WO2024124909A1 - 一种通信方法、电子设备及存储介质 - Google Patents
一种通信方法、电子设备及存储介质 Download PDFInfo
- Publication number
- WO2024124909A1 WO2024124909A1 PCT/CN2023/109115 CN2023109115W WO2024124909A1 WO 2024124909 A1 WO2024124909 A1 WO 2024124909A1 CN 2023109115 W CN2023109115 W CN 2023109115W WO 2024124909 A1 WO2024124909 A1 WO 2024124909A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- computing power
- target task
- computing
- target
- power unit
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Ceased
Links
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04W—WIRELESS COMMUNICATION NETWORKS
- H04W28/00—Network traffic management; Network resource management
- H04W28/02—Traffic management, e.g. flow control or congestion control
- H04W28/08—Load balancing or load distribution
- H04W28/09—Management thereof
- H04W28/0925—Management thereof using policies
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04W—WIRELESS COMMUNICATION NETWORKS
- H04W24/00—Supervisory, monitoring or testing arrangements
- H04W24/10—Scheduling measurement reports ; Arrangements for measurement reports
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F1/00—Details not covered by groups G06F3/00 - G06F13/00 and G06F21/00
- G06F1/26—Power supply means, e.g. regulation thereof
- G06F1/32—Means for saving power
- G06F1/3203—Power management, i.e. event-based initiation of a power-saving mode
- G06F1/3234—Power saving characterised by the action undertaken
- G06F1/329—Power saving characterised by the action undertaken by task scheduling
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/46—Multiprogramming arrangements
- G06F9/48—Program initiating; Program switching, e.g. by interrupt
- G06F9/4806—Task transfer initiation or dispatching
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/46—Multiprogramming arrangements
- G06F9/50—Allocation of resources, e.g. of the central processing unit [CPU]
- G06F9/5005—Allocation of resources, e.g. of the central processing unit [CPU] to service a request
- G06F9/5027—Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals
- G06F9/5044—Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals considering hardware capabilities
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N20/00—Machine learning
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/0464—Convolutional networks [CNN, ConvNet]
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/06—Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons
- G06N3/063—Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons using electronic means
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/09—Supervised learning
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/098—Distributed learning, e.g. federated learning
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04W—WIRELESS COMMUNICATION NETWORKS
- H04W24/00—Supervisory, monitoring or testing arrangements
- H04W24/02—Arrangements for optimising operational condition
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04W—WIRELESS COMMUNICATION NETWORKS
- H04W24/00—Supervisory, monitoring or testing arrangements
- H04W24/06—Testing, supervising or monitoring using simulated traffic
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04W—WIRELESS COMMUNICATION NETWORKS
- H04W28/00—Network traffic management; Network resource management
- H04W28/02—Traffic management, e.g. flow control or congestion control
- H04W28/08—Load balancing or load distribution
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F2209/00—Indexing scheme relating to G06F9/00
- G06F2209/48—Indexing scheme relating to G06F9/48
- G06F2209/485—Resource constraint
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F2209/00—Indexing scheme relating to G06F9/00
- G06F2209/50—Indexing scheme relating to G06F9/50
- G06F2209/503—Resource availability
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F2209/00—Indexing scheme relating to G06F9/00
- G06F2209/50—Indexing scheme relating to G06F9/50
- G06F2209/509—Offload
Definitions
- the present application belongs to the field of communication technology, and specifically relates to a communication method, electronic equipment and storage medium.
- AI artificial intelligence
- AI-based training and inference engines are usually centrally deployed on a single-point computing board of a base station.
- RAN has rich application scenarios and the computing power of a single station is fixed. Supporting numerous application scenarios will bring about the problem of insufficient single-point computing power.
- data acquisition and AI applications may be located on different boards of the base station, resulting in the need to transmit data from other boards to a fixed location for AI training and inference, which will bring about problems such as data transmission overhead and feedback delay.
- the purpose of the embodiments of the present application is to provide a communication method, device, electronic device and storage medium, which can solve the problem of insufficient computing power and training of RAN under the condition of large amount of business data and rich application scenarios.
- an embodiment of the present application provides a communication method, which is executed by a first computing power unit, and the method includes: sending target startup information to a second computing power unit, wherein the target startup information carries characteristics of a target task; receiving computing power measurement information reported by the second computing power unit, wherein the computing power measurement information is used to represent the available computing power allocated to the target task by the second computing power unit; and sending the target task to the second computing power unit when the available computing power matches the characteristics of the target task.
- an embodiment of the present application provides a communication method, which is executed by a second computing power unit, and the method includes: receiving target startup information sent by a first computing power unit, wherein the target startup information carries the characteristics of a target task; allocating available computing power to the target task according to the characteristics of the target task; and sending computing power measurement information to the first computing power unit, wherein the computing power measurement information is used to represent the available computing power.
- an embodiment of the present application provides a communication method, which is executed by a third computing power unit, and the method includes: receiving a processing result of a target task sent by a first computing power unit; evaluating the processing result of the target task to obtain a target evaluation result; and feeding back the target evaluation result to the first computing power unit.
- an embodiment of the present application provides a communication device, comprising: a first sending module, used to send target startup information to a second computing power unit, wherein the target startup information carries the characteristics of a target task; a first receiving module, used to receive computing power measurement information reported by the second computing power unit, wherein the computing power measurement information is used to indicate the available computing power allocated by the second computing power unit to the target task; and a second sending module, used to send the target task to the second computing power unit when the available computing power matches the characteristics of the target task.
- an embodiment of the present application provides a communication device, the device comprising: a second receiving module, configured to receive target startup information sent by a first computing unit, wherein the target startup information carries the characteristics of a target task; an allocation module, configured to allocate the target task to the first computing unit according to the characteristics of the target task; The target task allocates available computing power; a third sending module is used to send computing power measurement information to the first computing power unit, wherein the computing power measurement information is used to represent the available computing power.
- an embodiment of the present application provides a communication device, which includes: a third receiving module, used to receive the processing result of the target task sent by the first computing power unit; an evaluation module, used to evaluate the processing result of the target task to obtain a target evaluation result; and a feedback module, used to feed back the target evaluation result to the first computing power unit.
- an embodiment of the present application provides an electronic device, which includes a processor and a memory, wherein the memory stores programs or instructions that can be run on the processor, and when the program or instructions are executed by the processor, the steps of the communication method described in the first aspect, the second aspect, or the third aspect are implemented.
- an embodiment of the present application provides a readable storage medium, on which a program or instruction is stored.
- the program or instruction is executed by a processor, the steps of the communication method described in the first aspect, the second aspect, or the third aspect are implemented.
- an embodiment of the present application provides a chip, which includes a processor and a communication interface, the communication interface is coupled to the processor, and the processor is used to run programs or instructions to implement the steps of the communication method described in the first aspect, the second aspect, or the third aspect.
- an embodiment of the present application provides a computer program product, which is stored in a storage medium and is executed by at least one processor to implement the steps of the communication method as described in the first aspect, the second aspect, or the third aspect.
- FIG1 is a flow chart of a communication method provided in an embodiment of the present application.
- FIG2 is a schematic diagram of an AI distributed framework structure provided in an embodiment of the present application.
- FIG3 is a schematic diagram of the structure of an AI training inference engine provided in an embodiment of the present application.
- FIG4 is a schematic diagram of a distributed framework structure within a base station provided in an embodiment of the present application.
- FIG5 is a schematic diagram of a distributed framework structure between base stations provided in an embodiment of the present application.
- Figure 6 is a schematic diagram of a distributed framework structure between a base station and an edge computing device provided in an embodiment of the present application.
- Figure 7 is a general architecture diagram of a 5G wireless access network RAN provided in an embodiment of the present application.
- FIG8 is a flow chart of a communication method provided in an embodiment of the present application.
- FIG. 9 is a flow chart of a communication method provided in an embodiment of the present application.
- FIG. 10 is a flow chart of a communication method provided in an embodiment of the present application.
- FIG. 11 is a flow chart of a communication method provided in an embodiment of the present application.
- FIG. 12 is a schematic diagram of the structure of a communication device provided in an embodiment of the present application.
- FIG13 is a schematic diagram of the structure of a communication device provided in an embodiment of the present application.
- FIG. 14 is a schematic diagram of the structure of a communication device provided in an embodiment of the present application.
- FIG. 15 is a schematic diagram of the hardware structure of an electronic device provided in an embodiment of the present application.
- first, second, etc. in the specification and claims of this application are used to distinguish similar objects, rather than to describe a specific order or sequence. It should be understood that the terms used in this way can be interchanged where appropriate, so that the embodiments of the present application can be implemented in an order other than those illustrated or described herein, and the objects distinguished by "first”, “second”, etc. are generally of the same type.
- the number of objects is not limited, for example, the first object can be one or more.
- “and/or” means at least one of the connected objects, and the character “/” generally means that the related objects are in an "or” relationship.
- FIG1 shows a flow chart of a communication method provided by an embodiment of the present application.
- the method can be executed by an electronic device or a first computing unit on the electronic device.
- the electronic device may include: a server or a terminal device.
- the method can be executed by software or hardware installed on the electronic device. As shown in FIG1, the method includes the following steps:
- S101 Send target startup information to the second computing power unit.
- the target startup information carries the characteristics of the target task.
- the AI engine of the wireless access network RAN is usually centrally deployed on a single-point computing board of the base station.
- the computing power of the base station under this deployment mode is fixed. Due to different scenarios or scenarios at different protocol layers, data acquisition and AI applications may be located on different boards of the base station. Centralized deployment requires that data from other boards be transmitted to a fixed location before AI training and reasoning can be performed, which brings problems such as data transmission overhead and feedback delay.
- the wireless access network has rich application scenarios, and the computing power of a single station is fixed. If you want to support many application scenarios, the problem of insufficient single-point computing power will arise.
- the AI engine is a framework that supports users to develop machine learning and deep learning model training operations.
- the overall architecture of the artificial intelligence distributed AI framework is shown in Figure 2 as a master-slave mode: the main AI training and reasoning engine collaborates with multiple auxiliary AI training and reasoning engines to complete training and reasoning related tasks, and the main AI training and reasoning engine and multiple auxiliary AI training and reasoning engines communicate through a virtual eXtensible Local Area Network (VXLAN).
- VXLAN virtual eXtensible Local Area Network
- the main and auxiliary AI training and reasoning engines are composed of computing power management, task management, model management, and AI training and reasoning framework.
- the computing power management is responsible for computing power measurement, evaluating AI engine computing power, and maintaining computing power status information; the model management maintains AI model addition, update, deletion management, and model loading; the task management is responsible for managing and allocating AI training tasks and reasoning tasks, and the AI training and reasoning framework is responsible for executing model training and reasoning.
- the main AI training engine i.e., the first computing unit
- sends a target startup message to the auxiliary AI training engine i.e., the second computing unit.
- the target startup message carries the characteristics of the target task, which is used to notify the second computing unit to report the available computing power that can be allocated to the target task.
- S102 Receive computing power measurement information reported by the second computing power unit.
- the computing power measurement information is used to represent the available computing power allocated by the second computing power unit to the target task.
- This step receives the computing power management module of the auxiliary AI training engine and reports the computing power measurement information to the main AI training engine.
- the computing power measurement information is used to indicate the available computing power allocated by the auxiliary AI training engine to the target task.
- a0_type indicates the type of computing power hardware, such as central processing unit (CPU), field-programmable gate array (FPGA), graphics processing unit (GPU), etc.
- a1_flops indicates the computing power of the hardware
- a2_load indicates the load of the computing power, including the maximum computing power used, the minimum computing power used, the average computing power used, etc.
- a3_time indicates the time information of computing power usage.
- the main AI training and reasoning engine receives the computing power measurement information reported by each auxiliary AI training and reasoning engine, and the task management module matches the training and reasoning task characteristics with the computing power.
- the AI training and reasoning task characteristics cover the computing power resources required for training and reasoning, the source of training and reasoning data, the amount of data, and the real-time requirements of the task; the matching rules meet the engine that executes the training and reasoning tasks to use local data for training and reasoning to ensure the real-time nature of training and reasoning.
- training and reasoning tasks may not be assigned to auxiliary training engines with relatively high loads.
- the main AI training and reasoning engine sends the training and reasoning tasks to the auxiliary AI training and reasoning engines for execution based on the task matching results of the task management module.
- a communication method provided in an embodiment of the present application is provided by sending a target startup to a second computing unit. information, wherein the target startup information carries the characteristics of the target task; receiving the computing power measurement information reported by the second computing power unit, wherein the computing power measurement information is used to represent the available computing power allocated to the target task by the second computing power unit; when the available computing power matches the characteristics of the target task, the target task is sent to the second computing power unit, which can solve the problems of insufficient computing power, large training and inference latency, and poor real-time performance of RAN when the amount of business data is large and the application scenarios are rich.
- the first computing power unit includes a first main control board of a first base station
- the second computing power unit includes at least one baseband board of the first base station, at least one baseband board of the second base station, and at least one edge computing device connected to the first base station.
- the second computing power unit when the second computing power unit includes at least one baseband board of the second base station, the first base station and the second base station establish an inter-base station connection through the Xn port; when the second computing power unit includes at least one baseband board of the first base station, the first computing power unit and the second computing power unit are connected through a virtual extensible local area network VXLAN.
- the overall architecture of the fifth generation mobile communication technology (5G) radio access network RAN is as follows: the next generation base station (Next Generation Node B, gNB) provides 5G new radio (New Radio, NR) user plane and control plane protocols; the next generation evolved base station (Next Generation Evolved Node B, ng-eNB) provides the evolved universal mobile telecommunications system (Universal Mobile Telecommunications System, UMTS) terrestrial radio access (Evolved UMTS Terrestrial Radio Access Network, E-UTRA) user plane and control plane protocols.
- the connection between gNB and ng-eNB is through the Xn port.
- the gNB and ng-eNB are connected to the 5G core network (5G Core, 5GC) through the NG port.
- the 5GC includes the authentication management function (Authentication Management Function, AMF) and the user plane function (User Plane Function, UPF).
- AMF Authentication Management Function
- UPF User Plane Function
- the AMF communicates with the gNB and UPF through the NG-U port respectively.
- ng-eNB connection In the embodiment of the present application, the artificial intelligence AI distributed framework can be deployed in a distributed manner within the gNB, or in a distributed manner on the gNB and ng-eNB, and can ultimately be applied to the intelligentization of 5G wireless access networks.
- after sending the target task to the second computing unit it also includes: receiving a processing result of the target task sent by the second computing unit.
- after receiving the processing result of the target task sent by the second computing power unit it also includes: sending the processing result of the target task to a third computing power unit; receiving a target evaluation result fed back by the third computing power unit, wherein the target evaluation result is obtained by evaluating the processing result of the target task; and updating the processing result of the target task according to the target evaluation result.
- the target task includes: the task of training the model; the processing result of the target task includes: the model obtained by training; the target evaluation result includes: the evaluation result obtained by evaluating the model.
- the AI training of RAN can be carried out in a cloud manner, that is, the AI training tasks of the base station are obtained through the cloud training system, and then the training results are returned to the base station, so that the AI training of the wireless access network is no longer limited by the computing power of the base station.
- data transmission is required between the base station and the cloud. Massive data has high requirements on the transmission bandwidth, and it will also bring large data transmission delay, resulting in poor real-time performance of online AI training reasoning. In response to task scenarios with high real-time requirements, the expected effect cannot be achieved.
- the embodiments of the present application can collect and measure computing resources within the deployment domain by deploying an artificial intelligence distributed framework inside or between wireless base stations, and match available computing resources with AI training and reasoning tasks, completing AI reasoning and training tasks in a distributed manner, and can also share training models between base stations in a cross-site manner.
- it can solve the problem that some models cannot be trained online due to high business load and limited idle computing power at some sites.
- FIG8 is a flow chart of a communication method provided by an embodiment of the present application.
- the method can be executed by an electronic device or a second computing unit on the electronic device.
- the electronic device may include: a server or a terminal device.
- the method can be executed by software or hardware installed on the electronic device. As shown in FIG8, the method includes the following steps:
- S201 Receive target startup information sent by the first computing unit.
- the target startup information carries the characteristics of the target task.
- This step deploys the computing power management module of the auxiliary AI training engine (i.e., the second computing power unit) of different baseband boards to report the computing power measurement information of this baseband board to the main AI reasoning training engine (i.e., the first computing power unit).
- the main AI reasoning training engine i.e., the first computing power unit
- S202 Allocate available computing power to the target task according to the characteristics of the target task.
- the computing power management module of the auxiliary AI training engine measures the local available computing power through calculation according to the characteristics of the target task.
- S203 Send computing power measurement information to the first computing power unit, where the computing power measurement information is used to represent the available computing power.
- the auxiliary AI training engine reports the available computing power to the main AI training inference engine.
- a communication method provided in an embodiment of the present application receives target startup information sent by a first computing power unit, wherein the target startup information carries the characteristics of a target task; allocates available computing power to the target task according to the characteristics of the target task; and sends computing power measurement information to the first computing power unit, wherein the computing power measurement information is used to represent the available computing power.
- This method can solve the problems of insufficient computing power, large training and inference latency, and poor real-time performance of RAN when the amount of business data is large and the application scenarios are rich.
- the computing power measurement information after the computing power measurement information is sent to the first computing power unit, it also includes: receiving the target task, where the target task is sent by the first computing power unit when the available computing power matches the characteristics of the target task.
- the method further includes: executing the target task to obtain a processing result of the target task; sending the target task to the first computing unit; The processing results of the service.
- FIG9 shows a flow chart of a communication method provided by an embodiment of the present application. The method comprises the following steps:
- the main AI training inference engine sends target startup information to the auxiliary AI training engines respectively.
- the computing power management module of the auxiliary AI training engine deployed on different baseband boards reports the computing power measurement of the baseband board to the main AI inference training engine.
- the main AI training and inference engine receives the computing power measurement information reported by each auxiliary AI training and inference engine, and the task management module matches the training and inference task characteristics with the computing power.
- the main AI training and reasoning engine sends the training and reasoning tasks to the auxiliary AI training and reasoning engine for execution based on the task matching results of the task management module.
- the local AI training and inference engine receives the task, starts matching the data required for the training and inference task, and starts collecting and preprocessing local data.
- the local AI training and inference engine receives the task and uses the local pre-processed data to perform local training and inference tasks. It can also store the trained model and inference results locally through the local model management module.
- the model and reasoning results are also synchronously sent to the model management module of the main AI training and reasoning engine for management.
- the model management module updates the managed model according to the model management strategy.
- FIG10 is a flow chart of a communication method provided by an embodiment of the present application.
- the method can be executed by an electronic device or a third computing unit on the electronic device.
- the electronic device may include: a server or a terminal device.
- the method can be executed by software or hardware installed on the electronic device. As shown in FIG10, the method includes the following steps:
- S301 Receive the processing result of the target task sent by the first computing unit.
- the base station (third computing unit) that deploys the auxiliary AI training and inference engine has upper-layer applications that need to use the corresponding AI model and inference results; however, due to the high load and low idle computing power of this station, this station has not previously performed the corresponding AI model training and inference, and the upper-layer applications of this station cannot match the corresponding AI model and inference results.
- the base station that deploys the auxiliary AI training and inference engine requests the corresponding AI model and inference results from the base station of the main AI training and inference engine.
- the base station of the main AI training inference engine sends the corresponding AI model ⁇ inference result to the base station of the auxiliary AI training inference engine.
- the upper-layer application of the base station of the auxiliary AI training inference engine calls this AI model ⁇ inference result to complete the upper-layer business use.
- the auxiliary AI training inference engine evaluates the effect of the upper-layer application calling this AI model ⁇ inference result.
- the auxiliary AI training and reasoning engine will feed back the corresponding evaluation results to the main AI training and reasoning engine, so that the main AI training and reasoning engine can decide whether to carry out the next round of model training and model reasoning tasks according to the effect evaluation strategy.
- a communication method provided in an embodiment of the application can solve the problem of RAN in service data by receiving the processing result of a target task sent by a first computing unit; evaluating the processing result of the target task to obtain a target evaluation result; and feeding back the target evaluation result to the first computing unit.
- There are problems such as insufficient computing power when the volume is large and the application scenarios are rich, as well as large training and inference latency and poor real-time performance.
- the base stations are deployed in a mesh, different base stations have the same application scenarios. Different training tasks can be deployed on different base stations respectively. Each base station can share the training results, which can support AI training in all scenarios, achieve the effect of enhancing single-point computing power, and improve the processing capabilities of the wireless access network.
- FIG11 shows a flow chart of a communication method provided by an embodiment of the present application. The method comprises the following steps:
- the base station (third computing unit) that deploys the auxiliary AI training and inference engine has upper-layer applications that need to use the corresponding AI model ⁇ inference results; however, due to the high load and low idle computing power of this station, the station has not previously performed corresponding AI model training ⁇ inference, and the upper-layer applications of this station cannot match the corresponding AI model ⁇ inference results.
- the base station that deploys the auxiliary AI training and reasoning engine requests the corresponding AI model ⁇ inference result from the base station of the main AI training and reasoning engine.
- the base station of the main AI training inference engine sends the corresponding AI model ⁇ inference result to the base station of the auxiliary AI training inference engine.
- the upper-layer application of the base station of the auxiliary AI training inference engine calls this AI model ⁇ inference result to complete the upper-layer business use.
- the auxiliary AI training inference engine evaluates the effect of the upper-layer application calling this AI model ⁇ inference result.
- the auxiliary AI training inference engine feeds back the corresponding evaluation results to the main AI training inference engine, so that the main AI training inference engine decides whether to carry out the next round of model training ⁇ model inference tasks according to the effect evaluation strategy.
- the above steps are mainly based on the process of sharing AI models and inference results between base stations in a distributed framework.
- the specific deployment principle is: base stations with relatively low load and more idle computing power deploy the main AI training and inference engine; base stations with relatively high load and less idle computing power deploy the auxiliary AI training and inference engine.
- the trained models and inference results will be managed by the main AI training and inference engine.
- the base stations that deploy the auxiliary AI training and inference engines can directly use the AI models and inference results managed by the main AI training and inference engine to complete the upper-level business applications.
- the distributed deployment mode and communication method between the base station and the edge computing device can refer to the description of the above embodiment, which can achieve the same technical effect. To avoid repetition, no further details will be given.
- the communication method provided in the embodiment of the present application can be executed by a communication device or a control module in the communication device for executing the communication method.
- the communication device provided in the embodiment of the present application is described by taking the method for executing the communication by the communication device as an example.
- FIG12 is a schematic diagram of the structure of a communication device provided in an embodiment of the present application.
- the communication device 500 includes: a first sending module 510 , a first receiving module 520 and a second sending module 530 .
- the first sending module 510 is used to send target startup information to the second computing power unit, wherein the target startup information carries the characteristics of the target task; the first receiving module 520 is used to receive computing power measurement information reported by the second computing power unit, wherein the computing power measurement information is used to represent the available computing power allocated by the second computing power unit to the target task; the second sending module 530 is used to send the target task to the second computing power unit when the available computing power matches the characteristics of the target task.
- a communication device includes a first sending module for sending target startup information to a second computing unit, wherein the target startup information carries the characteristics of a target task; a first receiving module for receiving computing power measurement information reported by the second computing unit, wherein the computing power measurement information is used to indicate the available computing power allocated by the second computing unit to the target task; and a second sending module for sending the target task to the second computing unit when the available computing power matches the characteristics of the target task, which can solve the problem of RAN having a large amount of business data and a high speed communication interface.
- There are problems such as insufficient computing power in a variety of application scenarios, large training and inference latency, and poor real-time performance.
- the first receiving module 520 is further used to: receive the processing result of the target task sent by the second computing unit.
- the device 500 also includes: a third sending module, used to send the processing result of the target task to the third computing power unit; the first receiving module 520 is also used to: receive the target evaluation result fed back by the third computing power unit, wherein the target evaluation result is obtained by evaluating the processing result of the target task; the device 500 also includes: an updating module, used to update the processing result of the target task according to the target evaluation result.
- a third sending module used to send the processing result of the target task to the third computing power unit
- the first receiving module 520 is also used to: receive the target evaluation result fed back by the third computing power unit, wherein the target evaluation result is obtained by evaluating the processing result of the target task
- the device 500 also includes: an updating module, used to update the processing result of the target task according to the target evaluation result.
- FIG13 is a schematic diagram of the structure of a communication device provided in an embodiment of the present application.
- the communication device 600 includes: a second receiving module 610 , an allocating module 620 and a fourth sending module 630 .
- the second receiving module 610 is used to receive target startup information sent by the first computing power unit, wherein the target startup information carries the characteristics of the target task; the allocation module 620 is used to allocate available computing power to the target task according to the characteristics of the target task; the fourth sending module 630 is used to send computing power measurement information to the first computing power unit, wherein the computing power measurement information is used to represent the available computing power.
- a communication device configured to receive target startup information sent by a first computing power unit through a second receiving module, wherein the target startup information carries characteristics of a target task; an allocation module is configured to allocate available computing power to the target task according to the characteristics of the target task; and a fourth sending module is configured to send computing power measurement information to the first computing power unit, wherein the computing power measurement information is used to represent the available computing power, which can solve the problems of insufficient computing power, large training and inference latency, and poor real-time performance of RAN when the amount of business data is large and the application scenarios are rich.
- the second receiving module 610 is further used to: receive the target task, where the target task is sent by the first computing power unit when the available computing power matches the characteristics of the target task.
- the apparatus 600 further includes: an execution module, configured to execute the target task and obtain a processing result of the target task; and the fourth sending module 630 is further configured to: Send the processing result of the target task to the first computing power unit.
- FIG14 is a schematic diagram of the structure of a communication device provided in an embodiment of the present application.
- the communication device 700 includes: a third receiving module 710 , an evaluation module 720 and a fifth sending module 730 .
- the third receiving module 710 is used to receive the processing result of the target task sent by the first computing unit; the evaluation module 720 is used to evaluate the processing result of the target task to obtain a target evaluation result; the fifth sending module 730 is used to feed back the target evaluation result to the first computing unit.
- the first computing power unit includes a first main control board of a first base station
- the second computing power unit includes at least one baseband board of the first base station, at least one baseband board of the second base station, and at least one edge computing device connected to the first base station.
- the second computing power unit when the second computing power unit includes at least one baseband board of the second base station, the first base station and the second base station establish an inter-base station connection through the Xn port; when the second computing power unit includes at least one baseband board of the first base station, the first computing power unit and the second computing power unit are connected through a virtual extensible local area network VXLAN.
- the third computing power unit includes at least one baseband board of the first base station, at least one baseband board of the second base station, and at least one edge computing device connected to the first base station; the third computing power unit and the second computing power unit are not the same computing power unit.
- the target task includes: a task of training a model; the processing result of the target task includes: a model obtained by training; and the target evaluation result includes: an evaluation result obtained by evaluating the model.
- the communication device provided in the embodiment of the present application can implement the various processes implemented by the communication method embodiment described in at least one embodiment of Figures 1 to 11, and achieve the same technical effect. To avoid repetition, it will not be repeated here.
- the communication device in the embodiments of the present application may be a device, or a component, integrated circuit, or chip in a terminal device.
- the device may be a mobile electronic device or a non-mobile electronic device.
- the mobile electronic device may be a mobile phone, a tablet computer, a laptop computer, a PDA, an in-vehicle electronic device, a wearable device, an ultra-mobile personal computer, or a portable computer.
- the mobile electronic device may be a mobile computer, UMPC), a netbook or a personal digital assistant (PDA), etc.
- the non-mobile electronic device may be a server, a network attached storage (NAS), a personal computer (PC), a television (TV), an ATM or an automatic machine, etc., which is not specifically limited in the embodiments of the present application.
- the communication device in the embodiment of the present application may be a device having an operating system.
- the operating system may be an Android operating system, an iOS operating system, or other possible operating systems, which are not specifically limited in the embodiment of the present application.
- an embodiment of the present application further provides an electronic device 800, including a processor 801, a memory 802, a program or instruction stored in the memory 802 and executable on the processor 801, and when the program or instruction is executed by the processor 801, the communication method described in at least one of the embodiments of FIG1 to FIG11 is implemented.
- the electronic device in the embodiment of the present application includes: a server, a terminal device, or other devices other than a terminal device.
- the above electronic device structure does not constitute a limitation on the electronic device.
- the electronic device may include more or fewer components than shown in the figure, or combine certain components, or arrange the components differently.
- the input unit may include a graphics processing unit (GPU) and a microphone
- the display unit may be configured with a display panel in the form of a liquid crystal display, an organic light-emitting diode, etc.
- the user input unit includes a touch panel and at least one of other input devices.
- the touch panel is also called a touch screen.
- Other input devices may include, but are not limited to, a physical keyboard, function keys (such as volume control keys, switch keys, etc.), a trackball, a mouse, and a joystick, which will not be repeated here.
- the memory can be used to store software programs and various data.
- the memory may mainly include a first storage area for storing programs or instructions and a second storage area for storing data, wherein the first storage area may store an operating system, an application program or instructions required for at least one function (such as a sound playback function, an image playback function, etc.), etc.
- the memory may include a volatile memory or a non-volatile memory, or the memory may include both volatile and non-volatile memories.
- the non-volatile memory may be a read-only memory (ROM), a programmable read-only memory (PROM), an erasable programmable read-only memory (Erasable PROM, EPROM), Electrically Erasable Programmable Read-Only Memory (EEPROM), or Flash memory.
- Volatile memory can be Random Access Memory (RAM), Static RAM (SRAM), Dynamic RAM (DRAM), Synchronous DRAM (SDRAM), Double Data Rate SDRAM (DDRSDRAM), Enhanced SDRAM (ESDRAM), Synchronous Link DRAM (SLDRAM), and Direct Rambus RAM (DRRAM).
- RAM Random Access Memory
- SRAM Static RAM
- DRAM Dynamic RAM
- SDRAM Synchronous DRAM
- DDRSDRAM Double Data Rate SDRAM
- ESDRAM Enhanced SDRAM
- SLDRAM Synchronous Link DRAM
- DRRAM Direct Rambus RAM
- the processor may include one or more processing units; optionally, the processor integrates an application processor and a modem processor, wherein the application processor mainly processes operations related to the operating system, user interface, and application programs, and the modem processor mainly processes communication signals, such as a baseband processor. It is understandable that the modem processor may not be integrated into the processor.
- An embodiment of the present application also provides a readable storage medium, on which a program or instruction is stored.
- a program or instruction is stored.
- the communication method described in at least one of the embodiments in Figures 1 and 2 is implemented, and the same technical effect can be achieved. To avoid repetition, it will not be repeated here.
- the processor is a processor in the electronic device described in the above embodiment.
- the readable storage medium includes a computer readable storage medium, such as a computer read-only memory (ROM), a random access memory (RAM), a magnetic disk or an optical disk, etc.
- An embodiment of the present application further provides a chip, which includes a processor and a communication interface, wherein the communication interface is coupled to the processor, and the processor is used to run programs or instructions to implement the various processes of the above-mentioned communication method embodiment, and can achieve the same technical effect. To avoid repetition, it will not be repeated here.
- the chip mentioned in the embodiments of the present application can also be called a system-level chip, a system chip, a chip system or a system-on-chip chip, etc.
- the technical solution of the present application can be embodied in the form of a computer software product, which is stored in a storage medium (such as ROM/RAM, a disk, or an optical disk), and includes a number of instructions for a terminal (which can be a mobile phone, a computer, a server, or a network device, etc.) to execute the methods described in each embodiment of the present application.
- a storage medium such as ROM/RAM, a disk, or an optical disk
- a terminal which can be a mobile phone, a computer, a server, or a network device, etc.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Software Systems (AREA)
- General Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Computing Systems (AREA)
- Mathematical Physics (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Computation (AREA)
- Health & Medical Sciences (AREA)
- Life Sciences & Earth Sciences (AREA)
- Biophysics (AREA)
- Biomedical Technology (AREA)
- Artificial Intelligence (AREA)
- Molecular Biology (AREA)
- Computational Linguistics (AREA)
- General Health & Medical Sciences (AREA)
- Computer Networks & Wireless Communication (AREA)
- Signal Processing (AREA)
- Neurology (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Medical Informatics (AREA)
- Mobile Radio Communication Systems (AREA)
Abstract
Description
f=f(a0_type,a1_flops,a2_load,a3_time)
Claims (13)
- 一种通信方法,由第一算力单元执行,所述方法包括:向第二算力单元发送目标启动信息,其中,所述目标启动信息中携带目标任务的特征;接收所述第二算力单元上报的算力度量信息,其中,所述算力度量信息用于表示所述第二算力单元分配给所述目标任务的可用算力;在所述可用算力与所述目标任务的特征匹配的情况下,将所述目标任务发送给所述第二算力单元。
- 根据权利要求1所述的方法,其中,在将所述目标任务发送给所述第二算力单元之后,还包括:接收所述第二算力单元发送的所述目标任务的处理结果。
- 根据权利要求2所述的方法,其中,在接收所述第二算力单元发送的所述目标任务的处理结果之后,还包括:将所述目标任务的处理结果发送给第三算力单元;接收所述第三算力单元反馈的目标评估结果,其中,所述目标评估结果是对所述目标任务的处理结果进行评估得到的;根据所述目标评估结果,对所述目标任务的处理结果进行更新。
- 一种通信方法,由第二算力单元执行,所述方法包括:接收第一算力单元发送的目标启动信息,其中,所述目标启动信息中携带目标任务的特征;根据所述目标任务的特征为所述目标任务分配可用算力;向所述第一算力单元发送的算力度量信息,其中,所述算力度量信息用于表示所述可用算力。
- 根据权利要求4所述的方法,其中,在所述向所述第一算力单元发送的算力度量信息之后,还包括:接收所述目标任务,所述目标任务是所述第一算力单元在所述可用算力与所述目标任务的特征匹配的情况下发送的。
- 根据权利要求5所述的方法,其中,在接收所述目标任务之后,还包括:执行所述目标任务,获得所述目标任务的处理结果;向所述第一算力单元发送所述目标任务的处理结果。
- 一种通信方法,由第三算力单元执行,所述方法包括:接收第一算力单元发送的目标任务的处理结果;对所述目标任务的处理结果进行评估,得到目标评估结果;将所述目标评估结果反馈给所述第一算力单元。
- 根据权利要求1至7任一所述的方法,其中,所述第一算力单元包括第一基站的第一主控板,所述第二算力单元包括所述第一基站的至少一个基带板、第二基站的至少一个基带板、与所述第一基站连接的至少一个边缘计算设备中的至少一者。
- 根据权利要求8所述的方法,其中,在所述第二算力单元包括第二基站的至少一个基带板的情况下,所述第一基站与所述第二基站通过Xn口建立基站间连接;在所述第二算力单元包括所述第一基站的至少一个基带板的情况下,所述第一算力单元与所述第二算力单元通过虚拟可扩展局域网VXLAN连接。
- 根据权利要求3或7所述的方法,其中,所述第三算力单元包括第一基站的至少一个基带板、第二基站的至少一个基带板、与所述第一基站连接的至少一个边缘计算设备中的至少一者;所述第三算力单元与所述第二算力单元不是同一个算力单元。
- 根据权利要求1至7任一所述的方法,其中,所述目标任务包括:训练模型的任务;所述目标任务的处理结果包括:训练得到的模型;所述目标评估结果包括:对所述模型进行评估得到的评估结果。
- 一种电子设备,包括处理器,存储器及存储在所述存储器上并可在所述处理器上运行的程序或指令,所述程序或指令被所述处理器执行时实现如权利要求1至11任一项所述的通信方法的步骤。
- 一种可读存储介质,所述可读存储介质上存储程序或指令,所述程序或指令被处理器执行时实现如权利要求1至11任一项所述的通信方法的步骤。
Priority Applications (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| EP23902119.9A EP4590020A4 (en) | 2022-12-13 | 2023-07-25 | COMMUNICATION METHOD, ELECTRONIC DEVICE AND STORAGE MEDIA |
Applications Claiming Priority (2)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN202211595595.0A CN118200979A (zh) | 2022-12-13 | 2022-12-13 | 一种通信方法、电子设备及存储介质 |
| CN202211595595.0 | 2022-12-13 |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| WO2024124909A1 true WO2024124909A1 (zh) | 2024-06-20 |
Family
ID=91402130
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| PCT/CN2023/109115 Ceased WO2024124909A1 (zh) | 2022-12-13 | 2023-07-25 | 一种通信方法、电子设备及存储介质 |
Country Status (3)
| Country | Link |
|---|---|
| EP (1) | EP4590020A4 (zh) |
| CN (1) | CN118200979A (zh) |
| WO (1) | WO2024124909A1 (zh) |
Families Citing this family (1)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN118802661B (zh) * | 2024-06-24 | 2026-01-20 | 中国移动通信有限公司研究院 | 一种任务处理方法及装置、设备、存储介质 |
Citations (6)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| WO2021208914A1 (zh) * | 2020-04-15 | 2021-10-21 | 展讯半导体(南京)有限公司 | 基于网络调度的算力共享方法及相关产品 |
| WO2022028418A1 (zh) * | 2020-08-04 | 2022-02-10 | 中国移动通信有限公司研究院 | 算力处理的网络系统、业务处理方法及设备 |
| CN114168331A (zh) * | 2021-12-07 | 2022-03-11 | 杭州萤石软件有限公司 | 算法部署和调度方法以及算法部署和调度装置 |
| WO2022143744A1 (zh) * | 2020-12-31 | 2022-07-07 | 维沃移动通信有限公司 | 信息处理方法、装置、设备及存储介质 |
| WO2022143748A1 (zh) * | 2020-12-31 | 2022-07-07 | 维沃移动通信有限公司 | 信息处理方法、装置、设备及存储介质 |
| CN115373836A (zh) * | 2022-05-09 | 2022-11-22 | 华为技术有限公司 | 计算网络、算力度量方法、调度装置及相关产品 |
-
2022
- 2022-12-13 CN CN202211595595.0A patent/CN118200979A/zh active Pending
-
2023
- 2023-07-25 EP EP23902119.9A patent/EP4590020A4/en active Pending
- 2023-07-25 WO PCT/CN2023/109115 patent/WO2024124909A1/zh not_active Ceased
Patent Citations (6)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| WO2021208914A1 (zh) * | 2020-04-15 | 2021-10-21 | 展讯半导体(南京)有限公司 | 基于网络调度的算力共享方法及相关产品 |
| WO2022028418A1 (zh) * | 2020-08-04 | 2022-02-10 | 中国移动通信有限公司研究院 | 算力处理的网络系统、业务处理方法及设备 |
| WO2022143744A1 (zh) * | 2020-12-31 | 2022-07-07 | 维沃移动通信有限公司 | 信息处理方法、装置、设备及存储介质 |
| WO2022143748A1 (zh) * | 2020-12-31 | 2022-07-07 | 维沃移动通信有限公司 | 信息处理方法、装置、设备及存储介质 |
| CN114168331A (zh) * | 2021-12-07 | 2022-03-11 | 杭州萤石软件有限公司 | 算法部署和调度方法以及算法部署和调度装置 |
| CN115373836A (zh) * | 2022-05-09 | 2022-11-22 | 华为技术有限公司 | 计算网络、算力度量方法、调度装置及相关产品 |
Non-Patent Citations (1)
| Title |
|---|
| See also references of EP4590020A4 * |
Also Published As
| Publication number | Publication date |
|---|---|
| CN118200979A (zh) | 2024-06-14 |
| EP4590020A1 (en) | 2025-07-23 |
| EP4590020A4 (en) | 2025-12-24 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| WO2021142609A1 (zh) | 信息上报方法、装置、设备和存储介质 | |
| CN110096336B (zh) | 数据监控方法、装置、设备和介质 | |
| US20200053144A1 (en) | Method and system for coordination of inter-operable infrastructure as a service (iaas) and platform as a service (paas) systems | |
| CN107515787A (zh) | 资源配置方法及相关产品 | |
| CN107613107A (zh) | 资源配置方法及相关产品 | |
| CN107483725A (zh) | 资源配置方法及相关产品 | |
| CN107547745A (zh) | 资源配置方法及相关产品 | |
| CN107894920A (zh) | 资源配置方法及相关产品 | |
| CN115801569B (zh) | 一种访问规则部署方法、装置、设备、介质及云平台 | |
| CN107450988A (zh) | 资源配置方法及相关产品 | |
| EP2256633A2 (en) | Service provider management device, service provider management program, and service provider management method | |
| CN107621981A (zh) | 资源配置方法及相关产品 | |
| CN109815202B (zh) | 日志编辑方法及相关装置 | |
| CN118885180A (zh) | 接入异构虚机镜像或容器集群编译的系统、方法、设备及介质 | |
| WO2024124909A1 (zh) | 一种通信方法、电子设备及存储介质 | |
| CN120856413A (zh) | 一种数据处理方法及相关装置 | |
| CN112527377B (zh) | 应用程序生成处理方法、装置、计算机设备和存储介质 | |
| CN111767345B (zh) | 建模数据同步方法、装置、计算机设备及可读存储介质 | |
| CN107479972A (zh) | 资源配置方法及相关产品 | |
| CN118916130A (zh) | 服务请求调度方法、装置、电子设备及存储介质 | |
| CN114928572B (zh) | 一种分布式系统的流量控制方法、装置、介质及设备 | |
| CN110727511A (zh) | 应用程序的控制方法、网络侧设备和计算机可读存储介质 | |
| CN113568708B (zh) | 平台创建方法、装置及设备 | |
| CN116366618A (zh) | 固定容器ip的网络插件方法及装置 | |
| CN115580614A (zh) | 一种数据下载方法、装置、设备及计算机可读存储介质 |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| 121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 23902119 Country of ref document: EP Kind code of ref document: A1 |
|
| WWE | Wipo information: entry into national phase |
Ref document number: 2023902119 Country of ref document: EP |
|
| ENP | Entry into the national phase |
Ref document number: 2023902119 Country of ref document: EP Effective date: 20250418 |
|
| NENP | Non-entry into the national phase |
Ref country code: DE |
|
| WWP | Wipo information: published in national office |
Ref document number: 2023902119 Country of ref document: EP |