WO2024120470A1 - 模型训练方法、终端及网络侧设备 - Google Patents

模型训练方法、终端及网络侧设备 Download PDF

Info

Publication number
WO2024120470A1
WO2024120470A1 PCT/CN2023/136968 CN2023136968W WO2024120470A1 WO 2024120470 A1 WO2024120470 A1 WO 2024120470A1 CN 2023136968 W CN2023136968 W CN 2023136968W WO 2024120470 A1 WO2024120470 A1 WO 2024120470A1
Authority
WO
WIPO (PCT)
Prior art keywords
federated learning
model
message
information
training
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Ceased
Application number
PCT/CN2023/136968
Other languages
English (en)
French (fr)
Inventor
程思涵
崇卫微
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Vivo Mobile Communication Co Ltd
Original Assignee
Vivo Mobile Communication Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Vivo Mobile Communication Co Ltd filed Critical Vivo Mobile Communication Co Ltd
Priority to EP23900043.3A priority Critical patent/EP4633104A4/en
Priority to JP2025532177A priority patent/JP2025538004A/ja
Publication of WO2024120470A1 publication Critical patent/WO2024120470A1/zh
Priority to US19/229,729 priority patent/US20250299110A1/en
Anticipated expiration legal-status Critical
Ceased legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/16Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks using machine learning or artificial intelligence
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/04Network management architectures or arrangements
    • H04L41/042Network management architectures or arrangements comprising distributed management centres cooperatively managing the network
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/2866Architectures; Arrangements
    • H04L67/2869Terminals specially adapted for communication
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/098Distributed learning, e.g. federated learning

Definitions

  • the present application belongs to the field of communication technology, and specifically relates to a model training method, a terminal and a network side device.
  • Federated learning aims to build a federated learning model based on distributed data sets.
  • information related to the federated learning model can be exchanged between parties (or in encrypted form), but the original data cannot be exchanged, so as not to expose the private part of the data on each site.
  • horizontal federated learning is the union of samples. It is suitable for scenarios where the participants have the same business model but reach different customers, that is, scenarios with more feature overlap and less user overlap, such as the same service (such as session management business) in the core domain and access domain of the communication network serving different users (such as each terminal, that is, different samples).
  • horizontal federation increases the number of training samples, thereby obtaining a better federated learning model.
  • the client After the federated learning training is completed, the client usually remains in a state of waiting for the next round of training, taking up a lot of space and computing power.
  • the embodiments of the present application provide a model training method, a terminal, and a network-side device, which can solve the problem that the client cannot know the end of federated learning training and thus cannot perform the next operation, thereby occupying space and computing power.
  • a model training method comprising: a first device receives a first message from a second device, the first message being used to indicate that federated learning training is terminated or suspended; the first device performs a first operation based on the first message; wherein the first device comprises a federated learning client, and the second device comprises a federated learning server.
  • a model training method comprising: a second device sends a first message to a first device, wherein the first message is used to indicate that the federated learning training is terminated or aborted; wherein the first device includes a client of the federated learning
  • the second device includes a federated learning server.
  • a model training device which is applied to a first device, including: a receiving module for receiving a first message from a second device, wherein the first message is used to indicate the termination or suspension of federated learning training; a processing module for performing a first operation based on the first message; wherein the first device includes a federated learning client, and the second device includes a federated learning server.
  • a model training device which is applied to a second device, including: a sending module, used to send a first message to a first device, wherein the first message is used to indicate the termination or suspension of federated learning training; wherein the first device includes a federated learning client, and the second device includes a federated learning server.
  • a terminal comprising a processor and a communication interface, wherein the communication interface is used to receive a first message from a second device, the first message is used to indicate that federated learning training is terminated or suspended; the processor is used to perform a first operation based on the first message; wherein the terminal includes a client of federated learning, and the second device includes a server of federated learning.
  • the communication interface is used to send a first message to a first device, the first message is used to indicate that federated learning training is terminated or suspended; wherein the first device includes a client of federated learning, and the terminal includes a server of federated learning.
  • a network side device which includes a processor and a memory, wherein the memory stores programs or instructions that can be run on the processor, and when the program or instructions are executed by the processor, the steps of the method described in the first aspect or the second aspect are implemented.
  • a network-side device comprising a processor and a communication interface, wherein the communication interface is used to receive a first message from a second device, the first message being used to indicate that federated learning training is terminated or aborted; the processor is used to perform a first operation based on the first message; wherein the network-side device comprises a client of federated learning, and the second device comprises a server of federated learning.
  • the communication interface is used to send a first message to a first device, the first message being used to indicate that federated learning training is terminated or aborted; wherein the first device comprises a client of federated learning, and the network-side device comprises a server of federated learning.
  • a model training system comprising: a terminal and a network side device, wherein the terminal can be used to execute the steps of the method described in the first aspect, and the network side device can be used to execute the steps of the method described in the second aspect; or, the terminal can be used to execute the steps of the method described in the second aspect, and the network side device can be used to execute the steps of the method described in the first aspect.
  • a readable storage medium on which a program or instruction is stored.
  • the program or instruction is executed by a processor, the steps of the method described in the first aspect are implemented, or the steps of the method described in the second aspect are implemented.
  • a chip comprising a processor and a communication interface, the communication interface is coupled to the processor, the processor is used to run a program or instruction to implement the steps of the method described in the first aspect, or Implement the steps of the method described in the second aspect.
  • a computer program/program product is provided, wherein the computer program/program product is stored in a storage medium, and the computer program/program product is executed by at least one processor to implement the steps of the method described in the first aspect, or to implement the steps of the method described in the second aspect.
  • the server may send a first message to the client, and the first message is used to indicate that the federated learning training is terminated or suspended.
  • the first device can be informed that the federated learning training is finished, and can perform a first operation based on the first message, for example, stopping the local federated learning training, deleting the local federated learning model, etc., to avoid occupying the client's space and computing power and improving the client's performance.
  • FIG1 is a schematic diagram of a wireless communication system according to an embodiment of the present application.
  • FIG2 is a schematic flow chart of a model training method according to an embodiment of the present application.
  • FIG3 is a schematic flow chart of a model training method according to an embodiment of the present application.
  • FIG4 is a schematic flow chart of a model training method according to an embodiment of the present application.
  • FIG5 is a schematic diagram of the structure of a model training device according to an embodiment of the present application.
  • FIG6 is a schematic diagram of the structure of a model training device according to an embodiment of the present application.
  • FIG7 is a schematic diagram of the structure of a communication device according to an embodiment of the present application.
  • FIG8 is a schematic diagram of the structure of a terminal according to an embodiment of the present application.
  • FIG9 is a schematic diagram of the structure of a network side device according to an embodiment of the present application.
  • FIG. 10 is a schematic diagram of the structure of a network side device according to an embodiment of the present application.
  • first, second, etc. in the specification and claims of the present application are used to distinguish similar objects, and are not used to describe a specific order or sequence. It should be understood that the terms used in this way are interchangeable under appropriate circumstances, so that the embodiments of the present application can be implemented in an order other than those illustrated or described here, and the objects distinguished by “first” and “second” are generally of the same type, and the number of objects is not limited.
  • the first object can be one or more.
  • “and/or” in the specification and claims represents at least one of the connected objects, and the character “/" generally represents that the objects associated with each other are in an "or” relationship.
  • LTE Long Term Evolution
  • LTE-A Long Term Evolution-Advanced
  • CDMA Code Division Multiple Access
  • TDMA Time Division Multiple Access
  • FDMA Frequency Division Multiple Access
  • OFDMA Orthogonal Frequency Division Multiple Access
  • SC-FDMA Single-carrier Frequency Division Multiple Access
  • NR New Radio
  • 6G 6th Generation
  • FIG1 shows a block diagram of a wireless communication system applicable to an embodiment of the present application.
  • the wireless communication system includes a terminal 11 and a network side device 12 .
  • the terminal 11 can be a mobile phone, a tablet computer (Tablet Personal Computer), a laptop computer (Laptop Computer) or a notebook computer, a personal digital assistant (Personal Digital Assistant, PDA), a handheld computer, a netbook, an ultra-mobile personal computer (ultra-mobile personal computer, UMPC), a mobile Internet device (Mobile Internet Device, MID), augmented reality (augmented reality, AR)/virtual reality (virtual reality, VR) equipment, a robot, a wearable device (Wearable Device), a vehicle-mounted device (Vehicle User Equipment, VUE), a pedestrian terminal (Pedestrian User Equipment, PUE), a smart home (home equipment with wireless communication functions, such as refrigerators, televisions, washing machines or furniture, etc.), a game console, a personal computer (personal computer, PC), a teller
  • the network side device 12 may include an access network device or a core network device, wherein the access network device may also be referred to as a radio access network device, a radio access network (RAN), a radio access network function or a radio access network unit.
  • the access network device may include a base station, a WLAN access point or a WiFi node, etc.
  • the base station may be referred to as a node B, an evolved node B (evolved Node B, eNB), an access point, a base transceiver station (Base Transceiver Station, BTS), a radio base station, a radio transceiver, a basic service set (Basic Service Set, BSS), an extended service set (Extended Service Set, ESS), a home B node, a home evolved B node, a transmitting and receiving point (Transmitting Receiving Point, TRP) or other appropriate terms in the field, as long as the same technical effect is achieved, the base station is not limited to a specific technical vocabulary, it should be noted that in the embodiment of the present application, only the base station in the NR system is used as an example for introduction, and the specific type of the base station is not limited.
  • the core network equipment may include but is not limited to at least one of the following: core network node, core network function, mobility management entity (Mobility Management Entity, MME), access mobility management function (Access and Mobility Management Function, AMF), session management function (Session Management Function, SMF), user plane function (User Plane Function, UPF), policy control function (Policy Control Function, PCF), policy and charging rules function unit (Policy and Charging Rules Function, PCRF), edge application service discovery function (Edge Application Server Discovery Function, EASDF), unified data management (Unified Data Management, UDM), unified data storage (Unified Data Repository, UDR), home user server (Home Subscriber Server, HSS), Centralized network configuration (CNC), network repository function (NRF), network exposure function (NEF), local NEF (L-NEF), binding support function (BSF), application function (AF), etc.
  • MME mobility management entity
  • AMF Access Mobility Management Function
  • SMF Session Management Function
  • SMF session management function
  • User Plane Function User
  • an embodiment of the present application provides a model training method 200, which can be executed by a first device.
  • the method can be executed by software or hardware installed on the first device.
  • the method includes the following steps.
  • the first device receives a first message from the second device, where the first message is used to indicate that federated learning training is terminated or stopped.
  • the first device in each embodiment of the present application may be a client of federated learning, which may be a terminal, an access network device or a core network device, etc.
  • the core network device may include, for example, a model training logical function network element (Model Training Logical Function, MTLF), an analysis logical function network element (Analytics Logical Function, AnLF), etc.
  • the second device may be a server of federated learning, which may be a terminal, an access network device or a core network device, and the core network device may include, for example, MTLF, AnLF, etc.
  • a first device may receive a first message from a second device, the first message being used to indicate that federated learning training is terminated or aborted.
  • Termination of federated learning training means that the entire federated learning process ends for the first device and the second device.
  • Termination of federated learning training means that the federated learning process is interrupted or ended for the first device.
  • S204 The first device performs a first operation based on the first message.
  • the first device may perform the first operation according to internal logic, and may also perform the first operation according to suggestion information in the first message, etc.
  • the suggestion information will be described in detail later.
  • the server i.e., the second device, performs a member selection process.
  • the second device sends a request to a storage information device such as a network storage function (NF Repository Function, NRF), requesting to obtain the capability information of each intelligent network element device such as MTLF, and determines whether the intelligent network element device can participate in federated learning through the capability information of the intelligent network element device, and determines the members for the federated learning; 2)
  • the second device sends information such as the initialization model of the federated learning to each client, i.e., the first device; 3) Each first device feeds back the intermediate results, such as gradients, to the second device after local training; 4)
  • the second device aggregates the intermediate results and updates the federated learning model. After repeated steps of member selection-intermediate model distribution-local training-intermediate result feedback-aggregation and updating of the global model, the training can be stopped after the federated learning model converges.
  • the server can send a first message to the client, and the first message is used to indicate the termination or suspension of the federated learning training.
  • the first device can be informed that the federated learning training is ended, and can perform a first operation based on the first message, for example, stopping the local federated learning training, deleting the local federated learning model, etc., to avoid occupying the client's space and computing power and improve the client's performance.
  • the embodiment of the present application defines a corresponding processing mechanism after the federated learning training is completed, so as to make the execution process of the entire federated learning more complete.
  • the first message may include at least one of the following:
  • Indication information of the termination of federated learning training that is, the second device explicitly indicates the termination of federated learning training, so that the first device can perform the first operation according to its internal logic or the recommended information in the following 7); the information from 2) to 7) below can implicitly indicate the termination of federated learning training; or, the termination of federated learning training can be implicitly indicated through a signaling name, etc.
  • the termination of federated learning training mentioned in the various embodiments of the present application may refer to the completion of federated learning training, for example, the parameters of the federated learning model converge, the loss function of the federated learning model converges, the number of federated learning training reaches a number threshold, the duration of the federated learning training reaches a duration threshold, etc.
  • Federated learning training termination indication information that is, the second device explicitly indicates the termination of federated learning training, so that the first device can perform the first operation according to its internal logic or the recommended information in the following 7); the following information 2) to 7) can implicitly indicate the termination of federated learning training; or, implicitly indicate the termination of federated learning training through a signaling name, etc.
  • Model ID or identification information of the federated learning model which can be used to uniquely identify the federated learning model.
  • the federated learning model can be a trained federated learning model or a model in which training is terminated during training.
  • Model information of the federated learning model which includes, for example, the network structure, weight parameters, input and output data of the federated learning model; the model information may also include download address information or storage address information of the federated learning model file.
  • the information of input and output data may be the category information of input data, which is used to indicate what kind of data should be input and what kind of output data should be output.
  • the federated learning model may be a trained federated learning model or a model in which training is terminated during training.
  • Gradient information of the federated learning model which can be transmitted in the form of a gradient file, such as the download address information or storage address information of the gradient file, or can be transmitted through the message, etc.
  • the gradient information can be the gradient information used by the final global model, wherein the gradient information of the final global model can be the sum of the gradients fed back by multiple clients in this round (because the update of a round of global model can be based on multiple gradients fed back by multiple clients in this round, these gradients can be aggregated and then updated, or all gradients can be used for update, etc.
  • the feedback gradient information can be a sum of these multiple gradients, or multiple gradient information, etc.).
  • the federated learning model can be a federated learning model that has completed training, or it can be a model when training is terminated during training.
  • Task identification information which is used to indicate the task category for which the federated learning model is used, for example, indicating which type of task the federated learning model is used to perform.
  • Task identification information and the analysis identification described below have similar meanings and can be used interchangeably; task identification information can also be called data analysis task identification (which can be analytic ID) information.
  • Task association identification information (which can be correlation ID, subscription correlation ID), which is used to indicate the target federated learning task, for example, uniquely indicating the federated learning task (or federated learning model training task). This information can be generated when the task is generated, or it can be generated by the server at Generated when issuing global tasks, etc.
  • the reason information is used to indicate the reason why the second device sends the first message.
  • the reason information can be used to indicate at least one of the following: the federated learning process has ended, the federated learning process is interrupted.
  • the reason information can further indicate the reason why the federated learning is interrupted, such as the accuracy of the second device is not enough to continue the federated learning, or the second device is removed.
  • the reason information can further indicate the reason why the federated learning is terminated, such as the federated learning model has converged, the number of iterations has reached a preset value, the training time has timed out, etc.
  • Recommendation information where the recommendation information is used to indicate an operation to be performed by the first device after receiving the first message.
  • the suggestion information may include at least one of the following:
  • Instruction information for updating the federated learning model which is used to instruct the first device to update its local federated learning model using the received gradient information, etc., and can implicitly inform the first device that it can save and use the federated learning model (e.g., it has the authority to use the federated learning model).
  • the federated learning model can be a federated learning model that has been trained, or a federated learning model when the training is terminated during the training process.
  • c Instruction information for deleting the local federated learning model, used to indicate that the first device is to delete its local federated learning model, for example, indicating that the first device should not use the federated learning model, does not have the authority to use the federated learning model, etc.
  • the first device may perform the first operation according to internal logic, and may also perform the first operation according to suggestion information in the first message, etc.
  • the first operation performed by the first device includes at least one of the following:
  • the first device can obtain the trained federated learning model or gradient information and use the trained federated learning model or gradient information to update the local federated learning model, and use the model subsequently.
  • the first device can obtain the trained federated learning model and use the trained federated learning model.
  • the first device if the first device does not need the trained federated learning model, the first device also knows that the local federated learning model does not need to be updated anymore, and can delete the local federated learning model, thereby saving storage space.
  • the first device may stop local federated learning training to save computing power, etc.
  • the first operation performed by the first device may include at least one of the above 1) to 4), for example, the first device stops the local federated learning training, deletes the local federated learning model used in the previous local federated learning training, and
  • the federated learning model may be a federated learning model that has been trained, or a federated learning model that has been terminated during training.
  • the first device may also determine the first operation based on its internal logic and/or the first message. Specifically, when the first device receives the first message, it may determine the first operation based on the suggestion information in the first message, such as performing the first operation as suggested. Alternatively, when the first device receives the first message, it may determine the first operation based on the model information or gradient information in the first message, such as receiving a federated learning model, or updating a local model. Alternatively, when the first device receives the first message, it may determine the first operation based on the task identification information in the first message, such as using a trained federated learning model to perform a task.
  • the first operation includes updating the local federated learning model, and/or receiving the federated learning model, and after the first device receives the first message from the second device, the method further includes: the first device saving the federated learning model; wherein the federated learning model supports use by the first device.
  • saving the federated learning model may mean that the first device saves the federated learning model to the first device after updating the local federated learning model or after the first device receives the federated learning model.
  • the federated learning model supports use by the first device, which may mean that when other devices initiate model requests or task requests (such as a data analysis task) to the first device, the first device can use the federated learning model as the target model of the model request to feed back to other devices, or use the model to perform operations such as calculations and reasoning to generate task results corresponding to the task request, and feed back the results to other devices, etc.
  • the federated learning model can be a federated learning model that has been trained, or a federated learning model when training is terminated during training.
  • the method before the first device receives the first message from the second device in S202, the method also includes: the first device receives a federated learning training request message from the second device; the first device sends a response message to the second device, and the response message includes request information for obtaining a federated learning model.
  • the request information may be a request to obtain model information or gradient information of the federated learning model, etc., and is used to request the second device to send a global or aggregated federated learning model to the first device.
  • the response message also includes a task association identifier, which is used to uniquely identify this model training task.
  • the training request message is used to request the first device to participate in federated learning.
  • the training request message includes at least one of task identification information, task association identification information, model information, model gradient information, model identification information, etc.
  • the training request message may be to instruct the first device to use the model corresponding to the federated learning training and the data that the first device can collect to perform local federated learning training.
  • the request information may include at least one of the following:
  • the federated learning model may be a federated learning model after training is completed, or a federated learning model obtained by training when the federated learning is interrupted and stopped, and the subsequent steps are similar.
  • Second request information where the second request information is used to request model information of the federated learning model (the model information includes network architecture information, download address information, etc.).
  • Third request information where the third request information is used to request gradient information of the federated learning model.
  • the federated learning model may be a federated learning model that has completed training, or a federated learning model in which training is terminated during training.
  • the method before the first device receives the first message from the second device in S202, the method further includes: the first device sends a second message to the second device after completing the local federated learning training, and the second message includes the training result of the local training of the federated learning, and the request information for obtaining the federated learning model.
  • the training result can be the intermediate model information or intermediate gradient information of the federated learning model.
  • the second message can also include information such as the federated learning task identifier and the identifier of the federated learning model.
  • the second message may be generated by the first device after any round of federated learning training is completed.
  • the request information included in the second message may include at least one of the following:
  • First request information where the first request information is used to request to obtain a federated learning model.
  • Second request information where the second request information is used to request model information of the federated learning model (the model information includes network architecture information, download address information, etc.).
  • Third request information where the third request information is used to request gradient information of the federated learning model.
  • the embodiment includes the following steps.
  • Step 0 This step 0 can be divided into the following steps 0a and 0b.
  • Step 0a The federated learning consumer (such as AnLF) sends a federated learning model request to the federated learning server (such as MTLF).
  • the federated learning model request can be carried by the Nnwdaf_MLModelProvision_Subscribe message.
  • the federated learning model request is used to request a federated learning model (hereinafter referred to as the model) to complete its own tasks.
  • the server determines whether to trigger federated learning based on local configuration or the request of the federated learning consumer, and determines to initialize federated learning and member selection.
  • Step 0b For devices without federated learning server capabilities (such as devices with only client capabilities, or devices without federated learning capabilities), a federated learning request can also be sent to a device with federated learning server capabilities, requesting federated learning to generate the required model.
  • devices without federated learning server capabilities such as devices with only client capabilities, or devices without federated learning capabilities
  • the device can also be regarded as requesting to obtain the trained federated learning model by sending a federated learning request to the server.
  • the device can also participate in the federated learning process, so the server can also send the first message to the device.
  • the federated learning request in this step may include at least one of the following information:
  • Analytics ID which is used to indicate the federated learning process for the task type of analytics ID. It can also be called data analysis task ID. It is the same as the previous task ID information.
  • Model ID The identifier of the federated learning model (Model ID), which is used to uniquely identify the federated learning model.
  • Model filter information used to limit the scope of the federated learning process, such as regional scope, time range, and single network slice selection auxiliary information. Assistance Information, S-NSSAI), Data Network Name (Data Network Name, DNN), etc.
  • Model target of model (optional), which can be used to specify the target of the federated learning process, such as one or more specific terminals, all terminals within a certain range, or all terminals that meet certain conditions.
  • Model reporting information (optional), which can be used to indicate the reporting information of the generated federated learning model information, such as reporting time (start time, deadline, etc.) and reporting conditions (periodic trigger, event trigger, etc.).
  • Step 1 The device with federated learning server capabilities determines that federated learning is to be performed and selects members. It can initialize the formulation of a strategy for federated learning, such as: specifying how many rounds of training to collect status information, or how many rounds of training to collect training status information.
  • the server finds out the devices that are willing to participate in federated learning and meet the requirements of federated learning by searching the capability information and willingness information of other devices. For example, when the server is a MTLF network element, the MTLF network element searches for network elements in the NRF to find out other network elements (such as other MTLFs) that meet the training requirements of this federated learning.
  • the server is a MTLF network element
  • the MTLF network element searches for network elements in the NRF to find out other network elements (such as other MTLFs) that meet the training requirements of this federated learning.
  • Step 2 A device with federated learning server capabilities (referred to as server) and a device with federated learning client capabilities (referred to as client) interact to perform federated learning training, which may specifically include the following steps.
  • server a device with federated learning server capabilities
  • client a device with federated learning client capabilities
  • Step 2a The server sends a training request for federated learning training to the client, such as through a Nnwdaf_MLModelTraining_Subscribe message, requesting the client to participate in federated learning and perform local training of federated learning based on the global model and the client's local data.
  • the training request may include at least one of the following information:
  • Analytics ID which is used to indicate the type of task requested for the analytics ID and the type of task the federated learning model is used for.
  • Model ID The identifier of the federated learning model (Model ID), which is used to uniquely identify the federated learning model.
  • Task correlation identification information (Correlation ID), which is used to uniquely indicate the federated learning task.
  • Model initialization information which is used to indicate model information and configuration information in this round of federated learning.
  • describing the model means describing the model itself, such as what algorithm, architecture, parameters and hyperparameters the federated learning model is composed of, or the federated learning model itself, such as the model file, the address information of the model file, etc.
  • the configuration information in this round of federated learning (also called guideline information) can be used to determine how to perform training in the local training process of this round of federated learning, such as the number of rounds of local training to be performed, the type of data to be used, the maximum training time, and other information.
  • Step 2b After receiving the training request for federated learning training, the client can feedback to the server whether it participates in the federated learning training, such as sending it through the Nnwdaf_MLModelTraining_Subscribe Response message.
  • the relevant information may include: whether the client participates in the federated learning training, the indication information of the request to obtain the final global model (global model) or the updated global model, the analysis identifier, the task association identifier information, etc.
  • the updated global model refers to the aggregate model generated by the federated learning server when the federated learning is interrupted.
  • the request to obtain the final global model or the indication information of the updated global model is used to indicate the client
  • the client wants to obtain the final global model or updated global model information, such as the model file of the final model or updated global model, or the download address information or storage address information of the model file, or the updated gradient of the federated learning model.
  • the updated global model information can help the client generate and obtain the final global model information.
  • Step 3 During each iteration, the server sends model information and/or model update messages to the client.
  • the updated global model or the updated gradient information of the global model can be sent in a manner such as step 2a, or different signaling can be used to inform the client, and the client can update its local model for the next round of local training. It can include identification information such as task association identification information for indicating federated training, model information, and/or gradient information.
  • Step 4 The client sends a request to obtain data to the data source (data source, referring to the network element that can provide data) in its area or to which it belongs to collect data for local federated learning.
  • data source data source, referring to the network element that can provide data
  • the data is provided by different network elements, such as User Plane Function (UPF), Operation Administration and Maintenance (OAM), Unified Data Management (UDM), etc.
  • UPF User Plane Function
  • OAM Operation Administration and Maintenance
  • UDM Unified Data Management
  • the request to obtain data can be carried by the following messages: Ndccf_DataManagement_Subscribe message, Nnf_EventExposure message, Subscribe and/or Ndccf_DataManagement_Notify/Nnf_EventExposure_Notify message, etc.
  • the client uses the acquired data and model information to train the local model, generate intermediate results, and feed them back to the server for the server to aggregate and update the global model.
  • Step 5 After the client completes local training, it feeds back the training results of the local training to the server. The server can then use the training results to update the global model. In this step, interim model information or gradient information can be fed back through Nnwdaf_MLModelTraining_Notify.
  • the message sent by the client to the server may include at least one of the following information:
  • Result information which is used to indicate the training result of the local training, which can be an intermediate model or an updated gradient, etc.
  • Consent information Consent info
  • status information status information
  • training information acceleration info
  • the willingness information is used to indicate whether the member is still willing to participate in the next round of federated learning.
  • Status information is used to describe the client status information after the local training of this round of federated learning is completed.
  • Specific status information can be member load (such as NF load); member resource usage (such as resource usage: CPU, memory, disk; GPU); member capability information (such as whether it can participate in federated learning, what kind of federated learning to participate in, etc.).
  • Training status information used to describe the client's training status information during local training of this round of federated learning.
  • the situation refers to the performance of the model after the local training based on its local data. It can be a statistical calculation method and the corresponding value of the method, such as the accuracy of the model and its specific value (80%), MAE and its value (0.1).
  • Step 6 The server aggregates the model and determines that the training of the model can be stopped, or the server determines that the model training is terminated based on the training termination condition.
  • the server decides to stop model training.
  • the training end condition includes at least one of the following: all model parameters converge, the model loss function converges, the parameters of the model trained locally by the client converge, the loss function of the model trained locally by the client converges, the number of training rounds reaches the round number threshold, the number of training times reaches the number threshold, and the training duration reaches the duration threshold.
  • These thresholds and convergence conditions can be pre-designed by the server internally, etc.
  • the server decides to terminate the model training.
  • the training termination conditions may include one or more of the following: reduced computing power of the client, excessive client load, excessive resource usage, etc.
  • Step 7 The server sends a termination message to the client, telling the client participating in the federated learning that the federated learning has been terminated; or the server sends a termination message to the client, telling the client participating in the federated learning that the federated learning has been terminated.
  • the server can send an indication information end of training termination or suspension to the client through the Nnwdaf_MLModelTraining_unsubscribe message, or send an indication information end of training termination or suspension to the client through other signaling messages; it can also send the final result of the federated learning training, such as the final model, or the final gradient, etc.; it can also send suggestion information to inform the client what actions can be performed on the federated learning.
  • model information can be determined by the internal logic of the server, or it can be requested by the client during the interaction with federated learning, such as step 2b, step 5, etc.; it can also be received in step 0b.
  • a message requesting to obtain a federated learning model Specifically, in step 2b and/or step 5, a request message from a device such as a client is received. For example, in step 2b, the first device sends a response message to the second device, and the response message includes request information for obtaining a trained federated learning model. In step 5, the first device sends a second message to the second device after completing local federated learning training.
  • the second message includes the training results of local training of federated learning and the request information for obtaining a trained federated learning model.
  • the server can also determine the content carried by the termination message before sending the termination message. Specifically, the server can determine the termination message when receiving a request message from the client, such as when the request message includes request model information, determining that the termination message carries the model information of the federated learning model; and when the request message includes request model gradient information, determining that the termination message carries the gradient information of the federated learning model.
  • the termination (or suspension) message sent by the server to the client may include at least one of the following:
  • Task identification information where the task identification information is used to indicate the task category that the federated learning model is used for, for example, indicating which type of task the federated learning model is used for.
  • Model ID The model identifier (Model ID) or identification information of the federated learning model, which can be used to uniquely identify the federated learning model.
  • Task association identification information where the task association identification information is used to indicate the target federated learning task, for example, to uniquely indicate the federated learning task.
  • Indication information of the termination (or suspension) of the federated learning training i.e., the second device explicitly indicates the termination of the federated learning training. Stop (or terminate), so that the first device can perform the first operation according to its internal logic or the following suggested information.
  • Model information of the federated learning model which includes, for example, the network structure, weight parameters, input and output data of the federated learning model; the model information may also include download address information or storage address information of the federated learning model file.
  • Gradient information of the federated learning model which may be transmitted in the form of a gradient file, and the gradient information may be the gradient information used by the final global model.
  • the server can provide the client with the final model information, or the gradient information of the final update.
  • the client can use the gradient information to update its local federated learning model (the federated learning model used in the previous federated learning) to obtain the final global model.
  • the final global model refers to the aggregated model generated by the federated learning server after the federated learning process is completed.
  • the final global model can be at least one of the following: a model file (containing the model's network structure, weight parameters, input and output data, etc.); download address information or storage address information of the model file (used to indicate the storage address of the model file, or where the model file can be downloaded from).
  • a model file containing the model's network structure, weight parameters, input and output data, etc.
  • download address information or storage address information of the model file used to indicate the storage address of the model file, or where the model file can be downloaded from.
  • the gradient information can be delivered in the form of a gradient file, which contains the gradient information used for model update.
  • Reason information where the reason information is used to indicate the reason why the server sends the termination (or suspension) message.
  • Suggestion information where the suggestion information is used to indicate an operation to be performed by the first device after receiving the first message.
  • the suggestion information may include at least one of the following:
  • Instruction information for updating the federated learning model which is used to instruct the first device to update its local federated learning model using the received gradient information, etc., and can implicitly inform the first device that it can save and use the federated learning model (e.g., it has the authority to use the federated learning model).
  • c Instruction information for deleting the local federated learning model, used to indicate that the first device is to delete its local federated learning model, for example, indicating that the first device should not use the federated learning model, does not have the authority to use the federated learning model, etc.
  • Steps 8a-8c After receiving the termination message, the client performs an action. Specifically, after knowing that the federated learning training is finished, the client can decide the subsequent actions based on its internal logic, or the recommended information and model information sent by the server. The subsequent actions can be to update the local model to the global model, receive the final global model and use it later; delete the local model used in the previous training; stop training, etc.
  • the client updates the local model to obtain the final model, and can use the final model later.
  • the client receives the gradient information of the model update (as in step 7), and uses the gradient information to update the local model, thereby obtaining the final model.
  • the model can be used later, such as sending the model to other devices for certain data analysis tasks.
  • the client saves the final model and can use the final model later.
  • the client receives the final model, such as in step 7, receiving the model file and/or the download address information or storage address information of the model file, to obtain the final model.
  • the client deletes the local model. Specifically, the client deletes the local model trained by the federated learning. This may be because the client did not initiate a request before, and therefore did not receive the final model of the federated learning. It may also be because the model will not be used in the future, so the client chooses to delete the model. If the client still wants to obtain the final model of the federated learning training, it can re-initiate a normal model acquisition request to the server, such as sending it through Nnwdaf_MLModelProvision_Subscribe and Nnwdaf_MLModelProvision_Notify messages.
  • Step 9 After the server completes the federated learning model training, it sends the model information to the consumer. This step has no sequence relationship with steps 7-8, that is, this step can also occur before step 7.
  • the model information may include at least one of the following:
  • Model file including the model’s network structure, weight parameters, input and output data, etc.
  • Analytics ID which is used to indicate that the federated learning model is suitable for a certain type of reasoning task.
  • Model filter information which is used to indicate the reporting information of the generated federated learning model information, such as reporting time (start time, deadline, etc.) and reporting conditions (periodic trigger, event trigger, etc.).
  • Valid region information indicating the region to which the federated learning model is applicable.
  • Valid time information indicating the time when the federated learning model is applicable.
  • the server can send model information via the following messages: Nnwdaf_MLModelProvision_Notify or Nnwdaf_MLModelInfo_Response.
  • Fig. 4 is a schematic diagram of a flow chart of a model training method according to an embodiment of the present application, which can be applied to a second device. As shown in Fig. 4, the method 400 includes the following steps.
  • the second device sends a first message to the first device, where the first message is used to indicate that the federated learning training is terminated. Or terminate; wherein the first device includes a client of federated learning, and the second device includes a server of federated learning.
  • the server may send a first message to the client, where the first message is used to indicate that the federated learning training is terminated or suspended.
  • the first device may be informed that the federated learning training is finished, and may also perform a first operation based on the first message, such as stopping the local federated learning training, deleting the local federated learning model, etc., to avoid occupying the client's space and computing power and improving the client's performance.
  • the first message includes at least one of the following:
  • Task identification information where the task identification information is used to indicate the task category for which the federated learning model is used.
  • Task association identification information where the task association identification information is used to indicate a target federated learning task.
  • Reason information where the reason information is used to indicate the reason why the second device sends the first message.
  • the reason information is used to indicate at least one of the following: the federated learning process ends; the federated learning process is interrupted.
  • Recommendation information where the recommendation information is used to indicate an operation to be performed by the first device after receiving the first message.
  • the recommendation information is used to instruct the first device to perform at least one of the following after receiving the first message: 1) updating the local federated learning model; 2) receiving the federated learning model; 3) deleting the local federated learning model used in the previous local federated learning training; 4) stopping the local federated learning training.
  • the method before the second device sends the first message to the first device, the method also includes: the second device sends a federated learning training request message to the first device; the second device receives a response message from the first device, and the response message includes request information for obtaining a federated learning model.
  • the method before the second device sends the first message to the first device, the method also includes: the second device receives a second message from the first device, the second message including the training results of local training of federated learning of the first device, and request information for obtaining the federated learning model.
  • the request information includes at least one of the following: 1) first request information, the first request information is used to request to obtain a federated learning model; 2) second request information, the second request information is used to request to obtain model information of the federated learning model; 3) third request information, the third request information is used to request to obtain gradient information of the federated learning model.
  • the second device after receiving the request information for obtaining the federated learning model, the second device sends the first message according to the request information for obtaining the federated learning model, wherein the first message includes at least one of the following: model information of the federated learning model and gradient information of the federated learning model.
  • the learned model includes the final global model or the updated global model.
  • the model training method provided in the embodiment of the present application can be executed by a model training device.
  • the model training device executing the model training method is taken as an example to illustrate the model training device provided in the embodiment of the present application.
  • FIG5 is a schematic diagram of the structure of a model training apparatus according to an embodiment of the present application.
  • the apparatus can be applied to a first device.
  • the apparatus 500 includes the following modules.
  • the receiving module 502 may be configured to receive a first message from a second device, wherein the first message is configured to indicate that the federated learning training is terminated or stopped.
  • the processing module 504 can be used to perform a first operation based on the first message; wherein the first device includes a federated learning client, and the second device includes a federated learning server.
  • the server may send a first message to the client, and the first message is used to indicate that the federated learning training is terminated or suspended.
  • the first device can be informed that the federated learning training is finished, and can perform a first operation based on the first message, for example, stopping the local federated learning training, deleting the local federated learning model, etc., to avoid occupying the client's space and computing power and improving the client's performance.
  • the first message includes at least one of the following:
  • Task identification information where the task identification information is used to indicate the task category for which the federated learning model is used.
  • Task association identification information where the task association identification information is used to indicate a target federated learning task.
  • Reason information where the reason information is used to indicate the reason why the second device sends the first message.
  • Recommendation information where the recommendation information is used to indicate an operation to be performed by the first device after receiving the first message.
  • the first operation includes at least one of the following:
  • the first operation includes updating the local federated learning model, and/or receiving the federated learning model, and the processing module 504 is also used to save the federated learning model; wherein, the federated learning model supports use by the first device.
  • the receiving module 502 is also used to receive a federated learning training request message from the second device; the device also includes a sending module, used to send a response message to the second device, and the response message includes request information for obtaining a federated learning model.
  • the apparatus further includes a sending module for sending a second message to the second device after completing the local federated learning training, wherein the second message includes the training results of the local federated learning training and request information for obtaining the federated learning model.
  • the request information includes at least one of the following: 1) first request information, the first request information is used to request to obtain a federated learning model; 2) second request information, the second request information is used to request to obtain model information of the federated learning model; 3) third request information, the third request information is used to request to obtain gradient information of the federated learning model.
  • the process of the method 200 corresponding to the embodiment of the present application can be referred to, and the various units/modules in the device 500 and the above-mentioned other operations and/or functions are respectively for implementing the corresponding processes in the method 200, and can achieve the same or equivalent technical effects. For the sake of brevity, they will not be repeated here.
  • the model training device in the embodiment of the present application can be an electronic device, such as an electronic device with an operating system, or a component in an electronic device, such as an integrated circuit or a chip.
  • the electronic device can be a terminal, or it can be other devices other than a terminal.
  • the terminal can include but is not limited to the types of terminals 11 listed above, and other devices can be servers, network attached storage (NAS), etc., which are not specifically limited in the embodiment of the present application.
  • FIG6 is a schematic diagram of the structure of a model training apparatus according to an embodiment of the present application.
  • the apparatus can be applied to a second device.
  • the apparatus 600 includes the following modules.
  • the sending module 602 can be used to send a first message to a first device, where the first message is used to indicate that the federated learning training is terminated or suspended; wherein the first device includes a federated learning client, and the second device includes a federated learning server.
  • the apparatus 600 may include a processing module and the like.
  • the server may send a first message to the client, and the first message is used to indicate that the federated learning training is terminated or suspended.
  • the first device can be informed that the federated learning training is finished, and can perform a first operation based on the first message, for example, stopping the local federated learning training, deleting the local federated learning model, etc., to avoid occupying the client's space and computing power and improving the client's performance.
  • the first message includes at least one of the following:
  • Task identification information where the task identification information is used to indicate the task category for which the federated learning model is used.
  • Task association identification information where the task association identification information is used to indicate a target federated learning task.
  • Reason information where the reason information is used to indicate the reason why the second device sends the first message.
  • suggestion information is used to instruct the first device to perform after receiving the first message operate.
  • the sending module 602 is also used to send a federated learning training request message to the first device; the device also includes a receiving module, used to receive a response message from the first device, and the response message includes request information for obtaining a federated learning model.
  • the apparatus further includes a receiving module for receiving a second message from the first device, wherein the second message includes a training result of local training of federated learning of the first device and request information for obtaining a federated learning model.
  • the request information includes at least one of the following: 1) first request information, the first request information is used to request to obtain a federated learning model; 2) second request information, the second request information is used to request to obtain model information of the federated learning model; 3) third request information, the third request information is used to request to obtain gradient information of the federated learning model.
  • the process of the method 400 corresponding to the embodiment of the present application can be referred to, and the various units/modules in the device 600 and the above-mentioned other operations and/or functions are respectively for implementing the corresponding processes in the method 400, and can achieve the same or equivalent technical effects. For the sake of brevity, they will not be repeated here.
  • the model training device provided in the embodiment of the present application can implement the various processes implemented by the method embodiments of Figures 2 to 4 and achieve the same technical effects. To avoid repetition, they will not be described here.
  • an embodiment of the present application further provides a communication device 700, including a processor 701 and a memory 702, wherein the memory 702 stores a program or instruction that can be run on the processor 701.
  • the communication device 700 is a terminal
  • the program or instruction is executed by the processor 701 to implement the various steps of the above-mentioned model training method embodiment, and can achieve the same technical effect.
  • the communication device 700 is a network side device
  • the program or instruction is executed by the processor 701 to implement the various steps of the above-mentioned model training method embodiment, and can achieve the same technical effect. To avoid repetition, it will not be repeated here.
  • the embodiment of the present application also provides a terminal, including a processor and a communication interface, wherein the communication interface is used to receive a first message from a second device, wherein the first message is used to indicate that the federated learning training is terminated or suspended; the processor is used to perform a first operation based on the first message; wherein the terminal includes a client of the federated learning, and the second device includes a server of the federated learning.
  • the communication interface is used to send a first message to the first device, wherein the first message is used to indicate that the federated learning training is terminated or suspended; wherein the first device includes a client of the federated learning, and the terminal includes a server of the federated learning.
  • the terminal embodiment corresponds to the above-mentioned terminal side method embodiment, and each implementation process and implementation mode of the above-mentioned method embodiment can be applied to the terminal embodiment and can achieve the same technical effect.
  • Figure 8 is a schematic diagram of the hardware structure of a terminal implementing the embodiment of the present application.
  • the terminal 800 includes but is not limited to: a radio frequency unit 801, a network module 802, an audio output unit 803, an input unit 804, a sensor 805, a display unit 806, a user input unit 807, an interface unit 808, a memory 809 and at least some of the components of a processor 810.
  • the terminal 800 may also include a power source (such as a battery) for supplying power to various components.
  • the power supply can be logically connected to the processor 810 through the power management system, so that the power management system can manage charging, discharging, power consumption and other functions.
  • the terminal structure shown in FIG8 does not constitute a limitation on the terminal.
  • the terminal may include more or fewer components than shown in the figure, or combine certain components, or arrange the components differently, which will not be described in detail here.
  • the input unit 804 may include a graphics processing unit (GPU) 8041 and a microphone 8042, and the GPU 8041 processes the image data of the static picture or video obtained by the image capture device (such as a camera) in the video capture mode or the image capture mode.
  • the display unit 806 may include a display panel 8061, and the display panel 8061 may be configured in the form of a liquid crystal display, an organic light emitting diode, etc.
  • the user input unit 807 includes a touch panel 8071 and at least one of other input devices 8072.
  • the touch panel 8071 is also called a touch screen.
  • the touch panel 8071 may include two parts: a touch detection device and a touch controller.
  • Other input devices 8072 may include, but are not limited to, a physical keyboard, function keys (such as a volume control key, a switch key, etc.), a trackball, a mouse, and a joystick, which will not be repeated here.
  • the radio frequency unit 801 after receiving downlink data from the network side device, can transmit the data to the processor 810 for processing; in addition, the radio frequency unit 801 can send uplink data to the network side device.
  • the radio frequency unit 801 includes but is not limited to an antenna, an amplifier, a transceiver, a coupler, a low noise amplifier, a duplexer, etc.
  • the memory 809 can be used to store software programs or instructions and various data.
  • the memory 809 may mainly include a first storage area for storing programs or instructions and a second storage area for storing data, wherein the first storage area may store an operating system, an application program or instruction required for at least one function (such as a sound playback function, an image playback function, etc.), etc.
  • the memory 809 may include a volatile memory or a non-volatile memory, or the memory 809 may include both volatile and non-volatile memories.
  • the non-volatile memory may be a read-only memory (ROM), a programmable read-only memory (PROM), an erasable programmable read-only memory (EPROM), an electrically erasable programmable read-only memory (EEPROM), or a flash memory.
  • the volatile memory may be a random access memory (RAM), a static random access memory (SRAM), a dynamic random access memory (DRAM), a synchronous dynamic random access memory (SDRAM), a double data rate synchronous dynamic random access memory (DDRSDRAM), an enhanced synchronous dynamic random access memory (ESDRAM), a synchronous link dynamic random access memory (SLDRAM) and a direct memory bus random access memory (DRRAM).
  • the memory 809 in the embodiment of the present application includes but is not limited to these and any other suitable types of memory.
  • the processor 810 may include one or more processing units; optionally, the processor 810 integrates an application processor and a modem processor, wherein the application processor mainly processes operations related to an operating system, a user interface, and application programs, and the modem processor mainly processes wireless communication signals, such as a baseband processor. It is understandable that the modem processor may not be integrated into the processor 810.
  • the radio frequency unit 801 may be used to receive a first message from a second device, the first message being used to indicate that the federated learning training is terminated or suspended; the processor 810 may be used to perform a first operation based on the first message; the terminal includes a client of the federated learning, and the second device includes a server of the federated learning. Or, The radio frequency unit 801 is used to send a first message to a first device, where the first message is used to indicate that federated learning training is terminated or suspended; wherein the first device includes a client of federated learning, and the terminal includes a server of federated learning.
  • the server may send a first message to the client, and the first message is used to indicate that the federated learning training is terminated or suspended.
  • the first device can be informed that the federated learning training is finished, and can perform a first operation based on the first message, for example, stopping the local federated learning training, deleting the local federated learning model, etc., to avoid occupying the client's space and computing power and improving the client's performance.
  • the terminal 800 provided in the embodiment of the present application can also implement the various processes of the above-mentioned model training method embodiment and can achieve the same technical effect. To avoid repetition, it will not be repeated here.
  • the embodiment of the present application also provides a network-side device, including a processor and a communication interface, wherein the communication interface is used to receive a first message from a second device, wherein the first message is used to indicate that the federated learning training is terminated or suspended; the processor is used to perform a first operation based on the first message; wherein the network-side device includes a client of federated learning, and the second device includes a server of federated learning.
  • the communication interface is used to send a first message to the first device, wherein the first message is used to indicate that the federated learning training is terminated or suspended; wherein the first device includes a client of federated learning, and the network-side device includes a server of federated learning.
  • This network side device embodiment corresponds to the above-mentioned network side device method embodiment.
  • Each implementation process and implementation method of the above-mentioned method embodiment can be applied to this network side device embodiment and can achieve the same technical effect.
  • the embodiment of the present application also provides a network side device.
  • the network side device 900 includes: an antenna 91, a radio frequency device 92, a baseband device 93, a processor 94, and a memory 95.
  • the antenna 91 is connected to the radio frequency device 92.
  • the radio frequency device 92 receives information through the antenna 91 and sends the received information to the baseband device 93 for processing.
  • the baseband device 93 processes the information to be sent and sends it to the radio frequency device 92.
  • the radio frequency device 92 processes the received information and sends it out through the antenna 91.
  • the method executed by the network-side device in the above embodiment may be implemented in the baseband device 93, which includes a baseband processor.
  • the baseband device 93 may include, for example, at least one baseband board, on which a plurality of chips are arranged, as shown in FIG. 9 , wherein one of the chips is, for example, a baseband processor, which is connected to the memory 95 through a bus interface to call a program in the memory 95 and execute the network device operations shown in the above method embodiment.
  • the network side device may also include a network interface 96, which is, for example, a common public radio interface (CPRI).
  • a network interface 96 which is, for example, a common public radio interface (CPRI).
  • CPRI common public radio interface
  • the network side device 900 of the embodiment of the present invention also includes: instructions or programs stored in the memory 95 and executable on the processor 94.
  • the processor 94 calls the instructions or programs in the memory 95 to execute the methods executed by the modules shown in Figure 5 or Figure 6, and achieves the same technical effect. To avoid repetition, it will not be repeated here.
  • the embodiment of the present application further provides a network side device.
  • the network side device 1000 includes: a processor 1001, a network interface 1002, and a memory 1003.
  • the network interface 1002 is, for example, a common public radio interface (CPRI).
  • CPRI common public radio interface
  • the network side device 1000 of the embodiment of the present application further includes: a memory 1003 stored in the memory 1003 and can be used in the processing
  • the processor 1001 calls the instructions or programs in the memory 1003 to execute the methods executed by the modules shown in FIG. 5 or FIG. 6 , and achieves the same technical effect. To avoid repetition, it will not be described here.
  • An embodiment of the present application also provides a readable storage medium, on which a program or instruction is stored.
  • a program or instruction is stored.
  • the various processes of the above-mentioned model training method embodiment are implemented, and the same technical effect can be achieved. To avoid repetition, it will not be repeated here.
  • the processor is the processor in the terminal described in the above embodiment.
  • the readable storage medium may be non-volatile or non-transient.
  • the readable storage medium includes a computer-readable storage medium, such as a computer read-only memory ROM, a random access memory RAM, a magnetic disk or an optical disk.
  • An embodiment of the present application further provides a chip, which includes a processor and a communication interface, wherein the communication interface is coupled to the processor, and the processor is used to run programs or instructions to implement the various processes of the above-mentioned model training method embodiment, and can achieve the same technical effect. To avoid repetition, it will not be repeated here.
  • the chip mentioned in the embodiments of the present application can also be called a system-level chip, a system chip, a chip system or a system-on-chip chip, etc.
  • the embodiments of the present application further provide a computer program/program product, which is stored in a storage medium and is executed by at least one processor to implement the various processes of the above-mentioned model training method embodiment and can achieve the same technical effect. To avoid repetition, it will not be repeated here.
  • An embodiment of the present application also provides a model training system, including: a terminal and a network side device, wherein the terminal can be used to execute the steps of the model training method as described above, and the network side device can be used to execute the steps of the model training method as described above.
  • the technical solution of the present application can be embodied in the form of a computer software product, which is stored in a storage medium (such as ROM/RAM, a magnetic disk, or an optical disk), and includes a number of instructions for enabling a terminal (which can be a mobile phone, a computer, a server, an air conditioner, or a network device, etc.) to execute the methods described in each embodiment of the present application.
  • a storage medium such as ROM/RAM, a magnetic disk, or an optical disk
  • a terminal which can be a mobile phone, a computer, a server, an air conditioner, or a network device, etc.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Software Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Medical Informatics (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Signal Processing (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Databases & Information Systems (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Biology (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)
  • Mobile Radio Communication Systems (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

本申请实施例公开了一种模型训练方法、终端及网络侧设备,属于通信技术领域,本申请实施例的模型训练方法包括:第一设备接收来自第二设备的第一消息,所述第一消息用于指示联邦学习训练终止或中止;所述第一设备基于所述第一消息执行第一操作;其中,所述第一设备包括联邦学习的客户端,所述第二设备包括联邦学习的服务器。

Description

模型训练方法、终端及网络侧设备
交叉引用
本申请要求在2022年12月08日提交中国专利局、申请号为202211579377.8、名称为“模型训练方法、终端及网络侧设备”的中国专利申请以及在2023年04月07日提交中国专利局、申请号为202310372773.1、名称为“模型训练方法、终端及网络侧设备”的中国专利申请的优先权,上述申请的全部内容通过引用结合在本申请中。
技术领域
本申请属于通信技术领域,具体涉及一种模型训练方法、终端及网络侧设备。
背景技术
联邦学习旨在建立一个基于分布数据集的联邦学习模型。在联邦学习模型训练过程中,联邦学习模型相关的信息能够在各方之间交换(或者是以加密形式交换),但原始数据不能交换,从而不会暴露每个站点上数据的隐私部分。
横向联邦学习的本质是样本的联合,适用于参与者间业态相同但触达客户不同的场景,即特征重叠多,用户重叠少的场景,比如通信网络内核心网域和接入网域服务不同用户(如每一个终端,即样本不同)的同一服务(如会话管理业务)。通过联合参与方的不同样本的相同数据特征,横向联邦使训练样本的数量增多,从而得到一个更好的联邦学习模型。
相关技术中,在联邦学习训练结束后还没有相应的处理机制,在联邦学习训练结束后,客户端通常还会停留在等待下一轮训练的状态,占用大量的空间及运算能力。
发明内容
本申请实施例提供一种模型训练方法、终端及网络侧设备,能够解决因客户端无法获知联邦学习训练结束,进而无法执行下一步的操作,占用空间及运算能力的问题。
第一方面,提供了一种模型训练方法,包括:第一设备接收来自第二设备的第一消息,所述第一消息用于指示联邦学习训练终止或中止;所述第一设备基于所述第一消息执行第一操作;其中,所述第一设备包括联邦学习的客户端,所述第二设备包括联邦学习的服务器。
第二方面,提供了一种模型训练方法,包括:第二设备向第一设备发送第一消息,所述第一消息用于指示联邦学习训练终止或中止;其中,所述第一设备包括联邦学习的客户 端,所述第二设备包括联邦学习的服务器。
第三方面,提供了一种模型训练装置,应用于第一设备,包括:接收模块,用于接收来自第二设备的第一消息,所述第一消息用于指示联邦学习训练终止或中止;处理模块,用于基于所述第一消息执行第一操作;其中,所述第一设备包括联邦学习的客户端,所述第二设备包括联邦学习的服务器。
第四方面,提供了一种模型训练装置,应用于第二设备,包括:发送模块,用于向第一设备发送第一消息,所述第一消息用于指示联邦学习训练终止或中止;其中,所述第一设备包括联邦学习的客户端,所述第二设备包括联邦学习的服务器。
第五方面,提供了一种终端,该终端包括处理器和存储器,所述存储器存储可在所述处理器上运行的程序或指令,所述程序或指令被所述处理器执行时实现如第一方面或第二方面所述的方法的步骤。
第六方面,提供了一种终端,包括处理器及通信接口,其中,所述通信接口用于接收来自第二设备的第一消息,所述第一消息用于指示联邦学习训练终止或中止;所述处理器用于基于所述第一消息执行第一操作;其中,所述终端包括联邦学习的客户端,所述第二设备包括联邦学习的服务器。或者,所述通信接口用于向第一设备发送第一消息,所述第一消息用于指示联邦学习训练终止或中止;其中,所述第一设备包括联邦学习的客户端,所述终端包括联邦学习的服务器。
第七方面,提供了一种网络侧设备,该网络侧设备包括处理器和存储器,所述存储器存储可在所述处理器上运行的程序或指令,所述程序或指令被所述处理器执行时实现如第一方面或第二方面所述的方法的步骤。
第八方面,提供了一种网络侧设备,包括处理器及通信接口,其中,所述通信接口用于接收来自第二设备的第一消息,所述第一消息用于指示联邦学习训练终止或中止;所述处理器用于基于所述第一消息执行第一操作;其中,所述网络侧设备包括联邦学习的客户端,所述第二设备包括联邦学习的服务器。或者,所述通信接口用于向第一设备发送第一消息,所述第一消息用于指示联邦学习训练终止或中止;其中,所述第一设备包括联邦学习的客户端,所述网络侧设备包括联邦学习的服务器。
第九方面,提供了一种模型训练系统,包括:终端及网络侧设备,所述终端可用于执行如第一方面所述的方法的步骤,所述网络侧设备可用于执行如第二方面所述的方法的步骤;或者,所述终端可用于执行如第二方面所述的方法的步骤,所述网络侧设备可用于执行如第一方面所述的方法的步骤。
第十方面,提供了一种可读存储介质,所述可读存储介质上存储程序或指令,所述程序或指令被处理器执行时实现如第一方面所述的方法的步骤,或者实现如第二方面所述的方法的步骤。
第十一方面,提供了一种芯片,所述芯片包括处理器和通信接口,所述通信接口和所述处理器耦合,所述处理器用于运行程序或指令,实现如第一方面所述的方法的步骤,或 实现如第二方面所述的方法的步骤。
第十二方面,提供了一种计算机程序/程序产品,所述计算机程序/程序产品被存储在存储介质中,所述计算机程序/程序产品被至少一个处理器执行以实现如第一方面所述的方法的步骤,或者实现如第二方面所述的方法的步骤。
在本申请实施例中,在联邦学习训练终止或中止后,服务器可以向客户端发送第一消息,第一消息用于指示联邦学习训练终止或中止,这样,第一设备即可获知联邦学习训练结束,可以基于第一消息执行第一操作,例如,停止本地联邦学习训练,删除本地联邦学习模型等,避免占用客户端的空间及运算能力,提升客户端的性能。
附图说明
图1是根据本申请实施例的无线通信系统的示意图;
图2是根据本申请实施例的模型训练方法的示意性流程图;
图3是根据本申请实施例的模型训练方法的示意性流程图;
图4是根据本申请实施例的模型训练方法的示意性流程图;
图5是根据本申请实施例的模型训练装置的结构示意图;
图6是根据本申请实施例的模型训练装置的结构示意图;
图7是根据本申请实施例的通信设备的结构示意图;
图8是根据本申请实施例的终端的结构示意图;
图9是根据本申请实施例的网络侧设备的结构示意图;
图10是根据本申请实施例的网络侧设备的结构示意图。
具体实施方式
下面将结合本申请实施例中的附图,对本申请实施例中的技术方案进行清楚描述,显然,所描述的实施例是本申请一部分实施例,而不是全部的实施例。基于本申请中的实施例,本领域普通技术人员所获得的所有其他实施例,都属于本申请保护的范围。
本申请的说明书和权利要求书中的术语“第一”、“第二”等是用于区别类似的对象,而不用于描述特定的顺序或先后次序。应该理解这样使用的术语在适当情况下可以互换,以便本申请的实施例能够以除了在这里图示或描述的那些以外的顺序实施,且“第一”、“第二”所区别的对象通常为一类,并不限定对象的个数,例如第一对象可以是一个,也可以是多个。此外,说明书以及权利要求中“和/或”表示所连接对象的至少其中之一,字符“/”一般表示前后关联对象是一种“或”的关系。
值得指出的是,本申请实施例所描述的技术不限于长期演进型(Long Term Evolution,LTE)/LTE的演进(LTE-Advanced,LTE-A)系统,还可用于其他无线通信系统,诸如码 分多址(Code Division Multiple Access,CDMA)、时分多址(Time Division Multiple Access,TDMA)、频分多址(Frequency Division Multiple Access,FDMA)、正交频分多址(Orthogonal Frequency Division Multiple Access,OFDMA)、单载波频分多址(Single-carrier Frequency Division Multiple Access,SC-FDMA)和其他系统。本申请实施例中的术语“系统”和“网络”常被可互换地使用,所描述的技术既可用于以上提及的系统和无线电技术,也可用于其他系统和无线电技术。以下描述出于示例目的描述了新空口(New Radio,NR)系统,并且在以下大部分描述中使用NR术语,但是这些技术也可应用于NR系统应用以外的应用,如第6代(6th Generation,6G)通信系统。
图1示出本申请实施例可应用的一种无线通信系统的框图。无线通信系统包括终端11和网络侧设备12。其中,终端11可以是手机、平板电脑(Tablet Personal Computer)、膝上型电脑(Laptop Computer)或称为笔记本电脑、个人数字助理(Personal Digital Assistant,PDA)、掌上电脑、上网本、超级移动个人计算机(ultra-mobile personal computer,UMPC)、移动上网装置(Mobile Internet Device,MID)、增强现实(augmented reality,AR)/虚拟现实(virtual reality,VR)设备、机器人、可穿戴式设备(Wearable Device)、车载设备(Vehicle User Equipment,VUE)、行人终端(Pedestrian User Equipment,PUE)、智能家居(具有无线通信功能的家居设备,如冰箱、电视、洗衣机或者家具等)、游戏机、个人计算机(personal computer,PC)、柜员机或者自助机等终端侧设备,可穿戴式设备包括:智能手表、智能手环、智能耳机、智能眼镜、智能首饰(智能手镯、智能手链、智能戒指、智能项链、智能脚镯、智能脚链等)、智能腕带、智能服装等。需要说明的是,在本申请实施例并不限定终端11的具体类型。网络侧设备12可以包括接入网设备或核心网设备,其中,接入网设备也可以称为无线接入网设备、无线接入网(Radio Access Network,RAN)、无线接入网功能或无线接入网单元。接入网设备可以包括基站、WLAN接入点或WiFi节点等,基站可被称为节点B、演进节点B(evolved Node B,eNB)、接入点、基收发机站(Base Transceiver Station,BTS)、无线电基站、无线电收发机、基本服务集(Basic Service Set,BSS)、扩展服务集(Extended Service Set,ESS)、家用B节点、家用演进型B节点、发送接收点(Transmitting Receiving Point,TRP)或所述领域中其他某个合适的术语,只要达到相同的技术效果,所述基站不限于特定技术词汇,需要说明的是,在本申请实施例中仅以NR系统中的基站为例进行介绍,并不限定基站的具体类型。核心网设备可以包含但不限于如下至少一项:核心网节点、核心网功能、移动管理实体(Mobility Management Entity,MME)、接入移动管理功能(Access and Mobility Management Function,AMF)、会话管理功能(Session Management Function,SMF)、用户平面功能(User Plane Function,UPF)、策略控制功能(Policy Control Function,PCF)、策略与计费规则功能单元(Policy and Charging Rules Function,PCRF)、边缘应用服务发现功能(Edge Application Server Discovery Function,EASDF)、统一数据管理(Unified Data Management,UDM),统一数据仓储(Unified Data Repository,UDR)、归属用户服务器(Home Subscriber Server,HSS)、 集中式网络配置(Centralized network configuration,CNC)、网络存储功能(Network Repository Function,NRF),网络开放功能(Network Exposure Function,NEF)、本地NEF(Local NEF,或L-NEF)、绑定支持功能(Binding Support Function,BSF)、应用功能(Application Function,AF)等。需要说明的是,在本申请实施例中仅以NR系统中的核心网设备为例进行介绍,并不限定核心网设备的具体类型。
下面结合附图,通过一些实施例及其应用场景对本申请实施例提供的模型训练方法进行详细地说明。
如图2所示,本申请实施例提供一种模型训练方法200,该方法可以由第一设备执行,换言之,该方法可以由安装在第一设备的软件或硬件来执行,该方法包括如下步骤。
S202:第一设备接收来自第二设备的第一消息,所述第一消息用于指示联邦学习训练终止或中止。
本申请各个实施例中的第一设备可以是联邦学习的客户端,该客户端可以是终端,接入网设备或核心网设备等,该核心网设备例如包括模型训练逻辑功能网元(Model Training Logical Function,MTLF),分析逻辑功能网元(Analytics Logical Function,AnLF)等;第二设备可以是联邦学习的服务器,该服务器可以是终端,接入网设备或核心网设备,该核心网设备例如包括MTLF,AnLF等。
在本申请实施例中,第一设备可以接收来自第二设备的第一消息,该第一消息用于指示联邦学习训练终止或中止。其中,联邦学习训练终止是指对于第一设备和第二设备而言,整个联邦学习过程结束。而联邦学习训练中止是指对于第一设备而言联邦学习过程中断或结束。
S204:第一设备基于所述第一消息执行第一操作。
该实施例中,第一设备在接收到第一消息后,可以根据内部逻辑执行第一操作,还可以根据第一消息中的建议信息等执行第一操作,该建议信息将在后文详细介绍。
可选地,S202之前还可以包括如下步骤:1)服务器(server),即第二设备进行成员选择过程,例如,第二设备向网络存储功能(NF Repository Function,NRF)等储存信息设备发送请求,请求获取各MTLF等智能化网元设备的能力信息,通过智能化网元设备的能力信息来确定智能化网元设备是否能参与联邦学习,并确定进行联邦学习的成员;2)第二设备向各客户端(clients)即第一设备发送联邦学习的初始化模型等信息;3)各第一设备进行本地训练后向第二设备反馈中间结果,如梯度等;4)第二设备进行中间结果的聚合并更新联邦学习模型。经过多次重复的成员选择-中间模型下发-本地训练-中间结果反馈-聚合更新全局模型的步骤,待联邦学习模型收敛等情况后即可停止训练。
本申请实施例提供的模型训练方法,在联邦学习训练终止或中止后,服务器可以向客户端发送第一消息,第一消息用于指示联邦学习训练终止或中止,这样,第一设备即可获知联邦学习训练结束,可以基于第一消息执行第一操作,例如,停止本地联邦学习训练,删除本地联邦学习模型等,避免占用客户端的空间及运算能力,提升客户端的性能。
本申请实施例定义了联邦学习训练结束后相应的处理机制,使整个联邦学习的执行过程更加完整。
可选地,所述第一消息可以包括如下至少之一:
1)联邦学习训练终止的指示信息,即第二设备显式指示联邦学习训练终止,这样,第一设备可以根据其内部逻辑或者以下7)中的建议信息等执行第一操作;以下2)至7)的信息可以隐式指示联邦学习训练终止;又或者,通过信令名称从而隐式指示联邦学习训练终止等。
需要说明的是,本申请各个实施例中提到的联邦学习训练终止,可以指联邦学习训练完成,例如,联邦学习模型参数收敛,联邦学习模型损失函数收敛,联邦学习训练次数达到次数阈值,联邦学习训练时长达到时长阈值等。
2)联邦学习训练中止指示信息,即第二设备显式指示联邦学习训练中止,这样,第一设备可以根据其内部逻辑或者以下7)中的建议信息等执行第一操作;以下2)至7)的信息可以隐式指示联邦学习训练中止;又或者,通过信令名称从而隐式指示联邦学习训练中止等。
3)联邦学习模型的模型标识(Model ID)或标识信息,该模型标识或标识信息可以用于唯一地标识联邦学习模型。该联邦学习模型可以是训练完成的联邦学习模型,还可以是训练过程中训练中止的情况下的模型。
4)联邦学习模型的模型信息,该模型信息例如包括联邦学习模型的网络结构,权重参数,输入输出数据等信息;该模型信息还可以包括联邦学习模型文件的下载地址信息或存储地址信息等。其中,输入输出数据的信息可以是输入数据的类别信息,用于指示应该输入何类数据,输出数据的类型又是何类等。该联邦学习模型可以是训练完成的联邦学习模型,还可以是训练过程中训练中止的情况下的模型。
5)联邦学习模型的梯度信息,该梯度信息可以是以梯度文件的形式传递,如该梯度文件的下载地址信息或存储地址信息等,也可以是通过该消息进行传输等。梯度信息可以是最终的全局模型所使用的梯度信息,其中,最终的全局模型的梯度信息可能是该轮多个客户端所反馈的梯度的总合(因为一轮的全局模型的更新可能是基于该轮多个clients反馈的多个梯度,可能会聚合这些梯度再进行更新,或者是使用全部的梯度进行更新等。但是反馈的梯度信息可以是这多个梯度的一个总和,或者是多个梯度信息等)。该联邦学习模型可以是训练完成的联邦学习模型,还可以是训练过程中训练中止的情况下的模型。
6)任务标识信息,所述任务标识信息用于指示联邦学习模型用于的任务类别,例如,指示该联邦学习模型是用于进行哪类任务。任务标识信息和下述的分析标识是类似的意思,可以互相代替;任务标识信息又可以叫做数据分析任务标识(可以是analytic ID)信息。
7)任务关联标识信息(可以是correlation ID,subscription correlation ID),所述任务关联标识信息用于指示目标联邦学习任务,例如,唯一地指示该次联邦学习任务(或者可以叫做联邦学习模型训练任务)。该信息可以是在生成任务时生成,或者是由服务器在 下发全局任务时生成等。
8)原因信息,所述原因信息用于指示所述第二设备发送所述第一消息的原因。可选的,该原因信息可以用于指示以下至少一项:联邦学习过程已结束,联邦学习过程中断。可选地,原因信息还可以进一步指示联邦学习中断的原因,如可能是因为第二设备的精度不足以继续进行联邦学习,或者是第二设备被剔除等。可选地,原因信息还可以进一步指示联邦学习结束的原因,如可能是因为联邦学习模型已经收敛、迭代次数到达预设值、训练时间超时等。
9)建议信息,所述建议信息用于指示所述第一设备在接收到所述第一消息后执行的操作。
可选地,所述建议信息可以包括如下至少之一:
a:更新联邦学习模型的指示信息,用于指示第一设备使用接收到的梯度信息等更新其本地联邦学习模型,并可以隐性告知第一设备可以保存并使用该联邦学习模型(如,拥有了使用该联邦学习模型的权限)。
b:保存联邦学习模型的指示信息,用于指示第一设备可以使用接收到的联邦学习模型的模型信息获取最终训练完成的联邦学习模型,并可以隐性告知第一设备可以使用联邦学习该模型(如,拥有了使用该联邦学习模型的权限)。该联邦学习模型可以是训练完成的联邦学习模型,还可以是训练过程中训练中止的情况下的联邦学习模型。
c:删除本地联邦学习模型的指示信息,用于指示第一设备要删除其本地联邦学习模型,例如,指示第一设备不应该使用该联邦学习模型,没有使用该联邦学习模型的权限等。
d:停止本地联邦学习训练的指示信息,用于指示第一设备可以停止本地联邦学习训练。
可选地,本申请各个实施例中,第一设备在接收到第一消息后,可以根据内部逻辑执行第一操作,还可以根据第一消息中的建议信息等执行第一操作,第一设备执行的第一操作包括如下至少之一:
1)更新本地联邦学习模型。该例子中,第一设备可以获取到训练完成的联邦学习模型或者梯度信息并使用训练完成的联邦学习模型或者梯度信息进行本地联邦学习模型的更新,并在后续使用该模型。
2)接收联邦学习模型。该例子中,第一设备可以获取到训练完成的联邦学习模型并使用训练完成的联邦学习模型。
3)删除之前本地联邦学习训练时所使用的本地联邦学习模型。该例子中,如果第一设备不需要训练完成的联邦学习模型,那么第一设备也知道本地联邦学习模型不用再更新了,可以删除本地联邦学习模型,从而节约存储空间。
4)停止本地联邦学习训练。对于异步的联邦学习训练,第一设备可以停止本地联邦学习训练,以节约运算能力等。
可以理解,第一设备执行的第一操作可以包括上述1)至4)中的至少之一,例如,第一设备停止本地联邦学习训练,删除之前本地联邦学习训练时所使用的本地联邦学习模 型,并接收联邦学习模型等等。其中,该联邦学习模型可以是训练完成的联邦学习模型,还可以是训练过程中训练中止的情况下的联邦学习模型。
第一设备执行第一操作之前,第一设备还可以根据其内部逻辑和/或第一消息确定第一操作。具体地,第一设备在接收到第一消息时,可以根据第一消息中的建议信息,确定第一操作,如按照建议执行第一操作。又或者,第一设备在接收到第一消息时,可以根据第一消息中的模型信息或者梯度信息,确定第一操作,如接收联邦学习模型,又或者是更新本地模型等。又或者,第一设备在接收到第一消息时,可以根据第一消息中的任务标识信息确定第一操作,如使用训练完成的联邦学习模型执行某一项任务。
可选地,所述第一操作包括所述更新本地联邦学习模型,和/或,接收联邦学习模型,所述第一设备接收来自第二设备的第一消息之后,所述方法还包括:所述第一设备保存所述联邦学习模型;其中,所述联邦学习模型支持被所述第一设备使用。
具体地,保存联邦学习模型可以是指第一设备在更新本地联邦学习模型后或者第一设备接收联邦学习模型后将联邦学习模型保存到第一设备中。所述联邦学习模型支持被所述第一设备使用,可以是指在有其他设备向第一设备发起模型请求,或者任务请求(如某一数据分析任务)时,第一设备可以使用该联邦学习模型作为模型请求的目标模型反馈给其他设备,或者使用该模型进行运算、推理等操作生成任务请求所对应的任务结果,并反馈该结果给其他设备等。该联邦学习模型可以是训练完成的联邦学习模型,还可以是训练过程中训练中止的情况下的联邦学习模型。
本申请各个实施例中,S202中所述第一设备接收来自第二设备的第一消息之前,所述方法还包括:所述第一设备接收来自所述第二设备的联邦学习训练请求消息;所述第一设备向所述第二设备发送响应消息,所述响应消息包括获取联邦学习模型的请求信息,该请求信息可以是请求获取联邦学习模型的模型信息或梯度信息等,用于请求第二设备将全局或聚合后的联邦学习模型发送给第一设备。
可选的,响应消息中还包括任务关联标识,该任务关联标识用于唯一标识本次模型训练任务。
所述训练请求消息用于请求所述第一设备参与联邦学习。其中,所述训练请求消息包括任务标识信息,任务关联标识信息,模型信息,模型的梯度信息,模型的标识信息等至少一项。具体地,所述训练请求消息可以是指示所述第一设备使用该联邦学习训练所对应的模型使用所述第一设备能收集到的数据进行本地联邦学习训练。
可选地,该请求信息可以包括以下至少一项:
1)第一请求信息,所述第一请求信息用于请求获取联邦学习模型。该联邦学习模型可以是训练完成后的联邦学习模型,还可以是联邦学习中断停止时,训练得到的联邦学习模型,后续类同。
2)第二请求信息,所述第二请求信息用于请求获取联邦学习模型的模型信息(该模型信息包括网络架构信息,下载地址信息等)。
3)第三请求信息,所述第三请求信息用于请求获取联邦学习模型的梯度信息。
可选地,联邦学习模型可以是训练完成的联邦学习模型,还可以是训练过程中训练中止的情况下的联邦学习模型。
本申请各个实施例中,S202中所述第一设备接收来自第二设备的第一消息之前,所述方法还包括:所述第一设备在进行完本地联邦学习训练后发送第二消息至所述第二设备,所述第二消息包括联邦学习本地训练的训练结果,以及获取联邦学习模型的请求信息。该训练结果可以是联邦学习模型的中间模型信息或中间梯度信息。可选地,所述第二消息还可以包括联邦学习任务标识,联邦学习模型的标识等信息。
第二消息可以是第一设备在任意一轮联邦学习训练完成后发生。
可选地,第二消息包括的该请求信息可以包括以下至少一项:
1)第一请求信息,所述第一请求信息用于请求获取联邦学习模型。
2)第二请求信息,所述第二请求信息用于请求获取联邦学习模型的模型信息(该模型信息包括网络架构信息,下载地址信息等)。
3)第三请求信息,所述第三请求信息用于请求获取联邦学习模型的梯度信息。
为详细说明本申请实施例提供的模型训练方法,以下将结合一个具体的实施例进行说明,如图3所示,该实施例包括如下步骤。
步骤0:该步骤0可以分为以下步骤0a和步骤0b。
步骤0a:联邦学习消费者(如AnLF)向联邦学习服务器(如MTLF)发送联邦学习模型请求,该联邦学习模型请求可以由Nnwdaf_MLModelProvision_Subscribe消息携带,该联邦学习模型请求用于请求获得一个联邦学习模型(以下可以将联邦学习模型简称为模型)用于完成自己的任务。服务器基于本地配置或者联邦学习消费者的请求等情况判断是否触发联邦学习,并确定进行初始化联邦学习和成员选择。
步骤0b:对于没有联邦学习服务器能力的设备(如仅具有客户端能力,或者无联邦学习能力的设备),也可以向具有联邦学习服务器能力的设备发送联邦学习请求,请求进行联邦学习并生成所需要的模型。
在步骤0b的情况下,该设备通过向服务器发送联邦学习请求,也可以视为请求获取训练完成的联邦学习模型。特别是,如果该设备具有客户端能力,那么该设备也可以参与到联邦学习的过程中,所以服务器发送第一消息时也可以发送给该设备。
可选地,该步骤中的联邦学习请求可以包括以下信息的至少之一:
1)联邦学习指示(FL indication),用于请求进行联邦学习过程。
2)分析标识(Analytics ID),用于指示请求针对analytics ID的任务类型而进行联邦学习过程。也可以叫做数据分析任务标识。和前面的任务标识信息相同。
3)联邦学习模型的标识(Model ID),该标识用于唯一地标识联邦学习模型。
4)模型过滤信息(Model filter information)(可选),用于限定联邦学习过程的范围,如区域范围,时间范围,单一网络切片选择辅助信息(Single Network Slice Selection  Assistance Information,S-NSSAI),数据网络名称(Data Network Name,DNN)等。
5)模型目标(Model target of model)(可选),可以用于指定联邦学习过程针对的对象,如特定的一个或多个终端,一定范围内的所有终端或满足一定条件的所有终端等。
6)模型报告信息(Model reporting information)(可选),可以用于指示所产生的联邦学习模型信息的上报信息,如上报时间(开始时间,截止时间等)、上报条件(周期性触发、事件触发等)。
步骤1:具有联邦学习服务器能力的设备确定要进行联邦学习,并进行成员选择,可以初始化制定对于联邦学习的策略,如:规定进行多少轮训练后收集一次状态信息,或者进行多少轮训练收集训练情况信息等。
成员选择可以是服务器通过查找其他设备的能力信息,意愿信息等,查找出愿意参加联邦学习且满足该联邦学习需求的设备。例如,服务器是MTLF网元的情况下,MTLF网元通过向NRF进行网元查找,查找出满足该次联邦学习训练需求的其他网元(如其他MTLF)。
步骤2:具有联邦学习服务器能力的设备(简称服务器)和具有联邦学习客户端能力的设备(简称客户端)进行交互以进行联邦学习的训练,具体可以包括如下步骤。
步骤2a:服务器向客户端发送联邦学习训练的训练请求,如通过Nnwdaf_MLModelTraining_Subscribe消息发送,请求客户端参加联邦学习,并且根据全局模型和客户端的本地数据进行联邦学习的本地训练。
该训练请求可以包括如下信息的至少之一:
1)分析标识(Analytics ID),用于指示请求针对analytics ID的任务类型而进行联邦学习过程,指示该联邦学习模型是用于进行哪类任务。
2)联邦学习模型的标识(Model ID),该标识用于唯一地标识联邦学习模型。
3)任务关联标识信息(Correlation ID),用于唯一地指示该次联邦学习任务。
4)模型初始化信息,用于指示模型信息和在此轮联邦学习中的配置信息等。其中,描述模型是指描述模型本身,如联邦学习模型是以何种算法,何种架构,何种参数及超参数等结构组成,又或者联邦学习模型本身,如模型文件,模型文件的地址信息等。此轮联邦学习中的配置信息(也可以叫做指导信息,guideline information)可以用于在此轮联邦学习的本地训练过程中如何进行训练,比如要进行本地训练的轮数,应使用的数据类型,最大训练时间等信息。
步骤2b:客户端接收到联邦学习训练的训练请求后,可以反馈服务器其是否参与该联邦学习训练的相关信息,如通过Nnwdaf_MLModelTraining_Subscribe Response消息发送。该相关信息中可以包括:客户端是否参加该联邦学习训练,请求获取最终的全局模型(global model)或更新后的全局模型的指示信息,分析标识,任务关联标识信息等信息。其中,更新后的全局模型是指在联邦学习中断时由联邦学习的服务器所产生的聚合模型。
其中,请求获取最终的全局模型或更新后的全局模型的指示信息,用于指示该客户端 想要在该联邦学习训练结束后,也可以说是联邦学习模型更新后,获取最终的全局模型或更新后的全局模型的信息,如最终模型或更新后的全局模型的模型文件,或者模型文件的下载地址信息或存储地址信息等,也可以是联邦学习模型的更新梯度等。这些更新后的全局模型的信息可以帮助该客户端生成、获取到最终的全局模型的信息。
步骤3:在每轮的迭代过程中,服务器向客户端发送模型信息和/或模型更新消息等。
可以通过如步骤2a的方式发送,也可以是不同的信令发送更新后的全局模型,或者全局模型更新的梯度信息告知客户端,并使客户端更新其本地模型以进行下一轮本地训练。可以包括用于指示联邦训练的任务关联标识信息等标识信息和模型信息和/或梯度信息等。
步骤4:客户端向自己所在区域或所属的数据源(data source,指可以提供数据的网元)发送获取数据的请求,以收集数据进行本地的联邦学习。
根据任务的不同,该数据提供网元也不同,如用户面功能(User Plane Function,UPF),操作管理和维护(Operation Administration and Maintenance,OAM),统一数据管理(Unified Data Management,UDM)等。
获取数据的请求可以由如下消息携带:Ndccf_DataManagement_Subscribe消息,Nnf_EventExposure消息,Subscribe和/或Ndccf_DataManagement_Notify/Nnf_EventExposure_Notify消息等。
客户端使用所获取的数据和获得模型信息进行本地模型训练,并生成中间结果,并反馈给服务器以供服务器聚合并更新全局模型。
步骤5:客户端进行完本地训练后,反馈本地训练的训练结果给服务器。服务器后续可使用该训练结果更新全局模型。该步骤可以通过Nnwdaf_MLModelTraining_Notify反馈interim模型信息或梯度信息。
该步骤中,客户端向服务器发送的消息可以包括以下信息的至少之一:
1)结果信息,用于指示该次本地训练的训练结果,可以是中间模型,或者所更新的梯度等。
2)请求最终模型的标识信息,用于指示该客户端想要在该联邦学习训练结束后获取最终模型的信息。
3)Analytic ID、correlation ID等标识信息。
4)意愿信息(consent info),状态信息(status info),训练情况信息(accuracy info)。这些信息可以用于帮助服务器决定该客户端是否可以继续参加下一轮联邦学习的训练。
其中,意愿信息,用于指示该成员是否还有意愿参加下一轮联邦学习。
状态信息,用于说明此轮联邦学习的本地训练完成后,客户端的状态信息。具体的状态信息可以是成员的负载情况(如NF load);成员的资源使用情况(如resource usage:CPU,memory,disk;GPU);成员的能力信息(如是否能参加联邦学习等,参加何种联邦学习等)等。
训练情况信息,用于说明此轮联邦学习的本地训练时,客户端的训练情况信息。训练 情况是指在该次本地训练后的模型基于其本地数据的表现,可以是某种统计计算方法和该方法对应的数值,如模型的准确度和具体值(80%),MAE和其值(0.1)。
步骤6:服务器聚合模型并确定模型的训练可以停止,或服务器基于训练中止条件确定模型训练中止。
当满足训练结束条件时,服务器决定停止模型训练,训练结束条件包括以下至少一项:所有模型参数收敛,模型损失函数收敛,客户端本地训练的模型的参数收敛,客户端本地训练的模型的损失函数收敛,训练轮数达到轮数阈值,训练次数达到次数阈值,训练时长达到时长阈值中。这些阈值、收敛的条件可以是服务器预先的内部设计等。
当满足训练中止条件时,服务器决定中止模型训练。训练中止条件例如包括客户端的算力降低、客户端负载过高、资源使用率过高等其中的一项或多项。
步骤7:服务器发送终止消息给客户端,告诉参与联邦学习的客户端联邦学习已经终止;或服务器发送中止消息给客户端,告诉参与联邦学习的客户端联邦学习已经中止。
具体地,服务器可以通过Nnwdaf_MLModelTraining_unsubscribe消息向客户端发送训练终止或中止的指示信息端,也可以通过其他信令消息向客户端发送训练终止或中止的指示信息端;还可以发送该联邦学习训练的最终结果,如最终模型,或者最终梯度等;还可以发送建议信息,告知客户端可以对于该联邦学习执行什么动作。
对于是否发送模型信息(如模型信息或梯度信息),可以是服务器的内部逻辑决定,也可以是客户端在和联邦学习的交互过程中所请求的,如步骤2b,步骤5等;也可以是在步骤0b时,接收到请求获取联邦学习模型的消息。具体地,在步骤2b和/或步骤5中,接收到来自于客户端等设备的请求信息,如步骤2b中,第一设备向所述第二设备发送响应消息,所述响应消息包括获取训练完成的联邦学习模型的请求信息,步骤5中,第一设备在进行完本地联邦学习训练后发送第二消息至所述第二设备,所述第二消息包括联邦学习本地训练的训练结果,以及获取训练完成的联邦学习模型的请求信息。服务器还可以在发送终止消息之前,确定终止消息所携带的内容。具体地,服务器可以在接收到来自于客户端的请求信息的情况下,确定终止消息,如在请求信息包括请求模型信息的情况下,确定终止消息中携带联邦学习模型的模型信息;又如在请求信息包括请求模型梯度信息的情况下,确定终止消息中携带联邦学习模型的梯度信息等。
具体地,服务器发送给客户端的终止(或中止)消息可以包括如下至少之一:
1)任务标识信息,所述任务标识信息用于指示联邦学习模型用于的任务类别,例如,指示该联邦学习模型是用于进行哪类任务。
2)联邦学习模型的模型标识(Model ID)或标识信息,该模型标识或标识信息可以用于唯一地标识联邦学习模型。
3)任务关联标识信息,所述任务关联标识信息用于指示目标联邦学习任务,例如,唯一地指示该次联邦学习任务。
4)联邦学习训练终止(或中止)的指示信息,即第二设备显式指示联邦学习训练终 止(或中止),这样,第一设备可以根据其内部逻辑或者以下建议信息等执行第一操作。
5)联邦学习模型的模型信息,该模型信息例如包括联邦学习模型的网络结构,权重参数,输入输出数据等信息;该模型信息还可以包括联邦学习模型文件的下载地址信息或存储地址信息等。
6)联邦学习模型的梯度信息,该梯度信息可以是以梯度文件的形式传递,梯度信息可以是最终的全局模型所使用的梯度信息。
服务器可以给客户端提供最终的模型信息,或者最终更新时的梯度信息。客户端可以使用该梯度信息更新其本地联邦学习模型(在之前的联邦学习中本地所保存,使用的联邦学习模型)以获得最终的全局模型。最终的全局模型是指联邦学习过程结束后由联邦学习服务器所产生的聚合模型。
具体地,最终的全局模型可以是以下至少一种:模型文件(包含模型的网络结构,权重参数,输入输出数据等);模型文件的下载地址信息或存储地址信息(用于指示模型文件的存储地址,或者从哪里可以下载模型文件)。
梯度信息可以是以梯度文件的形式传递,包含模型更新所使用的梯度信息。
7)原因信息,所述原因信息用于指示服务器发送所述终止(或中止)消息的原因。
8)建议信息,所述建议信息用于指示所述第一设备在接收到所述第一消息后执行的操作。
可选地,所述建议信息可以包括如下至少之一:
a:更新联邦学习模型的指示信息,用于指示第一设备使用接收到的梯度信息等更新其本地联邦学习模型,并可以隐性告知第一设备可以保存并使用该联邦学习模型(如,拥有了使用该联邦学习模型的权限)。
b:保存联邦学习模型的指示信息,用于指示第一设备可以使用接收到的联邦学习模型的模型信息获取最终训练完成的联邦学习模型,并可以隐性告知第一设备可以使用联邦学习该模型(如,拥有了使用该联邦学习模型的权限)。
c:删除本地联邦学习模型的指示信息,用于指示第一设备要删除其本地联邦学习模型,例如,指示第一设备不应该使用该联邦学习模型,没有使用该联邦学习模型的权限等。
d:停止本地联邦学习训练的指示信息,用于指示第一设备可以停止本地联邦学习训练。
步骤8a-8c,客户端在接收到终止消息后,执行动作。具体地,客户端在知道联邦学习训练结束后,可以根据其内部逻辑,或者服务器发送的建议信息,模型信息等决定后续动作。后续动作可以是更新本地模型为全局模型,接收最终的全局模型并可在后续使用;删除之前训练时所使用的本地模型;停止训练等。
8a:客户端更新本地模型,以获取到最终模型,并可在后续使用最终模型。客户端接收到模型更新的梯度信息(如步骤7),并使用该梯度信息更新本地模型,从而获取到最终模型。并可在后续使用该模型,如在后续将该模型发送给其他设备以进行某些数据分析任务等。
8b:客户端保存最终模型,并可在后续使用最终模型。客户端接收到最终模型,如在步骤7中,接收到模型文件和/或模型文件的下载地址信息或存储地址信息,以获取到最终模型。
8c:客户端删除本地模型。具体地,客户端删除关于该联邦学习训练的本地模型。可能是因为客户端没有在之前发起请求,也就没有接收到关于该联邦学习的最终模型等。也可能是因为不会在之后使用该模型,从而自己选择了删除模型等。若该客户端还想获取到该联邦学习训练的最终模型,可以重新向服务器发起普通的模型获取请求等,如,通过Nnwdaf_MLModelProvision_Subscribe和Nnwdaf_MLModelProvision_Notify消息发送。
8d:客户端停止本地训练。对于异步联邦学习训练,不同的客户端的训练不同,可能有的客户端完成了一轮本地训练,而有的客户端还在进行本地训练。对于这种还在进行本地训练的客户端,可以在接收到联邦学习的终止消息后,停止本地训练。该动作也可以和其他动作组合使用,即既停止了本地训练,也保存了接收到的最终模型等。
步骤9:服务器完成联邦学习的模型训练后,发送模型信息给消费者。该步骤和步骤7-8没有先后顺序关系,即该步骤也可以发生在步骤7之前。
模型信息可以包括如下至少之一:
1)训练完成的联邦学习模型的模型标识(Model ID)或标识信息,该模型标识或标识信息可以用于唯一地标识联邦学习模型。
2)联邦学习指示(FL indication),用于指示是联邦学习所生成的模型。
3)模型文件,包含模型的网络结构,权重参数,输入输出数据等。
4)模型文件的下载地址信息或存储地址信息,用于指示模型文件的存储地址,或者从哪里可以下载模型文件。
5)分析标识(Analytics ID),用于指示联邦学习模型适用于某种推理任务类型。
6)模型过滤信息(Model filter information),用于指示所产生的联邦学习模型信息的上报信息,如上报时间(开始时间,截止时间等)、上报条件(周期性触发、事件触发等)。
7)有效区域信息,指示联邦学习模型适用的区域。
8)有效时间信息,指示联邦学习模型适用的时间。
服务器可以通过以下消息发送模型信息:Nnwdaf_MLModelProvision_Notify或Nnwdaf_MLModelInfo_Response。
以上结合图2和图3详细描述了根据本申请实施例的模型训练方法。下面将结合图4详细描述根据本申请另一实施例的模型训练方法。可以理解的是,从第二设备描述的第二设备与第一设备的交互与图2所示的方法中的第一设备侧的描述相同或相对应,为避免重复,适当省略相关描述。
图4是本申请实施例的模型训练方法实现流程示意图,可以应用在第二设备。如图4所示,该方法400包括如下步骤。
S402:第二设备向第一设备发送第一消息,所述第一消息用于指示联邦学习训练终止 或中止;其中,所述第一设备包括联邦学习的客户端,所述第二设备包括联邦学习的服务器。
在本申请实施例中,在联邦学习训练终止或中止后,服务器可以向客户端发送第一消息,第一消息用于指示联邦学习训练终止或中止,这样,第一设备即可获知联邦学习训练结束,还可以基于第一消息执行第一操作,例如,停止本地联邦学习训练,删除本地联邦学习模型等,避免占用客户端的空间及运算能力,提升客户端的性能。
可选地,作为一个实施例,所述第一消息包括如下至少之一:
1)联邦学习训练终止的指示信息。
2)联邦学习训练中止指示信息。
3)联邦学习模型的模型标识或标识信息。
4)联邦学习模型的模型信息。
5)联邦学习模型的梯度信息。
6)任务标识信息,所述任务标识信息用于指示联邦学习模型用于的任务类别。
7)任务关联标识信息,所述任务关联标识信息用于指示目标联邦学习任务。
8)原因信息,所述原因信息用于指示所述第二设备发送所述第一消息的原因。
可选的,所述原因信息用于指示以下至少一项:联邦学习过程结束;联邦学习过程中断。
9)建议信息,所述建议信息用于指示所述第一设备在接收到所述第一消息后执行的操作。
可选地,作为一个实施例,所述建议信息用于指示所述第一设备在接收到所述第一消息后执行如下至少之一:1)更新本地联邦学习模型;2)接收联邦学习模型;3)删除之前本地联邦学习训练时所使用的本地联邦学习模型;4)停止本地联邦学习训练。
可选地,作为一个实施例,所述第二设备向第一设备发送第一消息之前,所述方法还包括:所述第二设备向所述第一设备发送联邦学习训练请求消息;所述第二设备接收来自所述第一设备的响应消息,所述响应消息包括获取联邦学习模型的请求信息。
可选地,作为一个实施例,所述第二设备向第一设备发送第一消息之前,所述方法还包括:所述第二设备接收来自所述第一设备的第二消息,所述第二消息包括所述第一设备联邦学习本地训练的训练结果,以及获取联邦学习模型的请求信息。
可选地,作为一个实施例,所述请求信息包括如下至少之一:1)第一请求信息,所述第一请求信息用于请求获取联邦学习模型;2)第二请求信息,所述第二请求信息用于请求获取联邦学习模型的模型信息;3)第三请求信息,所述第三请求信息用于请求获取联邦学习模型的梯度信息。
可选的,作为一个实施例,在接收到所述获取联邦学习模型的请求信息之后,所述第二设备根据所述获取联邦学习模型的请求信息,发送所述第一消息,所述第一消息中包含以下至少一项:联邦学习模型的模型信息,联邦学习模型的梯度信息。其中,所述联邦学 习模型包括最终的全局模型或更新后的全局模型。
本申请实施例提供的模型训练方法,执行主体可以为模型训练装置。本申请实施例中以模型训练装置执行模型训练方法为例,说明本申请实施例提供的模型训练装置。
图5是根据本申请实施例的模型训练装置的结构示意图,该装置可以应用于第一设备,如图5所示,装置500包括如下模块。
接收模块502,可以用于接收来自第二设备的第一消息,所述第一消息用于指示联邦学习训练终止或中止。
处理模块504,可以用于基于所述第一消息执行第一操作;其中,所述第一设备包括联邦学习的客户端,所述第二设备包括联邦学习的服务器。
在本申请实施例中,在联邦学习训练终止或中止后,服务器可以向客户端发送第一消息,第一消息用于指示联邦学习训练终止或中止,这样,第一设备即可获知联邦学习训练结束,可以基于第一消息执行第一操作,例如,停止本地联邦学习训练,删除本地联邦学习模型等,避免占用客户端的空间及运算能力,提升客户端的性能。
可选地,作为一个实施例,所述第一消息包括如下至少之一:
1)联邦学习训练终止的指示信息。
2)联邦学习训练中止指示信息。
3)联邦学习模型的模型标识或标识信息。
4)联邦学习模型的模型信息。
5)联邦学习模型的梯度信息。
6)任务标识信息,所述任务标识信息用于指示联邦学习模型用于的任务类别。
7)任务关联标识信息,所述任务关联标识信息用于指示目标联邦学习任务。
8)原因信息,所述原因信息用于指示所述第二设备发送所述第一消息的原因。
9)建议信息,所述建议信息用于指示所述第一设备在接收到所述第一消息后执行的操作。
可选地,作为一个实施例,所述第一操作包括如下至少之一:
1)更新本地联邦学习模型。
2)接收联邦学习模型。
3)删除之前本地联邦学习训练时所使用的本地联邦学习模型。
4)停止本地联邦学习训练。
可选地,作为一个实施例,所述第一操作包括所述更新本地联邦学习模型,和/或,接收联邦学习模型,所述处理模块504,还用于保存所述联邦学习模型;其中,所述联邦学习模型支持被所述第一设备使用。
可选地,作为一个实施例,所述接收模块502,还用于接收来自所述第二设备的联邦学习训练请求消息;所述装置还包括发送模块,用于向所述第二设备发送响应消息,所述响应消息包括获取联邦学习模型的请求信息。
可选地,作为一个实施例,所述装置还包括发送模块,用于在进行完本地联邦学习训练后发送第二消息至所述第二设备,所述第二消息包括联邦学习本地训练的训练结果,以及获取联邦学习模型的请求信息。
可选地,作为一个实施例,所述请求信息包括如下至少之一:1)第一请求信息,所述第一请求信息用于请求获取联邦学习模型;2)第二请求信息,所述第二请求信息用于请求获取联邦学习模型的模型信息;3)第三请求信息,所述第三请求信息用于请求获取联邦学习模型的梯度信息。
根据本申请实施例的装置500可以参照对应本申请实施例的方法200的流程,并且,该装置500中的各个单元/模块和上述其他操作和/或功能分别为了实现方法200中的相应流程,并且能够达到相同或等同的技术效果,为了简洁,在此不再赘述。
本申请实施例中的模型训练装置可以是电子设备,例如具有操作系统的电子设备,也可以是电子设备中的部件,例如集成电路或芯片。该电子设备可以是终端,也可以为除终端之外的其他设备。示例性的,终端可以包括但不限于上述所列举的终端11的类型,其他设备可以为服务器、网络附属存储器(Network Attached Storage,NAS)等,本申请实施例不作具体限定。
图6是根据本申请实施例的模型训练装置的结构示意图,该装置可以应用于第二设备,如图6所示,装置600包括如下模块。
发送模块602,可以用于向第一设备发送第一消息,所述第一消息用于指示联邦学习训练终止或中止;其中,所述第一设备包括联邦学习的客户端,所述第二设备包括联邦学习的服务器。
可选地,装置600可以包括处理模块等。
在本申请实施例中,在联邦学习训练终止或中止后,服务器可以向客户端发送第一消息,第一消息用于指示联邦学习训练终止或中止,这样,第一设备即可获知联邦学习训练结束,可以基于第一消息执行第一操作,例如,停止本地联邦学习训练,删除本地联邦学习模型等,避免占用客户端的空间及运算能力,提升客户端的性能。
可选地,作为一个实施例,所述第一消息包括如下至少之一:
1)联邦学习训练终止的指示信息。
2)联邦学习训练中止指示信息。
3)联邦学习模型的模型标识或标识信息。
4)联邦学习模型的模型信息。
5)联邦学习模型的梯度信息。
6)任务标识信息,所述任务标识信息用于指示联邦学习模型用于的任务类别。
7)任务关联标识信息,所述任务关联标识信息用于指示目标联邦学习任务。
8)原因信息,所述原因信息用于指示所述第二设备发送所述第一消息的原因。
9)建议信息,所述建议信息用于指示所述第一设备在接收到所述第一消息后执行的 操作。
可选地,作为一个实施例,所述发送模块602,还用于向所述第一设备发送联邦学习训练请求消息;所述装置还包括接收模块,用于接收来自所述第一设备的响应消息,所述响应消息包括获取联邦学习模型的请求信息。
可选地,作为一个实施例,所述装置还包括接收模块,用于接收来自所述第一设备的第二消息,所述第二消息包括所述第一设备联邦学习本地训练的训练结果,以及获取联邦学习模型的请求信息。
可选地,作为一个实施例,所述请求信息包括如下至少之一:1)第一请求信息,所述第一请求信息用于请求获取联邦学习模型;2)第二请求信息,所述第二请求信息用于请求获取联邦学习模型的模型信息;3)第三请求信息,所述第三请求信息用于请求获取联邦学习模型的梯度信息。
根据本申请实施例的装置600可以参照对应本申请实施例的方法400的流程,并且,该装置600中的各个单元/模块和上述其他操作和/或功能分别为了实现方法400中的相应流程,并且能够达到相同或等同的技术效果,为了简洁,在此不再赘述。
本申请实施例提供的模型训练装置能够实现图2至图4的方法实施例实现的各个过程,并达到相同的技术效果,为避免重复,这里不再赘述。
可选的,如图7所示,本申请实施例还提供一种通信设备700,包括处理器701和存储器702,存储器702上存储有可在所述处理器701上运行的程序或指令,例如,该通信设备700为终端时,该程序或指令被处理器701执行时实现上述模型训练方法实施例的各个步骤,且能达到相同的技术效果。该通信设备700为网络侧设备时,该程序或指令被处理器701执行时实现上述模型训练方法实施例的各个步骤,且能达到相同的技术效果,为避免重复,这里不再赘述。
本申请实施例还提供一种终端,包括处理器和通信接口,所述通信接口用于接收来自第二设备的第一消息,所述第一消息用于指示联邦学习训练终止或中止;所述处理器用于基于所述第一消息执行第一操作;其中,所述终端包括联邦学习的客户端,所述第二设备包括联邦学习的服务器。或者,所述通信接口用于向第一设备发送第一消息,所述第一消息用于指示联邦学习训练终止或中止;其中,所述第一设备包括联邦学习的客户端,所述终端包括联邦学习的服务器。
该终端实施例与上述终端侧方法实施例对应,上述方法实施例的各个实施过程和实现方式均可适用于该终端实施例中,且能达到相同的技术效果。具体地,图8为实现本申请实施例的一种终端的硬件结构示意图。
该终端800包括但不限于:射频单元801、网络模块802、音频输出单元803、输入单元804、传感器805、显示单元806、用户输入单元807、接口单元808、存储器809以及处理器810等中的至少部分部件。
本领域技术人员可以理解,终端800还可以包括给各个部件供电的电源(比如电池), 电源可以通过电源管理系统与处理器810逻辑相连,从而通过电源管理系统实现管理充电、放电、以及功耗管理等功能。图8中示出的终端结构并不构成对终端的限定,终端可以包括比图示更多或更少的部件,或者组合某些部件,或者不同的部件布置,在此不再赘述。
应理解的是,本申请实施例中,输入单元804可以包括图形处理单元(Graphics Processing Unit,GPU)8041和麦克风8042,GPU8041对在视频捕获模式或图像捕获模式中由图像捕获装置(如摄像头)获得的静态图片或视频的图像数据进行处理。显示单元806可包括显示面板8061,可以采用液晶显示器、有机发光二极管等形式来配置显示面板8061。用户输入单元807包括触控面板8071以及其他输入设备8072中的至少一种。触控面板8071,也称为触摸屏。触控面板8071可包括触摸检测装置和触摸控制器两个部分。其他输入设备8072可以包括但不限于物理键盘、功能键(比如音量控制按键、开关按键等)、轨迹球、鼠标、操作杆,在此不再赘述。
本申请实施例中,射频单元801接收来自网络侧设备的下行数据后,可以传输给处理器810进行处理;另外,射频单元801可以向网络侧设备发送上行数据。通常,射频单元801包括但不限于天线、放大器、收发信机、耦合器、低噪声放大器、双工器等。
存储器809可用于存储软件程序或指令以及各种数据。存储器809可主要包括存储程序或指令的第一存储区和存储数据的第二存储区,其中,第一存储区可存储操作系统、至少一个功能所需的应用程序或指令(比如声音播放功能、图像播放功能等)等。此外,存储器809可以包括易失性存储器或非易失性存储器,或者,存储器809可以包括易失性和非易失性存储器两者。其中,非易失性存储器可以是只读存储器(Read-Only Memory,ROM)、可编程只读存储器(Programmable ROM,PROM)、可擦除可编程只读存储器(Erasable PROM,EPROM)、电可擦除可编程只读存储器(Electrically EPROM,EEPROM)或闪存。易失性存储器可以是随机存取存储器(Random Access Memory,RAM),静态随机存取存储器(Static RAM,SRAM)、动态随机存取存储器(Dynamic RAM,DRAM)、同步动态随机存取存储器(Synchronous DRAM,SDRAM)、双倍数据速率同步动态随机存取存储器(Double Data Rate SDRAM,DDRSDRAM)、增强型同步动态随机存取存储器(Enhanced SDRAM,ESDRAM)、同步连接动态随机存取存储器(Synch link DRAM,SLDRAM)和直接内存总线随机存取存储器(Direct Rambus RAM,DRRAM)。本申请实施例中的存储器809包括但不限于这些和任意其它适合类型的存储器。
处理器810可包括一个或多个处理单元;可选的,处理器810集成应用处理器和调制解调处理器,其中,应用处理器主要处理涉及操作系统、用户界面和应用程序等的操作,调制解调处理器主要处理无线通信信号,如基带处理器。可以理解的是,上述调制解调处理器也可以不集成到处理器810中。
其中,射频单元801,可以用于接收来自第二设备的第一消息,所述第一消息用于指示联邦学习训练终止或中止;处理器810,可以用于基于所述第一消息执行第一操作;其中,所述终端包括联邦学习的客户端,所述第二设备包括联邦学习的服务器。或者,所述 射频单元801用于向第一设备发送第一消息,所述第一消息用于指示联邦学习训练终止或中止;其中,所述第一设备包括联邦学习的客户端,所述终端包括联邦学习的服务器。
在本申请实施例中,在联邦学习训练终止或中止后,服务器可以向客户端发送第一消息,第一消息用于指示联邦学习训练终止或中止,这样,第一设备即可获知联邦学习训练结束,可以基于第一消息执行第一操作,例如,停止本地联邦学习训练,删除本地联邦学习模型等,避免占用客户端的空间及运算能力,提升客户端的性能。
本申请实施例提供的终端800还可以实现上述模型训练方法实施例的各个过程,且能达到相同的技术效果,为避免重复,这里不再赘述。
本申请实施例还提供一种网络侧设备,包括处理器和通信接口,所述通信接口用于接收来自第二设备的第一消息,所述第一消息用于指示联邦学习训练终止或中止;所述处理器用于基于所述第一消息执行第一操作;其中,所述网络侧设备包括联邦学习的客户端,所述第二设备包括联邦学习的服务器。或者,所述通信接口用于向第一设备发送第一消息,所述第一消息用于指示联邦学习训练终止或中止;其中,所述第一设备包括联邦学习的客户端,所述网络侧设备包括联邦学习的服务器。
该网络侧设备实施例与上述网络侧设备方法实施例对应,上述方法实施例的各个实施过程和实现方式均可适用于该网络侧设备实施例中,且能达到相同的技术效果。
具体地,本申请实施例还提供了一种网络侧设备。如图9所示,该网络侧设备900包括:天线91、射频装置92、基带装置93、处理器94和存储器95。天线91与射频装置92连接。在上行方向上,射频装置92通过天线91接收信息,将接收的信息发送给基带装置93进行处理。在下行方向上,基带装置93对要发送的信息进行处理,并发送给射频装置92,射频装置92对收到的信息进行处理后经过天线91发送出去。
以上实施例中网络侧设备执行的方法可以在基带装置93中实现,该基带装置93包括基带处理器。
基带装置93例如可以包括至少一个基带板,该基带板上设置有多个芯片,如图9所示,其中一个芯片例如为基带处理器,通过总线接口与存储器95连接,以调用存储器95中的程序,执行以上方法实施例中所示的网络设备操作。
该网络侧设备还可以包括网络接口96,该接口例如为通用公共无线接口(common public radio interface,CPRI)。
具体地,本发明实施例的网络侧设备900还包括:存储在存储器95上并可在处理器94上运行的指令或程序,处理器94调用存储器95中的指令或程序执行图5或图6所示各模块执行的方法,并达到相同的技术效果,为避免重复,故不在此赘述。
具体地,本申请实施例还提供了一种网络侧设备。如图10所示,该网络侧设备1000包括:处理器1001、网络接口1002和存储器1003。其中,网络接口1002例如为通用公共无线接口(common public radio interface,CPRI)。
具体地,本申请实施例的网络侧设备1000还包括:存储在存储器1003上并可在处理 器1001上运行的指令或程序,处理器1001调用存储器1003中的指令或程序执行图5或图6所示各模块执行的方法,并达到相同的技术效果,为避免重复,故不在此赘述。
本申请实施例还提供一种可读存储介质,所述可读存储介质上存储有程序或指令,该程序或指令被处理器执行时实现上述模型训练方法实施例的各个过程,且能达到相同的技术效果,为避免重复,这里不再赘述。
其中,所述处理器为上述实施例中所述的终端中的处理器。所述可读存储介质,可以是非易失性的,也可以是非瞬态的。可读存储介质,包括计算机可读存储介质,如计算机只读存储器ROM、随机存取存储器RAM、磁碟或者光盘等。
本申请实施例另提供了一种芯片,所述芯片包括处理器和通信接口,所述通信接口和所述处理器耦合,所述处理器用于运行程序或指令,实现上述模型训练方法实施例的各个过程,且能达到相同的技术效果,为避免重复,这里不再赘述。
应理解,本申请实施例提到的芯片还可以称为系统级芯片,系统芯片,芯片系统或片上系统芯片等。
本申请实施例另提供了一种计算机程序/程序产品,所述计算机程序/程序产品被存储在存储介质中,所述计算机程序/程序产品被至少一个处理器执行以实现上述模型训练方法实施例的各个过程,且能达到相同的技术效果,为避免重复,这里不再赘述。
本申请实施例还提供了一种模型训练系统,包括:终端及网络侧设备,所述终端可用于执行如上所述的模型训练方法的步骤,所述网络侧设备可用于执行如上所述的模型训练方法的步骤。
需要说明的是,在本文中,术语“包括”、“包含”或者其任何其他变体意在涵盖非排他性的包含,从而使得包括一系列要素的过程、方法、物品或者装置不仅包括那些要素,而且还包括没有明确列出的其他要素,或者是还包括为这种过程、方法、物品或者装置所固有的要素。在没有更多限制的情况下,由语句“包括一个……”限定的要素,并不排除在包括该要素的过程、方法、物品或者装置中还存在另外的相同要素。此外,需要指出的是,本申请实施方式中的方法和装置的范围不限按示出或讨论的顺序来执行功能,还可包括根据所涉及的功能按基本同时的方式或按相反的顺序来执行功能,例如,可以按不同于所描述的次序来执行所描述的方法,并且还可以添加、省去、或组合各种步骤。另外,参照某些示例所描述的特征可在其他示例中被组合。
通过以上的实施方式的描述,本领域的技术人员可以清楚地了解到上述实施例方法可借助软件加必需的通用硬件平台的方式来实现,当然也可以通过硬件,但很多情况下前者是更佳的实施方式。基于这样的理解,本申请的技术方案本质上或者说对现有技术做出贡献的部分可以以计算机软件产品的形式体现出来,该计算机软件产品存储在一个存储介质(如ROM/RAM、磁碟、光盘)中,包括若干指令用以使得一台终端(可以是手机,计算机,服务器,空调器,或者网络设备等)执行本申请各个实施例所述的方法。
上面结合附图对本申请的实施例进行了描述,但是本申请并不局限于上述的具体实施 方式,上述的具体实施方式仅仅是示意性的,而不是限制性的,本领域的普通技术人员在本申请的启示下,在不脱离本申请宗旨和权利要求所保护的范围情况下,还可做出很多形式,均属于本申请的保护之内。

Claims (30)

  1. 一种模型训练方法,包括:
    第一设备接收来自第二设备的第一消息,所述第一消息用于指示联邦学习训练终止或中止;
    所述第一设备基于所述第一消息执行第一操作;其中,所述第一设备包括联邦学习的客户端,所述第二设备包括联邦学习的服务器。
  2. 根据权利要求1所述的方法,其中,所述第一消息包括如下至少之一:
    联邦学习训练终止的指示信息;
    联邦学习训练中止指示信息;
    联邦学习模型的模型标识或标识信息;
    联邦学习模型的模型信息;
    联邦学习模型的梯度信息;
    任务标识信息,所述任务标识信息用于指示联邦学习模型用于的任务类别;
    任务关联标识信息,所述任务关联标识信息用于指示目标联邦学习任务;
    原因信息,所述原因信息用于指示所述第二设备发送所述第一消息的原因;
    建议信息,所述建议信息用于指示所述第一设备在接收到所述第一消息后执行的操作。
  3. 根据权利要求2所述的方法,其中,所述原因信息用于指示以下至少一项:
    联邦学习过程已结束;
    联邦学习过程中断。
  4. 根据权利要求1-3任一项所述的方法,其中,所述第一操作包括如下至少之一:
    更新本地联邦学习模型;
    接收联邦学习模型;
    删除之前本地联邦学习训练时所使用的本地联邦学习模型;
    停止本地联邦学习训练。
  5. 根据权利要求4所述的方法,其中,所述第一操作包括所述更新本地联邦学习模型,和/或,接收联邦学习模型,所述第一设备接收来自第二设备的第一消息之后,所述方法还包括:
    所述第一设备保存所述联邦学习模型;其中,所述联邦学习模型支持被所述第一设备使用。
  6. 根据权利要求1所述的方法,其中,所述第一设备接收来自第二设备的第一消息之前,所述方法还包括:
    所述第一设备接收来自所述第二设备的联邦学习训练请求消息;
    所述第一设备向所述第二设备发送响应消息,所述响应消息包括获取联邦学习模型的请求信息。
  7. 根据权利要求1所述的方法,其中,所述第一设备接收来自第二设备的第一消息之前,所述方法还包括:
    所述第一设备在进行完本地联邦学习训练后发送第二消息至所述第二设备,所述第二消息包括联邦学习本地训练的训练结果,以及获取联邦学习模型的请求信息。
  8. 根据权利要求6或7所述的方法,其中,所述请求信息包括如下至少之一:
    第一请求信息,所述第一请求信息用于请求获取联邦学习模型;
    第二请求信息,所述第二请求信息用于请求获取联邦学习模型的模型信息;
    第三请求信息,所述第三请求信息用于请求获取联邦学习模型的梯度信息。
  9. 根据权利要求1至8任意一项所述的方法,所述联邦学习模型包括最终的全局模型或更新后的全局模型。
  10. 一种模型训练方法,包括:
    第二设备向第一设备发送第一消息,所述第一消息用于指示联邦学习训练终止或中止;
    其中,所述第一设备包括联邦学习的客户端,所述第二设备包括联邦学习的服务器。
  11. 根据权利要求10所述的方法,其中,所述第一消息包括如下至少之一:
    联邦学习训练终止的指示信息;
    联邦学习训练中止指示信息;
    联邦学习模型的模型标识或标识信息;
    联邦学习模型的模型信息;
    联邦学习模型的梯度信息;
    任务标识信息,所述任务标识信息用于指示联邦学习模型用于的任务类别;
    任务关联标识信息,所述任务关联标识信息用于指示目标联邦学习任务;
    原因信息,所述原因信息用于指示所述第二设备发送所述第一消息的原因;
    建议信息,所述建议信息用于指示所述第一设备在接收到所述第一消息后执行的操作。
  12. 根据权利要求11所述的方法,其中,所述原因信息用于指示以下至少一项:
    联邦学习过程结束;
    联邦学习过程中断。
  13. 根据权利要求11所述的方法,其中,所述建议信息用于指示所述第一设备在接收到所述第一消息后执行如下至少之一:
    更新本地联邦学习模型;
    接收联邦学习模型;
    删除之前本地联邦学习训练时所使用的本地联邦学习模型;
    停止本地联邦学习训练。
  14. 根据权利要求10所述的方法,其中,所述第二设备向第一设备发送第一消息之前,所述方法还包括:
    所述第二设备向所述第一设备发送联邦学习训练请求消息;
    所述第二设备接收来自所述第一设备的响应消息,所述响应消息包括获取联邦学习模型的请求信息。
  15. 根据权利要求10所述的方法,其中,所述第二设备向第一设备发送第一消息之前,所述方法还包括:
    所述第二设备接收来自所述第一设备的第二消息,所述第二消息包括所述第一设备联邦学习本地训练的训练结果,以及获取联邦学习模型的请求信息。
  16. 根据权利要求14或15所述的方法,其中,所述第二设备向第一设备发送第一消息,包括:
    所述第二设备根据所述获取联邦学习模型的请求信息,发送所述第一消息,其中,所述第一消息中包含以下至少一项:
    联邦学习模型的模型信息;
    联邦学习模型的梯度信息;
    其中,所述联邦学习模型包括最终的全局模型或更新后的全局模型。
  17. 一种模型训练装置,应用于第一设备,包括:
    接收模块,用于接收来自第二设备的第一消息,所述第一消息用于指示联邦学习训练终止或中止;
    处理模块,用于基于所述第一消息执行第一操作;其中,所述第一设备包括联邦学习的客户端,所述第二设备包括联邦学习的服务器。
  18. 根据权利要求17所述的装置,其中,所述第一消息包括如下至少之一:
    联邦学习训练终止的指示信息;
    联邦学习训练中止指示信息;
    联邦学习模型的模型标识或标识信息;
    联邦学习模型的模型信息;
    联邦学习模型的梯度信息;
    任务标识信息,所述任务标识信息用于指示联邦学习模型用于的任务类别;
    任务关联标识信息,所述任务关联标识信息用于指示目标联邦学习任务;
    建议信息,所述建议信息用于指示所述第一设备在接收到所述第一消息后执行的操作。
  19. 根据权利要求17或18所述的装置,其中,所述第一操作包括如下至少之一:
    更新本地联邦学习模型;
    接收联邦学习模型;
    删除之前本地联邦学习训练时所使用的本地联邦学习模型;
    停止本地联邦学习训练。
  20. 根据权利要求19所述的装置,其中,所述第一操作包括所述更新本地联邦学习模型,和/或,接收联邦学习模型,所述处理模块,还用于保存所述联邦学习模型;其中,所述联邦学习模型支持被所述第一设备使用。
  21. 根据权利要求17所述的装置,其中,
    所述接收模块,还用于接收来自所述第二设备的联邦学习训练请求消息;
    所述装置还包括发送模块,用于向所述第二设备发送响应消息,所述响应消息包括获取联邦学习模型的请求信息。
  22. 根据权利要求17所述的装置,其中,所述装置还包括发送模块,用于在进行完本地联邦学习训练后发送第二消息至所述第二设备,所述第二消息包括联邦学习本地训练的训练结果,以及获取联邦学习模型的请求信息。
  23. 根据权利要求17至22任意一项所述的装置,所述联邦学习模型包括最终的全局模型或更新后的全局模型。
  24. 一种模型训练装置,应用于第二设备,包括:
    发送模块,用于向第一设备发送第一消息,所述第一消息用于指示联邦学习训练终止或中止;
    其中,所述第一设备包括联邦学习的客户端,所述第二设备包括联邦学习的服务器。
  25. 根据权利要求24所述的装置,其中,所述第一消息包括如下至少之一:
    联邦学习训练终止的指示信息;
    联邦学习训练中止指示信息;
    联邦学习模型的模型标识或标识信息;
    联邦学习模型的模型信息;
    联邦学习模型的梯度信息;
    任务标识信息,所述任务标识信息用于指示联邦学习模型用于的任务类别;
    任务关联标识信息,所述任务关联标识信息用于指示目标联邦学习任务;
    建议信息,所述建议信息用于指示所述第一设备在接收到所述第一消息后执行的操作。
  26. 根据权利要求24所述的装置,其中,
    所述发送模块,还用于向所述第一设备发送联邦学习训练请求消息;
    所述装置还包括接收模块,用于接收来自所述第一设备的响应消息,所述响应消息包括获取联邦学习模型的请求信息。
  27. 根据权利要求24所述的装置,其中,所述装置还包括接收模块,用于接收来自所述第一设备的第二消息,所述第二消息包括所述第一设备联邦学习本地训练的训练结果,以及获取联邦学习模型的请求信息。
  28. 一种终端,包括处理器和存储器,所述存储器存储可在所述处理器上运行的程序或指令,所述程序或指令被所述处理器执行时实现如权利要求1至16任一项所述的方法的步骤。
  29. 一种网络侧设备,包括处理器和存储器,所述存储器存储可在所述处理器上运行的程序或指令,所述程序或指令被所述处理器执行时实现如权利要求1至16任一项所述的方法的步骤。
  30. 一种可读存储介质,所述可读存储介质上存储程序或指令,所述程序或指令被处理器执行时实现如权利要求1至16任一项所述的方法的步骤。
PCT/CN2023/136968 2022-12-08 2023-12-07 模型训练方法、终端及网络侧设备 Ceased WO2024120470A1 (zh)

Priority Applications (3)

Application Number Priority Date Filing Date Title
EP23900043.3A EP4633104A4 (en) 2022-12-08 2023-12-07 MODEL TRAINING METHOD, TERMINAL AND NETWORK-SIDE DEVICE
JP2025532177A JP2025538004A (ja) 2022-12-08 2023-12-07 モデルトレーニング方法、端末及びネットワーク側機器
US19/229,729 US20250299110A1 (en) 2022-12-08 2025-06-05 Model training method, terminal, and network-side device

Applications Claiming Priority (4)

Application Number Priority Date Filing Date Title
CN202211579377 2022-12-08
CN202211579377.8 2022-12-08
CN202310372773.1A CN118175052A (zh) 2022-12-08 2023-04-07 模型训练方法、终端及网络侧设备
CN202310372773.1 2023-04-07

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US19/229,729 Continuation US20250299110A1 (en) 2022-12-08 2025-06-05 Model training method, terminal, and network-side device

Publications (1)

Publication Number Publication Date
WO2024120470A1 true WO2024120470A1 (zh) 2024-06-13

Family

ID=91347768

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2023/136968 Ceased WO2024120470A1 (zh) 2022-12-08 2023-12-07 模型训练方法、终端及网络侧设备

Country Status (5)

Country Link
US (1) US20250299110A1 (zh)
EP (1) EP4633104A4 (zh)
JP (1) JP2025538004A (zh)
CN (1) CN118175052A (zh)
WO (1) WO2024120470A1 (zh)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2026011431A1 (zh) * 2024-07-12 2026-01-15 北京小米移动软件有限公司 信息处理方法、节点、通信设备、通信系统及存储介质

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2026011435A1 (zh) * 2024-07-12 2026-01-15 北京小米移动软件有限公司 模型训练方法、节点、通信设备、通信系统及存储介质
CN118504717B (zh) * 2024-07-19 2024-10-22 浙江霖研精密科技有限公司 基于梯度正交化的跨部门联邦学习方法、系统及存储介质

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113869533A (zh) * 2021-09-29 2021-12-31 深圳前海微众银行股份有限公司 联邦学习建模优化方法、设备、可读存储介质及程序产品
WO2022099512A1 (zh) * 2020-11-11 2022-05-19 北京小米移动软件有限公司 数据处理方法及装置、通信设备和存储介质
CN115242756A (zh) * 2021-04-01 2022-10-25 中国移动通信有限公司研究院 一种联邦学习业务的处理方法、装置、设备以及系统

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10621019B1 (en) * 2017-11-22 2020-04-14 Amazon Technologies, Inc. Using a client to manage remote machine learning jobs

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2022099512A1 (zh) * 2020-11-11 2022-05-19 北京小米移动软件有限公司 数据处理方法及装置、通信设备和存储介质
CN115242756A (zh) * 2021-04-01 2022-10-25 中国移动通信有限公司研究院 一种联邦学习业务的处理方法、装置、设备以及系统
CN113869533A (zh) * 2021-09-29 2021-12-31 深圳前海微众银行股份有限公司 联邦学习建模优化方法、设备、可读存储介质及程序产品

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
AIHUA LI, CHINA MOBILE: "Horizontal Federated Learning among Multiple NWDAFs in TS 23.288", 3GPP DRAFT; S2-2211435; TYPE CR; CR 0582; FS_ENA_PH3, 3RD GENERATION PARTNERSHIP PROJECT (3GPP), MOBILE COMPETENCE CENTRE ; 650, ROUTE DES LUCIOLES ; F-06921 SOPHIA-ANTIPOLIS CEDEX ; FRANCE, vol. 3GPP SA 2, no. Toulouse, FR; 20221114 - 20221118, 22 November 2022 (2022-11-22), Mobile Competence Centre ; 650, route des Lucioles ; F-06921 Sophia-Antipolis Cedex ; France, XP052225434 *
See also references of EP4633104A4 *
VIVIAN CHONG, VIVO, NTT DOCOMO, ERICSSON, LG ELECTRONICS, NOKIA, NOKIA SHANGHAI-BELL, HUAWEI, HISILICON: "Updates for Nnwdaf_MLModelTraining service", 3GPP DRAFT; S2-2307882; TYPE CR; CR 0773; ENA_PH3, 3RD GENERATION PARTNERSHIP PROJECT (3GPP), MOBILE COMPETENCE CENTRE ; 650, ROUTE DES LUCIOLES ; F-06921 SOPHIA-ANTIPOLIS CEDEX ; FRANCE, vol. 3GPP SA 2, no. Berlin, DE; 20230522 - 20230526, 30 May 2023 (2023-05-30), Mobile Competence Centre ; 650, route des Lucioles ; F-06921 Sophia-Antipolis Cedex ; France, XP052382713 *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2026011431A1 (zh) * 2024-07-12 2026-01-15 北京小米移动软件有限公司 信息处理方法、节点、通信设备、通信系统及存储介质

Also Published As

Publication number Publication date
JP2025538004A (ja) 2025-11-20
EP4633104A1 (en) 2025-10-15
CN118175052A (zh) 2024-06-11
EP4633104A4 (en) 2026-01-28
US20250299110A1 (en) 2025-09-25

Similar Documents

Publication Publication Date Title
WO2024120470A1 (zh) 模型训练方法、终端及网络侧设备
US20240188047A1 (en) Computing session update method and apparatus, and communication device
WO2023246584A1 (zh) 算力处理方法、装置及通信设备
WO2023246756A1 (zh) 算力服务方法、装置、终端及核心网设备
WO2024037632A1 (zh) 通信方法、终端及网络侧设备
WO2023125932A1 (zh) Ai网络信息传输方法、装置及通信设备
WO2023131286A1 (zh) 资源控制方法、装置、终端、网络侧设备及可读存储介质
WO2024125358A1 (zh) 算力处理方法及通信设备
WO2024017023A1 (zh) 允许的nssai的获取方法、终端及网络侧设备
WO2024078400A1 (zh) 模型请求方法、装置、通信设备及可读存储介质
WO2023185929A1 (zh) 资源控制方法、装置、终端及网络侧设备
CN116847330A (zh) 终端不可用时期的协商方法、终端及网络侧设备
CN116939638A (zh) 通信方法、装置及相关设备
WO2024149288A1 (zh) Ai模型分发、接收方法、终端及网络侧设备
WO2024022398A1 (zh) 托管网络的选网信息的获取方法、终端及网络侧设备
WO2025195274A1 (zh) 任务处理方法、终端及网络侧设备
WO2023179553A1 (zh) 终端不可用时期的协商方法、终端及网络侧设备
WO2025185508A1 (zh) 网络接入的控制方法、装置及通信设备
WO2024140712A1 (zh) 模型提供、模型获取、设备查询方法、装置和通信设备
WO2025130732A1 (zh) 终端节能的方法、终端设备和网络设备
WO2025195331A1 (zh) 通信方法、装置、设备及存储介质
WO2025026193A1 (zh) 业务处理方法、装置、通信设备及可读存储介质
WO2025195276A1 (zh) 任务处理方法、终端及网络侧设备
WO2025195234A1 (zh) 终端能力注册方法、获取方法、装置、终端及设备
WO2025209284A1 (zh) 通信数据处理方法、装置及通信设备

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 23900043

Country of ref document: EP

Kind code of ref document: A1

ENP Entry into the national phase

Ref document number: 2025532177

Country of ref document: JP

Kind code of ref document: A

WWE Wipo information: entry into national phase

Ref document number: 2025532177

Country of ref document: JP

WWE Wipo information: entry into national phase

Ref document number: 2023900043

Country of ref document: EP

NENP Non-entry into the national phase

Ref country code: DE

ENP Entry into the national phase

Ref document number: 2023900043

Country of ref document: EP

Effective date: 20250708

REG Reference to national code

Ref country code: BR

Ref legal event code: B01A

Ref document number: 112025011463

Country of ref document: BR

WWP Wipo information: published in national office

Ref document number: 2023900043

Country of ref document: EP

ENP Entry into the national phase

Ref document number: 112025011463

Country of ref document: BR

Kind code of ref document: A2

Effective date: 20250605