WO2024197810A1 - 一种数据处理方法、模型的训练方法以及相关设备 - Google Patents

一种数据处理方法、模型的训练方法以及相关设备 Download PDF

Info

Publication number
WO2024197810A1
WO2024197810A1 PCT/CN2023/085467 CN2023085467W WO2024197810A1 WO 2024197810 A1 WO2024197810 A1 WO 2024197810A1 CN 2023085467 W CN2023085467 W CN 2023085467W WO 2024197810 A1 WO2024197810 A1 WO 2024197810A1
Authority
WO
WIPO (PCT)
Prior art keywords
data
machine learning
learning model
module
sub
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Ceased
Application number
PCT/CN2023/085467
Other languages
English (en)
French (fr)
Inventor
张公正
徐晨
王坚
李榕
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Huawei Technologies Co Ltd
Original Assignee
Huawei Technologies Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Huawei Technologies Co Ltd filed Critical Huawei Technologies Co Ltd
Priority to EP23929396.2A priority Critical patent/EP4668175A4/en
Priority to CN202380092774.9A priority patent/CN120641917A/zh
Priority to PCT/CN2023/085467 priority patent/WO2024197810A1/zh
Publication of WO2024197810A1 publication Critical patent/WO2024197810A1/zh
Anticipated expiration legal-status Critical
Ceased legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/044Recurrent networks, e.g. Hopfield networks
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • G06N3/0455Auto-encoder networks; Encoder-decoder networks
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L27/00Modulated-carrier systems
    • H04L27/26Systems using multi-frequency codes
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/14Network analysis or design
    • H04L41/145Network analysis or design involving simulating, designing, planning or modelling of a network
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/16Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks using machine learning or artificial intelligence
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/0464Convolutional networks [CNN, ConvNet]
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/048Activation functions

Definitions

  • the present application relates to the field of communications, and in particular to a data processing method, a model training method, and related equipment.
  • AI Artificial Intelligence
  • A is the theory, method, technology and application system that uses digital computers or machines controlled by digital computers to simulate, extend and expand human intelligence, perceive the environment, acquire knowledge and use knowledge to obtain the best results.
  • Artificial Intelligence is also the study of the design principles and implementation methods of various intelligent machines, so that machines have the functions of perception, reasoning and decision-making.
  • the data to be processed can be input into a machine learning model to obtain processed data output by the model, wherein the processed data includes T sub-data, where T is an integer greater than or equal to 1.
  • T can be flexibly determined according to actual conditions.
  • the number of output channels of each machine learning model is fixed, which means that the machine learning model can only output a fixed number of sub-data.
  • T changes another machine learning model needs to be used for processing, which requires storing multiple machine learning models, resulting in a large storage space overhead. Therefore, a machine learning model that is compatible with multiple values of T is urgently needed.
  • the embodiments of the present application provide a data processing method, a model training method, and related equipment.
  • a module in a first machine learning model is called at least once, a sub-data can be obtained.
  • the number of calls of the module in the first machine learning model can be flexibly adjusted according to the value of T to generate T sub-data. In this way, the first machine learning model can be compatible with multiple values of T, and there is no need to store multiple machine learning models, thereby reducing the storage space overhead.
  • the first aspect of the present application provides a data processing method, which can apply artificial intelligence technology to the field of communications.
  • the method is applied to the first device side.
  • the first device can be a device or a component that can be configured in the device (such as a chip, a chip system, etc.).
  • the method includes: the first device obtains the value of T, T represents the number of sub-data included in the output data of the first machine learning model, and T is an integer greater than or equal to 1; the first device inputs the first data into the first machine learning model to obtain the second data generated by the first machine learning model, and the second data includes T sub-data.
  • the first machine learning model includes one or more modules, and each time a module in the first machine learning model is called at least once, a sub-data is obtained; illustratively, each time a module in the first machine learning model is called, one module in the first machine learning model can be called, or multiple modules can be called.
  • a sub-data can be obtained by calling a module in the first machine learning model at least once, after obtaining the value of T, the number of calls to the module in the first machine learning model can be flexibly adjusted according to the value of T to generate T sub-data, so that the first machine learning model can be compatible with multiple values of T, and it is no longer necessary to store multiple machine learning models, reducing the storage space overhead.
  • the function of the first machine learning model includes any one or more of the following combinations: encoding, modulation, and generating a reference signal.
  • the first data is the data that needs to be encoded
  • the second data is the encoded data.
  • the first data is the data that needs to be modulated
  • the second data is the modulated data.
  • the function of the first machine learning model is to generate a reference signal
  • the first data may be the index number of multiple reference signals
  • the second data may be a reference signal.
  • the function of the first machine learning model is encoding and modulation
  • the first data is the data that needs to be encoded and modulated
  • the second data is the encoded and modulated data, etc.
  • the multiple modules in the first machine learning model include a first module and at least one second module, and the first device inputs the first data into the first machine learning model to obtain the second data output by the first machine learning model, including: the first device inputs the first data into the first module to obtain the first sub-data generated by the first module, and the first sub-data is one of the T sub-data; the first feature information of the first data is input into the second module to obtain the second sub-data generated by the second module, and the second sub-data is one of the T sub-data.
  • the first feature information includes the feature information generated when the module in the first machine learning model was last called for data processing; illustratively, the first feature information can be the feature information generated when the first module in the first machine learning model was last called for data processing, or it can be the feature information generated when the second module in the first machine learning model was last called for data processing.
  • the first machine learning model includes a first module and at least one second module.
  • the input of the first module of the first machine learning model is the entire first data
  • the input of the second module is the feature information obtained when the module of the first machine learning model was called last time. Then, the input when the second module is called for the first time is the feature information obtained when the entire first data is processed by the first module.
  • the multiple modules in the first machine learning model include a first module and at least one third module
  • the first device inputs the first data into the first machine learning model to obtain the second data output by the first machine learning model, including: the first device inputs the first data into the first module, generates the first sub-data through the first module, the first sub-data is one of T sub-data, and the process of generating the first sub-data through the first module includes extracting features from the first data, that is, the feature information of the first data can be obtained in the process of generating the first sub-data through the first module; the first device calls the third module multiple times to obtain the third sub-data generated by the third module, the third sub-data is one of T sub-data, wherein the input of the third module includes the feature information of the first data, and the feature information of the first data is updated multiple times in the process of calling the third module multiple times.
  • the feature information of the first data input to the third module is obtained in the process of generating the first sub-data by the first module; when the first device calls the third module for the second time and thereafter, the feature information of the first data input to the third module is obtained in the process of calling the third module last time.
  • a third sub-data is generated according to the last updated feature information of the first data. Afterwards, it is helpful to have a more thorough understanding of the first data, thereby generating sub-data with better performance.
  • the first device inputs the first feature information into the second module to obtain second sub-data generated by the second module, including: the first device linearly transforms the first feature information through the second module, and processes it with a first activation function to obtain the transformed feature information; linearly transforms the transformed feature information, and processes it with a second activation function to obtain the second sub-data.
  • the above method since the above method is simple and easy to implement, it is not only beneficial to reduce the computer resources consumed in the process of generating the second data; and the number of parameters used in the second module shown in the above method is relatively small, which is beneficial to reduce the communication resources consumed when transmitting the parameters of the first machine learning model.
  • At least one second module includes a plurality of second modules, wherein at least two of the plurality of second modules use different parameters; that is, the first device can generate a second sub-data each time it calls the second module, but different second modules may be called in the process of generating T-1 second data.
  • the meaning of "two second modules using different parameters" may include any of the following differences: the same type of parameters are used in the two second modules, but the parameter values used in the two second modules are not exactly the same; or the types of parameters used in the two second modules are not exactly the same, etc.
  • multiple second modules can be used in the first machine learning model, and at least two of the multiple second modules use different parameters, that is, T-1 second sub-data are generated by different second modules, which is beneficial to the matching between the parameters of the second module and the generated second sub-data, and thus is beneficial to obtaining second data with better performance.
  • the first device inputs the first data into the first module to obtain the first sub-data generated by the first module, which may include: the first device obtains the first sub-data generated by the first module by calling the first module once or multiple times.
  • each time the first device calls the first module to process the input data it may include: the first device linearly transforms the input data through the first module, and processes it with a third activation function to obtain the transformed input data; linearly transforms the transformed input data, and processes it with a fourth activation function to obtain the processing result of the first module.
  • the input data of the first module may be the first data or the feature information of the first data.
  • the method before the first device inputs the first data into the first machine learning model, the method further includes: the first device obtains the data to be processed and the value of H, where H is an integer greater than or equal to 1, and H indicates the length of the first data; if the length of the data to be processed is less than H, the data to be processed is padded to obtain the first data, and the length of the first data is H.
  • the data to be processed is padded to obtain the first data with a length of H, and then the first data with a length of H is input into the first machine learning model, so that no matter how long the data to be processed is, the first machine learning model processes the first data with a length of H, which is not only conducive to compatibility with data to be processed of any length, but also conducive to reducing the difficulty of the first machine learning model in data processing to obtain second data with better performance.
  • the first data includes data to be processed and padding data
  • the padding data includes first identification information
  • the first identification information is used to identify the value of T and/or the value of K
  • K is the length of the data to be processed
  • K is an integer greater than or equal to 1
  • the first identification information can be used to identify the value of T and the value of K, and can also be used to identify the value of T, and can also be used to identify the value of K.
  • the first device can use the first function to process the value of T and/or the value of K to obtain the first identification information.
  • the conditions that the first function needs to meet include: limiting the value of the first identification information within a preset range, and being able to map different T values and/or K values to different values, that is, the value generated by the first function can uniquely identify a certain T value and/or K value, or in other words, the value generated by the first function can distinguish different T values and/or K values.
  • the first data carries first identification information for identifying the value of T and/or the value of K.
  • the first machine learning model can process the first data according to the value of T and/or the value of K, that is, according to the length of the output data of the first machine learning model and/or the length of the actual data to be processed, and then the second data output by the first machine learning model is conducive to obtaining second data with better performance.
  • the size of the parameters in the first machine learning model is related to the value of H and the value of G, where G is the length of each sub-data.
  • the size of the parameters in the first machine learning model is designed according to the length of the first data and the length of each sub-data in the T sub-data, which is conducive to reducing the number of parameters in the first machine learning model while meeting the output requirements, and is conducive to further reducing the communication resources consumed by transmitting the parameters of the first machine learning model.
  • the parameters corresponding to the first machine learning model and/or the identification information of the aforementioned parameters may be carried in signaling to enable the parameters corresponding to the first machine learning model and/or the identification information of the aforementioned parameters to be transmitted between different devices.
  • the parameters corresponding to the first machine learning model are carried in one or more of the following information: downlink control information DCI, uplink control information UCI, sidelink control information SCI, radio resource control RRC signaling, or media access control control element MAC CE.
  • the identification information of the parameters is carried in any one or more of the following information: DCI, UCI, SCI, RRC signaling, MAC CE, physical broadcast channel PBCH, or physical random access channel PRACH.
  • transmitting the identification information of the aforementioned at least one set of parameters and/or each set of parameters in a signaling has higher transmission efficiency and consumes less computer resources; in addition, the present solution provides a variety of signaling that can be used to transmit the identification information of the aforementioned at least one set of parameters and/or each set of parameters, thereby improving the implementation flexibility of the present solution.
  • the second device is a receiving end of the second data, and the second device contains multiple sets of parameters corresponding to the first machine learning model and identification information of each set of parameters, and the method further includes: the first device sends second identification information to the second device, and the second identification information is used to indicate a set of parameters adopted by the first machine learning model in the first device.
  • the second device contains multiple sets of parameters of the first machine learning model and identification information of each set of parameters. The first device only needs to send the second identification information to the second device, and the second device can know which set of parameters is adopted by the first machine learning model in the first device, and the communication resources occupied by transmitting the second identification information are relatively small, which is conducive to reducing the consumed communication resources.
  • the present application provides a data processing method that can apply artificial intelligence technology to the field of communications.
  • the method is applied to the second device side.
  • the second device can be a device or a component that can be configured in the device (such as a chip, a chip system, etc.).
  • the method includes: the second device obtains the second data, and then generates the first data based on the second data.
  • the second data includes T sub-data, T is an integer greater than or equal to 1, and the second data is generated by a first machine learning model in the first device.
  • the first machine learning model includes one or more modules, and each module in the first machine learning model is called at least once to obtain a sub-data.
  • the second device can denoise the received signal and obtain the received second data (that is, the estimated first sub-data) from the denoised received signal.
  • the received second data that is, the estimated first sub-data
  • the denoised received signal is demodulated to obtain the received second data; for another example, if the second data is modulated data, that is, the function of the first machine learning model is modulation, or the function of the first machine learning model is encoding and modulation, then the denoised received signal can be directly determined as the second data.
  • the parameters corresponding to the first machine learning model are carried in one or more of the following information: downlink control information DCI, uplink control information UCI, sidelink control information SCI, radio resource control RRC signaling, or media access control control element MAC CE; and/or, the identification information of the parameters is carried in any one or more of the following information: DCI, UCI, SCI, RRC signaling, MAC CE, physical broadcast channel PBCH, or physical random access channel PRACH.
  • the second device has third data
  • the third data includes multiple groups of parameters corresponding to the first machine learning model and identification information of each group of parameters
  • the method further includes: the second device receives second identification information sent by the first device; and determines a group of parameters used by the first machine learning model in the first device according to the second identification information and the third data.
  • the second device generates the first data according to the second data, including: the second device can demodulate and/or decode the received second data according to a group of parameters used by the first machine learning model in the first device to generate estimated first data.
  • the present application provides a model training method, which can apply artificial intelligence technology to the field of communications.
  • the method is applied to a training device, which can be a device or a component that can be configured in the device (such as a chip, a chip system, etc.), and the method includes: the training device obtains training data from a training data set, wherein the training data is used to obtain the value of the first data and T, and T is an integer greater than or equal to 1; illustratively, the training data may include the value of the data to be processed and T, and the data to be processed is used to obtain the first data, for example, the data to be processed is the same as the first data, or the first data is obtained after the data to be processed is filled; T is used to indicate the number of sub-data included in the output data of the first machine learning model, and at least two training data in the training data set include different values of T.
  • the training device inputs the first data into the first machine learning model to obtain the second data generated by the first machine learning model, and the second data includes T sub-data, wherein the first machine learning model includes multiple modules, and each module in the first machine learning model is called at least once to obtain a sub-data generated by the module; based on the second data and the loss function, the first machine learning model is trained to obtain the trained first machine learning model.
  • the second data is used to determine the signal to be sent
  • the training device trains the first machine learning model based on the second data and the loss function, including: the training device obtains a received signal corresponding to the signal to be sent, and demodulates and/or decodes the received signal corresponding to the signal to be sent to obtain estimated data corresponding to the data to be processed; the training device trains the first machine learning model according to the estimated data and the loss function, and the loss function indicates the similarity between the estimated data and the data to be processed.
  • the training device obtaining the received signal corresponding to the signal to be sent may include: the training device multiplies the signal to be sent by the channel matrix, and adds the multiplication result to the noise to obtain the received signal, and the above steps are to simulate the process of the signal to be sent being transmitted through the channel. Alternatively, the above steps are performed by two training devices. If the combination is completed, the training device obtaining the received signal corresponding to the signal to be sent may include: the first training device sends the signal to be sent to the second training device, and the second training device receives the received signal.
  • a specific implementation method for training the first machine learning model is provided when the functions of the first machine learning model include encoding and/or modulation, which reduces the difficulty of implementing this solution, and the loss function uses the similarity between the estimated data and the data to be processed, that is, the goal of the loss function is to obtain estimated data with better performance.
  • the loss function is more in line with the actual needs when sending data between devices, and the second data output by the trained first machine learning model is more in line with actual needs.
  • the training device trains the first machine learning model based on the second data and a loss function, including: the training device obtains a received reference signal corresponding to the reference signal, and generates predicted channel information according to the received reference signal corresponding to the reference signal; the training device trains the first machine learning model according to the loss function, and the loss function indicates the similarity between the predicted channel information and the correct channel information.
  • the training device acquiring the received reference signal corresponding to the reference signal may include: the training device multiplies the reference signal by a channel matrix, and adds the result of the multiplication to noise to obtain the received reference signal, and the aforementioned steps are to simulate the process of the reference signal being transmitted through a channel.
  • the aforementioned steps are completed by two training devices in cooperation, and the training device acquiring the received reference signal corresponding to the reference signal may include: the first training device sends the reference signal to the second training device, and the second training device receives the received reference signal.
  • a specific implementation method for training the first machine learning model when the function of the first machine learning model is to generate a reference signal is also provided, which expands the application scenarios of this solution and improves the implementation flexibility of this solution.
  • the training device can also be used to execute the steps performed by the first device in the first aspect and various possible implementation methods of the first aspect.
  • the specific implementation methods, meanings of terms and beneficial effects brought about by the steps in various possible implementation methods of the third aspect can all be referred to the first aspect and will not be repeated here.
  • the present application provides a data processing device that can apply artificial intelligence technology to the field of communications, the data processing device comprising a processing module; wherein the processing module is used to obtain a value of T, where T is an integer greater than or equal to 1, and T represents the number of sub-data included in the output data of the first machine learning model;
  • the processing module is also used to input the first data into the first machine learning model to obtain second data generated by the first machine learning model, where the second data includes T sub-data, wherein the first machine learning model includes one or more modules, and each time a module in the first machine learning model is called at least once, one sub-data is obtained.
  • the functionality of the first machine learning model includes any one or a combination of the following: encoding, modulating, or generating a reference signal.
  • the multiple modules in the first machine learning model include a first module and at least one second module
  • the processing module is specifically used to: input the first data into the first module to obtain first sub-data generated by the first module, and the first sub-data is one of T sub-data; input the first feature information into the second module to obtain second sub-data generated by the second module, wherein the first feature information includes the feature information generated when the module in the first machine learning model was last called for data processing, the second sub-data is one of the T sub-data, and the module in the first machine learning model that was last called is the first module or the second module.
  • the multiple modules in the first machine learning model include a first module and at least one third module
  • the processing module is specifically used to: input the first data into the first module, generate first sub-data through the first module, the first sub-data is one of T sub-data, and the process of generating the first sub-data through the first module includes extracting features of the first data; call the third module multiple times to obtain third sub-data generated by the third module, the third sub-data is one of T sub-data, wherein the input of the third module includes feature information of the first data, and the feature information of the first data is updated multiple times in the process of calling the third module multiple times.
  • the processing module is specifically used to: perform a linear transformation on the first feature information through the second module, and process it with a first activation function to obtain the transformed feature information; perform a linear transformation on the transformed feature information, and process it with a second activation function to obtain second sub-data.
  • the at least one second module includes a plurality of second modules, wherein parameters adopted by at least two second modules among the plurality of second modules are different.
  • the processing module is further used to obtain the value of the data to be processed and H, where H is an integer greater than or equal to 1, and H indicates the length of the first data; the processing module is further used to pad the data to be processed if the length of the data to be processed is less than H to obtain the first data, and the length of the first data is H.
  • the first data includes data to be processed and padding data
  • the padding data includes first identification information
  • the first identification information is used to identify the value of T and/or the value of K
  • K is the length of the data to be processed
  • K is an integer greater than or equal to 1.
  • the size of the parameters in the first machine learning model is related to the value of H and the value of G, where G is the length of each sub-data.
  • the parameters corresponding to the first machine learning model are carried in one or more of the following information: downlink control information DCI, uplink control information UCI, sidelink control information SCI, radio resource control RRC signaling, or media access control control element MAC CE; and/or, the identification information of the parameters is carried in any one or more of the following information: DCI, UCI, SCI, RRC signaling, MAC CE, physical broadcast channel PBCH, or physical random access channel PRACH.
  • a data processing device is applied to a first device, and the second device is a receiving end of second data.
  • the second device contains multiple groups of parameters corresponding to the first machine learning model and identification information of each group of parameters.
  • the data processing device also includes: a transceiver module, which is used to send second identification information to the second device, and the second identification information is used to indicate a set of parameters adopted by the first machine learning model in the first device.
  • the present application provides a data processing device that can apply artificial intelligence technology to the field of communications, wherein the data processing device includes a processing module; wherein the processing module is used to obtain second data; and generate first data based on the second data.
  • the second data includes T sub-data, where T is an integer greater than or equal to 1, and the second data is generated by a first machine learning model in a first device, and the first machine learning model includes one or more modules, and each module in the first machine learning model is called at least once to obtain a sub-data.
  • the parameters corresponding to the first machine learning model are carried in one or more of the following information: downlink control information DCI, uplink control information UCI, sidelink control information SCI, radio resource control RRC information Command or media access control control element MAC CE; and/or, the identification information of the parameter is carried in any one or more of the following information: DCI, UCI, SCI, RRC signaling, MAC CE, physical broadcast channel PBCH or physical random access channel PRACH.
  • the data processing device is applied to a second device, the second device has third data, the third data includes multiple groups of parameters corresponding to the first machine learning model and identification information of each group of parameters, and the data processing device further includes: a transceiver module, which is used to receive the second identification information sent by the first device; a processing module, which is also used to determine a group of parameters used by the first machine learning model in the first device according to the second identification information and the third data.
  • the processing module is specifically used to generate the first data according to a group of parameters used by the first machine learning model in the first device and the second data.
  • the present application provides a model training device that can apply artificial intelligence technology to the field of communications, and the model training device includes a processing module; wherein the processing module is used to obtain training data from a training data set, wherein the training data is used to obtain first data and T values, T is an integer greater than or equal to 1, and at least two training data in the training data set include different values of T; the processing module is also used to input the first data into a first machine learning model to obtain second data generated by the first machine learning model, the second data includes T sub-data, wherein the first machine learning model includes multiple modules, and each module in the first machine learning model is called at least once to obtain a sub-data generated by the module; the processing module is also used to train the first machine learning model based on the second data and the loss function to obtain the trained first machine learning model.
  • the functionality of the first machine learning model includes any one or a combination of the following: encoding, modulating, or generating a reference signal.
  • the multiple modules in the first machine learning model include a first module and at least one second module
  • the processing module is specifically used to: input the first data into the first module to obtain first sub-data generated by the first module, and the first sub-data is one of T sub-data; input the first feature information into the second module to obtain second sub-data generated by the second module, wherein the first feature information includes the feature information generated when the module in the first machine learning model was last called for data processing, the second sub-data is one of the T sub-data, and the module in the first machine learning model that was last called is the first module or the second module.
  • the processing module is further used to obtain the data to be processed from the training data; the processing module is further used to obtain the value of H, where H is an integer greater than or equal to 1, and H indicates the length of the first data; the processing module is further used to pad the data to be processed if the length of the data to be processed is less than H to obtain the first data, and the length of the first data is H.
  • the second data is used to determine the signal to be sent
  • the processing module is specifically used to: demodulate and/or decode the received signal corresponding to the signal to be sent to obtain estimated data corresponding to the data to be processed; train the first machine learning model according to the estimated data and the loss function, and the loss function indicates the similarity between the estimated data and the data to be processed.
  • the processing module when the second data is a reference signal, based on the second data and the loss function, the processing module is specifically used to: generate predicted channel information according to a received reference signal corresponding to the reference signal; train the first machine learning model according to the loss function, the loss function indicating the predicted channel information and the correct channel information. The similarity between information.
  • the present application provides a communication system that can apply artificial intelligence technology to the field of communications.
  • the communication system may include a data processing device as in the fourth aspect and a data processing device as in the fifth aspect.
  • the communication system further includes a training device for the model as in the fifth aspect.
  • the present application provides a data processing method that can apply artificial intelligence technology to the field of communications, the method comprising: a third device obtains a first signaling, wherein the first signaling carries at least one set of parameters adopted by a first machine learning model and indication information corresponding to each set of parameters, the indication information is used to indicate the position of multiple parameters included in each set of parameters in the first machine learning model; and sends the first signaling to the first device.
  • the third device and the second device can be the same device or different devices, which is not limited in the present application.
  • the signaling when at least one set of parameters of the first machine learning model is transmitted via signaling, the signaling not only carries the aforementioned at least one set of parameters, but also carries indication information corresponding to each set of parameters, and the indication information is used to indicate the positions of multiple parameters included in each set of parameters in the first machine learning module.
  • the first device After receiving the signaling, the first device can understand how to use the parameters carried in the signaling, and transmit at least one set of parameters of the first machine learning model by means of signaling, which is conducive to reducing the communication resources consumed in the parameter transmission process and improving the efficiency of the parameter transmission process.
  • the first signaling is any one of the following: downlink control information DCI, uplink control information UCI, sidelink control information SCI, radio resource control RRC signaling, or media access control control element MAC CE.
  • the present application provides a data processing method that can apply artificial intelligence technology to the field of communications, and the method includes: a first device receives a first signaling, wherein the first signaling carries at least one set of parameters adopted by a first machine learning model and indication information corresponding to each set of parameters, and the indication information is used to indicate the positions of multiple parameters included in each set of parameters in the first machine learning model.
  • the first signaling is any one of the following: downlink control information DCI, uplink control information UCI, sidelink control information SCI, radio resource control RRC signaling, or media access control control element MAC CE.
  • an embodiment of the present application provides a device, comprising at least one processor, at least one processor coupled to a memory, the memory being used to store programs or instructions; at least one processor being used to execute programs or instructions, so that the aforementioned device executes the method in any of the above aspects.
  • an embodiment of the present application provides a computer-readable storage medium, in which a computer program is stored.
  • the computer-readable storage medium is run on a computer, the computer executes the method in any of the above aspects.
  • an embodiment of the present application provides a computer program product, which includes a program.
  • the program When the program is run on a computer, the computer executes the method in any of the above aspects.
  • the present application provides a chip system, which includes a processor for supporting a communication device to implement the functions involved in the above aspects, for example, sending or processing the data and/or information involved in the above methods.
  • the chip system also includes a memory, which is used to store program instructions and data necessary for the communication device.
  • the chip system can be composed of a chip, or it can include a chip and other discrete devices.
  • FIG1 is a schematic diagram of an architecture of a wireless communication system provided in an embodiment of the present application.
  • FIG2 is another schematic diagram of the architecture of a wireless communication system provided in an embodiment of the present application.
  • FIG3 is a flow chart of a data processing method provided in an embodiment of the present application.
  • FIG4 is another schematic diagram of a data processing method provided in an embodiment of the present application.
  • FIG5 is a schematic diagram of a flow chart of a first device and a second device determining a set of parameters used by a first machine learning model according to an embodiment of the present application;
  • FIG6 is a schematic diagram of a process for a first device according to an embodiment of the present application to obtain a set of parameters used by a first machine learning model
  • FIG7 is a schematic diagram of first data provided in an embodiment of the present application.
  • FIG8 is a schematic diagram of generating T sub-data using a first machine learning model according to an embodiment of the present application.
  • FIG9 is a schematic diagram of a model training method provided in an embodiment of the present application.
  • FIG10 is a schematic diagram of a structure of a data processing device provided in an embodiment of the present application.
  • FIG11 is another schematic diagram of the structure of a data processing device provided in an embodiment of the present application.
  • FIG12 is a schematic diagram of a training device for a model provided in an embodiment of the present application.
  • FIG13 is a schematic diagram of a device provided in an embodiment of the present application.
  • FIG14 is another schematic diagram of a device provided in an embodiment of the present application.
  • FIG. 15 is a schematic diagram of the structure of a chip provided in an embodiment of the present application.
  • first”, second, etc. in the specification and claims of this application and the above-mentioned drawings are used to distinguish similar objects (for example, to distinguish objects in the same embodiment), and are not necessarily used to describe a specific order or sequence.
  • the objects defined by “first”, “second”, etc. may refer to different objects. It should be understood that the data used in this way can be interchangeable under appropriate circumstances so that the embodiments described herein can be implemented in an order other than that illustrated or described herein.
  • Send and “receive” in the embodiments of the present application indicate the direction of signal transmission.
  • send information to XX device can be understood as the destination of the information is XX device, which can include direct transmission through the air interface, and also include indirect transmission through the air interface by other units or modules.
  • Receiveive information from YY device can be understood as the source of the information is YY device. The device may include receiving directly from the YY device through the air interface, or indirectly receiving from the YY device through the air interface from other units or modules.
  • Send can also be understood as the "output” of the chip interface, and “receiving” can also be understood as the "input” of the chip interface.
  • sending and receiving can be performed between devices or within a device, for example, sending or receiving between components, modules, chips, software modules or hardware modules within the device through a bus, wiring or interface. It is understandable that the information may be subjected to necessary processing between the source and destination of the information transmission, such as encoding, modulation, etc., but the destination can understand the valid information from the source. Similar expressions in this application can be understood similarly and will not be repeated.
  • indication may include direct indication and indirect indication, and may also include explicit indication and implicit indication.
  • the information indicated by a certain information is called information to be indicated.
  • information to be indicated there are many ways to indicate the information to be indicated, such as but not limited to, the information to be indicated can be directly indicated, such as the information to be indicated itself or the index of the information to be indicated.
  • the information to be indicated can also be indirectly indicated by indicating other information, wherein there is an association relationship between the other information and the information to be indicated; it is also possible to indicate only a part of the information to be indicated, while the other parts of the information to be indicated are known or agreed in advance, for example, the indication of specific information can be realized by means of the arrangement order of each information agreed in advance (such as predefined by the protocol), thereby reducing the indication overhead to a certain extent.
  • the present application does not limit the specific method of indication. It can be understood that for the sender of the indication information, the indication information can be used to indicate the information to be indicated, and for the receiver of the indication information, the indication information can be used to determine the information to be indicated.
  • the present application can apply artificial intelligence technology to the field of communications, and optionally, can apply artificial intelligence technology to the application scenario of signal transmission.
  • the machine learning model can be used to perform any one or more of the following tasks: encoding, modulation, generating reference signals or other tasks in the field of communications.
  • Figure 1 is a schematic diagram of the architecture of a wireless communication system provided by an embodiment of the present application.
  • the method provided by the present application can be applied to a wireless communication system.
  • the wireless communication system includes a network device 101 and a mobile station (MS) 102.
  • MS mobile station
  • a wireless connection can be established between the network device 101 and each terminal device, and a wireless connection can also be established between each terminal device.
  • the network device 101 may refer to a device that provides wireless access services in a wireless network.
  • the network device 101 may be a device that connects the mobile station 102 to the wireless network, and may also be called a base station; the aforementioned base station may be various forms of macro base stations, micro base stations, relay stations or access points, etc.
  • the names of the network devices 101 having base station functions may be different.
  • the base station may be called an evolved Node B (eNB), a Node B (NB), the next generation Node B (gNB) in the fifth generation (5G) communication system, a home base station (e.g., home evolved Node B, or home Node B, HNB), a base band unit (BBU), a wireless fidelity (Wi-Fi) access point (AP), a transmission reception point (TRP) or a radio network controller (RNC), etc.
  • eNB evolved Node B
  • NB next generation Node B
  • gNB next generation Node B
  • 5G fifth generation
  • a home base station e.g., home evolved Node B, or home Node B, HNB
  • BBU base band unit
  • Wi-Fi wireless fidelity
  • AP transmission reception point
  • TRP transmission reception point
  • RNC radio network controller
  • multiple network nodes collaborate to assist in achieving wireless access, and different network nodes respectively implement part of the functions of a base station.
  • a network node may be a central unit (CU), a distributed unit (DU), a CU-control plane (CP), a CU-user plane (UP), or a radio unit (RADIO).
  • CU and DU may be set separately, or may be included in the same network element, such as a baseband unit (BBU).
  • BBU baseband unit
  • RU may be included in a radio frequency device or a radio frequency unit, such as a remote radio unit (RRU), an active antenna unit (AAU), or a remote radio head (RRH).
  • RRU remote radio unit
  • AAU active antenna unit
  • RRH remote radio head
  • CU or CU-CP and CU-UP
  • DU or RU may also have different names, but those skilled in the art may understand their meanings.
  • CU may also be referred to as an open CU (O-CU)
  • DU may also be referred to as an open DU (O-DU)
  • CU-CP may also be referred to as an open CU-CP (O-CU-CP)
  • CU-UP may also be referred to as an open CU-UP (O-CU-UP)
  • RU may also be referred to as an open RU (O-RU).
  • any unit among CU (or CU-CP, CU-UP), DU and RU may be implemented by a software module, a hardware module, or a combination of a software module and a hardware module.
  • the embodiment of the present application does not limit the specific device form of the network device 101.
  • the mobile station 102 refers to a wireless terminal device that can receive scheduling information and indication information sent by the network device 101.
  • the mobile station 102 can be a handheld device with wireless communication function, a vehicle-mounted device, a wearable device, a computing device, or other processing device connected to a wireless modem.
  • the mobile station 102 can communicate with one or more core networks or the Internet via a wireless access network (RAN).
  • RAN wireless access network
  • the mobile station 102 can be a portable, pocket-sized, handheld, computer-built-in, or vehicle-mounted mobile device that exchanges voice and/or data with the wireless access network.
  • the mobile station 102 can be a user agent, a cellular phone, a smart phone, a personal digital assistant (PDA), a tablet computer (Tablet Personal Computer, Tablet PC), a wireless modem, a handheld device (handset), a laptop computer, a personal communication service (PCS) phone, a remote station (remote station), an access point (access point, AP), a remote terminal equipment (remote terminal), an access terminal equipment (access terminal), a customer premises equipment (customer premises equipment, CPE), a terminal (terminal), a user equipment (user equipment, UE) or a mobile terminal (mobile terminal, MT), etc.
  • PDA personal digital assistant
  • Tablet PC Tablet PC
  • PCS personal communication service
  • the mobile station 102 may also be a wearable device, which is a general term for wearable devices that are intelligently designed and developed using wearable technology for daily wear, such as glasses, gloves, watches, clothing, and shoes.
  • a wearable device is a portable device that is worn directly on the body or integrated into the user's clothes or accessories. Wearable devices are not just hardware devices, but also achieve powerful functions through software support, data interaction, and cloud interaction.
  • wearable smart devices include those that are fully functional, large in size, and can achieve complete or partial functions without relying on smartphones, such as smart watches or smart glasses, as well as those that only focus on a certain type of application function and need to be used in conjunction with other devices such as smartphones, such as various types of smart bracelets, smart helmets, and smart jewelry for vital sign monitoring.
  • the mobile station 102 may also be a drone, a robot, a terminal device in device-to-device (D2D) communication, a terminal device in vehicle to everything (V2X), a virtual reality (VR) device, an augmented reality (AR) device, a wireless terminal in industrial control, a terminal device in self driving, a terminal device in remote medical, a terminal device in a smart grid, a wireless terminal in a smart city, a terminal device in a smart home, etc.
  • D2D device-to-device
  • V2X vehicle to everything
  • VR virtual reality
  • AR augmented reality
  • the mobile station 102 may also be a communication system after the 5G communication system (for example, the sixth generation (6th generation, The embodiment of the present application does not limit the device form of the mobile station 102, such as a terminal device in a future-developed public land mobile network (PLMN), or a terminal device in a future-developed public land mobile network (PLMN).
  • a communication system after the 5G communication system for example, the sixth generation (6th generation.
  • PLMN public land mobile network
  • PLMN public land mobile network
  • the network device 101 can send downlink data to each terminal device, or each terminal device can also send uplink data to the network device 101; the network device 101 or each terminal device may use a machine learning model in the process of sending data, and the data processing method provided in this application can be adopted.
  • each terminal device can also send data to each other.
  • Each terminal device may use a machine learning model in the process of sending data, so the data processing method provided in this application can be adopted.
  • FIG 2 is another architectural diagram of the wireless communication system provided in the embodiment of the present application.
  • various smart home products are connected through a wireless network to enable data to be transmitted between smart home products.
  • these smart home products are all connected to the same wireless network through a wireless router, thereby enabling data interaction between various smart home products.
  • other types of smart home products may also be included in actual applications, such as smart refrigerators, smart range hoods, smart curtains, and other smart home products. This embodiment does not limit the types of smart home products.
  • smart home products can also be directly connected wirelessly without being connected to the same wireless network through a wireless router.
  • the smart home products can be connected wirelessly through Bluetooth.
  • the method provided in the embodiments of the present application can also be applied to other communication system scenarios.
  • different devices such as intelligent robots, lathes, handling vehicles and other equipment
  • a wireless network and transmit data to each other via the wireless network.
  • the embodiments of the present application do not limit the specific scenarios in which the data processing method is applied.
  • the wireless communication systems mentioned in the embodiments of the present application include but are not limited to: fifth generation mobile communication technology (5th Generation Mobile Communication Technology, 5G) communication system, 6G communication system, satellite communication system, short-range communication system, narrowband Internet of Things system (Narrow Band-Internet of Things, NB-IoT), Global System for Mobile Communications (Global System for Mobile Communications, GSM), Enhanced Data rate for GSM Evolution (Enhanced Data rate for GSM Evolution, EDGE), Wideband Code Division Multiple Access system (Wideband Code Division Multiple Access, WCDMA), Code Division Multiple Access 2000 system (Code Division Multiple Access, CDMA2000), Time Division-Synchronization Code Division Multiple Access system (Time Division-Synchronization Code Division Multiple Access, TD-SCDMA) and Long Term Evolution (LTE) system and other communication systems.
  • 5G Fifth Generation Mobile Communication Technology
  • 6G communication system 6G communication system
  • satellite communication system short-range communication system
  • narrowband Internet of Things system Narrow Band-Internet of Things, NB-IoT
  • GSM Global System for
  • Figure 3 is a flow chart of the data processing method provided by an embodiment of the present application. As shown in Figure 3, 301.
  • the first device obtains the value of T, where T is an integer greater than or equal to 1, and T represents the number of sub-data included in the output data of the first machine learning model. 302.
  • the first device inputs the first data into the first machine learning model to obtain the second data generated by the first machine learning model, and the second data includes T sub-data; wherein the first machine learning model includes one or more modules, and each time a module in the first machine learning model is called at least once, one of the T sub-data is obtained; exemplarily, each time a module in the first machine learning model is called, the first device can be called.
  • a module in a machine learning model can also call multiple modules.
  • the first device may be any device that needs to send data in the above-mentioned multiple application scenarios; for example, the first device may be the network device 101 or the mobile station 102 in Figure 1; for another example, the first device may be the smart home or wireless router in Figure 2; or, the first device may also be other devices that need to send data, etc.
  • the form of the first device is not limited in the embodiments of the present application.
  • the function of the first machine learning model includes any one or more of the following combinations: encoding, modulation, generating reference signals or other functions.
  • the first data is the data to be encoded
  • the second data is the encoded data.
  • the function of the first machine learning model is modulation
  • the first data is the data to be modulated
  • the second data is the modulated data.
  • the first data may be the index number of multiple reference signals
  • the second data may be a reference signal.
  • the first data is the data to be encoded and modulated
  • the second data is the encoded and modulated data, etc.
  • the first data and the second data may be expressed as other types of data, etc., which is not limited in the embodiments of the present application.
  • the number of calls of the module in the first machine learning model can be flexibly adjusted according to the value of T to generate T sub-data, so that the first machine learning model can be compatible with multiple values of T, and there is no need to store multiple machine learning models, thereby reducing the storage space overhead.
  • the detailed implementation process of the reasoning phase of the above-mentioned first machine learning model is first introduced below, and then the detailed implementation process of the training phase of the first machine learning model is introduced.
  • the "reasoning phase of the first machine learning model” is the process of using the first machine learning model to process data
  • the "training phase of the first machine learning model” is the process of iteratively training the first machine learning model using training data.
  • the process of iteratively training the first machine learning model is also the process of iteratively updating the parameters adopted by the first machine learning model.
  • one or more sets of trained parameters corresponding to the first machine learning model can be obtained, and the aforementioned parameters obtained in the training phase will be used in the reasoning phase.
  • FIG. 4 is another schematic diagram of a data processing method provided in an embodiment of the present application. As shown in FIG. 4 , the data processing method includes steps 401 to 411 .
  • the first device obtains a set of parameters adopted by the first machine learning model.
  • the first device before the first device uses the first machine learning model to process data, it is necessary to first determine a set of trained parameters adopted by the first machine learning model; exemplarily, the aforementioned set of trained parameters includes the parameters required by all modules in the first machine learning model.
  • the first device can obtain a set of parameters used by the first machine learning model in a variety of ways.
  • multiple sets of trained parameters of the first machine learning model and identification information of each set of trained parameters can be predefined.
  • identification information of each set of trained parameters can also be referred to as the index number of each set of trained parameters.
  • a group of parameters used in the first machine learning model includes the following multiple parameters as an example: U, W, ⁇ s , V, and ⁇ 0 .
  • the parameters used in the process of extracting features from the input data of the first machine learning model to obtain feature information of the input data include U and ⁇ s
  • the parameters used in the process of updating the feature information of the input data include W and ⁇ s
  • the parameters used in the process of generating multiple sub-data in the output data of the first machine learning model according to the feature information of the input data include V and ⁇ 0 .
  • Matrix-U0 represents a value of parameter U
  • Matrix-W0 represents a value of parameter W
  • Vector-s0 represents a value of parameter ⁇ s
  • Matrix-V0 represents a value of parameter V
  • Vector-o0 represents a value of parameter ⁇ 0
  • Matrix-U0, Matrix-W0, Vector-s0, Matrix-V0 and Vector-o0 represent a set of parameters of the first machine learning model
  • the index number "0" in the second row of Table 1 represents the identification information of the aforementioned set of parameters.
  • the third and fourth rows in Table 1 can be understood by referring to the above explanation of the first row in Table 1. It should be noted that the examples in Table 1 are only for the convenience of understanding, and the correspondence between each set of parameters in the multiple sets of parameters of the first machine learning model and the identification information is not used to limit this solution.
  • the at least one first preset indicator may include any one or more of the following indicators: the moving speed of the terminal device, the maximum value of the multipath delay spread, the peak to average power ratio (PAPR) or other indicators, etc.
  • PAPR peak to average power ratio
  • the specific indicators used are not limited in the embodiments of the present application.
  • a set of parameters corresponding to when the moving speed of the terminal device is greater than or equal to the speed threshold is different from a set of parameters corresponding to when the moving speed of the terminal device is less than the speed threshold.
  • a set of parameters corresponding to when the maximum value of the multipath delay spread is greater than threshold 1 is different from a set of parameters corresponding to when the maximum value of the multipath delay spread is less than threshold 1.
  • a set of parameters corresponding to when the PAPR value is within range 1 is different from a set of parameters corresponding to when the PAPR value is within range 2.
  • the high-speed mobile scenario represents that the moving speed of the terminal device is greater than or equal to the speed threshold
  • the low-speed mobile scenario represents that the moving speed of the terminal device is less than the speed threshold
  • the large multipath delay spread represents that the maximum value of the multipath delay spread is greater than the threshold 1
  • the small multipath delay spread represents that the maximum value of the multipath delay spread is less than the threshold 1.
  • Table 2 takes the 8 groups of parameters of the pre-defined first machine learning model as an example.
  • the 8 groups of parameters are parameter 1, parameter 2, parameter 3, parameter 4, parameter 5, parameter 6, parameter 7 and parameter 8. Different groups of parameters in the 8 groups of parameters of the first machine learning model correspond to different scenarios and different indicator ranges of the first preset indicator.
  • Table 2 are only for the convenience of understanding the relationship between different groups of parameters in the multiple groups of trained parameters of the first machine learning model.
  • the specific number of pre-defined parameter groups and the specific usage method can be flexibly set in combination with the actual scenario, and are not limited here.
  • the first device is a terminal device, in an implementation method (hereinafter referred to as implementation method one for the convenience of description), before executing step 401, the first device (i.e., the terminal device) is deployed with multiple identification information corresponding to the parameters of the first machine learning model and a first rule, and the first rule indicates the correspondence between different indicator ranges of at least one first preset indicator of different identification information in the multiple identification information corresponding to the parameters of the first machine learning model.
  • the meaning of the first rule can be understood by referring to the "correspondence between different groups of parameters in the multiple groups of trained parameters and different indicator ranges of at least one first preset indicator" disclosed in the above description, and will not be elaborated here; the first device can determine a second identification information from the multiple identification information corresponding to the parameters of the first machine learning model based on the value of at least one first preset indicator and the first rule, and the second identification information is the identification information of a group of parameters adopted by the first machine learning model in the first device.
  • first identification information will be used in subsequent descriptions, and the meaning of the first identification information will also be explained in the subsequent descriptions, so it will not be repeated here; for the various forms of base stations and terminal devices, please refer to the above description, so it will not be repeated here.
  • the first device may send the second identification information to the base station.
  • the base station may send a set of trained parameters pointed to by the second identification information to the first device.
  • the aforementioned set of trained parameters sent by the base station is a set of trained parameters adopted by the first machine learning model in the first device.
  • Step 401 may include: the first device is able to obtain the aforementioned set of trained parameters sent by the base station, thereby determining the aforementioned set of parameters obtained as a set of parameters adopted by the first machine learning model.
  • the base station may send the second identification information and a set of parameters pointed to by the second identification information to the first device.
  • the first device can obtain the second identification information and a set of parameters pointed to by the second identification information, and the first device determines the obtained set of parameters as a set of parameters adopted by the first machine learning model.
  • the second device that communicates data with the first device is the above-mentioned terminal device
  • the first device that is, the above-mentioned terminal device
  • the second device has also determined which set of parameters is adopted by the first machine learning model in the first device.
  • the first device can also send a set of parameters adopted by the first machine learning model (hereinafter referred to as "a set of target parameters" for the convenience of description) to the second device, so that the second device can determine which set of parameters is adopted by the first machine learning model in the first device.
  • a set of target parameters for the convenience of description
  • At least one third identification information corresponding to the parameters of the first machine learning model may be configured in the first device, and each third identification information in the aforementioned at least one third identification information is identification information of a set of parameters that can be adopted by the first machine learning model, and the set of parameters pointed to by each third identification information in the aforementioned at least one third identification information all conforms to the hardware capabilities of the first device, that is, the hardware capabilities of the first device can support the execution of a set of parameters pointed to by each third identification information.
  • the first device may send all the third identification information configured above to the base station, and correspondingly, after receiving the at least one third identification information sent by the first device, the base station may obtain a set of trained parameters pointed to by each third identification information (that is, a set of parameters that can be adopted by the first machine learning model).
  • the base station sends the aforementioned at least one set of trained parameters of the first machine learning model corresponding to the at least one third identification information to the first device, and the first device receives the aforementioned at least one set of parameters corresponding to the at least one third identification information.
  • Step 401 may include: the first device may select a set of target parameters adopted by the first machine learning model from at least one set of parameters corresponding one-to-one to at least one third identification information.
  • different third identification information in at least one third identification information can correspond to different indicator ranges of at least one second preset indicator.
  • the first device can determine a second identification information from multiple third identification information corresponding to the parameters of the first machine learning model based on the value of the at least one second preset indicator, and then select a group of target parameters corresponding to a second identification information from at least one group of parameters corresponding one-to-one to the at least one third identification information.
  • the specific implementation method of "the first device can determine a second identification information from multiple third identification information corresponding to the parameters of the first machine learning model according to the value of at least one second preset indicator” can refer to the above description of the specific implementation method of "the first device can determine a second identification information from multiple identification information corresponding to the parameters of the first machine learning model according to the value of at least one first preset indicator", which will not be repeated here.
  • At least one second preset indicator and the category of “at least one first preset indicator” may be the same or different.
  • the specific category of “at least one second preset indicator” may be flexibly set according to actual conditions and is not limited here.
  • the first device may also send to the base station (i.e., an example of the second device) a second identification information determined by the above-mentioned first device from at least one third identification.
  • the base station receives the second identification information sent by the first device, so that the second device can determine which set of parameters is adopted by the first machine learning model in the first device.
  • the first device can also send a set of target parameters adopted by the first machine learning model to the second device, so that the second device can determine which set of parameters is adopted by the first machine learning model in the first device.
  • step 401 may include: the base station (that is, an example of the first device) can determine a set of target parameters adopted by the first machine learning model from the multiple groups of trained parameters of the first machine learning model based on the value of at least one first preset indicator.
  • the first device i.e., the base station
  • the first device i.e., the base station
  • the second device i.e., the terminal device communicating with the base station
  • the second device may also send a set of target parameters adopted by the aforementioned first machine learning model to the second device (i.e., the terminal device communicating with the base station), so that the second device can determine which set of parameters is adopted by the first machine learning model in the first device.
  • both the first device and the second device already have multiple sets of trained parameters of the first machine learning model and identification information of each set of trained parameters.
  • the first device can send multiple sets of trained parameters of the first machine learning model and identification information of each set of trained parameters to the second device.
  • the base station (an example of the first device) can send multiple sets of trained parameters of the first machine learning model and identification information of each set of trained parameters to the terminal device (an example of the second device); for another example, the terminal device (an example of the first device) can send multiple sets of trained parameters of the first machine learning model and identification information of each set of trained parameters to the base station (an example of the second device); for another example, the first terminal device (an example of the first device) can send multiple sets of trained parameters of the first machine learning model and identification information of each set of trained parameters to the second terminal device (an example of the second device)
  • the network equipment and mobile station in the wireless communication system have been pre-configured with multiple groups of trained parameters of the first machine learning model and identification information of each group of trained parameters, and the network equipment and mobile station in the wireless communication system include a first device and a second device.
  • step 401 may include: the first device may obtain a set of target parameters from multiple sets of trained parameters of the first machine learning model.
  • the first device may determine a second identification information from multiple identification information corresponding to the parameters of the first machine learning model based on the value of at least one first preset indicator, and then determine a set of parameters pointed to by the second identification information, thereby determining a set of parameters used by the first machine learning model.
  • the specific implementation method of the aforementioned steps can be found in the above description and will not be repeated here.
  • the first device may also send the aforementioned second identification information to the second device, and correspondingly, the second device receives the second identification information sent by the first device.
  • the second device may determine which set of parameters is used by the first machine learning model in the first device based on the received second identification information and third data, and the third data includes multiple sets of parameters corresponding to the first machine learning model in the second device and identification information of each set of parameters.
  • the specific forms of the first device and the second device can be flexibly determined in combination with the actual application scenario, and are not limited here.
  • FIG. 5 is a flow chart of a first device and a second device determining a set of parameters used by a first machine learning model according to an embodiment of the present application.
  • the first device sends multiple sets of parameters of the first machine learning model and identification information of each set of parameters to the second device.
  • the first device Send the second identification information to the second device.
  • the second device determines a set of parameters used by the first machine learning model in the first device according to the second identification and the third data, wherein the third data includes multiple sets of parameters of the first machine learning model and identification information of each set of parameters.
  • the example in FIG. 5 is only for facilitating the understanding of the present solution and is not used to limit the present solution.
  • the first device only needs to send the second identification information to the second device, and the second device can know which group of parameters is used by the first machine learning model in the first device.
  • the communication resources occupied by transmitting the second identification information are relatively small, which is beneficial to reducing the consumed communication resources.
  • the embodiment of the present application does not limit the execution order between the above-mentioned operation of "the second device determines which set of parameters is adopted by the first machine learning model in the first device” and the subsequent steps 402 to 411, and the operation of "the second device determines which set of parameters is adopted by the first machine learning model in the first device” can be performed before or after any step of steps 402 to 411.
  • only a set of trained parameters of the first machine learning model may be defined in advance, and the set of trained parameters of the aforementioned first machine learning model may be pre-configured in the first device. Then, when the first device needs to use the first machine learning model, the set of parameters adopted by the first machine learning model may be directly obtained locally.
  • the first device in order to enable the first device to obtain a set of trained parameters adopted by the first machine learning model, and in order to enable the second device that communicates data with the first device to determine what parameters are adopted by the first machine learning model in the first device, it may be necessary to send the trained parameters corresponding to the first machine learning model and/or identification information of the aforementioned parameters between different devices.
  • the terminal device (that is, an example of the first device) can send second identification information (that is, identification information of the trained parameters corresponding to the first machine learning model) to the base station.
  • the base station can send a set of trained parameters pointed to by the second identification information to the terminal device, that is, a set of target parameters adopted by the first machine learning model.
  • the terminal device (that is, an example of the first device) can send a set of target parameters adopted by the first machine learning model to another terminal device (that is, an example of the second device).
  • the terminal device (also an example of the first device) can send at least one third identification information (also the identification information of the trained parameters corresponding to the first machine learning model) to the base station.
  • the base station sends at least one set of trained parameters of the first machine learning model corresponding to the at least one third identification information to the terminal device (also an example of the first device).
  • the first device can also send the above-mentioned second identification information to the base station (also an example of the second device).
  • the first device can also send a set of target parameters adopted by the first machine learning model to the second device, and so on.
  • the situations in the above-mentioned implementation methods 3 and 4 are not listed one by one here. For details, please refer to the descriptions in the above-mentioned various implementation methods.
  • the trained parameters corresponding to the first machine learning model and/or identification information of the aforementioned parameters may be carried in the signaling, that is, in the various implementations described above, the trained parameters corresponding to the first machine learning model and/or identification information of the aforementioned parameters are sent between different devices by sending signaling.
  • each signaling may carry at least one set of trained parameters of the first machine learning model.
  • the trained parameters corresponding to the learning model can be carried in one or more of the following information: downlink control information (DCI), uplink control information (UCI), sidelink control information (SCI), radio resource control (RRC) signaling, media access control control element (MAC CE) or other types of signaling, which are not exhaustive here.
  • DCI downlink control information
  • UCI uplink control information
  • SCI sidelink control information
  • RRC radio resource control
  • MAC CE media access control control element
  • the base station can send a DCI, RRC or MAC CE carrying a set of trained parameters pointed to by the second identification information to the terminal device (that is, an example of the first device); correspondingly, the first device can obtain the set of trained parameters pointed to by the second identification information from the aforementioned DCI, RRC or MAC CE.
  • the terminal device (that is, an example of the first device) can carry a set of target parameters adopted by the first machine learning model in SCI, RRC or MAC CE and send it to another terminal device (that is, an example of the second device); correspondingly, the second device can obtain the set of target parameters adopted by the aforementioned first machine learning model from SCI, RRC or MAC CE.
  • the base station may carry each third identification information and a set of trained parameters pointed to by each third identification information in DCI, RRC or MAC CE and send them to the terminal device (that is, an example of the first device); correspondingly, the first device may obtain each third identification information and a set of trained parameters pointed to by each third identification information from DCI, RRC or MAC CE.
  • the terminal device (an example of the first device) can carry each group of trained parameters of the first machine learning model and the identification information of each group of trained parameters in UCI, RRC or MAC CE and send them to the base station (an example of the second device); correspondingly, the base station can obtain each group of trained parameters of the first machine learning model and the identification information of each group of trained parameters from UCI, RRC or MAC CE, etc.
  • the first signaling may carry at least one set of parameters adopted by the first machine learning model and indication information corresponding to each set of parameters, and the indication information corresponding to each set of parameters is used to indicate the position of multiple parameters included in each set of parameters in the first machine learning model.
  • the first signaling is any of the following: DCI, UCI, SCI, RRC, MAC CE or other types of signaling.
  • the base station when the base station needs to send a set of trained parameters pointed to by the second identification information to the first device, the base station can send a first signaling to the first device.
  • the first device can receive the first signaling sent by the base station and obtain a set of parameters adopted by the first machine learning model from the first signaling.
  • the base station when the base station needs to send a set of trained parameters pointed to by each third identification information to the first device, the base station may send one or more first signalings to the first device, each first signaling carrying a third identification information and a set of parameters pointed to by the third identification information, and so on.
  • the base station may send one or more first signalings to the first device, each first signaling carrying a third identification information and a set of parameters pointed to by the third identification information, and so on.
  • Other situations in the various implementation methods mentioned above are not listed one by one here.
  • the signaling when at least one set of parameters of the first machine learning model is transmitted by signaling, the signaling not only carries the aforementioned at least one set of parameters, but also carries indication information corresponding to each set of parameters, which is used to indicate each set of parameters.
  • the parameters include the positions of multiple parameters in the first machine learning module, so that after receiving the signaling, the first device can understand how to use the parameters carried in the signaling, and transmit at least one set of parameters of the first machine learning model by means of signaling, which is conducive to reducing the communication resources consumed in the parameter transmission process and improving the efficiency of the parameter transmission process.
  • the indication information corresponding to each group of parameters may include the number of layers of each parameter in each group of parameters in the first machine learning model, and the parameter value used when operating in the layer.
  • the name of each parameter in the first signaling may be consistent with the name of each parameter in the first machine learning model, and the information carried by the first signaling may include the following:
  • the above information represents that the parameters used by the first neural network layer in the first machine learning model include matrix U1_1, matrix U1_2 and vector b1_1.
  • the values of matrix U1_1 are ⁇ u11(1,1), u11(1,2), u11(2,1), u11(2,2),... ⁇
  • the values of matrix U1_2 are ⁇ u12(1,1), u12(1,2), u12(1,3), u12(2,1),... ⁇
  • the values of vector b1_1 are b11(1), b11(2),... ⁇ , etc.
  • the meanings of the parameters used by the second to fifth neural network layers of the first machine learning model can be understood in combination with the above description and will not be elaborated here.
  • the following takes the first signaling as MAC CE and RRC as an example to show the specific format of carrying a set of parameters of the first machine learning model in MAC CE and RRC.
  • First refer to the following Table 3, which shows the format of a set of parameters of the first machine learning model in MAC CE.
  • R in Table 3 represents a meaningless parameter, that is, R will not be used in the first machine learning model.
  • the first machine learning model includes two neural network layers as an example.
  • the parameters used by the first neural network layer of the first machine learning model include U 0 , W 0 , ⁇ s0 , V 0 and ⁇ o0 , and the values of the five parameters used by the first neural network layer of the first machine learning model are shown in the second to sixth rows, respectively.
  • the seventh row of Table 3 (i.e., the seventh row of MAC CE) states that the parameters used by the second neural network layer of the first machine learning model include U 1 , W 1 , ⁇ s1 , V 1 and ⁇ o1 , and the values of the five parameters used by the second neural network layer of the first machine learning model are shown in the eighth to twelfth rows. Since the parameters used in the first machine learning model of the first device are also U 0 , W 0 , ⁇ s0 , V 0 , ⁇ o0 , U 1 , W 1 , ⁇ s1 , V 1 and ⁇ o1 , the first device can determine the value of each parameter in the first machine learning model after receiving the MAC CE. It should be understood that the examples in Table 3 are only for the convenience of understanding this solution and are not used to limit this solution.
  • the content carried in RRC may be as follows:
  • matrixU SEQUENCE ⁇ means that the values of the parameters in ⁇ are all matrices, Ui0j0 represents the parameter of the i0th row and j0th column of the parameter matrixU of the first machine learning model, and REAL in Ui0j0,REAL represents that the value type of the parameter Ui0j0 is a real number; similarly, Ui1j1 represents the parameter of the i1th row and j1th column of the parameter matrixU of the first machine learning model, and REAL in Ui1j1,REAL represents that the value type of the parameter Ui1j1 is a real number.
  • vector SEQUENCE ⁇ means that the values of the parameters in ⁇ are all vectors, v0 is the 0th parameter of the parameter vector of the first machine learning model, and REAL in v0,REAL represents that the value type of the parameter v0 is a real number; v1 is the 1st parameter of the parameter vector of the first machine learning model, and REAL in v1,REAL represents that the value type of the parameter v1 is a real number.
  • a set of parameters of the first machine learning model can be carried in the RRC, and after obtaining the RRC, the position of each parameter in the set of parameters in the first machine learning model can also be known. It should be understood that the above examples are only for the convenience of understanding of this solution and are not used to limit this solution.
  • FIG. 6 is a flow chart of a first device obtaining a set of parameters adopted by a first machine learning model provided in an embodiment of the present application.
  • the first device sends a UCI to a base station, and the UCI carries second identification information.
  • the base station obtains a set of parameters pointed to by the second identification information, that is, a set of parameters adopted by the first machine learning model. 603.
  • the base station obtains a DUI, and the DUI carries a set of parameters pointed to by the second identification information and indication information corresponding to the aforementioned set of parameters, and the indication information corresponding to the aforementioned set of parameters is used to indicate the aforementioned set of
  • the parameters include the positions of multiple parameters in the first machine learning model. 604.
  • the base station sends the DUI to the first device, and correspondingly, the first device receives the DUI.
  • the UCI and DUI in Figure 6 can also be replaced by other types of signaling.
  • the example in Figure 6 is only for the convenience of understanding this solution and is not used to limit this solution.
  • the identification information of the parameters is carried in any one or more of the following information: DCI, UCI, SCI, RRC signaling, MAC CE, physical broadcast channel (PBCH), physical random access channel (PRACH) or other types of signaling, which are not exhaustive here.
  • the terminal device (that is, an example of the first device) can carry the second identification information in UCI, MAC CE or PRACH and send it to the base station.
  • the base station can carry the second identification information in DCI, MAC CE or PBCH and send it to the terminal device.
  • the first terminal device can carry the second identification information in SCI or MAC CE and send it to the second terminal device, and so on; it should be noted that the above-mentioned various implementation methods in which the identification information of the trained parameters corresponding to the first machine learning model is carried by signaling are not described one by one here. Other situations in which the aforementioned identification information is carried by signaling in the aforementioned various implementation methods can be understood by referring to the aforementioned description.
  • carrying the identification information of the aforementioned at least one set of parameters and/or each set of parameters in a signaling for transmission has higher transmission efficiency and consumes less computer resources; in addition, the present solution provides a variety of signaling that can be used to transmit the identification information of the aforementioned at least one set of parameters and/or each set of parameters, thereby improving the implementation flexibility of the present solution.
  • the trained parameters corresponding to the first machine learning model and/or identification information of the aforementioned parameters may be carried in a data packet. That is, in the above implementations, the trained parameters corresponding to the first machine learning model and/or identification information of the aforementioned parameters are sent between different devices by sending data packets.
  • the terminal device (also an example of the first device) can send a first data packet carrying the second identification information to the base station, and correspondingly, the base station can obtain the second identification information from the first data packet.
  • the base station can send a second data packet to the terminal device (also an example of the first device), and the second data packet carries a set of trained parameters pointed to by the second identification information; the terminal device can obtain the aforementioned set of target parameters adopted by the first machine learning model from the second data packet.
  • the terminal device (also an example of the first device) can send a third data packet to another terminal device (also an example of the second device), and the third data packet carries a set of target parameters adopted by the first machine learning model; the second device can obtain a set of target parameters adopted by the first machine learning model in the first device from the third data packet, and so on.
  • implementation method 2 implementation method 3 and implementation method 4 are not described one by one here.
  • the specific implementation methods in implementation method 2, implementation method 3 and implementation method 4 can be understood by referring to the above description of implementation method 1.
  • the first machine learning model may be retrained to optimize the parameters used in the first machine learning model, and the first device obtains a set of updated parameters of the first machine learning model.
  • the set of updated parameters of the first machine learning model may be sent to the second device.
  • the base station retrains the first machine learning model to obtain a set of updated parameters of the first machine learning model.
  • the first device may send a request to the base station, the aforementioned request is used to request the base station to retrain the first machine learning model, and the base station sends a set of updated parameters of the first machine learning model to the first device.
  • the first device may also send the aforementioned set of updated parameters to the second device.
  • the base station can also send the aforementioned set of updated parameters to the second device, so that the second device can determine the set of updated parameters adopted by the first machine learning model in the first device.
  • the first device retrains the first machine learning model to obtain a set of updated parameters of the first machine learning model.
  • the first device also sends the set of updated parameters of the first machine learning model to the second device.
  • the first device is a terminal device and the second device is a base station
  • the terminal device after the terminal device retrains the first machine learning model and obtains a set of updated parameters of the first machine learning model, it can send the aforementioned set of updated parameters to the base station.
  • the first device is a first terminal device and the second device is a second terminal device
  • after the first terminal device retrains the first machine learning model and obtains a set of updated parameters of the first machine learning model it can send the aforementioned set of updated parameters to the second terminal device, and so on.
  • Various situations are not enumerated here.
  • the first device obtains the value of T, where T is an integer greater than or equal to 1, and T represents the number of sub-data included in the output data of the first machine learning model.
  • T represents the number of sub-data included in the output data of the first machine learning model.
  • the output data of the first machine learning model is modulated data
  • the modulated data may include T groups of modulated symbols.
  • the output data of the first machine learning model is encoded data
  • the encoded data may include T groups of encoded bit data.
  • the task performed by the first machine learning model is to generate a reference signal
  • the output data of the first machine learning model is a reference signal
  • T may represent the length of the aforementioned reference signal.
  • the length of the reference signal may indicate the number of symbols included in the reference signal.
  • the first device obtains data to be processed.
  • the first device determines whether the length of the data to be processed is less than H. If the determination result is yes, the process proceeds to step 405; if the determination result is no, the process proceeds to step 406, where H indicates the length of the first data.
  • step 404 is an optional step.
  • the first device can also obtain the length of the data to be processed and the value of H.
  • the length of the data to be processed can be K, where K is an integer greater than or equal to 1, and H indicates the length of the first data, that is, H indicates the expected length of the input data of the first machine learning model, and H is an integer greater than or equal to 1.
  • the first device may determine whether K is less than H; if the determination result is yes, proceed to step 405; if the determination result is no, proceed to step 406.
  • the length of the data to be processed may be the number of bits of the data to be processed.
  • the data to be processed is the data that needs to be encoded
  • the length of the data to be processed may be the number of bits of the data that needs to be encoded.
  • the data to be processed is the data that needs to be modulated
  • the length of the data to be processed may be the number of bits of the data that needs to be modulated.
  • the data to be processed is the data that needs to be encoded and modulated, and the length of the data to be processed is the number of bits of the data that needs to be encoded and modulated.
  • the task performed by the first machine learning model is to generate a reference signal, the data to be processed includes the index numbers of multiple reference signals, and the length of the data to be processed is the number of bits of the index numbers of the aforementioned multiple reference signals, and so on.
  • the data to be processed includes index numbers of multiple reference signals, and the length of the data to be processed can be the number of the aforementioned multiple reference signals, etc.
  • the meaning of "the length of the data to be processed" can be flexibly determined based on actual conditions and is not limited here.
  • the first device fills the data to be processed to obtain first data, and the length of the first data is H.
  • step 405 is an optional step. If the first device determines that the length of the data to be processed is less than H, the first device can fill the data to be processed to obtain the first data.
  • the length of the first data is H, and the first data may include the data to be processed and the filled data.
  • the data to be processed when the length of the data to be processed is less than H, the data to be processed is filled to obtain the first data of length H, and then the first data of length H is input into the first machine learning model, so that no matter how long the data to be processed is, the first machine learning model processes the first data of length H, which is not only conducive to compatibility with data to be processed of any length, but also conducive to reducing the difficulty of the first machine learning model in data processing to obtain second data with better performance.
  • the padding data may include first identification information, and the first identification information is used to identify the value of T and/or the value of K, that is, the first identification information can be used to identify the value of T and the value of K, and can also be used to identify the value of T, and can also be used to identify the value of K.
  • the value of T is the number of sub-data included in the output data of the first machine learning model
  • K is the length of the data to be processed
  • T and K are both integers greater than or equal to 1.
  • the first device may use the first function to process the value of T and/or the value of K to obtain the first identification information.
  • the conditions that the first function needs to meet include: limiting the value of the first identification information within a preset range, and being able to map different values of T and/or K to different values, that is, the value generated by the first function can uniquely identify a certain value of T and/or value of K, or in other words, the value generated by the first function can distinguish different values of T and/or value of K.
  • the first function may be a binary function, a linear function, or a nonlinear function.
  • f(T, K) represents an example of the first function, and the example in formula (1) is only for the convenience of understanding the present solution and is not used to limit the present solution.
  • the first data carries first identification information for identifying the value of T and/or the value of K.
  • the first machine learning model can process the first data according to the value of T and/or the value of K, that is, according to the length of the output data of the first machine learning model and/or the length of the actual data to be processed, and then the second data output by the first machine learning model is conducive to obtaining second data with better performance.
  • the padding The charging data may also carry identification information of the first device.
  • the first device may be the identification information of the terminal device; if the first device is a base station, the first device may be the identification information of the base station; for example, the identification information of the first device may be a radio network temporary identity (RNTI), a cell identity (ID), a physical cell identity (PCI), or other types of identification information, etc., which are not exhaustive here.
  • RNTI radio network temporary identity
  • ID cell identity
  • PCI physical cell identity
  • the padding data may also carry invalid information.
  • the remaining space in the padding data may be filled with 0, 1 or other values. The example here is only for the convenience of understanding the present solution and is not used to limit the present solution.
  • i is taken from 0 to H-1 in sequence. If i is less than K, b i is obtained from the data to be processed, and b i is converted to c i and put into the first data, c i represents the i+1th data in the first data; if i is greater than or equal to K and less than H-1, 0 is filled into the first data; if i is equal to H-1, the first identification information is filled into the first data.
  • the meaning of the first identification information can be found in the above description and will not be elaborated here. It should be understood that the example here is only for the convenience of understanding this scheme and is not used to limit this scheme.
  • Figure 7 is a schematic diagram of the first data provided in the embodiment of the present application.
  • the first data includes data to be processed, 0 for padding, and first identification information.
  • the example in Figure 7 is only for the convenience of understanding this solution and is not used to limit this solution.
  • the padding data may include identification information of the first device but not the first identification information. In another case, the padding data may carry only invalid information.
  • the first device merges the first identification information and the data to be processed to obtain the first data, where the first identification information is used to identify the value of T and/or the value of K, where K is the length of the data to be processed and K is an integer greater than or equal to 1.
  • step 406 is an optional step. If the first device determines that the length of the data to be processed is equal to H, the first device may also merge the first identification information and the data to be processed to obtain the first data. For the meaning of the first identification information, please refer to the above description and will not be repeated here.
  • “merging the first identification information and the data to be processed” includes, but is not limited to: concatenating, adding or other merging methods of the first identification information and the data to be processed, etc., which are not limited here.
  • steps 404 to 406 are all optional steps, and if steps 404 to 406 are not performed, the data to be processed can be directly determined as the first data. Alternatively, steps 404 and 405 can be omitted, and only step 406 can be performed.
  • steps 404 and 405 may be performed, and step 406 may not be performed. Then, when the first device determines that the length of the data to be processed is equal to H, the data to be processed may be directly determined as the first data.
  • the first device inputs the first data into the first machine learning model to obtain second data generated by the first machine learning model, where the second data includes T sub-data, wherein the first machine learning model includes one or more modules, and each time a module in the first machine learning model is called at least once, one sub-data is obtained.
  • the first device may directly input the first data into the first machine learning model.
  • the first device may also scramble the first data using the identification information of the first device, and input the scrambled first data into the first machine learning model.
  • the meaning of the identification information of the first device can be referred to the above description, which will not be repeated here.
  • One sub-data is obtained at least once each time a module in the first machine learning model is called means that one sub-data among T sub-data can be obtained each time a module in the first machine learning model is called at least once, or, one sub-data among T sub-data can be obtained each time multiple modules in the first machine learning model are called at least once.
  • the function of the first machine learning model includes any one or more of the following combinations: encoding, modulation, or generating a reference signal.
  • the T sub-data represent T groups of encoded bit data, and each group of encoded bit data may include one or more bit data.
  • the T sub-data represent T groups of modulated symbols, and each group of modulated symbols may include one or more symbols.
  • the T sub-data represent T groups of encoded and modulated symbols, and each group of encoded and modulated symbols may include one or more symbols.
  • the T sub-data may represent T groups of symbols in the reference signal, etc. It should be understood that the examples given here are only for the convenience of understanding this scheme and are not used to limit this scheme.
  • the first machine learning model may include a first module and at least one second module, and the difference between the first module and the second module includes: the initial input of the first module is the first data or the scrambled first data, and the initial input of the second module is the feature information of the first data (or the scrambled first data).
  • step 407 may include: the first device inputs the first data (or the scrambled first data) into the first module to obtain the first sub-data generated by the first module, and the first sub-data is one of the T sub-data; the first device can obtain the characteristic information of the first data in the process of using the first module to generate the first sub-data.
  • the first device inputs the first characteristic information into the second module to obtain the second sub-data generated by the second module, wherein the first characteristic information includes the characteristic information generated when the module in the first machine learning model was last called for data processing, and the second sub-data is one of the T sub-data.
  • the first machine learning model includes a first module and at least one second module, the input of the first module of the first machine learning model is the entire first data, and the input of the second module is the last call to the first If the feature information is obtained when the module of the machine learning model is called, the feature information obtained when the entire first data is processed by the first module is input when the second module is called for the first time, so that the feature information input into the second module each time refers to the entire first data, that is, when generating each sub-data in the T sub-data, the information of the entire first data is referred to, which is conducive to obtaining second data with better performance; and each time the second module is called once, one second sub-data among the T sub-data can be obtained, which is conducive to quickly obtaining the T sub-data included in the second data.
  • the first device may obtain the first sub-data generated by the first module by calling the first module once or multiple times; optionally, if the first device obtains the first sub-data generated by the first module by calling the first module multiple times, the first characteristic information input when the second module is called for the first time is the characteristic information generated when the first module is called for the last time.
  • the first device inputs the first data (or the first data after scrambling) into the first module, processes the first data (or the first data after scrambling) through the first module, and then directly outputs the first sub-data.
  • the first module may use a convolutional neural network, a recurrent neural network, a fully connected neural network, or other types of neural networks, etc., which are not limited here.
  • the process of the first device using the first module to process the first data may include: the first device performs a linear transformation on the first data (or the first data after scrambling) through the first module, and processes it with a third activation function to obtain characteristic information of the first data (or the first data after scrambling); linearly transforms the characteristic information of the first data (or the first data after scrambling), and processes it with a fourth activation function to obtain the first sub-data generated by the first module.
  • the activation function in the first machine learning module can be any of the following: tanh(x), max(min(a*x,+1),-1), sin(x) or other types of activation functions, etc.
  • tanh(x) max(min(a*x,+1),-1), sin(x) or other types of activation functions, etc.
  • the examples here are only used to prove the feasibility of this solution and are not used to limit this solution.
  • the third activation function and the fourth activation function may use the same activation function or different activation functions, which may be flexibly set according to actual conditions and are not limited in the embodiments of the present application.
  • the first device inputs the first data (or the first data after scrambling) into the first module, and processes the first data (or the first data after scrambling) through the first module.
  • the first device obtains the second characteristic information of the first data (or the first data after scrambling) obtained in the aforementioned processing process, inputs the second characteristic information into the first module again, and processes the second characteristic information through the first module; the first device repeats the aforementioned step of "obtaining the second characteristic information obtained in the process of calling the first module for data processing last time, inputting the second characteristic information into the first module again, and processing the second characteristic information through the first module" at least once, and uses the processing result of the last call of the first module to process the second characteristic information as the first sub-data.
  • the process of the first device using the first module to process the input data each time may include: the first device performs a linear transformation on the input data through the first module, and processes it using the third activation function to obtain the transformed input data; performs a linear transformation on the transformed input data, and processes it using the fourth activation function to obtain the processing result of the first module.
  • the input data of the first module may be the first data (or the first data after scrambling), or the characteristic information of the first data (or the first data after scrambling).
  • each second sub-data specifically, each time the first device calls the second module, it will The first feature information is input into the second module, and the first feature information is processed by the second module to obtain second sub-data generated by the second module, where the second sub-data is one of the T sub-data.
  • the first feature information is the feature information generated when the module in the first machine learning model is called for data processing last time; exemplarily, the first feature information can be the feature information generated when the first module in the first machine learning model is called for data processing last time, or it can be the feature information generated when the second module in the first machine learning model is called for data processing last time.
  • each second module can use a convolutional neural network, a recurrent neural network, a fully connected neural network, or other types of neural networks, etc., which are not limited here.
  • the process in which the first device uses the second module to process the first feature information each time may include: the first device performs a linear transformation on the first feature information through the second module, and processes it using the first activation function to obtain the transformed feature information; performs a linear transformation on the transformed feature information, and processes it using the second activation function to obtain the second sub-data generated by the second module.
  • the first activation function and the second activation function are both activation functions within the first machine learning model, and the specific activation function to be used can be flexibly set according to the actual situation.
  • a specific implementation method for data processing by the second module is provided. Since the above method is simple and easy to implement, it is not only beneficial to reduce the computer resources consumed in the process of generating the second data; and the number of parameters used in the second module shown in the above method is relatively small, which is beneficial to reduce the communication resources consumed when transmitting the parameters of the first machine learning model.
  • the first machine learning model includes a first module and a second module, and a first sub-data is obtained by calling the first module once, and T-1 second sub-data are obtained by calling the second module T-1 times:
  • the parameters used in the first module include U, ⁇ s , V and ⁇ o
  • the parameters used in the second module include W, ⁇ s , V and ⁇ o
  • ⁇ ′ s represents the transposition of ⁇ s
  • ⁇ ′ o represents the transposition of ⁇ o
  • the values of the parameters used in the first module and the second module are the same, that is, the values of U in the first module and W in the second module are the same.
  • c represents the first data
  • c′ represents the transposition of the first data
  • Uc′+ ⁇ ′ s represents the linear transformation of the transposed first data.
  • o′ 0 Vs′ 0 + ⁇ ′ o represents that the feature information s′ 0 of the first data is subjected to linear transformation to obtain o′ 0
  • o 0 represents the transposition of o′ 0
  • exp(j2 ⁇ o 0 ) represents that o 0 is processed by the fourth activation function to obtain the first sub-data x 0 .
  • the example here in which the first machine learning model only includes a first module and a second module, and the parameters used in the first module and the second module are consistent, is only an example for the convenience of understanding the present solution.
  • the parameters used by the first module and the second module may also be inconsistent, and the first machine learning model may also include multiple second modules.
  • FIG8 is a schematic diagram of generating T sub-data using the first machine learning model provided in an embodiment of the present application.
  • the first data can be linearly transformed by the first module, and processed by the third activation function to obtain S0 (i.e., the feature information of the first data), S0 is linearly transformed to obtain O0 , and O0 is processed to obtain a first sub-data generated by the first module.
  • S 0 generated in the process of calling the first module to process the first data is input into the second module of the first machine learning model, S 0 is linearly transformed by the second module, and is processed by the first activation function to obtain S 1 (that is, the updated feature information of the first data), S 1 is linearly transformed to obtain O 1 , and after processing O 1 , the first second sub-data generated by the second module is obtained.
  • the feature information (i.e., St -1 ) generated when the module of the first machine learning model was called last time for processing is input into the second module. For example, if it is the first time that the second module of the first machine learning model is called, the feature information (an example of St-1 ) generated when the first module of the first machine learning model is called for processing is input into the second module; if it is the second to T-1th time that the second module is called, the feature information (another example of St -1 ) generated when the second module was called last time for processing is input into the second module called currently.
  • St-1 is linearly transformed by the second module and processed using the first activation function to obtain St (i.e., the first data).
  • the T - th sub-data generated by the second module is obtained. Then, the second module can be called T-1 times to obtain T-1 second sub-data.
  • the T-1 second sub-data and 1 first sub-data can constitute T sub-data in the second data.
  • the example in FIG8 is only for facilitating the understanding of the present solution and is not used to limit the present solution.
  • the first machine learning model may include multiple second modules, wherein at least two of the multiple second modules use different parameters. That is, the first device can generate one second sub-data each time it calls the second module, but different second modules may be called in the process of generating T-1 second data.
  • two second modules using different parameters may include any of the following differences: the same type of parameters are used in the two second modules, but the parameter values used in the two second modules are not exactly the same; or the types of parameters used in the two second modules are not exactly the same, etc., which are not exhaustive here.
  • the same parameters are used in the two second modules means that not only the types of parameters used in the two second modules are exactly the same, but also the values of each parameter are exactly the same.
  • the multiple second modules may include a second module 1, a second module 2 and a second module 3. If the value of T is 8, 7 second sub-data need to be generated.
  • Each second module can be used to generate the same number (for example, 3) of second sub-data, that is, the second module 1 is used to generate the first three second data, the second module 2 is used to generate the fourth, fifth and sixth second sub-data, and the second module 3 is used to generate the seventh second sub-data.
  • the multiple second modules may include a second module 1, a second module 2 and a second module 3, and the value of T is 8, so 7 second sub-data need to be generated.
  • the number of second sub-data generated by each second module may also be different, that is, the second module 1 is used to generate the first 3 second data, the second module 2 is used to generate the 4th and 5th second sub-data, and the second module 3 is used to generate the 6th and 7th second sub-data. It should be noted that the examples here are only for the convenience of understanding the present scheme and are not used to limit the present scheme.
  • x t represents the t+1th sub-data among the T sub-data included in the second data.
  • Different second modules use the same ⁇ s and ⁇ o , and different second modules use different W and V.
  • W t mod ⁇ and V t mod ⁇ represent the periodic calling of ⁇ second modules. There are ⁇ groups of different parameters in the ⁇ second modules.
  • W t mod ⁇ can be specifically expressed as W 0 , W 1 ...W ⁇ -1
  • V t mod ⁇ can be specifically expressed as V 0 , V 1 ...V ⁇ -1 .
  • the ⁇ groups of different parameters include W 0 and V 0 , W 1 and V 1 ...W ⁇ -1 and V ⁇ -1 , respectively.
  • ⁇ second sub-data are generated by ⁇ second modules in one cycle, and the ⁇ second modules are reused in the next cycle.
  • the model may include three second modules, namely, the second module 1, the second module 2 and the second module 3. The three second modules can be called cyclically.
  • the second module 1 After generating a second sub-data through the second module 1 (that is, using W 1 and V 1 ), and then generating a second sub-data through the second module 2 (that is, using W 2 and V 2 ), and then generating a second sub-data through the second module 3 (that is, using W 0 and V 0 ), the second module 1 can be called again, and so on. It should be understood that the examples here are only for the convenience of understanding the present solution and are not used to limit the present solution.
  • multiple second modules may be used in the first machine learning model, and at least two of the multiple second modules use different parameters, that is, T-1 second sub-data are generated by different second modules, which is beneficial to the matching degree between the parameters of the second module and the generated second sub-data, and thus is beneficial to obtaining second data with better performance.
  • the first machine learning model may include a first module and at least one third module, and the difference between the first module and the third module includes: the initial input of the first module is the first data or the scrambled first data, and the initial input of the third module is the feature information of the first data (or the scrambled first data).
  • the meaning of the "third module” is similar to that of the "second module", which can be understood by referring to the above description and will not be repeated here.
  • step 407 may include: the first device inputs the first data (or the first data after interference) into the first module, and generates the first sub-data through the first module, and the first sub-data is one of the T sub-data; the specific implementation of the aforementioned steps can refer to the above description, which is not repeated here.
  • the first device calls the third module multiple times to obtain the third sub-data generated by the third module, and the third sub-data is one of the T sub-data, wherein the input of the third module includes the feature information of the first data, and the feature information of the first data (or the first data after interference) is updated multiple times in the process of calling the third module multiple times.
  • the process of the third module processing the input data is similar to “the process of the second module processing the input data", the difference is that each time the first device calls the second module once, it will use the processing result generated by the second module as a second sub-data; while the first device needs to call a third module multiple times, and update the characteristic information of the first data (or the first data after interference) multiple times in the process of calling the third module multiple times, and then use the processing result obtained by the last call to the third module as a third sub-data.
  • the third sub-data, the second sub-data and the first sub-data are all sub-data included in the second data. For the meaning of the sub-data, please refer to the above description, which will not be repeated here.
  • the first device inputs the characteristic information of the first data (or the first data after interference) into the third module, and processes the first data through the third module, and the aforementioned processing process includes updating the characteristic information of the first data.
  • the first device inputs the updated characteristic information of the first data (or the first data after interference) into the third module again, and processes the updated characteristic information of the first data (or the first data after interference) through the third module again, and the aforementioned processing process includes updating the characteristic information of the first data (or the first data after interference) again; the first device repeats the aforementioned operation at least once, and when the number of times the characteristic information of the first data (or the first data after interference) is processed by the third module reaches a preset number of times, a third sub-data generated by the third module is obtained.
  • s′ tl-1 represents the characteristic information of the first data generated when the third module was called last time
  • s′ tN represents the updated characteristic information of the first data generated after the characteristic information of the first data is updated N times
  • x t represents a third sub-data generated after the third module is called N times
  • N is an integer greater than or equal to 2.
  • a third sub-data is generated based on the last updated characteristic information of the first data. After multiple updates to the first data, it is helpful to have a more thorough understanding of the first data, thereby generating sub-data with better performance.
  • the size of the parameters in the first machine learning model is related to the value of H and the value of G, where H is the length of the first data and G is the length of each sub-data in the T sub-data.
  • the T sub-data represent T groups of encoded bit data, and G represents the number of bits in each group of encoded bit data.
  • the T sub-data represent T groups of modulated symbols, and G represents the number of symbols in each group of modulated symbols.
  • the T sub-data represent T groups of modulated symbols, and G represents the number of symbols in each group of modulated symbols.
  • the T sub-data may represent T groups of symbols in the reference signal, and G represents the number of symbols in each group of symbols, etc. It should be understood that the examples given here are only for the convenience of understanding this solution and are not used to limit this solution.
  • the values of H is 12, the value of G is 12, and the parameters used in the first machine learning model may include U, W, ⁇ s , V, and ⁇ o , then That is, U is a 12 by 12 matrix, the size of U is 12 in length and width, the size of W and V is the same as the size of U, ⁇ s is a 1 by 12 vector, the size of ⁇ s is 1 in width and 12 in length, and the size of ⁇ o is the same as the size of ⁇ s .
  • the value of H is 6, the value of G is 6, and the parameters used in the first machine learning model may include U, W, ⁇ s , V, and ⁇ o , then That is, U is a 6 by 6 matrix, the size of U is 6 in length and width, the size of W and V is the same as the size of U, ⁇ s is a 1 by 6 vector, the size of ⁇ s is 1 in width and 6 in length, and the size of ⁇ o is the same as the size of ⁇ s .
  • the value of H is 12
  • the value of G is 6, and the parameters used in the first machine learning model may include U, W, ⁇ s , V, and ⁇ o , then right
  • the explanation of the dimensions of ⁇ s , W, ⁇ s , V and ⁇ o can be found in the above description and will not be repeated here. It should be noted that the examples given here are only for the convenience of understanding the present solution and are not used to limit the present solution.
  • the size of the parameters in the first machine learning model is designed according to the length of the first data and the length of each sub-data in the T sub-data, which is beneficial to reducing the amount of parameters in the first machine learning model while meeting the output requirements, and is beneficial to further reducing the communication resources consumed by transmitting the parameters of the first machine learning model.
  • the first device determines a signal to be sent according to the second data.
  • the first device after the first device generates the second data through the first machine learning model, it can also determine the signal to be sent according to the second data. For example, if the function of the first machine learning model is encoding, the second data is the encoded data, and the first device also needs to modulate the second data to obtain the signal to be sent.
  • the second data is modulated data
  • the first device can also use truncation to rate match the second data to obtain the signal to be sent.
  • the second data is the encoded and modulated data
  • the first device can also use truncation to rate match the second data to obtain the signal to be sent.
  • the function of the first machine learning model is to generate a reference signal
  • the second data is the reference signal
  • the first device can determine the reference signal as the signal to be sent.
  • the first device may also perform other operations in the process of obtaining the second data and determining the signal to be sent according to the second data, which are not limited here.
  • the first device sends the signal to be sent to the second device.
  • the second device obtains second data, the second data includes T sub-data, T is an integer greater than or equal to 1, the second data is generated by the first machine learning model in the first device, the first machine learning model includes one or more modules, and each time a module in the first machine learning model is called at least once, one sub-data is obtained.
  • the second device after the second device obtains the received signal corresponding to the signal to be sent, it can denoise the received signal and obtain the received second data (that is, the estimated second data) from the denoised received signal.
  • the received second data that is, the estimated second data
  • the received second data is obtained after demodulating the denoised received signal.
  • the second data is modulated data, that is, the function of the first machine learning model is modulation, or the function of the first machine learning model is encoding and modulation, then the denoised received signal can be directly determined as the second data.
  • the second device can determine the estimated channel information based on the received signal after acquiring the received signal (that is, the received second data) corresponding to the signal to be sent.
  • the second device generates first data according to the second data.
  • the second device may generate the estimated first data according to the received second data.
  • the second device may acquire a set of parameters adopted by the first machine learning model in the first device.
  • the specific implementation of the above steps can refer to the description in step 401.
  • the second device may demodulate and/or decode the second data according to a set of parameters adopted by the first machine learning model in the first device to generate the estimated first data.
  • the first data of the calculation can be performed by the received second data.
  • the second device demodulates and/or decodes the second data according to a set of parameters adopted by the first machine learning model in the first device, which may include: after obtaining a set of parameters adopted by the first machine learning model, the second device can determine what operations the first device has performed on the first data using the first machine learning model, and then can use an estimation algorithm to perform an inverse operation on the second data to achieve demodulation and/or decoding of the second data, the aforementioned inverse operation being the inverse operation of the operation performed on the first data using the first machine learning model.
  • the estimation algorithm may be any of the following: maximum likelihood estimation algorithm, maximum a posteriori probability estimation or other types of estimation algorithms, etc.
  • maximum likelihood estimation algorithm maximum a posteriori probability estimation or other types of estimation algorithms, etc.
  • the examples given here are only for facilitating the understanding of the present solution and are not used to limit the present solution.
  • the second device demodulates and/or decodes the second data according to a set of parameters adopted by the first machine learning model in the first device, which may include: after the second device obtains a set of parameters adopted by the first machine learning model, it can obtain a second machine learning model corresponding to the first machine learning model, input the second data into the second machine learning model, and demodulate and/or decode the second data through the second machine learning model.
  • the second device can also obtain the estimated first data in other ways. The examples here are only used to prove the feasibility of this solution and are not used to limit this solution.
  • Figure 9 is a schematic diagram of a model training method provided in an embodiment of the present application.
  • the model training method includes steps 901 to 903.
  • Acquire training data from a training data set wherein the training data is used to obtain first data and a value of T, where T is an integer greater than or equal to 1, and at least two training data in the training data set include different values of T.
  • each training data may include a value of T, where T represents the number of sub-data included in the output data of the first machine learning model, and at least two training data in the training data set include different values of T.
  • each training data may also include data to be processed, and the training device may directly determine the aforementioned data to be processed as the first data; or, the first data may be obtained based on the aforementioned data to be processed using the method shown in steps 403 to 406 in the corresponding embodiment of Figure 4.
  • the specific implementation method may refer to the description in the corresponding embodiment of Figure 4 above, and the meaning of the noun in step 901 may also be understood in conjunction with the description in the corresponding embodiment of Figure 4, which will not be elaborated here.
  • the training device inputs the first data into the first machine learning model to obtain the second data generated by the first machine learning model.
  • the specific implementation of step 902 and the meaning of the nouns in step 902 can refer to the description of step 407 in the embodiment corresponding to FIG4, which will not be repeated here.
  • the training device in steps 901 and 902 can be a terminal device or a base station, which can be flexibly set according to actual conditions.
  • the training device can perform a training operation based on the second data and the loss function.
  • the first machine learning model is trained to obtain a trained first machine learning model. It should be noted that step 903 can be performed by the same device or by different devices.
  • step 903 is performed by the same device, and the training device in step 903 can be a terminal device or a base station.
  • the function of the first machine learning model includes coding and/or modulation
  • step 903 includes: denoising a received signal corresponding to a signal to be sent to obtain a denoised received signal, demodulating and/or decoding the denoised received signal to obtain estimated data corresponding to the data to be processed; training the first machine learning model according to the estimated data and a first loss function, wherein the first loss function indicates the similarity between the estimated data and the data to be processed.
  • the function of the first machine learning model is modulation.
  • the training device can directly determine the second data as the signal to be sent; the training device obtains the received signal corresponding to the signal to be sent, denoises the received signal to obtain the denoised received signal, and demodulates the denoised received signal to obtain the estimated data corresponding to the data to be processed.
  • the training device generates the similarity between the data to be processed and the estimated data, that is, obtains the function value of the first loss function, and uses the function value of the first loss function to update the weight parameters of the first machine learning model, thereby realizing one training of the first machine learning model.
  • the first loss function can be a cross entropy loss function, an L1 loss function, or other types of loss functions, etc., which can be flexibly determined in combination with the actual application scenario, and is not limited in the embodiments of the present application.
  • the training device obtaining a received signal corresponding to the signal to be sent may include: the training device multiplies the signal to be sent with a channel matrix, and adds the result of the multiplication to noise to obtain the received signal, and the aforementioned steps are for simulating the process of the signal to be sent being transmitted through the channel.
  • the same channel matrix and noise can be used, or different channel matrices and/or different noises can be used. Different channel matrices and/or different noises are used to simulate channel environments with different signal-to-noise ratios.
  • the function of the first machine learning model is encoding, and after obtaining the second data (i.e., the encoded data), the training device may also modulate the second data to obtain a signal to be sent; the training device obtains a received signal corresponding to the signal to be sent, denoises the received signal to obtain a denoised received signal, and demodulates and decodes the denoised received signal to obtain estimated data corresponding to the data to be processed in the training data.
  • the training device trains the first machine learning model based on the data to be processed, the estimated data, and the first loss function; the aforementioned steps and the specific implementation of "the training device obtains a received signal corresponding to the signal to be sent" can be found in the above description and will not be repeated here.
  • the function of the first machine learning model is modulation.
  • the training device can rate match the second data by truncation to obtain the signal to be sent; the training device obtains a received signal corresponding to the signal to be sent, denoises the received signal to obtain a denoised received signal, and demodulates the denoised received signal to obtain estimated data corresponding to the data to be processed in the training data.
  • the subsequent steps performed by the training device can refer to the above description and will not be repeated here.
  • the function of the first machine learning model is encoding and modulation.
  • the training device can use a truncation method to rate match the second data to obtain the signal to be sent; the training device obtains a received signal corresponding to the signal to be sent, and performs rate matching on the received signal.
  • denoising a denoised received signal is obtained, and the denoised received signal is demodulated and decoded to obtain estimated data corresponding to the data to be processed in the training data.
  • the subsequent steps performed by the training device can refer to the above description and will not be repeated here.
  • the function of the first machine learning model is to generate a reference signal
  • the second data is the reference signal.
  • the training device can generate a received reference signal corresponding to the reference signal, and generate predicted channel information based on the received reference signal corresponding to the reference signal.
  • the specific implementation method of “the training device generates a received reference signal corresponding to the reference signal” is similar to the specific implementation method of "the training device generates a received signal corresponding to the signal to be sent", the difference is that "the signal to be sent” is replaced by "reference signal”, and "the received signal” is replaced by "received reference signal", which will not be repeated here.
  • the training device trains the first machine learning model according to the second loss function, and the second loss function indicates the similarity between the predicted channel information and the correct channel information.
  • the training device calculates the similarity between the predicted channel information and the correct channel information to obtain the function value of the second loss function, and uses the function value of the second loss function to update the weight parameters of the first machine learning model, thereby realizing one training of the first machine learning model.
  • step 903 is performed by two training devices (hereinafter referred to as the first training device and the second training device for ease of description).
  • the first training device may be a base station
  • the second training device may be a terminal device that requests the base station to retrain the first machine learning model
  • the first training device may be a first device
  • the second training device may be a second device.
  • the function of the first machine learning model includes encoding and/or modulation.
  • the difference of this implementation method mainly lies in the implementation method of "obtaining a received signal corresponding to the signal to be sent".
  • the signal to be sent is sent to the second training device, and the second training device receives the received signal, and then the second training device obtains estimated data corresponding to the data to be processed; according to the estimated data and the first loss function, the first machine learning model is trained.
  • a specific implementation method for training the first machine learning model is provided when the function of the first machine learning model includes encoding and/or modulation, which reduces the difficulty of implementing the present solution, and the loss function uses the similarity between the estimated data and the data to be processed, that is, the goal of the loss function is to obtain estimated data with better performance.
  • the loss function is more in line with the actual needs when sending data between devices, and the second data output by the trained first machine learning model is more in line with actual needs.
  • the function of the first machine learning model is to generate a reference signal.
  • the difference of this implementation method mainly lies in the implementation method of "obtaining a received reference signal corresponding to the reference signal".
  • the first training device sends the reference signal to the second training device; after the second training device obtains the received reference signal, it generates predicted channel information according to the received reference signal; the second training device trains the first machine learning model according to the loss function.
  • a specific implementation method for training the first machine learning model when the function of the first machine learning model is to generate a reference signal is also provided, which expands the application scenarios of the present solution and improves the implementation flexibility of the present solution.
  • the training device repeatedly executes steps 901 to 903 to iteratively train the first machine learning model until the condition is satisfied.
  • the convergence condition of the first loss function obtains a set of trained parameters of the first machine learning model.
  • the training device can also use multiple different training data sets to train the first machine learning model separately, so as to obtain multiple sets of trained parameters of the first machine learning model.
  • Figure 10 is a schematic diagram of a data processing device provided in an embodiment of the present application.
  • the data processing device 1000 can implement the function of the first device in the above method embodiment, and thus can also achieve the beneficial effects possessed by the above method embodiment.
  • the data processing device 1000 may include a processing module 1001; wherein the processing module 1001 is used to obtain the value of T, T is an integer greater than or equal to 1, and T represents the number of sub-data included in the output data of the first machine learning model; the processing module 1001 is also used to input the first data into the first machine learning model to obtain the second data generated by the first machine learning model, the second data including T sub-data, wherein the first machine learning model includes one or more modules, and each module in the first machine learning model is called at least once to obtain one sub-data.
  • the functionality of the first machine learning model includes any one or a combination of the following: encoding, modulation, and generating a reference signal.
  • the multiple modules in the first machine learning model include a first module and at least one second module
  • the processing module 1001 is specifically used to: input the first data into the first module to obtain first sub-data generated by the first module, where the first sub-data is one of T sub-data; input the first feature information into the second module to obtain second sub-data generated by the second module, wherein the first feature information includes the feature information generated when the module in the first machine learning model was last called for data processing, the second sub-data is one of the T sub-data, and the module in the first machine learning model that was last called is the first module or the second module.
  • the multiple modules in the first machine learning model include a first module and at least one third module
  • the processing module 1001 is specifically used to: input the first data into the first module, generate first sub-data through the first module, the first sub-data is one of T sub-data, and the process of generating the first sub-data through the first module includes extracting features of the first data; call the third module multiple times to obtain third sub-data generated by the third module, the third sub-data is one of T sub-data, wherein the input of the third module includes feature information of the first data, and the feature information of the first data is updated multiple times in the process of calling the third module multiple times.
  • the processing module 1001 is specifically used to: perform a linear transformation on the first feature information through the second module, and process it with a first activation function to obtain the transformed feature information; perform a linear transformation on the transformed feature information, and process it with a second activation function to obtain second sub-data.
  • the at least one second module includes multiple second modules, wherein parameters adopted by at least two second modules among the multiple second modules are different.
  • the processing module 1001 is also used to obtain the value of the data to be processed and H, where H is an integer greater than or equal to 1, and H indicates the length of the first data; the processing module 1001 is also used to pad the data to be processed if the length of the data to be processed is less than H to obtain the first data, and the length of the first data is H.
  • the first data includes data to be processed and filling data
  • the filling data includes first identification information
  • the first identification information is used to identify the value of T and/or the value of K, where K is the length of the data to be processed and K is an integer greater than or equal to 1.
  • the size of the parameters in the first machine learning model is related to the value of H and the value of G, where G is the length of each sub-data.
  • the parameters corresponding to the first machine learning model are carried in one or more of the following information: downlink control information DCI, uplink control information UCI, sidelink control information SCI, radio resource control RRC signaling, or media access control control element MAC CE; and/or, the identification information of the parameter is carried in any one or more of the following information: DCI, UCI, SCI, RRC signaling, MAC CE, physical broadcast channel PBCH, or physical random access channel PRACH.
  • the data processing device 1000 is applied to the first device, and the second device is the receiving end of the second data.
  • the second device has multiple groups of parameters corresponding to the first machine learning model and identification information of each group of parameters, please refer to Figure 10.
  • the data processing device 1000 may also include: a transceiver module 1002, used to send second identification information to the second device, and the second identification information is used to indicate a set of parameters adopted by the first machine learning model in the first device.
  • the data processing device 1100 can implement the function of the second device in the above method embodiment, and thus can also achieve the beneficial effects possessed by the above method embodiment.
  • the data processing device 1100 may include a processing module 1101; wherein the processing module 1101 is used to obtain second data; and based on the second data, generate first data.
  • the second data includes T sub-data, T is an integer greater than or equal to 1, and the second data is generated by a first machine learning model in the first device, and the first machine learning model includes one or more modules, and each time a module in the first machine learning model is called at least once, a sub-data is obtained.
  • the parameters corresponding to the first machine learning model are carried in one or more of the following information: downlink control information DCI, uplink control information UCI, sidelink control information SCI, radio resource control RRC signaling, or media access control control element MAC CE; and/or, the identification information of the parameters is carried in any one or more of the following information: DCI, UCI, SCI, RRC signaling, MAC CE, physical broadcast channel PBCH, or physical random access channel PRACH.
  • the data processing device is applied to a second device, and the second device has third data, and the third data includes multiple groups of parameters corresponding to the first machine learning model and identification information of each group of parameters.
  • the data processing device 1100 may also include: a transceiver module 1102, which is used to receive the second identification information sent by the first device; a processing module 1101, which is also used to determine a set of parameters used by the first machine learning model in the first device according to the second identification information and the third data.
  • the processing module 1101 is specifically used to generate the first data according to a set of parameters used by the first machine learning model in the first device and the second data.
  • the model training device 1200 can implement the functions of the model training device in the above-mentioned method embodiment, and therefore can also achieve the beneficial effects possessed by the above-mentioned method embodiment.
  • the model training device 1200 may include a processing module 1201; wherein the processing module 1201 is used to obtain training data from a training data set, wherein the training data is used to obtain the first data and the value of T, T is an integer greater than or equal to 1, and at least two training data in the training data set include different values of T; the processing module 1201 is also used to input the first data into the first machine learning model to obtain the first machine learning model generated by the first machine learning model.
  • the second data includes T sub-data, wherein the first machine learning model includes multiple modules, and each time a module in the first machine learning model is called at least once, a sub-data generated by the module is obtained; the processing module 1201 is also used to train the first machine learning model based on the second data and the loss function to obtain the trained first machine learning model.
  • the functionality of the first machine learning model includes any one or a combination of the following: encoding, modulating, or generating a reference signal.
  • the multiple modules in the first machine learning model include a first module and at least one second module
  • the processing module 1201 is specifically used to: input the first data into the first module to obtain first sub-data generated by the first module, where the first sub-data is one of T sub-data; input the first feature information into the second module to obtain second sub-data generated by the second module, wherein the first feature information includes the feature information generated when the module in the first machine learning model was last called for data processing, the second sub-data is one of the T sub-data, and the module in the first machine learning model that was last called is the first module or the second module.
  • the processing module 1201 is also used to obtain the data to be processed from the training data; the processing module 1201 is also used to obtain the value of H, where H is an integer greater than or equal to 1, and H indicates the length of the first data; the processing module 1201 is also used to pad the data to be processed if the length of the data to be processed is less than H to obtain the first data, and the length of the first data is H.
  • the second data is used to determine the signal to be sent
  • the processing module 1201 is specifically used to: demodulate and/or decode the received signal corresponding to the signal to be sent to obtain estimated data corresponding to the data to be processed; train the first machine learning model according to the estimated data and the loss function, and the loss function indicates the similarity between the estimated data and the data to be processed.
  • the processing module 1201 when the second data is a reference signal, based on the second data and the loss function, the processing module 1201 is specifically used to: generate predicted channel information according to a received reference signal corresponding to the reference signal; and train the first machine learning model according to the loss function, wherein the loss function indicates the similarity between the predicted channel information and the correct channel information.
  • FIG 13 is a schematic diagram of a device provided in an embodiment of the present application.
  • the communication device 1300 may be a device as a terminal device in the above embodiment, and the example shown in Figure 13 is implemented by a terminal device (or a component in the terminal device).
  • the communication device 1300 may include but is not limited to at least one processor 1301 and a communication port 1302 .
  • the device may further include at least one of a memory 1303 and a bus 1304 .
  • the at least one processor 1301 is used to control and process the actions of the communication device 1300 .
  • the processor 1301 may be a central processing unit, a general-purpose processor, a digital signal processor, an application-specific integrated circuit, a field programmable gate array or other programmable logic device, a transistor logic device, a hardware component, or any combination thereof. It may implement or execute various exemplary logic blocks, modules, and circuits described in conjunction with the disclosure of this application.
  • the processor may also be a combination that implements computing functions, such as a combination of one or more microprocessors, a combination of a digital signal processor and a microprocessor, and the like.
  • a person skilled in the art may clearly understand that the method described herein is not intended to be limiting. For convenience and simplicity, the specific working processes of the systems, devices and units described above can refer to the corresponding processes in the aforementioned method embodiments and will not be repeated here.
  • the device 1300 shown in Figure 13 can be used to implement the steps implemented by the terminal device in the aforementioned method embodiment.
  • the specific implementation methods of the device 1300 shown in Figure 13 to perform the aforementioned steps can all be referred to the description in the aforementioned method embodiment, and will not be repeated here one by one.
  • the device 1400 can be specifically a device as a network device in the above embodiment, and the example shown in Figure 14 is that the network device is implemented by the network device (or a component in the network device); that is, when the first device, the second device or the training device involved in the above embodiment is specifically a network device, the device 1400 shown in Figure 14 can be implemented; illustratively, when the first device, the second device or the training device is a base station, the device 1400 shown in Figure 14 can be implemented.
  • the structure of the communication device can refer to the structure shown in Figure 14.
  • the device 1400 includes at least one processor 1411 and at least one network interface 1412. Further optionally, the communication device also includes at least one memory 1414, at least one transceiver 1413 and one or more antennas 1415.
  • the processor 1411, the memory 1414, the transceiver 1413 and the network interface 1412 are connected, for example, through a bus. In an embodiment of the present application, the connection may include various interfaces, transmission lines or buses, etc., which are not limited in this embodiment.
  • the antenna 1415 is connected to the transceiver 1413.
  • the network interface 1412 is used to enable the communication device to communicate with other communication devices through a communication link.
  • the network interface 1412 may include a network interface between the communication device and the core network device, such as an S1 interface, and the network interface may include a network interface between the communication device and other communication devices (such as other network devices or core network devices), such as an X2 or Xn interface.
  • the processor 1411 is mainly used to process the communication protocol and communication data, and to control the entire communication device, execute the software program, and process the data of the software program, for example, to support the communication device to perform the actions described in the embodiment.
  • the communication device may include a baseband processor and a central processor.
  • the baseband processor is mainly used to process the communication protocol and communication data
  • the central processor is mainly used to control the entire terminal device, execute the software program, and process the data of the software program.
  • the processor 1411 in Figure 14 can integrate the functions of the baseband processor and the central processor. It can be understood by those skilled in the art that the baseband processor and the central processor can also be independent processors, interconnected by technologies such as buses.
  • the terminal device can include multiple baseband processors to adapt to different network formats, and the terminal device can include multiple central processors to enhance its processing capabilities.
  • the various components of the terminal device can be connected through various buses.
  • the baseband processor can also be described as a baseband processing circuit or a baseband processing chip.
  • the central processor can also be described as a central processing circuit or a central processing chip.
  • the function of processing the communication protocol and communication data can be built into the processor, or it can be stored in the memory in the form of a software program, and the processor executes the software program to realize the baseband processing function.
  • the memory is mainly used to store software programs and data.
  • the memory 1414 can exist independently and be connected to the processor 1411.
  • the memory 1414 can be integrated with the processor 1411, for example, integrated into a chip.
  • the memory 1414 can store program codes for executing the technical solutions of the embodiments of the present application, and the execution is controlled by the processor 1411.
  • the various types of computer program codes executed can also be regarded as drivers of the processor 1411.
  • FIG14 shows only one memory and one processor.
  • the memory may also be referred to as a storage medium or a storage device.
  • the memory may be a processor in a
  • the storage elements on the same chip, that is, on-chip storage elements, or independent storage elements are not limited in the embodiments of the present application.
  • the transceiver 1413 can be used to support the reception or transmission of radio frequency signals between the communication device and the terminal, and the transceiver 1413 can be connected to the antenna 1415.
  • the transceiver 1413 includes a transmitter Tx and a receiver Rx.
  • one or more antennas 1415 can receive radio frequency signals
  • the receiver Rx of the transceiver 1413 is used to receive the radio frequency signal from the antenna, and convert the radio frequency signal into a digital baseband signal or a digital intermediate frequency signal, and provide the digital baseband signal or the digital intermediate frequency signal to the processor 1411, so that the processor 1411 further processes the digital baseband signal or the digital intermediate frequency signal, such as demodulation and decoding.
  • the transmitter Tx in the transceiver 1413 is also used to receive a modulated digital baseband signal or a digital intermediate frequency signal from the processor 1411, and convert the modulated digital baseband signal or the digital intermediate frequency signal into a radio frequency signal, and send the radio frequency signal through one or more antennas 1415.
  • the receiver Rx can selectively perform one or more stages of down-mixing and analog-to-digital conversion processing on the RF signal to obtain a digital baseband signal or a digital intermediate frequency signal, and the order of the down-mixing and analog-to-digital conversion processing is adjustable.
  • the transmitter Tx can selectively perform one or more stages of up-mixing and digital-to-analog conversion processing on the modulated digital baseband signal or digital intermediate frequency signal to obtain a RF signal, and the order of the up-mixing and digital-to-analog conversion processing is adjustable.
  • the digital baseband signal and the digital intermediate frequency signal can be collectively referred to as a digital signal.
  • the transceiver 1413 may also be referred to as a transceiver unit, a transceiver, a transceiver device, etc.
  • a device in the transceiver unit for implementing a receiving function may be regarded as a receiving unit
  • a device in the transceiver unit for implementing a sending function may be regarded as a sending unit, that is, the transceiver unit includes a receiving unit and a sending unit
  • the receiving unit may also be referred to as a receiver, an input port, a receiving circuit, etc.
  • the sending unit may be referred to as a transmitter, a transmitter, or a transmitting circuit, etc.
  • the device 1400 shown in Figure 14 can be used to implement the steps implemented by the base station in the aforementioned method embodiment.
  • the specific implementation methods of the device 1400 shown in Figure 14 to perform the aforementioned steps can all be referred to the description in the aforementioned method embodiment, and will not be repeated here one by one.
  • a computer-readable storage medium is also provided in an embodiment of the present application, in which a program for performing signal processing is stored.
  • the computer executes the steps executed by the first device in the method described in the embodiments shown in Figures 3 to 8 above, or the computer executes the steps executed by the second device in the method described in the embodiments shown in Figures 3 to 8 above, or the computer executes the steps executed by the training device in the method described in the embodiment shown in Figure 9 above.
  • Also provided in an embodiment of the present application is a computer program product, which, when executed on a computer, enables the computer to execute the steps executed by the first device in the method described in the embodiments shown in the aforementioned Figures 3 to 8, or enables the computer to execute the steps executed by the second device in the method described in the embodiments shown in the aforementioned Figures 3 to 8, or enables the computer to execute the steps executed by the training device in the method described in the embodiment shown in the aforementioned Figure 9.
  • the first device, the second device, the training device, the data processing device or the model training device provided in the embodiment of the present application may be a chip, and the chip includes: a processing unit and a communication unit, the processing unit may be, for example, a processor, and the communication unit may be, for example, an input/output interface, a pin or a circuit.
  • the processing unit may execute the computer execution instructions stored in the storage unit so that the chip executes the data processing method described in the embodiments shown in Figures 3 to 8 above, or, So that the chip executes the training method of the model described in the embodiment shown in Figure 9.
  • the storage unit is a storage unit in the chip, such as a register, a cache, etc.
  • the storage unit can also be a storage unit located outside the chip in the wireless access device, such as a read-only memory (ROM) or other types of static storage devices that can store static information and instructions, a random access memory (RAM), etc.
  • ROM read-only memory
  • RAM random access memory
  • FIG. 15 is a schematic diagram of a structure of a chip provided in an embodiment of the present application.
  • the chip can be expressed as a neural network processor NPU 150.
  • NPU 150 is mounted on the host CPU (Host CPU) as a coprocessor, and tasks are assigned by the Host CPU.
  • the core part of the NPU is the operation circuit 150, which controls the operation circuit 1503 through the controller 1504 to extract matrix data in the memory and perform multiplication operations.
  • the operation circuit 1503 includes multiple processing units (Process Engine, PE) inside.
  • the operation circuit 1503 is a two-dimensional systolic array.
  • the operation circuit 1503 can also be a one-dimensional systolic array or other electronic circuits capable of performing mathematical operations such as multiplication and addition.
  • the operation circuit 1503 is a general-purpose matrix processor.
  • the operation circuit takes the corresponding data of matrix B from the weight memory 1502 and caches it on each PE in the operation circuit.
  • the operation circuit takes the matrix A data from the input memory 1501 and performs matrix operation with matrix B, and the partial result or final result of the matrix is stored in the accumulator 1508.
  • Unified memory 1506 is used to store input data and output data. Weight data is directly transferred to weight memory 1502 through Direct Memory Access Controller (DMAC) 1505. Input data is also transferred to unified memory 1506 through DMAC.
  • DMAC Direct Memory Access Controller
  • BIU stands for Bus Interface Unit, that is, the bus interface unit 1510, which is used for the interaction between AXI bus and DMAC and instruction fetch buffer (IFB) 1509.
  • IOB instruction fetch buffer
  • the bus interface unit 1510 (Bus Interface Unit, BIU for short) is used for the instruction fetch memory 1509 to obtain instructions from the external memory, and is also used for the storage unit access controller 1505 to obtain the original data of the input matrix A or the weight matrix B from the external memory.
  • BIU Bus Interface Unit
  • DMAC is mainly used to transfer input data in the external memory DDR to the unified memory 1506 or to transfer weight data to the weight memory 1502 or to transfer input data to the input memory 1501.
  • the vector calculation unit 1507 includes multiple operation processing units, which further process the output of the operation circuit when necessary, such as vector multiplication, vector addition, exponential operation, logarithmic operation, size comparison, etc. It is mainly used for non-convolutional/fully connected layer network calculations in neural networks, such as Batch Normalization, pixel-level summation, upsampling of feature planes, etc.
  • the vector calculation unit 1507 can store the processed output vector to the unified memory 1506.
  • the vector calculation unit 1507 can apply a linear function and/or a nonlinear function to the output of the operation circuit 1503, such as linear interpolation of the feature plane extracted by the convolution layer, and then, for example, a vector of accumulated values to generate an activation value.
  • the vector calculation unit 1507 generates a normalized value, a pixel-level summed value, or both.
  • the processed output vector can be used as an activation input to the operation circuit 1503, for example, for use in a subsequent layer in a neural network.
  • the controller 1504 is connected to an instruction fetch buffer 1509 for storing the controller 1504 Instructions used;
  • Unified memory 1506, input memory 1501, weight memory 1502 and instruction fetch memory 1509 are all on-chip memories. External memories are private to the NPU hardware architecture.
  • the operations of each layer in the first machine learning model can be performed by the operation circuit 1503 or the vector calculation unit 1507.
  • the processor mentioned in any of the above places may be a general-purpose central processing unit, a microprocessor, an ASIC, or one or more integrated circuits for controlling the execution of the program of the above-mentioned first aspect method.
  • the device embodiments described above are merely schematic, wherein the units described as separate components may or may not be physically separated, and the components displayed as units may or may not be physical units, that is, they may be located in one place, or they may be distributed over multiple network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the scheme of this embodiment.
  • the connection relationship between the modules indicates that there is a communication connection between them, which may be specifically implemented as one or more communication buses or signal lines.
  • the technical solution of the present application is essentially or the part that contributes to the prior art can be embodied in the form of a software product, which is stored in a readable storage medium, such as a computer floppy disk, a U disk, a mobile hard disk, a ROM, a RAM, a magnetic disk or an optical disk, etc., including a number of instructions to enable a computer device (which can be a personal computer, a training device, or a network device, etc.) to execute the methods described in each embodiment of the present application.
  • a computer device which can be a personal computer, a training device, or a network device, etc.
  • all or part of the embodiments may be implemented by software, hardware, firmware or any combination thereof.
  • all or part of the embodiments may be implemented in the form of a computer program product.
  • the computer program product includes one or more computer instructions.
  • the computer may be a general-purpose computer, a special-purpose computer, a computer network, or other programmable devices.
  • the computer instructions may be stored in a computer-readable storage medium, or transmitted from one computer-readable storage medium to another computer-readable storage medium.
  • the computer instructions may be transmitted from a website site, a computer, a training device, or a data center by wired (e.g., coaxial cable, optical fiber, digital subscriber line (DSL)) or wireless (e.g., infrared, wireless, microwave, etc.) mode to another website site, computer, training device, or data center.
  • the computer-readable storage medium may be any available medium that a computer can store or a data storage device such as a training device, a data center, etc. that includes one or more available media integrations.
  • the available medium may be a magnetic medium, (e.g., a floppy disk, a hard disk, a tape), an optical medium (e.g., a DVD), or a semiconductor medium (e.g., a solid-state drive (SSD)), etc.
  • a magnetic medium e.g., a floppy disk, a hard disk, a tape
  • an optical medium e.g., a DVD
  • a semiconductor medium e.g., a solid-state drive (SSD)

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • Software Systems (AREA)
  • Artificial Intelligence (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Data Mining & Analysis (AREA)
  • Biophysics (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Biomedical Technology (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Databases & Information Systems (AREA)
  • Medical Informatics (AREA)
  • Mobile Radio Communication Systems (AREA)
  • Machine Translation (AREA)

Abstract

一种数据处理方法、模型的训练方法以及相关设备,可以将人工智能技术应用于通信领域中,方法包括:获取T的取值,T代表第一机器学习模型的输出数据包括的子数据的个数;将第一数据输入第一机器学习模型,得到第一机器学习模型生成的第二数据,第二数据包括T个子数据,其中,第一机器学习模型包括一个或多个模块,每调用第一机器学习模型中的模块至少一次得到一个子数据;可以根据T的取值灵活调整第一机器学习模型中模块的调用次数,以生成T个子数据,从而第一机器学习模型能够兼容T的多种取值,不再需要存储多个机器学习模型,减少了存储空间的开销。

Description

一种数据处理方法、模型的训练方法以及相关设备 技术领域
本申请涉及通信领域,尤其涉及一种数据处理方法、模型的训练方法以及相关设备。
背景技术
人工智能(Artificial Intelligence,AI)是利用数字计算机或者数字计算机控制的机器模拟、延伸和扩展人的智能,感知环境、获取知识并使用知识获得最佳结果的理论、方法、技术及应用系统。人工智能也就是研究各种智能机器的设计原理与实现方法,使机器具有感知、推理与决策的功能。
将人工智能技术应用于无线通信领域中是人工智能的一个应用方式,例如通过机器学习模型执行调制、编码或其他任务等。示例性地,可以将待处理数据输入机器学习模型,得到该模型输出的处理后数据,前述处理后数据包括T个子数据,T为大于或等于1的整数。
但T的取值可以根据实际情况灵活确定,而相关技术中,每个机器学习模型的输出通道的数量是固定的,导致该机器学习模型只能输出固定个数的子数据;当T发生改变时,就需要采用另一个机器学习模型进行处理,则需要存储多个机器学习模型,导致存储空间开销很大,因此,一种能够兼容T的多种取值的机器学习模型亟待推出。
发明内容
本申请实施例提供了一种数据处理方法、模型的训练方法以及相关设备,每调用第一机器学习模型中的模块至少一次就能得到一个子数据,则可以根据T的取值灵活调整第一机器学习模型中模块的调用次数,以生成T个子数据,从而第一机器学习模型能够兼容T的多种取值,不再需要存储多个机器学习模型,减少了存储空间的开销。
本申请第一方面提供一种数据处理方法,可以将人工智能技术应用于通信领域中,所述方法应用于第一装置侧,第一装置可以是设备,也可以是可配置于设备中的组件(如,芯片、芯片系统等),所述方法包括:第一装置获取T的取值,T代表第一机器学习模型的输出数据包括的子数据的个数,T为大于或等于1的整数;第一装置将第一数据输入第一机器学习模型,得到第一机器学习模型生成的第二数据,第二数据包括T个子数据。其中,第一机器学习模型包括一个或多个模块,每调用第一机器学习模型中的模块至少一次得到一个子数据;示例性地,每次调用第一机器学习模型中的模块时可以调用第一机器学习模型中的一个模块,也可以调用多个模块。本实现方式中,由于每调用第一机器学习模型中的模块至少一次就能得到一个子数据,则在获取到T的取值之后,可以根据T的取值灵活调整第一机器学习模型中模块的调用次数,以生成T个子数据,从而第一机器学习模型能够兼容T的多种取值,则不再需要存储多个机器学习模型,减少了存储空间的开销。
在一种可能实现方式中,第一机器学习模型的功能包括如下任一项或多项的组合:编码、调制、生成参考信号。例如,若第一机器学习模型的功能是编码,则第一数据为需要编码的数据,第二数据为编码后的数据。又例如,若第一机器学习模型的功能是调制,则第一数据为需要调制的数据,第二数据为调制后的数据。又例如,若第一机器学习模型的功能是生成参考信号,则第一数据可以为多个参考信号的索引号,第二数据可以为参考信号。又例如,第一机器学习模型的功能是编码和调制,则第一数据为需要编码和调制的数据,第二数据为编码以及调制后的数据等。
本实现方式中,提供了第一机器学习模型的多种功能,扩展了本方案的应用场景,提高了本方案的实现灵活性。
在一种可能实现方式中,第一机器学习模型中的多个模块包括第一模块和至少一个第二模块,第一装置将第一数据输入第一机器学习模型,得到第一机器学习模型输出的第二数据,包括:第一装置将第一数据输入第一模块,得到第一模块生成的第一子数据,第一子数据为T个子数据中的一个;将第一数据的第一特征信息输入第二模块,得到第二模块生成的第二子数据,第二子数据为T个子数据中的一个。其中,第一特征信息包括上一次调用第一机器学习模型中的模块进行数据处理时生成的特征信息;示例性地,第一特征信息可以为上一次调用第一机器学习模型中的第一模块进行数据处理时生成的特征信息,也可以为上一次调用第一机器学习模型中的第二模块进行数据处理时生成的特征信息。
本实现方式中,第一机器学习模型包括第一模块和至少一个第二模块,第一机器学习模型的第一模块的输入为整个第一数据,第二模块的输入为上一次调用第一机器学习模型的模块时得到的特征信息,则第一次调用第二模块时输入的为通过第一模块对整个第一数据进行处理时得到的特征信息,从而每次输入第二模块的特征信息均参考了整个第一数据,也即在生成T个子数据中的每个子数据时均参考了整个第一数据的信息,有利于得到性能更好的第二数据;且每调用第二模块一次就能够得到T个子数据中的一个第二子数据,有利于快速的得到第二数据中包括的T个子数据。
在一种可能实现方式中,第一机器学习模型中的多个模块包括第一模块和至少一个第三模块,第一装置将第一数据输入第一机器学习模型,得到第一机器学习模型输出的第二数据,包括:第一装置将第一数据输入第一模块,通过第一模块生成第一子数据,第一子数据为T个子数据中的一个,通过第一模块生成第一子数据的过程中包括对第一数据进行特征提取,也即在通过第一模块生成第一子数据的过程中能够得到第一数据的特征信息;第一装置调用第三模块多次,得到第三模块生成的第三子数据,第三子数据为T个子数据中的一个,其中,第三模块的输入包括第一数据的特征信息,在调用第三模块多次的过程中对第一数据的特征信息进行多次更新。示例性地,第一装置在第一次调用第三模块时,向第三模块输入的第一数据的特征信息是在第一模块生成第一子数据的过程中得到的;第一装置在第二次及以后再调用第三模块时,向第三模块输入的第一数据的特征信息是上一次调用第三模块的过程中得到的。
本实现方式中,调用第三模块多次以对第一数据的特征信息进行多次更新之后,才根据最后的第一数据的更新后的特征信息,生成一个第三子数据,在对第一数据的多次更新 之后,有利于更加透彻的了解第一数据,从而生成性能更好的子数据。
在一种可能实现方式中,第一装置将第一特征信息输入第二模块,得到第二模块生成的第二子数据,包括:第一装置通过第二模块对第一特征信息进行线性变换,并采用第一激活函数进行处理,得到变换后的特征信息;对变换后的特征信息进行线性变换,并采用第二激活函数进行处理,得到第二子数据。
本实现方式中,由于上述方式简单且易于实现,不仅有利于减少在生成第二数据的过程中消耗的计算机资源;且上述方式示出的第二模块中采用的参数量较少,有利于降低在传输第一机器学习模型的参数时消耗的通信资源。
在一种可能实现方式中,至少一个第二模块包括多个第二模块,其中,多个第二模块中至少两个第二模块采用的参数不同;也即第一装置每调用第二模块一次能够生成一个第二子数据,但在生成T-1个第二数据的过程中调用的可以为不同的第二模块。示例性地,“采用的参数不同的两个第二模块”的含义可以包括如下任一种不同:两个第二模块中采用了相同类型的参数,但两个第二模块中采用的参数值不完全相同;或者,两个第二模块中采用的参数的类型不完全相同等。
本实现方式中,第一机器学习模型中可以采用多个第二模块,多个第二模块中存至少两个第二模块采用的参数不同,也即T-1个第二子数据是由不同的第二模块生成的,有利于第二模块的参数和生成的第二子数据之间的匹配度,从而有利于得到性能更好的第二数据。
在一种可能实现方式中,第一装置将第一数据输入第一模块,得到第一模块生成的第一子数据,可以包括:第一装置通过调用第一模块一次或多次的方式,得到第一模块生成的第一子数据。可选地,第一装置每次调用第一模块对输入数据进行处理时,可以包括:第一装置通过第一模块对输入数据进行线性变换,并采用第三激活函数进行处理,得到变换后的输入数据;对变换后的输入数据进行线性变换,并采用第四激活函数进行处理,得到第一模块的处理结果。第一模块的输入数据可以为第一数据或者第一数据的特征信息。
在一种可能实现方式中,第一装置将第一数据输入第一机器学习模型之前,方法还包括:第一装置获取待处理数据和H的取值,H为大于或等于1的整数,H指示第一数据的长度;若待处理数据的长度小于H,则对待处理数据进行填充,得到第一数据,第一数据的长度为H。本实现方式中,当待处理数据的长度小于H时,对待处理数据进行填充得到长度为H的第一数据,再将长度为H的第一数据输入第一机器学习模型中,从而无论待处理数据的长度是多少,第一机器学习模型处理的都是长度为H的第一数据,不仅有利于兼容任意长度的待处理数据,且有利于降低第一机器学习模型在进行数据处理时的难度,以得到性能更好的第二数据。
在一种可能实现方式中,第一数据包括待处理数据和填充数据,填充数据包括第一标识信息,第一标识信息用于标识T的取值和/或K的取值,K为待处理数据的长度,K为大于或等于1的整数,也即第一标识信息可以用于标识T的取值和K的取值,也可以用于标识T的取值,也可以用于标识K的取值。示例性地,第一装置在获取到T的取值和/或K的取值之后,可以采用第一函数对T的取值和/或K的取值进行处理,以得到第一标识信息。 第一函数需要满足的条件包括:将第一标识信息的取值限制在预设范围内,且,能够将不同T的取值和/或K的取值映射为不同的值,也即通过该第一函数生成的值能够唯一的标识某一个T的取值和/或K的取值,又或者说通过该第一函数生成的值能够对不同的T的取值和/或K的取值进行区分。
本实现方式中,在第一数据中携带有用于标识T的取值和/或K的取值的第一标识信息,则第一机器学习模型能够根据T的取值和/或K的取值,即根据第一机器学习模型的输出数据的长度和/或真实的待处理数据的长度处理第一数据,进而通过第一机器学习模型输出的第二数据,有利于得到性能更好的第二数据。
在一种可能实现方式中,第一机器学习模型中的参数的尺寸与H的取值以及G的取值相关,G为每个子数据的长度。本实现方式中,根据第一数据的长度以及T个子数据中每个子数据的长度来设计第一机器学习模型中参数的尺寸,有利于在满足输出要求的前提下,减少第一机器学习模型中的参数量,有利于进一步减少传输第一机器学习模型的参数所消耗的通信资源。
在一种可能实现方式中,与第一机器学习模型对应的参数和/或前述参数的标识信息可以携带于信令中,以实现与第一机器学习模型对应的参数和/或前述参数的标识信息在不同装置之间传输。可选地,与第一机器学习模型对应的参数携带于如下一种或多种信息中:下行控制信息DCI、上行控制信息UCI、侧行链路控制信息SCI、无线资源控制RRC信令或者媒体访问控制的控制元素MAC CE。和/或,参数的标识信息携带于如下任一种或多种信息中:DCI、UCI、SCI、RRC信令、MAC CE、物理广播信道PBCH或者物理随机接入信道PRACH。
本实现方式中,相比于将第一机器学习模型的至少一组参数和/或每组参数的标识信息携带于数据包中传输,将前述至少一组参数和/或每组参数的标识信息携带于信令中传输,传输效率更高,且消耗的计算机资源更少;此外,本方案中提供了多种能够用于传输前述至少一组参数和/或每组参数的标识信息的信令,提高了本方案的实现灵活性。
在一种可能实现方式中,第二装置为第二数据的接收端,第二装置中有与第一机器学习模型对应的多组参数以及每组参数的标识信息,方法还包括:第一装置向第二装置发送第二标识信息,第二标识信息用于指示第一装置中的第一机器学习模型采用的一组参数。本实现方式中,第二装置中存在第一机器学习模型的多组参数以及每组参数的标识信息,第一装置仅需要向第二装置发送第二标识信息,第二装置就能够得知第一装置中的第一机器学习模型采用的是哪组参数,而传输第二标识信息所占用的通信资源较少,有利于减少所消耗的通信资源。
第二方面,本申请提供了一种数据处理方法,可以将人工智能技术应用于通信领域中,所述方法应用于第二装置侧,第二装置可以是设备,也可以是可配置于设备中的组件(如,芯片、芯片系统等),所述方法包括:第二装置获取第二数据,进而根据第二数据,生成第一数据。其中,第二数据包括T个子数据,T为大于或等于1的整数,第二数据由第一装置中的第一机器学习模型生成,第一机器学习模型包括一个或多个模块,每调用第一机器学习模型中的模块至少一次得到一个子数据。示例性地,第二装置在获取到接收信号之后,可以对接收信号进行去噪,并从去噪后的接收信号中获取接收的第二数据(也即估计的第 二数据)。例如,若第二数据为编码后的数据,也即第一机器学习模型的功能为编码,则对去噪后的接收信号进行解调制之后得到接收的第二数据;又例如,若第二数据为调制后的数据,也即第一机器学习模型的功能为调制,或者,第一机器学习模型的功能为编码和调制,则可以直接将去噪后的接收信号确定为第二数据。
在一种可能实现方式中,与第一机器学习模型对应的参数携带于如下一种或多种信息中:下行控制信息DCI、上行控制信息UCI、侧行链路控制信息SCI、无线资源控制RRC信令或者媒体访问控制的控制元素MAC CE;和/或,参数的标识信息携带于如下任一种或多种信息中:DCI、UCI、SCI、RRC信令、MAC CE、物理广播信道PBCH或者物理随机接入信道PRACH。
在一种可能实现方式中,第二装置中有第三数据,第三数据包括与第一机器学习模型对应的多组参数以及每组参数的标识信息,方法还包括:第二装置接收第一装置发送的第二标识信息;根据第二标识信息和第三数据,确定第一装置中的第一机器学习模型采用的一组参数。第二装置根据第二数据,生成第一数据,包括:第二装置可以根据第一装置中的第一机器学习模型采用的一组参数,对接收的第二数据进行解调制和/或解码,以生成估计的第一数据。
本申请第二方面的各个可能实现方式中的步骤的具体实现方式、名词的含义以及所带来的有益效果,均可以参阅第一方面,此处不再赘述。
第三方面,本申请提供了一种模型的训练方法,可以将人工智能技术应用于通信领域中,所述方法应用于训练装置,训练装置可以是设备,也可以是可配置于设备中的组件(如,芯片、芯片系统等),所述方法包括:训练装置从训练数据集合中获取训练数据,其中,训练数据用于得到第一数据和T的取值,T为大于或等于1的整数;示例性地,训练数据可以包括待处理数据和T的取值,待处理数据用于得到第一数据,例如待处理数据和第一数据相同,或者,在对待处理数据进行填充后得到第一数据;T用于指示第一机器学习模型的输出数据中包括的子数据的数量,训练数据集合中至少两个训练数据包括的T的取值不同。训练装置将第一数据输入第一机器学习模型,得到第一机器学习模型生成的第二数据,第二数据包括T个子数据,其中,第一机器学习模型包括多个模块,每调用第一机器学习模型中的模块至少一次,得到模块生成的一个子数据;基于第二数据和损失函数,对第一机器学习模型进行训练,得到训练后的第一机器学习模型。
在一种可能实现方式中,在第一机器学习模型的功能包括编码和/或调制的情况下,第二数据用于确定待发送的信号,训练装置基于第二数据和损失函数,对第一机器学习模型进行训练,包括:训练装置获取与该待发送的信号对应的接收信号,对与待发送的信号对应的接收信号进行解调制和/或解码以得到与待处理数据对应的估计数据;训练装置根据估计数据和损失函数,对第一机器学习模型进行训练,损失函数指示估计数据和待处理数据之间的相似度。
示例性地,训练装置获取与该待发送的信号对应的接收信号可以包括:训练装置将该待发送的信号与信道矩阵相乘,并将前述相乘的结果与噪声相加,得到该接收信号,前述步骤是为了模拟该待发送的信号经过信道传输的过程。或者,前述步骤由两个训练装置配 合完成,则训练装置获取与该待发送的信号对应的接收信号可以包括:第一训练装置将待发送的信号发送给第二训练装置,第二训练装置接收到了该接收信号。
本实现方式中,提供了在第一机器学习模型的功能包括编码和/或调制的情况下,对第一机器学习模型进行训练的具体实现方式,降低了本方案的实现难度,且损失函数采用的是估计数据和待处理数据之间的相似度,也即损失函数的目标是获得性能更好的估计数据,该损失函数更加符合装置之间发送数据时的实际需求,则训练后的第一机器学习模型输出的第二数据更符合实际的需求。
在一种可能实现方式中,在第二数据为参考信号的情况下,训练装置基于第二数据和损失函数,对第一机器学习模型进行训练,包括:训练装置获取与参考信号对应的接收的参考信号,根据与参考信号对应的接收的参考信号,生成预测的信道信息;训练装置根据损失函数,对第一机器学习模型进行训练,损失函数指示预测的信道信息和正确的信道信息之间的相似度。
示例性地,训练装置获取与参考信号对应的接收的参考信号可以包括:训练装置将参考信号与信道矩阵相乘,并将前述相乘的结果与噪声相加,得到该接收的参考信号,前述步骤是为了模拟参考信号经过信道传输的过程。或者,前述步骤由两个训练装置配合完成,则训练装置获取与参考信号对应的接收的参考信号可以包括:第一训练装置将参考信号发送给第二训练装置,第二训练装置接收到了该接收的参考信号。
本实现方式中,还提供了在第一机器学习模型的功能为生成参考信号的情况下,对第一机器学习模型进行训练的具体实现方式,扩展了本方案的应用场景,提高了本方案的实现灵活性。
本申请第三方面中,训练装置还可以用于执行第一方面以及第一方面的各个可能实现方式中第一装置执行的步骤,第三方面的各个可能实现方式中的步骤的具体实现方式、名词的含义以及所带来的有益效果,均可以参阅第一方面,此处不再赘述。
第四方面,本申请提供了一种数据处理装置,可以将人工智能技术应用于通信领域中,数据处理装置包括处理模块;其中,处理模块,用于获取T的取值,T为大于或等于1的整数,T代表第一机器学习模型的输出数据包括的子数据的个数;
处理模块,还用于将第一数据输入第一机器学习模型,得到第一机器学习模型生成的第二数据,第二数据包括T个子数据,其中,第一机器学习模型包括一个或多个模块,每调用第一机器学习模型中的模块至少一次得到一个子数据。
在一种可能实现方式中,第一机器学习模型的功能包括如下任一项或多项的组合:编码、调制或生成参考信号。
在一种可能实现方式中,第一机器学习模型中的多个模块包括第一模块和至少一个第二模块,处理模块,具体用于:将第一数据输入第一模块,得到第一模块生成的第一子数据,第一子数据为T个子数据中的一个;将第一特征信息输入第二模块,得到第二模块生成的第二子数据,其中,第一特征信息包括上一次调用第一机器学习模型中的模块进行数据处理时生成的特征信息,第二子数据为T个子数据中的一个,上一次调用的第一机器学习模型中的模块为第一模块或者第二模块。
在一种可能实现方式中,第一机器学习模型中的多个模块包括第一模块和至少一个第三模块,处理模块,具体用于:将第一数据输入第一模块,通过第一模块生成第一子数据,第一子数据为T个子数据中的一个,通过第一模块生成第一子数据的过程中包括对第一数据进行特征提取;调用第三模块多次,得到第三模块生成的第三子数据,第三子数据为T个子数据中的一个,其中,第三模块的输入包括第一数据的特征信息,在调用第三模块多次的过程中对第一数据的特征信息进行多次更新。
在一种可能实现方式中,处理模块,具体用于:通过第二模块对第一特征信息进行线性变换,并采用第一激活函数进行处理,得到变换后的特征信息;对变换后的特征信息进行线性变换,并采用第二激活函数进行处理,得到第二子数据。
在一种可能实现方式中,至少一个第二模块包括多个第二模块,其中,多个第二模块中至少两个第二模块采用的参数不同。
在一种可能实现方式中,处理模块,还用于获取待处理数据和H的取值,H为大于或等于1的整数,H指示第一数据的长度;处理模块,还用于若待处理数据的长度小于H,则对待处理数据进行填充,得到第一数据,第一数据的长度为H。
在一种可能实现方式中,第一数据包括待处理数据和填充数据,填充数据包括第一标识信息,第一标识信息用于标识T的取值和/或K的取值,K为待处理数据的长度,K为大于或等于1的整数。
在一种可能实现方式中,第一机器学习模型中的参数的尺寸与H的取值以及G的取值相关,G为每个子数据的长度。
在一种可能实现方式中,与第一机器学习模型对应的参数携带于如下一种或多种信息中:下行控制信息DCI、上行控制信息UCI、侧行链路控制信息SCI、无线资源控制RRC信令或者媒体访问控制的控制元素MAC CE;和/或,参数的标识信息携带于如下任一种或多种信息中:DCI、UCI、SCI、RRC信令、MAC CE、物理广播信道PBCH或者物理随机接入信道PRACH。
在一种可能实现方式中,数据处理装置应用于第一装置,第二装置为第二数据的接收端,第二装置中有与第一机器学习模型对应的多组参数以及每组参数的标识信息,数据处理装置还包括:收发模块,用于向第二装置发送第二标识信息,第二标识信息用于指示第一装置中的第一机器学习模型采用的一组参数。
本申请第四方面中,第四方面的各个可能实现方式中的步骤的具体实现方式、名词的含义以及所带来的有益效果,均可以参阅第一方面,此处不再赘述。
第五方面,本申请提供了一种数据处理装置,可以将人工智能技术应用于通信领域中,数据处理装置包括处理模块;其中,处理模块用于获取第二数据;根据第二数据,生成第一数据。其中,第二数据包括T个子数据,T为大于或等于1的整数,第二数据由第一装置中的第一机器学习模型生成,第一机器学习模型包括一个或多个模块,每调用第一机器学习模型中的模块至少一次得到一个子数据。
在一种可能实现方式中,与第一机器学习模型对应的参数携带于如下一种或多种信息中:下行控制信息DCI、上行控制信息UCI、侧行链路控制信息SCI、无线资源控制RRC信 令或者媒体访问控制的控制元素MAC CE;和/或,参数的标识信息携带于如下任一种或多种信息中:DCI、UCI、SCI、RRC信令、MAC CE、物理广播信道PBCH或者物理随机接入信道PRACH。
在一种可能实现方式中,该数据处理装置应用于第二装置,第二装置中有第三数据,第三数据包括与第一机器学习模型对应的多组参数以及每组参数的标识信息,数据处理装置还包括:收发模块,用于接收第一装置发送的第二标识信息;处理模块,还用于根据第二标识信息和第三数据,确定第一装置中的第一机器学习模型采用的一组参数。处理模块,具体用于根据第一装置中的第一机器学习模型采用的一组参数和第二数据,生成第一数据。
本申请第五方面中,第五方面的各个可能实现方式中的步骤的具体实现方式、名词的含义以及所带来的有益效果,均可以参阅第二方面,此处不再赘述。
第六方面,本申请提供了一种模型的训练装置,可以将人工智能技术应用于通信领域中,模型的训练装置包括处理模块;其中,处理模块,用于从训练数据集合中获取训练数据,其中,训练数据用于得到第一数据和T的取值,T为大于或等于1的整数,训练数据集合中至少两个训练数据包括的T的取值不同;处理模块,还用于将第一数据输入第一机器学习模型,得到第一机器学习模型生成的第二数据,第二数据包括T个子数据,其中,第一机器学习模型包括多个模块,每调用第一机器学习模型中的模块至少一次,得到模块生成的一个子数据;处理模块,还用于基于第二数据和损失函数,对第一机器学习模型进行训练,得到训练后的第一机器学习模型。
在一种可能实现方式中,第一机器学习模型的功能包括如下任一项或多项的组合:编码、调制或生成参考信号。
在一种可能实现方式中,第一机器学习模型中的多个模块包括第一模块和至少一个第二模块,处理模块,具体用于:将第一数据输入第一模块,得到第一模块生成的第一子数据,第一子数据为T个子数据中的一个;将第一特征信息输入第二模块,得到第二模块生成的第二子数据,其中,第一特征信息包括上一次调用第一机器学习模型中的模块进行数据处理时生成的特征信息,第二子数据为T个子数据中的一个,上一次调用的第一机器学习模型中的模块为第一模块或者第二模块。
在一种可能实现方式中,处理模块,还用于从训练数据中获取待处理数据;处理模块,还用于获取H的取值,H为大于或等于1的整数,H指示第一数据的长度;处理模块,还用于若待处理数据的长度小于H,则对待处理数据进行填充,得到第一数据,第一数据的长度为H。
在一种可能实现方式中,在第一机器学习模型的功能包括编码和/或调制的情况下,第二数据用于确定待发送的信号,处理模块,具体用于:对与待发送的信号对应的接收信号进行解调制和/或解码以得到与待处理数据对应的估计数据;根据估计数据和损失函数,对第一机器学习模型进行训练,损失函数指示估计数据和待处理数据之间的相似度。
在一种可能实现方式中,在第二数据为参考信号的情况下,基于第二数据和损失函数,处理模块,具体用于:根据与参考信号对应的接收的参考信号,生成预测的信道信息;根据损失函数,对第一机器学习模型进行训练,损失函数指示预测的信道信息和正确的信道 信息之间的相似度。
本申请第六方面中,第六方面的各个可能实现方式中的步骤的具体实现方式、名词的含义以及所带来的有益效果,均可以参阅第一方面,此处不再赘述。
第七方面,本申请提供了一种通信系统,可以将人工智能技术应用于通信领域中,通信系统可以包括如第四方面的数据处理装置以及如第五方面的数据处理装置。
在一种可能实现方式中,通信系统还包括如第五方面的模型的训练装置。
第八方面,本申请提供了一种数据处理方法,可以将人工智能技术应用于通信领域中,方法包括:第三装置获取第一信令,其中,第一信令中携带有第一机器学习模型采用的至少一组参数以及每组参数所对应的指示信息,指示信息用于指示每组参数包括的多个参数在第一机器学习模型中的位置;向第一装置发送第一信令。第三装置与第二装置可以为同一装置,也可以为不同的装置,本申请中不做限定。
本实现方式中,当通过信令传输第一机器学习模型的至少一组参数时,信令中不仅携带前述至少一组参数,还会携带每组参数所对应的指示信息,该指示信息用于指示每组参数包括的多个参数在第一机器学习模块中的位置,从而第一装置在接收到该信令之后,就能够明白如何使用信令中携带的参数,且采用信令的方式传输第一机器学习模型的至少一组参数,有利于降低参数传输过程中所消耗的通信资源,且提高参数传输过程的效率。
在一种可能实现方式中,第一信令为如下任一种:下行控制信息DCI、上行控制信息UCI、侧行链路控制信息SCI、无线资源控制RRC信令或者媒体访问控制的控制元素MAC CE。
第九方面,本申请提供了一种数据处理方法,可以将人工智能技术应用于通信领域中,方法包括:第一装置接收第一信令,其中,第一信令中携带有第一机器学习模型采用的至少一组参数以及每组参数所对应的指示信息,指示信息用于指示每组参数包括的多个参数在第一机器学习模型中的位置。
在一种可能实现方式中,第一信令为如下任一种:下行控制信息DCI、上行控制信息UCI、侧行链路控制信息SCI、无线资源控制RRC信令或者媒体访问控制的控制元素MAC CE。
第十方面,本申请实施例提供了一种装置,包括至少一个处理器,至少一个处理器与存储器耦合,存储器用于存储程序或指令;至少一个处理器用于执行程序或指令,使得前述装置执行上述任一方面中的方法。
第十一方面,本申请实施例提供了一种计算机可读存储介质,计算机可读存储介质中存储有计算机程序,当其在计算机上运行时,使得计算机执行上述任一方面中的方法。
第十二方面,本申请实施例提供了一种计算机程序产品,计算机程序产品包括程序,当该程序在计算机上运行时,使得计算机执行上述任一方面中的方法。
第十三方面,本申请提供了一种芯片系统,该芯片系统包括处理器,用于支持通信装置实现上述方面中所涉及的功能,例如,发送或处理上述方法中所涉及的数据和/或信息。在一种可能的设计中,芯片系统还包括存储器,存储器,用于保存通信装置必要的程序指令和数据。该芯片系统,可以由芯片构成,也可以包括芯片和其他分立器件。
附图说明
图1为本申请实施例提供的无线通信系统的一种架构示意图;
图2为本申请实施例提供的无线通信系统的另一种架构示意图;
图3为本申请实施例提供的数据处理方法的一种流程示意图;
图4为本申请实施例提供的数据处理方法的另一种示意图;
图5为本申请实施例提供的第一装置和第二装置确定第一机器学习模型采用的一组参数的一种流程示意图;
图6为本申请实施例提供的第一装置获取第一机器学习模型采用的一组参数的一种流程示意图;
图7为本申请实施例提供的第一数据的一种示意图;
图8为本申请实施例提供的利用第一机器学习模型生成T个子数据的一种示意图;
图9为本申请实施例提供的模型的训练方法的一种示意图;
图10为本申请实施例提供的数据处理装置的一种结构示意图;
图11为本申请实施例提供的数据处理装置的另一种结构示意图;
图12为本申请实施例提供的模型的训练装置的一种示意图;
图13为本申请的实施例提供的装置的一种示意图;
图14为本申请的实施例提供的装置的另一种示意图;
图15为本申请实施例提供的芯片的一种结构示意图。
具体实施方式
下面结合附图,对本申请的实施例进行描述,显然,所描述的实施例仅仅是本申请一部分的实施例,而不是全部的实施例。本领域普通技术人员可知,随着技术的发展和新场景的出现,本申请实施例提供的技术方案对于类似的技术问题,同样适用。
本申请的说明书和权利要求书及上述附图中的术语“第一”、“第二”等是用于区别类似的对象(例如,区分同一实施例中的对象),而不必用于描述特定的顺序或先后次序,且在不同实施例中“第一”、“第二”等限定的对象(如“第一装置”和“第二装置”)可能指代不同的对象,应该理解这样使用的数据在适当情况下可以互换,以便这里描述的实施例能够以除了在这里图示或描述的内容以外的顺序实施。
此外,术语“包括”和“具有”以及他们的任何变形,意图在于覆盖不排他的包含,例如,包含了一系列步骤或模块的过程、方法、系统、产品或设备不必限于清楚地列出的那些步骤或模块,而是可包括没有清楚地列出的或对于这些过程、方法、产品或设备固有的其它步骤或模块。在本申请中出现的对步骤进行的命名或者编号,并不意味着必须按照命名或者编号所指示的时间/逻辑先后顺序执行方法流程中的步骤,已经命名或者编号的流程步骤可以根据要实现的技术目的变更执行次序,只要能达到相同或者相类似的技术效果即可。
本申请实施例中的“发送”和“接收”,表示信号传递的走向。例如,“向XX设备发送信息”可以理解为该信息的目的端是XX设备,可以包括通过空口直接发送,也包括其他单元或模块通过空口间接发送。“接收来自YY设备的信息”可以理解为该信息的源端是YY设 备,可以包括通过空口直接从YY设备接收,也可以包括通过空口从其他单元或模块间接地从YY设备接收。“发送”也可以理解为芯片接口的“输出”,“接收”也可以理解为芯片接口的“输入”。换言之,发送和接收可以是在设备之间进行的,也可以是在设备内进行的,例如,通过总线、走线或接口在设备内的部件之间、模组之间、芯片之间、软件模块或者硬件模块之间发送或接收。可以理解的是,信息在信息发送的源端和目的端之间可能会被进行必要的处理,比如编码、调制等,但目的端可以理解来自源端的有效信息。本申请中类似的表述可以做相似的理解,不再赘述。
在本申请实施例中,“指示”可以包括直接指示和间接指示,也可以包括显式指示和隐式指示。将某一信息(如下文所述的指示信息)所指示的信息称为待指示信息,则具体实现过程中,对待指示信息进行指示的方式有很多种,例如但不限于,可以直接指示待指示信息,如待指示信息本身或者该待指示信息的索引等。也可以通过指示其他信息来间接指示待指示信息,其中该其他信息与待指示信息之间存在关联关系;还可以仅仅指示待指示信息的一部分,而待指示信息的其他部分则是已知的或者提前约定的,例如可以借助预先约定(例如协议预定义)的各个信息的排列顺序来实现对特定信息的指示,从而在一定程度上降低指示开销。本申请对于指示的具体方式不作限定。可以理解的是,对于该指示信息的发送方来说,该指示信息可用于指示待指示信息,对于指示信息的接收方来说,该指示信息可用于确定待指示信息。
本申请可以将人工智能技术应用于通信领域中,可选地,可以将人工智能技术应用于信号发送这一应用场景中。示例性地,可以利用机器学习模型执行如下任一项或多项任务:编码、调制、生成参考信号或通信领域的其他任务等。
在对本申请提供的数据处理方法进行详细说明之前,先对本申请实施例提供的方法所应用的场景进行介绍。请先参阅图1,图1为本申请实施例提供的无线通信系统的一种架构示意图。本申请提供的方法可以应用于无线通信系统中,如图1所示,无线通信系统中包括网络设备101和移动台(mobile station,MS)102。其中,网络设备101与各个终端设备之间可以建立无线连接,各个终端设备之间也可以建立有无线连接。
网络设备101可以是指无线网络中提供无线接入服务的设备。示例性地,网络设备101可以为将移动台102接入到无线网络的设备,又可以称为基站;前述基站可以为各种形式的宏基站、微基站、中继站或接入点等等。在采用不同的无线接入技术的无线通信系统中,具备基站功能的网络设备101的名称可能会有所不同,例如,基站可以为称为演进型节点B(evolved Node B,eNB)、节点B(Node B,NB)、第五代(5th generation,5G)通信系统中的下一代基站(the next Generation Node B,gNB)、家庭基站(例如,home evolved Node B,或home Node B,HNB)、基带单元(base band unit,BBU)、无线保真(wireless fidelity,Wi-Fi)接入点(Access Point,AP)、传输接收点(transmission reception point,TRP)或无线网络控制器(radio network controller,RNC)等等。在另一种可能的场景中,由多个网络节点协作协助实现无线接入,不同网络节点分别实现基站的部分功能。例如,网络节点可以是集中式单元(central unit,CU),分布式单元(distributed unit,DU),CU-控制面(control plane,CP),CU-用户面(user plane,UP),或者无线单元(radio  unit,RU)等。CU和DU可以是单独设置,或者也可以包括在同一个网元中,例如基带单元(baseband unit,BBU)中。RU可以包括在射频设备或者射频单元中,例如包括在射频拉远单元(remote radio unit,RRU)、有源天线处理单元(active antenna unit,AAU)或远程射频头(remote radio head,RRH)中。在不同系统中,CU(或CU-CP和CU-UP)、DU或RU也可以有不同的名称,但是本领域的技术人员可以理解其含义。例如,在ORAN系统中,CU也可以称为开放式CU(O-CU),DU也可以称为开放式DU(O-DU),CU-CP也可以称为开放式CU-CP(O-CU-CP),CU-UP也可以称为开放式CU-UP(O-CU-UP),RU也可以称为开放式RU(O-RU)。其中,CU(或CU-CP、CU-UP)、DU和RU中的任一单元,可以是通过软件模块、硬件模块、或者软件模块与硬件模块结合来实现。本申请实施例对网络设备101的具体设备形态不做限定。
移动台102是指能够接收网络设备101发送的调度信息和指示信息的无线终端设备(terminal)。移动台102可以是具有无线通信功能的手持设备、车载设备、可穿戴设备、计算设备或连接到无线调制解调器的其他处理设备。
移动台102可以经无线接入网(wireless access network,RAN)与一个或多个核心网或者互联网进行通信。例如,移动台102可以是便携式、袖珍式、手持式、计算机内置的或者车载的移动装置,它们与无线接入网交换语音和/或数据。示例性地,移动台102可以是用户单元(user agent)、蜂窝电话(cellular phone)、智能手机(smart phone)、个人数字助理(personal digital assistant,PDA)、平板电脑(Tablet Personal Computer,Tablet PC)、无线调制解调器(modem)、手持设备(handset)、膝上型电脑(laptop computer)、个人通信业务(personal communication service,PCS)电话、远程站(remote station)、接入点(access point,AP)、远程终端设备(remote terminal)、接入终端设备(access terminal)、用户端设备(customer premises equipment,CPE)、终端(terminal)、用户设备(user equipment,UE)或移动终端(mobile terminal,MT)等等。
又例如,移动台102还可以是可穿戴设备,是应用穿戴式技术对日常穿戴进行智能化设计、开发出可以穿戴的设备的总称,如眼镜、手套、手表、服饰及鞋等。可穿戴设备即直接穿在身上,或是整合到用户的衣服或配件的一种便携式设备。可穿戴设备不仅仅是一种硬件设备,更是通过软件支持以及数据交互、云端交互来实现强大的功能。广义穿戴式智能设备包括功能全、尺寸大、可不依赖智能手机实现完整或者部分的功能,例如:智能手表或智能眼镜等,以及只专注于某一类应用功能,需要和其它设备如智能手机配合使用,如各类进行体征监测的智能手环、智能头盔、智能首饰等。
又例如,移动台102还可以是无人机、机器人、设备到设备通信(device-to-device,D2D)中的终端设备、车到一切(vehicle to everything,V2X)中的终端设备、虚拟现实(virtual reality,VR)设备、增强现实(augmented reality,AR)设备、工业控制(industrial control)中的无线终端、无人驾驶(self driving)中的终端设备、远程医疗(remote medical)中的终端设备、智能电网(smart grid)中的终端设备、智慧城市(smart city)中的无线终端、智慧家庭(smart home)中的终端设备等。
此外,移动台102也可以是5G通信系统之后的通信系统(例如第六代(6th generation, 6G)通信系统等)中的终端设备或者未来演进的公共陆地移动网络(public land mobile network,PLMN)中的终端设备等等,本申请实施例不限定移动台102的设备形态。
在一些应用场景中,网络设备101可以向各个终端设备发送下行数据,或者,各个终端设备也可以向网络设备101发送上行数据;网络设备101或各个终端设备在发送数据的过程中可能会利用机器学习模型,则可以采用本申请提供的数据处理方法。
在另一些应用场景中,各个终端设备之间还可以互相发送数据,在各个终端设备在发送数据的过程中可能会利用机器学习模型,则可以采用本申请提供的数据处理方法。
请继续参阅图2,图2为本申请实施例提供的无线通信系统的另一种架构示意图。如图2所示,在智能家居场景中,各种智能家居产品之间通过无线网络连接,以实现智能家居产品之间能够互相传输数据。在图2中,以智慧电视、智能空气净化器、智能饮水机、智能音箱以及扫地机器人等智能家居产品为例,这些智能家居产品均通过无线路由器连接至同一个无线网络中,从而实现各个智能家居产品之间的数据交互。除了上述示例的智能家居产品之外,在实际应用中还可以包括其他类型的智能家居产品,例如智能冰箱、智能抽油烟机、智能窗帘等智能家居产品,本实施例并不对智能家居产品的类型进行限定。
此外,不同的智能家居产品之间也可以是直接进行无线连接,而不需要通过无线路由器接入到同一个无线网络中。例如,各个智能家居产品之间通过蓝牙来实现无线连接。
除了上述图1和图2所介绍的场景以外,本申请实施例提供的方法还可以应用于其他的通信系统场景下。例如,在智能工厂场景中,不同的设备(例如智能机器人、车床、搬运车辆等设备)之间通过无线网络进行连接,并通过无线网络互相传递数据。本申请实施例并不对数据处理方法所应用的具体场景进行限定。
需要说明的是,本申请实施例所提及的无线通信系统包括但不限于:第五代移动通信技术(5th Generation Mobile Communication Technology,5G)通信系统、6G通信系统、卫星通信系统、短距通信系统、窄带物联网系统(Narrow Band-Internet of Things,NB-IoT)、全球移动通信系统(Global System for Mobile Communications,GSM)、增强型数据速率GSM演进系统(Enhanced Data rate for GSM Evolution,EDGE)、宽带码分多址系统(Wideband Code Division Multiple Access,WCDMA)、码分多址2000系统(Code Division Multiple Access,CDMA2000)、时分同步码分多址系统(Time Division-Synchronization Code Division Multiple Access,TD-SCDMA)以及长期演进系统(Long Term Evolution,LTE)等通信系统。本申请实施例并不对无线通信系统的具体架构进行限定。
在上述种种应用场景中,当某一装置需要发送数据时,均可以采用本申请提供的数据处理方法,具体的,请参阅图3,图3为本申请实施例提供的数据处理方法的一种流程示意图。如图3所示,301、第一装置获取T的取值,T为大于或等于1的整数,T代表第一机器学习模型的输出数据包括的子数据的个数。302、第一装置将第一数据输入第一机器学习模型,得到第一机器学习模型生成的第二数据,第二数据包括T个子数据;其中,第一机器学习模型包括一个或多个模块,每调用第一机器学习模型中的模块至少一次得到T个子数据中的一个子数据;示例性地,每次调用第一机器学习模型中的模块时可以调用第一 机器学习模型中的一个模块,也可以调用多个模块。
示例性地,第一装置可以为上述多个应用场景中任意一个需要发送数据的设备;例如,第一装置可以为图1中的网络设备101或移动台102;又例如,第一装置可以为图2中的智能家居或无线路由器;或者,第一装置也可以为其他需要发送数据的设备等,本申请实施例中不对第一装置的形态进行限定。
可选地,第一机器学习模型的功能包括如下任一项或多项的组合:编码、调制、生成参考信号或其他功能等。例如,若第一机器学习模型的功能是编码,则第一数据为需要编码的数据,第二数据为编码后的数据。又例如,若第一机器学习模型的功能是调制,则第一数据为需要调制的数据,第二数据为调制后的数据。又例如,若第一机器学习模型的功能是生成参考信号,则第一数据可以为多个参考信号的索引号,第二数据可以为参考信号。又例如,第一机器学习模型的功能是编码和调制,则第一数据为需要编码和调制的数据,第二数据为编码以及调制后的数据等。当第一机器学习模型为其他功能时,第一数据和第二数据可以表现为其他类型的数据等,本申请实施例中不做限定。
本申请实施例中,由于每调用第一机器学习模型中的模块至少一次就能得到一个子数据,则在获取到T的取值之后,可以根据T的取值灵活调整第一机器学习模型中模块的调用次数,以生成T个子数据,从而第一机器学习模型能够兼容T的多种取值,则不再需要存储多个机器学习模型,减少了存储空间的开销。
本申请实施例中,以下先对上述第一机器学习模型的推理阶段的详细实现过程进行介绍,再介绍第一机器学习模型的训练阶段的详细实现过程。其中,“第一机器学习模型的推理阶段”为利用第一机器学习模型进行数据处理的过程;“第一机器学习模型的训练阶段”为利用训练数据对第一机器学习模型进行迭代训练的过程。在对第一机器学习模型进行迭代训练的过程也是对第一机器学习模型采用的参数进行迭代更新的过程,则在利用训练数据对第一机器学习模型进行迭代训练之后能够得到与第一机器学习模型对应的一组或多组训练后的参数,训练阶段得到的前述参数将在推理阶段使用。
一、推理阶段
具体的,请参阅图4,图4为本申请实施例提供的数据处理方法的另一种示意图,如图4所示,该数据处理方法包括步骤401至411。
401、第一装置获取第一机器学习模型采用的一组参数。
本申请实施例中,第一装置在利用第一机器学习模型进行数据处理之前,需要先确定第一机器学习模型采用的一组训练后的参数;示例性地,前述一组训练后的参数包括第一机器学习模型中的所有模块需要的参数。
第一装置可以通过多种方式获取到第一机器学习模型采用的一组参数。在一种实现方式中,可以预先定义第一机器学习模型的多组训练后的参数,以及每组训练后的参数的标识信息,示例性地,“每组训练后的参数的标识信息”也可以称为每组训练后的参数的索引号。
为更直观地理解本方案,如下通过表1示出第一机器学习模型的多组训练后的参数与 每组训练后的参数的标识信息之间的对应关系,表1中以第一机器学习模型中采用到的一组参数包括如下多个参数为例:U、W、θs、V以及θ0。示例性地,在对第一机器学习模型的输入数据进行特征提取以得到输入数据的特征信息的过程中使用的参数包括U和θs,在对输入数据的特征信息进行特征更新的过程中使用的参数包括W和θs,在根据输入数据的特征信息生成第一机器学习模型的输出数据中的多个子数据的过程中使用的参数包括V和θ0
表1
其中,在表1的第二行中,Matrix(矩阵)-U0代表参数U的一个取值,Matrix-W0代表参数W的一个取值,Vector(向量)-s0代表参数θs的一个取值,Matrix-V0代表参数V的一个取值,Vector-o0代表参数θ0的一个取值,Matrix-U0、Matrix-W0、Vector-s0、Matrix-V0以及Vector-o0代表第一机器学习模型的一组参数,表1的第二行中的索引号“0”代表前述一组参数的标识信息。表1中第三行以及第四行可以参阅上述对表1中第一行的解释进行理解,需要说明的是,表1中的示例仅为方便理解,第一机器学习模型的多组参数中每组参数与标识信息之间的对应关系,不用于限定本方案。
可选地,第一机器学习模型的多组训练后的参数中不同组参数与至少一种第一预设指标的不同指标范围对应。例如,至少一种第一预设指标可以包括如下任一个或多个指标:终端设备的移动速度、多径时延扩展的最大值、峰值平均功率比(peak to average power ratio,PAPR)或其他指标等等,本申请实施例中不限定具体采用哪些指标。
例如,当终端设备的移动速度大于或等于速度阈值时所对应的一组参数,和,当终端设备的移动速度小于速度阈值时所对应的一组参数不同。又例如,多径时延扩展的最大值大于阈值1时所对应的一组参数,和,多径时延扩展的最大值小于阈值1时所对应的一组参数不同。又例如,PAPR的值位于范围1内时所对应的一组参数,和,PAPR的值位于范围2内时所对应的一组参数不同。
为更直观地理解本方案,如下通过表2示出多组不同的训练后的参数与不同的场景以及第一预设指标的不同指标范围之间的对应关系。
表2
其中,高速移动场景代表终端设备的移动速度大于或等于速度阈值,低速移动场景代表终端设备的移动速度小于速度阈值,多径时延扩展大代表多径时延扩展的最大值大于阈值1,多径时延扩展小代表多径时延扩展的最大值小于阈值1,表2中以预先定义第一机器学习模型的8组参数为例,8组参数分别为参数1、参数2、参数3、参数4、参数5、参数6、参数7以及参数8,第一机器学习模型的8组参数中不同组参数与不同的场景以及第一预设指标的不同指标范围对应,需要说明的是,表2中的示例仅为方便理解第一机器学习模型的多组训练后的参数中不同组参数之间的关系,具体预先定义的参数的组数以及具体的使用方式可以结合实际场景灵活设定,此处不做限定。
在一种情况中,若第一装置为终端设备,在一种实现方式中(为方便描述,后续称为实现方式一),在执行步骤401之前,第一装置(也即终端设备)中部署有与第一机器学习模型的参数对应的多个标识信息以及第一规则,第一规则指示与第一机器学习模型的参数对应的多个标识信息中不同的标识信息至少一种第一预设指标的不同指标范围之间的对应关系,由于与第一机器学习模型的参数对应的多个标识信息包括第一机器学习模型的多组参数中每组参数的标识信息,第一规则的含义可以参阅上述描述中公开的“多组训练后的参数中不同组参数与至少一种第一预设指标的不同指标范围之间的对应关系”进行理解,此处不做赘述;第一装置可以根据至少一种第一预设指标的值以及第一规则,从与第一机器学习模型的参数对应的多个标识信息中确定一个第二标识信息,第二标识信息为第一装置中的第一机器学习模型采用的一组参数的标识信息。需要说明的是,“第一标识信息”这一概念将在后续描述中使用,对于第一标识信息的含义也将在后续描述中说明,此处先不做赘述;基站和终端设备的多种形态,可以参阅上述描述,此处不再赘述。
第一装置可以向基站发送第二标识信息,基站在接收到第一装置发送的第二标识信息之后,可以向第一装置发送第二标识信息指向的一组训练后的参数,前述由基站发送的一组训练后的参数为第一装置中的第一机器学习模型采用的一组训练后的参数。步骤401可以包括:第一装置能够获取到基站发送的前述一组训练后的参数,从而将获取到的前述一组参数确定为第一机器学习模型采用的一组参数。
可选地,基站可以向第一装置发送第二标识信息以及该第二标识信息指向的一组参数。对应的,第一装置能够获取到第二标识信息以及第二标识信息指向的一组参数,第一装置将获取到的前述一组参数确定为第一机器学习模型采用的一组参数。
需要说明的是,在上述实现方式中,若与第一装置(也即上述终端设备)进行数据通信的第二装置为上述基站,则在第一装置获取第一机器学习模型采用的一组训练后的参数的过程中,第二装置也已经确定第一装置中的第一机器学习模型采用的是哪组参数。
若与第一装置(也即上述终端设备)进行数据通信的第二装置为另一个终端设备,则在步骤401之后,第一装置还可以将第一机器学习模型采用的一组参数(为方便描述,后续称为“一组目标参数”)发送给第二装置,从而第二装置能够确定第一装置中的第一机器学习模型采用的是哪一组参数。
若第一装置为终端设备,在另一种实现方式中(为方便描述,后续称为实现方式二),第一装置中可以配置有与第一机器学习模型的参数对应的至少一个第三标识信息,前述至少一个第三标识信息中每个第三标识信息为第一机器学习模型能够采用的一组参数的标识信息,前述至少一个第三标识信息中每个第三标识信息指向的一组参数均符合第一装置的硬件能力,也即第一装置的硬件能力能够支持每个第三标识信息指向的一组参数被执行。在执行步骤401之前,第一装置可以向基站发送上述配置的所有第三标识信息,对应的,基站在接收到第一装置发送的至少一个第三标识信息之后,可以获取每个第三标识信息指向的一组训练后的参数(也即第一机器学习模型能够采用的一组参数)。基站向第一装置发送前述与至少一个第三标识信息一一对应的第一机器学习模型的至少一组训练后的参数,第一装置接收到前述与至少一个第三标识信息一一对应的至少一组参数。
步骤401可以包括:第一装置可以从前述与至少一个第三标识信息一一对应的至少一组参数中选择第一机器学习模型采用的一组目标参数。
示例性地,至少一个第三标识信息中不同的第三标识信息可以与至少一种第二预设指标的不同指标范围对应,第一装置可以根据至少一种第二预设指标的值,从与第一机器学习模型的参数对应的多个第三标识信息中确定一个第二标识信息,进而从与至少一个第三标识信息一一对应的至少一组参数中选择与一个第二标识信息对应的一组目标参数。
其中,“第一装置可以根据至少一种第二预设指标的值,从与第一机器学习模型的参数对应的多个第三标识信息中确定一个第二标识信息”的具体实现方式可以参阅上述对“第一装置可以根据至少一种第一预设指标的值,从与第一机器学习模型的参数对应的多个标识信息中确定一个第二标识信息”的具体实现方式的描述,此处不做赘述。
“至少一种第二预设指标”的含义和“至少一种第一预设指标”的类别可以相同或不同,具体“至少一种第二预设指标”的类别可以根据实际情况灵活设定,此处不做限定。
可选地,若与第一装置(也即上述终端设备)进行数据通信的第二装置为上述基站,则在步骤401之后,第一装置还可以向基站(也即第二装置的一个示例)发送上述第一装置从至少一个第三标识中确定出的一个第二标识信息,对应的,基站接收第一装置发送的一个第二标识信息,从而第二装置能够确定第一装置中的第一机器学习模型采用的是哪一组参数。
若与第一装置(也即上述终端设备)进行数据通信的第二装置为另一个终端设备,则在步骤401之后,第一装置还可以将第一机器学习模型采用的一组目标参数发送给第二装置,从而第二装置能够确定第一装置中的第一机器学习模型采用的是哪一组参数。
在另一种情况中,若第一装置为基站(为方便描述,后续称为实现方式三),与第一装置进行数据通信的第二装置为终端设备,由于基站中可以部署有第一机器学习模型的多组训练后的参数,以及每组训练后的参数的标识信息,则步骤401可以包括:基站(也即第一装置的一个示例)可以根据至少一种第一预设指标的值,从与第一机器学习模型的多组训练后的参数中确定第一机器学习模型采用的一组目标参数。
示例性地,第一装置(也即基站)可以根据至少一种第一预设指标的值,从与第一机器学习模型对应的多个标识信息中确定一个第二标识信息,进而确定第二标识信息指向的一组参数,从而确定了第一机器学习模型采用的一组目标参数。
可选地,在步骤401之后,第一装置(也即基站)还可以将前述第一机器学习模型采用的一组目标参数发送给第二装置(也即与该基站通信的终端设备),从而第二装置能够确定第一装置中的第一机器学习模型采用的是哪一组参数。
在另一种实现方式中(为方便描述,后续称为实现方式四),在第一装置和第二装置进行数据通信之前,第一装置和第二装置中均已经有第一机器学习模型的多组训练后的参数以及每组训练后的参数的标识信息。示例性地,第一装置可以将第一机器学习模型的多组训练后的参数以及每组训练后的参数的标识信息发送给第二装置。例如,基站(第一装置的一个示例)可以将第一机器学习模型的多组训练后的参数以及每组训练后的参数的标识信息发送给终端设备(第二装置的一个示例);又例如,终端设备(第一装置的一个示例)可以将第一机器学习模型的多组训练后的参数以及每组训练后的参数的标识信息发送给基站(第二装置的一个示例);又例如,第一终端设备(第一装置的一个示例)可以将第一机器学习模型的多组训练后的参数以及每组训练后的参数的标识信息发送给第二终端设备(第二装置的一个示例)
或者,无线通信系统中的网络设备和移动台中均已经预先配置了第一机器学习模型的多组训练后的参数以及每组训练后的参数的标识信息,无线通信系统中的网络设备和移动台包括第一装置和第二装置。
则步骤401可以包括:第一装置可以从第一机器学习模型的多组训练后的参数中获取一组目标参数。示例性地,第一装置可以根据至少一种第一预设指标的值,从与第一机器学习模型的参数对应的多个标识信息中确定一个第二标识信息,进而确定第二标识信息指向的一组参数,从而确定了第一机器学习模型采用的一组参数。前述步骤的具体实现方式可以参阅上述描述,此处不做赘述。
可选地,在步骤401之后,第一装置还可以向第二装置发送前述一个第二标识信息,对应的,第二装置接收第一装置发送的一个第二标识信息。第二装置可以根据接收到的一个第二标识信息和第三数据,确定第一装置中的第一机器学习模型采用的是哪一组参数,第三数据包括第二装置中的与第一机器学习模型对应的多组参数以及每组参数的标识信息。对于第一装置和第二装置的具体形态可以结合实际应用场景灵活确定,此处不做限定。
为了更直观地理解本方案,请参阅图5,图5为本申请实施例提供的第一装置和第二装置确定第一机器学习模型采用的一组参数的一种流程示意图。如图5所示,501、第一装置向第二装置发送第一机器学习模型的多组参数以及每组参数的标识信息。502、第一装置 向第二装置发送第二标识信息。503、第二装置根据第二标识和第三数据,确定第一装置中的第一机器学习模型采用的一组所述参数,第三数据包括第一机器学习模型的多组参数以及每组参数的标识信息,应理解,图5中的示例仅为方便理解本方案,不用于限定本方案。
本申请实施例中,第二装置中存在第一机器学习模型的多组参数以及每组参数的标识信息,第一装置仅需要向第二装置发送第二标识信息,第二装置就能够得知第一装置中的第一机器学习模型采用的是哪组参数,而传输第二标识信息所占用的通信资源较少,有利于减少所消耗的通信资源。
需要说明的是,本申请实施例中不限定上述“第二装置确定第一装置中的第一机器学习模型采用的是哪一组参数”这一操作与后续步骤402至411之间的执行顺序,“第二装置确定第一装置中的第一机器学习模型采用的是哪一组参数”这一操作可以在步骤402至411中任一步骤之前或之后执行。
在另一种实现方式中,可以预先仅定义第一机器学习模型的一组训练后的参数,前述第一机器学习模型的一组训练后的参数可以预先配置在第一装置中,则当第一装置需要使用第一机器学习模型时,可以直接从本地获取第一机器学习模型采用的一组参数。
本申请实施例中,为了使得第一装置获取到第一机器学习模型采用的一组训练后的参数,以及,为了使得与第一装置进行数据通信的第二装置能够确定第一装置中的第一机器学习模型采用的是什么样的参数,不同装置之间可能会需要发送与第一机器学习模型对应的训练后的参数和/或前述参数的标识信息。
例如,在上述实现方式一中,终端设备(也即第一装置的一个示例)可以向基站发送第二标识信息(也即与第一机器学习模型对应的训练后的参数的标识信息)。又例如,基站可以向终端设备发送第二标识信息指向的一组训练后的参数,也即第一机器学习模型采用的一组目标参数。又例如,终端设备(也即第一装置的一个示例)可以向另一个终端设备(也即第二装置的一个示例)发送第一机器学习模型采用的一组目标参数。
又例如,在上述实现方式二中,终端设备(也即第一装置的一个示例)可以向基站发送至少一个第三标识信息(也即与第一机器学习模型对应的训练后的参数的标识信息)。又例如,基站向终端设备(也即第一装置的一个示例)发送前述与至少一个第三标识信息一一对应的第一机器学习模型的至少一组训练后的参数。又例如,若与第一装置(也即上述终端设备)进行数据通信的第二装置为上述基站,第一装置还可以向基站(也即第二装置的一个示例)发送前述一个第二标识信息。又例如,若与第一装置(也即上述终端设备)进行数据通信的第二装置为另一个终端设备,第一装置还可以将第一机器学习模型采用的一组目标参数发送给第二装置等等,此处不再对上述实现方式三和实现方式四中的情况进行一一列举,具体可以参阅上述种种实现方式中的描述。
在一种实现方式中,与第一机器学习模型对应的训练后的参数和/或前述参数的标识信息可以携带于信令中,也即在上述种种实现方式中,不同装置之间通过发送信令的方式来发送与第一机器学习模型对应的训练后的参数和/或前述参数的标识信息。
当不同装置之间通过发送信令的方式来传输与第一机器学习模型对应的训练后的参数时,每个信令可以携带第一机器学习模型的至少一组训练后的参数。可选地,与第一机器 学习模型对应的训练后的参数可以携带于如下一种或多种信息中:下行控制信息(downlink control information,DCI)、上行控制信息(uplink control information,UCI)、侧行链路控制信息(sidelink control information,SCI)、无线资源控制(radio resource control,RRC)信令、媒体访问控制的控制元素(media access control control element,MAC CE)或者其他类型的信令中,此处不做穷举。
例如,在上述实现方式一中,基站可以将携带有该第二标识信息指向的一组训练后的参数的DCI、RRC或MAC CE发送给终端设备(也即第一装置的一种示例);对应的,第一装置可以从前述DCI、RRC或MAC CE中获取该第二标识信息指向的一组训练后的参数。
又例如,在上述实现方式一中,终端设备(也即第一装置的一种示例)可以将第一机器学习模型采用的一组目标参数携带于SCI、RRC或MAC CE发送给另一个终端设备(也即第二装置的一种示例);对应的,第二装置可以从SCI、RRC或MAC CE中获取前述第一机器学习模型采用的一组目标参数。
又例如,在上述实现方式二中,基站可以将每个第三标识信息以及每个第三标识信息指向的一组训练后的参数携带于DCI、RRC或MAC CE中发送给终端设备(也即第一装置的一个示例);对应的,第一装置可以从DCI、RRC或MAC CE中获取每个第三标识信息以及每个第三标识信息指向的一组训练后的参数。
又例如,在上述实现方式四中,终端设备(第一装置的一个示例)可以将第一机器学习模型的每组训练后的参数以及每组训练后的参数的标识信息携带于UCI、RRC或MAC CE中发送给基站(第二装置的一个示例);对应的,基站可以从UCI、RRC或MAC CE中获取第一机器学习模型的每组训练后的参数以及每组训练后的参数的标识信息等等。
需要说明的是,此处不对上述种种实现方式中通过信令携带与第一机器学习模型对应的训练后的参数的情况一一进行赘述,上述种种实现方式中其他通过信令携带与第一机器学习模型对应的训练后的参数的情况可参阅前述描述进行理解。
示例性地,当通过信令(为方便描述,后续称为“第一信令”)携带第一机器学习模型的至少一组训练后的参数时,第一信令中可以携带有第一机器学习模型采用的至少一组参数以及每组参数所对应的指示信息,每组参数所对应的指示信息用于指示每组参数包括的多个参数在第一机器学习模型中的位置。可选地,第一信令为如下任一种:DCI、UCI、SCI、RRC、MAC CE或其他类型的信令等。
例如,在上述实现方式一中,当基站需要向第一装置发送第二标识信息指向的一组训练后的参数时,基站可以向第一装置发送第一信令,对应的,第一装置可以接收基站发送的第一信令,并从第一信令中获取第一机器学习模型采用的一组参数。
又例如,在上述实现方式二中,当基站需要向第一装置发送每个第三标识信息指向的一组训练后的参数时,基站可以向第一装置发送一个或多个第一信令,每个第一信令中携带一个第三标识信息以及该一个第三标识信息指向的一组参数等等,此处不再上述种种实现方式中的其他情况进行一一列举。
本申请实施例中,当通过信令传输第一机器学习模型的至少一组参数时,信令中不仅携带前述至少一组参数,还会携带每组参数所对应的指示信息,该指示信息用于指示每组 参数包括的多个参数在第一机器学习模块中的位置,从而第一装置在接收到该信令之后,就能够明白如何使用信令中携带的参数,且采用信令的方式传输第一机器学习模型的至少一组参数,有利于降低参数传输过程中所消耗的通信资源,且提高参数传输过程的效率。
可选地,每组参数所对应的指示信息可以包括每组参数中每个参数在第一机器学习模型的层数,以及该层中运算时所采用的参数值。示例性地,第一信令中每个参数的名称和第一机器学习模型中的每个参数的名称可以一致,第一信令携带的信息可以包括如下内容:
{层[1]:{矩阵U1_1:{u11(1,1),u11(1,2),u11(2,1),u11(2,2),…},矩阵U1_2:{u12(1,1),u12(1,2),u12(1,3),u12(2,1),…},向量b1_1:{b11(1),b11(2),…}}
层[2,3,4]:{矩阵U2_1:{u21(1,1),u12(1,2),u21(2,1),u12(2,2),…},矩阵U2_2:{u22(1,1),u22(1,2),u22(1,3),u22(2,1),…},向量b2_1:{b21(1),b21(2),…}}
层[5]:{矩阵U3_1:{u31(1,1),u31(1,2),u31(1,3),…},矩阵U3_2:{u32(1,1),u32(1,2),u33(2,1),…},向量b3_1:{b31(1),b31(2),…}}}
上述信息代表第一机器学习模型中第一个神经网络层采用的参数包括矩阵U1_1、矩阵U1_2和向量b1_1,矩阵U1_1的值为{u11(1,1),u11(1,2),u11(2,1),u11(2,2),…},矩阵U1_2的值为{u12(1,1),u12(1,2),u12(1,3),u12(2,1),…},向量b1_1的值为b11(1),b11(2),…}等,对于第一机器学习模型的第二个神经网络层至第五个神经网络层采用的参数的含义可以结合前述描述进行理解,此处不做赘述。
为了更直观地理解本方案,如下以第一信令为MAC CE和RRC为例,展示了在MAC CE和RRC中携带第一机器学习模型的一组参数时的具体格式。先参阅如下表3,表3中示出了第一机器学习模型的一组参数在MAC CE中的格式。
表3
表3中的R代表无意义的参数,也即第一机器学习模型中不会采用R,表3中以第一机器学习模型包括2个神经网络层为例,在表3的第一行(也即MAC CE的第一行)先声明 第一机器学习模型的第1个神经网络层采用的参数包括U0、W0、θs0、V0以及θo0,并在第二行至第六行中分别示出了第一机器学习模型的第1个神经网络层采用的5个参数的值。在表3的第七行(也即MAC CE的第七行)声明了第一机器学习模型的第2个神经网络层采用的参数包括U1、W1、θs1、V1以及θo1,并在第八行至第十二行示出了第一机器学习模型的第2个神经网络层采用的5个参数的值,由于第一装置的第一机器学习模型中采用的参数也是U0、W0、θs0、V0、θo0、U1、W1、θs1、V1以及θo1,则第一装置在接收到MAC CE之后,能够确定第一机器学习模型中每个参数的取值,应理解,表3中的示例仅为方便理解本方案,不用于限定本方案。
接下来介绍当通过RRC携带第一机器学习模型的一组或多组参数时,RRC中携带的内容可以如下:
其中,matrixU SEQUENCE{}代表{}中的参数的值均为矩阵,Ui0j0代表第一机器学习模型的参数matrixU的第i0行第j0列的参数,Ui0j0,REAL中的REAL代表的是Ui0j0这一参数的值的类型是实数;同理,Ui1j1代表第一机器学习模型的参数matrixU的第i1行第j1列的参数,Ui1j1,REAL中的REAL代表的是Ui1j1这一参数的值的类型是实数。vector SEQUENCE{}代表{}中的参数的值均为向量,v0为第一机器学习模型的参数vector的第0个参数,v0,REAL中的REAL代表的是v0这一参数的值的类型是实数;v1为第一机器学习模型的参数vector的第1个参数,v1,REAL中的REAL代表的是v1这一参数的值的类型是实数。采用前述方式RRC中可以携带第一机器学习模型的一组参数,且在获取到RRC之后,还可以知道一组参数中的每个参数在第一机器学习模型中的位置,应理解,上述举例仅为方便理解本方案,不用于限定本方案。
为更直观地理解本方案,请参阅图6,图6为本申请实施例提供的第一装置获取第一机器学习模型采用的一组参数的一种流程示意图。601、第一装置向基站发送UCI,该UCI中携带有第二标识信息。602、基站获取第二标识信息指向的一组参数,也即第一机器学习模型采用的一组参数。603、基站获取DUI,该DUI中携带有第二标识信息指向的一组参数以及前述一组参数所对应的指示信息,前述一组参数所对应的指示信息用于指示前述一组 参数包括的多个参数在第一机器学习模型中的位置。604、基站向第一装置发送该DUI,对应的,第一装置接收该DUI。需要说明的是,图6中的UCI和DUI也可以替换为其他类型的信令,图6中的示例仅为方便理解本方案,不用于限定本方案。
当不同装置之间通过发送信令的方式来传输与第一机器学习模型对应的训练后的参数的标识信息时,可选地,参数的标识信息携带于如下任一种或多种信息中:DCI、UCI、SCI、RRC信令、MAC CE、物理广播信道(physical broadcast channel,PBCH)、物理随机接入信道(physiacal random access channel,PRACH)或其他类型的信令中,此处不做穷举。
例如,在上述实现方式一中,终端设备(也即第一装置的一个示例)可以将第二标识信息携带于UCI、MAC CE或PRACH中发送给基站。又例如,在上述实现方式四中,若第一装置为基站,第二装置为终端设备,则基站可以将第二标识信息携带于DCI、MAC CE或PBCH中发送给终端设备。又例如,在上述实现方式四中,若第一装置和第二装置均为终端设备,则第一终端设备可以将第二标识信息携带于SCI或MAC CE中发送给第二终端设备等等;需要说明的是,此处不对上述种种实现方式中通过信令携带与第一机器学习模型对应的训练后的参数的标识信息情况一一进行赘述,上述种种实现方式中其他通过信令携带前述标识信息的情况可参阅前述描述进行理解。
本申请实施例中,相比于将第一机器学习模型的至少一组参数和/或每组参数的标识信息携带于数据包中传输,将前述至少一组参数和/或每组参数的标识信息携带于信令中传输,传输效率更高,且消耗的计算机资源更少;此外,本方案中提供了多种能够用于传输前述至少一组参数和/或每组参数的标识信息的信令,提高了本方案的实现灵活性。
在另一种实现方式中,与第一机器学习模型对应的训练后的参数和/或前述参数的标识信息可以携带于数据包中,也即在上述种种实现方式中,不同装置之间通过发送数据包的方式来发送与第一机器学习模型对应的训练后的参数和/或前述参数的标识信息,
例如,在上述实现方式一中,终端设备(也即第一装置的一个示例)可以向基站发送携带第二标识信息的第一数据包,对应的,基站可以从第一数据包中获取到第二标识信息。又例如,基站可以向终端设备(也即第一装置的一个示例)发送第二数据包,第二数据包中携带第二标识信息指向的一组训练后的参数;终端设备可以从第二数据包中获取第一机器学习模型采用的前述一组目标参数。又例如,终端设备(也即第一装置的一个示例)可以向另一个终端设备(也即第二装置的一个示例)发送第三数据包,第三数据包中携带第一机器学习模型采用的一组目标参数;第二装置可以从第三数据包中获取第一装置中的第一机器学习模型采用的一组目标参数等等。
需要说明的是,此处不再对实现方式二、实现方式三以及实现方式四中的具体实现方式进行一一赘述,实现方式二、实现方式三以及实现方式四中的具体实现方式可以参阅上述对实现方式一种的描述进行理解。
可选地,在第一装置和第二装置进行数据传输的过程中,可以对第一机器学习模型进行再次训练,以优化第一机器学习模型中采用的参数,第一装置在得到第一机器学习模型的一组更新后的参数。可选地,在第一装置得到第一机器学习模型的一组更新后的参数之后,可以把第一机器学习模型的一组更新后的参数发送给第二装置。
示例性地,在一种实现方式中,无论第一装置是终端设备还是基站,均由基站对第一机器学习模型进行再次训练,得到第一机器学习模型的一组更新后的参数。在一种情况中,若第一装置为终端设备,则第一装置可以向基站发送请求,前述请求用于请求基站对第一机器学习模型进行再次训练,基站将第一机器学习模型的一组更新后的参数发送给第一装置。可选地,第一装置还可以向第二装置发送前述一组更新后的参数。
在另一种情况中,若第一装置为基站,则基站在得到第一机器学习模型的一组更新后的参数之后,还可以向第二装置发送前述一组更新后的参数,从而第二装置能够确定第一装置中的第一机器学习模型采用的一组更新后的参数。
在另一种实现方式中,由第一装置对第一机器学习模型进行再次训练,得到第一机器学习模型的一组更新后的参数。可选地,第一装置还会将第一机器学习模型的一组更新后的参数发送给第二装置。
例如,若第一装置为终端设备,第二装置为基站,则终端设备在对第一机器学习模型进行再次训练,得到第一机器学习模型的一组更新后的参数之后,可以将前述一组更新后的参数发送给基站。又例如,若第一装置为第一终端设备,第二装置为第二终端设备,则第一终端设备在对第一机器学习模型进行再次训练,得到第一机器学习模型的一组更新后的参数之后,可以将前述一组更新后的参数发送给第二终端设备等等,此处不对各种情形进行穷举。
需要说明的是,上述各种实现方式中各个装置之间发送“第一机器学习模型的一组更新后的参数”的方式可以参阅上述描述,此处不做赘述。
402、第一装置获取T的取值,T为大于或等于1的整数,T代表第一机器学习模型的输出数据包括的子数据的个数。
本申请实施例中,T代表第一机器学习模型的输出数据包括的子数据的个数。例如,若采用第一机器学习模型执行的任务是调制,则第一机器学习模型的输出数据是调制后的数据,调制后的数据可以包括T组调制后的符号。又例如,若采用第一机器学习模型执行的任务是编码,则第一机器学习模型的输出数据是编码后的数据,编码后的数据可以包括T组编码后的比特数据。又例如,若采用第一机器学习模型执行的任务是生成参考信号,则第一机器学习模型的输出数据是参考信号,T代表的可以为前述参考信号的长度,示例性地,参考信号的长度可以指示参考信号包括的符号的数量。
403、第一装置获取待处理数据。
404、第一装置判断待处理数据的长度是否小于H,若判断结果为是,则进入步骤405;若判断结果为否,则进入步骤406,H指示第一数据的长度。
本申请实施例中,步骤404为可选步骤,第一装置还可以获取待处理数据的长度和H的取值,待处理数据的长度可以为K,K为大于或等于1的整数,H指示第一数据的长度,也即H指示第一机器学习模型的输入数据的期望长度,H为大于或等于1的整数。
第一装置在获取到K和H的取值之后,可以判断K是否小于H;若判断结果为是,则进入步骤405;若判断结果为否,则进入步骤406。
可选地,待处理数据的长度可以为待处理数据的比特位数。例如,若采用第一机器学 习模型执行的任务是编码,则待处理数据为需要编码的数据,待处理数据的长度可以为需要编码的数据的比特位数。又例如,若采用第一机器学习模型执行的任务是调制,则待处理数据为需要调制的数据,待处理数据的长度可以为需要调制的数据的比特位数。又例如,若采用第一机器学习模型执行的任务是编码和调制,则待处理数据为需要编码和调制的数据,待处理数据的长度为需要执行编码和调制的数据的比特位数。又例如,若采用第一机器学习模型执行的任务是生成参考信号,则待处理数据包括多个参考信号的索引号,待处理数据的长度为前述多个参考信号的索引号的比特位数等等。
或者,若采用第一机器学习模型执行的任务是生成参考信号,则待处理数据包括多个参考信号的索引号,待处理数据的长度可以为前述多个参考信号的个数等等,“待处理数据的长度”的含义可以结合实际情况灵活确定,此处不做限定。
405、第一装置对待处理数据进行填充,得到第一数据,第一数据的长度为H。
本申请实施例中,步骤405为可选步骤,若第一装置确定待处理数据的长度小于H,则第一装置可以对待处理数据进行填充,得到第一数据,第一数据的长度为H,则第一数据可以包括待处理数据和填充数据。本申请实施例中,当待处理数据的长度小于H时,对待处理数据进行填充得到长度为H的第一数据,再将长度为H的第一数据输入第一机器学习模型中,从而无论待处理数据的长度是多少,第一机器学习模型处理的都是长度为H的第一数据,不仅有利于兼容任意长度的待处理数据,且有利于降低第一机器学习模型在进行数据处理时的难度,以得到性能更好的第二数据。
在一种情况中,填充数据可以包括第一标识信息,第一标识信息用于标识T的取值和/或K的取值,也即第一标识信息可以用于标识T的取值和K的取值,也可以用于标识T的取值,也可以用于标识K的取值。其中,T的取值为第一机器学习模型的输出数据包括的子数据的个数,K为待处理数据的长度,T和K均为大于或等于1的整数。
示例性地,第一装置在获取到T的取值和/或K的取值之后,可以采用第一函数对T的取值和/或K的取值进行处理,以得到第一标识信息。第一函数需要满足的条件包括:将第一标识信息的取值限制在预设范围内,且,能够将不同T的取值和/或K的取值映射为不同的值,也即通过该第一函数生成的值能够唯一的标识某一个T的取值和/或K的取值,又或者说通过该第一函数生成的值能够对不同的T的取值和/或K的取值进行区分。
示例性地,第一函数可以为二元函数、线性函数或非线性函数。例如,此处以第一标识信息用于标识T的取值和K的取值为例,如下公开了第一函数的一个示例:
f(T,K)=1–2*((K-3)*11+(T-4))/98;  (1)
其中,f(T,K)代表第一函数的一个示例,式(1)中的示例仅为方便理解本方案,不用于限定本方案。
本申请实施例中,在第一数据中携带有用于标识T的取值和/或K的取值的第一标识信息,则第一机器学习模型能够根据T的取值和/或K的取值,即根据第一机器学习模型的输出数据的长度和/或真实的待处理数据的长度处理第一数据,进而通过第一机器学习模型输出的第二数据,有利于得到性能更好的第二数据。
可选地,若填充数据在携带了第一标识信息之后还有剩余空间,则在一种情况中,填 充数据还可以携带第一装置的标识信息。示例性地,若第一装置为终端设备,则第一装置可以为该终端设备的标识信息;若第一装置为基站,则第一装置可以为该基站的标识信息;例如,第一装置的标识信息可以为无线网络临时标识(radio network tempory identity,RNTI)、小区(cell)标识(identity,ID)、物理小区标识(physical cell identity,PCI)或其他类型的标识信息等,此处不做穷举。
在另一种情况中,填充数据中还可以携带无效信息,示例性地,填充数据中的剩余空间中可以均填充为0、1或其他数值等,此处示例仅为方便理解本方案,不用于限定本方案。
为了进一步理解本方案,如下公开了第一装置在执行上述填充操作时采用的代码的一个示例,待处理数据为b0,b1,…,bK-1
参阅上述代码可知,在对待处理数据进行填充的过程中,对i从0到H-1依次取值,若i小于K,则从待处理数据中获取bi,并将bi转换为ci并放入第一数据中,ci代表第一数据中第i+1个数据;若i大于或等于K且小于H-1,则将0填充至第一数据中;若i等于H-1,则将第一标识信息填充至第一数据中,第一标识信息的含义可以参阅上述描述,此处不做赘述,应理解,此处示例仅为方便理解本方案,不用于限定本方案。
为更直观地理解本方案,请参阅图7,图7为本申请实施例提供的第一数据的一种示意图。如图7所示,第一数据包括待处理数据、用于填充的0以及第一标识信息,图7中的示例仅为方便理解本方案,不用于限定本方案。
在另一种情况中,填充数据中可以包括第一装置的标识信息,且不包括第一标识信息。在另一种情况中,填充数据中携带的可以均为无效信息。
406、第一装置将第一标识信息和待处理数据进行融合,得到第一数据,第一标识信息用于标识T的取值和/或K的取值,K为待处理数据的长度,K为大于或等于1的整数。
本申请实施例中,步骤406为可选步骤,若第一装置确定待处理数据的长度等于H,第一装置还可以将第一标识信息和待处理数据进行融合,得到第一数据;对于第一标识信息的含义可以参阅上述描述,此处不做赘述。
示例性地,“将第一标识信息和待处理数据进行融合”包括但不限于:将第一标识信息和待处理数据进行拼接、相加或其他融合方式等等,此处不做限定。
需要说明的是,步骤404至406均为可选步骤,若步骤404至406均不执行,则可以直接将待处理数据确定为第一数据。或者,也可以不执行步骤404和405,仅执行步骤406。
或者,也可以执行步骤404和405,且不执行步骤406,则当第一装置确定待处理数据的长度等于H时,也可以直接将待处理数据确定为第一数据。
407、第一装置将第一数据输入第一机器学习模型,得到第一机器学习模型生成的第二数据,第二数据包括T个子数据,其中,第一机器学习模型包括一个或多个模块,每调用第一机器学习模型中的模块至少一次得到一个子数据。
本申请实施例中,第一装置在获取到第一数据之后,在一种实现方式中,第一装置可以直接将第一数据输入第一机器学习模型中。在另一种实现方式中,第一装置还可以采用第一装置的标识信息对第一数据进行加扰,并将加扰后的第一数据输入第一机器学习模型中,对于第一装置的标识信息的含义可以参阅上述描述,此处不做赘述。
“每调用第一机器学习模型中的模块至少一次得到一个子数据”代表每调用第一机器学习模型中的一个模块至少一次能够得到T个子数据中的一个子数据,或者,每调用第一机器学习模型中的多个模块至少一次能够得到T个子数据中的一个子数据。
第一机器学习模型的功能包括如下任一项或多项的组合:编码、调制或生成参考信号。例如,若第一机器学习模型的功能为编码,则T个子数据代表T组编码后的比特数据,每组编码后的比特数据可以包括一个或多个比特数据。又例如,若第一机器学习模型的功能是调制,则T个子数据代表T组调制后的符号,每组调制后的符号可以包括一个或多个符号。又例如,若第一机器学习模型的功能是编码和调制,则T个子数据代表T组编码和调制后的符号,每组编码和调制后的符号可以包括一个或多个符号。又例如,若第一机器学习模型的功能是生成参考信号,则T个子数据可以代表参考信号中的T组符号等,应理解,此处举例仅为方便理解本方案,不用于限定本方案。
本申请实施例中,提供了第一机器学习模型的多种功能,扩展了本方案的应用场景,提高了本方案的实现灵活性。
在一种情况中,第一机器学习模型可以包括第一模块和至少一个第二模块,第一模块和第二模块的区别包括:第一模块的初始输入为第一数据或加扰后的第一数据,第二模块的初始输入为第一数据(或加扰后的第一数据)的特征信息。
示例性地,每调用第一机器学习模型中的第二模块一次,就能够得到T个子数据中的一个子数据,则步骤407可以包括:第一装置将第一数据(或加扰后的第一数据)输入第一模块,得到第一模块生成的第一子数据,第一子数据为T个子数据中的一个;第一装置在利用第一模块生成第一子数据的过程中能够得到第一数据的特征信息。第一装置将第一特征信息输入第二模块,得到第二模块生成的第二子数据,其中,第一特征信息包括上一次调用第一机器学习模型中的模块进行数据处理时生成的特征信息,第二子数据为T个子数据中的一个。本申请实施例中,第一机器学习模型包括第一模块和至少一个第二模块,第一机器学习模型的第一模块的输入为整个第一数据,第二模块的输入为上一次调用第一 机器学习模型的模块时得到的特征信息,则第一次调用第二模块时输入的为通过第一模块对整个第一数据进行处理时得到的特征信息,从而每次输入第二模块的特征信息均参考了整个第一数据,也即在生成T个子数据中的每个子数据时均参考了整个第一数据的信息,有利于得到性能更好的第二数据;且每调用第二模块一次就能够得到T个子数据中的一个第二子数据,有利于快速的得到第二数据中包括的T个子数据。
针对“第一子数据”的获取过程,具体的,第一装置可以通过调用第一模块一次或多次的方式,得到第一模块生成的第一子数据;可选地,若第一装置通过调用第一模块多次的方式,得到第一模块生成的第一子数据,则第一次调用第二模块时输入的第一特征信息为最后一次调用第一模块时生成的特征信息。在一种实现方式中,第一装置在将第一数据(或加扰后的第一数据)输入第一模块中,通过第一模块对第一数据(或加扰后的第一数据)进行处理之后,直接输出第一子数据。
示例性地,第一模块可以选用卷积神经网络、循环神经网络、全连接神经网络或其他类型的神经网络等等,此处均不做限定。
可选地,第一装置利用第一模块对第一数据(或加扰后的第一数据)进行处理的过程可以包括:第一装置通过第一模块对第一数据(或加扰后的第一数据)进行线性变换,并采用第三激活函数进行处理,得到第一数据(或加扰后的第一数据)的特征信息;对第一数据(或加扰后的第一数据)的特征信息进行线性变换,并采用第四激活函数进行处理,得到第一模块生成的第一子数据。
示例性地,第一机器学习模块中的激活函数可以为如下任一种:tanh(x)、max(min(a*x,+1),-1)、sin(x)或其他类型的激活函数等等,此处举例仅为证明本方案的可实现性,不用于限定本方案。
第三激活函数和第四激活函数可以采用相同的激活函数,也可以采用不同的激活函数,具体可以结合实际情况灵活设定,本申请实施例中不做限定。
在另一种实现方式中,第一装置在将第一数据(或加扰后的第一数据)输入第一模块中,通过第一模块对第一数据(或加扰后的第一数据)进行处理。第一装置获取前述处理过程中得到的第一数据(或加扰后的第一数据)的第二特征信息,将第二特征信息再次输入第一模块,通过第一模块对第二特征信息进行处理;第一装置重复执行前述“获取上一次调用第一模块进行数据处理的过程中得到的第二特征信息,将第二特征信息再次输入第一模块,通过第一模块对第二特征信息进行处理”的步骤至少一次,将最后一次调用第一模块对第二特征信息进行处理时的处理结果作为第一子数据。
示例性地,第一装置每次利用第一模块对输入数据进行处理的过程可以包括:第一装置通过第一模块对输入数据进行线性变换,并采用第三激活函数进行处理,得到变换后的输入数据;对变换后的输入数据进行线性变换,并采用第四激活函数进行处理,得到第一模块的处理结果。其中,第一模块的输入数据可以为第一数据(或加扰后的第一数据),或者,为第一数据(或加扰后的第一数据)的特征信息。
针对“每个第二子数据”的获取过程,具体的,第一装置每次调用第二模块时,会将 第一特征信息输入第二模块中,通过第二模块对第一特征信息进行处理,得到第二模块生成的第二子数据,第二子数据为T个子数据中的一个。
其中,第一特征信息为上一次调用第一机器学习模型中的模块进行数据处理时生成的特征信息;示例性地,第一特征信息可以为上一次调用第一机器学习模型中的第一模块进行数据处理时生成的特征信息,也可以为上一次调用第一机器学习模型中的第二模块进行数据处理时生成的特征信息。
示例性地,每个第二模块可以选用卷积神经网络、循环神经网络、全连接神经网络或其他类型的神经网络等等,此处均不做限定。
可选地,第一装置每次利用第二模块对第一特征信息进行处理的过程可以包括:第一装置通过第二模块对第一特征信息进行线性变换,并采用第一激活函数进行处理,得到变换后的特征信息;对变换后的特征信息进行线性变换,并采用第二激活函数进行处理,得到第二模块生成的第二子数据。其中,第一激活函数和第二激活函数均为第一机器学习模型内的激活函数,具体采用哪种激活函数可以结合实际情况灵活设定。
本申请实施例中,提供了第二模块进行数据处理时的具体实现方式,由于上述方式简单且易于实现,不仅有利于减少在生成第二数据的过程中消耗的计算机资源;且上述方式示出的第二模块中采用的参数量较少,有利于降低在传输第一机器学习模型的参数时消耗的通信资源。
为了进一步理解本方案,如下公开了第一装置生成T个子数据的代码的一种示例,T个子数据包括x(0),x(1),…,x(T-1),如下代码中以第一机器学习模型包括一个第一模块和一个第二模块,通过调用第一模块一次得到一个第一子数据,通过调用第二模块T-1次得到T-1个第二子数据:
参阅上述代码,第一模块中采用的参数包括U、θs、V和θo,第二模块中采用的参数包括W、θs、V和θo,θ′s代表θs的转置,θ′o代表θo的转置,此处示例中以第一模块和第二模块采用的参数的值相同为例,也即第一模块中的U和第二模块中的W的取值相同。其中,c代表第一数据,c′代表第一数据的转置,Uc′+θ′s代表对转置后的第一数据进行线性变换, s′0=tanh(Uc′+θ′s)代表对转置后的第一数据进行线性变换之后并采用第三激活函数进行处理,得到变换后的第一数据s′0(也即第一数据的特征信息s′0)。o′0=Vs′0+θ′o代表对第一数据的特征信息s′0进行线性变换之后得到o′0,o0代表o′0的转置,exp(j2πo0)代表利用第四激活函数对o0进行处理,得到第一子数据x0
对于每个1 to T-1这T-1个第二子数据,s′t-1代表第一特征信息,也即上一次调用第一机器学习模型中的模块进行数据处理时生成的特征信息,与第一模块的处理方式类似,s′t=tanh(Ws′t-1+θ′s)代表对第一特征信息进行线性变换之后并采用第一激活函数进行处理,得到变换后的特征信息s′t;o′t=Vs′t+θ′o代表对变换后的特征信息s′t进行线性变换之后得到o′t,ot代表o′t的转置,exp(j2πot)代表利用第二激活函数对ot进行处理,得到一个第二子数据xt,重复调用第二模块T-1次,能够得到T-1个第二子数据。
需要说明的是,此处以第一机器学习模型仅包括一个第一模块和一个第二模块,且第一模块和第二模块中采用的参数一致为例,仅为方便理解本方案的一个示例,在实际应用的过程中,第一模块和第二模块采用的参数也可以不一致,第一机器学习模型中也可以包括多个第二模块。
为更直观地理解本方案,请参阅图8,图8为本申请实施例提供的利用第一机器学习模型生成T个子数据的一种示意图。图8中以每调用第一机器学习模型中的模块(也即第一模块或第二模块)一次,能够得到T个子数据中的一个子数据为例,如图8所示,在将第一数据输入第一机器学习模型之后,可以通过第一模块对第一数据进行线性变换,并利用第三激活函数进行处理得到S0(也即第一数据的特征信息),对S0进行线性变换得到O0,对O0进行处理之后得到第一模块生成的一个第一子数据。
将调用第一模块对第一数据进行处理的过程中生成的S0输入第一机器学习模型的第二模块,通过第二模块对S0进行线性变换,并利用第一激活函数进行处理得到S1(也即第一数据的更新后的特征信息),对S1进行线性变换得到O1,对O1进行处理之后得到第二模块生成的第一个第二子数据。
在生成T-1个第二子数据的过程中,均将上一次调用第一机器学习模型的模块进行处理时生成的特征信息(也即St-1)输入第二模块,示例性地,若为第一次调用第一机器学习模型的第二模块,则将调用第一机器学习模型的第一模块进行处理时生成的特征信息(St-1的一个示例)输入第二模块;若为第二次至第T-1次调用第二模块,则将上一次调用第二模块进行处理时生成的特征信息(St-1的另一个示例)输入当前次调用的第二模块。通过第二模块对St-1进行线性变换,并利用第一激活函数进行处理得到St(也即第一数据的 更新后的特征信息),对St进行线性变换得到Ot,对Ot进行处理之后得到第二模块生成的第T个子数据,则可以通过调用第二模块T-1次的方式得到T-1个第二子数据,T-1个第二子数据和1个第一子数据可以组成第二数据中的T个子数据,图8中的示例仅为方便理解本方案,不用于限定本方案。
可选地,第一机器学习模型可以包括多个第二模块,其中,多个第二模块中至少两个第二模块采用的参数不同。也即第一装置每调用第二模块一次能够生成一个第二子数据,但在生成T-1个第二数据的过程中调用的可以为不同的第二模块。
“采用的参数不同的两个第二模块”的含义可以包括如下任一种不同:两个第二模块中采用了相同类型的参数,但两个第二模块中采用的参数值不完全相同;或者,两个第二模块中采用的参数的类型不完全相同等,此处不做穷举。“两个第二模块采用的参数相同”代表两个第二模块不仅采用的参数的类型完全一致,而且每个参数的取值也完全一致。
例如,多个第二模块可以包括第二模块1、第二模块2和第二模块3,T的取值为8,则需要生成7个第二子数据,可以每个第二模块用来生成相同数量(例如3个)的第二子数据,也即在生成前3个第二数据时采用第二模块1,在生成第4个、第5个和第6个第二子数据时采用第二模块2,在生成第7个第二子数据时采用第二模块3。
又例如,多个第二模块可以包括第二模块1、第二模块2和第二模块3,T的取值为8,则需要生成7个第二子数据,可以每个第二模块生成的第二子数据的个数也可以不同,也即在生成前3个第二数据时采用第二模块1,在生成第4个和第5个第二子数据时采用第二模块2,在生成第6个和第7个第二子数据时采用第二模块3,需要说明的是,此处举例均仅为方便理解本方案,不用于限定本方案。
为了进一步理解本方案,如下公开了第一装置采用多个第二模块生成T-1个第二子数据时的代码的一种示例:
其中,对于s′t-1、W、θ′s、V以及θ′o的含义均可以参阅上述描述,此处不做赘述,xt代表第二数据包括的T个子数据中第t+1个子数据,不同的第二模块采用的θs和θo相同,不同的第二模块采用的W和V不同,Wt mod τ和Vt mod τ代表周期性调用τ个第二模块,τ个第二模块中存在τ组不同的参数,Wt mod τ具体可以表现为W0、W1…Wτ-1,Vt mod τ具体可以表现为V0、V1…Vτ-1,τ组不同的参数分别包括W0和V0、W1和V1…Wτ-1和Vτ-1,一个周期通过τ个第二模块生成τ个第二子数据,下一周期重新采用该τ个第二模块,例如,第一机器学习模 型可以包括第二模块1、第二模块2以及第二模块3这三个第二模块,则可以循环调用这3个第二模块,先通过第二模块1(也即采用了W1和V1)生成1个第二子数据之后,再通过第二模块2(也即采用了W2和V2)生成1个第二子数据之后,再通过第二模块3(也即采用了W0和V0)生成1个第二子数据之后,可以再一次调用第二模块1,以此类推等等,应理解,此处示例仅为方便理解本方案,不用于限定本方案。
本申请实施例中,第一机器学习模型中可以采用多个第二模块,多个第二模块中存至少两个第二模块采用的参数不同,也即T-1个第二子数据是由不同的第二模块生成的,有利于第二模块的参数和生成的第二子数据之间的匹配度,从而有利于得到性能更好的第二数据。
在另一种情况中,第一机器学习模型可以包括第一模块和至少一个第三模块,第一模块和第三模块的区别包括:第一模块的初始输入为第一数据或加扰后的第一数据,第三模块的初始输入为第一数据(或加扰后的第一数据)的特征信息。“第三模块”和“第二模块”的含义类似,可参阅上述描述理解,此处不做赘述。
示例性地,每调用第一机器学习模型中的第三模块多次,能够得到T个子数据中的一个子数据,则步骤407可以包括:第一装置将第一数据(或干扰后的第一数据)输入第一模块,通过第一模块生成第一子数据,第一子数据为T个子数据中的一个;前述步骤的具体实现方式可参阅上述描述,此处不做赘述。第一装置调用第三模块多次,得到第三模块生成的第三子数据,第三子数据为T个子数据中的一个,其中,第三模块的输入包括第一数据的特征信息,在调用第三模块多次的过程中对第一数据(或干扰后的第一数据)的特征信息进行多次更新。
其中,“第三模块对输入数据进行处理的过程”与“第二模块对输入数据进行处理的过程”类似,区别在于,第一装置每调用第二模块一次,就会将第二模块生成的处理结果作为一个第二子数据;而第一装置需要调用一个第三模块多次,在调用第三模块多次的过程中对第一数据(或干扰后的第一数据)的特征信息进行多次更新,才会将最后一次调用第三模块得到的处理结果作为一个第三子数据。第三子数据、第二子数据以及第一子数据均为第二数据中包括的子数据,对于子数据的含义可以参阅上述描述,此处不做赘述。
示例性地,第一装置将第一数据(或干扰后的第一数据)的特征信息输入第三模块中,通过第三模块对第一数据进行处理,前述处理过程包括对第一数据的特征信息进行更新。第一装置将第一数据(或干扰后的第一数据)的更新后的特征信息再次输入第三模块中,再次通过第三模块对第一数据(或干扰后的第一数据)的更新后的特征信息进行处理,前述处理过程包括对第一数据(或干扰后的第一数据)的特征信息进行再次更新;第一装置重复执行前述操作至少一次,在利用第三模块对第一数据(或干扰后的第一数据)的特征信息进行处理的次数达到预设次数时,得到该第三模块生成的一个第三子数据。
为了进一步理解本方案,如下公开了利用一个第三模块生成一个第三子数据时采用的代码的一个示例:

其中,对于W、θ′s、V、θ′o以及xt的含义均可以参阅上述描述,此处不做赘述,s′tl-1代表上一次调用第三模块时生成的第一数据的特征信息,s′tN代表对第一数据的特征信息更新N次后生成的第一数据的更新后的特征信息,xt代表对第三模块调用N次后生成的一个第三子数据,N为大于或等于2的整数,此处示例仅为方便理解本方案,不用于限定本方案,
本申请实施例中,调用第三模块多次以对第一数据的特征信息进行多次更新之后,才根据最后的第一数据的更新后的特征信息,生成一个第三子数据,在对第一数据的多次更新之后,有利于更加透彻的了解第一数据,从而生成性能更好的子数据。
可选地,第一机器学习模型中的参数的尺寸与H的取值以及G的取值相关,H为第一数据的长度,G为T个子数据中每个子数据的长度。
例如,若第一机器学习模型的功能为编码,则T个子数据代表T组编码后的比特数据,G代表每组编码后的比特数据中的比特位数。又例如,若第一机器学习模型的功能是调制,则T个子数据代表T组调制后的符号,G代表每组调制后的符号中符号的个数。又例如,若第一机器学习模型的功能是编码和调制,则T个子数据代表T组调制后的符号,G代表每组调制后的符号中符号的个数。又例如,若第一机器学习模型的功能是生成参考信号,则T个子数据可以代表参考信号中T组符号,G代表每组符号中符号的个数等,应理解,此处举例仅为方便理解本方案,不用于限定本方案。
例如,H的取值为12,G的取值为12,第一机器学习模型中采用的参数可以包括U、W、θs、V以及θo,则可以为也即U为12乘12的矩阵,U的尺寸为长和宽均为12,W和V的尺寸与U的尺寸相同,θs为1乘12的向量,θs的尺寸为宽是1且长为12,θo的尺寸与θs的尺寸相同。
又例如,H的取值为6,G的取值为6,第一机器学习模型中采用的参数可以包括U、W、θs、V以及θo,则可以为也即U为6乘6的矩阵,U的尺寸为长和宽均为6,W和V的尺寸与U的尺寸相同,θs为1乘6的向量,θs的尺寸为宽是1且长为6,θo的尺寸与θs的尺寸相同。
又例如,H的取值为12,G的取值为6,第一机器学习模型中采用的参数可以包括U、W、θs、V以及θo,则可以为对 于、W、θs、V以及θo的尺寸的解释可以参阅上述描述,此处不再赘述。需要说明的是,此处举例仅为方便理解本方案,不用于限定本方案。
本申请实施例中,根据第一数据的长度以及T个子数据中每个子数据的长度来设计第一机器学习模型中参数的尺寸,有利于在满足输出要求的前提下,减少第一机器学习模型中的参数量,有利于进一步减少传输第一机器学习模型的参数所消耗的通信资源。
408、第一装置根据第二数据确定待发送的信号。
本申请实施例中,第一装置在通过第一机器学习模型生成第二数据之后,还可以根据第二数据确定待发送的信号。例如,若第一机器学习模型的功能为编码,则第二数据为编码后的数据,第一装置还需要对第二数据进行调制得到待发送的信号。
又例如,若第一机器学习模型的功能是调制,则第二数据为调制后的数据,第一装置还可以采用截断的方式对第二数据进行速率匹配,以得到该待发送的信号。
又例如,若第一机器学习模型的功能是编码和调制,则第二数据为编码和调制后的数据,第一装置还可以采用截断的方式对第二数据进行速率匹配,以得到该待发送的信号。
又例如,若第一机器学习模型的功能是生成参考信号,则第二数据为参考信号,第一装置可以将参考信号确定为待发送的信号。
需要说明的是,第一装置在得到第二数据,根据第二数据确定待发送的信号的过程中还可以执行其他操作,此处不做限定。
409、第一装置向第二装置发送该待发送的信号。
410、第二装置获取第二数据,第二数据包括T个子数据,T为大于或等于1的整数,第二数据由第一装置中的第一机器学习模型生成,第一机器学习模型包括一个或多个模块,每调用第一机器学习模型中的模块至少一次得到一个子数据。
本申请实施例中,在一些应用场景中,若第一机器学习模型的功能为编码和/或调制,第二装置在获取到与该待发送的信号对应的接收信号之后,可以对接收信号进行去噪,并从去噪后的接收信号中获取接收的第二数据(也即估计的第二数据),对于第二数据的含义可以参阅上述描述,此处不做赘述。
例如,若第二数据为编码后的数据,也即第一机器学习模型的功能为编码,则对去噪后的接收信号进行解调制之后得到接收的第二数据。又例如,若第二数据为调制后的数据,也即第一机器学习模型的功能为调制,或者,第一机器学习模型的功能为编码和调制,则可以直接将去噪后的接收信号确定为第二数据。
在另一些应用场景中,若第一机器学习模型的功能为生成参考信号,则第二装置在获取到与该待发送的信号对应的接收信号(也即接收到的第二数据)之后,可以根据接收信号确定估计的信道信息。
411、第二装置根据第二数据,生成第一数据。
本申请实施例中,第二装置在获取到接收的第二数据之后,可以根据接收的第二数据,生成估计的第一数据。可选地,第二装置可以获取第一装置中的第一机器学习模型采用的一组参数,前述步骤的具体实现方式可以参阅步骤401中的描述。第二装置可以根据第一装置中的第一机器学习模型采用的一组参数,对第二数据进行解调制和/或解码,以生成估 计的第一数据。
示例性地,在一种实现方式中,第二装置根据第一装置中的第一机器学习模型采用的一组参数,对第二数据进行解调制和/或解码,可以包括:第二装置在获取到第一机器学习模型采用的一组参数之后,可以确定第一装置利用第一机器学习模型对第一数据执行了哪些操作,则可以采用估计算法对第二数据执行逆操作,以实现对第二数据的解调制和/或解码,前述逆操作为利用第一机器学习模型对第一数据执行的操作的逆操作。
例如,估计算法可以为如下任一种:最大似然估计算法、最大后验概率估计或其他类型的估计算法等等,此处举例仅为方便理解本方案,不用于限定本方案。
在另一种实现方式中,第二装置根据第一装置中的第一机器学习模型采用的一组参数,对第二数据进行解调制和/或解码,可以包括:第二装置在获取到第一机器学习模型采用的一组参数之后,可以获取与第一机器学习模型对应的一个第二机器学习模型,将第二数据输入第二机器学习模型中,通过第二机器学习模型对第二数据进行解调制和/或解码。需要说明的是,第二装置还可以采用其他方式得到估计的第一数据,此处举例仅为证明本方案的可实现性,不用于限定本方案。
二、训练阶段
具体的,请参阅图9,图9为本申请实施例提供的模型的训练方法的一种示意图,如图9所示,该模型的训练方法包括步骤901至903。
901、从训练数据集合中获取训练数据,其中,训练数据用于得到第一数据和T的取值,T为大于或等于1的整数,训练数据集合中至少两个训练数据包括的T的取值不同。
本申请实施例中,训练装置中可以存在训练数据集合,在每次训练过程中,训练装置可以从训练数据集合中获取一个或多个训练数据。其中,每个训练数据中可以包括T的取值,T代表第一机器学习模型的输出数据包括的子数据的数量,训练数据集合中至少两个训练数据包括的T的取值不同。
示例性地,每个训练数据中还可以包括待处理数据,训练装置可以直接将前述待处理数据确定为第一数据;或者,也可以基于前述待处理数据采用图4对应实施例中步骤403至406中示出的方式得到第一数据,具体实现方式可以参阅上述图4对应实施例中的描述,步骤901中名词的含义也可以结合图4对应实施例中的描述进行理解,此处均不做赘述。
902、将第一数据输入第一机器学习模型,得到第一机器学习模型生成的第二数据,第二数据包括T个子数据,其中,第一机器学习模型包括多个模块,每调用第一机器学习模型中的模块至少一次,得到模块生成的一个子数据。
本申请实施例中,训练装置将第一数据输入第一机器学习模型,得到第一机器学习模型生成的第二数据,步骤902的具体实现方式以及步骤902中名词的含义均可以参阅图4对应实施例中步骤407中的描述,此处不做赘述。需要说明的是,步骤901和902中的训练装置可以为终端设备,也可以为基站,具体可以根据实际情况灵活设定。
903、基于第二数据和损失函数,对第一机器学习模型进行训练,得到训练后的第一机器学习模型。
本申请实施例中,训练装置在得到第二数据之后,可以基于第二数据和损失函数,对 第一机器学习模型进行训练,得到训练后的第一机器学习模型。需要说明的是,步骤903可以由同一个设备执行,也可以由不同的设备执行。
在一种实现方式中,步骤903由同一设备执行,则步骤903中的训练装置可以为终端设备,也可以为基站。在一种情况下,第一机器学习模型的功能包括编码和/或调制,步骤903包括:对与待发送的信号对应的接收信号进行去噪后得到去噪后的接收信号,对去噪后的接收信号进行解调制和/或解码以得到与待处理数据对应的估计数据;根据估计数据和第一损失函数,对第一机器学习模型进行训练,第一损失函数指示估计数据和待处理数据之间的相似度。
具体的,在一种实现方式中,第一机器学习模型的功能为调制,训练装置在得到第二数据(也即调制后的数据)之后,可以直接将第二数据确定为待发送的信号;训练装置获取与该待发送的信号对应的接收信号,对该接收信号进行去噪后得到去噪后的接收信号,对去噪后的接收信号进行解调制以得到与待处理数据对应的估计数据。训练装置生成待处理数据和估计数据之间的相似度,也即得到第一损失函数的函数值,利用第一损失函数的函数值对第一机器学习模型的权重参数进行更新,实现了对第一机器学习模型的一次训练。示例性地,第一损失函数可以为交叉熵损失函数、L1损失函数或其他类型的损失函数等等,具体可以结合实际应用场景灵活确定,本申请实施例中不做限定。
示例性地,训练装置获取与该待发送的信号对应的接收信号可以包括:训练装置将该待发送的信号与信道矩阵相乘,并将前述相乘的结果与噪声相加,得到该接收信号,前述步骤是为了模拟该待发送的信号经过信道传输的过程。
需要说明的是,在对第一机器学习模型进行多次训练的过程中,可以均采用相同的信道矩阵和噪声,也可以采用不同的信道矩阵和/或不同的噪声,不同的信道矩阵和/或不同的噪声用于模拟不同的信噪比的信道环境。
在另一种实现方式中,第一机器学习模型的功能为编码,训练装置在得到第二数据(也即编码后的数据)之后,也可以对第二数据进行调制得到待发送的信号;训练装置获取与该待发送的信号对应的接收信号,对接收信号进行去噪后得到去噪后的接收信号,并对该去噪后的接收信号进行解调制和解码以得到与训练数据中的待处理数据对应的估计数据。训练装置根据待处理数据、估计数据和第一损失函数对第一机器学习模型进行训练;前述步骤以及“训练装置获取与该待发送的信号对应的接收信号”的具体实现方式可以参阅上述描述,此处不做赘述。
在另一种实现方式中,第一机器学习模型的功能为调制,训练装置在得到第二数据(也即调制后的数据)之后,可以采用截断的方式对第二数据进行速率匹配,以得到该待发送的信号;训练装置获取与该待发送的信号对应的接收信号,对接收信号进行去噪后得到去噪后的接收信号,并对该去噪后的接收信号进行解调制以得到与训练数据中的待处理数据对应的估计数据。训练装置执行的后续步骤可以参阅上述描述,此处不做赘述。
在另一种实现方式中,第一机器学习模型的功能为编码和调制,训练装置在得到第二数据(也即编码以及调制后的数据)之后,可以采用截断的方式对第二数据进行速率匹配,以得到该待发送的信号;训练装置获取与该待发送的信号对应的接收信号,对接收信号进 行去噪后得到去噪后的接收信号,并对该去噪后的接收信号进行解调制和解码以得到与训练数据中的待处理数据对应的估计数据。训练装置执行的后续步骤可以参阅上述描述,此处不做赘述
在另一种情况中,第一机器学习模型的功能为生成参考信号,则第二数据为参考信号。训练装置可以生成与参考信号对应的接收的参考信号,根据与参考信号对应的接收的参考信号,生成预测的信道信息。其中,“训练装置生成与参考信号对应的接收的参考信号”的具体实现方式与“训练装置生成与待发送的信号对应的接收信号”的具体实现方式类似,区别在于将“待发送的信号”替换为“参考信号”,将“接收信号”替换为“接收的参考信号”,此处不再赘述。
训练装置根据第二损失函数,对第一机器学习模型进行训练,第二损失函数指示预测的信道信息和正确的信道信息之间的相似度。示例性地,训练装置计算预测的信道信息和正确的信道信息之间的相似度,以得到第二损失函数的函数值,利用第二损失函数的函数值对第一机器学习模型的权重参数进行更新,实现了对第一机器学习模型的一次训练。
在另一种实现方式中,步骤903由两个训练装置(为方面描述,后续称为第一训练装置和第二训练装置)共同执行。例如,第一训练装置可以为基站,第二训练装置为请求基站对第一机器学习模型进行再次训练的终端设备;又例如第一训练装置可以为第一装置,第二训练装置可以为第二装置。
在一种情况中,第一机器学习模型的功能包括编码和/或调制,相比于步骤903由同一个设备执行,本实现方式的区别之处主要在于“获取与待发送的信号对应的接收信号”的实现方式,示例性地,第一训练装置在基于第二数据得到待发送的信号之后,将待发送的信号发送给第二训练装置,第二训练装置接收到了该接收信号,进而由第二训练装置获取与待处理数据对应的估计数据;根据估计数据和第一损失函数,对第一机器学习模型进行训练。
本申请实施例中,提供了在第一机器学习模型的功能包括编码和/或调制的情况下,对第一机器学习模型进行训练的具体实现方式,降低了本方案的实现难度,且损失函数采用的是估计数据和待处理数据之间的相似度,也即损失函数的目标是获得性能更好的估计数据,该损失函数更加符合装置之间发送数据时的实际需求,则训练后的第一机器学习模型输出的第二数据更符合实际的需求。
在另一种情况中,第一机器学习模型的功能为生成参考信号,相比于步骤903由同一个设备执行,本实现方式的区别之处主要在于“获取与参考信号对应的接收的参考信号”的实现方式,示例性地,第一训练装置在得到参考信号之后,将参考信号发送给第二训练装置;第二训练装置获取到接收的参考信号之后,根据接收的参考信号生成预测的信道信息;第二训练装置根据损失函数,对第一机器学习模型进行训练。
本申请实施例中,还提供了在第一机器学习模型的功能为生成参考信号的情况下,对第一机器学习模型进行训练的具体实现方式,扩展了本方案的应用场景,提高了本方案的实现灵活性。
训练装置重复执行步骤901至903,以对第一机器学习模型进行迭代训练,直至满足 第一损失函数的收敛条件,得到第一机器学习模型的一组训练后的参数。
此外,训练装置还可以采用多个不同的训练数据集合,分别对第一机器学习模型进行训练,从而得到第一机器学习模型的多组训练后的参数。
本申请实施例中,不仅提供了第一机器学习模型推理阶段的实现方式,还提供了第一机器学习模型在训练阶段的实现方式,降低了本方案的实现难度。
在上述图1至图9对应的实施例的基础上,请参阅图10,图10为本申请实施例提供的数据处理装置的一种示意图。该数据处理装置1000可以实现上述方法实施例中第一装置的功能,因此也能实现上述方法实施例所具备的有益效果。数据处理装置1000可以包括处理模块1001;其中,处理模块1001,用于获取T的取值,T为大于或等于1的整数,T代表第一机器学习模型的输出数据包括的子数据的个数;处理模块1001,还用于将第一数据输入第一机器学习模型,得到第一机器学习模型生成的第二数据,第二数据包括T个子数据,其中,第一机器学习模型包括一个或多个模块,每调用第一机器学习模型中的模块至少一次得到一个子数据。
可选地,第一机器学习模型的功能包括如下任一项或多项的组合:编码、调制、生成参考信号。
可选地,第一机器学习模型中的多个模块包括第一模块和至少一个第二模块,处理模块1001,具体用于:将第一数据输入第一模块,得到第一模块生成的第一子数据,第一子数据为T个子数据中的一个;将第一特征信息输入第二模块,得到第二模块生成的第二子数据,其中,第一特征信息包括上一次调用第一机器学习模型中的模块进行数据处理时生成的特征信息,第二子数据为T个子数据中的一个,上一次调用的第一机器学习模型中的模块为第一模块或者第二模块。
可选地,第一机器学习模型中的多个模块包括第一模块和至少一个第三模块,处理模块1001,具体用于:将第一数据输入第一模块,通过第一模块生成第一子数据,第一子数据为T个子数据中的一个,通过第一模块生成第一子数据的过程中包括对第一数据进行特征提取;调用第三模块多次,得到第三模块生成的第三子数据,第三子数据为T个子数据中的一个,其中,第三模块的输入包括第一数据的特征信息,在调用第三模块多次的过程中对第一数据的特征信息进行多次更新。
可选地,处理模块1001,具体用于:通过第二模块对第一特征信息进行线性变换,并采用第一激活函数进行处理,得到变换后的特征信息;对变换后的特征信息进行线性变换,并采用第二激活函数进行处理,得到第二子数据。
可选地,至少一个第二模块包括多个第二模块,其中,多个第二模块中至少两个第二模块采用的参数不同。
可选地,处理模块1001,还用于获取待处理数据和H的取值,H为大于或等于1的整数,H指示第一数据的长度;处理模块1001,还用于若待处理数据的长度小于H,则对待处理数据进行填充,得到第一数据,第一数据的长度为H。
可选地,第一数据包括待处理数据和填充数据,填充数据包括第一标识信息,第一标 识信息用于标识T的取值和/或K的取值,K为待处理数据的长度,K为大于或等于1的整数。
可选地,第一机器学习模型中的参数的尺寸与H的取值以及G的取值相关,G为每个子数据的长度。
可选地,与第一机器学习模型对应的参数携带于如下一种或多种信息中:下行控制信息DCI、上行控制信息UCI、侧行链路控制信息SCI、无线资源控制RRC信令或者媒体访问控制的控制元素MAC CE;和/或,参数的标识信息携带于如下任一种或多种信息中:DCI、UCI、SCI、RRC信令、MAC CE、物理广播信道PBCH或者物理随机接入信道PRACH。
可选地,数据处理装置1000应用于第一装置,第二装置为第二数据的接收端,第二装置中有与第一机器学习模型对应的多组参数以及每组参数的标识信息,请参阅图10,数据处理装置1000还可以包括:收发模块1002,用于向第二装置发送第二标识信息,第二标识信息用于指示第一装置中的第一机器学习模型采用的一组参数。
请参阅图11,图11为本申请实施例提供的数据处理装置的另一种示意图。该数据处理装置1100可以实现上述方法实施例中第二装置的功能,因此也能实现上述方法实施例所具备的有益效果。数据处理装置1100可以包括处理模块1101;其中,处理模块1101用于获取第二数据;根据第二数据,生成第一数据。其中,第二数据包括T个子数据,T为大于或等于1的整数,第二数据由第一装置中的第一机器学习模型生成,第一机器学习模型包括一个或多个模块,每调用第一机器学习模型中的模块至少一次得到一个子数据。
在一种可能实现方式中,与第一机器学习模型对应的参数携带于如下一种或多种信息中:下行控制信息DCI、上行控制信息UCI、侧行链路控制信息SCI、无线资源控制RRC信令或者媒体访问控制的控制元素MAC CE;和/或,参数的标识信息携带于如下任一种或多种信息中:DCI、UCI、SCI、RRC信令、MAC CE、物理广播信道PBCH或者物理随机接入信道PRACH。
在一种可能实现方式中,该数据处理装置应用于第二装置,第二装置中有第三数据,第三数据包括与第一机器学习模型对应的多组参数以及每组参数的标识信息,请参阅图11,数据处理装置1100还可以包括:收发模块1102,用于接收第一装置发送的第二标识信息;处理模块1101,还用于根据第二标识信息和第三数据,确定第一装置中的第一机器学习模型采用的一组参数。处理模块1101,具体用于根据第一装置中的第一机器学习模型采用的一组参数和第二数据,生成第一数据。
请参阅图12,图12为本申请实施例提供的模型的训练装置的一种示意图。模型的训练装置1200可以实现上述方法实施例中模型的训练装置的功能,因此也能实现上述方法实施例所具备的有益效果。模型的训练装置1200可以包括处理模块1201;其中,处理模块1201,用于从训练数据集合中获取训练数据,其中,训练数据用于得到第一数据和T的取值,T为大于或等于1的整数,训练数据集合中至少两个训练数据包括的T的取值不同;处理模块1201,还用于将第一数据输入第一机器学习模型,得到第一机器学习模型生成的 第二数据,第二数据包括T个子数据,其中,第一机器学习模型包括多个模块,每调用第一机器学习模型中的模块至少一次,得到模块生成的一个子数据;处理模块1201,还用于基于第二数据和损失函数,对第一机器学习模型进行训练,得到训练后的第一机器学习模型。
在一种可能实现方式中,第一机器学习模型的功能包括如下任一项或多项的组合:编码、调制或生成参考信号。
在一种可能实现方式中,第一机器学习模型中的多个模块包括第一模块和至少一个第二模块,处理模块1201,具体用于:将第一数据输入第一模块,得到第一模块生成的第一子数据,第一子数据为T个子数据中的一个;将第一特征信息输入第二模块,得到第二模块生成的第二子数据,其中,第一特征信息包括上一次调用第一机器学习模型中的模块进行数据处理时生成的特征信息,第二子数据为T个子数据中的一个,上一次调用的第一机器学习模型中的模块为第一模块或者第二模块。
在一种可能实现方式中,处理模块1201,还用于从训练数据中获取待处理数据;处理模块1201,还用于获取H的取值,H为大于或等于1的整数,H指示第一数据的长度;处理模块1201,还用于若待处理数据的长度小于H,则对待处理数据进行填充,得到第一数据,第一数据的长度为H。
在一种可能实现方式中,在第一机器学习模型的功能包括编码和/或调制的情况下,第二数据用于确定待发送的信号,处理模块1201,具体用于:对与待发送的信号对应的接收信号进行解调制和/或解码以得到与待处理数据对应的估计数据;根据估计数据和损失函数,对第一机器学习模型进行训练,损失函数指示估计数据和待处理数据之间的相似度。
在一种可能实现方式中,在第二数据为参考信号的情况下,基于第二数据和损失函数,处理模块1201,具体用于:根据与参考信号对应的接收的参考信号,生成预测的信道信息;根据损失函数,对第一机器学习模型进行训练,损失函数指示预测的信道信息和正确的信道信息之间的相似度。
请参阅图13,图13为本申请的实施例提供的装置的一种示意图。该通信装置1300具体可以为上述实施例中的作为终端设备的装置,图13所示示例为终端设备通过终端设备(或者终端设备中的部件)实现。
其中,该通信装置1300的一种可能的逻辑结构示意图,该通信装置1300可以包括但不限于至少一个处理器1301以及通信端口1302。
可选地,该装置还可以包括存储器1303、总线1304中的至少一个,在本申请的实施例中,该至少一个处理器1301用于对通信装置1300的动作进行控制处理。
此外,处理器1301可以是中央处理器单元,通用处理器,数字信号处理器,专用集成电路,现场可编程门阵列或者其他可编程逻辑器件、晶体管逻辑器件、硬件部件或者其任意组合。其可以实现或执行结合本申请公开内容所描述的各种示例性的逻辑方框,模块和电路。该处理器也可以是实现计算功能的组合,例如包含一个或多个微处理器组合,数字信号处理器和微处理器的组合等等。所属领域的技术人员可以清楚地了解到,为描述的方 便和简洁,上述描述的系统,装置和单元的具体工作过程,可以参考前述方法实施例中的对应过程,在此不再赘述。
需要说明的是,当上述实施例中所涉及的第一装置、第二装置或训练设备具体表现为终端设备时,可以采用图13中示出的装置1300实现前述方法实施例中终端设备所实现的步骤,图13所示装置1300执行前述步骤的具体实现方式,均可以参考前述方法实施例中的叙述,此处不再一一赘述。
请参阅图14,图14为本申请的实施例提供的装置的另一种示意图。该装置1400具体可以为上述实施例中的作为网络设备的装置,图14所示示例为网络设备通过网络设备(或者网络设备中的部件)实现;也即当上述实施例中所涉及的第一装置、第二装置或训练设备具体表现为网络设备时,可以采用图14中示出的装置1400实现;示例性地,当第一装置、第二装置或训练设备为基站时,可以采用图14中示出的装置1400实现。
其中,该通信装置的结构可以参考图14所示的结构。装置1400包括至少一个处理器1411以及至少一个网络接口1412。进一步可选地,该通信装置还包括至少一个存储器1414、至少一个收发器1413和一个或多个天线1415。处理器1411、存储器1414、收发器1413和网络接口1412相连,例如通过总线相连,在本申请实施例中,该连接可包括各类接口、传输线或总线等,本实施例对此不做限定。天线1415与收发器1413相连。网络接口1412用于使得通信装置通过通信链路,与其它通信设备通信。例如网络接口1412可以包括通信装置与核心网设备之间的网络接口,例如S1接口,网络接口可以包括通信装置和其他通信装置(例如其他网络设备或者核心网设备)之间的网络接口,例如X2或者Xn接口。
处理器1411主要用于对通信协议以及通信数据进行处理,以及对整个通信装置进行控制,执行软件程序,处理软件程序的数据,例如用于支持通信装置执行实施例中所描述的动作。通信装置可以包括基带处理器和中央处理器,基带处理器主要用于对通信协议以及通信数据进行处理,中央处理器主要用于对整个终端设备进行控制,执行软件程序,处理软件程序的数据。图14中的处理器1411可以集成基带处理器和中央处理器的功能,本领域技术人员可以理解,基带处理器和中央处理器也可以是各自独立的处理器,通过总线等技术互联。本领域技术人员可以理解,终端设备可以包括多个基带处理器以适应不同的网络制式,终端设备可以包括多个中央处理器以增强其处理能力,终端设备的各个部件可以通过各种总线连接。该基带处理器也可以表述为基带处理电路或者基带处理芯片。该中央处理器也可以表述为中央处理电路或者中央处理芯片。对通信协议以及通信数据进行处理的功能可以内置在处理器中,也可以以软件程序的形式存储在存储器中,由处理器执行软件程序以实现基带处理功能。
存储器主要用于存储软件程序和数据。存储器1414可以是独立存在,与处理器1411相连。可选地,存储器1414可以和处理器1411集成在一起,例如集成在一个芯片之内。其中,存储器1414能够存储执行本申请实施例的技术方案的程序代码,并由处理器1411来控制执行,被执行的各类计算机程序代码也可被视为是处理器1411的驱动程序。
图14仅示出了一个存储器和一个处理器。在实际的终端设备中,可以存在多个处理器和多个存储器。存储器也可以称为存储介质或者存储设备等。存储器可以为与处理器处于 同一芯片上的存储元件,即片内存储元件,或者为独立的存储元件,本申请实施例对此不做限定。
收发器1413可以用于支持通信装置与终端之间射频信号的接收或者发送,收发器1413可以与天线1415相连。收发器1413包括发射机Tx和接收机Rx。具体地,一个或多个天线1415可以接收射频信号,该收发器1413的接收机Rx用于从天线接收该射频信号,并将射频信号转换为数字基带信号或数字中频信号,并将该数字基带信号或数字中频信号提供给该处理器1411,以便处理器1411对该数字基带信号或数字中频信号做进一步的处理,例如解调处理和译码处理。此外,收发器1413中的发射机Tx还用于从处理器1411接收经过调制的数字基带信号或数字中频信号,并将该经过调制的数字基带信号或数字中频信号转换为射频信号,并通过一个或多个天线1415发送该射频信号。具体地,接收机Rx可以选择性地对射频信号进行一级或多级下混频处理和模数转换处理以得到数字基带信号或数字中频信号,该下混频处理和模数转换处理的先后顺序是可调整的。发射机Tx可以选择性地对经过调制的数字基带信号或数字中频信号时进行一级或多级上混频处理和数模转换处理以得到射频信号,该上混频处理和数模转换处理的先后顺序是可调整的。数字基带信号和数字中频信号可以统称为数字信号。
收发器1413也可以称为收发单元、收发机、收发装置等。可选地,可以将收发单元中用于实现接收功能的器件视为接收单元,将收发单元中用于实现发送功能的器件视为发送单元,即收发单元包括接收单元和发送单元,接收单元也可以称为接收机、输入口、接收电路等,发送单元可以称为发射机、发射器或者发射电路等。
示例性地,当上述实施例中所涉及的第一装置、第二装置或训练设备具体表现为基站时,可以采用图14中示出的装置1400实现前述方法实施例中基站所实现的步骤,图14所示装置1400执行前述步骤的具体实现方式,均可以参考前述方法实施例中的叙述,此处不再一一赘述。
本申请实施例中还提供一种计算机可读存储介质,该计算机可读存储介质中存储有用于进行信号处理的程序,当其在计算机上运行时,使得计算机执行如前述图3至图8所示实施例描述的方法中第一装置所执行的步骤,或者,使得计算机执行如前述图3至图8所示实施例描述的方法中第二装置所执行的步骤,或者,使得计算机执行如前述图9所示实施例描述的方法中训练设备所执行的步骤。
本申请实施例中还提供一种包括计算机程序产品,当其在计算机上运行时,使得计算机执行如前述图3至图8所示实施例描述的方法中第一装置所执行的步骤,或者,使得计算机执行如前述图3至图8所示实施例描述的方法中第二装置所执行的步骤,或者,使得计算机执行如前述图9所示实施例描述的方法中训练设备所执行的步骤。
本申请实施例提供的第一装置、第二装置、训练设备、数据处理装置或模型的训练装置具体可以为芯片,芯片包括:处理单元和通信单元,所述处理单元例如可以是处理器,所述通信单元例如可以是输入/输出接口、管脚或电路等。该处理单元可执行存储单元存储的计算机执行指令,以使芯片执行上述图3至图8所示实施例描述的数据处理方法,或者, 以使芯片执行上述图9所示实施例描述的模型的训练方法。可选地,所述存储单元为所述芯片内的存储单元,如寄存器、缓存等,所述存储单元还可以是所述无线接入设备端内的位于所述芯片外部的存储单元,如只读存储器(read-only memory,ROM)或可存储静态信息和指令的其他类型的静态存储设备,随机存取存储器(random access memory,RAM)等。
具体的,请参阅图15,图15为本申请实施例提供的芯片的一种结构示意图,所述芯片可以表现为神经网络处理器NPU 150,NPU 150作为协处理器挂载到主CPU(Host CPU)上,由Host CPU分配任务。NPU的核心部分为运算电路150,通过控制器1504控制运算电路1503提取存储器中的矩阵数据并进行乘法运算。
在一些实现中,运算电路1503内部包括多个处理单元(Process Engine,PE)。在一些实现中,运算电路1503是二维脉动阵列。运算电路1503还可以是一维脉动阵列或者能够执行例如乘法和加法这样的数学运算的其它电子线路。在一些实现中,运算电路1503是通用的矩阵处理器。
举例来说,假设有输入矩阵A,权重矩阵B,输出矩阵C。运算电路从权重存储器1502中取矩阵B相应的数据,并缓存在运算电路中每一个PE上。运算电路从输入存储器1501中取矩阵A数据与矩阵B进行矩阵运算,得到的矩阵的部分结果或最终结果,保存在累加器(accumulator)1508中。
统一存储器1506用于存放输入数据以及输出数据。权重数据直接通过存储单元访问控制器(Direct Memory Access Controller,DMAC)1505,DMAC被搬运到权重存储器1502中。输入数据也通过DMAC被搬运到统一存储器1506中。
BIU为Bus Interface Unit即,总线接口单元1510,用于AXI总线与DMAC和取指存储器(Instruction Fetch Buffer,IFB)1509的交互。
总线接口单元1510(Bus Interface Unit,简称BIU),用于取指存储器1509从外部存储器获取指令,还用于存储单元访问控制器1505从外部存储器获取输入矩阵A或者权重矩阵B的原数据。
DMAC主要用于将外部存储器DDR中的输入数据搬运到统一存储器1506或将权重数据搬运到权重存储器1502中或将输入数据数据搬运到输入存储器1501中。
向量计算单元1507包括多个运算处理单元,在需要的情况下,对运算电路的输出做进一步处理,如向量乘,向量加,指数运算,对数运算,大小比较等等。主要用于神经网络中非卷积/全连接层网络计算,如Batch Normalization(批归一化),像素级求和,对特征平面进行上采样等。
在一些实现中,向量计算单元1507能将经处理的输出的向量存储到统一存储器1506。例如,向量计算单元1507可以将线性函数和/或非线性函数应用到运算电路1503的输出,例如对卷积层提取的特征平面进行线性插值,再例如累加值的向量,用以生成激活值。在一些实现中,向量计算单元1507生成归一化的值、像素级求和的值,或二者均有。在一些实现中,处理过的输出的向量能够用作到运算电路1503的激活输入,例如用于在神经网络中的后续层中的使用。
控制器1504连接的取指存储器(instruction fetch buffer)1509,用于存储控制器 1504使用的指令;
统一存储器1506,输入存储器1501,权重存储器1502以及取指存储器1509均为On-Chip存储器。外部存储器私有于该NPU硬件架构。
其中,第一机器学习模型中各层的运算可以由运算电路1503或向量计算单元1507执行。
其中,上述任一处提到的处理器,可以是一个通用中央处理器,微处理器,ASIC,或一个或多个用于控制上述第一方面方法的程序执行的集成电路。
另外需说明的是,以上所描述的装置实施例仅仅是示意性的,其中所述作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部模块来实现本实施例方案的目的。另外,本申请提供的装置实施例附图中,模块之间的连接关系表示它们之间具有通信连接,具体可以实现为一条或多条通信总线或信号线。
通过以上的实施方式的描述,所属领域的技术人员可以清楚地了解到本申请可借助软件加必需的通用硬件的方式来实现,当然也可以通过专用硬件包括专用集成电路、专用CPU、专用存储器、专用元器件等来实现。一般情况下,凡由计算机程序完成的功能都可以很容易地用相应的硬件来实现,而且,用来实现同一功能的具体硬件结构也可以是多种多样的,例如模拟电路、数字电路或专用电路等。但是,对本申请而言更多情况下软件程序实现是更佳的实施方式。基于这样的理解,本申请的技术方案本质上或者说对现有技术做出贡献的部分可以以软件产品的形式体现出来,该计算机软件产品存储在可读取的存储介质中,如计算机的软盘、U盘、移动硬盘、ROM、RAM、磁碟或者光盘等,包括若干指令用以使得一台计算机设备(可以是个人计算机,训练设备,或者网络设备等)执行本申请各个实施例所述的方法。
在上述实施例中,可以全部或部分地通过软件、硬件、固件或者其任意组合来实现。当使用软件实现时,可以全部或部分地以计算机程序产品的形式实现。
所述计算机程序产品包括一个或多个计算机指令。在计算机上加载和执行所述计算机程序指令时,全部或部分地产生按照本申请实施例所述的流程或功能。所述计算机可以是通用计算机、专用计算机、计算机网络、或者其他可编程装置。所述计算机指令可以存储在计算机可读存储介质中,或者从一个计算机可读存储介质向另一计算机可读存储介质传输,例如,所述计算机指令可以从一个网站站点、计算机、训练设备或数据中心通过有线(例如同轴电缆、光纤、数字用户线(DSL))或无线(例如红外、无线、微波等)方式向另一个网站站点、计算机、训练设备或数据中心进行传输。所述计算机可读存储介质可以是计算机能够存储的任何可用介质或者是包含一个或多个可用介质集成的训练设备、数据中心等数据存储设备。所述可用介质可以是磁性介质,(例如,软盘、硬盘、磁带)、光介质(例如,DVD)、或者半导体介质(例如固态硬盘(Solid State Disk,SSD))等。

Claims (28)

  1. 一种数据处理方法,其特征在于,所述方法包括:
    获取T的取值,所述T为大于或等于1的整数,所述T代表第一机器学习模型的输出数据包括的子数据的个数;
    将第一数据输入所述第一机器学习模型,得到所述第一机器学习模型生成的第二数据,所述第二数据包括所述T个子数据,其中,所述第一机器学习模型包括一个或多个模块,每调用所述第一机器学习模型中的模块至少一次得到一个所述子数据。
  2. 根据权利要求1所述的方法,其特征在于,所述第一机器学习模型的功能包括如下任一项或多项的组合:编码、调制、生成参考信号。
  3. 根据权利要求1或2所述的方法,其特征在于,所述第一机器学习模型中的多个模块包括第一模块和至少一个第二模块,所述将第一数据输入所述第一机器学习模型,得到所述第一机器学习模型输出的第二数据,包括:
    将所述第一数据输入所述第一模块,得到所述第一模块生成的第一子数据,所述第一子数据为所述T个子数据中的一个;
    将第一特征信息输入所述第二模块,得到所述第二模块生成的第二子数据,其中,所述第一特征信息包括上一次调用所述第一机器学习模型中的模块进行数据处理时生成的特征信息,所述第二子数据为所述T个子数据中的一个,所述上一次调用的所述第一机器学习模型中的模块为所述第一模块或者所述第二模块。
  4. 根据权利要求1或2所述的方法,其特征在于,所述第一机器学习模型中的多个模块包括第一模块和至少一个第三模块,所述将第一数据输入所述第一机器学习模型,得到所述第一机器学习模型输出的第二数据,包括:
    将所述第一数据输入所述第一模块,通过所述第一模块生成第一子数据,所述第一子数据为所述T个子数据中的一个,所述通过所述第一模块生成第一子数据的过程中包括对所述第一数据进行特征提取;
    调用所述第三模块多次,得到所述第三模块生成的第三子数据,所述第三子数据为所述T个子数据中的一个,其中,所述第三模块的输入包括所述第一数据的特征信息,在所述调用所述第三模块多次的过程中对所述第一数据的特征信息进行多次更新。
  5. 根据权利要求3所述的方法,其特征在于,所述将第一特征信息输入所述第二模块,得到所述第二模块生成的第二子数据,包括:
    通过所述第二模块对所述第一特征信息进行线性变换,并采用第一激活函数进行处理,得到变换后的特征信息;
    对所述变换后的特征信息进行线性变换,并采用第二激活函数进行处理,得到所述第二子数据。
  6. 根据权利要求3所述的方法,其特征在于,所述至少一个第二模块包括多个所述第二模块,其中,所述多个第二模块中至少两个第二模块采用的参数不同。
  7. 根据权利要求1或2所述的方法,其特征在于,所述将第一数据输入所述第一机器学习模型之前,所述方法还包括:
    获取待处理数据和H的取值,H为大于或等于1的整数,所述H指示所述第一数据的长度;
    若所述待处理数据的长度小于所述H,则对所述待处理数据进行填充,得到所述第一数据,所述第一数据的长度为所述H。
  8. 根据权利要求7所述的方法,其特征在于,所述第一数据包括所述待处理数据和填充数据,所述填充数据包括第一标识信息,所述第一标识信息用于标识所述T的取值和/或K的取值,所述K为所述待处理数据的长度,所述K为大于或等于1的整数。
  9. 根据权利要求7所述的方法,其特征在于,所述第一机器学习模型中的参数的尺寸与所述H的取值以及G的取值相关,所述G为每个所述子数据的长度。
  10. 根据权利要求1或2所述的方法,其特征在于,与所述第一机器学习模型对应的参数携带于如下一种或多种信息中:下行控制信息DCI、上行控制信息UCI、侧行链路控制信息SCI、无线资源控制RRC信令或者媒体访问控制的控制元素MAC CE;和/或,
    所述参数的标识信息携带于如下任一种或多种信息中:DCI、UCI、SCI、RRC信令、MAC CE、物理广播信道PBCH或者物理随机接入信道PRACH。
  11. 根据权利要求1或2所述的方法,其特征在于,所述方法应用于第一装置侧,第二装置为所述第二数据的接收端,所述第二装置中有与所述第一机器学习模型对应的多组参数以及每组所述参数的标识信息,所述方法还包括:
    向所述第二装置发送第二标识信息,所述第二标识信息用于指示所述第一装置中的所述第一机器学习模型采用的一组所述参数。
  12. 一种数据处理方法,其特征在于,所述方法包括:
    获取第二数据,其中,所述第二数据包括T个子数据,所述T为大于或等于1的整数,所述第二数据由第一装置中的第一机器学习模型生成,所述第一机器学习模型包括一个或多个模块,每调用所述第一机器学习模型中的模块至少一次得到一个所述子数据;
    根据所述第二数据,生成第一数据。
  13. 根据权利要求12所述的方法,其特征在于,与所述第一机器学习模型对应的参数携带于如下一种或多种信息中:下行控制信息DCI、上行控制信息UCI、侧行链路控制信息SCI、无线资源控制RRC信令或者媒体访问控制的控制元素MAC CE;和/或,
    所述参数的标识信息携带于如下任一种或多种信息中:DCI、UCI、SCI、RRC信令、MAC CE、物理广播信道PBCH或者物理随机接入信道PRACH。
  14. 根据权利要求12所述的方法,其特征在于,所述方法应用于第二装置侧,所述第二装置中有第三数据,所述第三数据包括与所述第一机器学习模型对应的多组参数以及每组所述参数的标识信息,所述方法还包括:
    接收所述第一装置发送的第二标识信息;
    根据所述第二标识信息和所述第三数据,确定所述第一装置中的所述第一机器学习模型采用的一组所述参数;
    所述根据所述第二数据,生成第一数据,包括:
    根据所述第一装置中的所述第一机器学习模型采用的一组所述参数和所述第二数据, 生成所述第一数据。
  15. 一种模型的训练方法,其特征在于,所述方法包括:
    从训练数据集合中获取训练数据,其中,所述训练数据用于得到第一数据和T的取值,所述T为大于或等于1的整数,所述训练数据集合中至少两个所述训练数据包括的所述T的取值不同;
    将所述第一数据输入所述第一机器学习模型,得到所述第一机器学习模型生成的第二数据,所述第二数据包括所述T个子数据,其中,所述第一机器学习模型包括多个模块,每调用所述第一机器学习模型中的模块至少一次,得到所述模块生成的一个所述子数据;
    基于所述第二数据和损失函数,对所述第一机器学习模型进行训练,得到训练后的所述第一机器学习模型。
  16. 根据权利要求15所述的方法,其特征在于,所述第一机器学习模型的功能包括如下任一项或多项的组合:编码、调制、生成参考信号。
  17. 根据权利要求15或16所述的方法,其特征在于,所述第一机器学习模型中的多个模块包括第一模块和至少一个第二模块,所述将第一数据输入所述第一机器学习模型,得到所述第一机器学习模型输出的第二数据,包括:
    将所述第一数据输入所述第一模块,得到所述第一模块生成的第一子数据,所述第一子数据为所述T个子数据中的一个;
    将第一特征信息输入所述第二模块,得到所述第二模块生成的第二子数据,其中,所述第一特征信息包括上一次调用所述第一机器学习模型中的模块进行数据处理时生成的特征信息,所述第二子数据为所述T个子数据中的一个,所述上一次调用的所述第一机器学习模型中的模块为所述第一模块或者所述第二模块。
  18. 根据权利要求15或16所述的方法,其特征在于,所述将第一数据输入所述第一机器学习模型之前,所述方法还包括:
    从所述训练数据中获取待处理数据;
    获取H的取值,H为大于或等于1的整数,所述H指示所述第一数据的长度;
    若所述待处理数据的长度小于所述H,则对所述待处理数据进行填充,得到所述第一数据,所述第一数据的长度为所述H。
  19. 根据权利要求18所述的方法,其特征在于,在所述第一机器学习模型的功能包括编码和/或调制的情况下,所述第二数据用于确定待发送的信号,所述基于所述第二数据和损失函数,对所述第一机器学习模型进行训练,包括:
    对与所述待发送的信号对应的接收信号进行解调制和/或解码以得到与所述待处理数据对应的估计数据;
    根据所述估计数据和所述损失函数,对所述第一机器学习模型进行训练,所述损失函数指示所述估计数据和所述待处理数据之间的相似度。
  20. 根据权利要求15或16所述的方法,其特征在于,在所述第二数据为参考信号的情况下,所述基于所述第二数据和损失函数,对所述第一机器学习模型进行训练,包括:
    根据与所述参考信号对应的接收的参考信号,生成预测的信道信息;
    根据所述损失函数,对所述第一机器学习模型进行训练,所述损失函数指示所述预测的信道信息和正确的信道信息之间的相似度。
  21. 一种数据处理装置,其特征在于,所述数据处理装置包括处理模块和收发模块;
    所述处理模块用于执行如权利要求1至11中任一项所述的处理操作,所述收发模块用于执行如权利要求1至11中任一项所述的收发操作。
  22. 一种数据处理装置,其特征在于,所述数据处理装置包括处理模块和收发模块;
    所述处理模块用于执行如权利要求12至14中任一项所述的处理操作,所述收发模块用于执行如权利要求12至14中任一项所述的收发操作。
  23. 一种模型的训练装置,其特征在于,所述数据处理装置包括处理模块,所述处理模块用于执行如权利要求15至20中任一项所述的处理操作。
  24. 一种通信系统,其特征在于,所述通信系统包括:如权利要求21所述的数据处理装置以及如权利要求22所述的数据处理装置。
  25. 根据权利要求24所述的系统,其特征在于,所述通信系统还包括:如权利要求23所述的模型的训练装置。
  26. 一种装置,其特征在于,所述装置包括至少一个处理器,所述至少一个处理器与存储器耦合,所述存储器用于存储程序或指令;
    所述至少一个处理器用于执行所述程序或指令,以使所述装置实现如权利要求1至11中任一项所述的方法;或者,实现如权利要求12至14中任一项所述的方法;或者,实现如权利要求15至20中任一项所述的方法。
  27. 一种计算机可读存储介质,其特征在于,所述可读存储介质存储有指令,当所述指令被计算机执行时,使得权利要求1至11中任一项所述的方法被执行;或者,使得权利要求12至14中任一项所述的方法被执行;或者,使得权利要求15至20中任一项所述的方法被执行。
  28. 一种计算机程序产品,其特征在于,所述计算机程序产品包括指令,当所述指令在计算机上运行时,使得权利要求1至11中任一项所述的方法被执行;或者,使得权利要求12至14中任一项所述的方法被执行;或者,使得权利要求15至20中任一项所述的方法被执行。
PCT/CN2023/085467 2023-03-31 2023-03-31 一种数据处理方法、模型的训练方法以及相关设备 Ceased WO2024197810A1 (zh)

Priority Applications (3)

Application Number Priority Date Filing Date Title
EP23929396.2A EP4668175A4 (en) 2023-03-31 2023-03-31 DATA PROCESSING METHOD, MODEL TRAINING METHOD AND ASSOCIATED DEVICE
CN202380092774.9A CN120641917A (zh) 2023-03-31 2023-03-31 一种数据处理方法、模型的训练方法以及相关设备
PCT/CN2023/085467 WO2024197810A1 (zh) 2023-03-31 2023-03-31 一种数据处理方法、模型的训练方法以及相关设备

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2023/085467 WO2024197810A1 (zh) 2023-03-31 2023-03-31 一种数据处理方法、模型的训练方法以及相关设备

Publications (1)

Publication Number Publication Date
WO2024197810A1 true WO2024197810A1 (zh) 2024-10-03

Family

ID=92907363

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2023/085467 Ceased WO2024197810A1 (zh) 2023-03-31 2023-03-31 一种数据处理方法、模型的训练方法以及相关设备

Country Status (3)

Country Link
EP (1) EP4668175A4 (zh)
CN (1) CN120641917A (zh)
WO (1) WO2024197810A1 (zh)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN119150706A (zh) * 2024-11-18 2024-12-17 珠海市格努科技有限公司 弓网物理变量预测方法、装置、系统及存储介质

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20190095785A1 (en) * 2017-09-26 2019-03-28 Amazon Technologies, Inc. Dynamic tuning of training parameters for machine learning algorithms
CN113408208A (zh) * 2021-06-25 2021-09-17 成都欧珀通信科技有限公司 模型训练方法、信息提取方法、相关装置及存储介质
CN114418129A (zh) * 2022-03-30 2022-04-29 苏州浪潮智能科技有限公司 一种深度学习模型训练方法及相关装置
WO2022127867A1 (en) * 2020-12-17 2022-06-23 Telefonaktiebolaget Lm Ericsson (Publ) Method and apparatus for controlling training data
CN115470929A (zh) * 2022-08-30 2022-12-13 三一重机有限公司 样本数据的生成方法、模型训练方法、装置、设备及介质
WO2023036309A1 (zh) * 2021-09-13 2023-03-16 维沃移动通信有限公司 参考信号序列生成方法、装置、设备及介质

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP3808024B1 (en) * 2019-09-04 2022-03-16 Google LLC Neural network formation configuration feedback for wireless communications
US11689940B2 (en) * 2019-12-13 2023-06-27 Google Llc Machine-learning architectures for simultaneous connection to multiple carriers

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20190095785A1 (en) * 2017-09-26 2019-03-28 Amazon Technologies, Inc. Dynamic tuning of training parameters for machine learning algorithms
WO2022127867A1 (en) * 2020-12-17 2022-06-23 Telefonaktiebolaget Lm Ericsson (Publ) Method and apparatus for controlling training data
CN113408208A (zh) * 2021-06-25 2021-09-17 成都欧珀通信科技有限公司 模型训练方法、信息提取方法、相关装置及存储介质
WO2023036309A1 (zh) * 2021-09-13 2023-03-16 维沃移动通信有限公司 参考信号序列生成方法、装置、设备及介质
CN114418129A (zh) * 2022-03-30 2022-04-29 苏州浪潮智能科技有限公司 一种深度学习模型训练方法及相关装置
CN115470929A (zh) * 2022-08-30 2022-12-13 三一重机有限公司 样本数据的生成方法、模型训练方法、装置、设备及介质

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
See also references of EP4668175A4 *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN119150706A (zh) * 2024-11-18 2024-12-17 珠海市格努科技有限公司 弓网物理变量预测方法、装置、系统及存储介质

Also Published As

Publication number Publication date
CN120641917A (zh) 2025-09-12
EP4668175A4 (en) 2026-03-18
EP4668175A1 (en) 2025-12-24

Similar Documents

Publication Publication Date Title
US20260089071A1 (en) Communication method and related device
WO2024197810A1 (zh) 一种数据处理方法、模型的训练方法以及相关设备
US20250200382A1 (en) Data processing method, training method, and related apparatus
US20250125854A1 (en) Channel state information (csi) feedback method, terminal device and network device
WO2022222116A1 (zh) 信道恢复的方法及收端设备
WO2025139843A1 (zh) 通信方法和通信装置
WO2023207783A1 (zh) 一种通信方法、装置及系统
WO2025059855A1 (zh) 一种数据处理方法以及相关设备
WO2024050789A1 (zh) 通信方法及相关装置
CN121444505A (zh) 无线通信的方法、终端设备和网络设备
WO2024245221A1 (zh) 一种数据处理方法以及相关设备
CN115706615A (zh) 一种通信方法及通信装置
CN119155007B (zh) 数据处理方法、装置、存储介质及程序产品
WO2025161758A1 (zh) 通信方法和通信装置
WO2025050289A1 (zh) 一种通信方法及相关设备
WO2025050286A1 (zh) 一种通信方法及相关设备
KR20260064717A (ko) 통신 방법 및 관련 디바이스
WO2025086262A1 (zh) 一种通信方法及相关设备
WO2025167443A1 (zh) 一种通信方法及相关设备
WO2025060863A1 (zh) 通信方法及相关装置
WO2025189831A1 (zh) 一种通信方法及相关装置
WO2025124095A1 (zh) 通信方法及通信装置
WO2025167816A1 (zh) 通信方法和相关装置
WO2025218168A1 (zh) 一种通信方法及相关装置
CN121750406A (zh) 通信方法及通信装置

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 23929396

Country of ref document: EP

Kind code of ref document: A1

WWE Wipo information: entry into national phase

Ref document number: 202380092774.9

Country of ref document: CN

WWP Wipo information: published in national office

Ref document number: 202380092774.9

Country of ref document: CN

WWE Wipo information: entry into national phase

Ref document number: 2023929396

Country of ref document: EP

NENP Non-entry into the national phase

Ref country code: DE

ENP Entry into the national phase

Ref document number: 2023929396

Country of ref document: EP

Effective date: 20250916

WWP Wipo information: published in national office

Ref document number: 2023929396

Country of ref document: EP