WO2023283765A1 - 机器学习模型的训练方法、装置、计算机设备和存储介质 - Google Patents
机器学习模型的训练方法、装置、计算机设备和存储介质 Download PDFInfo
- Publication number
- WO2023283765A1 WO2023283765A1 PCT/CN2021/105777 CN2021105777W WO2023283765A1 WO 2023283765 A1 WO2023283765 A1 WO 2023283765A1 CN 2021105777 W CN2021105777 W CN 2021105777W WO 2023283765 A1 WO2023283765 A1 WO 2023283765A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- machine learning
- model
- learning model
- training
- sample set
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Ceased
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N20/00—Machine learning
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N20/00—Machine learning
- G06N20/20—Ensemble learning
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16H—HEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
- G16H30/00—ICT specially adapted for the handling or processing of medical images
- G16H30/40—ICT specially adapted for the handling or processing of medical images for processing medical images, e.g. editing
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16H—HEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
- G16H50/00—ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
- G16H50/20—ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for computer-aided diagnosis, e.g. based on medical expert systems
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16H—HEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
- G16H50/00—ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
- G16H50/70—ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for mining of medical data, e.g. analysing previous cases of other patients
Definitions
- the present application relates to the technical field of model training, in particular to a training method, device, computer equipment and storage medium for a machine learning model.
- training a machine learning model requires a large number of training samples.
- medical images involve patient privacy and data security, medical images cannot be shared between hospitals, so there will be problems with fewer training samples for machine learning models and poor accuracy of machine learning models.
- a training method for a machine learning model comprising:
- the training samples in the first training sample set and the second training sample set include medical images obtained by scanning objects scanned by medical scanning equipment;
- the first machine learning model and the second machine learning model have at least part of the same structure, and the model parameters of the second machine learning model are at least partially used when training the first machine learning model, and the first machine learning model is at least partially used when training the second machine learning model.
- Model parameters for machine learning models are at least partially used when training the first machine learning model, and the first machine learning model is at least partially used when training the second machine learning model.
- the above-mentioned multiple rounds of model training are performed based on the first training sample set to obtain the first machine learning model, including:
- N round of model training is performed to obtain the first machine learning model; N is a positive integer greater than 1;
- the training of the first machine learning model ends.
- the first machine learning model has the same partial structure as the second machine learning model, and the Nth round of model training is performed based on the first training sample set and at least part of the model parameters of the current second machine learning model to obtain The first machine learning model, including:
- the N round of model training is performed to obtain the first machine learning model .
- the first machine learning model and the second machine learning model all have the same structure, and the Nth round of model training is performed based on the first training sample set and at least part of the model parameters of the current second machine learning model to obtain The first machine learning model, including:
- the Nth round of model training is performed to obtain the first machine learning model.
- the method also includes:
- the N+1th round of training of the first machine learning model is performed.
- the above-mentioned multiple rounds of model training are performed based on the second training sample set to obtain the second machine learning model, including:
- the M round of model training is performed to obtain the second machine learning model; wherein, M is a positive integer greater than 0;
- the first machine learning model has the same partial structure as the second machine learning model, and the Mth round of model training is performed based on the second training sample set and at least part of the model parameters of the current first machine learning model to obtain A second machine learning model, including:
- a first round of training is performed based on the second training sample set, the model parameters of the structurally identical part of the first initial model, and the second initial model parameters to obtain a second initial model; at least part of the parameters of the second initial model are used to train the first machine learning model;
- the first machine learning model and the second machine learning model all have the same structure, and the Mth round of model training is performed based on the second training sample set and the model parameters of the current first machine learning model to obtain the second Machine learning models, including:
- An M-th round of model training is performed based on the second training sample set and all model parameters of the current first machine learning model to obtain a second machine learning model.
- the method also includes:
- the M+1th round of training of the second machine learning model is performed.
- the model index includes the accuracy rate of the output result
- the first preset index includes the first preset accuracy rate
- the second preset index includes the second preset accuracy rate
- the method also includes:
- the batch gradient algorithm is used to determine the descent gradient and continue training until it is determined that the output result of the machine learning model meets the preset convergence conditions.
- the convergence condition is set, the current round of training is stopped.
- the acquisition of the first training sample set and the second training sample set includes:
- the first hospital and the second hospital are different hospitals.
- the first machine learning model and the second machine learning model include at least one of a dose prediction model, an automatic delineation model, a curative effect evaluation model, a survival index evaluation model, a cancer screening model, and a deformation registration model .
- model parameters of the first machine learning model and the model parameters of the second machine learning model are transmitted through the network
- model parameters of the first machine learning model and the model parameters of the second machine learning model are transmitted through a storage medium.
- the training of the first machine learning model and the training of the second machine learning model are respectively performed in two independent networks.
- the training of the first machine learning model and the training of the second machine learning model are performed alternately.
- the first machine learning model and the second machine learning model have the same structure and the same application.
- the applications of the first machine learning model and the second machine learning model are different.
- the method also includes:
- a training method for a machine learning model comprising:
- the training samples in the training sample set include medical images obtained by scanning objects scanned by medical scanning equipment;
- every two machine learning models in the at least two machine learning models have at least part of the same structure, and when training one of the machine learning models, at least partly use the model parameters of the same part of the other machine learning model.
- multiple rounds of model training are performed based on each training sample set to obtain a machine learning model corresponding to each training sample set, including:
- model parameters of the current round are obtained, and model training is performed based on the training sample set corresponding to the machine learning model and the model parameters of the current round to obtain the machine learning model; wherein, the model parameters of the current round include initial model parameters or Model parameters of the same structure in another machine learning model.
- each machine learning model is trained on an independent network.
- At least two machine learning models have the same structure and the same application.
- the application of each machine learning model is different.
- a training device for a machine learning model comprising:
- the sample set obtaining module is used to obtain the first training sample set and the second training sample set; the training samples in the first training sample set and the second training sample set include medical images obtained by scanning objects scanned by medical scanning equipment;
- the first training module is used to perform multiple rounds of model training based on the first training sample set to obtain the first machine learning model
- the second training module is used to perform multiple rounds of model training based on the second training sample set to obtain a second machine learning model
- the first machine learning model and the second machine learning model have at least part of the same structure, and the model parameters of the second machine learning model are at least partially used when training the first machine learning model, and the first machine learning model is at least partially used when training the second machine learning model.
- Model parameters for machine learning models are at least partially used when training the first machine learning model, and the first machine learning model is at least partially used when training the second machine learning model.
- the above-mentioned first training module is specifically used to perform the first round of model training based on the first initial model parameters and the first training sample set to obtain the first initial model; at least part of the model parameters of the first initial model For training the second machine learning model; based on the first training sample set and at least part of the model parameters of the current second machine learning model, the Nth round of model training is performed to obtain the first machine learning model; N is a positive integer greater than 1; If it is determined that the model index of the first machine learning model satisfies the first preset index, the training of the first machine learning model ends.
- the first machine learning model has the same structure as the second machine learning model, and the above-mentioned first training module is specifically used for the same structure of the current second machine learning model based on the first training sample set.
- the model parameters and part of the model parameters of the first machine learning model obtained in the previous round of training are performed on the N-th round of model training to obtain the first machine learning model.
- the first machine learning model and the second machine learning model all have the same structure, and the above-mentioned first training module is specifically used to conduct training based on the first training sample set and all model parameters of the current second machine learning model. In the Nth round of model training, the first machine learning model is obtained.
- the above-mentioned first training module is further configured to perform the N+1th round of training of the first machine learning model if it is determined that the model index of the first machine learning model does not meet the first preset index.
- the above-mentioned second training module is specifically used to perform the Mth round of model training based on the second training sample set and at least part of the model parameters of the current first machine learning model to obtain the second machine learning model; wherein , M is a positive integer greater than 0; if the model index of the second machine learning model satisfies the second preset index, the training of the second machine learning model ends.
- the first machine learning model has the same structure as the second machine learning model
- the above-mentioned second training module is specifically used based on the second training sample set, the model parameters of the same part of the first initial model, and Perform the first round of training on the second initial model parameters to obtain the second initial model; at least part of the parameters of the second initial model are used to train the first machine learning model; based on the second training sample set, the structure of the current first machine learning model The same part of the model parameters and some of the model parameters of the second machine learning model obtained from the previous round of training continue to perform model training to obtain the second machine learning model.
- the first machine learning model and the second machine learning model all have the same structure, and the above-mentioned second training module is specifically used to conduct training based on the second training sample set and all model parameters of the current first machine learning model.
- a second machine learning model is obtained.
- the above-mentioned second training module is configured to perform the M+1th round of training of the second machine learning model if it is determined that the model index of the second machine learning model does not meet the second preset index.
- the model index includes the accuracy rate of the output result
- the first preset index includes the first preset accuracy rate
- the second preset index includes the second preset accuracy rate
- the device also includes:
- Gradient descent module used for each round of training, if it is determined according to the preset loss function that the output result of the machine learning model does not meet the preset convergence conditions, then use the batch gradient algorithm to determine the descent gradient and continue training until the machine learning model is determined When the output of the model meets the preset convergence conditions, the current round of training is stopped.
- the above-mentioned sample set acquisition module is specifically configured to acquire medical images of the first hospital, and generate a first training sample set based on the medical images of the first hospital; acquire medical images of the second hospital, and generate the first training sample set based on the medical images of the first hospital; The medical images of the second hospital generate a second training sample set; wherein, the first hospital and the second hospital are different hospitals.
- the first machine learning model and the second machine learning model include at least one of a dose prediction model, an automatic delineation model, a curative effect evaluation model, a survival index evaluation model, a cancer screening model, and a deformation registration model .
- model parameters of the first machine learning model and the model parameters of the second machine learning model are transmitted through the network
- model parameters of the first machine learning model and the model parameters of the second machine learning model are transmitted through a storage medium.
- the training of the first machine learning model and the training of the second machine learning model are respectively performed in two independent networks.
- the training of the first machine learning model and the training of the second machine learning model are performed alternately.
- the first machine learning model and the second machine learning model have the same structure and the same application.
- the applications of the first machine learning model and the second machine learning model are different.
- the device also includes:
- the combination processing module is used to combine the first machine learning model and the second machine learning model to obtain the target machine learning model.
- a training device for a machine learning model comprising:
- the sample set obtaining module is used to obtain at least two training sample sets; the training samples in the training sample set include medical images obtained by scanning objects scanned by medical scanning equipment;
- a training module configured to perform multiple rounds of model training based on each training sample set, to obtain a machine learning model corresponding to each training sample set;
- every two machine learning models in the at least two machine learning models have at least part of the same structure, and when training one of the machine learning models, at least partly use the model parameters of the same part of the other machine learning model.
- the above-mentioned training module is used to obtain the model parameters of the current round for each machine learning model, and perform model training based on the training sample set corresponding to the machine learning model and the model parameters of the current round to obtain the machine learning model ;
- the model parameters of the current round include initial model parameters or model parameters of the same structure in another machine learning model.
- each machine learning model is trained on an independent network.
- At least two machine learning models have the same structure and the same application.
- the application of each machine learning model is different.
- a computer device comprising a memory and a processor, the memory stores a computer program, and the processor implements the following steps when executing the computer program:
- the training samples in the first training sample set and the second training sample set include medical images obtained by scanning objects scanned by medical scanning equipment;
- the first machine learning model and the second machine learning model have at least part of the same structure, and the model parameters of the second machine learning model are at least partially used when training the first machine learning model, and the first machine learning model is at least partially used when training the second machine learning model.
- Model parameters for machine learning models are at least partially used when training the first machine learning model, and the first machine learning model is at least partially used when training the second machine learning model.
- a computer-readable storage medium on which a computer program is stored, and when the computer program is executed by a processor, the following steps are implemented:
- the training samples in the first training sample set and the second training sample set include medical images obtained by scanning objects scanned by medical scanning equipment;
- the first machine learning model and the second machine learning model have at least part of the same structure, and the model parameters of the second machine learning model are at least partially used when training the first machine learning model, and the first machine learning model is at least partially used when training the second machine learning model.
- Model parameters for machine learning models are at least partially used when training the first machine learning model, and the first machine learning model is at least partially used when training the second machine learning model.
- the above machine learning model training method, device, computer equipment, and storage medium obtain the first training sample set and the second training sample set; perform multiple rounds of model training based on the first training sample set to obtain the first machine learning model;
- the second training sample set is subjected to multiple rounds of model training to obtain a second machine learning model. Since the first machine learning model and the second machine learning model have parts with the same structure, the parts with the same structure can use the same model parameters.
- the first machine learning model at least partly use the model parameters of the second machine learning model, and when training the second machine learning model, at least partly use the model parameters of the first machine learning model.
- the training of the first machine learning model does not use the second training sample set, and the training of the second machine learning model does not use the first training sample set, which can ensure the data security of the training samples; while training the first machine learning model Using the model parameters of the second machine learning model to train the second machine learning model Using the model parameters of the first machine learning model can improve the model training speed and the accuracy of the machine learning model.
- Fig. 1 is the application environment figure of the training method of machine learning model in an embodiment
- Fig. 2 is a schematic flow chart of a training method of a machine learning model in an embodiment
- Figure 3a is a schematic structural diagram of the first machine learning model in an embodiment
- Fig. 3b is a schematic structural diagram of a second machine learning model in an embodiment
- FIG. 4 is one of the flow diagrams of performing multiple rounds of model training steps based on the first training sample set in an embodiment
- Fig. 5 is the second schematic flow diagram of multiple rounds of model training steps based on the first training sample set in an embodiment
- FIG. 6 is one of the schematic flow diagrams of performing multiple rounds of model training steps based on the second training sample set in an embodiment
- FIG. 7 is the second schematic flow diagram of multiple rounds of model training steps based on the second training sample set in an embodiment
- FIG. 8 is a schematic flowchart of the steps of alternately training the first machine learning model and the steps of the second machine learning model in one embodiment
- Fig. 9 is a schematic flow chart of a training method for a machine learning model in another embodiment
- Fig. 10 is a structural block diagram of a training device for a machine learning model in an embodiment
- Fig. 11 is a structural block diagram of a training device for a machine learning model in another embodiment
- Figure 12 is a diagram of the internal structure of a computer device in one embodiment.
- This application provides a training plan for a machine learning model, including: obtaining a first training sample set and a second training sample set; performing multiple rounds of model training based on the first training sample set to obtain a first machine learning model; based on the second training sample set Multiple rounds of model training are performed on the set to obtain the second machine learning model.
- the first machine learning model and the second machine learning model have at least part of the same structure, and the model parameters of the second machine learning model are at least partially used when training the first machine learning model, and the first machine learning model is at least partially used when training the second machine learning model.
- Model parameters for machine learning models are
- training the first machine learning model does not use the second training sample set, and training the second machine learning model does not use the first training sample set, which can ensure the data security of the training samples; while training the first machine learning model uses the second machine learning model Learning the model parameters of the model and training the second machine learning model Using the model parameters of the first machine learning model can improve the model training speed and the accuracy of the machine learning model. It can be seen that the problem of less training samples and poor accuracy of the machine learning model in the prior art is solved.
- the training method of the machine learning model provided in this application can be applied to the application environment shown in FIG. 1 .
- the application environment may include a model training system, and the model training system includes multiple model training terminals 101, and the multiple model training terminals 101 may communicate through a network.
- the model training terminal 101 may be a terminal connected to the medical scanning device 102 .
- the terminal can be but not limited to various personal computers, notebook computers and tablet computers.
- the above-mentioned medical scanning device 102 can be a single-mode device or a multi-mode device, such as but not limited to DR (Digital radiography, that is, digital X Line photography) equipment, CT (Computed Tomography, computerized tomography) equipment, CBCT (Cone Beam Computed Tomography, cone beam computerized tomography) equipment, PET (Positron Emission Computed Tomography, positron emission computerized tomography) ) equipment, MRI (Magnetic Resonance Imaging, magnetic resonance imaging) equipment, ultrasound equipment, PET-CT equipment, PET-MR, RT (radiotherapy, radiation therapy) equipment, CT-RT and MR-RT.
- the model training terminal 101 can also be a PACS (Picture Archiving and Communication Systems, image archiving and communication system) server.
- the above PACS server can be realized by an independent server or a server cluster composed of multiple servers.
- a training method of a machine learning model is provided, and the method is applied to the model training system in Figure 1 as an example for illustration, including the following steps:
- Step 201 acquire a first training sample set and a second training sample set.
- the training samples in the first training sample set and the second training sample set include medical images obtained by scanning objects scanned by a medical scanning device.
- the medical image can be a two-dimensional image or a three-dimensional image.
- the training sample set satisfies data diversity, label consistency, and the data structure is the same.
- the model training system can use medical images in the same hospital for model training, or use medical images in different hospitals for model training.
- the model training terminal When the model training system uses medical images in the same hospital for model training, the model training terminal generates a first training sample set and a second training sample set according to multiple medical images, wherein the first training sample set and the second training sample sets are used for different training tasks.
- the model training end obtains 100 CT images from the same CT device, divides the 100 CT images into two image sets, and obtains the first training sample set and the second training sample set, where the first training sample set is used for dose
- the second training sample set is used for the training task of automatically sketching the model.
- the first model training end in the model training system acquires medical images of the first hospital, and generates a first training sample set based on the medical images of the first hospital;
- the model The second model training end in the training system acquires medical images of the second hospital, and generates a second training sample set based on the medical images of the second hospital; wherein, the first hospital and the second hospital are different hospitals.
- the model training terminal A1 obtains CT images from the hospital B1 and generates a first training sample set; the model training terminal A2 obtains CT images from the hospital B2 and generates a second training sample set.
- the embodiment of the present disclosure does not limit the manner of obtaining the training sample set.
- Step 202 Perform multiple rounds of model training based on the first training sample set to obtain a first machine learning model.
- Step 203 performing multiple rounds of model training based on the second training sample set to obtain a second machine learning model.
- the model training system uses medical images in the same hospital for model training
- the model training system can use the same model training terminal to train the first machine learning model and the second machine learning model, or use different model training terminals for training. Training of the first machine learning model and the second machine learning model.
- model training system uses medical images in different hospitals for model training
- model training system uses different model training terminals to train the first machine learning model and the second machine learning model.
- the first machine learning model and the second machine learning model have at least part of the same structure, and at least partially use the model parameters of the second machine learning model when training the first machine learning model, and when training the second machine learning model Model parameters of the first machine learning model are utilized at least in part.
- the first machine learning model shown in Figure 3a and the second machine learning model shown in Figure 3b, the first machine learning model and the second machine learning model have parts with the same structure, so the parts with the same structure can use the same Model parameters. In this way, when training the first machine learning model, at least partly use the model parameters of the second machine learning model, and when training the second machine learning model, at least partly use the model parameters of the first machine learning model.
- the first model training end of the model training system performs a round of model training to obtain model parameters of the first machine learning model, and then transfers the model parameters of the first machine learning model to the second model training end of the model training system.
- the second model training end uses the model parameters of the first machine learning model to conduct a round of training of the second machine learning model; after the training, the model parameters of the second machine learning model are passed to the first model training end.
- the first model training end uses the model parameters of the second machine learning model to conduct another round of training of the first machine learning model; after the training, the model parameters of the first machine learning model are passed to the second model training end. And so on, until the training of the first machine learning model and the second machine learning model is completed.
- model parameters of the first machine learning model and the model parameters of the second machine learning model are transmitted through a network; or, the model parameters of the first machine learning model and the model parameters of the second machine learning model are transmitted through a storage medium.
- the first model training terminal and the second model training terminal send the model parameters through the network, or the user uses a storage medium such as a mobile hard disk to copy the model parameters to realize the transfer of the model parameters. Embodiments of the present disclosure do not limit this.
- the training of the first machine learning model does not use the second training sample set, and the training of the second machine learning model does not use the first training sample set, which can ensure the data security of the training samples; while training The first machine learning model uses the model parameters of the second machine learning model, and the training of the second machine learning model uses the model parameters of the first machine learning model, which can improve the model training speed and the accuracy of the machine learning model.
- the first machine learning model and the second machine learning model include at least one of a dose prediction model, an automatic delineation model, a curative effect evaluation model, a survival index evaluation model, a cancer screening model, and a deformation registration model .
- the first machine learning model is a dose prediction model
- the second machine learning model is an automatic delineation model
- the first machine learning model is a curative effect evaluation model
- the second machine learning model is a survival index evaluation model.
- the embodiment of the present disclosure does not limit the model types of the first machine learning model and the second machine learning model.
- the training samples are limited, and the accuracy of the application model obtained by using the limited samples for model training in the prior art is limited.
- the training method of this embodiment obtains a model with higher accuracy by repeatedly using limited samples to iteratively train the machine learning model, and better technical effects can be obtained when it is applied.
- the automatic delineation model obtained by using the training method of this embodiment has a better delineation effect on the region of interest;
- the dose prediction model obtained by training with few samples the dose prediction model obtained by using the training method of this embodiment can predict the dose more accurately.
- the first training sample set and the second training sample set are obtained; multiple rounds of model training are performed based on the first training sample set to obtain the first machine learning model; multiple rounds are performed based on the second training sample set Model training to obtain a second machine learning model. Since the first machine learning model and the second machine learning model have parts with the same structure, the parts with the same structure can use the same model parameters.
- the first machine learning model at least partly use the model parameters of the second machine learning model, and when training the second machine learning model, at least partly use the model parameters of the first machine learning model.
- the training of the first machine learning model does not use the second training sample set, and the training of the second machine learning model does not use the first training sample set, which can ensure the data security of the training samples; while training the first machine learning model Using the model parameters of the second machine learning model to train the second machine learning model Using the model parameters of the first machine learning model can improve the model training speed and the accuracy of the machine learning model.
- the above-mentioned step of performing multiple rounds of model training based on the first training sample set to obtain the first machine learning model may include:
- Step 301 Perform a first round of model training based on the first initial model parameters and the first training sample set to obtain a first initial model.
- the first initial model parameter may be a random model parameter, or a model parameter assigned by a user, which is not limited in this embodiment of the present disclosure.
- the model training end obtains the first initial model parameters and the first training sample set, it performs the first round of model training of the first machine learning model to obtain the first initial model.
- the model parameters of the first initial model are used to train the second machine learning model.
- Step 302 Perform N round of model training based on the first training sample set and at least some model parameters of the current second machine learning model to obtain the first machine learning model, where N is a positive integer greater than 1.
- the first machine learning model and the second machine learning model can be trained alternately. After the model training end obtains the first initial model, it uses the model parameters of the first initial model to perform the first round of training of the second machine learning model.
- the model training end performs the Nth round of training of the first machine learning model.
- the Nth round of model training is performed based on the first training sample set and at least some model parameters of the current second machine learning model to obtain the first machine learning model
- the process of the model may include: performing the Nth round of model training based on the first training sample set, the model parameters of the same structural part of the current second machine learning model, and some model parameters of the first machine learning model obtained from the previous round of training , to get the first machine learning model.
- the model training end performs the first machine learning model based on the first training sample set, some model parameters of the first initial model, and the model parameters of the same structure obtained by the first round of training of the second machine learning model.
- the second round of model training to obtain the first machine learning model, and perform step 303 or step 304.
- the model training end performs the first training based on the first training sample set, the first machine learning model from the second round of training, and the model parameters of the same structure obtained from the second round of training on the second machine learning model.
- the first machine learning model is obtained, and step 303 or step 304 is performed.
- the Nth round of model training is performed based on the first training sample set and at least some model parameters of the current second machine learning model to obtain the first machine learning model
- the process of the model may include: performing N rounds of model training based on the first training sample set and all model parameters of the current second machine learning model to obtain the first machine learning model.
- Step 303 If it is determined that the model index of the first machine learning model satisfies the first preset index, the training of the first machine learning model is ended.
- the model index includes the accuracy rate of the output result
- the first preset index includes the first preset accuracy rate
- the model training end calculates the accuracy rate of the output result of the first machine learning model, and compares the accuracy rate of the output result of the first machine learning model with the first preset accuracy rate. If the accuracy rate of the output result of the first machine learning model is greater than the first preset accuracy rate, then determine that the model index of the first machine learning model meets the first preset index; if the accuracy rate of the output result of the first machine learning model is less than or equal to For the first preset accuracy rate, it is determined that the model index of the first machine learning model does not meet the first preset index.
- the above-mentioned process of calculating the accuracy rate of the output result of the first machine learning model may include: inputting a preset number of test samples into the first machine learning model to obtain the output results corresponding to each test sample; the statistics are consistent with the labels of the test samples The number of output results, and calculate the ratio between the number and the preset number, to obtain the accuracy of the output result.
- the test samples can be selected from the first training sample set, or can be obtained in the same way as the first training sample set, which is not limited in this embodiment of the present disclosure, and the preset number is also not limited in the embodiment of the present disclosure .
- the embodiment of the present disclosure may further include the following steps:
- step 304 if it is determined that the model index of the first machine learning model does not satisfy the first preset index, perform the N+1th round of training of the first machine learning model.
- N 2 as an example, if the model index of the first machine learning model does not meet the first preset index, at least part of the model parameters obtained by the second round of training the second machine learning model are obtained, and then based on the first training sample set, The first machine learning model obtained in the second round of training and at least part of the model parameters obtained in the second round of training the second machine learning model, the third round of training is performed on the first machine learning model, and step 303 or step 304 is performed.
- N Taking N equal to 3 as an example, if the model index of the first machine learning model does not meet the first preset index, at least part of the model parameters obtained from the third round of training the second machine learning model are obtained, and then based on the first training sample set, The first machine learning model obtained in the third round of training and at least part of the model parameters obtained in the third round of training the second machine learning model, the fourth round of training is performed on the first machine learning model, and step 303 or step 304 is performed.
- the batch gradient algorithm is used to determine the descent gradient and continue training until the machine learning model is determined to be When the output result of the learning model meets the preset convergence conditions, the current round of training is stopped.
- the batch gradient descent method is used to determine the descent gradient and continue training until the first machine learning model is determined When the output result of the learning model meets the preset convergence conditions, the first round of training is stopped.
- the preset loss function and batch gradient descent method are also used for model training. The embodiment of the present disclosure does not limit the preset loss function and the preset convergence condition.
- the first round of model training is performed based on the first initial model parameters and the first training sample set to obtain the first initial model; based on the first A training sample set and at least part of the model parameters of the current second machine learning model are trained for the Nth round of model training to obtain the first machine learning model; if it is determined that the model index of the first machine learning model meets the first preset index, then end Training of the first machine learning model; if it is determined that the model index of the first machine learning model does not meet the first preset index, then the N+1th round of training of the first machine learning model is performed.
- the first machine learning model that meets the first preset index can be trained by using at least part of the model parameters of the second machine learning model without using the second training sample set.
- you can Ensuring the data security of training samples can also improve model training speed and model accuracy.
- the above-mentioned step of performing multiple rounds of model training based on the second training sample set to obtain the second machine learning model may include:
- Step 401 Perform M round of model training based on the second training sample set and at least some model parameters of the current first machine learning model to obtain a second machine learning model; where M is a positive integer greater than 0.
- the model training end After training the first machine learning model to obtain the first initial model, the model training end performs the M-th round of training of the second machine learning model based on the second training sample set and at least part of the model parameters of the first initial model.
- the M-th round of model training is performed based on the second training sample set and the model parameters of the current first machine learning model to obtain the second machine learning model
- the process may include: performing a first round of training based on the second training sample set, the model parameters of the same part of the first initial model, and the second initial model parameters to obtain the second initial model; wherein, at least part of the second initial model The parameters are used to train the first machine learning model; then, based on the second training sample set, the model parameters of the same part of the current first machine learning model and some model parameters of the second machine learning model obtained from the previous round of training, continue Perform model training to obtain a second machine learning model.
- the model training end performs a second round of training based on the second training sample set, the second machine learning model obtained from the first round of training, and at least part of the model parameters obtained from the second round of training on the first machine learning model.
- the second round of model training of the machine learning model obtains the second machine learning model, and then step 402 or step 403 is performed.
- the model training end is based on the second training sample set, the second machine learning model trained in the second round, and at least part of the model parameters obtained from the third round of training the first machine learning model , perform the third round of model training of the second machine learning model to obtain the second machine learning model, and then perform step 402 or step 403 .
- the M-th round of model training is performed based on the second training sample set and the model parameters of the current first machine learning model to obtain the second machine learning model.
- the process may include: performing the Mth round of model training based on the second training sample set and all model parameters of the current first machine learning model to obtain the second machine learning model.
- Step 402 If the model index of the second machine learning model satisfies the second preset index, the training of the second machine learning model is ended.
- the model index of the second machine learning model is calculated, and it is determined whether the model index meets the second preset index; if the second preset index is met, the training of the second machine learning model is ended.
- the model index includes the accuracy rate of the output result
- the second preset index includes the second preset accuracy rate
- the accuracy rate of the output result of the second machine learning model can be calculated, and the accuracy rate of the output result of the second machine learning model can be compared with the second preset accuracy rate. If the output result of the second machine learning model is accurate rate is greater than the second preset accuracy rate, then determine that the model index of the second machine learning model satisfies the second preset index; if the accuracy rate of the second machine learning model output result is less than or equal to the second preset accuracy rate, then determine the first The model index of the second machine learning model does not meet the second preset index.
- the above-mentioned process of calculating the accuracy rate of the output result of the second machine learning model may include: inputting a preset number of test samples into the second machine learning model to obtain the output results corresponding to each test sample; the statistics are consistent with the labels of the test samples The number of output results, and calculate the ratio between the number and the preset number, to obtain the accuracy of the output result.
- the test samples can be selected from the second training sample set, or can be obtained in the same way as the second training sample set, which is not limited in this embodiment of the present disclosure, and the preset number is also not limited in the embodiment of the present disclosure .
- the foregoing first preset index and the second preset index may be the same preset index, or may be different preset indexes, which are not limited in this embodiment of the present disclosure.
- the embodiment of the present disclosure may further include the following steps:
- Step 403 if it is determined that the model index of the second machine learning model does not meet the second preset index, perform the M+1th round of training of the second machine learning model.
- the model index of the second machine learning model does not meet the second preset index, then obtain the model parameters obtained from the third round of training the first machine learning model, and then based on the second training sample set,
- the second machine learning model obtained in the second round of training and the model parameters obtained in the third round of training the first machine learning model are used to perform a third round of training on the second machine learning model.
- the model parameters obtained by the fourth round of training the first machine learning model are obtained, and then based on the second training sample set, the third The second machine learning model obtained in the round of training and the model parameters obtained in the fourth round of training the first machine learning model, and the fourth round of training is performed on the second machine learning model.
- the model index of the second machine learning model satisfies the second preset index.
- the batch gradient algorithm is used to determine the descent gradient and continue training until the machine learning model is determined to be When the output result of the learning model meets the preset convergence conditions, the current round of training is stopped.
- the batch gradient descent method is used to determine the descent gradient and continue training until the second machine learning model is determined to be When the output result of the learning model meets the preset convergence conditions, the first round of training is stopped.
- the preset loss function and batch gradient descent method are also used for model training.
- the M-th round of model training is performed based on the second training sample set and at least part of the model parameters of the current first machine learning model to obtain The second machine learning model; if the model index of the second machine learning model meets the second preset index, then end the training of the second machine learning model; if it is determined that the model index of the second machine learning model does not meet the second preset index, Then perform the M+1th round of training of the second machine learning model.
- the second machine learning model that satisfies the second preset index can be trained by using the model parameters of the first machine learning model without using the first training sample set. In this process, the training can be guaranteed The data security of samples can also improve the model training speed and model accuracy.
- the training of the first machine learning model and the training of the second machine learning model are performed alternately, and the description will be made by taking the first machine learning model and the second machine learning model having the same partial structure as an example. As shown in Figure 8, the following steps may be included:
- Step 501 Perform a first round of model training based on the first initial model parameters and the first training sample set to obtain a first initial model.
- Step 502 Perform a first round of training based on the second training sample set, the model parameters of the structurally identical part of the first initial model, and the second initial model parameters to obtain a second initial model.
- the same structure includes the same model hierarchy and the same connection relationship.
- Step 503 Continue model training based on the first training sample set, at least some model parameters of the current second machine learning model, and some model parameters of the first machine learning model obtained from the previous round of training, to obtain the first machine learning model.
- Step 504 continue model training based on the second training sample set, the model parameters of the same structural part of the current first machine learning model, and some model parameters of the second machine learning model obtained in the previous round of training, to obtain the second machine learning model Model.
- Step 505 if it is determined that the model index of the first machine learning model satisfies the first preset index, and the model index of the second machine learning model satisfies the second preset index, then end the process of the first machine learning model and the second machine learning model train.
- Step 506 if it is determined that the model index of the first machine learning model does not meet the first preset index, and/or the model index of the second machine learning model does not meet the second preset index, perform the first machine learning model and the second The next round of training for the machine learning model.
- the first machine learning model and the second machine learning model have the same structure.
- model training is continued based on the first training sample set and all model parameters of the current second machine learning model to obtain the first machine learning model; in step 504, based on the second training sample set Set and all model parameters of the current first machine learning model to continue model training to obtain a second machine learning model.
- the training of the first machine learning model and the training of the second machine learning model are respectively performed in two independent networks.
- the first model training terminal and the second model training terminal in the model training system are in two independent networks, and the first model training terminal and the second training terminal cannot communicate through the network.
- the model parameters of the first machine learning model and the model parameters of the second machine learning model can be transmitted through the storage medium.
- the first machine learning model and the second machine learning model have the same structure and the same application.
- the first machine learning model is a dose prediction model trained using the first training sample set
- the second machine learning model is a dose prediction model trained using the second training sample set
- the first machine learning model and the second machine learning model The models have the same structure and are both applied to dose prediction.
- the applications of the first machine learning model and the second machine learning model are different.
- the first machine learning model is a dose prediction model
- the second machine learning model is an automatic delineation model
- the structures of the first machine learning model and the second machine learning model can be partially or completely the same.
- the embodiment of the present disclosure may further include: combining the first machine learning model and the second machine learning model to obtain the target machine learning model.
- the model training end combines the first machine learning model and the second machine learning model to obtain a combined target machine learning model. For example, by combining the dose prediction model with the automatic delineation model, a target machine learning model that performs dose prediction first and then automatic delineation can be obtained, making the model more powerful.
- the first machine learning model and the second machine learning model are alternately trained, and the training of the first machine learning model does not use the second training sample set, and the training of the second machine learning model does not use the first training sample set set, which can ensure the data security of the training samples; while training the first machine learning model using the model parameters of the second machine learning model, training the second machine learning model using the model parameters of the first machine learning model can improve the model training speed and machine Accuracy of the learned model.
- a training method of a machine learning model is provided.
- the method is applied to the model training system in FIG. 1 as an example, including the following steps:
- Step 601 acquire at least two training sample sets.
- the training samples in the training sample set include medical images obtained by scanning objects scanned by the medical scanning equipment.
- medical images obtained by scanning objects scanned by the medical scanning equipment.
- CT images CBCT images
- PET images MR images
- ultrasound images etc.
- the model training system acquires at least two training sample sets, and the acquisition method can refer to step 201 .
- Step 602 Perform multiple rounds of model training based on each training sample set to obtain a machine learning model corresponding to each training sample set.
- every two machine learning models in the at least two machine learning models have at least part of the same structure, and when training one of the machine learning models, at least partly use the model parameters of the same part of the other machine learning model.
- model parameters of the current round are obtained, and model training is performed based on the training sample set corresponding to the machine learning model and the model parameters of the current round to obtain the machine learning model; wherein, the model parameters of the current round include initial model parameters or Model parameters of the same structure in another machine learning model.
- the first machine learning model and the second machine learning model have at least part of the same structure
- the second machine learning model and the third machine learning model have at least part of the same structure
- the first The first machine learning model and the third machine learning model are at least partially identical in structure.
- the initial model parameters of the first round are obtained, and then the first round of model training is performed based on the first training sample set and the initial model parameters.
- the second machine learning model at least part of the model parameters of the first machine learning model are obtained, and a first round of model training is performed based on the second training sample set and at least part of the model parameters of the first machine learning model.
- the third machine learning model at least part of the model parameters of the second machine learning model are acquired, and a first round of model training is performed based on the third training sample set and at least part of the model parameters of the second machine learning model.
- the first machine learning model After the first round, for the first machine learning model, at least part of the model parameters of the third machine learning model are obtained, and a second round of model training is performed based on the first training sample set and at least part of the model parameters of the third machine learning model.
- the second machine learning model At least some model parameters of the first machine learning model are obtained, and a second round of model training is performed based on the second training sample set and at least some model parameters of the first machine learning model.
- the third machine learning model at least part of the model parameters of the second machine learning model are obtained, and a second round of model training is performed based on the third training sample set and at least part of the model parameters of the second machine learning model.
- model parameters are transferred between at least two machine learning models in a preset order. For example, at least some of the model parameters of the first machine learning model are passed to the second machine learning model, at least some of the model parameters of the second machine learning model are passed to the third machine learning model, and at least some of the model parameters of the third machine learning model are passed to The first machine learning model.
- the transfer of model parameters may also be in other transfer forms, and the embodiment of the present disclosure does not limit the preset order.
- each machine learning model is trained on an independent network.
- the training of the first machine learning model, the second machine learning model and the third machine learning model is performed in three independent networks, and the three networks cannot communicate with each other.
- At least two machine learning models have the same structure and the same application.
- the first machine learning model, the second machine learning model and the third machine learning model are all dose prediction models, and the structures of the first machine learning model, the second machine learning model and the third machine learning model are the same.
- the application of each machine learning model is different.
- the first machine learning model is a dose prediction model
- the second machine learning model is an automatic delineation model
- the third machine learning model is a curative effect evaluation model.
- the structures of every two machine learning models may be partly or completely the same.
- At least two training sample sets are obtained; multiple rounds of model training are performed based on each training sample set, and a machine learning model corresponding to each training sample set is obtained. Since every two machine learning models have at least part of the same structure, and when one of the machine learning models is trained, at least part of the model parameters of the same part of the other machine learning model are used. Therefore, the data security of the training samples can be guaranteed, and the speed of model training and the accuracy of the machine learning model can be improved.
- steps in the flow charts of FIG. 2 to FIG. 9 are displayed sequentially as indicated by the arrows, these steps are not necessarily executed sequentially in the order indicated by the arrows. Unless otherwise specified herein, there is no strict order restriction on the execution of these steps, and these steps can be executed in other orders. Moreover, at least some of the steps in FIGS. 2 to 9 may include multiple steps or stages, and these steps or stages are not necessarily performed at the same time, but may be performed at different times. The steps or stages The order of execution is not necessarily performed in rounds, but may be performed alternately or alternately with other steps or at least a part of steps or stages in other steps.
- a training device for a machine learning model including:
- the sample set acquisition module 701 is configured to acquire a first training sample set and a second training sample set; the training samples in the first training sample set and the second training sample set include medical images obtained by scanning objects scanned by medical scanning equipment;
- the first training module 702 is configured to perform multiple rounds of model training based on the first training sample set to obtain a first machine learning model
- the second training module 703 is configured to perform multiple rounds of model training based on the second training sample set to obtain a second machine learning model
- the first machine learning model and the second machine learning model have at least part of the same structure, and the model parameters of the second machine learning model are at least partially used when training the first machine learning model, and the first machine learning model is at least partially used when training the second machine learning model.
- Model parameters for machine learning models are at least partially used when training the first machine learning model, and the first machine learning model is at least partially used when training the second machine learning model.
- the above-mentioned first training module 702 is specifically configured to perform the first round of model training based on the first initial model parameters and the first training sample set to obtain the first initial model; at least part of the first initial model The parameters are used to train the second machine learning model; based on the first training sample set and at least part of the model parameters of the current second machine learning model, the Nth round of model training is performed to obtain the first machine learning model; N is greater than 1 A positive integer; if it is determined that the model index of the first machine learning model satisfies the first preset index, the training of the first machine learning model is ended.
- the first machine learning model has the same structure as the second machine learning model, and the above-mentioned first training module 702 is specifically used for the same structure part of the current second machine learning model based on the first training sample set
- the model parameters of the model and some model parameters of the first machine learning model obtained in the previous round of training are performed for the Nth round of model training to obtain the first machine learning model.
- the first machine learning model and the second machine learning model all have the same structure, and the above-mentioned first training module 702 is specifically used for all model parameters based on the first training sample set and the current second machine learning model Perform the Nth round of model training to obtain the first machine learning model.
- the above-mentioned first training module 702 is further configured to perform the N+1th round of training of the first machine learning model if it is determined that the model index of the first machine learning model does not meet the first preset index.
- the above-mentioned second training module 703 is specifically configured to perform the Mth round of model training based on the second training sample set and at least part of the model parameters of the current first machine learning model to obtain the second machine learning model;
- M is a positive integer greater than 0; if the model index of the second machine learning model satisfies the second preset index, the training of the second machine learning model ends.
- the first machine learning model has the same structure as the second machine learning model
- the above-mentioned second training module 703 is specifically used for model parameters based on the second training sample set and the same structure of the first initial model and the second initial model parameters for the first round of training to obtain the second initial model; at least part of the parameters of the second initial model are used to train the first machine learning model; based on the second training sample set, the current first machine learning model
- the model parameters of the same part of the structure and some model parameters of the second machine learning model obtained from the previous round of training continue to perform model training to obtain the second machine learning model.
- the first machine learning model and the second machine learning model all have the same structure, and the above-mentioned second training module 703 is specifically used for all model parameters of the current first machine learning model based on the second training sample set Perform the Mth round of model training to obtain the second machine learning model.
- the second training module 703 is configured to perform the M+1th round of training of the second machine learning model if it is determined that the model index of the second machine learning model does not meet the second preset index.
- the model index includes the accuracy rate of the output result
- the first preset index includes the first preset accuracy rate
- the second preset index includes the second preset accuracy rate
- the device also includes:
- Gradient descent module used for each round of training, if it is determined according to the preset loss function that the output result of the machine learning model does not meet the preset convergence conditions, then use the batch gradient algorithm to determine the descent gradient and continue training until the machine learning model is determined When the output of the model meets the preset convergence conditions, the current round of training is stopped.
- the above-mentioned sample set acquisition module 701 is specifically configured to acquire medical images of the first hospital, and generate a first training sample set based on the medical images of the first hospital; acquire medical images of the second hospital, and based on A second training sample set is generated from medical images of the second hospital; wherein, the first hospital and the second hospital are different hospitals.
- the first machine learning model and the second machine learning model include at least one of a dose prediction model, an automatic delineation model, a curative effect evaluation model, a survival index evaluation model, a cancer screening model, and a deformation registration model .
- model parameters of the first machine learning model and the model parameters of the second machine learning model are transmitted through the network
- model parameters of the first machine learning model and the model parameters of the second machine learning model are transmitted through a storage medium.
- the training of the first machine learning model and the training of the second machine learning model are respectively performed in two independent networks.
- the training of the first machine learning model and the training of the second machine learning model are performed alternately.
- the first machine learning model and the second machine learning model have the same structure and the same application.
- the applications of the first machine learning model and the second machine learning model are different.
- a training device for a machine learning model comprising:
- a sample set acquisition module 801 configured to acquire at least two training sample sets; the training samples in the training sample set include medical images obtained by scanning objects scanned by medical scanning equipment;
- the training module 802 is used to perform multiple rounds of model training based on each training sample set to obtain a machine learning model corresponding to each training sample set;
- every two machine learning models in the at least two machine learning models have at least part of the same structure, and when training one of the machine learning models, at least partly use the model parameters of the same part of the other machine learning model.
- the above-mentioned training module 802 is used to obtain the model parameters of the current round for each machine learning model, and perform model training based on the training sample set corresponding to the machine learning model and the model parameters of the current round to obtain machine learning model; wherein, the model parameters of the current round include initial model parameters or model parameters of the same structure in another machine learning model.
- each machine learning model is trained on an independent network.
- At least two machine learning models have the same structure and the same application.
- the application of each machine learning model is different.
- Each module in the above-mentioned machine learning model training device can be realized in whole or in part by software, hardware and a combination thereof.
- the above-mentioned modules can be embedded in or independent of the processor in the computer device in the form of hardware, and can also be stored in the memory of the computer device in the form of software, so that the processor can invoke and execute the corresponding operations of the above-mentioned modules.
- a computer device is provided.
- the computer device may be a terminal, and its internal structure may be as shown in FIG. 12 .
- the computer device includes a processor, a memory, a communication interface, a display screen and an input device connected through a system bus. Wherein, the processor of the computer device is used to provide calculation and control capabilities.
- the memory of the computer device includes a non-volatile storage medium and an internal memory.
- the non-volatile storage medium stores an operating system and computer programs.
- the internal memory provides an environment for the operation of the operating system and computer programs in the non-volatile storage medium.
- the communication interface of the computer device is used to communicate with an external terminal in a wired or wireless manner, and the wireless manner can be realized through WIFI, an operator network, NFC (Near Field Communication) or other technologies.
- WIFI Wireless Fidelity
- NFC Near Field Communication
- the computer program is executed by the processor, a method for training a machine learning model is realized.
- the display screen of the computer device may be a liquid crystal display screen or an electronic ink display screen
- the input device of the computer device may be a touch layer covered on the display screen, or a button, a trackball or a touch pad provided on the casing of the computer device , and can also be an external keyboard, touchpad, or mouse.
- FIG. 12 is only a block diagram of a part of the structure related to the solution of this application, and does not constitute a limitation to the computer equipment on which the solution of this application is applied.
- the specific computer equipment can be More or fewer components than shown in the figures may be included, or some components may be combined, or have a different arrangement of components.
- a computer device including a memory and a processor, a computer program is stored in the memory, and the processor implements the following steps when executing the computer program:
- the training samples in the first training sample set and the second training sample set include medical images obtained by scanning objects scanned by medical scanning equipment;
- the first machine learning model and the second machine learning model have at least part of the same structure, and the model parameters of the second machine learning model are at least partially used when training the first machine learning model, and the first machine learning model is at least partially used when training the second machine learning model.
- Model parameters for machine learning models are at least partially used when training the first machine learning model, and the first machine learning model is at least partially used when training the second machine learning model.
- N round of model training is performed to obtain the first machine learning model; N is a positive integer greater than 1;
- the training of the first machine learning model ends.
- the first machine learning model has the same partial structure as the second machine learning model, and the processor also implements the following steps when executing the computer program:
- the N round of model training is performed to obtain the first machine learning model .
- the first machine learning model and the second machine learning model all have the same structure, and the processor also implements the following steps when executing the computer program:
- the Nth round of model training is performed to obtain the first machine learning model.
- the N+1th round of training of the first machine learning model is performed.
- the M round of model training is performed to obtain the second machine learning model; wherein, M is a positive integer greater than 0;
- the first machine learning model has the same partial structure as the second machine learning model, and the processor also implements the following steps when executing the computer program:
- a first round of training is performed based on the second training sample set, the model parameters of the structurally identical part of the first initial model, and the second initial model parameters to obtain a second initial model; at least part of the parameters of the second initial model are used to train the first machine learning model;
- the first machine learning model and the second machine learning model all have the same structure, and the processor also implements the following steps when executing the computer program:
- the M round of model training is performed to obtain the second machine learning model.
- the M+1th round of training of the second machine learning model is performed.
- the model index includes the accuracy rate of the output result
- the first preset index includes the first preset accuracy rate
- the second preset index includes the second preset accuracy rate
- the batch gradient algorithm is used to determine the descent gradient and continue training until it is determined that the output result of the machine learning model meets the preset convergence conditions.
- the convergence condition is set, the current round of training is stopped.
- the first hospital and the second hospital are different hospitals.
- the first machine learning model and the second machine learning model include at least one of a dose prediction model, an automatic delineation model, a curative effect evaluation model, a survival index evaluation model, a cancer screening model, and a deformation registration model.
- model parameters of the first machine learning model and the model parameters of the second machine learning model are transmitted through the network
- model parameters of the first machine learning model and the model parameters of the second machine learning model are transmitted through a storage medium.
- the training of the first machine learning model and the training of the second machine learning model are respectively performed in two independent networks.
- the training of the first machine learning model is alternated with the training of the second machine learning model.
- the first machine learning model and the second machine learning model have the same structure and the same application.
- the applications of the first machine learning model and the second machine learning model are different.
- a computer-readable storage medium on which a computer program is stored, and when the computer program is executed by a processor, the following steps are implemented:
- the training samples in the first training sample set and the second training sample set include medical images obtained by scanning objects scanned by medical scanning equipment;
- the first machine learning model and the second machine learning model have at least part of the same structure, and the model parameters of the second machine learning model are at least partially used when training the first machine learning model, and the first machine learning model is at least partially used when training the second machine learning model.
- Model parameters for machine learning models are at least partially used when training the first machine learning model, and the first machine learning model is at least partially used when training the second machine learning model.
- N round of model training is performed to obtain the first machine learning model; N is a positive integer greater than 1;
- the training of the first machine learning model ends.
- the first machine learning model has the same partial structure as the second machine learning model, and when the computer program is executed by the processor, the following steps are also implemented:
- the N round of model training is performed to obtain the first machine learning model .
- the first machine learning model has the same structure as the second machine learning model, and when the computer program is executed by the processor, the following steps are also implemented:
- the Nth round of model training is performed to obtain the first machine learning model.
- the N+1th round of training of the first machine learning model is performed.
- the M round of model training is performed to obtain the second machine learning model; wherein, M is a positive integer greater than 0;
- the first machine learning model has the same partial structure as the second machine learning model, and when the computer program is executed by the processor, the following steps are also implemented:
- a first round of training is performed based on the second training sample set, the model parameters of the structurally identical part of the first initial model, and the second initial model parameters to obtain a second initial model; at least part of the parameters of the second initial model are used to train the first machine learning model;
- model parameters of the same part of the current first machine learning model and some model parameters of the second machine learning model obtained from the previous round of training continue model training to obtain the second machine learning model.
- the first machine learning model has the same structure as the second machine learning model, and when the computer program is executed by the processor, the following steps are also implemented:
- An M-th round of model training is performed based on the second training sample set and all model parameters of the current first machine learning model to obtain a second machine learning model.
- the M+1th round of training of the second machine learning model is performed.
- the model index includes the accuracy rate of the output result
- the first preset index includes the first preset accuracy rate
- the second preset index includes the second preset accuracy rate
- the batch gradient algorithm is used to determine the descent gradient and continue training until it is determined that the output result of the machine learning model meets the preset convergence conditions.
- the convergence condition is set, the current round of training is stopped.
- the first hospital and the second hospital are different hospitals.
- the first machine learning model and the second machine learning model include at least one of a dose prediction model, an automatic delineation model, a curative effect evaluation model, a survival index evaluation model, a cancer screening model, and a deformation registration model.
- model parameters of the first machine learning model and the model parameters of the second machine learning model are transmitted through the network
- model parameters of the first machine learning model and the model parameters of the second machine learning model are transmitted through a storage medium.
- the training of the first machine learning model and the training of the second machine learning model are respectively performed in two independent networks.
- the training of the first machine learning model is alternated with the training of the second machine learning model.
- the first machine learning model and the second machine learning model have the same structure and the same application.
- the applications of the first machine learning model and the second machine learning model are different.
- Non-volatile memory may include read-only memory (Read-Only Memory, ROM), magnetic tape, floppy disk, flash memory or optical memory, etc.
- Volatile memory can include Random Access Memory (RAM) or external cache memory.
- RAM can take many forms, such as Static Random Access Memory (SRAM) or Dynamic Random Access Memory (DRAM).
Landscapes
- Engineering & Computer Science (AREA)
- Health & Medical Sciences (AREA)
- Medical Informatics (AREA)
- Public Health (AREA)
- Data Mining & Analysis (AREA)
- Theoretical Computer Science (AREA)
- Software Systems (AREA)
- Epidemiology (AREA)
- General Health & Medical Sciences (AREA)
- Primary Health Care (AREA)
- Biomedical Technology (AREA)
- Evolutionary Computation (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Artificial Intelligence (AREA)
- Physics & Mathematics (AREA)
- Computing Systems (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Mathematical Physics (AREA)
- Radiology & Medical Imaging (AREA)
- Nuclear Medicine, Radiotherapy & Molecular Imaging (AREA)
- Databases & Information Systems (AREA)
- Pathology (AREA)
- Image Analysis (AREA)
Abstract
Description
Claims (29)
- 一种机器学习模型的训练方法,其特征在于,所述方法包括:获取第一训练样本集和第二训练样本集;所述第一训练样本集和所述第二训练样本集中的训练样本包括医学扫描设备对扫描对象进行扫描得到的医学影像;基于所述第一训练样本集进行多轮模型训练,得到第一机器学习模型;基于所述第二训练样本集进行多轮模型训练,得到第二机器学习模型;其中,所述第一机器学习模型与所述第二机器学习模型至少部分结构相同,且训练所述第一机器学习模型时至少部分利用所述第二机器学习模型的模型参数,训练所述第二机器学习模型时至少部分利用所述第一机器学习模型的模型参数。
- 根据权利要求1所述的方法,其特征在于,所述基于所述第一训练样本集进行多轮模型训练,得到第一机器学习模型,包括:基于第一初始模型参数和所述第一训练样本集进行第一轮模型训练,得到第一初始模型,所述第一初始模型的至少部分模型参数用于训练所述第二机器学习模型;基于所述第一训练样本集和当前的第二机器学习模型的至少部分模型参数进行第N轮模型训练,得到所述第一机器学习模型;N为大于1的正整数;若确定所述第一机器学习模型的模型指标满足第一预设指标,则结束所述第一机器学习模型的训练。
- 根据权利要求2所述的方法,其特征在于,所述第一机器学习模型与所述第二机器学习模型部分结构相同,所述基于所述第一训练样本集和当前的第二机器学习模型的至少部分模型参数进行第N轮模型训练,得到所述第一机器学习模型,包括:基于所述第一训练样本集、当前的第二机器学习模型的结构相同部分的模型参数以及前一轮训练得到的第一机器学习模型的部分模型参数进行第N轮模型训练,得到所述第一机器学习模型。
- 根据权利要求2所述的方法,其特征在于,所述第一机器学习模型与所述第二机器学习模型全部结构相同,所述基于所述第一训练样本集和当前的第二机器学习模型的至少部分模型参数进行第N轮模型训练,得到所述第一机器学习模型,包括:基于所述第一训练样本集、当前的第二机器学习模型的全部模型参数进行第N轮模型训练,得到所述第一机器学习模型。
- 根据权利要求2所述的方法,其特征在于,所述方法还包括:若确定所述第一机器学习模型的模型指标不满足所述第一预设指标,则进行所述第一机器学习模型的第N+1轮训练。
- 根据权利要求2所述的方法,其特征在于,所述基于所述第二训练样本集进行多轮模型训练,得到第二机器学习模型,包括:基于所述第二训练样本集和当前的第一机器学习模型的至少部分模型参数进行第M轮模型训练,得到所述第二机器学习模型;其中,M为大于0的正整数;若所述第二机器学习模型的模型指标满足第二预设指标,则结束所述第二机器学习模型的训练。
- 根据权利要求6所述的方法,其特征在于,所述第一机器学习模型与所述第二机器学习模型部分结构相同,所述基于所述第二训练样本集和当前的第一机器学习模型的至少部分模型参数进行第M轮模型训练,得到所述第二机器学习模型,包括:基于所述第二训练样本集、所述第一初始模型的结构相同部分的模型参数以及第二初始模型参数进行第一轮训练,得到第二初始模型;所述第二初始模型的至少部分参数用于训练所述第一机器学习模型;基于所述第二训练样本集、当前的第一机器学习模型的结构相同部分的模型参数以及前一轮训练得到的第二机器学习模型的部分模型参数继续进行模型训练,得到所述第二机器学习模型。
- 根据权利要求6所述的方法,其特征在于,所述第一机器学习模型与所述第二机器学习模型全部结构相同,所述基于所述第二训练样本集和当前的第一机器学习模型的至少部分模型参数进行第M轮模型训练,得到所述第二机器学习模型,包括:基于所述第二训练样本集、当前的第一机器学习模型的全部模型参数进行第M轮模型训练,得到所述第二机器学习模型。
- 根据权利要求6所述的方法,其特征在于,所述方法还包括:若确定所述第二机器学习模型的模型指标不满足所述第二预设指标,则进行所述第二机器学习模型的第M+1轮训练。
- 根据权利要求6所述的方法,其特征在于,所述模型指标包括输出结果的准确率,所述第一预设指标包括第一预设准确率;所述第二预设指标包括第二预设准确率。
- 根据权利要求6所述的方法,其特征在于,所述方法还包括:在每一轮训练过程中,若根据预设损失函数确定机器学习模型的输出结果不符合预设收敛条件,则利用批量梯度算法确定下降梯度并继续训练,直到确定所述机器学习模型的输出结果符合所述预设收敛条件时,停止本轮训练。
- 根据权利要求1所述的方法,其特征在于,所述获取第一训练样本集和第二训练样本集,包括:获取所述第一医院的医学影像,并基于所述第一医院的医学影像生成所述第一训练样本集;获取所述第二医院的医学影像,并基于所述第二医院的医学影像生成所述第二训练样本集;其中,所述第一医院与所述第二医院为不同的医院。
- 根据权利要求1所述的方法,其特征在于,所述第一机器学习模型和所述第二机器学习模型包括剂量预测模型、自动勾画模型、疗效评估模型、生存指标评估模型、癌症筛查模型和形变配准模型中的至少一种。
- 根据权利要求1所述的方法,其特征在于,所述第一机器学习模型的模型参数和所述第二机器学习模型的模型参数通过网络传递;或,所述第一机器学习模型的模型参数和所述第二机器学习模型的模型参数通过存储介质传递。
- 根据权利要求1所述的方法,其特征在于,所述第一机器学习模型的训练和第二机器学习模型的训练分别在两个独立的网络进行。
- 根据权利要求1所述的方法,其特征在于,所述第一机器学习模型的训练与所述第二机器学习模型的训练交替进行。
- 根据权利要求1所述的方法,其特征在于,所述第一机器学习模型和第二机器学习模型的结构相同且应用相同。
- 根据权利要求1所述的方法,其特征在于,所述第一机器学习模型和第二机器学习模型的应用不同。
- 根据权利要求1所述的方法,其特征在于,训练所述第一机器学习模型和所述第二机器学习模型的过程中,仅传递模型参数。
- 根据权利要求1所述的方法,其特征在于,所述方法还包括:将所述第一机器学习模型与所述第二机器学习模型进行组合处理,得到目标机器学习模型。
- 一种机器学习模型的训练方法,其特征在于,所述方法包括:获取至少两个训练样本集;所述训练样本集中的训练样本包括医学扫描设备对扫描对象进行扫描得到的医学影像;基于各所述训练样本集进行多轮模型训练,得到各所述训练样本集对应的机器学习模型;其中,至少两个所述机器学习模型中的每两个机器学习模型至少部分结构相同,且训练其中一个机器学习模型时,至少部分利用另一个机器学习模型的结构相同部分的模型参数。
- 根据权利要求21所述的方法,其特征在于,所述基于各所述训练样本集进行多轮模型训练,得到各所述训练样本集对应的机器学习模型,包括:对于各所述机器学习模型,获取当前轮的模型参数,并基于所述机器学习模型对应的训练样本集和所述当前轮的模型参数进行模型训练,得到所述机器学习模型;其中,所述当前轮的模型参数包括初始模型参数或另一机器学习模型中结构相同部分的模型参数。
- 根据权利要求21所述的方法,其特征在于,各所述机器学习模型的训练在独立的网络进行。
- 根据权利要求21所述的方法,其特征在于,至少两个所述机器学习模型的结构相同且应用相同。
- 根据权利要求21所述的方法,其特征在于,各所述机器学习模型的应用不同。
- 一种机器学习模型的训练装置,其特征在于,所述装置包括:样本集获取模块,用于获取第一训练样本集和第二训练样本集;所述第一训练样本集和所述第二训练样本集中的训练样本包括医学扫描设备对扫描对象进行扫描得到的医学影像;第一训练模块,用于基于所述第一训练样本集进行多轮模型训练,得到第一机器学习模型;第二训练模块,用于基于所述第二训练样本集进行多轮模型训练,得到第二机器学习模型;其中,所述第一机器学习模型与所述第二机器学习模型至少部分结构相同;且训练所述第一机器学习模型时至少部分利用所述第二机器学习模型的模型参数,训练所述第二机器学习模型时至少部分利用所述第一机器学习模型的模型参数。
- 一种机器学习模型的训练装置,其特征在于,所述装置包括:样本集获取模块,用于获取至少两个训练样本集;所述训练样本集中的训练样本包括医学扫描设备对扫描对象进行扫描得到的医学影像;训练模块,用于基于各所述训练样本集进行多轮模型训练,得到各所述训练样本集对应的机器学习模型;其中,至少两个所述机器学习模型中的每两个机器学习模型至少部分结构相同,且训练其中一个机器学习模型时,至少部分利用另一个机器学习模型的结构相同部分的模型参数。
- 一种计算机设备,包括存储器和处理器,所述存储器存储有计算机程序,其特征在于,所述处理器执行所述计算机程序时实现权利要求1至25中任一项所述的方法的步骤。
- 一种计算机可读存储介质,其上存储有计算机程序,其特征在于,所述计算机程序被处理器执行时实现权利要求1至25中任一项所述的方法的步骤。
Priority Applications (4)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| EP21949563.7A EP4343708A4 (en) | 2021-07-12 | 2021-07-12 | METHOD AND APPARATUS FOR TRAINING MACHINE LEARNING MODELS, COMPUTER DEVICE AND STORAGE MEDIUM |
| US18/579,328 US20240346374A1 (en) | 2021-07-12 | 2021-07-12 | Method and apparatus for training machine learning models, computer device, and storage medium |
| CN202180098440.3A CN117355850A (zh) | 2021-07-12 | 2021-07-12 | 机器学习模型的训练方法、装置、计算机设备和存储介质 |
| PCT/CN2021/105777 WO2023283765A1 (zh) | 2021-07-12 | 2021-07-12 | 机器学习模型的训练方法、装置、计算机设备和存储介质 |
Applications Claiming Priority (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| PCT/CN2021/105777 WO2023283765A1 (zh) | 2021-07-12 | 2021-07-12 | 机器学习模型的训练方法、装置、计算机设备和存储介质 |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| WO2023283765A1 true WO2023283765A1 (zh) | 2023-01-19 |
Family
ID=84919822
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| PCT/CN2021/105777 Ceased WO2023283765A1 (zh) | 2021-07-12 | 2021-07-12 | 机器学习模型的训练方法、装置、计算机设备和存储介质 |
Country Status (4)
| Country | Link |
|---|---|
| US (1) | US20240346374A1 (zh) |
| EP (1) | EP4343708A4 (zh) |
| CN (1) | CN117355850A (zh) |
| WO (1) | WO2023283765A1 (zh) |
Cited By (1)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN118245810A (zh) * | 2024-05-28 | 2024-06-25 | 北京壹永科技有限公司 | 训练大语言模型的方法、装置、电子设备 |
Families Citing this family (4)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| EP4490720A1 (en) * | 2022-04-21 | 2025-01-15 | Google LLC | Joint segmenting and automatic speech recognition |
| US12573495B2 (en) | 2022-12-30 | 2026-03-10 | Cilag Gmbh International | Surgical computing system with support for interrelated machine learning models |
| US20240221892A1 (en) * | 2022-12-30 | 2024-07-04 | Cilag Gmbh International | Surgical computing system with support for interrelated machine learning models |
| US12531156B2 (en) | 2022-12-30 | 2026-01-20 | Cilag Gmbh International | Method for advanced algorithm support |
Citations (5)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US20180218502A1 (en) * | 2017-01-27 | 2018-08-02 | Arterys Inc. | Automated segmentation utilizing fully convolutional networks |
| CN110348436A (zh) * | 2019-06-19 | 2019-10-18 | 平安普惠企业管理有限公司 | 对图像中的文本信息进行识别的方法及相关设备 |
| CN110400251A (zh) * | 2019-06-13 | 2019-11-01 | 深圳追一科技有限公司 | 视频处理方法、装置、终端设备及存储介质 |
| CN112257738A (zh) * | 2020-07-31 | 2021-01-22 | 北京京东尚科信息技术有限公司 | 机器学习模型的训练方法、装置和图像的分类方法、装置 |
| CN112861892A (zh) * | 2019-11-27 | 2021-05-28 | 杭州海康威视数字技术股份有限公司 | 图片中目标的属性的确定方法和装置 |
Family Cites Families (3)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US11593634B2 (en) * | 2018-06-19 | 2023-02-28 | Adobe Inc. | Asynchronously training machine learning models across client devices for adaptive intelligence |
| CN112651510B (zh) * | 2019-10-12 | 2024-09-06 | 华为技术有限公司 | 模型更新方法、工作节点及模型更新系统 |
| US11604984B2 (en) * | 2019-11-18 | 2023-03-14 | Shanghai United Imaging Intelligence Co., Ltd. | Systems and methods for machine learning based modeling |
-
2021
- 2021-07-12 WO PCT/CN2021/105777 patent/WO2023283765A1/zh not_active Ceased
- 2021-07-12 US US18/579,328 patent/US20240346374A1/en active Pending
- 2021-07-12 EP EP21949563.7A patent/EP4343708A4/en active Pending
- 2021-07-12 CN CN202180098440.3A patent/CN117355850A/zh active Pending
Patent Citations (5)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US20180218502A1 (en) * | 2017-01-27 | 2018-08-02 | Arterys Inc. | Automated segmentation utilizing fully convolutional networks |
| CN110400251A (zh) * | 2019-06-13 | 2019-11-01 | 深圳追一科技有限公司 | 视频处理方法、装置、终端设备及存储介质 |
| CN110348436A (zh) * | 2019-06-19 | 2019-10-18 | 平安普惠企业管理有限公司 | 对图像中的文本信息进行识别的方法及相关设备 |
| CN112861892A (zh) * | 2019-11-27 | 2021-05-28 | 杭州海康威视数字技术股份有限公司 | 图片中目标的属性的确定方法和装置 |
| CN112257738A (zh) * | 2020-07-31 | 2021-01-22 | 北京京东尚科信息技术有限公司 | 机器学习模型的训练方法、装置和图像的分类方法、装置 |
Non-Patent Citations (1)
| Title |
|---|
| See also references of EP4343708A4 * |
Cited By (1)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN118245810A (zh) * | 2024-05-28 | 2024-06-25 | 北京壹永科技有限公司 | 训练大语言模型的方法、装置、电子设备 |
Also Published As
| Publication number | Publication date |
|---|---|
| CN117355850A (zh) | 2024-01-05 |
| EP4343708A4 (en) | 2024-08-07 |
| US20240346374A1 (en) | 2024-10-17 |
| EP4343708A1 (en) | 2024-03-27 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| WO2023283765A1 (zh) | 机器学习模型的训练方法、装置、计算机设备和存储介质 | |
| Foley et al. | OpenFL: the open federated learning library | |
| CN109754447B (zh) | 图像生成方法、装置、设备和存储介质 | |
| US11328412B2 (en) | Hierarchical learning of weights of a neural network for performing multiple analyses | |
| Tolu‐Akinnawo et al. | Advancements in artificial intelligence in noninvasive cardiac imaging: a comprehensive review | |
| JP2021056995A (ja) | 医用情報処理装置、医用情報処理システム及び医用情報処理方法 | |
| US20220130525A1 (en) | Artificial intelligence orchestration engine for medical studies | |
| CN109567852B (zh) | 扫描范围的确定方法、医学图像的获取方法、装置和设备 | |
| US20210177261A1 (en) | System, method, and computer-accessible medium for magnetic resonance value driven autonomous scanner | |
| CN107330951A (zh) | 图像重建系统及方法 | |
| CN105279364A (zh) | 协议管理系统 | |
| Saldanha et al. | Swarm learning with weak supervision enables automatic breast cancer detection in magnetic resonance imaging | |
| US10296713B2 (en) | Method and system for reviewing medical study data | |
| CN114596304A (zh) | 图像检测模型的生成方法、图像检测方法及计算机设备 | |
| Pacheco et al. | Pilot deployment of a cloud-based universal medical image repository in a large public health system: A protocol study | |
| ElBedoui et al. | SoK: federated learning and unlearning for medical image analysis | |
| CN114723723B (zh) | 医学影像处理方法、计算机设备和存储介质 | |
| CN113742506B (zh) | 影像显示方法和计算机设备 | |
| US20240145068A1 (en) | Medical image analysis platform and associated methods | |
| CN114565530B (zh) | 图像重建方法、装置、计算机设备和存储介质 | |
| CN114913260B (zh) | 图像重建方法、装置、计算机设备和存储介质 | |
| CN117635511A (zh) | 医学图像处理方法、装置、计算机设备和存储介质 | |
| CN108511052A (zh) | 用于确定投影数据组的方法以及投影确定系统 | |
| JP7216660B2 (ja) | 下流のニーズを総合することにより読み取り環境を決定するためのデバイス、システム、及び方法 | |
| KR102665091B1 (ko) | 의료 정보 처리 장치 및 방법 |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| WWE | Wipo information: entry into national phase |
Ref document number: 202180098440.3 Country of ref document: CN |
|
| WWE | Wipo information: entry into national phase |
Ref document number: 2021949563 Country of ref document: EP Ref document number: 21949563 Country of ref document: EP |
|
| WWE | Wipo information: entry into national phase |
Ref document number: 18579328 Country of ref document: US |
|
| ENP | Entry into the national phase |
Ref document number: 2021949563 Country of ref document: EP Effective date: 20231221 |
|
| NENP | Non-entry into the national phase |
Ref country code: DE |