WO2023283765A1 - 机器学习模型的训练方法、装置、计算机设备和存储介质 - Google Patents

机器学习模型的训练方法、装置、计算机设备和存储介质 Download PDF

Info

Publication number
WO2023283765A1
WO2023283765A1 PCT/CN2021/105777 CN2021105777W WO2023283765A1 WO 2023283765 A1 WO2023283765 A1 WO 2023283765A1 CN 2021105777 W CN2021105777 W CN 2021105777W WO 2023283765 A1 WO2023283765 A1 WO 2023283765A1
Authority
WO
WIPO (PCT)
Prior art keywords
machine learning
model
learning model
training
sample set
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Ceased
Application number
PCT/CN2021/105777
Other languages
English (en)
French (fr)
Inventor
倪成
刘润鑫
章卫
张康
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai United Imaging Healthcare Co Ltd
Original Assignee
Shanghai United Imaging Healthcare Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai United Imaging Healthcare Co Ltd filed Critical Shanghai United Imaging Healthcare Co Ltd
Priority to EP21949563.7A priority Critical patent/EP4343708A4/en
Priority to US18/579,328 priority patent/US20240346374A1/en
Priority to CN202180098440.3A priority patent/CN117355850A/zh
Priority to PCT/CN2021/105777 priority patent/WO2023283765A1/zh
Publication of WO2023283765A1 publication Critical patent/WO2023283765A1/zh
Anticipated expiration legal-status Critical
Ceased legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • G06N20/20Ensemble learning
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H30/00ICT specially adapted for the handling or processing of medical images
    • G16H30/40ICT specially adapted for the handling or processing of medical images for processing medical images, e.g. editing
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H50/00ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
    • G16H50/20ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for computer-aided diagnosis, e.g. based on medical expert systems
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H50/00ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
    • G16H50/70ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for mining of medical data, e.g. analysing previous cases of other patients

Definitions

  • the present application relates to the technical field of model training, in particular to a training method, device, computer equipment and storage medium for a machine learning model.
  • training a machine learning model requires a large number of training samples.
  • medical images involve patient privacy and data security, medical images cannot be shared between hospitals, so there will be problems with fewer training samples for machine learning models and poor accuracy of machine learning models.
  • a training method for a machine learning model comprising:
  • the training samples in the first training sample set and the second training sample set include medical images obtained by scanning objects scanned by medical scanning equipment;
  • the first machine learning model and the second machine learning model have at least part of the same structure, and the model parameters of the second machine learning model are at least partially used when training the first machine learning model, and the first machine learning model is at least partially used when training the second machine learning model.
  • Model parameters for machine learning models are at least partially used when training the first machine learning model, and the first machine learning model is at least partially used when training the second machine learning model.
  • the above-mentioned multiple rounds of model training are performed based on the first training sample set to obtain the first machine learning model, including:
  • N round of model training is performed to obtain the first machine learning model; N is a positive integer greater than 1;
  • the training of the first machine learning model ends.
  • the first machine learning model has the same partial structure as the second machine learning model, and the Nth round of model training is performed based on the first training sample set and at least part of the model parameters of the current second machine learning model to obtain The first machine learning model, including:
  • the N round of model training is performed to obtain the first machine learning model .
  • the first machine learning model and the second machine learning model all have the same structure, and the Nth round of model training is performed based on the first training sample set and at least part of the model parameters of the current second machine learning model to obtain The first machine learning model, including:
  • the Nth round of model training is performed to obtain the first machine learning model.
  • the method also includes:
  • the N+1th round of training of the first machine learning model is performed.
  • the above-mentioned multiple rounds of model training are performed based on the second training sample set to obtain the second machine learning model, including:
  • the M round of model training is performed to obtain the second machine learning model; wherein, M is a positive integer greater than 0;
  • the first machine learning model has the same partial structure as the second machine learning model, and the Mth round of model training is performed based on the second training sample set and at least part of the model parameters of the current first machine learning model to obtain A second machine learning model, including:
  • a first round of training is performed based on the second training sample set, the model parameters of the structurally identical part of the first initial model, and the second initial model parameters to obtain a second initial model; at least part of the parameters of the second initial model are used to train the first machine learning model;
  • the first machine learning model and the second machine learning model all have the same structure, and the Mth round of model training is performed based on the second training sample set and the model parameters of the current first machine learning model to obtain the second Machine learning models, including:
  • An M-th round of model training is performed based on the second training sample set and all model parameters of the current first machine learning model to obtain a second machine learning model.
  • the method also includes:
  • the M+1th round of training of the second machine learning model is performed.
  • the model index includes the accuracy rate of the output result
  • the first preset index includes the first preset accuracy rate
  • the second preset index includes the second preset accuracy rate
  • the method also includes:
  • the batch gradient algorithm is used to determine the descent gradient and continue training until it is determined that the output result of the machine learning model meets the preset convergence conditions.
  • the convergence condition is set, the current round of training is stopped.
  • the acquisition of the first training sample set and the second training sample set includes:
  • the first hospital and the second hospital are different hospitals.
  • the first machine learning model and the second machine learning model include at least one of a dose prediction model, an automatic delineation model, a curative effect evaluation model, a survival index evaluation model, a cancer screening model, and a deformation registration model .
  • model parameters of the first machine learning model and the model parameters of the second machine learning model are transmitted through the network
  • model parameters of the first machine learning model and the model parameters of the second machine learning model are transmitted through a storage medium.
  • the training of the first machine learning model and the training of the second machine learning model are respectively performed in two independent networks.
  • the training of the first machine learning model and the training of the second machine learning model are performed alternately.
  • the first machine learning model and the second machine learning model have the same structure and the same application.
  • the applications of the first machine learning model and the second machine learning model are different.
  • the method also includes:
  • a training method for a machine learning model comprising:
  • the training samples in the training sample set include medical images obtained by scanning objects scanned by medical scanning equipment;
  • every two machine learning models in the at least two machine learning models have at least part of the same structure, and when training one of the machine learning models, at least partly use the model parameters of the same part of the other machine learning model.
  • multiple rounds of model training are performed based on each training sample set to obtain a machine learning model corresponding to each training sample set, including:
  • model parameters of the current round are obtained, and model training is performed based on the training sample set corresponding to the machine learning model and the model parameters of the current round to obtain the machine learning model; wherein, the model parameters of the current round include initial model parameters or Model parameters of the same structure in another machine learning model.
  • each machine learning model is trained on an independent network.
  • At least two machine learning models have the same structure and the same application.
  • the application of each machine learning model is different.
  • a training device for a machine learning model comprising:
  • the sample set obtaining module is used to obtain the first training sample set and the second training sample set; the training samples in the first training sample set and the second training sample set include medical images obtained by scanning objects scanned by medical scanning equipment;
  • the first training module is used to perform multiple rounds of model training based on the first training sample set to obtain the first machine learning model
  • the second training module is used to perform multiple rounds of model training based on the second training sample set to obtain a second machine learning model
  • the first machine learning model and the second machine learning model have at least part of the same structure, and the model parameters of the second machine learning model are at least partially used when training the first machine learning model, and the first machine learning model is at least partially used when training the second machine learning model.
  • Model parameters for machine learning models are at least partially used when training the first machine learning model, and the first machine learning model is at least partially used when training the second machine learning model.
  • the above-mentioned first training module is specifically used to perform the first round of model training based on the first initial model parameters and the first training sample set to obtain the first initial model; at least part of the model parameters of the first initial model For training the second machine learning model; based on the first training sample set and at least part of the model parameters of the current second machine learning model, the Nth round of model training is performed to obtain the first machine learning model; N is a positive integer greater than 1; If it is determined that the model index of the first machine learning model satisfies the first preset index, the training of the first machine learning model ends.
  • the first machine learning model has the same structure as the second machine learning model, and the above-mentioned first training module is specifically used for the same structure of the current second machine learning model based on the first training sample set.
  • the model parameters and part of the model parameters of the first machine learning model obtained in the previous round of training are performed on the N-th round of model training to obtain the first machine learning model.
  • the first machine learning model and the second machine learning model all have the same structure, and the above-mentioned first training module is specifically used to conduct training based on the first training sample set and all model parameters of the current second machine learning model. In the Nth round of model training, the first machine learning model is obtained.
  • the above-mentioned first training module is further configured to perform the N+1th round of training of the first machine learning model if it is determined that the model index of the first machine learning model does not meet the first preset index.
  • the above-mentioned second training module is specifically used to perform the Mth round of model training based on the second training sample set and at least part of the model parameters of the current first machine learning model to obtain the second machine learning model; wherein , M is a positive integer greater than 0; if the model index of the second machine learning model satisfies the second preset index, the training of the second machine learning model ends.
  • the first machine learning model has the same structure as the second machine learning model
  • the above-mentioned second training module is specifically used based on the second training sample set, the model parameters of the same part of the first initial model, and Perform the first round of training on the second initial model parameters to obtain the second initial model; at least part of the parameters of the second initial model are used to train the first machine learning model; based on the second training sample set, the structure of the current first machine learning model The same part of the model parameters and some of the model parameters of the second machine learning model obtained from the previous round of training continue to perform model training to obtain the second machine learning model.
  • the first machine learning model and the second machine learning model all have the same structure, and the above-mentioned second training module is specifically used to conduct training based on the second training sample set and all model parameters of the current first machine learning model.
  • a second machine learning model is obtained.
  • the above-mentioned second training module is configured to perform the M+1th round of training of the second machine learning model if it is determined that the model index of the second machine learning model does not meet the second preset index.
  • the model index includes the accuracy rate of the output result
  • the first preset index includes the first preset accuracy rate
  • the second preset index includes the second preset accuracy rate
  • the device also includes:
  • Gradient descent module used for each round of training, if it is determined according to the preset loss function that the output result of the machine learning model does not meet the preset convergence conditions, then use the batch gradient algorithm to determine the descent gradient and continue training until the machine learning model is determined When the output of the model meets the preset convergence conditions, the current round of training is stopped.
  • the above-mentioned sample set acquisition module is specifically configured to acquire medical images of the first hospital, and generate a first training sample set based on the medical images of the first hospital; acquire medical images of the second hospital, and generate the first training sample set based on the medical images of the first hospital; The medical images of the second hospital generate a second training sample set; wherein, the first hospital and the second hospital are different hospitals.
  • the first machine learning model and the second machine learning model include at least one of a dose prediction model, an automatic delineation model, a curative effect evaluation model, a survival index evaluation model, a cancer screening model, and a deformation registration model .
  • model parameters of the first machine learning model and the model parameters of the second machine learning model are transmitted through the network
  • model parameters of the first machine learning model and the model parameters of the second machine learning model are transmitted through a storage medium.
  • the training of the first machine learning model and the training of the second machine learning model are respectively performed in two independent networks.
  • the training of the first machine learning model and the training of the second machine learning model are performed alternately.
  • the first machine learning model and the second machine learning model have the same structure and the same application.
  • the applications of the first machine learning model and the second machine learning model are different.
  • the device also includes:
  • the combination processing module is used to combine the first machine learning model and the second machine learning model to obtain the target machine learning model.
  • a training device for a machine learning model comprising:
  • the sample set obtaining module is used to obtain at least two training sample sets; the training samples in the training sample set include medical images obtained by scanning objects scanned by medical scanning equipment;
  • a training module configured to perform multiple rounds of model training based on each training sample set, to obtain a machine learning model corresponding to each training sample set;
  • every two machine learning models in the at least two machine learning models have at least part of the same structure, and when training one of the machine learning models, at least partly use the model parameters of the same part of the other machine learning model.
  • the above-mentioned training module is used to obtain the model parameters of the current round for each machine learning model, and perform model training based on the training sample set corresponding to the machine learning model and the model parameters of the current round to obtain the machine learning model ;
  • the model parameters of the current round include initial model parameters or model parameters of the same structure in another machine learning model.
  • each machine learning model is trained on an independent network.
  • At least two machine learning models have the same structure and the same application.
  • the application of each machine learning model is different.
  • a computer device comprising a memory and a processor, the memory stores a computer program, and the processor implements the following steps when executing the computer program:
  • the training samples in the first training sample set and the second training sample set include medical images obtained by scanning objects scanned by medical scanning equipment;
  • the first machine learning model and the second machine learning model have at least part of the same structure, and the model parameters of the second machine learning model are at least partially used when training the first machine learning model, and the first machine learning model is at least partially used when training the second machine learning model.
  • Model parameters for machine learning models are at least partially used when training the first machine learning model, and the first machine learning model is at least partially used when training the second machine learning model.
  • a computer-readable storage medium on which a computer program is stored, and when the computer program is executed by a processor, the following steps are implemented:
  • the training samples in the first training sample set and the second training sample set include medical images obtained by scanning objects scanned by medical scanning equipment;
  • the first machine learning model and the second machine learning model have at least part of the same structure, and the model parameters of the second machine learning model are at least partially used when training the first machine learning model, and the first machine learning model is at least partially used when training the second machine learning model.
  • Model parameters for machine learning models are at least partially used when training the first machine learning model, and the first machine learning model is at least partially used when training the second machine learning model.
  • the above machine learning model training method, device, computer equipment, and storage medium obtain the first training sample set and the second training sample set; perform multiple rounds of model training based on the first training sample set to obtain the first machine learning model;
  • the second training sample set is subjected to multiple rounds of model training to obtain a second machine learning model. Since the first machine learning model and the second machine learning model have parts with the same structure, the parts with the same structure can use the same model parameters.
  • the first machine learning model at least partly use the model parameters of the second machine learning model, and when training the second machine learning model, at least partly use the model parameters of the first machine learning model.
  • the training of the first machine learning model does not use the second training sample set, and the training of the second machine learning model does not use the first training sample set, which can ensure the data security of the training samples; while training the first machine learning model Using the model parameters of the second machine learning model to train the second machine learning model Using the model parameters of the first machine learning model can improve the model training speed and the accuracy of the machine learning model.
  • Fig. 1 is the application environment figure of the training method of machine learning model in an embodiment
  • Fig. 2 is a schematic flow chart of a training method of a machine learning model in an embodiment
  • Figure 3a is a schematic structural diagram of the first machine learning model in an embodiment
  • Fig. 3b is a schematic structural diagram of a second machine learning model in an embodiment
  • FIG. 4 is one of the flow diagrams of performing multiple rounds of model training steps based on the first training sample set in an embodiment
  • Fig. 5 is the second schematic flow diagram of multiple rounds of model training steps based on the first training sample set in an embodiment
  • FIG. 6 is one of the schematic flow diagrams of performing multiple rounds of model training steps based on the second training sample set in an embodiment
  • FIG. 7 is the second schematic flow diagram of multiple rounds of model training steps based on the second training sample set in an embodiment
  • FIG. 8 is a schematic flowchart of the steps of alternately training the first machine learning model and the steps of the second machine learning model in one embodiment
  • Fig. 9 is a schematic flow chart of a training method for a machine learning model in another embodiment
  • Fig. 10 is a structural block diagram of a training device for a machine learning model in an embodiment
  • Fig. 11 is a structural block diagram of a training device for a machine learning model in another embodiment
  • Figure 12 is a diagram of the internal structure of a computer device in one embodiment.
  • This application provides a training plan for a machine learning model, including: obtaining a first training sample set and a second training sample set; performing multiple rounds of model training based on the first training sample set to obtain a first machine learning model; based on the second training sample set Multiple rounds of model training are performed on the set to obtain the second machine learning model.
  • the first machine learning model and the second machine learning model have at least part of the same structure, and the model parameters of the second machine learning model are at least partially used when training the first machine learning model, and the first machine learning model is at least partially used when training the second machine learning model.
  • Model parameters for machine learning models are
  • training the first machine learning model does not use the second training sample set, and training the second machine learning model does not use the first training sample set, which can ensure the data security of the training samples; while training the first machine learning model uses the second machine learning model Learning the model parameters of the model and training the second machine learning model Using the model parameters of the first machine learning model can improve the model training speed and the accuracy of the machine learning model. It can be seen that the problem of less training samples and poor accuracy of the machine learning model in the prior art is solved.
  • the training method of the machine learning model provided in this application can be applied to the application environment shown in FIG. 1 .
  • the application environment may include a model training system, and the model training system includes multiple model training terminals 101, and the multiple model training terminals 101 may communicate through a network.
  • the model training terminal 101 may be a terminal connected to the medical scanning device 102 .
  • the terminal can be but not limited to various personal computers, notebook computers and tablet computers.
  • the above-mentioned medical scanning device 102 can be a single-mode device or a multi-mode device, such as but not limited to DR (Digital radiography, that is, digital X Line photography) equipment, CT (Computed Tomography, computerized tomography) equipment, CBCT (Cone Beam Computed Tomography, cone beam computerized tomography) equipment, PET (Positron Emission Computed Tomography, positron emission computerized tomography) ) equipment, MRI (Magnetic Resonance Imaging, magnetic resonance imaging) equipment, ultrasound equipment, PET-CT equipment, PET-MR, RT (radiotherapy, radiation therapy) equipment, CT-RT and MR-RT.
  • the model training terminal 101 can also be a PACS (Picture Archiving and Communication Systems, image archiving and communication system) server.
  • the above PACS server can be realized by an independent server or a server cluster composed of multiple servers.
  • a training method of a machine learning model is provided, and the method is applied to the model training system in Figure 1 as an example for illustration, including the following steps:
  • Step 201 acquire a first training sample set and a second training sample set.
  • the training samples in the first training sample set and the second training sample set include medical images obtained by scanning objects scanned by a medical scanning device.
  • the medical image can be a two-dimensional image or a three-dimensional image.
  • the training sample set satisfies data diversity, label consistency, and the data structure is the same.
  • the model training system can use medical images in the same hospital for model training, or use medical images in different hospitals for model training.
  • the model training terminal When the model training system uses medical images in the same hospital for model training, the model training terminal generates a first training sample set and a second training sample set according to multiple medical images, wherein the first training sample set and the second training sample sets are used for different training tasks.
  • the model training end obtains 100 CT images from the same CT device, divides the 100 CT images into two image sets, and obtains the first training sample set and the second training sample set, where the first training sample set is used for dose
  • the second training sample set is used for the training task of automatically sketching the model.
  • the first model training end in the model training system acquires medical images of the first hospital, and generates a first training sample set based on the medical images of the first hospital;
  • the model The second model training end in the training system acquires medical images of the second hospital, and generates a second training sample set based on the medical images of the second hospital; wherein, the first hospital and the second hospital are different hospitals.
  • the model training terminal A1 obtains CT images from the hospital B1 and generates a first training sample set; the model training terminal A2 obtains CT images from the hospital B2 and generates a second training sample set.
  • the embodiment of the present disclosure does not limit the manner of obtaining the training sample set.
  • Step 202 Perform multiple rounds of model training based on the first training sample set to obtain a first machine learning model.
  • Step 203 performing multiple rounds of model training based on the second training sample set to obtain a second machine learning model.
  • the model training system uses medical images in the same hospital for model training
  • the model training system can use the same model training terminal to train the first machine learning model and the second machine learning model, or use different model training terminals for training. Training of the first machine learning model and the second machine learning model.
  • model training system uses medical images in different hospitals for model training
  • model training system uses different model training terminals to train the first machine learning model and the second machine learning model.
  • the first machine learning model and the second machine learning model have at least part of the same structure, and at least partially use the model parameters of the second machine learning model when training the first machine learning model, and when training the second machine learning model Model parameters of the first machine learning model are utilized at least in part.
  • the first machine learning model shown in Figure 3a and the second machine learning model shown in Figure 3b, the first machine learning model and the second machine learning model have parts with the same structure, so the parts with the same structure can use the same Model parameters. In this way, when training the first machine learning model, at least partly use the model parameters of the second machine learning model, and when training the second machine learning model, at least partly use the model parameters of the first machine learning model.
  • the first model training end of the model training system performs a round of model training to obtain model parameters of the first machine learning model, and then transfers the model parameters of the first machine learning model to the second model training end of the model training system.
  • the second model training end uses the model parameters of the first machine learning model to conduct a round of training of the second machine learning model; after the training, the model parameters of the second machine learning model are passed to the first model training end.
  • the first model training end uses the model parameters of the second machine learning model to conduct another round of training of the first machine learning model; after the training, the model parameters of the first machine learning model are passed to the second model training end. And so on, until the training of the first machine learning model and the second machine learning model is completed.
  • model parameters of the first machine learning model and the model parameters of the second machine learning model are transmitted through a network; or, the model parameters of the first machine learning model and the model parameters of the second machine learning model are transmitted through a storage medium.
  • the first model training terminal and the second model training terminal send the model parameters through the network, or the user uses a storage medium such as a mobile hard disk to copy the model parameters to realize the transfer of the model parameters. Embodiments of the present disclosure do not limit this.
  • the training of the first machine learning model does not use the second training sample set, and the training of the second machine learning model does not use the first training sample set, which can ensure the data security of the training samples; while training The first machine learning model uses the model parameters of the second machine learning model, and the training of the second machine learning model uses the model parameters of the first machine learning model, which can improve the model training speed and the accuracy of the machine learning model.
  • the first machine learning model and the second machine learning model include at least one of a dose prediction model, an automatic delineation model, a curative effect evaluation model, a survival index evaluation model, a cancer screening model, and a deformation registration model .
  • the first machine learning model is a dose prediction model
  • the second machine learning model is an automatic delineation model
  • the first machine learning model is a curative effect evaluation model
  • the second machine learning model is a survival index evaluation model.
  • the embodiment of the present disclosure does not limit the model types of the first machine learning model and the second machine learning model.
  • the training samples are limited, and the accuracy of the application model obtained by using the limited samples for model training in the prior art is limited.
  • the training method of this embodiment obtains a model with higher accuracy by repeatedly using limited samples to iteratively train the machine learning model, and better technical effects can be obtained when it is applied.
  • the automatic delineation model obtained by using the training method of this embodiment has a better delineation effect on the region of interest;
  • the dose prediction model obtained by training with few samples the dose prediction model obtained by using the training method of this embodiment can predict the dose more accurately.
  • the first training sample set and the second training sample set are obtained; multiple rounds of model training are performed based on the first training sample set to obtain the first machine learning model; multiple rounds are performed based on the second training sample set Model training to obtain a second machine learning model. Since the first machine learning model and the second machine learning model have parts with the same structure, the parts with the same structure can use the same model parameters.
  • the first machine learning model at least partly use the model parameters of the second machine learning model, and when training the second machine learning model, at least partly use the model parameters of the first machine learning model.
  • the training of the first machine learning model does not use the second training sample set, and the training of the second machine learning model does not use the first training sample set, which can ensure the data security of the training samples; while training the first machine learning model Using the model parameters of the second machine learning model to train the second machine learning model Using the model parameters of the first machine learning model can improve the model training speed and the accuracy of the machine learning model.
  • the above-mentioned step of performing multiple rounds of model training based on the first training sample set to obtain the first machine learning model may include:
  • Step 301 Perform a first round of model training based on the first initial model parameters and the first training sample set to obtain a first initial model.
  • the first initial model parameter may be a random model parameter, or a model parameter assigned by a user, which is not limited in this embodiment of the present disclosure.
  • the model training end obtains the first initial model parameters and the first training sample set, it performs the first round of model training of the first machine learning model to obtain the first initial model.
  • the model parameters of the first initial model are used to train the second machine learning model.
  • Step 302 Perform N round of model training based on the first training sample set and at least some model parameters of the current second machine learning model to obtain the first machine learning model, where N is a positive integer greater than 1.
  • the first machine learning model and the second machine learning model can be trained alternately. After the model training end obtains the first initial model, it uses the model parameters of the first initial model to perform the first round of training of the second machine learning model.
  • the model training end performs the Nth round of training of the first machine learning model.
  • the Nth round of model training is performed based on the first training sample set and at least some model parameters of the current second machine learning model to obtain the first machine learning model
  • the process of the model may include: performing the Nth round of model training based on the first training sample set, the model parameters of the same structural part of the current second machine learning model, and some model parameters of the first machine learning model obtained from the previous round of training , to get the first machine learning model.
  • the model training end performs the first machine learning model based on the first training sample set, some model parameters of the first initial model, and the model parameters of the same structure obtained by the first round of training of the second machine learning model.
  • the second round of model training to obtain the first machine learning model, and perform step 303 or step 304.
  • the model training end performs the first training based on the first training sample set, the first machine learning model from the second round of training, and the model parameters of the same structure obtained from the second round of training on the second machine learning model.
  • the first machine learning model is obtained, and step 303 or step 304 is performed.
  • the Nth round of model training is performed based on the first training sample set and at least some model parameters of the current second machine learning model to obtain the first machine learning model
  • the process of the model may include: performing N rounds of model training based on the first training sample set and all model parameters of the current second machine learning model to obtain the first machine learning model.
  • Step 303 If it is determined that the model index of the first machine learning model satisfies the first preset index, the training of the first machine learning model is ended.
  • the model index includes the accuracy rate of the output result
  • the first preset index includes the first preset accuracy rate
  • the model training end calculates the accuracy rate of the output result of the first machine learning model, and compares the accuracy rate of the output result of the first machine learning model with the first preset accuracy rate. If the accuracy rate of the output result of the first machine learning model is greater than the first preset accuracy rate, then determine that the model index of the first machine learning model meets the first preset index; if the accuracy rate of the output result of the first machine learning model is less than or equal to For the first preset accuracy rate, it is determined that the model index of the first machine learning model does not meet the first preset index.
  • the above-mentioned process of calculating the accuracy rate of the output result of the first machine learning model may include: inputting a preset number of test samples into the first machine learning model to obtain the output results corresponding to each test sample; the statistics are consistent with the labels of the test samples The number of output results, and calculate the ratio between the number and the preset number, to obtain the accuracy of the output result.
  • the test samples can be selected from the first training sample set, or can be obtained in the same way as the first training sample set, which is not limited in this embodiment of the present disclosure, and the preset number is also not limited in the embodiment of the present disclosure .
  • the embodiment of the present disclosure may further include the following steps:
  • step 304 if it is determined that the model index of the first machine learning model does not satisfy the first preset index, perform the N+1th round of training of the first machine learning model.
  • N 2 as an example, if the model index of the first machine learning model does not meet the first preset index, at least part of the model parameters obtained by the second round of training the second machine learning model are obtained, and then based on the first training sample set, The first machine learning model obtained in the second round of training and at least part of the model parameters obtained in the second round of training the second machine learning model, the third round of training is performed on the first machine learning model, and step 303 or step 304 is performed.
  • N Taking N equal to 3 as an example, if the model index of the first machine learning model does not meet the first preset index, at least part of the model parameters obtained from the third round of training the second machine learning model are obtained, and then based on the first training sample set, The first machine learning model obtained in the third round of training and at least part of the model parameters obtained in the third round of training the second machine learning model, the fourth round of training is performed on the first machine learning model, and step 303 or step 304 is performed.
  • the batch gradient algorithm is used to determine the descent gradient and continue training until the machine learning model is determined to be When the output result of the learning model meets the preset convergence conditions, the current round of training is stopped.
  • the batch gradient descent method is used to determine the descent gradient and continue training until the first machine learning model is determined When the output result of the learning model meets the preset convergence conditions, the first round of training is stopped.
  • the preset loss function and batch gradient descent method are also used for model training. The embodiment of the present disclosure does not limit the preset loss function and the preset convergence condition.
  • the first round of model training is performed based on the first initial model parameters and the first training sample set to obtain the first initial model; based on the first A training sample set and at least part of the model parameters of the current second machine learning model are trained for the Nth round of model training to obtain the first machine learning model; if it is determined that the model index of the first machine learning model meets the first preset index, then end Training of the first machine learning model; if it is determined that the model index of the first machine learning model does not meet the first preset index, then the N+1th round of training of the first machine learning model is performed.
  • the first machine learning model that meets the first preset index can be trained by using at least part of the model parameters of the second machine learning model without using the second training sample set.
  • you can Ensuring the data security of training samples can also improve model training speed and model accuracy.
  • the above-mentioned step of performing multiple rounds of model training based on the second training sample set to obtain the second machine learning model may include:
  • Step 401 Perform M round of model training based on the second training sample set and at least some model parameters of the current first machine learning model to obtain a second machine learning model; where M is a positive integer greater than 0.
  • the model training end After training the first machine learning model to obtain the first initial model, the model training end performs the M-th round of training of the second machine learning model based on the second training sample set and at least part of the model parameters of the first initial model.
  • the M-th round of model training is performed based on the second training sample set and the model parameters of the current first machine learning model to obtain the second machine learning model
  • the process may include: performing a first round of training based on the second training sample set, the model parameters of the same part of the first initial model, and the second initial model parameters to obtain the second initial model; wherein, at least part of the second initial model The parameters are used to train the first machine learning model; then, based on the second training sample set, the model parameters of the same part of the current first machine learning model and some model parameters of the second machine learning model obtained from the previous round of training, continue Perform model training to obtain a second machine learning model.
  • the model training end performs a second round of training based on the second training sample set, the second machine learning model obtained from the first round of training, and at least part of the model parameters obtained from the second round of training on the first machine learning model.
  • the second round of model training of the machine learning model obtains the second machine learning model, and then step 402 or step 403 is performed.
  • the model training end is based on the second training sample set, the second machine learning model trained in the second round, and at least part of the model parameters obtained from the third round of training the first machine learning model , perform the third round of model training of the second machine learning model to obtain the second machine learning model, and then perform step 402 or step 403 .
  • the M-th round of model training is performed based on the second training sample set and the model parameters of the current first machine learning model to obtain the second machine learning model.
  • the process may include: performing the Mth round of model training based on the second training sample set and all model parameters of the current first machine learning model to obtain the second machine learning model.
  • Step 402 If the model index of the second machine learning model satisfies the second preset index, the training of the second machine learning model is ended.
  • the model index of the second machine learning model is calculated, and it is determined whether the model index meets the second preset index; if the second preset index is met, the training of the second machine learning model is ended.
  • the model index includes the accuracy rate of the output result
  • the second preset index includes the second preset accuracy rate
  • the accuracy rate of the output result of the second machine learning model can be calculated, and the accuracy rate of the output result of the second machine learning model can be compared with the second preset accuracy rate. If the output result of the second machine learning model is accurate rate is greater than the second preset accuracy rate, then determine that the model index of the second machine learning model satisfies the second preset index; if the accuracy rate of the second machine learning model output result is less than or equal to the second preset accuracy rate, then determine the first The model index of the second machine learning model does not meet the second preset index.
  • the above-mentioned process of calculating the accuracy rate of the output result of the second machine learning model may include: inputting a preset number of test samples into the second machine learning model to obtain the output results corresponding to each test sample; the statistics are consistent with the labels of the test samples The number of output results, and calculate the ratio between the number and the preset number, to obtain the accuracy of the output result.
  • the test samples can be selected from the second training sample set, or can be obtained in the same way as the second training sample set, which is not limited in this embodiment of the present disclosure, and the preset number is also not limited in the embodiment of the present disclosure .
  • the foregoing first preset index and the second preset index may be the same preset index, or may be different preset indexes, which are not limited in this embodiment of the present disclosure.
  • the embodiment of the present disclosure may further include the following steps:
  • Step 403 if it is determined that the model index of the second machine learning model does not meet the second preset index, perform the M+1th round of training of the second machine learning model.
  • the model index of the second machine learning model does not meet the second preset index, then obtain the model parameters obtained from the third round of training the first machine learning model, and then based on the second training sample set,
  • the second machine learning model obtained in the second round of training and the model parameters obtained in the third round of training the first machine learning model are used to perform a third round of training on the second machine learning model.
  • the model parameters obtained by the fourth round of training the first machine learning model are obtained, and then based on the second training sample set, the third The second machine learning model obtained in the round of training and the model parameters obtained in the fourth round of training the first machine learning model, and the fourth round of training is performed on the second machine learning model.
  • the model index of the second machine learning model satisfies the second preset index.
  • the batch gradient algorithm is used to determine the descent gradient and continue training until the machine learning model is determined to be When the output result of the learning model meets the preset convergence conditions, the current round of training is stopped.
  • the batch gradient descent method is used to determine the descent gradient and continue training until the second machine learning model is determined to be When the output result of the learning model meets the preset convergence conditions, the first round of training is stopped.
  • the preset loss function and batch gradient descent method are also used for model training.
  • the M-th round of model training is performed based on the second training sample set and at least part of the model parameters of the current first machine learning model to obtain The second machine learning model; if the model index of the second machine learning model meets the second preset index, then end the training of the second machine learning model; if it is determined that the model index of the second machine learning model does not meet the second preset index, Then perform the M+1th round of training of the second machine learning model.
  • the second machine learning model that satisfies the second preset index can be trained by using the model parameters of the first machine learning model without using the first training sample set. In this process, the training can be guaranteed The data security of samples can also improve the model training speed and model accuracy.
  • the training of the first machine learning model and the training of the second machine learning model are performed alternately, and the description will be made by taking the first machine learning model and the second machine learning model having the same partial structure as an example. As shown in Figure 8, the following steps may be included:
  • Step 501 Perform a first round of model training based on the first initial model parameters and the first training sample set to obtain a first initial model.
  • Step 502 Perform a first round of training based on the second training sample set, the model parameters of the structurally identical part of the first initial model, and the second initial model parameters to obtain a second initial model.
  • the same structure includes the same model hierarchy and the same connection relationship.
  • Step 503 Continue model training based on the first training sample set, at least some model parameters of the current second machine learning model, and some model parameters of the first machine learning model obtained from the previous round of training, to obtain the first machine learning model.
  • Step 504 continue model training based on the second training sample set, the model parameters of the same structural part of the current first machine learning model, and some model parameters of the second machine learning model obtained in the previous round of training, to obtain the second machine learning model Model.
  • Step 505 if it is determined that the model index of the first machine learning model satisfies the first preset index, and the model index of the second machine learning model satisfies the second preset index, then end the process of the first machine learning model and the second machine learning model train.
  • Step 506 if it is determined that the model index of the first machine learning model does not meet the first preset index, and/or the model index of the second machine learning model does not meet the second preset index, perform the first machine learning model and the second The next round of training for the machine learning model.
  • the first machine learning model and the second machine learning model have the same structure.
  • model training is continued based on the first training sample set and all model parameters of the current second machine learning model to obtain the first machine learning model; in step 504, based on the second training sample set Set and all model parameters of the current first machine learning model to continue model training to obtain a second machine learning model.
  • the training of the first machine learning model and the training of the second machine learning model are respectively performed in two independent networks.
  • the first model training terminal and the second model training terminal in the model training system are in two independent networks, and the first model training terminal and the second training terminal cannot communicate through the network.
  • the model parameters of the first machine learning model and the model parameters of the second machine learning model can be transmitted through the storage medium.
  • the first machine learning model and the second machine learning model have the same structure and the same application.
  • the first machine learning model is a dose prediction model trained using the first training sample set
  • the second machine learning model is a dose prediction model trained using the second training sample set
  • the first machine learning model and the second machine learning model The models have the same structure and are both applied to dose prediction.
  • the applications of the first machine learning model and the second machine learning model are different.
  • the first machine learning model is a dose prediction model
  • the second machine learning model is an automatic delineation model
  • the structures of the first machine learning model and the second machine learning model can be partially or completely the same.
  • the embodiment of the present disclosure may further include: combining the first machine learning model and the second machine learning model to obtain the target machine learning model.
  • the model training end combines the first machine learning model and the second machine learning model to obtain a combined target machine learning model. For example, by combining the dose prediction model with the automatic delineation model, a target machine learning model that performs dose prediction first and then automatic delineation can be obtained, making the model more powerful.
  • the first machine learning model and the second machine learning model are alternately trained, and the training of the first machine learning model does not use the second training sample set, and the training of the second machine learning model does not use the first training sample set set, which can ensure the data security of the training samples; while training the first machine learning model using the model parameters of the second machine learning model, training the second machine learning model using the model parameters of the first machine learning model can improve the model training speed and machine Accuracy of the learned model.
  • a training method of a machine learning model is provided.
  • the method is applied to the model training system in FIG. 1 as an example, including the following steps:
  • Step 601 acquire at least two training sample sets.
  • the training samples in the training sample set include medical images obtained by scanning objects scanned by the medical scanning equipment.
  • medical images obtained by scanning objects scanned by the medical scanning equipment.
  • CT images CBCT images
  • PET images MR images
  • ultrasound images etc.
  • the model training system acquires at least two training sample sets, and the acquisition method can refer to step 201 .
  • Step 602 Perform multiple rounds of model training based on each training sample set to obtain a machine learning model corresponding to each training sample set.
  • every two machine learning models in the at least two machine learning models have at least part of the same structure, and when training one of the machine learning models, at least partly use the model parameters of the same part of the other machine learning model.
  • model parameters of the current round are obtained, and model training is performed based on the training sample set corresponding to the machine learning model and the model parameters of the current round to obtain the machine learning model; wherein, the model parameters of the current round include initial model parameters or Model parameters of the same structure in another machine learning model.
  • the first machine learning model and the second machine learning model have at least part of the same structure
  • the second machine learning model and the third machine learning model have at least part of the same structure
  • the first The first machine learning model and the third machine learning model are at least partially identical in structure.
  • the initial model parameters of the first round are obtained, and then the first round of model training is performed based on the first training sample set and the initial model parameters.
  • the second machine learning model at least part of the model parameters of the first machine learning model are obtained, and a first round of model training is performed based on the second training sample set and at least part of the model parameters of the first machine learning model.
  • the third machine learning model at least part of the model parameters of the second machine learning model are acquired, and a first round of model training is performed based on the third training sample set and at least part of the model parameters of the second machine learning model.
  • the first machine learning model After the first round, for the first machine learning model, at least part of the model parameters of the third machine learning model are obtained, and a second round of model training is performed based on the first training sample set and at least part of the model parameters of the third machine learning model.
  • the second machine learning model At least some model parameters of the first machine learning model are obtained, and a second round of model training is performed based on the second training sample set and at least some model parameters of the first machine learning model.
  • the third machine learning model at least part of the model parameters of the second machine learning model are obtained, and a second round of model training is performed based on the third training sample set and at least part of the model parameters of the second machine learning model.
  • model parameters are transferred between at least two machine learning models in a preset order. For example, at least some of the model parameters of the first machine learning model are passed to the second machine learning model, at least some of the model parameters of the second machine learning model are passed to the third machine learning model, and at least some of the model parameters of the third machine learning model are passed to The first machine learning model.
  • the transfer of model parameters may also be in other transfer forms, and the embodiment of the present disclosure does not limit the preset order.
  • each machine learning model is trained on an independent network.
  • the training of the first machine learning model, the second machine learning model and the third machine learning model is performed in three independent networks, and the three networks cannot communicate with each other.
  • At least two machine learning models have the same structure and the same application.
  • the first machine learning model, the second machine learning model and the third machine learning model are all dose prediction models, and the structures of the first machine learning model, the second machine learning model and the third machine learning model are the same.
  • the application of each machine learning model is different.
  • the first machine learning model is a dose prediction model
  • the second machine learning model is an automatic delineation model
  • the third machine learning model is a curative effect evaluation model.
  • the structures of every two machine learning models may be partly or completely the same.
  • At least two training sample sets are obtained; multiple rounds of model training are performed based on each training sample set, and a machine learning model corresponding to each training sample set is obtained. Since every two machine learning models have at least part of the same structure, and when one of the machine learning models is trained, at least part of the model parameters of the same part of the other machine learning model are used. Therefore, the data security of the training samples can be guaranteed, and the speed of model training and the accuracy of the machine learning model can be improved.
  • steps in the flow charts of FIG. 2 to FIG. 9 are displayed sequentially as indicated by the arrows, these steps are not necessarily executed sequentially in the order indicated by the arrows. Unless otherwise specified herein, there is no strict order restriction on the execution of these steps, and these steps can be executed in other orders. Moreover, at least some of the steps in FIGS. 2 to 9 may include multiple steps or stages, and these steps or stages are not necessarily performed at the same time, but may be performed at different times. The steps or stages The order of execution is not necessarily performed in rounds, but may be performed alternately or alternately with other steps or at least a part of steps or stages in other steps.
  • a training device for a machine learning model including:
  • the sample set acquisition module 701 is configured to acquire a first training sample set and a second training sample set; the training samples in the first training sample set and the second training sample set include medical images obtained by scanning objects scanned by medical scanning equipment;
  • the first training module 702 is configured to perform multiple rounds of model training based on the first training sample set to obtain a first machine learning model
  • the second training module 703 is configured to perform multiple rounds of model training based on the second training sample set to obtain a second machine learning model
  • the first machine learning model and the second machine learning model have at least part of the same structure, and the model parameters of the second machine learning model are at least partially used when training the first machine learning model, and the first machine learning model is at least partially used when training the second machine learning model.
  • Model parameters for machine learning models are at least partially used when training the first machine learning model, and the first machine learning model is at least partially used when training the second machine learning model.
  • the above-mentioned first training module 702 is specifically configured to perform the first round of model training based on the first initial model parameters and the first training sample set to obtain the first initial model; at least part of the first initial model The parameters are used to train the second machine learning model; based on the first training sample set and at least part of the model parameters of the current second machine learning model, the Nth round of model training is performed to obtain the first machine learning model; N is greater than 1 A positive integer; if it is determined that the model index of the first machine learning model satisfies the first preset index, the training of the first machine learning model is ended.
  • the first machine learning model has the same structure as the second machine learning model, and the above-mentioned first training module 702 is specifically used for the same structure part of the current second machine learning model based on the first training sample set
  • the model parameters of the model and some model parameters of the first machine learning model obtained in the previous round of training are performed for the Nth round of model training to obtain the first machine learning model.
  • the first machine learning model and the second machine learning model all have the same structure, and the above-mentioned first training module 702 is specifically used for all model parameters based on the first training sample set and the current second machine learning model Perform the Nth round of model training to obtain the first machine learning model.
  • the above-mentioned first training module 702 is further configured to perform the N+1th round of training of the first machine learning model if it is determined that the model index of the first machine learning model does not meet the first preset index.
  • the above-mentioned second training module 703 is specifically configured to perform the Mth round of model training based on the second training sample set and at least part of the model parameters of the current first machine learning model to obtain the second machine learning model;
  • M is a positive integer greater than 0; if the model index of the second machine learning model satisfies the second preset index, the training of the second machine learning model ends.
  • the first machine learning model has the same structure as the second machine learning model
  • the above-mentioned second training module 703 is specifically used for model parameters based on the second training sample set and the same structure of the first initial model and the second initial model parameters for the first round of training to obtain the second initial model; at least part of the parameters of the second initial model are used to train the first machine learning model; based on the second training sample set, the current first machine learning model
  • the model parameters of the same part of the structure and some model parameters of the second machine learning model obtained from the previous round of training continue to perform model training to obtain the second machine learning model.
  • the first machine learning model and the second machine learning model all have the same structure, and the above-mentioned second training module 703 is specifically used for all model parameters of the current first machine learning model based on the second training sample set Perform the Mth round of model training to obtain the second machine learning model.
  • the second training module 703 is configured to perform the M+1th round of training of the second machine learning model if it is determined that the model index of the second machine learning model does not meet the second preset index.
  • the model index includes the accuracy rate of the output result
  • the first preset index includes the first preset accuracy rate
  • the second preset index includes the second preset accuracy rate
  • the device also includes:
  • Gradient descent module used for each round of training, if it is determined according to the preset loss function that the output result of the machine learning model does not meet the preset convergence conditions, then use the batch gradient algorithm to determine the descent gradient and continue training until the machine learning model is determined When the output of the model meets the preset convergence conditions, the current round of training is stopped.
  • the above-mentioned sample set acquisition module 701 is specifically configured to acquire medical images of the first hospital, and generate a first training sample set based on the medical images of the first hospital; acquire medical images of the second hospital, and based on A second training sample set is generated from medical images of the second hospital; wherein, the first hospital and the second hospital are different hospitals.
  • the first machine learning model and the second machine learning model include at least one of a dose prediction model, an automatic delineation model, a curative effect evaluation model, a survival index evaluation model, a cancer screening model, and a deformation registration model .
  • model parameters of the first machine learning model and the model parameters of the second machine learning model are transmitted through the network
  • model parameters of the first machine learning model and the model parameters of the second machine learning model are transmitted through a storage medium.
  • the training of the first machine learning model and the training of the second machine learning model are respectively performed in two independent networks.
  • the training of the first machine learning model and the training of the second machine learning model are performed alternately.
  • the first machine learning model and the second machine learning model have the same structure and the same application.
  • the applications of the first machine learning model and the second machine learning model are different.
  • a training device for a machine learning model comprising:
  • a sample set acquisition module 801 configured to acquire at least two training sample sets; the training samples in the training sample set include medical images obtained by scanning objects scanned by medical scanning equipment;
  • the training module 802 is used to perform multiple rounds of model training based on each training sample set to obtain a machine learning model corresponding to each training sample set;
  • every two machine learning models in the at least two machine learning models have at least part of the same structure, and when training one of the machine learning models, at least partly use the model parameters of the same part of the other machine learning model.
  • the above-mentioned training module 802 is used to obtain the model parameters of the current round for each machine learning model, and perform model training based on the training sample set corresponding to the machine learning model and the model parameters of the current round to obtain machine learning model; wherein, the model parameters of the current round include initial model parameters or model parameters of the same structure in another machine learning model.
  • each machine learning model is trained on an independent network.
  • At least two machine learning models have the same structure and the same application.
  • the application of each machine learning model is different.
  • Each module in the above-mentioned machine learning model training device can be realized in whole or in part by software, hardware and a combination thereof.
  • the above-mentioned modules can be embedded in or independent of the processor in the computer device in the form of hardware, and can also be stored in the memory of the computer device in the form of software, so that the processor can invoke and execute the corresponding operations of the above-mentioned modules.
  • a computer device is provided.
  • the computer device may be a terminal, and its internal structure may be as shown in FIG. 12 .
  • the computer device includes a processor, a memory, a communication interface, a display screen and an input device connected through a system bus. Wherein, the processor of the computer device is used to provide calculation and control capabilities.
  • the memory of the computer device includes a non-volatile storage medium and an internal memory.
  • the non-volatile storage medium stores an operating system and computer programs.
  • the internal memory provides an environment for the operation of the operating system and computer programs in the non-volatile storage medium.
  • the communication interface of the computer device is used to communicate with an external terminal in a wired or wireless manner, and the wireless manner can be realized through WIFI, an operator network, NFC (Near Field Communication) or other technologies.
  • WIFI Wireless Fidelity
  • NFC Near Field Communication
  • the computer program is executed by the processor, a method for training a machine learning model is realized.
  • the display screen of the computer device may be a liquid crystal display screen or an electronic ink display screen
  • the input device of the computer device may be a touch layer covered on the display screen, or a button, a trackball or a touch pad provided on the casing of the computer device , and can also be an external keyboard, touchpad, or mouse.
  • FIG. 12 is only a block diagram of a part of the structure related to the solution of this application, and does not constitute a limitation to the computer equipment on which the solution of this application is applied.
  • the specific computer equipment can be More or fewer components than shown in the figures may be included, or some components may be combined, or have a different arrangement of components.
  • a computer device including a memory and a processor, a computer program is stored in the memory, and the processor implements the following steps when executing the computer program:
  • the training samples in the first training sample set and the second training sample set include medical images obtained by scanning objects scanned by medical scanning equipment;
  • the first machine learning model and the second machine learning model have at least part of the same structure, and the model parameters of the second machine learning model are at least partially used when training the first machine learning model, and the first machine learning model is at least partially used when training the second machine learning model.
  • Model parameters for machine learning models are at least partially used when training the first machine learning model, and the first machine learning model is at least partially used when training the second machine learning model.
  • N round of model training is performed to obtain the first machine learning model; N is a positive integer greater than 1;
  • the training of the first machine learning model ends.
  • the first machine learning model has the same partial structure as the second machine learning model, and the processor also implements the following steps when executing the computer program:
  • the N round of model training is performed to obtain the first machine learning model .
  • the first machine learning model and the second machine learning model all have the same structure, and the processor also implements the following steps when executing the computer program:
  • the Nth round of model training is performed to obtain the first machine learning model.
  • the N+1th round of training of the first machine learning model is performed.
  • the M round of model training is performed to obtain the second machine learning model; wherein, M is a positive integer greater than 0;
  • the first machine learning model has the same partial structure as the second machine learning model, and the processor also implements the following steps when executing the computer program:
  • a first round of training is performed based on the second training sample set, the model parameters of the structurally identical part of the first initial model, and the second initial model parameters to obtain a second initial model; at least part of the parameters of the second initial model are used to train the first machine learning model;
  • the first machine learning model and the second machine learning model all have the same structure, and the processor also implements the following steps when executing the computer program:
  • the M round of model training is performed to obtain the second machine learning model.
  • the M+1th round of training of the second machine learning model is performed.
  • the model index includes the accuracy rate of the output result
  • the first preset index includes the first preset accuracy rate
  • the second preset index includes the second preset accuracy rate
  • the batch gradient algorithm is used to determine the descent gradient and continue training until it is determined that the output result of the machine learning model meets the preset convergence conditions.
  • the convergence condition is set, the current round of training is stopped.
  • the first hospital and the second hospital are different hospitals.
  • the first machine learning model and the second machine learning model include at least one of a dose prediction model, an automatic delineation model, a curative effect evaluation model, a survival index evaluation model, a cancer screening model, and a deformation registration model.
  • model parameters of the first machine learning model and the model parameters of the second machine learning model are transmitted through the network
  • model parameters of the first machine learning model and the model parameters of the second machine learning model are transmitted through a storage medium.
  • the training of the first machine learning model and the training of the second machine learning model are respectively performed in two independent networks.
  • the training of the first machine learning model is alternated with the training of the second machine learning model.
  • the first machine learning model and the second machine learning model have the same structure and the same application.
  • the applications of the first machine learning model and the second machine learning model are different.
  • a computer-readable storage medium on which a computer program is stored, and when the computer program is executed by a processor, the following steps are implemented:
  • the training samples in the first training sample set and the second training sample set include medical images obtained by scanning objects scanned by medical scanning equipment;
  • the first machine learning model and the second machine learning model have at least part of the same structure, and the model parameters of the second machine learning model are at least partially used when training the first machine learning model, and the first machine learning model is at least partially used when training the second machine learning model.
  • Model parameters for machine learning models are at least partially used when training the first machine learning model, and the first machine learning model is at least partially used when training the second machine learning model.
  • N round of model training is performed to obtain the first machine learning model; N is a positive integer greater than 1;
  • the training of the first machine learning model ends.
  • the first machine learning model has the same partial structure as the second machine learning model, and when the computer program is executed by the processor, the following steps are also implemented:
  • the N round of model training is performed to obtain the first machine learning model .
  • the first machine learning model has the same structure as the second machine learning model, and when the computer program is executed by the processor, the following steps are also implemented:
  • the Nth round of model training is performed to obtain the first machine learning model.
  • the N+1th round of training of the first machine learning model is performed.
  • the M round of model training is performed to obtain the second machine learning model; wherein, M is a positive integer greater than 0;
  • the first machine learning model has the same partial structure as the second machine learning model, and when the computer program is executed by the processor, the following steps are also implemented:
  • a first round of training is performed based on the second training sample set, the model parameters of the structurally identical part of the first initial model, and the second initial model parameters to obtain a second initial model; at least part of the parameters of the second initial model are used to train the first machine learning model;
  • model parameters of the same part of the current first machine learning model and some model parameters of the second machine learning model obtained from the previous round of training continue model training to obtain the second machine learning model.
  • the first machine learning model has the same structure as the second machine learning model, and when the computer program is executed by the processor, the following steps are also implemented:
  • An M-th round of model training is performed based on the second training sample set and all model parameters of the current first machine learning model to obtain a second machine learning model.
  • the M+1th round of training of the second machine learning model is performed.
  • the model index includes the accuracy rate of the output result
  • the first preset index includes the first preset accuracy rate
  • the second preset index includes the second preset accuracy rate
  • the batch gradient algorithm is used to determine the descent gradient and continue training until it is determined that the output result of the machine learning model meets the preset convergence conditions.
  • the convergence condition is set, the current round of training is stopped.
  • the first hospital and the second hospital are different hospitals.
  • the first machine learning model and the second machine learning model include at least one of a dose prediction model, an automatic delineation model, a curative effect evaluation model, a survival index evaluation model, a cancer screening model, and a deformation registration model.
  • model parameters of the first machine learning model and the model parameters of the second machine learning model are transmitted through the network
  • model parameters of the first machine learning model and the model parameters of the second machine learning model are transmitted through a storage medium.
  • the training of the first machine learning model and the training of the second machine learning model are respectively performed in two independent networks.
  • the training of the first machine learning model is alternated with the training of the second machine learning model.
  • the first machine learning model and the second machine learning model have the same structure and the same application.
  • the applications of the first machine learning model and the second machine learning model are different.
  • Non-volatile memory may include read-only memory (Read-Only Memory, ROM), magnetic tape, floppy disk, flash memory or optical memory, etc.
  • Volatile memory can include Random Access Memory (RAM) or external cache memory.
  • RAM can take many forms, such as Static Random Access Memory (SRAM) or Dynamic Random Access Memory (DRAM).

Landscapes

  • Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Medical Informatics (AREA)
  • Public Health (AREA)
  • Data Mining & Analysis (AREA)
  • Theoretical Computer Science (AREA)
  • Software Systems (AREA)
  • Epidemiology (AREA)
  • General Health & Medical Sciences (AREA)
  • Primary Health Care (AREA)
  • Biomedical Technology (AREA)
  • Evolutionary Computation (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Artificial Intelligence (AREA)
  • Physics & Mathematics (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Radiology & Medical Imaging (AREA)
  • Nuclear Medicine, Radiotherapy & Molecular Imaging (AREA)
  • Databases & Information Systems (AREA)
  • Pathology (AREA)
  • Image Analysis (AREA)

Abstract

本申请涉及一种机器学习模型的训练方法、装置、计算机设备和存储介质。所述方法包括:获取第一训练样本集和第二训练样本集;所述第一训练样本集和所述第二训练样本集中的训练样本包括医学扫描设备对扫描对象进行扫描得到的医学影像;基于所述第一训练样本集进行多轮模型训练,得到第一机器学习模型;基于所述第二训练样本集进行多轮模型训练,得到第二机器学习模型;其中,所述第一机器学习模型与所述第二机器学习模型至少部分结构相同,且训练所述第一机器学习模型时至少部分利用所述第二机器学习模型的模型参数,训练所述第二机器学习模型时至少部分利用所述第一机器学习模型的模型参数。采用本方法无需共享医学影像即可提高模型准确度。

Description

机器学习模型的训练方法、装置、计算机设备和存储介质 技术领域
本申请涉及模型训练技术领域,特别是涉及一种机器学习模型的训练方法、装置、计算机设备和存储介质。
背景技术
随着医学影像设备的发展,对医学影像进行图像处理的机器学习模型得到了广泛使用。
通常情况下,训练机器学习模型需要大量的训练样本。但是,由于医学影像涉及患者隐私和数据安全,医院之间并不能共享医学影像,因此会出现机器学习模型的训练样本较少,机器学习模型的准确度较差的问题。
发明内容
基于此,有必要针对上述技术问题,提供一种无需共享医学影像即可提高模型准确度的机器学习模型的训练方法、装置、计算机设备和存储介质。
一种机器学习模型的训练方法,该方法包括:
获取第一训练样本集和第二训练样本集;第一训练样本集和第二训练样本集中的训练样本包括医学扫描设备对扫描对象进行扫描得到的医学影像;
基于第一训练样本集进行多轮模型训练,得到第一机器学习模型;
基于第二训练样本集进行多轮模型训练,得到第二机器学习模型;
其中,第一机器学习模型与第二机器学习模型至少部分结构相同,且训练第一机器学习模型时至少部分利用第二机器学习模型的模型参数,训练第二机器学习模型时至少部分利用第一机器学习模型的模型参数。
在其中一个实施例中,上述基于第一训练样本集进行多轮模型训练,得到第一机器学习模型,包括:
基于第一初始模型参数和第一训练样本集进行第一轮模型训练,得到第一初始模型;第一初始模型的至少部分模型参数用于训练第二机器学习模型;
基于第一训练样本集和当前的第二机器学习模型的至少部分模型参数进行第N轮模型训练,得到第一机器学习模型;N为大于1的正整数;
若确定第一机器学习模型的模型指标满足第一预设指标,则结束第一机器学习模型的训练。
在其中一个实施例中,第一机器学习模型与第二机器学习模型部分结构相同,上述基于第一训练样本集和当前的第二机器学习模型的至少部分模型参数进行第N轮模型训练, 得到第一机器学习模型,包括:
基于第一训练样本集、当前的第二机器学习模型的结构相同部分的模型参数以及前一轮训练得到的第一机器学习模型的部分模型参数进行第N轮模型训练,得到第一机器学习模型。
在其中一个实施例中,第一机器学习模型与第二机器学习模型全部结构相同,上述基于第一训练样本集和当前的第二机器学习模型的至少部分模型参数进行第N轮模型训练,得到第一机器学习模型,包括:
基于第一训练样本集、当前的第二机器学习模型的全部模型参数进行第N轮模型训练,得到第一机器学习模型。
在其中一个实施例中,该方法还包括:
若确定第一机器学习模型的模型指标不满足第一预设指标,则进行第一机器学习模型的第N+1轮训练。
在其中一个实施例中,上述基于第二训练样本集进行多轮模型训练,得到第二机器学习模型,包括:
基于第二训练样本集和当前的第一机器学习模型的至少部分模型参数进行第M轮模型训练,得到第二机器学习模型;其中,M为大于0的正整数;
若第二机器学习模型的模型指标满足第二预设指标,则结束第二机器学习模型的训练。
在其中一个实施例中,第一机器学习模型与第二机器学习模型部分结构相同,上述基于第二训练样本集和当前的第一机器学习模型的至少部分模型参数进行第M轮模型训练,得到第二机器学习模型,包括:
基于第二训练样本集、第一初始模型的结构相同部分的模型参数以及第二初始模型参数进行第一轮训练,得到第二初始模型;第二初始模型的至少部分参数用于训练第一机器学习模型;
基于第二训练样本集、当前的第一机器学习模型的结构相同部分的模型参数以及前一轮训练得到的第二机器学习模型的部分模型参数继续进行模型训练,得到第二机器学习模型。
在其中一个实施例中,第一机器学习模型与第二机器学习模型全部结构相同,上述基于第二训练样本集和当前的第一机器学习模型的模型参数进行第M轮模型训练,得到第二机器学习模型,包括:
基于第二训练样本集、当前的第一机器学习模型的全部模型参数进行第M轮模型训练,得到第二机器学习模型。
在其中一个实施例中,该方法还包括:
若确定第二机器学习模型的模型指标不满足第二预设指标,则进行第二机器学习模型的第M+1轮训练。
在其中一个实施例中,模型指标包括输出结果的准确率,第一预设指标包括第一预设准确率;第二预设指标包括第二预设准确率。
在其中一个实施例中,该方法还包括:
在每一轮训练过程中,若根据预设损失函数确定机器学习模型的输出结果不符合预设收敛条件,则利用批量梯度算法确定下降梯度并继续训练,直到确定机器学习模型的输出结果符合预设收敛条件时,停止本轮训练。
在其中一个实施例中,上述获取第一训练样本集和第二训练样本集,包括:
获取第一医院的医学影像,并基于第一医院的医学影像生成第一训练样本集;
获取第二医院的医学影像,并基于第二医院的医学影像生成第二训练样本集;
其中,第一医院与第二医院为不同的医院。
在其中一个实施例中,第一机器学习模型和第二机器学习模型包括剂量预测模型、自动勾画模型、疗效评估模型、生存指标评估模型、癌症筛查模型和形变配准模型中的至少一种。
在其中一个实施例中,第一机器学习模型的模型参数和第二机器学习模型的模型参数通过网络传递;
或,第一机器学习模型的模型参数和第二机器学习模型的模型参数通过存储介质传递。
在其中一个实施例中,第一机器学习模型的训练和第二机器学习模型的训练分别在两个独立的网络进行。
在其中一个实施例中,第一机器学习模型的训练与第二机器学习模型的训练交替进行。
在其中一个实施例中,第一机器学习模型和第二机器学习模型的结构相同且应用相同。
在其中一个实施例中,第一机器学习模型和第二机器学习模型的应用不同。
在其中一个实施例中,训练第一机器学习模型和第二机器学习模型的过程中,仅传递模型参数。
在其中一个实施例中,该方法还包括:
将第一机器学习模型与第二机器学习模型进行组合处理,得到目标机器学习模型。
一种机器学习模型的训练方法,该方法包括:
获取至少两个训练样本集;训练样本集中的训练样本包括医学扫描设备对扫描对象进行扫描得到的医学影像;
基于各训练样本集进行多轮模型训练,得到各训练样本集对应的机器学习模型;
其中,至少两个机器学习模型中的每两个机器学习模型至少部分结构相同,且训练其中一个机器学习模型时,至少部分利用另一个机器学习模型的结构相同部分的模型参数。
在其中一个实施例中,基于各训练样本集进行多轮模型训练,得到各训练样本集对应 的机器学习模型,包括:
对于各机器学习模型,获取当前轮的模型参数,并基于机器学习模型对应的训练样本集和当前轮的模型参数进行模型训练,得到机器学习模型;其中,当前轮的模型参数包括初始模型参数或另一机器学习模型中结构相同部分的模型参数。
在其中一个实施例中,各机器学习模型的训练在独立的网络进行。
在其中一个实施例中,至少两个机器学习模型的结构相同且应用相同。
在其中一个实施例中,各机器学习模型的应用不同。
一种机器学习模型的训练装置,该装置包括:
样本集获取模块,用于获取第一训练样本集和第二训练样本集;第一训练样本集和第二训练样本集中的训练样本包括医学扫描设备对扫描对象进行扫描得到的医学影像;
第一训练模块,用于基于第一训练样本集进行多轮模型训练,得到第一机器学习模型;
第二训练模块,用于基于第二训练样本集进行多轮模型训练,得到第二机器学习模型;
其中,第一机器学习模型与第二机器学习模型至少部分结构相同,且训练第一机器学习模型时至少部分利用第二机器学习模型的模型参数,训练第二机器学习模型时至少部分利用第一机器学习模型的模型参数。
在其中一个实施例中,上述第一训练模块,具体用于基于第一初始模型参数和第一训练样本集进行第一轮模型训练,得到第一初始模型;第一初始模型的至少部分模型参数用于训练第二机器学习模型;基于第一训练样本集和当前的第二机器学习模型的至少部分模型参数进行第N轮模型训练,得到第一机器学习模型;N为大于1的正整数;若确定第一机器学习模型的模型指标满足第一预设指标,则结束第一机器学习模型的训练。
在其中一个实施例中,第一机器学习模型与第二机器学习模型部分结构相同,上述第一训练模块,具体用于基于第一训练样本集、当前的第二机器学习模型的结构相同部分的模型参数以及前一轮训练得到的第一机器学习模型的部分模型参数进行第N轮模型训练,得到第一机器学习模型。
在其中一个实施例中,第一机器学习模型与第二机器学习模型全部结构相同,上述第一训练模块,具体用于基于第一训练样本集、当前的第二机器学习模型的全部模型参数进行第N轮模型训练,得到第一机器学习模型。
在其中一个实施例中,上述第一训练模块,还用于若确定第一机器学习模型的模型指标不满足第一预设指标,则进行第一机器学习模型的第N+1轮训练。
在其中一个实施例中,上述第二训练模块,具体用于基于第二训练样本集和当前的第一机器学习模型的至少部分模型参数进行第M轮模型训练,得到第二机器学习模型;其中,M为大于0的正整数;若第二机器学习模型的模型指标满足第二预设指标,则结束第二机器学习模型的训练。
在其中一个实施例中,第一机器学习模型与第二机器学习模型部分结构相同,上述第二训练模块,具体用于基于第二训练样本集、第一初始模型的结构相同部分的模型参数以 及第二初始模型参数进行第一轮训练,得到第二初始模型;第二初始模型的至少部分参数用于训练第一机器学习模型;基于第二训练样本集、当前的第一机器学习模型的结构相同部分的模型参数以及前一轮训练得到的第二机器学习模型的部分模型参数继续进行模型训练,得到第二机器学习模型。
在其中一个实施例中,第一机器学习模型与第二机器学习模型全部结构相同,上述第二训练模块,具体用于基于第二训练样本集、当前的第一机器学习模型的全部模型参数进行第M轮模型训练,得到第二机器学习模型。
在其中一个实施例中,上述第二训练模块,用于若确定第二机器学习模型的模型指标不满足第二预设指标,则进行第二机器学习模型的第M+1轮训练。
在其中一个实施例中,模型指标包括输出结果的准确率,第一预设指标包括第一预设准确率;第二预设指标包括第二预设准确率。
在其中一个实施例中,该装置还包括:
梯度下降模块,用于在每一轮训练过程中,若根据预设损失函数确定机器学习模型的输出结果不符合预设收敛条件,则利用批量梯度算法确定下降梯度并继续训练,直到确定机器学习模型的输出结果符合预设收敛条件时,停止本轮训练。
在其中一个实施例中,上述样本集获取模块,具体用于获取第一医院的医学影像,并基于第一医院的医学影像生成第一训练样本集;获取第二医院的医学影像,并基于第二医院的医学影像生成第二训练样本集;其中,第一医院与第二医院为不同的医院。
在其中一个实施例中,第一机器学习模型和第二机器学习模型包括剂量预测模型、自动勾画模型、疗效评估模型、生存指标评估模型、癌症筛查模型和形变配准模型中的至少一种。
在其中一个实施例中,第一机器学习模型的模型参数和第二机器学习模型的模型参数通过网络传递;
或,第一机器学习模型的模型参数和第二机器学习模型的模型参数通过存储介质传递。
在其中一个实施例中,第一机器学习模型的训练和第二机器学习模型的训练分别在两个独立的网络进行。
在其中一个实施例中,第一机器学习模型的训练与第二机器学习模型的训练交替进行。
在其中一个实施例中,第一机器学习模型和第二机器学习模型的结构相同且应用相同。
在其中一个实施例中,第一机器学习模型和第二机器学习模型的应用不同。
在其中一个实施例中,训练第一机器学习模型和第二机器学习模型的过程中,仅传递模型参数。
在其中一个实施例中,该装置还包括:
组合处理模块,用于将第一机器学习模型与第二机器学习模型进行组合处理,得到目标机器学习模型。
一种机器学习模型的训练装置,该装置包括:
样本集获取模块,用于获取至少两个训练样本集;训练样本集中的训练样本包括医学扫描设备对扫描对象进行扫描得到的医学影像;
训练模块,用于基于各训练样本集进行多轮模型训练,得到各训练样本集对应的机器学习模型;
其中,至少两个机器学习模型中的每两个机器学习模型至少部分结构相同,且训练其中一个机器学习模型时,至少部分利用另一个机器学习模型的结构相同部分的模型参数。
在其中一个实施例中,上述训练模块,用于对于各机器学习模型,获取当前轮的模型参数,并基于机器学习模型对应的训练样本集和当前轮的模型参数进行模型训练,得到机器学习模型;其中,当前轮的模型参数包括初始模型参数或另一机器学习模型中结构相同部分的模型参数。
在其中一个实施例中,各机器学习模型的训练在独立的网络进行。
在其中一个实施例中,至少两个机器学习模型的结构相同且应用相同。
在其中一个实施例中,各机器学习模型的应用不同。
一种计算机设备,包括存储器和处理器,所述存储器存储有计算机程序,所述处理器执行所述计算机程序时实现以下步骤:
获取第一训练样本集和第二训练样本集;第一训练样本集和第二训练样本集中的训练样本包括医学扫描设备对扫描对象进行扫描得到的医学影像;
基于第一训练样本集进行多轮模型训练,得到第一机器学习模型;
基于第二训练样本集进行多轮模型训练,得到第二机器学习模型;
其中,第一机器学习模型与第二机器学习模型至少部分结构相同,且训练第一机器学习模型时至少部分利用第二机器学习模型的模型参数,训练第二机器学习模型时至少部分利用第一机器学习模型的模型参数。
一种计算机可读存储介质,其上存储有计算机程序,所述计算机程序被处理器执行时实现以下步骤:
获取第一训练样本集和第二训练样本集;第一训练样本集和第二训练样本集中的训练样本包括医学扫描设备对扫描对象进行扫描得到的医学影像;
基于第一训练样本集进行多轮模型训练,得到第一机器学习模型;
基于第二训练样本集进行多轮模型训练,得到第二机器学习模型;
其中,第一机器学习模型与第二机器学习模型至少部分结构相同,且训练第一机器学习模型时至少部分利用第二机器学习模型的模型参数,训练第二机器学习模型时至少部分利用第一机器学习模型的模型参数。
上述机器学习模型的训练方法、装置、计算机设备和存储介质,获取第一训练样本集 和第二训练样本集;基于第一训练样本集进行多轮模型训练,得到第一机器学习模型;基于第二训练样本集进行多轮模型训练,得到第二机器学习模型。由于第一机器学习模型和第二机器学习模型存在结构相同的部分,因此,结构相同的部分可以使用相同的模型参数。在训练第一机器学习模型时,至少部分利用第二机器学习模型的模型参数,在训练第二机器学习模型时,至少部分利用第一机器学习模型的模型参数。通过本公开实施例,训练第一机器学习模型没有利用第二训练样本集,训练第二机器学习模型也没有利用第一训练样本集,可以保证训练样本的数据安全;而训练第一机器学习模型利用第二机器学习模型的模型参数,训练第二机器学习模型利用第一机器学习模型的模型参数,可以提高模型训练速度以及机器学习模型的准确度。
附图说明
图1为一个实施例中机器学习模型的训练方法的应用环境图;
图2为一个实施例中机器学习模型的训练方法的流程示意图;
图3a为一个实施例中第一机器学习模型的结构示意图;
图3b为一个实施例中第二机器学习模型的结构示意图;
图4为一个实施例中基于第一训练样本集进行多轮模型训练步骤的流程示意图之一;
图5为一个实施例中基于第一训练样本集进行多轮模型训练步骤的流程示意图之二;
图6为一个实施例中基于第二训练样本集进行多轮模型训练步骤的流程示意图之一;
图7为一个实施例中基于第二训练样本集进行多轮模型训练步骤的流程示意图之二;
图8为一个实施例中交替训练第一机器学习模型的训练与第二机器学习模型步骤的流程示意图;
图9为另一个实施例中机器学习模型的训练方法的流程示意图;
图10为一个实施例中机器学习模型的训练装置的结构框图;
图11为另一个实施例中机器学习模型的训练装置的结构框图;
图12为一个实施例中计算机设备的内部结构图。
具体实施例方式
为了使本申请的目的、技术方案及优点更加清楚明白,以下结合附图及实施例,对本申请进行进一步详细说明。应当理解,此处描述的具体实施例仅仅用以解释本申请,并不用于限定本申请。
首先,在具体介绍本公开实施例的技术方案之前,先对本公开实施例基于的技术背景或者技术演进脉络进行介绍。通常情况下,训练机器学习模型需要大量的训练样本。但是,由于医学影像涉及患者隐私和数据安全,医院之间并不能共享医学影像,因此会出现机器学习模型的训练样本较少,机器学习模型的准确度较差的问题。
本申请提供了机器学习模型的训练方案,包括:获取第一训练样本集和第二训练样本 集;基于第一训练样本集进行多轮模型训练,得到第一机器学习模型;基于第二训练样本集进行多轮模型训练,得到第二机器学习模型。其中,第一机器学习模型与第二机器学习模型至少部分结构相同,且训练第一机器学习模型时至少部分利用第二机器学习模型的模型参数,训练第二机器学习模型时至少部分利用第一机器学习模型的模型参数。这样,训练第一机器学习模型没有利用第二训练样本集,训练第二机器学习模型也没有利用第一训练样本集,可以保证训练样本的数据安全;而训练第一机器学习模型利用第二机器学习模型的模型参数,训练第二机器学习模型利用第一机器学习模型的模型参数,可以提高模型训练速度以及机器学习模型的准确度。可见,解决了现有技术中训练样本较少,机器学习模型的准确度较差的问题。
本申请提供的机器学习模型的训练方法,可以应用于如图1所示的应用环境中。该应用环境可以包括模型训练系统,模型训练系统包括多个模型训练端101,多个模型训练端101之间可以通过网络进行通信。其中,模型训练端101可以是与医学扫描设备102连接的终端。该终端可以但不限于是各种个人计算机、笔记本电脑机和平板电脑,上述医学扫描设备102可以为单模设备,也可以多模设备,例如可以但不限于是DR(Digital radiography,即数字化X线摄影)设备、CT(Computed Tomography,即电子计算机断层扫描)设备、CBCT(Cone Beam Computed Tomography,即锥形束电子计算机断层扫描)设备、PET(Positron Emission Computed Tomography,正电子发射型计算机断层扫描)设备、MRI(Magnetic Resonance Imaging,磁共振成像)设备、超声设备、PET-CT设备、PET-MR、RT(radiotherapy,即放射治疗)设备、CT-RT和MR-RT。模型训练端101也可以是PACS(Picture Archiving and Communication Systems,影像归档和通信系统)服务器。上述PACS服务器可以用独立的服务器或者是多个服务器组成的服务器集群来实现。
在一个实施例中,如图2所示,提供了一种机器学习模型的训练方法,以该方法应用于图1中的模型训练系统为例进行说明,包括以下步骤:
步骤201,获取第一训练样本集和第二训练样本集。
其中,第一训练样本集和第二训练样本集中的训练样本包括医学扫描设备对扫描对象进行扫描得到的医学影像。该医学影像可以为二维图像,也可以为三维图像。例如CT图像、CBCT图像、PET图像、MR图像、超声图像等。训练样本集满足数据多样性、标签一致性,并且,数据结构是相同的。
在训练机器学习模型的过程中,模型训练系统可以利用同一个医院中的医学影像进行模型训练,也可以利用不同医院中的医学影像进行模型训练。
当模型训练系统利用同一个医院中的医学影像进行模型训练时,模型训练端根据多个医学影像生成第一训练样本集和第二训练样本集,其中,第一训练样本集和第二训练样本集用于不同的训练任务。
例如,模型训练端从同一CT设备获取到100张CT影像,将100张CT影像分为两个影像集合,得到第一训练样本集和第二训练样本集,其中第一训练样本集用于剂量预测模型的训练任务,第二训练样本集用于自动勾画模型的训练任务。
当模型训练系统利用不同医院中的医学影像进行模型训练时,模型训练系统中的第一模型训练端获取第一医院的医学影像,并基于第一医院的医学影像生成第一训练样本集;模型训练系统中的第二模型训练端获取第二医院的医学影像,并基于第二医院的医学影像生成第二训练样本集;其中,第一医院与第二医院为不同的医院。
例如,模型训练端A1从医院B1获取CT影像,并生成第一训练样本集;模型训练端A2从医院B2获取CT影像,并生成第二训练样本集。
本公开实施例对训练样本集的获取方式不做限定。
步骤202,基于第一训练样本集进行多轮模型训练,得到第一机器学习模型。
步骤203,基于第二训练样本集进行多轮模型训练,得到第二机器学习模型。
当模型训练系统利用同一个医院中的医学影像进行模型训练时,模型训练系统可以采用同一模型训练端进行第一机器学习模型和第二机器学习模型的训练,也可以采用不同的模型训练端进行第一机器学习模型和第二机器学习模型的训练。
当模型训练系统利用不同医院中的医学影像进行模型训练时,模型训练系统采用不同的模型训练端进行第一机器学习模型和第二机器学习模型的训练。
在其中一个实施例中,第一机器学习模型与第二机器学习模型至少部分结构相同,且训练第一机器学习模型时至少部分利用第二机器学习模型的模型参数,训练第二机器学习模型时至少部分利用第一机器学习模型的模型参数。
如图3a所示的第一机器学习模型和图3b所示的第二机器学习模型,第一机器学习模型和第二机器学习模型存在结构相同的部分,因此,结构相同的部分可以使用相同的模型参数。这样,在训练第一机器学习模型时,至少部分利用第二机器学习模型的模型参数,在训练第二机器学习模型时,至少部分利用第一机器学习模型的模型参数。
例如,模型训练系统的第一模型训练端进行一轮模型训练,得到第一机器学习模型的模型参数,然后,将第一机器学习模型的模型参数传递到模型训练系统的第二模型训练端。第二模型训练端利用第一机器学习模型的模型参数进行一轮第二机器学习模型的训练;训练结束后,将第二机器学习模型的模型参数传递到第一模型训练端。第一模型训练端利用第二机器学习模型的模型参数再进行一轮第一机器学习模型的训练;训练结束后,再将第一机器学习模型的模型参数传递到第二模型训练端。以此类推,直到第一机器学习模型和第二机器学习模型训练完毕。
其中,第一机器学习模型的模型参数和第二机器学习模型的模型参数通过网络传递;或,第一机器学习模型的模型参数和第二机器学习模型的模型参数通过存储介质传递。例如,第一模型训练端与第二模型训练端通过网络发送模型参数,或者,用户使用移动硬盘等存储介质拷贝模型参数,实现模型参数的传递。本公开实施例对此不做限定。
可以理解地,在上述模型训练过程中,训练第一机器学习模型没有利用第二训练样本集,训练第二机器学习模型也没有利用第一训练样本集,可以保证训练样本的数据安全;而训练第一机器学习模型利用第二机器学习模型的模型参数,训练第二机器学习模型利用第一机器学习模型的模型参数,可以提高模型训练速度以及机器学习模型的准确度。
在其中一个实施例中,第一机器学习模型和第二机器学习模型包括剂量预测模型、自动勾画模型、疗效评估模型、生存指标评估模型、癌症筛查模型和形变配准模型中的至少一种。
例如,第一机器学习模型为剂量预测模型,第二机器学习模型为自动勾画模型。或者,第一机器学习模型为疗效评估模型,第二机器学习模型为生存指标评估模型。本公开实施例对第一机器学习模型和第二机器学习模型的模型种类不做限定。
对于不同医院而言,训练样本有限,现有技术利用有限样本进行模型训练得到的应用模型的精度有限。而本实施例的训练方法通过重复利用有限样本对机器学习模型进行迭代训练得到的模型精度较高,其在应用的时候可以得到更好的技术效果。例如,相对现有技术中利用较少样本训练得到的自动勾画模型,采用本实施例的训练方法得到的自动勾画模型对感兴趣区域的勾画效果更好;再例如,相对现有技术中利用较少样本训练得到的剂量预测模型,采用本实施例的训练方法得到的剂量预测模型可以更准确地预测剂量。
上述机器学习模型的训练方法中,获取第一训练样本集和第二训练样本集;基于第一训练样本集进行多轮模型训练,得到第一机器学习模型;基于第二训练样本集进行多轮模型训练,得到第二机器学习模型。由于第一机器学习模型和第二机器学习模型存在结构相同的部分,因此,结构相同的部分可以使用相同的模型参数。在训练第一机器学习模型时,至少部分利用第二机器学习模型的模型参数,在训练第二机器学习模型时,至少部分利用第一机器学习模型的模型参数。通过本公开实施例,训练第一机器学习模型没有利用第二训练样本集,训练第二机器学习模型也没有利用第一训练样本集,可以保证训练样本的数据安全;而训练第一机器学习模型利用第二机器学习模型的模型参数,训练第二机器学习模型利用第一机器学习模型的模型参数,可以提高模型训练速度以及机器学习模型的准确度。
在一个实施例中,如图4所示,上述基于第一训练样本集进行多轮模型训练,得到第一机器学习模型的步骤,可以包括:
步骤301,基于第一初始模型参数和第一训练样本集进行第一轮模型训练,得到第一初始模型。
其中,第一初始模型参数可以是随机的模型参数,也可以是用户赋值的模型参数,本公开实施例对此不做限定。
模型训练端获取第一初始模型参数和第一训练样本集后,进行第一机器学习模型的第一轮模型训练,得到第一初始模型。其中,第一初始模型的至少部分模型参数用于训练所 述第二机器学习模型。
步骤302,基于第一训练样本集和当前的第二机器学习模型的至少部分模型参数进行第N轮模型训练,得到第一机器学习模型,N为大于1的正整数。
第一机器学习模型和第二机器学习模型可以交替训练,模型训练端得到第一初始模型后,利用第一初始模型的模型参数进行第二机器学习模型的第一轮训练。
接着,模型训练端进行第一机器学习模型的第N轮训练。在第一机器学习模型与第二机器学习模型部分结构相同的情况下,基于第一训练样本集和当前的第二机器学习模型的至少部分模型参数进行第N轮模型训练,得到第一机器学习模型的过程,可以包括:基于第一训练样本集、当前的第二机器学习模型的结构相同部分的模型参数以及前一轮训练得到的第一机器学习模型的部分模型参数进行第N轮模型训练,得到第一机器学习模型。
以N等于2为例,模型训练端基于第一训练样本集合、第一初始模型的部分模型参数和第一轮训练第二机器学习模型得到的结构相同部分的模型参数,进行第一机器学习模型的第二轮模型训练,得到第一机器学习模型,并执行步骤303或步骤304。以N等于3为例,模型训练端基于第一训练样本集合、第二轮训练出的第一机器学习模型和第二轮训练第二机器学习模型得到的结构相同部分的模型参数,进行第一机器学习模型的第三轮模型训练,得到第一机器学习模型,并执行步骤303或步骤304。
在第一机器学习模型与第二机器学习模型全部结构相同的情况下,基于第一训练样本集和当前的第二机器学习模型的至少部分模型参数进行第N轮模型训练,得到第一机器学习模型的过程,可以包括:基于第一训练样本集、当前的第二机器学习模型的全部模型参数进行第N轮模型训练,得到第一机器学习模型。
步骤303,若确定第一机器学习模型的模型指标满足第一预设指标,则结束第一机器学习模型的训练。
每轮训练结束后,计算第一机器学习模型的模型指标,并确定该模型指标是否满足第一预设指标;如果满足第一预设指标,则结束第一机器学习模型的训练。
在其中一个实施例,模型指标包括输出结果的准确率,第一预设指标包括第一预设准确率。
模型训练端计算第一机器学习模型输出结果的准确率,并将第一机器学习模型输出结果的准确率与第一预设准确率进行比较。如果第一机器学习模型输出结果的准确率大于第一预设准确率,则确定第一机器学习模型的模型指标满足第一预设指标;如果第一机器学习模型输出结果的准确率小于或等于第一预设准确率,则确定第一机器学习模型的模型指标不满足第一预设指标。
上述计算第一机器学习模型输出结果的准确率的过程,可以包括:将预设数量的测试样本输入到第一机器学习模型中,得到各测试样本对应的输出结果;统计与测试样本的标注一致的输出结果的数量,并计算该数量与预设数量之间的比值,得到输出结果的准确率。测试样本可以从第一训练样本集中选取,也可以采用与第一训练样本集相同的获取方式获 取,本公开实施例对此不做限定,并且,本公开实施例对预设数量也不做限定。
在其中一个实施例中,如图5所示,本公开实施例还可以包括如下步骤:
步骤304,若确定第一机器学习模型的模型指标不满足第一预设指标,则进行第一机器学习模型的第N+1轮训练。
每轮训练结束后,如果第一机器学习模型的模型指标不满足第一预设指标,则进行第一机器学习模型的下一轮训练。
以N等于2为例,如果第一机器学习模型的模型指标不满足第一预设指标,则获取第二轮训练第二机器学习模型得到的至少部分模型参数,然后基于第一训练样本集、第二轮训练出的第一机器学习模型和第二轮训练第二机器学习模型得到的至少部分模型参数,对第一机器学习模型进行第三轮训练,并执行步骤303或步骤304。
以N等于3为例,如果第一机器学习模型的模型指标不满足第一预设指标,则获取第三轮训练第二机器学习模型得到的至少部分模型参数,然后基于第一训练样本集、第三轮训练出的第一机器学习模型和第三轮训练第二机器学习模型得到的至少部分模型参数,对第一机器学习模型进行第四轮训练,并执行步骤303或步骤304。
在其中一个实施例中,在每一轮训练过程中,若根据预设损失函数确定机器学习模型的输出结果不符合预设收敛条件,则利用批量梯度算法确定下降梯度并继续训练,直到确定机器学习模型的输出结果符合预设收敛条件时,停止本轮训练。
例如,在第一轮训练过程中,如果根据预设损失函数确定第一机器学习模型的输出结果不符合预设收敛条件,则利用批量梯度下降法确定下降梯度并继续训练,直到确定第一机器学习模型的输出结果符合预设收敛条件时,停止第一轮训练。在第二轮训练过程中,同样利用预设损失函数和批量梯度下降法进行模型训练。本公开实施例对预设损失函数和预设收敛条件不做限定。
上述基于第一训练样本集进行多轮模型训练,得到第一机器学习模型的过程中,基于第一初始模型参数和第一训练样本集进行第一轮模型训练,得到第一初始模型;基于第一训练样本集和当前的第二机器学习模型的至少部分模型参数进行第N轮模型训练,得到第一机器学习模型;若确定第一机器学习模型的模型指标满足第一预设指标,则结束第一机器学习模型的训练;若确定第一机器学习模型的模型指标不满足第一预设指标,则进行第一机器学习模型的第N+1轮训练。通过本公开实施例,无需利用第二训练样本集,只需利用第二机器学习模型的至少部分模型参数即可训练出满足第一预设指标的第一机器学习模型,在这个过程中,可以保证训练样本的数据安全,也可以提高模型训练速度和模型准确度。
在一个实施例中,如图6所示,上述基于第二训练样本集进行多轮模型训练,得到第二机器学习模型的步骤,可以包括:
步骤401,基于第二训练样本集和当前的第一机器学习模型的至少部分模型参数进行 第M轮模型训练,得到第二机器学习模型;其中,M为大于0的正整数。
训练第一机器学习模型得到第一初始模型后,模型训练端基于第二训练样本集和第一初始模型的至少部分模型参数进行第二机器学习模型的第M轮训练。
在第一机器学习模型与第二机器学习模型部分结构相同的情况下,基于第二训练样本集和当前的第一机器学习模型的模型参数进行第M轮模型训练,得到第二机器学习模型的过程,可以包括:基于第二训练样本集、第一初始模型的结构相同部分的模型参数以及第二初始模型参数进行第一轮训练,得到第二初始模型;其中,第二初始模型的至少部分参数用于训练第一机器学习模型;接着,基于第二训练样本集、当前的第一机器学习模型的结构相同部分的模型参数以及前一轮训练得到的第二机器学习模型的部分模型参数继续进行模型训练,得到第二机器学习模型。
例如,第一轮训练完成后,模型训练端基于第二训练样本集合、第一轮训练得到的第二机器学习模型和第二轮训练第一机器学习模型得到的至少部分模型参数,进行第二机器学习模型的第二轮模型训练得到第二机器学习模型,然后执行步骤402或步骤403。
第二轮训练完成后,如果需要继续训练,则模型训练端基于第二训练样本集合、第二轮训练出的第二机器学习模型和第三轮训练第一机器学习模型得到的至少部分模型参数,进行第二机器学习模型的第三轮模型训练,得到第二机器学习模型,然后执行步骤402或步骤403。
在第一机器学习模型与第二机器学习模型全部结构相同的情况下,基于第二训练样本集和当前的第一机器学习模型的模型参数进行第M轮模型训练,得到第二机器学习模型的过程,可以包括:基于第二训练样本集、当前的第一机器学习模型的全部模型参数进行第M轮模型训练,得到第二机器学习模型。
步骤402,若第二机器学习模型的模型指标满足第二预设指标,则结束第二机器学习模型的训练。
每轮训练结束后,计算第二机器学习模型的模型指标,并确定该模型指标是否满足第二预设指标;如果满足第二预设指标,则结束第二机器学习模型的训练。
在其中一个实施例,模型指标包括输出结果的准确率,第二预设指标包括第二预设准确率。
在实际应用中,可以计算第二机器学习模型输出结果的准确率,并将第二机器学习模型输出结果的准确率与第二预设准确率进行比较,如果第二机器学习模型输出结果的准确率大于第二预设准确率,则确定第二机器学习模型的模型指标满足第二预设指标;如果第二机器学习模型输出结果的准确率小于或等于第二预设准确率,则确定第二机器学习模型的模型指标不满足第二预设指标。
上述计算第二机器学习模型输出结果的准确率的过程,可以包括:将预设数量的测试样本输入到第二机器学习模型中,得到各测试样本对应的输出结果;统计与测试样本的标注一致的输出结果的数量,并计算该数量与预设数量之间的比值,得到输出结果的准确率。 测试样本可以从第二训练样本集中选取,也可以采用与第二训练样本集相同的获取方式获取,本公开实施例对此不做限定,并且,本公开实施例对预设数量也不做限定。
上述第一预设指标和第二预设指标可以为相同的预设指标,也可以为不同的预设指标,本公开实施例对此不做限定。
在其中一个实施例中,如图7所示,本公开实施例还可以包括如下步骤:
步骤403,若确定第二机器学习模型的模型指标不满足第二预设指标,则进行第二机器学习模型的第M+1轮训练。
每轮训练结束后,如果第二机器学习模型的模型指标不满足第二预设指标,则进行第二机器学习模型的下一轮训练。
例如,第二轮训练结束后,如果第二机器学习模型的模型指标不满足第二预设指标,则获取第三轮训练第一机器学习模型得到的模型参数,然后基于第二训练样本集、第二轮训练出的第二机器学习模型和第三轮训练第一机器学习模型得到的模型参数,对第二机器学习模型进行第三轮训练。
第三轮训练结束后,如果第二机器学习模型的模型指标不满足第二预设指标,则获取第四轮训练第一机器学习模型得到的模型参数,然后基于第二训练样本集、第三轮训练出的第二机器学习模型和第四轮训练第一机器学习模型得到的模型参数,对第二机器学习模型进行第四轮训练。以此类推,直到第二机器学习模型的模型指标满足第二预设指标为止。
在其中一个实施例中,在每一轮训练过程中,若根据预设损失函数确定机器学习模型的输出结果不符合预设收敛条件,则利用批量梯度算法确定下降梯度并继续训练,直到确定机器学习模型输出结果符合预设收敛条件时,停止本轮训练。
例如,在第一轮训练过程中,如果根据预设损失函数确定第二机器学习模型的输出结果不符合预设收敛条件,则利用批量梯度下降法确定下降梯度并继续训练,直到确定第二机器学习模型的输出结果符合预设收敛条件时,停止第一轮训练。在第二轮训练过程中,同样利用预设损失函数和批量梯度下降法进行模型训练。
上述基于第二训练样本集进行多轮模型训练,得到第二机器学习模型的过程中,基于第二训练样本集和当前的第一机器学习模型的至少部分模型参数进行第M轮模型训练,得到第二机器学习模型;若第二机器学习模型的模型指标满足第二预设指标,则结束第二机器学习模型的训练;若确定第二机器学习模型的模型指标不满足第二预设指标,则进行第二机器学习模型的第M+1轮训练。通过本公开实施例,无需利用第一训练样本集,只需利用第一机器学习模型的模型参数即可训练出满足第二预设指标的第二机器学习模型,在这个过程中,可以保证训练样本的数据安全,也可以提高模型训练速度和模型准确度。
在一个实施例中,第一机器学习模型的训练与第二机器学习模型的训练交替进行,以第一机器学习模型和第二机器学习模型部分结构相同为例进行说明。如图8所示,可以包括如下步骤:
步骤501,基于第一初始模型参数和第一训练样本集进行第一轮模型训练,得到第一初始模型。
步骤502,基于第二训练样本集、第一初始模型的结构相同部分的模型参数以及第二初始模型参数进行第一轮训练,得到第二初始模型。
其中,结构相同包括模型层次相同、连接关系相同等。
步骤503,基于第一训练样本集、当前的第二机器学习模型的至少部分模型参数以及前一轮训练得到的第一机器学习模型的部分模型参数继续进行模型训练,得到第一机器学习模型。
步骤504,基于第二训练样本集、当前的第一机器学习模型的结构相同部分的模型参数以及前一轮训练得到的第二机器学习模型的部分模型参数继续进行模型训练,得到第二机器学习模型。
步骤505,若确定第一机器学习模型的模型指标满足第一预设指标,且第二机器学习模型的模型指标满足第二预设指标,则结束第一机器学习模型和第二机器学习模型的训练。
步骤506,若确定第一机器学习模型的模型指标不满足第一预设指标,和/或第二机器学习模型的模型指标不满足第二预设指标,则进行第一机器学习模型和第二机器学习模型的下一轮训练。
在其中一个实施例中,第一机器学习模型和第二机器学习模型的结构相同。在该情形下,在步骤503中,基于第一训练样本集和当前的第二机器学习模型的全部模型参数继续进行模型训练,得到第一机器学习模型;在步骤504中,基于第二训练样本集和当前的第一机器学习模型的全部模型参数继续进行模型训练,得到第二机器学习模型。
在其中一个实施例中,第一机器学习模型的训练和第二机器学习模型的训练分别在两个独立的网络进行。
例如,模型训练系统中的第一模型训练端与第二模型训练端处于两个独立的网络中,第一模型训练端与第二训练端并不能通过网络进行通信。在这种情况下,第一机器学习模型的模型参数和第二机器学习模型的模型参数可以通过存储介质传递。
在其中一个实施例中,训练第一机器学习模型和第二机器学习模型的过程中,仅传递模型参数。
在其中一个实施例中,第一机器学习模型和第二机器学习模型的结构相同且应用相同。
例如,第一机器学习模型是使用第一训练样本集训练出的剂量预测模型,第二机器学习模型是使用第二训练样本集训练出的剂量预测模型,第一机器学习模型和第二机器学习模型的结构相同,并且都应用于剂量预测。
在其中一个实施例中,第一机器学习模型和第二机器学习模型的应用不同。
例如,第一机器学习模型是剂量预测模型,第二机器学习模型是自动勾画模型,第一 机器学习模型和第二机器学习模型的结构可以部分相同也可以完全相同。
在其中一个实施例中,本公开实施例还可以包括:将第一机器学习模型与第二机器学习模型进行组合处理,得到目标机器学习模型。
模型训练端将第一机器学习模型与第二机器学习模型进行组合处理,得到组合后的目标机器学习模型。例如,将剂量预测模型与自动勾画模型进行组合,可以得到一个先进行剂量预测,再进行自动勾画的目标机器学习模型,使得模型功能更加强大。
上述模型训练过程中,第一机器学习模型和第二机器学习模型交替进行训练,并且,训练第一机器学习模型没有利用第二训练样本集,训练第二机器学习模型也没有利用第一训练样本集,可以保证训练样本的数据安全;而训练第一机器学习模型利用第二机器学习模型的模型参数,训练第二机器学习模型利用第一机器学习模型的模型参数,可以提高模型训练速度以及机器学习模型的准确度。
在一个实施例中,如图9所示,提供了一种机器学习模型的训练方法,以该方法应用于图1中的模型训练系统为例进行说明,包括以下步骤:
步骤601,获取至少两个训练样本集。
其中,训练样本集中的训练样本包括医学扫描设备对扫描对象进行扫描得到的医学影像。例如CT图像、CBCT图像、PET图像、MR图像、超声图像等。模型训练系统获取至少两个训练样本集,获取方式可以参照步骤201。
步骤602,基于各训练样本集进行多轮模型训练,得到各训练样本集对应的机器学习模型。
其中,至少两个机器学习模型中的每两个机器学习模型至少部分结构相同,且训练其中一个机器学习模型时,至少部分利用另一个机器学习模型的结构相同部分的模型参数。
对于各机器学习模型,获取当前轮的模型参数,并基于机器学习模型对应的训练样本集和当前轮的模型参数进行模型训练,得到机器学习模型;其中,当前轮的模型参数包括初始模型参数或另一机器学习模型中结构相同部分的模型参数。
例如,获取三个训练样本集训练三个机器学习模型,其中,第一机器学习模型和第二机器学习模型至少部分结构相同,第二机器学习模型和第三机器学习模型至少部分结构相同,第一机器学习模型和第三机器学习模型至少部分结构相同。在训练过程中,对于第一机器学习模型,获取第一轮的初始模型参数,然后基于第一训练样本集和初始模型参数进行第一轮模型训练。接着,对于第二机器学习模型,获取第一机器学习模型的至少部分模型参数,基于第二训练样本集和第一机器学习模型的至少部分模型参数进行第一轮模型训练。之后,对于第三机器学习模型,获取第二机器学习模型的至少部分模型参数,基于第三训练样本集和第二机器学习模型的至少部分模型参数进行第一轮模型训练。
第一轮结束后,对于第一机器学习模型,获取第三机器学习模型的至少部分模型参数,基于第一训练样本集和第三机器学习模型的至少部分模型参数进行第二轮模型训练。对于 第二机器学习模型,获取第一机器学习模型的至少部分模型参数,基于第二训练样本集和第一机器学习模型的至少部分模型参数进行第二轮模型训练。对于第三机器学习模型,获取第二机器学习模型的至少部分模型参数,基于第三训练样本集和第二机器学习模型的至少部分模型参数进行第二轮模型训练。以此类推,依次进行训练。
在训练过程中,至少两个机器学习模型之间按照预设顺序传递模型参数。例如,第一机器学习模型的至少部分模型参数传递给第二机器学习模型,第二机器学习模型的至少部分模型参数传递给第三机器学习模型,第三机器学习模型的至少部分模型参数传递给第一机器学习模型。在其它实施例中,模型参数的传递也可以为其它传递形式,本公开实施例对预设顺序不做限定。
在其中一个实施例中,各机器学习模型的训练在独立的网络进行。
例如,第一机器学习模型、第二机器学习模型和第三机器学习模型的训练在三个独立的网络进行,三个网络之间并不能通信。
在其中一个实施例中,至少两个机器学习模型的结构相同且应用相同。
例如,第一机器学习模型、第二机器学习模型和第三机器学习模型都是剂量预测模型,并且第一机器学习模型、第二机器学习模型和第三机器学习模型的结构相同。
在其中一个实施例中,各机器学习模型的应用不同。
例如,第一机器学习模型是剂量预测模型,第二机器学习模型是自动勾画模型,第三机器学习模型是疗效评估模型。其中,每两个机器学习模型的结构可以部分相同也可以完全相同。
上述机器学习模型的训练方法中,获取至少两个训练样本集;基于各训练样本集进行多轮模型训练,得到各训练样本集对应的机器学习模型。由于每两个机器学习模型至少部分结构相同,并且训练其中一个机器学习模型时,至少部分利用另一个机器学习模型的结构相同部分的模型参数。因此,可以保证训练样本的数据安全,而且可以提高模型训练速度以及机器学习模型的准确度。
应该理解的是,虽然图2至图9的流程图中的各个步骤按照箭头的指示依轮显示,但是这些步骤并不是必然按照箭头指示的顺序依轮执行。除非本文中有明确的说明,这些步骤的执行并没有严格的顺序限制,这些步骤可以以其它的顺序执行。而且,图2至图9中的至少一部分步骤可以包括多个步骤或者多个阶段,这些步骤或者阶段并不必然是在同一时刻执行完成,而是可以在不同的时刻执行,这些步骤或者阶段的执行顺序也不必然是依轮进行,而是可以与其它步骤或者其它步骤中的步骤或者阶段的至少一部分轮流或者交替地执行。
在一个实施例中,如图10所示,提供了一种机器学习模型的训练装置,包括:
样本集获取模块701,用于获取第一训练样本集和第二训练样本集;第一训练样本集和第二训练样本集中的训练样本包括医学扫描设备对扫描对象进行扫描得到的医学影像;
第一训练模块702,用于基于第一训练样本集进行多轮模型训练,得到第一机器学习模型;
第二训练模块703,用于基于第二训练样本集进行多轮模型训练,得到第二机器学习模型;
其中,第一机器学习模型与第二机器学习模型至少部分结构相同,且训练第一机器学习模型时至少部分利用第二机器学习模型的模型参数,训练第二机器学习模型时至少部分利用第一机器学习模型的模型参数。
在其中一个实施例中,上述第一训练模块702,具体用于基于第一初始模型参数和第一训练样本集进行第一轮模型训练,得到第一初始模型;第一初始模型的至少部分模型参数用于训练所述第二机器学习模型;基于第一训练样本集和当前的第二机器学习模型的至少部分模型参数进行第N轮模型训练,得到第一机器学习模型;N为大于1的正整数;若确定第一机器学习模型的模型指标满足第一预设指标,则结束第一机器学习模型的训练。
在其中一个实施例中,第一机器学习模型与第二机器学习模型部分结构相同,上述第一训练模块702,具体用于基于第一训练样本集、当前的第二机器学习模型的结构相同部分的模型参数以及前一轮训练得到的第一机器学习模型的部分模型参数进行第N轮模型训练,得到第一机器学习模型。
在其中一个实施例中,第一机器学习模型与第二机器学习模型全部结构相同,上述第一训练模块702,具体用于基于第一训练样本集、当前的第二机器学习模型的全部模型参数进行第N轮模型训练,得到第一机器学习模型。
在其中一个实施例中,上述第一训练模块702,还用于若确定第一机器学习模型的模型指标不满足第一预设指标,则进行第一机器学习模型的第N+1轮训练。
在其中一个实施例中,上述第二训练模块703,具体用于基于第二训练样本集和当前的第一机器学习模型的至少部分模型参数进行第M轮模型训练,得到第二机器学习模型;其中,M为大于0的正整数;若第二机器学习模型的模型指标满足第二预设指标,则结束第二机器学习模型的训练。
在其中一个实施例中,第一机器学习模型与第二机器学习模型部分结构相同,上述第二训练模块703,具体用于基于第二训练样本集、第一初始模型的结构相同部分的模型参数以及第二初始模型参数进行第一轮训练,得到第二初始模型;第二初始模型的至少部分参数用于训练第一机器学习模型;基于第二训练样本集、当前的第一机器学习模型的结构相同部分的模型参数以及前一轮训练得到的第二机器学习模型的部分模型参数继续进行模型训练,得到第二机器学习模型。
在其中一个实施例中,第一机器学习模型与第二机器学习模型全部结构相同,上述第二训练模块703,具体用于基于第二训练样本集、当前的第一机器学习模型的全部模型参数进行第M轮模型训练,得到第二机器学习模型。
在其中一个实施例中,上述第二训练模块703,用于若确定第二机器学习模型的模型 指标不满足第二预设指标,则进行第二机器学习模型的第M+1轮训练。
在其中一个实施例中,模型指标包括输出结果的准确率,第一预设指标包括第一预设准确率;第二预设指标包括第二预设准确率。
在其中一个实施例中,该装置还包括:
梯度下降模块,用于在每一轮训练过程中,若根据预设损失函数确定机器学习模型的输出结果不符合预设收敛条件,则利用批量梯度算法确定下降梯度并继续训练,直到确定机器学习模型的输出结果符合预设收敛条件时,停止本轮训练。
在其中一个实施例中,上述样本集获取模块701,具体用于获取第一医院的医学影像,并基于第一医院的医学影像生成第一训练样本集;获取第二医院的医学影像,并基于第二医院的医学影像生成第二训练样本集;其中,第一医院与第二医院为不同的医院。
在其中一个实施例中,第一机器学习模型和第二机器学习模型包括剂量预测模型、自动勾画模型、疗效评估模型、生存指标评估模型、癌症筛查模型和形变配准模型中的至少一种。
在其中一个实施例中,第一机器学习模型的模型参数和第二机器学习模型的模型参数通过网络传递;
或,第一机器学习模型的模型参数和第二机器学习模型的模型参数通过存储介质传递。
在其中一个实施例中,第一机器学习模型的训练和第二机器学习模型的训练分别在两个独立的网络进行。
在其中一个实施例中,第一机器学习模型的训练与第二机器学习模型的训练交替进行。
在其中一个实施例中,第一机器学习模型和第二机器学习模型的结构相同且应用相同。
在其中一个实施例中,第一机器学习模型和第二机器学习模型的应用不同。
在其中一个实施例中,训练第一机器学习模型和第二机器学习模型的过程中,仅传递模型参数。
在一个实施例中,如图11所示,提供了一种机器学习模型的训练装置,该装置包括:
样本集获取模块801,用于获取至少两个训练样本集;训练样本集中的训练样本包括医学扫描设备对扫描对象进行扫描得到的医学影像;
训练模块802,用于基于各训练样本集进行多轮模型训练,得到各训练样本集对应的机器学习模型;
其中,至少两个机器学习模型中的每两个机器学习模型至少部分结构相同,且训练其中一个机器学习模型时,至少部分利用另一个机器学习模型的结构相同部分的模型参数。
在其中一个实施例中,上述训练模块802,用于对于各机器学习模型,获取当前轮的 模型参数,并基于机器学习模型对应的训练样本集和当前轮的模型参数进行模型训练,得到机器学习模型;其中,当前轮的模型参数包括初始模型参数或另一机器学习模型中结构相同部分的模型参数。
在其中一个实施例中,各机器学习模型的训练在独立的网络进行。
在其中一个实施例中,至少两个机器学习模型的结构相同且应用相同。
在其中一个实施例中,各机器学习模型的应用不同。
关于机器学习模型的训练装置的具体限定可以参见上文中对于机器学习模型的训练方法的限定,在此不再赘述。上述机器学习模型的训练装置中的各个模块可全部或部分通过软件、硬件及其组合来实现。上述各模块可以硬件形式内嵌于或独立于计算机设备中的处理器中,也可以以软件形式存储于计算机设备中的存储器中,以便于处理器调用执行以上各个模块对应的操作。
在一个实施例中,提供了一种计算机设备,该计算机设备可以是终端,其内部结构图可以如图12所示。该计算机设备包括通过系统总线连接的处理器、存储器、通信接口、显示屏和输入装置。其中,该计算机设备的处理器用于提供计算和控制能力。该计算机设备的存储器包括非易失性存储介质、内存储器。该非易失性存储介质存储有操作系统和计算机程序。该内存储器为非易失性存储介质中的操作系统和计算机程序的运行提供环境。该计算机设备的通信接口用于与外部的终端进行有线或无线方式的通信,无线方式可通过WIFI、运营商网络、NFC(近场通信)或其他技术实现。该计算机程序被处理器执行时以实现一种机器学习模型的训练方法。该计算机设备的显示屏可以是液晶显示屏或者电子墨水显示屏,该计算机设备的输入装置可以是显示屏上覆盖的触摸层,也可以是计算机设备外壳上设置的按键、轨迹球或触控板,还可以是外接的键盘、触控板或鼠标等。
本领域技术人员可以理解,图12中示出的结构,仅仅是与本申请方案相关的部分结构的框图,并不构成对本申请方案所应用于其上的计算机设备的限定,具体的计算机设备可以包括比图中所示更多或更少的部件,或者组合某些部件,或者具有不同的部件布置。
在一个实施例中,提供了一种计算机设备,包括存储器和处理器,存储器中存储有计算机程序,该处理器执行计算机程序时实现以下步骤:
获取第一训练样本集和第二训练样本集;第一训练样本集和第二训练样本集中的训练样本包括医学扫描设备对扫描对象进行扫描得到的医学影像;
基于第一训练样本集进行多轮模型训练,得到第一机器学习模型;
基于第二训练样本集进行多轮模型训练,得到第二机器学习模型;
其中,第一机器学习模型与第二机器学习模型至少部分结构相同,且训练第一机器学习模型时至少部分利用第二机器学习模型的模型参数,训练第二机器学习模型时至少部分利用第一机器学习模型的模型参数。
在一个实施例中,处理器执行计算机程序时还实现以下步骤:
基于第一初始模型参数和第一训练样本集进行第一轮模型训练,得到第一初始模型;第一初始模型的至少部分模型参数用于训练第二机器学习模型;
基于第一训练样本集和当前的第二机器学习模型的至少部分模型参数进行第N轮模型训练,得到第一机器学习模型;N为大于1的正整数;
若确定第一机器学习模型的模型指标满足第一预设指标,则结束第一机器学习模型的训练。
在一个实施例中,第一机器学习模型与第二机器学习模型部分结构相同,处理器执行计算机程序时还实现以下步骤:
基于第一训练样本集、当前的第二机器学习模型的结构相同部分的模型参数以及前一轮训练得到的第一机器学习模型的部分模型参数进行第N轮模型训练,得到第一机器学习模型。
在一个实施例中,第一机器学习模型与第二机器学习模型全部结构相同,处理器执行计算机程序时还实现以下步骤:
基于第一训练样本集、当前的第二机器学习模型的全部模型参数进行第N轮模型训练,得到第一机器学习模型。
在一个实施例中,处理器执行计算机程序时还实现以下步骤:
若确定第一机器学习模型的模型指标不满足第一预设指标,则进行第一机器学习模型的第N+1轮训练。
在一个实施例中,处理器执行计算机程序时还实现以下步骤:
基于第二训练样本集和当前的第一机器学习模型的至少部分模型参数进行第M轮模型训练,得到第二机器学习模型;其中,M为大于0的正整数;
若第二机器学习模型的模型指标满足第二预设指标,则结束第二机器学习模型的训练。
在一个实施例中,第一机器学习模型与第二机器学习模型部分结构相同,处理器执行计算机程序时还实现以下步骤:
基于第二训练样本集、第一初始模型的结构相同部分的模型参数以及第二初始模型参数进行第一轮训练,得到第二初始模型;第二初始模型的至少部分参数用于训练第一机器学习模型;
基于第二训练样本集、当前的第一机器学习模型的结构相同部分的模型参数以及前一轮训练得到的第二机器学习模型的部分模型参数继续进行模型训练,得到第二机器学习模型。
在一个实施例中,第一机器学习模型与第二机器学习模型全部结构相同,处理器执行计算机程序时还实现以下步骤:
基于第二训练样本集、当前的第一机器学习模型的全部模型参数进行第M轮模型训 练,得到第二机器学习模型。
在一个实施例中,处理器执行计算机程序时还实现以下步骤:
若确定第二机器学习模型的模型指标不满足第二预设指标,则进行第二机器学习模型的第M+1轮训练。
在一个实施例中,模型指标包括输出结果的准确率,第一预设指标包括第一预设准确率;第二预设指标包括第二预设准确率。
在一个实施例中,处理器执行计算机程序时还实现以下步骤:
在每一轮训练过程中,若根据预设损失函数确定机器学习模型的输出结果不符合预设收敛条件,则利用批量梯度算法确定下降梯度并继续训练,直到确定机器学习模型的输出结果符合预设收敛条件时,停止本轮训练。
在一个实施例中,处理器执行计算机程序时还实现以下步骤:
获取第一医院的医学影像,并基于第一医院的医学影像生成第一训练样本集;
获取第二医院的医学影像,并基于第二医院的医学影像生成第二训练样本集;
其中,第一医院与第二医院为不同的医院。
在一个实施例中,第一机器学习模型和第二机器学习模型包括剂量预测模型、自动勾画模型、疗效评估模型、生存指标评估模型、癌症筛查模型和形变配准模型中的至少一种。
在一个实施例中,第一机器学习模型的模型参数和第二机器学习模型的模型参数通过网络传递;
或,第一机器学习模型的模型参数和第二机器学习模型的模型参数通过存储介质传递。
在一个实施例中,第一机器学习模型的训练和第二机器学习模型的训练分别在两个独立的网络进行。
在一个实施例中,第一机器学习模型的训练与第二机器学习模型的训练交替进行。
在一个实施例中,第一机器学习模型和第二机器学习模型的结构相同且应用相同。
在一个实施例中,第一机器学习模型和第二机器学习模型的应用不同。
在一个实施例中,训练第一机器学习模型和第二机器学习模型的过程中,仅传递模型参数。
在一个实施例中,处理器执行计算机程序时还实现以下步骤:
将第一机器学习模型与第二机器学习模型进行组合处理,得到目标机器学习模型。
在一个实施例中,提供了一种计算机可读存储介质,其上存储有计算机程序,计算机程序被处理器执行时实现以下步骤:
获取第一训练样本集和第二训练样本集;第一训练样本集和第二训练样本集中的训练样本包括医学扫描设备对扫描对象进行扫描得到的医学影像;
基于第一训练样本集进行多轮模型训练,得到第一机器学习模型;
基于第二训练样本集进行多轮模型训练,得到第二机器学习模型;
其中,第一机器学习模型与第二机器学习模型至少部分结构相同,且训练第一机器学习模型时至少部分利用第二机器学习模型的模型参数,训练第二机器学习模型时至少部分利用第一机器学习模型的模型参数。
在一个实施例中,计算机程序被处理器执行时还实现以下步骤:
基于第一初始模型参数和第一训练样本集进行第一轮模型训练,得到第一初始模型;第一初始模型的至少部分模型参数用于训练第二机器学习模型;
基于第一训练样本集和当前的第二机器学习模型的至少部分模型参数进行第N轮模型训练,得到第一机器学习模型;N为大于1的正整数;
若确定第一机器学习模型的模型指标满足第一预设指标,则结束第一机器学习模型的训练。
在一个实施例中,第一机器学习模型与第二机器学习模型部分结构相同,计算机程序被处理器执行时还实现以下步骤:
基于第一训练样本集、当前的第二机器学习模型的结构相同部分的模型参数以及前一轮训练得到的第一机器学习模型的部分模型参数进行第N轮模型训练,得到第一机器学习模型。
在一个实施例中,第一机器学习模型与第二机器学习模型全部结构相同,计算机程序被处理器执行时还实现以下步骤:
基于第一训练样本集、当前的第二机器学习模型的全部模型参数进行第N轮模型训练,得到第一机器学习模型。
在一个实施例中,计算机程序被处理器执行时还实现以下步骤:
若确定第一机器学习模型的模型指标不满足第一预设指标,则进行第一机器学习模型的第N+1轮训练。
在一个实施例中,计算机程序被处理器执行时还实现以下步骤:
基于第二训练样本集和当前的第一机器学习模型的至少部分模型参数进行第M轮模型训练,得到第二机器学习模型;其中,M为大于0的正整数;
若第二机器学习模型的模型指标满足第二预设指标,则结束第二机器学习模型的训练。
在一个实施例中,第一机器学习模型与第二机器学习模型部分结构相同,计算机程序被处理器执行时还实现以下步骤:
基于第二训练样本集、第一初始模型的结构相同部分的模型参数以及第二初始模型参数进行第一轮训练,得到第二初始模型;第二初始模型的至少部分参数用于训练第一机器学习模型;
基于第二训练样本集、当前的第一机器学习模型的结构相同部分的模型参数以及前一轮训练得到的第二机器学习模型的部分模型参数继续进行模型训练,得到第二机器学习模 型。
在一个实施例中,第一机器学习模型与第二机器学习模型全部结构相同,计算机程序被处理器执行时还实现以下步骤:
基于第二训练样本集、当前的第一机器学习模型的全部模型参数进行第M轮模型训练,得到第二机器学习模型。
在一个实施例中,计算机程序被处理器执行时还实现以下步骤:
若确定第二机器学习模型的模型指标不满足第二预设指标,则进行第二机器学习模型的第M+1轮训练。
在一个实施例中,模型指标包括输出结果的准确率,第一预设指标包括第一预设准确率;第二预设指标包括第二预设准确率。
在一个实施例中,计算机程序被处理器执行时还实现以下步骤:
在每一轮训练过程中,若根据预设损失函数确定机器学习模型的输出结果不符合预设收敛条件,则利用批量梯度算法确定下降梯度并继续训练,直到确定机器学习模型的输出结果符合预设收敛条件时,停止本轮训练。
在一个实施例中,计算机程序被处理器执行时还实现以下步骤:
获取第一医院的医学影像,并基于第一医院的医学影像生成第一训练样本集;
获取第二医院的医学影像,并基于第二医院的医学影像生成第二训练样本集;
其中,第一医院与第二医院为不同的医院。
在一个实施例中,第一机器学习模型和第二机器学习模型包括剂量预测模型、自动勾画模型、疗效评估模型、生存指标评估模型、癌症筛查模型和形变配准模型中的至少一种。
在一个实施例中,第一机器学习模型的模型参数和第二机器学习模型的模型参数通过网络传递;
或,第一机器学习模型的模型参数和第二机器学习模型的模型参数通过存储介质传递。
在一个实施例中,第一机器学习模型的训练和第二机器学习模型的训练分别在两个独立的网络进行。
在一个实施例中,第一机器学习模型的训练与第二机器学习模型的训练交替进行。
在一个实施例中,第一机器学习模型和第二机器学习模型的结构相同且应用相同。
在一个实施例中,第一机器学习模型和第二机器学习模型的应用不同。
在一个实施例中,训练第一机器学习模型和第二机器学习模型的过程中,仅传递模型参数。
在一个实施例中,计算机程序被处理器执行时还实现以下步骤:
将第一机器学习模型与第二机器学习模型进行组合处理,得到目标机器学习模型。
本领域普通技术人员可以理解实现上述实施例方法中的全部或部分流程,是可以通过 计算机程序来指令相关的硬件来完成,所述的计算机程序可存储于一非易失性计算机可读取存储介质中,该计算机程序在执行时,可包括如上述各方法的实施例的流程。其中,本申请所提供的各实施例中所使用的对存储器、存储、数据库或其它介质的任何引用,均可包括非易失性和易失性存储器中的至少一种。非易失性存储器可包括只读存储器(Read-Only Memory,ROM)、磁带、软盘、闪存或光存储器等。易失性存储器可包括随机存取存储器(Random Access Memory,RAM)或外部高速缓冲存储器。作为说明而非局限,RAM可以是多种形式,比如静态随机存取存储器(Static Random Access Memory,SRAM)或动态随机存取存储器(Dynamic Random Access Memory,DRAM)等。
以上实施例的各技术特征可以进行任意的组合,为使描述简洁,未对上述实施例中的各个技术特征所有可能的组合都进行描述,然而,只要这些技术特征的组合不存在矛盾,都应当认为是本说明书记载的范围。
以上所述实施例仅表达了本申请的几种实施方式,其描述较为具体和详细,但并不能因此而理解为对发明专利范围的限制。应当指出的是,对于本领域的普通技术人员来说,在不脱离本申请构思的前提下,还可以做出若干变形和改进,这些都属于本申请的保护范围。因此,本申请专利的保护范围应以所附权利要求为准。

Claims (29)

  1. 一种机器学习模型的训练方法,其特征在于,所述方法包括:
    获取第一训练样本集和第二训练样本集;所述第一训练样本集和所述第二训练样本集中的训练样本包括医学扫描设备对扫描对象进行扫描得到的医学影像;
    基于所述第一训练样本集进行多轮模型训练,得到第一机器学习模型;
    基于所述第二训练样本集进行多轮模型训练,得到第二机器学习模型;
    其中,所述第一机器学习模型与所述第二机器学习模型至少部分结构相同,且训练所述第一机器学习模型时至少部分利用所述第二机器学习模型的模型参数,训练所述第二机器学习模型时至少部分利用所述第一机器学习模型的模型参数。
  2. 根据权利要求1所述的方法,其特征在于,所述基于所述第一训练样本集进行多轮模型训练,得到第一机器学习模型,包括:
    基于第一初始模型参数和所述第一训练样本集进行第一轮模型训练,得到第一初始模型,所述第一初始模型的至少部分模型参数用于训练所述第二机器学习模型;
    基于所述第一训练样本集和当前的第二机器学习模型的至少部分模型参数进行第N轮模型训练,得到所述第一机器学习模型;N为大于1的正整数;
    若确定所述第一机器学习模型的模型指标满足第一预设指标,则结束所述第一机器学习模型的训练。
  3. 根据权利要求2所述的方法,其特征在于,所述第一机器学习模型与所述第二机器学习模型部分结构相同,所述基于所述第一训练样本集和当前的第二机器学习模型的至少部分模型参数进行第N轮模型训练,得到所述第一机器学习模型,包括:
    基于所述第一训练样本集、当前的第二机器学习模型的结构相同部分的模型参数以及前一轮训练得到的第一机器学习模型的部分模型参数进行第N轮模型训练,得到所述第一机器学习模型。
  4. 根据权利要求2所述的方法,其特征在于,所述第一机器学习模型与所述第二机器学习模型全部结构相同,所述基于所述第一训练样本集和当前的第二机器学习模型的至少部分模型参数进行第N轮模型训练,得到所述第一机器学习模型,包括:
    基于所述第一训练样本集、当前的第二机器学习模型的全部模型参数进行第N轮模型训练,得到所述第一机器学习模型。
  5. 根据权利要求2所述的方法,其特征在于,所述方法还包括:
    若确定所述第一机器学习模型的模型指标不满足所述第一预设指标,则进行所述第一机器学习模型的第N+1轮训练。
  6. 根据权利要求2所述的方法,其特征在于,所述基于所述第二训练样本集进行多轮模型训练,得到第二机器学习模型,包括:
    基于所述第二训练样本集和当前的第一机器学习模型的至少部分模型参数进行第M轮模型训练,得到所述第二机器学习模型;其中,M为大于0的正整数;
    若所述第二机器学习模型的模型指标满足第二预设指标,则结束所述第二机器学习模型的训练。
  7. 根据权利要求6所述的方法,其特征在于,所述第一机器学习模型与所述第二机器学习模型部分结构相同,所述基于所述第二训练样本集和当前的第一机器学习模型的至少部分模型参数进行第M轮模型训练,得到所述第二机器学习模型,包括:
    基于所述第二训练样本集、所述第一初始模型的结构相同部分的模型参数以及第二初始模型参数进行第一轮训练,得到第二初始模型;所述第二初始模型的至少部分参数用于训练所述第一机器学习模型;
    基于所述第二训练样本集、当前的第一机器学习模型的结构相同部分的模型参数以及前一轮训练得到的第二机器学习模型的部分模型参数继续进行模型训练,得到所述第二机器学习模型。
  8. 根据权利要求6所述的方法,其特征在于,所述第一机器学习模型与所述第二机器学习模型全部结构相同,所述基于所述第二训练样本集和当前的第一机器学习模型的至少部分模型参数进行第M轮模型训练,得到所述第二机器学习模型,包括:
    基于所述第二训练样本集、当前的第一机器学习模型的全部模型参数进行第M轮模型训练,得到所述第二机器学习模型。
  9. 根据权利要求6所述的方法,其特征在于,所述方法还包括:
    若确定所述第二机器学习模型的模型指标不满足所述第二预设指标,则进行所述第二机器学习模型的第M+1轮训练。
  10. 根据权利要求6所述的方法,其特征在于,所述模型指标包括输出结果的准确率,所述第一预设指标包括第一预设准确率;所述第二预设指标包括第二预设准确率。
  11. 根据权利要求6所述的方法,其特征在于,所述方法还包括:
    在每一轮训练过程中,若根据预设损失函数确定机器学习模型的输出结果不符合预设收敛条件,则利用批量梯度算法确定下降梯度并继续训练,直到确定所述机器学习模型的输出结果符合所述预设收敛条件时,停止本轮训练。
  12. 根据权利要求1所述的方法,其特征在于,所述获取第一训练样本集和第二训练样本集,包括:
    获取所述第一医院的医学影像,并基于所述第一医院的医学影像生成所述第一训练样本集;
    获取所述第二医院的医学影像,并基于所述第二医院的医学影像生成所述第二训练样本集;
    其中,所述第一医院与所述第二医院为不同的医院。
  13. 根据权利要求1所述的方法,其特征在于,所述第一机器学习模型和所述第二机器学习模型包括剂量预测模型、自动勾画模型、疗效评估模型、生存指标评估模型、癌症筛查模型和形变配准模型中的至少一种。
  14. 根据权利要求1所述的方法,其特征在于,所述第一机器学习模型的模型参数和所述第二机器学习模型的模型参数通过网络传递;
    或,所述第一机器学习模型的模型参数和所述第二机器学习模型的模型参数通过存储介质传递。
  15. 根据权利要求1所述的方法,其特征在于,所述第一机器学习模型的训练和第二机器学习模型的训练分别在两个独立的网络进行。
  16. 根据权利要求1所述的方法,其特征在于,所述第一机器学习模型的训练与所述第二机器学习模型的训练交替进行。
  17. 根据权利要求1所述的方法,其特征在于,所述第一机器学习模型和第二机器学习模型的结构相同且应用相同。
  18. 根据权利要求1所述的方法,其特征在于,所述第一机器学习模型和第二机器学习模型的应用不同。
  19. 根据权利要求1所述的方法,其特征在于,训练所述第一机器学习模型和所述第二机器学习模型的过程中,仅传递模型参数。
  20. 根据权利要求1所述的方法,其特征在于,所述方法还包括:
    将所述第一机器学习模型与所述第二机器学习模型进行组合处理,得到目标机器学习模型。
  21. 一种机器学习模型的训练方法,其特征在于,所述方法包括:
    获取至少两个训练样本集;所述训练样本集中的训练样本包括医学扫描设备对扫描对象进行扫描得到的医学影像;
    基于各所述训练样本集进行多轮模型训练,得到各所述训练样本集对应的机器学习模型;
    其中,至少两个所述机器学习模型中的每两个机器学习模型至少部分结构相同,且训练其中一个机器学习模型时,至少部分利用另一个机器学习模型的结构相同部分的模型参数。
  22. 根据权利要求21所述的方法,其特征在于,所述基于各所述训练样本集进行多轮模型训练,得到各所述训练样本集对应的机器学习模型,包括:
    对于各所述机器学习模型,获取当前轮的模型参数,并基于所述机器学习模型对应的训练样本集和所述当前轮的模型参数进行模型训练,得到所述机器学习模型;其中,所述当前轮的模型参数包括初始模型参数或另一机器学习模型中结构相同部分的模型参数。
  23. 根据权利要求21所述的方法,其特征在于,各所述机器学习模型的训练在独立的网络进行。
  24. 根据权利要求21所述的方法,其特征在于,至少两个所述机器学习模型的结构相同且应用相同。
  25. 根据权利要求21所述的方法,其特征在于,各所述机器学习模型的应用不同。
  26. 一种机器学习模型的训练装置,其特征在于,所述装置包括:
    样本集获取模块,用于获取第一训练样本集和第二训练样本集;所述第一训练样本集和所述第二训练样本集中的训练样本包括医学扫描设备对扫描对象进行扫描得到的医学影像;
    第一训练模块,用于基于所述第一训练样本集进行多轮模型训练,得到第一机器学习模型;
    第二训练模块,用于基于所述第二训练样本集进行多轮模型训练,得到第二机器学习模型;
    其中,所述第一机器学习模型与所述第二机器学习模型至少部分结构相同;且训练所述第一机器学习模型时至少部分利用所述第二机器学习模型的模型参数,训练所述第二机器学习模型时至少部分利用所述第一机器学习模型的模型参数。
  27. 一种机器学习模型的训练装置,其特征在于,所述装置包括:
    样本集获取模块,用于获取至少两个训练样本集;所述训练样本集中的训练样本包括医学扫描设备对扫描对象进行扫描得到的医学影像;
    训练模块,用于基于各所述训练样本集进行多轮模型训练,得到各所述训练样本集对应的机器学习模型;
    其中,至少两个所述机器学习模型中的每两个机器学习模型至少部分结构相同,且训练其中一个机器学习模型时,至少部分利用另一个机器学习模型的结构相同部分的模型参数。
  28. 一种计算机设备,包括存储器和处理器,所述存储器存储有计算机程序,其特征在于,所述处理器执行所述计算机程序时实现权利要求1至25中任一项所述的方法的步骤。
  29. 一种计算机可读存储介质,其上存储有计算机程序,其特征在于,所述计算机程序被处理器执行时实现权利要求1至25中任一项所述的方法的步骤。
PCT/CN2021/105777 2021-07-12 2021-07-12 机器学习模型的训练方法、装置、计算机设备和存储介质 Ceased WO2023283765A1 (zh)

Priority Applications (4)

Application Number Priority Date Filing Date Title
EP21949563.7A EP4343708A4 (en) 2021-07-12 2021-07-12 METHOD AND APPARATUS FOR TRAINING MACHINE LEARNING MODELS, COMPUTER DEVICE AND STORAGE MEDIUM
US18/579,328 US20240346374A1 (en) 2021-07-12 2021-07-12 Method and apparatus for training machine learning models, computer device, and storage medium
CN202180098440.3A CN117355850A (zh) 2021-07-12 2021-07-12 机器学习模型的训练方法、装置、计算机设备和存储介质
PCT/CN2021/105777 WO2023283765A1 (zh) 2021-07-12 2021-07-12 机器学习模型的训练方法、装置、计算机设备和存储介质

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2021/105777 WO2023283765A1 (zh) 2021-07-12 2021-07-12 机器学习模型的训练方法、装置、计算机设备和存储介质

Publications (1)

Publication Number Publication Date
WO2023283765A1 true WO2023283765A1 (zh) 2023-01-19

Family

ID=84919822

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/105777 Ceased WO2023283765A1 (zh) 2021-07-12 2021-07-12 机器学习模型的训练方法、装置、计算机设备和存储介质

Country Status (4)

Country Link
US (1) US20240346374A1 (zh)
EP (1) EP4343708A4 (zh)
CN (1) CN117355850A (zh)
WO (1) WO2023283765A1 (zh)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN118245810A (zh) * 2024-05-28 2024-06-25 北京壹永科技有限公司 训练大语言模型的方法、装置、电子设备

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP4490720A1 (en) * 2022-04-21 2025-01-15 Google LLC Joint segmenting and automatic speech recognition
US12573495B2 (en) 2022-12-30 2026-03-10 Cilag Gmbh International Surgical computing system with support for interrelated machine learning models
US20240221892A1 (en) * 2022-12-30 2024-07-04 Cilag Gmbh International Surgical computing system with support for interrelated machine learning models
US12531156B2 (en) 2022-12-30 2026-01-20 Cilag Gmbh International Method for advanced algorithm support

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180218502A1 (en) * 2017-01-27 2018-08-02 Arterys Inc. Automated segmentation utilizing fully convolutional networks
CN110348436A (zh) * 2019-06-19 2019-10-18 平安普惠企业管理有限公司 对图像中的文本信息进行识别的方法及相关设备
CN110400251A (zh) * 2019-06-13 2019-11-01 深圳追一科技有限公司 视频处理方法、装置、终端设备及存储介质
CN112257738A (zh) * 2020-07-31 2021-01-22 北京京东尚科信息技术有限公司 机器学习模型的训练方法、装置和图像的分类方法、装置
CN112861892A (zh) * 2019-11-27 2021-05-28 杭州海康威视数字技术股份有限公司 图片中目标的属性的确定方法和装置

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11593634B2 (en) * 2018-06-19 2023-02-28 Adobe Inc. Asynchronously training machine learning models across client devices for adaptive intelligence
CN112651510B (zh) * 2019-10-12 2024-09-06 华为技术有限公司 模型更新方法、工作节点及模型更新系统
US11604984B2 (en) * 2019-11-18 2023-03-14 Shanghai United Imaging Intelligence Co., Ltd. Systems and methods for machine learning based modeling

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180218502A1 (en) * 2017-01-27 2018-08-02 Arterys Inc. Automated segmentation utilizing fully convolutional networks
CN110400251A (zh) * 2019-06-13 2019-11-01 深圳追一科技有限公司 视频处理方法、装置、终端设备及存储介质
CN110348436A (zh) * 2019-06-19 2019-10-18 平安普惠企业管理有限公司 对图像中的文本信息进行识别的方法及相关设备
CN112861892A (zh) * 2019-11-27 2021-05-28 杭州海康威视数字技术股份有限公司 图片中目标的属性的确定方法和装置
CN112257738A (zh) * 2020-07-31 2021-01-22 北京京东尚科信息技术有限公司 机器学习模型的训练方法、装置和图像的分类方法、装置

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
See also references of EP4343708A4 *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN118245810A (zh) * 2024-05-28 2024-06-25 北京壹永科技有限公司 训练大语言模型的方法、装置、电子设备

Also Published As

Publication number Publication date
CN117355850A (zh) 2024-01-05
EP4343708A4 (en) 2024-08-07
US20240346374A1 (en) 2024-10-17
EP4343708A1 (en) 2024-03-27

Similar Documents

Publication Publication Date Title
WO2023283765A1 (zh) 机器学习模型的训练方法、装置、计算机设备和存储介质
Foley et al. OpenFL: the open federated learning library
CN109754447B (zh) 图像生成方法、装置、设备和存储介质
US11328412B2 (en) Hierarchical learning of weights of a neural network for performing multiple analyses
Tolu‐Akinnawo et al. Advancements in artificial intelligence in noninvasive cardiac imaging: a comprehensive review
JP2021056995A (ja) 医用情報処理装置、医用情報処理システム及び医用情報処理方法
US20220130525A1 (en) Artificial intelligence orchestration engine for medical studies
CN109567852B (zh) 扫描范围的确定方法、医学图像的获取方法、装置和设备
US20210177261A1 (en) System, method, and computer-accessible medium for magnetic resonance value driven autonomous scanner
CN107330951A (zh) 图像重建系统及方法
CN105279364A (zh) 协议管理系统
Saldanha et al. Swarm learning with weak supervision enables automatic breast cancer detection in magnetic resonance imaging
US10296713B2 (en) Method and system for reviewing medical study data
CN114596304A (zh) 图像检测模型的生成方法、图像检测方法及计算机设备
Pacheco et al. Pilot deployment of a cloud-based universal medical image repository in a large public health system: A protocol study
ElBedoui et al. SoK: federated learning and unlearning for medical image analysis
CN114723723B (zh) 医学影像处理方法、计算机设备和存储介质
CN113742506B (zh) 影像显示方法和计算机设备
US20240145068A1 (en) Medical image analysis platform and associated methods
CN114565530B (zh) 图像重建方法、装置、计算机设备和存储介质
CN114913260B (zh) 图像重建方法、装置、计算机设备和存储介质
CN117635511A (zh) 医学图像处理方法、装置、计算机设备和存储介质
CN108511052A (zh) 用于确定投影数据组的方法以及投影确定系统
JP7216660B2 (ja) 下流のニーズを総合することにより読み取り環境を決定するためのデバイス、システム、及び方法
KR102665091B1 (ko) 의료 정보 처리 장치 및 방법

Legal Events

Date Code Title Description
WWE Wipo information: entry into national phase

Ref document number: 202180098440.3

Country of ref document: CN

WWE Wipo information: entry into national phase

Ref document number: 2021949563

Country of ref document: EP

Ref document number: 21949563

Country of ref document: EP

WWE Wipo information: entry into national phase

Ref document number: 18579328

Country of ref document: US

ENP Entry into the national phase

Ref document number: 2021949563

Country of ref document: EP

Effective date: 20231221

NENP Non-entry into the national phase

Ref country code: DE