WO2023167448A1 - 병리 슬라이드 이미지를 분석하는 방법 및 장치 - Google Patents
병리 슬라이드 이미지를 분석하는 방법 및 장치 Download PDFInfo
- Publication number
- WO2023167448A1 WO2023167448A1 PCT/KR2023/002241 KR2023002241W WO2023167448A1 WO 2023167448 A1 WO2023167448 A1 WO 2023167448A1 KR 2023002241 W KR2023002241 W KR 2023002241W WO 2023167448 A1 WO2023167448 A1 WO 2023167448A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- patch
- information
- slide image
- pathology slide
- machine learning
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Ceased
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/77—Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
- G06V10/774—Generating sets of training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/82—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/0002—Inspection of images, e.g. flaw detection
- G06T7/0012—Biomedical image inspection
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/60—Type of objects
- G06V20/69—Microscopic objects, e.g. biological cells or cellular parts
- G06V20/698—Matching; Classification
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/10—Image acquisition modality
- G06T2207/10056—Microscopic image
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20081—Training; Learning
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/30—Subject of image; Context of image processing
- G06T2207/30004—Biomedical image processing
- G06T2207/30024—Cell structures in vitro; Tissue sections in vitro
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V2201/00—Indexing scheme relating to image or video recognition or understanding
- G06V2201/03—Recognition of patterns in medical or anatomical images
Definitions
- the present disclosure relates to methods and apparatus for analyzing pathology slide images.
- the field of digital pathology is a field of obtaining histological information or predicting a prognosis of a subject using a whole slide image generated by scanning a pathological slide image.
- a pathology slide image can be obtained from a stained tissue sample of a subject.
- tissue samples can be tested for hematoxylin and eosin, trichrome, periodic acid schiff, autoradiogrphy, enzyme histochemistry, It can be stained by various staining methods such as immuno-fluorescence and immunohistochemistry. Stained tissue samples can be used for histology and biopsy evaluation, thereby providing a basis for determining whether to proceed to molecular profile analysis to understand disease status.
- Recognizing and detecting biological factors from pathology slide images has important effects on histological diagnosis of a specific disease, prediction of prognosis, and determination of treatment direction.
- the performance of the machine learning model for detecting or segmenting biological elements from the pathology slide image is low, this may be an obstacle in establishing an accurate treatment plan for the subject.
- An object of the present invention is to provide a method and apparatus for analyzing a pathology slide image.
- it is to provide a computer-readable recording medium recording a program for executing the method on a computer.
- the technical problem to be solved is not limited to the technical problems described above, and other technical problems may exist.
- a computing device includes at least one memory; and at least one processor, wherein the processor obtains a first pathology slide image in which at least one first object is expressed and biological information of the at least one first object, and the first Learning data is generated using at least one first patch included in the pathology slide image and the biological information, a first machine learning model is learned based on the learning data, and the learned first machine learning model The second pathology slide image is analyzed using.
- a method of analyzing a pathology slide image may include acquiring a first pathology slide image in which at least one first subject is expressed and biological information of the at least one first subject; generating learning data using at least one first patch included in the first pathology slide image and the biological information; learning a first machine learning model based on the learning data; and analyzing a second pathology slide image using the learned first machine learning model.
- a computer-readable recording medium includes a recording medium on which a program for executing the above-described method is recorded on a computer.
- FIG. 1 is a diagram for explaining an example of a system for analyzing a pathology slide image according to an exemplary embodiment.
- FIG. 2 is a block diagram of a system and network for preparing, processing, and reviewing slide images of tissue specimens using a machine learning model, according to one embodiment.
- 3A is a configuration diagram illustrating an example of a user terminal according to an embodiment.
- 3B is a configuration diagram illustrating an example of a server according to an embodiment.
- FIG. 4 is a flowchart illustrating an example of a method of processing a pathology slide image, according to an exemplary embodiment.
- FIG. 5 is a diagram for explaining examples of biological information according to an exemplary embodiment.
- FIG. 6 is a flowchart illustrating an example in which a processor obtains spatial transcript information according to an exemplary embodiment.
- FIG. 7 is a diagram for explaining an example of learning data according to an exemplary embodiment.
- FIG. 8 is a flowchart illustrating another example of a method of processing a pathology slide image according to an exemplary embodiment.
- FIG. 9 is a diagram for explaining an example in which a processor predicts a treatment response of a subject according to an exemplary embodiment.
- FIG. 10 is a diagram for explaining an example in which a processor learns a first machine learning model according to an embodiment.
- FIG. 11 is a diagram for explaining another example in which a processor learns a first machine learning model according to an embodiment.
- FIG. 12 is a diagram for explaining an example in which an operation of a processor is implemented according to an exemplary embodiment.
- 13A and 13B are diagrams for describing examples in which annotations are generated based on user input according to an exemplary embodiment.
- a computing device includes at least one memory; and at least one processor, wherein the processor obtains a first pathology slide image in which at least one first object is expressed and biological information of the at least one first object, and the first Learning data is generated using at least one first patch included in the pathology slide image and the biological information, a first machine learning model is learned based on the learning data, and the learned first machine learning model The second pathology slide image is analyzed using.
- ⁇ unit and “ ⁇ module” described in the specification mean a unit that processes at least one function or operation, which may be implemented as hardware or software or a combination of hardware and software.
- a "pathology slide image” may refer to an image obtained by taking a pathology slide fixed and stained through a series of chemical treatment processes for tissue, etc., removed from a human body.
- the pathology slide image may refer to a whole slide image (WSI) including a high-resolution image of the entire slide, and refers to a part of the whole slide image, for example, one or more patches.
- WSI whole slide image
- a pathology slide image may refer to a digital image photographed or scanned through a scanning device (eg, a digital scanner, etc.), and may refer to a specific protein, cell, tissue, and/or structure ( structure) may be included.
- the pathology slide image may include one or more patches, and histological information may be applied to the one or more patches through an annotation (eg, tagging).
- medical information may refer to any medically meaningful information that can be extracted from a medical image, for example, a specific tissue (eg, cancer tissue, cancer in a medical image) stromal tissue, etc.) and/or specific cells (e.g., tumor cells, lymphoid cells, macrophage cells, endothelial cells, fibroblast cells, etc.) It may include, but is not limited to, cancer diagnosis information, information related to the subject's possibility of developing cancer, and/or medical conclusions related to cancer treatment.
- medical information may include not only quantified numerical values obtained from medical images, but also information visualizing numerical values, prediction information according to numerical values, image information, statistical information, and the like. The medical information generated in this way may be provided to a user terminal or output or transferred to a display device and displayed.
- FIG. 1 is a diagram for explaining an example of a system for analyzing a pathology slide image according to an exemplary embodiment.
- a system 1 includes a user terminal 10 and a server 20 .
- the user terminal 10 and the server 20 may be connected in a wired or wireless communication method to transmit and receive data (eg, image data, etc.) between them.
- FIG. 1 shows that the system 1 includes the user terminal 10 and the server 20, but is not limited thereto.
- the system 1 may include other external devices (not shown), and the operation of the user terminal 10 and the server 20 to be described below may be performed by a single device (eg, the user terminal 10). Alternatively, it may be implemented by the server 20) or more devices.
- the user terminal 10 may be a computing device including a display device and a device (eg, a keyboard, mouse, etc.) that receives a user input, and includes a memory and a processor.
- the user terminal 10 may include a notebook PC, a desktop PC, a laptop, a tablet computer, a smart phone, and the like, but is not limited thereto.
- the server 20 may be a device that communicates with an external device (not shown) including the user terminal 10 .
- the server 20 may include a pathology slide image, a bitmap image corresponding to the pathology slide image, and information generated by analyzing the pathology slide image (eg, at least one tissue expressed in the pathology slide image). ) and information on cells, including biomarker expression information, etc.), and information on machine learning models used for analysis of pathology slide images.
- the server 20 may be a computing device including a memory and a processor and having its own computing capability. When the server 20 is a computing device, the server 20 may perform at least some of the operations of the user terminal 10 to be described later with reference to FIGS. 1 to 13B .
- the server 20 may be a cloud server, but is not limited thereto.
- the user terminal 10 outputs a pathology slide image and/or an image representing information generated through analysis of the pathology slide. For example, various information about at least one tissue and cell expressed in the pathology slide image may be expressed in the image. Also, biomarker expression information may be expressed in the image. Also, the image may be a report including medical information about at least a partial area included in the pathology slide image.
- the pathology slide image may refer to an image obtained by taking a pathology slide fixed and stained through a series of chemical treatments in order to observe a tissue removed from the human body under a microscope.
- a pathology slide image may refer to a whole slide image including a high-resolution image of the entire slide.
- a pathology slide image may refer to a portion of such a high-resolution, full-slide image.
- the pathology slide image may refer to a patch region divided in patch units from an entire slide image.
- a patch may have a certain area size.
- a patch may refer to a region including each of included objects within the entire slide.
- the pathology slide image may refer to a digital image taken using a microscope, and may include information about cells, tissues, and/or structures in the human body.
- biological elements eg, cancer cells, immune cells, cancer regions, etc.
- These biological factors can be used for histological diagnosis of disease, prediction of disease prognosis, and determination of treatment direction for disease.
- a machine learning model may be used in analyzing the pathology slide image.
- the machine learning model should be trained to recognize biological elements from the pathology slide image.
- Learning data often depends on annotation work performed by an expert (eg, a pathologist) on pathology slide images.
- the annotation work includes an expert marking the location and type of cells and/or tissues expressed in the pathology slide image one by one.
- the user terminal 10 analyzes the pathology slide image using a machine learning model. At this time, the user terminal 10 generates learning data using the pathology slide image in which the object is expressed and biological information of the object, and learns the machine learning model using the learning data.
- the user terminal 10 can improve the performance of the machine learning model even if no annotation work is performed (or even by a small amount of annotation results). can Accordingly, the accuracy of the analysis result of the pathology slide image by the machine learning model may be improved. In addition, since the user terminal 10 can predict a therapeutic reaction of the subject using the analysis result of the pathology slide image, the accuracy of the prediction result of the therapeutic response can be guaranteed.
- the user terminal 10 may generate learning data by utilizing spatial transcriptomics information of the object. Therefore, unlike the conventional learning data that depends on the expert's annotation work, the problem of deterioration in the performance of the machine learning model due to different expert standards can be solved.
- spatial transcriptome information spatial gene expression information can be obtained from a pathology slide image.
- a single spot can be set to include several cells. Therefore, gene expression information obtained within a single spot may be more objective information than judgment by an expert's visual recognition ability.
- the user terminal 10 may generate learning data using pathology slide images in which the object is stained in different ways.
- biological elements expressed in specific colors eg, proteins located in cell membranes or cell nuclei
- different biological factors can be identified through pathology slide images stained in different ways. Accordingly, when pathology slide images stained in different ways are used as training data, the performance of the machine learning model can be improved.
- the user terminal 10 learns a machine learning model, analyzes a pathology slide image using the learned machine learning model, and predicts a subject's treatment response using the analysis result. explain an example.
- the user terminal 10 learns a machine learning model throughout the specification, analyzes the pathology slide image using the learned machine learning model, and uses the analysis result to treat the subject's treatment response. It has been described as predicting, but is not limited thereto. For example, at least some of the operations performed by the user terminal 10 may be performed by the server 20 .
- the server 20 may generate learning data using a pathology slide image representing an object and biological information of the object. And, the server 20 may learn the machine learning model based on the learning data. In addition, the server 20 may analyze the pathology slide image using the learned machine learning model and transmit the result of the analysis to the user terminal 10 . In addition, the server 20 may predict a treatment response of the subject using the result of the analysis and transmit the result of the prediction to the user terminal 10 .
- the operation of the server 20 is not limited to the above.
- FIG. 2 is a block diagram of a system and network for preparing, processing, and reviewing slide images of tissue specimens using a machine learning model, according to one embodiment.
- system 2 includes user terminals 11 and 12, a scanner 50, an image management system 61, an AI-based biomarker analysis system 62, a laboratory information management system 63, and a server. (70).
- the components 11, 12, 50, 61, 62, 63, and 70 included in the system 2 may be connected to each other through a network 80.
- the network 80 may be a network in which components 11, 12, 50, 61, 62, 63, and 70 may be connected to each other through wired or wireless communication.
- the system 2 shown in FIG. 2 may include a network that can be connected to servers in hospitals, laboratories, laboratories, and/or user terminals of doctors or researchers.
- a method to be described later with reference to FIGS. 3A to 13B includes user terminals 11 and 12, an image management system 61, an AI-based biomarker analysis system 62, and laboratory information management. system 63 and/or hospital or laboratory server 70.
- the scanner 50 may acquire a digitized image from a tissue sample slide generated using a tissue sample of the subject 90 .
- the scanner 50, the user terminals 11 and 12, the image management system 61, the AI-based biomarker analysis system 62, the laboratory information management system 63 and / or the hospital or laboratory server 70 Is connected to a network 80 such as the Internet through one or more computers, servers, and/or mobile devices, respectively, or is connected to the user 30 and/or the subject 90 through one or more computers and/or mobile devices.
- a network 80 such as the Internet
- computers, servers, and/or mobile devices respectively, or is connected to the user 30 and/or the subject 90 through one or more computers and/or mobile devices.
- the user terminals 11 and 12, the image management system 61, the AI-based biomarker analysis system 62, the laboratory information management system 63, and/or the hospital or laboratory server 70 are used to monitor one or more subjects 90.
- a tissue sample, tissue sample slide, digitized images of a tissue sample slide, or any combination thereof may be generated or otherwise obtained from another device.
- the user terminals 11 and 12, the image management system 61, the AI-based biomarker analysis system 62, and the laboratory information management system 63 determine the age, medical history, cancer treatment history, and family history of the subject 90. Any combination of subject-specific information such as , past biopsy records, or disease information of the subject 90 may be obtained.
- the scanner 50, the user terminals 11 and 12, the image management system 61, the laboratory information management system 63, and/or the hospital or laboratory server 70 transmit digitized slide images and / or subject-specific information may be transmitted to the AI-based biomarker analysis system 62 .
- the AI-based biomarker analysis system 62 includes at least one of the scanner 50, the user terminals 11 and 12, the image management system 61, the laboratory information management system 63, and/or the hospital or laboratory server 70. It may include one or more storage devices (not shown) for storing images and data received from.
- the AI-based biomarker analysis system 62 may include a machine learning model repository that stores a machine learning model trained to process the received images and data.
- the AI-based biomarker analysis system 62 may obtain information about at least one cell, information about at least one region, information related to a biomarker, medical diagnosis information, and/or information about at least one cell from a pathology slide image of the subject 90. or a machine learning model learned and trained to predict at least one of the medical treatment information.
- the scanner 50, the user terminals 11 and 12, the AI-based biomarker analysis system 62, the laboratory information management system 63, and/or the hospital or laboratory server 70 transmit the digitized slide through the network 80.
- a result of analyzing the image, subject-specific information and/or digitized slide image may be transmitted to the image management system 61 .
- the image management system 61 may include a storage for storing received images and a storage for storing analysis results.
- a machine learning model learned and trained to predict at least one of the pieces of information may be stored and operated in the user terminals 11 and 12 and/or the image management system 61 .
- a method of analyzing a pathology slide image a method of processing subject information, a method of selecting a subject group, a method of designing a clinical trial, a method of generating biomarker expression information, and/or a specific biomarker
- the reference value setting method for AI-based biomarker analysis system 62, as well as user terminals 11 and 12, image management system 61, laboratory information management system 63 and / or hospital or laboratory server 70 can be performed in
- 3A is a configuration diagram illustrating an example of a user terminal according to an embodiment.
- the user terminal 100 includes a processor 110, a memory 120, an input/output interface 130 and a communication module 140.
- a processor 110 the memory 120
- an input/output interface 130 the input/output interface 130
- a communication module 140 the communication module 140
- the operation of the user terminal 100 is performed by the user terminals 11 and 12 of FIG. 2, the image management system 61, the AI-based biomarker analysis system 62, the laboratory information management system 63, and/or the hospital or It may be performed on the laboratory server 70.
- the processor 110 may process commands of a computer program by performing basic arithmetic, logic, and input/output operations.
- the command may be provided from the memory 120 or an external device (eg, the server 20).
- the processor 110 may overall control operations of other components included in the user terminal 100 .
- the processor 110 may obtain a first pathology slide image in which at least one first object is expressed and biological information of the at least one first object.
- the biological information may include at least one of information identified from the third pathology slide image and spatial transcriptome information of the first object.
- the third pathology slide image may include a stained image distinguished from the first pathology slide image.
- the processor 110 may generate learning data using at least one first patch included in the first pathology slide image and biological information.
- the learning data may include at least one of gene expression information corresponding to the first patch and at least one cell type represented in the first patch.
- the processor 110 may learn a first machine learning model based on the training data and analyze the second pathology slide image using the learned first machine learning model.
- the processor 110 may learn the first machine learning model by using the training data as ground truth data.
- the processor 110 may learn the first machine learning model by using at least one annotation generated based on a user input as correct answer data.
- the processor 110 may learn the first machine learning model by using the training data and at least one annotation as correct answer data.
- the processor 110 may generate a second machine learning model by adding or removing at least one layer included in the learned first machine learning model.
- the second machine learning model may be used to identify the type of at least one cell shown in the second pathology slide image.
- the processor 110 may predict a treatment response of the subject corresponding to the second pathology slide image by using spatial transcript information of the second object represented in the second pathology slide image.
- the spatial transcript information of the second object may include at least one of spatial transcript information obtained by the learned first machine learning model and spatial transcript information obtained separately.
- prediction of treatment response may be performed by a third machine learning model.
- the third machine learning model may be learned using a feature vector extracted from at least one layer included in the first machine learning model.
- the third machine learning model may be learned using gene expression information included in spatial transcriptome information and location information corresponding to the gene expression information.
- the processor 110 may use the first patch and the second patch included in the third pathology slide image as training data for learning the first machine learning model.
- the processor 110 may use a third patch obtained by image processing of the first patch and the second patch as training data for learning the first machine learning model.
- the second patch may include a patch indicating a position corresponding to the first patch.
- the processor 110 may use at least one annotation generated based on the first patch and the user input as training data for learning the first machine learning model.
- at least one annotation may be generated based on the third pathology slide image.
- the machine learning model means a statistical learning algorithm implemented based on the structure of a biological neural network or a structure for executing the algorithm in machine learning technology and cognitive science.
- the machine learning model as in a biological neural network, nodes, which are artificial neurons that form a network by combining synapses, repeatedly adjust the weights of synapses, and between the correct output corresponding to a specific input and the inferred output By learning to reduce the error of , it is possible to represent a model with problem-solving ability.
- the machine learning model may include an arbitrary probability model, a neural network model, and the like used in artificial intelligence learning methods such as machine learning and deep learning.
- a machine learning model may be implemented as a multilayer perceptron (MLP) composed of multilayer nodes and connections between them.
- the machine learning model according to this embodiment may be implemented using one of various artificial neural network model structures including MLP.
- a machine learning model includes an input layer that receives input signals or data from the outside, an output layer that outputs output signals or data corresponding to the input data, and a characteristic that is located between the input layer and the output layer and receives signals from the input layer. It may be composed of at least one hidden layer that extracts and delivers to the output layer. The output layer receives signals or data from the hidden layer and outputs them to the outside.
- the machine learning model may be trained to receive one or more pathology slide images and extract information about one or more objects (eg, cells, tissues, structures, etc.) included in the pathology slide images.
- objects eg, cells, tissues, structures, etc.
- the processor 110 may be implemented as an array of a plurality of logic gates, or may be implemented as a combination of a general-purpose microprocessor and a memory in which programs executable by the microprocessor are stored.
- the processor 110 may include a general-purpose processor, a central processing unit (CPU), a microprocessor, a digital signal processor (DSP), a controller, a microcontroller, a state machine, and the like.
- the processor 110 may include an application specific integrated circuit (ASIC), a programmable logic device (PLD), a field programmable gate array (FPGA), or the like.
- ASIC application specific integrated circuit
- PLD programmable logic device
- FPGA field programmable gate array
- processor 110 may be a combination of a digital signal processor (DSP) and a microprocessor, a combination of a plurality of microprocessors, a combination of one or more microprocessors coupled with a digital signal processor (DSP) core, or any other such combination. It may also refer to a combination of processing devices, such as a combination of configurations.
- DSP digital signal processor
- DSP digital signal processor
- Memory 120 may include any non-transitory computer readable recording medium.
- the memory 120 may include a permanent mass storage device such as random access memory (RAM), read only memory (ROM), a disk drive, a solid state drive (SSD), and flash memory. device) may be included.
- a non-perishable mass storage device such as a ROM, SSD, flash memory, or disk drive may be a separate permanent storage device from memory.
- the memory 210 may store an operating system (OS) and at least one program code (eg, a code for the processor 110 to perform an operation to be described later with reference to FIGS. 4 to 13B ).
- OS operating system
- program code eg, a code for the processor 110 to perform an operation to be described later with reference to FIGS. 4 to 13B ).
- the software components may be loaded from a computer-readable recording medium separate from the memory 120 .
- the recording medium readable by such a separate computer may be a recording medium that can be directly connected to the user terminal 100, and for example, a floppy drive, a disk, a tape, a DVD/CD-ROM drive, a memory card, etc.
- a readable recording medium may be included.
- the software components may be loaded into the memory 120 through the communication module 140 rather than a computer-readable recording medium.
- at least one program is a computer program installed by files provided by developers or a file distribution system that distributes application installation files through the communication module 140 (eg, FIGS. 4 to 13B). It may be loaded into the memory 120 based on a computer program for the processor 110 to perform an operation to be described later with reference to .
- the input/output interface 130 may be a means for interface with a device (eg, keyboard, mouse, etc.) for input or output that may be connected to the user terminal 100 or included in the user terminal 100 .
- a device eg, keyboard, mouse, etc.
- FIG. 3A the input/output interface 130 is shown as a separate component from the processor 110, but is not limited thereto, and the input/output interface 130 may be included in the processor 110.
- the communication module 140 may provide a configuration or function for the server 20 and the user terminal 100 to communicate with each other through a network. Also, the communication module 140 may provide a configuration or function for the user terminal 100 to communicate with other external devices. For example, control signals, commands, data, etc. provided under the control of the processor 110 may be transmitted to the server 20 and/or an external device via the communication module 140 and a network.
- the user terminal 100 may further include a display device.
- the user terminal 100 may be connected to an independent display device through a wired or wireless communication method to transmit and receive data between them.
- a pathology slide image, analysis information of the pathology slide image, and treatment response prediction information may be provided to the user 30 through the display device.
- 3B is a configuration diagram illustrating an example of a server according to an embodiment.
- the server 20 includes a processor 210 , a memory 220 and a communication module 230 .
- a processor 210 the server 20 includes a processor 210 , a memory 220 and a communication module 230 .
- FIG. 3B For convenience of description, only components related to the present invention are shown in FIG. 3B. Accordingly, other general-purpose components may be further included in the server 200 in addition to the components shown in FIG. 3B.
- the processor 210, memory 220, and communication module 230 shown in FIG. 3B may be implemented as independent devices.
- the processor 210 may acquire a pathology slide image from at least one of the internal memory 220, an external memory (not shown), the user terminal 10, or an external device.
- the processor 210 may acquire a first pathology slide image in which at least one first object is expressed and biological information of the at least one first object, or may acquire at least one first patch and biological information included in the first pathology slide image. It is possible to generate learning data using the learning data, learn a first machine learning model based on the learning data, or analyze a second pathology slide image using the learned first machine learning model.
- the processor 210 may predict a treatment response of the subject corresponding to the second pathology slide image by using spatial transcript information of the second object represented in the second pathology slide image.
- the user terminal 100 may output the information transmitted from the server 20 through the display device.
- processor 210 since an implementation example of the processor 210 is the same as the implementation example of the processor 110 described above with reference to FIG. 3A, a detailed description thereof will be omitted.
- the memory 220 may store various data such as pathology slide images and data generated according to the operation of the processor 210 . Also, an operating system (OS) and at least one program (eg, a program required for the processor 210 to operate) may be stored in the memory 220 .
- OS operating system
- at least one program eg, a program required for the processor 210 to operate
- the communication module 230 may provide a configuration or function for the server 200 and the user terminal 100 to communicate with each other through a network. Also, the communication module 140 may provide a configuration or function for the server 200 to communicate with other external devices. For example, control signals, commands, data, etc. provided under the control of the processor 210 may be transmitted to the user terminal 100 and/or an external device via the communication module 230 and a network.
- FIG. 4 is a flowchart illustrating an example of a method of processing a pathology slide image, according to an exemplary embodiment.
- the method of processing a pathology slide image is composed of steps sequentially processed by the user terminal 10 or 100 or the processor 110 shown in FIGS. 1 to 3A . Therefore, even if the contents are omitted below, the contents described above regarding the user terminals 10 and 100 or the processor 110 shown in FIGS. 1 to 3A may be applied to the method of processing the pathology slide image of FIG. 4. there is.
- At least one of the steps of the flowchart shown in FIG. 4 may be processed by the server 20 or 200 or the processor 210 .
- the processor 110 obtains a first pathology slide image in which at least one first object is expressed and biological information of the at least one first object.
- the first object may mean a cell, tissue, and/or structure in the human body.
- the biological information may include spatial transcriptomics of the first subject and information identified from a third pathology slide image.
- the third pathology slide image means an image stained in a manner distinct from the first pathology slide image.
- FIG. 5 is a diagram for explaining examples of biological information according to an exemplary embodiment.
- a subject 90 and an object 91 included in the human body of the subject 90 are illustrated.
- the biological information of the object 91 may include spatial transcript information 511 of the object 91 .
- the spatial transcript information 511 refers to information obtained through a spatial transcript process.
- the spatial transcriptome information 511 may include sequence data obtained through a spatial transcriptome process, gene expression information identified through data processing on the sequence data, and the like.
- the spatial transcriptomic process is a molecular profiling method that allows measuring gene expression in tissue samples and mapping the locations at which genes are expressed.
- the relative positional relationship of cells and tissues is important for understanding the normal development of cells or tissues and the pathology of disease.
- the existing Bulk-RNAseq was analyzed by mixing various tissues and cells at once, detailed gene expression patterns in space are unknown.
- gene expression patterns in space can be identified. Accordingly, not only the understanding of the disease but also the accuracy of diagnosis and treatment of the disease can be improved.
- the spatial transcriptome information includes a pathology slide image and/or genetic information corresponding to at least one grid included in the pathology slide image.
- a pathology slide image may be divided into a plurality of grids, and a single grid may be an area of 1 mm * 1 mm, but is not limited thereto.
- the processor 110 processes the sequence data to extract a partial region (eg, a single grid or a plurality of grids) of the pathology slide image, and obtains spatial transcriptome information by obtaining genetic information corresponding to the extracted region. can do.
- a partial region eg, a single grid or a plurality of grids
- FIG. 6 is a flowchart illustrating an example in which a processor obtains spatial transcript information according to an exemplary embodiment.
- step 610 the processor 110 obtains sequence data through a spatial transcription process.
- the spatial transcriptome process may include steps of sample prep, imaging, barcoding & library construction, and sequencing.
- step 620 the processor 110 obtains gene expression information corresponding to the spot by performing data processing on the sequence data.
- the processor 110 obtains genetic information corresponding to the spot location on the pathology slide image by processing the sequence data.
- a pathology slide image may be divided into a plurality of spots, and a single spot may be a circular area with a diameter of 55 ⁇ m, but is not limited thereto.
- the processor 110 may determine at which position of the pathology slide image the gene information included in the sequence data is expressed, based on the barcode information included in the sequence data.
- the barcode is a location coordinate value of a specific spot on the pathology slide image, and may be determined in advance. That is, the barcode and the coordinates on the pathology slide image may be matched with each other.
- sequence read means a portion sequenced from a DNA fragment.
- read1 of the pair-end sequence data may include a barcode matching coordinates (ie, location coordinates of the pathology slide image), and read2 may include transcript sequence information. That is, one end of the DNA fragment may include a barcode value corresponding to the coordinates of the spot where the corresponding DNA fragment was obtained, and sequence information may be included at the other end.
- the processor 110 may check gene expression information by aligning the fastq file including the sequence information with the renference genome. In addition, the processor 110 may obtain gene expression information of multiple (eg, about 5,000) genes for each spot of the pathology slide image through spatial information identified from the barcode.
- the processor 110 may determine which type of cells are present in the spot using gene expression information corresponding to the spot.
- gene expression information corresponding to the spot In general, immune cells, cancer cells, etc. have cell-specific highly expressed genes. Accordingly, by interpreting the gene expression information corresponding to the spot, the processor 110 may determine which cells are distributed in the corresponding spot area or which cells are included in what ratio.
- the processor 110 may further use single cell RNAseq data to determine the number and type of cells distributed in the spot area.
- single cell RNAseq data does not include spatial information, but only RNA information of each cell. Therefore, the processor 110 mathematically analyzes sequence data and single cell RNAseq data for a plurality of cells (eg, about 10) included in each spot, and determines how many cells are included in each spot or how many cells are included in each spot. You can check what proportions are included.
- the processor 110 may determine which type of cell is present in the spot from sequence data using a machine learning model. To this end, the processor 110 may learn a machine learning model using training data.
- the learning data may include sequence data obtained through step 610 as input data and a type of cell as output data.
- the training data may include sequence data acquired through step 610 and a patch of a pathology slide image corresponding to the sequence data as input data, and cell types as output data. That is, the machine learning model may be trained to identify cell types by considering not only sequence data but also morphology characteristics included in pathology slide images.
- the processor 110 may generate a plurality of pairs including [a patch of a pathology slide image - gene expression information corresponding to the patch]. Also, the processor 110 may generate a plurality of pairs including [a patch of the pathology slide image - information on the type of at least one cell corresponding to the patch]. Also, the processor 110 may generate a plurality of pairs including [a patch of the pathology slide image - gene expression information corresponding to the patch - information about the type of at least one cell corresponding to the patch]. Pairs thus created may be used as training data of the first machine learning model. Learning of the first machine learning model will be described later with reference to steps 420 and 430 .
- the biological information of the object 91 may include information 512 about biological elements (eg, cancer cells, immune cells, cancer regions, etc.) of the object 91 .
- the information 512 on the biological factors may be identified from the pathology slide image of the object 91 .
- various biological information about the object 91 may be identified. Accordingly, when stained in different ways, different biological information may be identified from the same object 91 .
- H&E stain Hematoxylin and eosin stain
- hematoxylin stains mainly the nuclear region from blue to purple
- eosin stains the cytoplasm or extracellular matrix pink. do. Therefore, in the case of H&E staining, the morphology of cells and tissues included in the subject can be easily identified.
- immunohistochemical staining methods include programmed cell death-ligand 1 (PD-L1) staining, human epidermal growth factor receptor 2 (HER2) staining, estrogen receptor (ER) staining, progesterone receptor (PR) staining, Ki- 67 staining, CD68 staining, and the like.
- special staining may include Van Gieson staining, Toluidine blue staining, Giemsa staining, Masson's trichrome staining, Periodic acid Schiff (PAS) staining, and the like.
- immunofluorescence staining may include fluorescence in situ hybridization (FISH) and the like.
- the expression level of a specific cell signal not identified can be confirmed from the pathology slide image by H&E staining.
- PD-L1 or HER2 is a protein or receptor expressed in malignant tumor cell membranes, etc., and the level of expression in tumor cell tissues can be evaluated through PD-L1 staining or HER2 staining. Therefore, when the expression level is high, it can be expected from the pathology slide image by H&E staining that the therapeutic response of the anticancer drug targeting the corresponding protein or receptor will be high.
- components of tissue that are not clearly observed from the pathology slide image by H&E staining can be accurately identified.
- Van Gieson staining specifically stains only collagen, only collagen expression in tissue can be identified.
- the presence and/or amount of unidentified specific cells may be confirmed from the pathology slide image by H&E staining.
- H&E staining since CD68 stains specifically for macrophages, in pathology slide images by H&E staining, the number of macrophages, which may not be well distinguished from other inflammatory cells, is much smaller than that of CD68 staining. It can be easily identified in pathology slide images.
- the processor 110 may use spatial transcriptome information 511 and/or biological element information 512 as learning data of a machine learning model. An example of using the information 512 on a biological element as learning data will be described later with reference to FIGS. 10 and 11 .
- step 420 the processor 110 generates learning data using at least one first patch included in the first pathology slide image and biological information.
- the learning data may include at least one of gene expression information corresponding to the patch and at least one cell type displayed in the patch.
- Information on 'the type of at least one cell shown in the patch' included in the learning data may be information obtained by processing gene expression information, as described above with reference to FIG. 6 .
- FIG. 7 is a diagram for explaining an example of learning data according to an exemplary embodiment.
- a patch 711 in a pathology slide image 710 is shown.
- the processor 110 may generate a plurality of pairs that may be used as training data.
- a pair may be [patch 711 - gene expression information 721 corresponding to patch 711], [patch 711 - information about at least one cell type corresponding to patch 711 ( 722)] or [patch 711 - gene expression information 721 corresponding to patch 711 - information 722 on the type of at least one cell corresponding to patch 711].
- the training data may include gene expression information 721 of the object displayed in the patch 711 and/or information 722 about the type of at least one cell of the object displayed in the patch 711 .
- the information about the type of at least one cell displayed in the patch 711 may be information obtained by processing the gene expression information 721 .
- step 430 the processor 110 learns the first machine learning model based on the learning data.
- the processor 110 may learn the first machine learning model by using the training data generated in step 420 as ground truth data.
- a pathology slide image patch is used as input data, and a [pathology slide image patch - gene expression information corresponding to the patch] pair, [pathology slide image patch - patch A pair of [information on the type of at least one corresponding cell] or [patch of the pathology slide image - gene expression information corresponding to the patch - information on the type of at least one cell corresponding to the patch] may be used as output data. .
- the first machine learning model receives a patch and learns to predict gene expression information at the location of the corresponding patch. It can be.
- the first machine learning model receives a patch as an input and assigns a certain type to the location of the patch. of cells can be learned to predict the presence.
- the first machine learning model selects the patch. It can be learned to predict the gene expression information and cell type corresponding to the position of the corresponding patch by receiving the input.
- the processor 110 may learn the first machine learning model using at least one annotation generated based on a user input. For example, learning of the first machine learning model using annotations may be additionally performed when the performance of learning using the training data generated in step 420 as ground truth data is not sufficient. However, it is not limited thereto.
- the user 30 may perform annotation by referring to a patch of the pathology slide image, and location information within the patch may be included in the annotation.
- the number of users who perform annotations is not limited.
- the processor 110 generates a second machine learning model for identifying the type of at least one cell included in the object by adding, removing, or adding at least one layer included in the learned first machine learning model. You may.
- a second machine learning model may be generated by adding at least one layer that
- the processor 110 may generate a second machine learning model by removing at least one layer predicting gene expression information from the learned first machine learning model and adding a new layer.
- step 440 the processor 110 analyzes the second pathology slide image using the learned first machine learning model.
- the processor 110 may analyze the second pathology slide image using the second machine learning model.
- the processor 110 does not perform annotation work (or a small amount of annotation results). can also improve the performance of machine learning models. Accordingly, the accuracy of the analysis result of the pathology slide image by the machine learning model may be improved.
- FIG. 8 is a flowchart illustrating another example of a method of processing a pathology slide image according to an exemplary embodiment.
- the method of processing a pathology slide image is composed of steps sequentially processed by the user terminal 10 or 100 or the processor 110 shown in FIGS. 1 to 3A . Therefore, even if the contents are omitted below, the contents described above regarding the user terminals 10 and 100 or the processor 110 shown in FIGS. 1 to 3A may be applied to the method of processing the pathology slide image of FIG. 8 . there is.
- At least one of the steps of the flowchart shown in FIG. 8 may be processed by the server 20 or 200 or the processor 210 .
- steps 810 to 840 correspond to steps 410 to 440, respectively. Therefore, detailed descriptions of steps 810 to 840 are omitted below.
- step 850 the processor 110 predicts a therapeutic reaction of the subject 90 corresponding to the second pathology slide image by using the spatial transcriptome information of the second object represented in the second pathology slide image.
- the processor 110 may predict the treatment response of the subject 90 using the third machine learning model.
- the spatial transcriptome information of the second object includes at least one of spatial transcriptome information (eg, gene expression information) obtained by the learned first machine learning model and/or separately obtained spatial transcriptome information.
- spatial transcriptome information eg, gene expression information
- the processor 110 predicts the treatment response of the subject 90 will be described with reference to FIG. 9 .
- FIG. 9 is a diagram for explaining an example in which a processor predicts a treatment response of a subject according to an exemplary embodiment.
- spatial transcript information 921 may be generated through a learned first machine learning model 911 .
- the space transfer body information 922 may be generated through a separate space transfer body process 912 .
- the pathology slide image and gene expression information corresponding to each of the grids included in the image can be obtained through the spatial transcript process 912 as described above with reference to step 610 .
- the processor 110 generates a treatment response prediction result 940 using the third machine learning model 930 .
- spatial transcript information 921 and/or spatial transcript information 922 may be input to the third machine learning model 930, and a treatment response prediction result 940 of the subject 90 may be generated. there is.
- the third machine learning model 930 may be trained using gene expression information included in spatial transcriptome information and location information corresponding to the gene expression information.
- a machine learning model eg, convolutional neural network, etc.
- a filter of a certain size eg, 3 * 3 pixels
- This operation is performed for each channel (eg, 3 RGB channels).
- the machine learning model is learned by performing back propagation based on the difference between the output result value and the actual result value (eg, ground truth) by passing the filtered value through the multilayer neural network.
- the processor 110 may replace gene expression information corresponding to each spot with a channel of the 2D image, and location information corresponding to the gene expression information with a pixel of the 2D image. Further, the processor 110 performs backpropagation on the basis of the difference between the result value output after passing through the multilayer neural network of the third machine learning model 930 and the result value for the treatment response or prognosis of the actual patient, thereby performing backpropagation on the third machine learning model 930.
- the learning model 930 may be trained.
- the processor 110 may acquire genetic information corresponding to each spot location on the pathology slide image by performing the above-described process with reference to step 610 .
- the third machine learning model 930 may be trained using a feature vector extracted from at least one layer included in the first machine learning model.
- the first machine learning model is a model that predicts gene expression information at the location of the patch based on the patch, and which type of cell exists at the location of the patch based on the patch. It can be learned as a model that predicts gene expression information and cell types corresponding to the location of a corresponding patch based on a predictive model or patch.
- the processor 110 may input a pathology slide image to a first machine learning model that has been learned, and extract a feature vector from at least one layer included in the first machine learning model that has been learned.
- the extracted layer may be a layer determined and selected empirically by the user 30 or may be a layer appropriately predicting the treatment response or prognosis of the subject 90 . That is, to enable the first machine learning model to well extract genetically and/or histologically important information (eg, gene expression information or cell type, characteristics, etc., which are based on prediction of treatment response) from the pathology slide image. Assuming that it has been learned, it can be expected that the feature vector extracted from any intermediate layer of the first machine learning model that has been learned also contains genetically and/or histologically important information.
- the processor 110 may perform a process of extracting a feature vector from at least one layer included in the learned first machine learning model for all of the plurality of patches included in a single pathology slide image.
- the processor 110 may perform pooling to integrate the feature vectors into a vector having a single length. For example, the processor 110 may perform pooling using an average value of feature vectors, pooling using a maximum value in each dimension of feature vectors, or a bag-of-word or Fisher Vector. The same dictionary-based pooling can be performed, or attention-based pooling using an artificial neural network can be performed. Through this pooling, a single vector corresponding to the pathology slide image of a single subject 90 may be defined.
- the processor 110 may learn a third machine learning model 930 that predicts responsiveness to a specific immuno-cancer drug or responsiveness to a specific treatment using the defined vector.
- the processor 110 performs learning of the third machine learning model 930 and prediction of the treatment response of the subject 90 through the third machine learning model 930. By doing so, the prediction accuracy can be improved compared to predicting the responsiveness to treatment using only the morphological characteristics of the pathology slide image.
- the processor 110 may learn the first machine learning model using the spatial transcript information 511 .
- the processor 110 may learn the first machine learning model using the information 512 on the biological element.
- FIGS. 10 and 11 examples in which the processor 110 learns the first machine learning model using the information 512 on the biological element will be described.
- FIG. 10 is a diagram for explaining an example in which a processor learns a first machine learning model according to an embodiment.
- the first staining scheme 1021 of the pathology slide image 1031 and the second staining scheme 1022 of the pathology slide image 1041 are different from each other.
- the first staining scheme 1021 is a staining scheme for selectively staining a specific biological element, as well as a staining scheme in which the shapes of the nucleus, cytoplasm, and extracellular matrix of all cells included in the object can be easily identified. modalities (eg H&E staining).
- the processor 110 may generate training data for learning the first machine learning model 1050 .
- the training data may include a patch 1032 included in the pathology slide image 1031 and a patch 1042 included in the pathology slide image 1041 .
- the patch 1032 and the patch 1042 may represent the same location of the object 1010 .
- the patch 1042 and the patch 1032 may indicate positions corresponding to each other.
- a first staining scheme 1021 is a scheme capable of selectively staining biological factor A
- a second staining scheme 1022 is a scheme capable of selectively staining biological factor B.
- Methods for selectively staining various biological factors are as described above with reference to FIG. 5 .
- pathology slide images 1031 and 1041 according to two types of staining methods 1021 and 1022 are shown in FIG. 10 , the present invention is not limited thereto.
- the processor 110 performs image processing so that the object 1010 on the image 1031 and the object 1010 on the image 1041 are perfectly overlapped. For example, the processor 110 applies geometric transformation (eg, enlargement, reduction, rotation, etc.) to the image 1031 and the image 1041, thereby creating an object 1010 and an image The object 1010 on 1041 can be accurately aligned. Also, the processor 110 extracts patches 1032 and 1042 from positions corresponding to each other of the images 1031 and 1041 . In this way, processor 110 may generate a plurality of pairs consisting of a patch extracted from image 1031 and a patch extracted from image 1041 .
- geometric transformation eg, enlargement, reduction, rotation, etc.
- processor 110 uses patches 1032 and 1042 to learn first machine learning model 1050 .
- the processor 110 may learn the first machine learning model 1050 by using the patch 1032 as input data and the patch 1042 as output data.
- the patch 1042 may be utilized as ground truth data.
- FIG. 11 is a diagram for explaining another example in which a processor learns a first machine learning model according to an embodiment.
- FIG. 11 an object 1110 and pathology slide images 1131 and 1141 representing the object are shown.
- detailed descriptions of the staining methods 1121 and 1122, the pathology slide images 1131 and 1141, and the patches 1132 and 1142 are the same as those described above with reference to FIG. 10 .
- the processor 110 may generate training data for learning the first machine learning model 1160 .
- the training data may include a patch 1132 and a patch 1143 obtained by performing image processing 1150 on the patch 1142 .
- Processor 110 may perform one or more image processing on patch 1142 to create patch 1143 .
- the processor 110 may perform image filtering so that only a portion of the patch 1142 dyed to a certain level or higher remains, or may perform image filtering to leave only a portion where a specific color is expressed and erase the rest.
- a technique of image processing performed by the processor 110 is not limited to the above description.
- the processor 110 extracts semantic information by applying a more complex image processing technique or a separate machine learning model to the patch 1142, and learns the extracted information corresponding to the patch 1143.
- the extracted information is information indicating the position of a specific cell (eg, cancer cell, immune cell, etc.) as a dot, the type or class of cells depending on the degree of expression of staining and/or the form in which staining is expressed. Information displayed by determining the may correspond.
- the image processing technique may be an algorithm that quantifies the expression amount of staining for each pixel included in the image 1141 and utilizes pixel location information.
- the extracted information may include information on the type and location of specific cells.
- a separate machine learning model may be a model that recognizes the locations and types of biological elements targeted by the dyeing method 1122 of the image 1141 .
- a separate machine learning model may be trained to detect biological factor B expressed by the second staining scheme 1122 when a patch stained with the second staining scheme 1122 is input.
- the second staining method 1122 is a dye expressed in cancer cells
- a separate machine learning model may receive a patch stained with the second staining method 1122 and learn to detect cancer cells.
- the detection result may be a point indicating the location of each cancer cell or may be a result of segmenting cancer cells at the pixel level.
- the processor 110 may learn the first machine learning model 1160 using the patch 1132 and at least one annotation generated based on a user input.
- the annotation may be generated based on the image 1141 .
- learning of the first machine learning model using annotations is additionally performed when the performance of learning using the above-described training data as ground truth data with reference to FIGS. 10 and 11 is not sufficient. It may be performed, but is not limited thereto.
- the user 30 may perform an annotation by referring to the image 1141 , and location information within the patch 1142 may be included in the annotation.
- the number of users who perform annotations is not limited.
- the processor 110 may create a separate machine learning model by adding, removing, or adding at least one layer included in the first machine learning model 1160 that has been learned.
- the processor 110 may generate a separate machine learning model by removing a layer that serves to draw an image from the first machine learning model 1160 that has been trained and newly adding a layer that performs an end-purpose task.
- the final purpose task may mean a task of additionally recognizing a biological factor that needs to be separately identified in addition to the biological factor that can be identified from the images 1131 and 1141 .
- the final target task may refer to a task capable of deriving medical information such as the expression level of a biomarker or prediction of treatment response.
- pathology slide images in which the same tissue is stained with different types of materials are used for machine learning model learning, thereby solving the problem of inaccuracy and cost increase due to human annotation. It can be, and a large amount of learning data can be secured.
- FIG. 12 is a diagram for explaining an example in which an operation of a processor is implemented according to an exemplary embodiment.
- An example to be described later with reference to FIG. 12 may be the operation of the processor 110 described above with reference to FIGS. 10 and 11 .
- the processor 110 may learn the first machine learning models 1050 and 1160 .
- a screen 1210 for selecting pathology slide images stained in different ways is shown.
- the configuration of the screen 1210 is only an example and may be changed in various ways.
- a list 1220 of target slide images and a list 1230 of reference slide images may be displayed on the screen 1210 .
- the target slide image may be an image stained with the first staining schemes 1021 and 1121
- the reference slide image may be an image stained with the second staining scheme 1022 and 1122.
- the processor 110 may perform the above-described operation with reference to FIGS. 10 and 11 .
- processor 110 may, based on image 1231 , predict what the location and/or type of biological elements (eg, cells, proteins and/or tissues) represented in image 1221 are. 1 machine learning models 1050 and 1160 may be learned.
- a screen 1250 in which the position and/or type of the biological element expressed in the image 1221 is predicted can be output.
- the configuration of the screen 1250 shown in FIG. 12 is only an example and may be changed in various ways.
- a mini-map 1251 displaying a portion of the image 1221 output on the current screen 1250 may be displayed on the screen 1250 .
- a window 1252 indicating a portion that the user 30 pays attention to from among portions currently displayed on the screen 1250 may be set on the screen 1250 .
- the position and size of the window 1252 may be preset or adjusted by the user 30 .
- 13A and 13B are diagrams for describing examples in which annotations are generated based on user input according to an exemplary embodiment.
- the user 30 may directly modify the annotation.
- biological elements eg, tissue, cell, structure, etc.
- the user 30 may select the cells ( 1321, 1322, 1323) can be directly modified.
- the user 30 may select a grid 1340 including a plurality of cells shown in an area 1331 of a pathology slide image 1330, and cells or tissues included in the grid 1340. You can also collectively modify the labeling for .
- the processor 110 improves the performance of the machine learning model even if no annotation work is performed (or even by a small amount of annotation results). can improve Accordingly, the accuracy of the analysis result of the pathology slide image by the machine learning model may be improved. In addition, since the processor 110 can predict the subject's treatment response using the analysis result of the pathology slide image, the accuracy of the prediction result of the treatment response can be guaranteed.
- the above-described method can be written as a program that can be executed on a computer, and can be implemented in a general-purpose digital computer that operates the program using a computer-readable recording medium.
- the structure of data used in the above-described method can be recorded on a computer-readable recording medium through various means.
- the computer-readable recording medium includes storage media such as magnetic storage media (eg, ROM, RAM, USB, floppy disk, hard disk, etc.) and optical reading media (eg, CD-ROM, DVD, etc.) do.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- General Health & Medical Sciences (AREA)
- Health & Medical Sciences (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Multimedia (AREA)
- Medical Informatics (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Evolutionary Computation (AREA)
- Biomedical Technology (AREA)
- Software Systems (AREA)
- Life Sciences & Earth Sciences (AREA)
- Molecular Biology (AREA)
- Databases & Information Systems (AREA)
- Artificial Intelligence (AREA)
- Computing Systems (AREA)
- Radiology & Medical Imaging (AREA)
- Nuclear Medicine, Radiotherapy & Molecular Imaging (AREA)
- Quality & Reliability (AREA)
- Image Analysis (AREA)
- Investigating Or Analysing Biological Materials (AREA)
- Epidemiology (AREA)
- Primary Health Care (AREA)
- Public Health (AREA)
- Medical Treatment And Welfare Office Work (AREA)
- Computational Linguistics (AREA)
Abstract
Description
Claims (15)
- 적어도 하나의 메모리; 및적어도 하나의 프로세서;를 포함하고,상기 프로세서는,적어도 하나의 제1 대상체가 표현된 제1 병리 슬라이드 이미지 및 상기 적어도 하나의 제1 대상체의 생물학적 정보(biological information)를 획득하고, 상기 제1 병리 슬라이드 이미지에 포함된 적어도 하나의 제1 패치(patch) 및 상기 생물학적 정보를 이용하여 학습 데이터를 생성하고, 상기 학습 데이터에 의하여 제1 머신러닝 모델을 학습하고, 상기 학습된 제1 머신러닝 모델을 이용하여 제2 병리 슬라이드 이미지를 분석하는 컴퓨팅 장치.
- 제 1 항에 있어서,상기 생물학적 정보는 제3 병리 슬라이드 이미지로부터 확인된 정보 및 상기 제1 대상체의 공간 전사체(Spatial Transcriptomics) 정보 중 적어도 하나를 포함하고,상기 제3 병리 슬라이드 이미지는 상기 제1 병리 슬라이드 이미지와 구별되는 방식으로 염색된 이미지를 포함하는 컴퓨팅 장치.
- 제 1 항에 있어서,상기 학습 데이터는,상기 제1 패치에 대응하는 유전자 발현(gene expression) 정보 및 상기 제1 패치에 나타난 적어도 하나의 세포의 종류 중 적어도 하나를 나타내는 데이터를 포함하고,상기 프로세서는,상기 학습 데이터를 정답(ground truth) 데이터로 이용하여 상기 제1 머신러닝 모델을 학습하는 컴퓨팅 장치.
- 제 3 항에 있어서,상기 프로세서는,상기 학습된 제1 머신러닝 모델에 포함된 적어도 하나의 레이어를 추가 또는 제거함으로써 상기 대상체에 포함된 적어도 하나의 세포의 종류를 식별하는 제2 머신러닝 모델을 생성하는 컴퓨팅 장치.
- 제 1 항에 있어서,상기 프로세서는,상기 제2 병리 슬라이드 이미지에 표현된 제2 대상체의 공간 전사체 정보를 이용하여 상기 제2 병리 슬라이드 이미지에 대응하는 피검자의 치료 반응(therapeutic reaction)을 예측하는 컴퓨팅 장치.
- 제 5 항에 있어서,상기 치료 반응의 예측은 제3 머신러닝 모델에 의하여 수행되고,상기 제2 대상체의 공간 전사체 정보는 상기 학습된 제1 머신러닝 모델에 의하여 획득된 공간 전사체 정보 및 별도로 획득된 공간 전사체 정보 중 적어도 하나를 포함하는 컴퓨팅 장치.
- 제 6 항에 있어서,상기 제3 머신러닝 모델은 상기 학습된 제1 머신러닝 모델에 포함된 적어도 하나의 레이어로부터 추출된 특징 벡터(feature vector)를 이용하여 피검자의 치료 반응을 예측하도록 학습되는 컴퓨팅 장치.
- 제 6 항에 있어서,상기 제3 머신러닝 모델은 공간 전사체 정보에 포함된 유전자 발현 정보 및 상기 유전자 발현 정보에 대응하는 위치 정보를 이용하여 피검자의 치료 반응을 예측하도록 학습되는 컴퓨팅 장치.
- 제 2 항에 있어서,상기 학습 데이터는 상기 제1 패치 및 상기 제3 병리 슬라이드 이미지에 포함된 제2 패치를 포함하고,상기 제2 패치는 상기 제3 병리 슬라이드 이미지에서 상기 제1 패치와 대응하는 위치를 나타내는 패치를 포함하는 컴퓨팅 장치.
- 제 2 항에 있어서,상기 학습 데이터는 상기 제1 패치 및 상기 제3 병리 슬라이드 이미지에 포함된 제2 패치가 이미지 처리된 제3 패치를 포함하고,상기 제2 패치는 상기 제3 병리 슬라이드 이미지에서 상기 제1 패치와 대응하는 위치를 나타내는 패치를 포함하는 컴퓨팅 장치.
- 적어도 하나의 제1 대상체가 표현된 제1 병리 슬라이드 이미지 및 상기 적어도 하나의 제1 대상체의 생물학적 정보(biological information)를 획득하는 단계;상기 제1 병리 슬라이드 이미지에 포함된 적어도 하나의 제1 패치(patch) 및 상기 생물학적 정보를 이용하여 학습 데이터를 생성하는 단계;상기 학습 데이터에 의하여 제1 머신러닝 모델을 학습하는 단계; 및상기 학습된 제1 머신러닝 모델을 이용하여 제2 병리 슬라이드 이미지를 분석하는 단계;를 포함하는 병리 슬라이드 이미지를 분석하는 방법.
- 제 11 항에 있어서,상기 생물학적 정보는 제3 병리 슬라이드 이미지로부터 확인된 정보 및 상기 제1 대상체의 공간 전사체(Spatial Transcriptomics) 정보 중 적어도 하나를 포함하고,상기 제3 병리 슬라이드 이미지는 상기 제1 병리 슬라이드 이미지와 구별되는 방식으로 염색된 이미지를 포함하는 방법.
- 제 11 항에 있어서,상기 학습 데이터는,상기 제1 패치에 대응하는 유전자 발현(gene expression) 정보 및 상기 제1 패치에 나타난 적어도 하나의 세포의 종류 중 적어도 하나를 나타내는 데이터를 포함하고,상기 학습하는 단계는,상기 학습 데이터를 정답(ground truth) 데이터로 이용하여 상기 제1 머신러닝 모델을 학습하는 방법.
- 제 12 항에 있어서,상기 학습 데이터는 상기 제1 패치 및 상기 제3 병리 슬라이드 이미지에 포함된 제2 패치를 포함하고,상기 제2 패치는 상기 제3 병리 슬라이드 이미지에서 상기 제1 패치와 대응하는 위치를 나타내는 패치를 포함하는 방법.
- 제 11 항의 방법을 컴퓨터에서 실행시키기 위한 프로그램을 기록한 컴퓨터로 읽을 수 있는 기록매체.
Priority Applications (2)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| JP2024552307A JP2025506993A (ja) | 2022-03-03 | 2023-02-16 | 病理スライド画像を分析する方法及び装置 |
| EP23763647.7A EP4489011A4 (en) | 2022-03-03 | 2023-02-16 | METHOD AND APPARATUS FOR PATHOLOGICAL SLIDE IMAGE ANALYSIS |
Applications Claiming Priority (4)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| KR20220027211 | 2022-03-03 | ||
| KR10-2022-0027211 | 2022-03-03 | ||
| KR10-2023-0017338 | 2023-02-09 | ||
| KR1020230017338A KR20230130536A (ko) | 2022-03-03 | 2023-02-09 | 병리 슬라이드 이미지를 분석하는 방법 및 장치 |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| WO2023167448A1 true WO2023167448A1 (ko) | 2023-09-07 |
Family
ID=87850835
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| PCT/KR2023/002241 Ceased WO2023167448A1 (ko) | 2022-03-03 | 2023-02-16 | 병리 슬라이드 이미지를 분석하는 방법 및 장치 |
Country Status (4)
| Country | Link |
|---|---|
| US (2) | US12530874B2 (ko) |
| EP (1) | EP4489011A4 (ko) |
| JP (1) | JP2025506993A (ko) |
| WO (1) | WO2023167448A1 (ko) |
Families Citing this family (3)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| EP4489011A4 (en) * | 2022-03-03 | 2025-08-06 | Lunit Inc | METHOD AND APPARATUS FOR PATHOLOGICAL SLIDE IMAGE ANALYSIS |
| CN117831612A (zh) * | 2024-03-05 | 2024-04-05 | 安徽省立医院(中国科学技术大学附属第一医院) | 基于人工智能的gist靶向药物类型选择预测方法及系统 |
| CN120431990B (zh) * | 2025-07-07 | 2025-11-04 | 杭州华大生命科学研究院 | 功能单元预测模型构建方法及预测方法、装置及电子设备 |
Citations (5)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| KR102155381B1 (ko) * | 2019-09-19 | 2020-09-11 | 두에이아이(주) | 인공지능 기반 기술의 의료영상분석을 이용한 자궁경부암 판단방법, 장치 및 소프트웨어 프로그램 |
| KR102170297B1 (ko) * | 2019-12-16 | 2020-10-26 | 주식회사 루닛 | 조직병리체학 데이터의 해석 정보를 제공하는 방법 및 시스템 |
| KR102246319B1 (ko) * | 2021-01-07 | 2021-05-03 | 주식회사 딥바이오 | 병리 검체에 대한 판단 결과를 제공하는 인공 뉴럴 네트워크의 학습 방법, 및 이를 수행하는 컴퓨팅 시스템 |
| KR20210139195A (ko) * | 2020-05-13 | 2021-11-22 | 주식회사 루닛 | 의학 데이터로부터 바이오마커와 관련된 의학적 예측을 생성하는 방법 및 시스템 |
| WO2021236544A1 (en) * | 2020-05-18 | 2021-11-25 | Genentech, Inc. | Pathology prediction based on spatial feature analysis |
Family Cites Families (22)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| WO2014140070A2 (en) | 2013-03-14 | 2014-09-18 | Ventana Medical Systems, Inc. | Whole slide image registration and cross-image annotation devices, systems and methods |
| US20190287645A1 (en) * | 2016-07-06 | 2019-09-19 | Guardant Health, Inc. | Methods for fragmentome profiling of cell-free nucleic acids |
| KR101879207B1 (ko) * | 2016-11-22 | 2018-07-17 | 주식회사 루닛 | 약한 지도 학습 방식의 객체 인식 방법 및 장치 |
| US10713794B1 (en) * | 2017-03-16 | 2020-07-14 | Facebook, Inc. | Method and system for using machine-learning for object instance segmentation |
| WO2019172901A1 (en) * | 2018-03-07 | 2019-09-12 | Google Llc | Virtual staining for tissue slide images |
| US10957041B2 (en) * | 2018-05-14 | 2021-03-23 | Tempus Labs, Inc. | Determining biomarkers from histopathology slide images |
| CA3109279A1 (en) * | 2018-05-24 | 2019-11-28 | University Of Pittsburgh-Of The Commonwealth System Of Higher Education | Predicting cancer recurrence from spatial multi-parameter cellular and subcellular imaging data. |
| US12165743B2 (en) * | 2018-11-09 | 2024-12-10 | The Broad Institute, Inc. | Compressed sensing for screening and tissue imaging |
| US20220180975A1 (en) * | 2019-01-28 | 2022-06-09 | The Broad Institute, Inc. | Methods and systems for determining gene expression profiles and cell identities from multi-omic imaging data |
| KR20220015367A (ko) * | 2019-05-31 | 2022-02-08 | 프리놈 홀딩스, 인크. | 메틸화된 핵산의 고심도 시퀀싱 방법 및 시스템 |
| KR102068279B1 (ko) * | 2019-10-04 | 2020-01-20 | 주식회사 루닛 | 이미지 분석 방법 및 시스템 |
| KR102068277B1 (ko) * | 2019-10-04 | 2020-02-11 | 주식회사 루닛 | 이미지 분석 방법 및 시스템 |
| KR102166835B1 (ko) * | 2019-10-28 | 2020-10-16 | 주식회사 루닛 | 신경망 학습 방법 및 그 장치 |
| US12300006B2 (en) * | 2019-12-23 | 2025-05-13 | The Regents Of The University Of California | Method and system for digital staining of microscopy images using deep learning |
| US11830227B2 (en) * | 2020-05-12 | 2023-11-28 | Lunit Inc. | Learning apparatus and learning method for three-dimensional image |
| US20210358571A1 (en) * | 2020-05-13 | 2021-11-18 | Tempus Labs, Inc. | Systems and methods for predicting pathogenic status of fusion candidates detected in next generation sequencing data |
| CN115698335A (zh) * | 2020-05-22 | 2023-02-03 | 因斯特罗公司 | 使用机器学习模型预测疾病结果 |
| US20220261988A1 (en) * | 2021-02-18 | 2022-08-18 | Lunit Inc. | Method and system for detecting region of interest in pathological slide image |
| US12488893B2 (en) * | 2021-02-18 | 2025-12-02 | Lunit Inc. | Method and system for training machine learning model for detecting abnormal region in pathological slide image |
| US11482319B2 (en) * | 2021-03-09 | 2022-10-25 | PAIGE.AI, Inc. | Systems and methods for processing electronic images to determine testing for unstained specimens |
| EP4489011A4 (en) * | 2022-03-03 | 2025-08-06 | Lunit Inc | METHOD AND APPARATUS FOR PATHOLOGICAL SLIDE IMAGE ANALYSIS |
| US20240257910A1 (en) * | 2023-01-30 | 2024-08-01 | Research & Business Foundation Sungkyunkwan University | Method and electronic device for predicting patch-level gene expression from histology image by using artificial intelligence model |
-
2023
- 2023-02-16 EP EP23763647.7A patent/EP4489011A4/en active Pending
- 2023-02-16 JP JP2024552307A patent/JP2025506993A/ja active Pending
- 2023-02-16 WO PCT/KR2023/002241 patent/WO2023167448A1/ko not_active Ceased
- 2023-03-03 US US18/178,233 patent/US12530874B2/en active Active
-
2025
- 2025-12-24 US US19/432,487 patent/US20260120281A1/en active Pending
Patent Citations (5)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| KR102155381B1 (ko) * | 2019-09-19 | 2020-09-11 | 두에이아이(주) | 인공지능 기반 기술의 의료영상분석을 이용한 자궁경부암 판단방법, 장치 및 소프트웨어 프로그램 |
| KR102170297B1 (ko) * | 2019-12-16 | 2020-10-26 | 주식회사 루닛 | 조직병리체학 데이터의 해석 정보를 제공하는 방법 및 시스템 |
| KR20210139195A (ko) * | 2020-05-13 | 2021-11-22 | 주식회사 루닛 | 의학 데이터로부터 바이오마커와 관련된 의학적 예측을 생성하는 방법 및 시스템 |
| WO2021236544A1 (en) * | 2020-05-18 | 2021-11-25 | Genentech, Inc. | Pathology prediction based on spatial feature analysis |
| KR102246319B1 (ko) * | 2021-01-07 | 2021-05-03 | 주식회사 딥바이오 | 병리 검체에 대한 판단 결과를 제공하는 인공 뉴럴 네트워크의 학습 방법, 및 이를 수행하는 컴퓨팅 시스템 |
Non-Patent Citations (1)
| Title |
|---|
| See also references of EP4489011A4 * |
Also Published As
| Publication number | Publication date |
|---|---|
| EP4489011A1 (en) | 2025-01-08 |
| JP2025506993A (ja) | 2025-03-13 |
| US12530874B2 (en) | 2026-01-20 |
| US20260120281A1 (en) | 2026-04-30 |
| US20230281971A1 (en) | 2023-09-07 |
| EP4489011A4 (en) | 2025-08-06 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| WO2023167448A1 (ko) | 병리 슬라이드 이미지를 분석하는 방법 및 장치 | |
| WO2021060899A1 (ko) | 인공지능 모델을 사용 기관에 특화시키는 학습 방법, 이를 수행하는 장치 | |
| WO2020045848A1 (ko) | 세그멘테이션을 수행하는 뉴럴 네트워크를 이용한 질병 진단 시스템 및 방법 | |
| WO2019235828A1 (ko) | 투 페이스 질병 진단 시스템 및 그 방법 | |
| WO2017135496A1 (ko) | 약물과 단백질 간 관계 분석 방법 및 장치 | |
| WO2021125744A1 (en) | Method and system for providing interpretation information on pathomics data | |
| WO2020032560A2 (ko) | 진단 결과 생성 시스템 및 방법 | |
| WO2012041333A1 (en) | Automated imaging, detection and grading of objects in cytological samples | |
| WO2020032559A2 (ko) | 뉴럴 네트워크를 이용한 질병의 진단 시스템 및 방법 | |
| WO2022124725A1 (ko) | 화합물과 단백질의 상호작용 예측 방법, 장치 및 컴퓨터 프로그램 | |
| WO2021010671A9 (ko) | 뉴럴 네트워크 및 비국소적 블록을 이용하여 세그멘테이션을 수행하는 질병 진단 시스템 및 방법 | |
| WO2020076133A1 (ko) | 암 영역 검출의 유효성 평가 장치 | |
| WO2023191472A1 (ko) | 면역조직화학 염색 이미지를 분석하기 위한 기계학습모델을 학습하는 방법 및 이를 수행하는 컴퓨팅 시스템 | |
| WO2023146361A1 (ko) | 인공지능 기반의 바이오 마커 선별 장치 및 방법 | |
| WO2025037735A1 (ko) | 생체 조직 이미지를 분석하여 종양 조직의 지역적 특성을 측정하는 방법 및 이를 수행하는 컴퓨팅 시스템 | |
| WO2020032561A2 (ko) | 다중 색 모델 및 뉴럴 네트워크를 이용한 질병 진단 시스템 및 방법 | |
| WO2023287235A1 (ko) | 병리 이미지 분석 방법 및 시스템 | |
| WO2023018085A1 (ko) | 병리 슬라이드 이미지와 관련된 정보를 출력하는 방법 및 장치 | |
| WO2023068787A1 (ko) | 의료 영상 분석 방법 | |
| WO2023128059A1 (en) | Method and apparatus for tumor purity based on pathological slide image | |
| WO2024242440A1 (en) | Method and apparatus for predicting treatment response to immune checkpoint inhibitors | |
| WO2025164990A1 (ko) | 병리 진단 케이스의 대표 병변 이미지 생성 방법 및 이를 수행하는 컴퓨팅 시스템 | |
| KR20230130536A (ko) | 병리 슬라이드 이미지를 분석하는 방법 및 장치 | |
| Fu et al. | Pix2Path: integrating spatial Transcriptomics and digital pathology with deep learning to score pathological risk and link gene expression to disease mechanisms | |
| WO2023113414A1 (ko) | 병리 검체에 대한 판단 결과를 제공하는 인공 뉴럴 네트워크의 학습 방법, 및 이를 수행하는 컴퓨팅 시스템 |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| 121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 23763647 Country of ref document: EP Kind code of ref document: A1 |
|
| WWE | Wipo information: entry into national phase |
Ref document number: 2024552307 Country of ref document: JP |
|
| WWE | Wipo information: entry into national phase |
Ref document number: 2023763647 Country of ref document: EP |
|
| NENP | Non-entry into the national phase |
Ref country code: DE |
|
| ENP | Entry into the national phase |
Ref document number: 2023763647 Country of ref document: EP Effective date: 20241004 |