WO2020137745A1 - 画像処理装置、画像処理システム、画像処理方法、プログラム - Google Patents
画像処理装置、画像処理システム、画像処理方法、プログラム Download PDFInfo
- Publication number
- WO2020137745A1 WO2020137745A1 PCT/JP2019/049623 JP2019049623W WO2020137745A1 WO 2020137745 A1 WO2020137745 A1 WO 2020137745A1 JP 2019049623 W JP2019049623 W JP 2019049623W WO 2020137745 A1 WO2020137745 A1 WO 2020137745A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- dimensional
- image data
- image
- image processing
- class
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Ceased
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/10—Segmentation; Edge detection
- G06T7/174—Segmentation; Edge detection involving the use of two or more images
-
- A—HUMAN NECESSITIES
- A61—MEDICAL OR VETERINARY SCIENCE; HYGIENE
- A61B—DIAGNOSIS; SURGERY; IDENTIFICATION
- A61B6/00—Apparatus or devices for radiation diagnosis; Apparatus or devices for radiation diagnosis combined with radiation therapy equipment
- A61B6/02—Arrangements for diagnosis sequentially in different planes; Stereoscopic radiation diagnosis
- A61B6/03—Computed tomography [CT]
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/243—Classification techniques relating to the number of classes
- G06F18/2431—Multiple classes
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
- G06N3/0455—Auto-encoder networks; Encoder-decoder networks
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/0464—Convolutional networks [CNN, ConvNet]
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/09—Supervised learning
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/10—Segmentation; Edge detection
- G06T7/11—Region-based segmentation
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/10—Segmentation; Edge detection
- G06T7/149—Segmentation; Edge detection involving deformable models, e.g. active contour models
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/10—Segmentation; Edge detection
- G06T7/162—Segmentation; Edge detection involving graph-based methods
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/10—Segmentation; Edge detection
- G06T7/187—Segmentation; Edge detection involving region growing; involving region merging; involving connected component labelling
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/10—Segmentation; Edge detection
- G06T7/194—Segmentation; Edge detection involving foreground-background segmentation
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/20—Image preprocessing
- G06V10/25—Determination of region of interest [ROI] or a volume of interest [VOI]
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/20—Image preprocessing
- G06V10/26—Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/40—Extraction of image or video features
- G06V10/44—Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components
- G06V10/443—Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components by matching or filtering
- G06V10/449—Biologically inspired filters, e.g. difference of Gaussians [DoG] or Gabor filters
- G06V10/451—Biologically inspired filters, e.g. difference of Gaussians [DoG] or Gabor filters with interaction between the filter responses, e.g. cortical complex cells
- G06V10/454—Integrating the filters into a hierarchical structure, e.g. convolutional neural networks [CNN]
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/762—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using clustering, e.g. of similar faces in social networks
- G06V10/7635—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using clustering, e.g. of similar faces in social networks based on graphs, e.g. graph cuts or spectral clustering
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/77—Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
- G06V10/774—Generating sets of training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/82—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/60—Type of objects
- G06V20/64—Three-dimensional [3D] objects
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16H—HEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
- G16H30/00—ICT specially adapted for the handling or processing of medical images
- G16H30/40—ICT specially adapted for the handling or processing of medical images for processing medical images, e.g. editing
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N5/00—Computing arrangements using knowledge-based models
- G06N5/01—Dynamic search techniques; Heuristics; Dynamic trees; Branch-and-bound
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2200/00—Indexing scheme for image data processing or generation, in general
- G06T2200/04—Indexing scheme for image data processing or generation, in general involving 3D image data
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/10—Image acquisition modality
- G06T2207/10072—Tomographic images
- G06T2207/10081—Computed x-ray tomography [CT]
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/10—Image acquisition modality
- G06T2207/10072—Tomographic images
- G06T2207/10088—Magnetic resonance imaging [MRI]
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20072—Graph-based image processing
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20081—Training; Learning
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20084—Artificial neural networks [ANN]
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20112—Image segmentation details
- G06T2207/20116—Active contour; Active surface; Snakes
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20112—Image segmentation details
- G06T2207/20161—Level set
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/30—Subject of image; Context of image processing
- G06T2207/30004—Biomedical image processing
- G06T2207/30056—Liver; Hepatic
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/30—Subject of image; Context of image processing
- G06T2207/30004—Biomedical image processing
- G06T2207/30084—Kidney; Renal
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/30—Subject of image; Context of image processing
- G06T2207/30004—Biomedical image processing
- G06T2207/30096—Tumor; Lesion
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V2201/00—Indexing scheme relating to image or video recognition or understanding
- G06V2201/03—Recognition of patterns in medical or anatomical images
- G06V2201/031—Recognition of patterns in medical or anatomical images of internal organs
Definitions
- the disclosure of the present specification relates to an image processing device, an image processing system, an image processing method, and a program.
- Segmentation is a process of distinguishing a region of interest existing in an image from a region other than the region of interest, and is also called region extraction, region division, image division, or the like. Although many segmentation methods have been proposed so far, in recent years, segmentation can be performed with high accuracy by giving pixel information about a region of interest or a region other than the region of interest as disclosed in Non-Patent Document 1. The method to do is proposed.
- Non-Patent Document 1 when performing the segmentation by the method described in Non-Patent Document 1, the user may have to give the above-described pixel information in advance, which may impose a burden on the user. It is an object of the present invention to provide an image processing apparatus that can reduce the burden of giving pixel information to a user and can extract a region with high accuracy.
- An image processing apparatus is configured to classify a plurality of pixels in two-dimensional image data forming first three-dimensional image data including an object into a first class group by a learned classifier.
- a plurality of pixels in the second three-dimensional image data including the object based on the classification result by the first classifying means, and a plurality of pixels including at least one class of the first class group.
- the image processing device of the present invention it is possible to reduce the burden of giving pixel information to the user and extract a region with high accuracy.
- the flowchart which shows an example of the processing procedure of the image processing apparatus which concerns on 1st embodiment.
- the figure explaining an example of the image which concerns on 1st embodiment The figure explaining an example of the image which concerns on 1st embodiment
- the figure explaining an example of the image which concerns on 1st embodiment The figure explaining an example of the image which concerns on 1st embodiment
- the figure explaining an example of the image which concerns on 1st embodiment The figure explaining an example of processing of the 1st classification part concerning a 1st embodiment.
- the figure explaining an example of the teaching data of the 1st classification part which concerns on 1st embodiment The figure explaining an example of the output of the 1st classification part which concerns on 1st embodiment.
- the figure explaining an example of the image which concerns on 2nd embodiment The figure explaining an example of the image which concerns on 2nd embodiment
- the figure explaining an example of the image which concerns on 2nd embodiment The figure explaining an example of the image which concerns on 2nd embodiment
- the figure explaining an example of the image which concerns on 2nd embodiment The figure explaining an example of the image which concerns on 2nd embodiment
- the target image data is image data captured by a modality that outputs three-dimensional image data
- it can be any one of MRI (Nuclear Magnetic Resonance Imaging), ultrasonic diagnostic equipment, X-ray computed tomography imaging equipment, etc.
- Image data captured by a modality may be used.
- MRI Magnetic Magnetic Resonance Imaging
- X-ray computed tomography X-ray CT
- the image processing apparatus for each of the two-dimensional image data (slice) constituting the spatial three-dimensional image data (three-dimensional tomographic image) including the object, the attention area by the two-dimensional segmentation method. Is roughly extracted (coarse extraction).
- the target refers to, for example, the subject.
- spatial three-dimensional image data will be referred to as three-dimensional image data.
- a two-dimensional rough extracted image of the attention area corresponding to the input two-dimensional image data is obtained.
- the three-dimensional rough extracted image is obtained by stacking or interpolating or integrating the two-dimensional rough extracted image.
- stacking refers to a process of converting two or more rough extracted images into continuous images.
- the integration process refers to a process of combining overlapping regions between two or more roughly extracted images into one. Then, based on the three-dimensional image data and the three-dimensional rough extraction image acquired by the two-dimensional segmentation method, the region of interest with higher accuracy is extracted by the three-dimensional segmentation method.
- extracting an area refers to classifying each pixel in an image into any of a predetermined class group. The classification only needs to identify the position of the extraction target, and may distinguish whether it is inside the extraction target such as an organ or a lesion, or may distinguish whether it is the contour of the extraction target. Good.
- an abdominal CT image of a human body taken by an X-ray computed tomography (X-ray CT) device will be described as an example of three-dimensional image data.
- the region of interest in the two-dimensional segmentation method is, for example, the liver and the right kidney here. That is, the processing here is a classification problem of classifying into one of three class groups (hereinafter, first class group) consisting of “liver”, “right kidney”, and “region other than liver and right kidney”. Becomes The region of interest in the three-dimensional segmentation method is the liver. That is, the processing here becomes a problem of classifying into one of a class group (hereinafter, a second class group) including two classes of “liver” and “region other than liver”.
- the configuration of this process is that the liver region and the right kidney region that is likely to be erroneously extracted as the liver region are set as the attention region of the two-dimensional segmentation method, and thus the error of categorizing the right kidney region as the liver region (erroneous extraction) There is an intention to suppress it.
- the region of interest of the first classifier two-dimensional segmentation method
- the region of interest of the second classifier three-dimensional segmentation method
- a method that does not necessarily require the user to give pixel information in advance is used as the two-dimensional segmentation method.
- the pixel information refers to information including at least one of position (foreground) information of an extraction target and position (background) information that is not an extraction target.
- One of the methods that does not necessarily require prior pixel information in the segmentation method is a segmentation method based on machine learning. In machine learning, the machine itself learns features from given data. This means, for example, that classification can be performed even if the user does not provide a classification condition in advance in the classification problem.
- CNN Convolutional Neural Network
- FCN [J.
- the three-dimensional segmentation method may be a region expansion method, a level set method, a graph cut method, or a Snakes method as long as it is a method of performing segmentation using given pixel information.
- the graph cut method is used.
- the attention area in the target image looks almost the same even between different image data.
- the classifier may not be able to learn the variation of the appearance of the attention area, and the classification accuracy may decrease. ..
- the appearance of the region of interest in the body axis direction as a property of the human body that is the subject.
- the reason why the appearance of the attention area varies is that, for example, the position where the attention area is drawn is different.
- the position and resolution at which the region of interest is drawn are generally the same even between different three-dimensional image data.
- the resolution of the different three-dimensional image data in the body axis direction is different depending on the image, so that the accuracy of the classifier may be lowered.
- the body-axis cross-sectional image is less susceptible to the difference in resolution in the body-axis direction.
- the 2D image represented by the body-axis cross-sectional image is classified by using only the 2D information among the 3D information of the 3D image data. Therefore, when the image data based on the two-dimensional information is targeted even between different three-dimensional image data, the difference in appearance of the extraction target in the images during learning and classification is small.
- 2D-CNN CNN that inputs 2D image data
- body axis cross-section image data an example of 2D image data that constitutes 3D image data
- 3D-CNN CNN which inputs three-dimensional image data
- 2D-CNN 2D-CNN
- a three-dimensional space area in which the attention area exists is cut out and the positions where the attention area exists are aligned.
- Spatial normalization such as is required.
- a process of aligning the resolutions is performed.
- a classifier based on machine learning such as SVM (Support Vector Machine) other than CNN.
- the image processing apparatus combines a 2D-FCN (Fully Concurrent Network) (an example of 2D-CNN) that receives body axis cross-sectional image data and a graph cut method that is a three-dimensional segmentation method.
- 2D-FCN Full Concurrent Network
- 2D-CNN Graphics Deformation Network
- graph cut method that is a three-dimensional segmentation method.
- the image processing apparatus 100 includes an acquisition unit 101, a first classification unit 102, and a second classification unit 103. Further, the image processing system according to this embodiment includes a storage device 70 outside the image processing device 100.
- the storage device 70 is an example of a computer-readable storage medium, and is a large-capacity information storage device represented by a hard disk drive (HDD) and a solid state drive (SSD).
- the storage device 70 holds at least one or more three-dimensional image data.
- the acquisition unit 101 acquires three-dimensional image data from the storage device 70. Then, the acquired three-dimensional image data is transmitted to the first classification unit 102 as the first three-dimensional image data and to the second classification unit 103 as the second three-dimensional image data.
- the first classification unit 102 receives the two-dimensional image data that constitutes the three-dimensional image data (first three-dimensional image data) acquired from the acquisition unit 101. Then, the two-dimensional image data forming the first three-dimensional image data is subjected to the two-dimensional segmentation by the first classifying unit 102, whereby the two-dimensional rough extraction corresponding to the first class group is performed. Get the image. The first classification unit further generates a three-dimensional rough extracted image by performing at least one process of stacking the two-dimensional rough extracted images corresponding to the first class group for each class, interpolation processing, and integration processing. , To the second classification unit 103.
- the two-dimensional rough extracted image or the three-dimensional rough extracted image obtained by stacking or interpolating/integrating the two-dimensional rough extracted image is set as the classification result by the first classification unit 102.
- the classification result corresponding to the first class group is a likelihood map in which each pixel expresses the class likelihood with a pixel value of 0 or more and 1 or less.
- each roughly extracted image is represented by a value close to 1 for a pixel that seems to be the class and a value close to 0 for a pixel that does not seem to be the class.
- the two-dimensional rough extracted image has the same image size as the body-axis cross-sectional image, and the three-dimensional rough extracted image has the same size as the three-dimensional image data.
- the pixel value in the roughly extracted image that is the classification result may be represented by any value as long as it is a value that can express the likelihood of the class.
- the pixel value may be given as a binary value, or a value in a different range may be given for each class.
- the first class is assigned 0 or more and less than 1
- the second class is assigned 1 or more and less than 2.
- the classification result may be output as to which of the class groups each of the pixels belongs to, or may be output as the likelihood to each class of the class group.
- the two-dimensional rough extracted image may have the same image size as the body-axis cross-sectional image as described above, or may have a different image size.
- the three-dimensional rough extracted image may have the same image size as the three-dimensional image as described above, or may have a different image size.
- the first classification unit 102 is a learned 2D-FCN.
- a learning method of 2D-FCN will be described with reference to FIG. 2D-FCN is one of supervised learning in machine learning, and exhibits classification ability by causing a machine to learn in advance by associating a two-dimensional correct answer image and a two-dimensional learning image with each other.
- the correct image composed of a plurality of two-dimensional correct images and the learning image composed of a plurality of two-dimensional learning images are combined as teaching data.
- FIG. 6A is a learning image 610, which is a set of two-dimensional body-axis cross-sectional images, and is an abdominal CT image captured by the X-ray CT apparatus in the present embodiment.
- the learning image 610 is composed of a body axis section image 611a and a body axis section image 611b.
- FIG. 6B is a set of two-dimensional correct answer images corresponding to each class of the first class group of the learning image 610.
- the liver region, the right kidney region, and regions other than the liver and the right kidney are classified.
- the correct answer image corresponding to each class is composed of a plurality of two-dimensional correct answer images.
- each correct answer image is, for example, the correct answer image 630 of the liver region is a two-dimensional correct answer image 631a of the liver region corresponding to the body-axis sectional image 611a, or a two-dimensional correct answer image 631b of the liver region corresponding to the body-axis sectional image 611b.
- the correct answer image of the right kidney region is composed of a two-dimensional correct answer image 641a of the right kidney region corresponding to the body-axis sectional image 611a and a two-dimensional correct answer image 641b of the right kidney region corresponding to the body-axis sectional image 611b.
- the regions other than the liver and the right kidney are each composed of a two-dimensional correct image 651a and a two-dimensional correct image 651b.
- the pixels in the correct answer image corresponding to each class of the first class group are images that represent whether or not the class is a binary value, and have a characteristic that there is no overlapping area between correct answer images corresponding to each class. ..
- the 2D-FCN is learned by using a teaching data set including one or more of the above teaching data.
- a learning method for example, an error back propagation method (Backpropagation), which is a general method in CNN learning, is used.
- Backpropagation which is a general method in CNN learning
- the correct image may be an image in which whether or not the class is represented by a binary value for each pixel as described above, or may be an image in which the likelihood of the class is expressed by a continuous value in pixels.
- the correct answer image is an image in which the likelihood of the class is expressed by continuous values
- the first classification unit 102 has a configuration for solving the regression problem for each pixel.
- the second classification unit 103 includes the three-dimensional image data (second three-dimensional image data) acquired from the acquisition unit 101 and the three-dimensional rough extracted image corresponding to each class acquired from the first classification unit 102. Is input and a three-dimensional image of interest corresponding to the second class group is output.
- the three-dimensional image of interest is an image showing the likelihood of the region to be extracted, and the pixel value is expressed similarly to the above-described three-dimensional rough extracted image.
- each unit of the image processing apparatus 100 shown in FIG. 1 may be realized as an independent device. Moreover, you may implement
- FIG. 2 is a diagram showing an example of the hardware configuration of the image processing apparatus 100.
- a CPU (Central Processing Unit) 201 mainly controls the operation of each component.
- the main memory 202 stores a control program executed by the CPU 201 and provides a work area when the CPU 201 executes the program.
- the magnetic disk 203 stores an OS (Operating System), a device driver for peripheral devices, and programs for realizing various application software including programs for performing processing to be described later.
- the CPU 201 executes the programs stored in the main memory 202, the magnetic disk 203, etc., so that the functions (software) of the image processing apparatus 100 shown in FIG. 1 and the processing in the flowcharts described later are realized.
- the display memory 204 temporarily stores display data.
- the monitor 205 is, for example, a CRT monitor, a liquid crystal monitor, or the like, and displays an image, text, or the like based on the data from the display memory 204.
- the mouse 206 and the keyboard 207 are used by the user to perform pointing input and input of characters and the like, respectively.
- the above components are connected to each other via a common bus 208 so that they can communicate with each other.
- the CPU 201 corresponds to an example of a processor.
- the image processing apparatus 100 may include at least one of a GPU (Graphics Processing Unit) and an FPGA (Field-Programmable Gate Array) in addition to the CPU 201. Further, instead of the CPU 201, it may have at least one of GPU and FPGA.
- the main memory 202 and the magnetic disk 203 correspond to an example of the memory.
- Step S310 Step of acquiring three-dimensional image data>
- the acquisition unit 101 acquires three-dimensional image data from the storage device 70.
- Step S320 the first classification unit 102 extracts a two-dimensional region of interest by a two-dimensional segmentation method for each of the two-dimensional body-axis cross-sectional images forming the three-dimensional image data.
- the two-dimensional attention area is output as a two-dimensional rough extracted image corresponding to each class of the first class group.
- a two-dimensional rough extracted image corresponding to each class is stacked for each class, and a three-dimensional rough extracted image corresponding to each class of the first class group is generated and transmitted to the second classifying unit.
- the first classification unit 102 is a learned 2D-FCN that classifies each pixel into three classes other than liver, right kidney, liver and right kidney.
- FIG. 4A shows the three-dimensional image data 410 acquired in step S310.
- the three-dimensional image data 410 is composed of a plurality of body axis section images such as a body axis section image 411a and a body axis section image 411b, and each of these body axis section images is a 2D-FCN input.
- FIG. 4C is a three-dimensional rough extracted image corresponding to each class. The three-dimensional rough extracted image 430 of the liver region, the three-dimensional rough extracted image 440 of the right kidney region, and the third order of regions other than the liver and the right kidney. The original rough extracted image 450 is shown.
- FIG. 4C is a three-dimensional rough extracted image corresponding to each class.
- the original rough extracted image 450 is shown.
- FIG. 4A shows the three-dimensional image
- Each three-dimensional rough extracted image is composed of a two-dimensional rough extracted image corresponding to each body-axis cross-sectional image that is an input of 2D-FCN.
- the two-dimensional roughly extracted image 431a of the liver region corresponds to the body axis sectional image 411a
- the two-dimensional roughly extracted image 431b of the liver region corresponds to the body axis sectional image 411b. It is a two-dimensional rough extracted image.
- the 2D-FCN outputs a two-dimensional rough extracted image corresponding to each class when the body axis cross-sectional image is input by the processing described later.
- the rough extracted image 441a and the rough extracted image 441b correspond to the body axis sectional image 411a and the body axis sectional image 411b of the three-dimensional image data 410, respectively.
- the rough extracted images 451a and 451b of the regions other than the liver and the right kidney similarly correspond to the body axis sectional image 411a and the body axis sectional image 411b, respectively.
- 2D-FCN processing will be described with reference to FIG.
- the input image 510 body axis cross-section image
- a plurality of Convolution processes, Pooling processes, and Upsampling processes are executed in the intermediate layer 530.
- the output layer 540 the Convolution process is performed, the output value of each pixel is normalized by the Softmax process, and the output image 520 that is a two-dimensional image including the pixel classification information is obtained.
- the Convolution process extracts features while maintaining the shape of the image.
- the Pooling process reduces the spatial size of the image width and height and enlarges the receptive field. Upsampling uses the pooling information to obtain a detailed resolution.
- the output image 520 is composed of two-dimensional rough extracted images of the same number as the number of classes to be classified. In the case of the present embodiment, the output image 520 is composed of a two-dimensional rough extraction image 521 of the liver region, a two-dimensional rough extraction image 522 of the right kidney region, and a two-dimensional rough extraction image 523 of regions other than the liver and the right kidney. To be done.
- a model having a Convolution process, a Pooling process (encoder), and an Upsampling process (decoder) represented by FCN as a network structure is called an encoder/decoder model.
- the encoder holds some global information by allowing a slight pixel displacement.
- the decoder restores the resolution while retaining the feature amount of the encoder. According to this model, a certain degree of accuracy can be expected even if the positions of the extraction targets are different between the two-dimensional image data forming different three-dimensional image data.
- the network structure is not limited to this structure as long as it is an architecture capable of processing image information in multi-scale.
- the Softmax process will be described using mathematical expressions.
- the pixel value before Softmax processing is a i,j,k
- the pixel value p i,j,k after Softmax processing is calculated based on the following equation (1).
- each of the two-dimensional rough extracted images corresponding to each class is a two-dimensional likelihood map showing the likelihood of the region.
- the coarse extraction image output from the first classifying unit 102 does not have to be the one in which each pixel as described above is represented by the likelihood representing the class likelihood.
- the value of each pixel may be represented by a binary value, or may be represented by a value in a different range for each class.
- the value obtained from the first classifying unit may be directly used, or a numerical value may be converted by setting a threshold value.
- the first classification unit 102 may be a CNN different from the 2D-FCN described above.
- the classifier is not limited to CNN and may be based on any two-dimensional segmentation method as long as it is a classifier that extracts a region of interest from each body-axis cross-sectional image.
- a classifier based on machine learning other than CNN SVM, k-means, boosting, random forest
- the first classifying unit may be composed of a plurality of classifiers, and the plurality of classifiers may be used in parallel or hierarchically.
- the first classifying unit 102 is not limited to the classifier that classifies three or more classes at the same time as described above.
- a classifier that classifies into two classes of a liver region and a region other than the liver region may be used.
- a plurality of classifiers for classifying into two classes may be prepared, and a rough extracted image may be obtained from each classifier.
- a result equivalent to the result of the above-described processing can be obtained. ..
- Step S330 Process End Determination Step>
- the first classification unit 102 determines whether or not there is unprocessed 2D image data in the 3D image data regardless of the processing target. If there is unprocessed two-dimensional image data, the process proceeds to step S320, and in step S320, rough extraction of the region of interest is performed. If there is no unprocessed two-dimensional image data, the process proceeds to the next step.
- Step S340 Three-dimensional rough extraction image generation step>
- the first classification unit 102 as the three-dimensional data generation means performs stacking or interpolation processing/integration processing on the two-dimensional rough extracted images corresponding to each class obtained by the processing up to step S330, and then the tertiary processing is performed. Generate the original rough extracted image. Then, the first classification unit 102 outputs the generated three-dimensional rough extracted image to the second classification unit 103. The three-dimensional rough extracted image corresponds to the three-dimensional classification result.
- the second classifying unit 103 may have the function of the three-dimensional data generating unit that generates the three-dimensional rough extraction image, or the calculating unit other than the first classifying unit 102 and the second classifying unit 103. May take on.
- the classification method performed by the second classification unit 103 described later is a graph cut method that classifies each pixel into liver and two classes other than liver.
- the graph cut method constructs a graph based on pixel information related to the extraction target area (foreground) and the area other than the extraction target area (background), and the designed energy function is minimized (or maximized). Area is extracted.
- the energy function E of the graph cut method is defined by the following equation (2).
- i and j respectively represent different pixel numbers in the image.
- ⁇ is a constant parameter that adjusts the contribution of the data term E 1 and the smoothing term E 2 .
- Step S350 Pixel Information Setting Step>
- the second classification unit 103 sets pixel information indicating the position of a pixel corresponding to at least one of the foreground and the background based on the three-dimensional rough extracted image.
- Setting of pixel information will be described with reference to FIG. 7.
- FIG. 7A is a roughly extracted image having 5 pixels on one side, and shows a roughly extracted image 720 of the liver region, a roughly extracted image 730 of the right kidney region, and a roughly extracted image 740 of regions other than the liver and the right kidney, respectively.
- the pixel value is a likelihood that represents the classiness, and therefore the pixel value continuously changes. Therefore, in the present embodiment, the roughly extracted images corresponding to the three classes are compared pixel by pixel, and the likelihood of the liver out of the likelihoods of the liver, the likelihood of the right kidney, and the likelihoods of the liver and other than the right kidney. Pixels higher than the other classes are foreground pixels (pixels determined as the foreground). Similarly, a pixel whose likelihood of the liver is lower than those of other classes is set as a background pixel (pixel determined as a background).
- FIG. 7B shows a seed image 750 that combines the coarsely extracted images according to the method described above.
- the seed image 750 includes a foreground pixel 751, a background pixel 752, and an intermediate pixel 753.
- the energy function shown in Equation 2 is defined based on the seed image 750.
- the data item E 1 sets the energy for each pixel according to the distance value based on the foreground pixel 751 and the background pixel 752 in the seed image 750.
- the foreground pixel 751 and the background pixel 752 are the pixels of the foreground/background pixels (not to be changed) even after the area extraction by the graph cut method, so that the corresponding edge (t -Energy is set so as to give a sufficiently large cost to (link).
- the smoothing term E 2 sets energy based on the difference in density value between adjacent pixels in the three-dimensional image data 410. By defining the energy function in this way, the intermediate pixel 753 is classified into either the foreground or the background. Then, the pixels classified as the foreground are set as the extraction target area, and the pixels classified as the background are set as the areas other than the extraction target area, so that the second classification unit 103 excludes the extraction target area and the extraction target area. Divide the area and.
- any method may be used as the pixel information setting method as long as it is a method of determining at least one of the foreground pixel and the background pixel based on the roughly extracted image acquired from the first classification unit 102. May be.
- the second classifier determines one of the pixels (foreground or background) based on the coarsely extracted image, and the other (background or foreground) pixel based on the pixel value of the 3D image data. May be.
- the determination is performed based on the magnitude relation between the likelihood of the liver and the likelihood other than the liver, but the invention is not limited to this.
- a pixel in the foreground may be determined, and a pixel (distance value) separated from the determined foreground pixel by a predetermined distance or more may be set as the background pixel.
- a pixel (distance value) separated from the determined foreground pixel by a predetermined distance or more may be set as the background pixel.
- threshold processing may be performed to determine the foreground (or background) pixel.
- the energy for each pixel is set according to the distance value, but the energy for each pixel may be set based on the likelihood of the roughly extracted image. Further, in the above method, energy is applied so that a corresponding edge (t-link) in the graph is given a sufficiently large cost so that the foreground pixel and the background pixel are not changed by the graph cut method.
- the magnitude of the likelihood may be set as the energy with which the pixels in the foreground or the background may change, or the energy may be set based on the likelihood or the distance value.
- the foreground, background, and intermediate pixels in the seed image may be set based on the magnitude of likelihood.
- the foreground and background pixels first, the roughly extracted images corresponding to each class are compared for each pixel. Then, it has been explained that the case where the likelihood of the liver is the highest in the target pixel is set as the foreground, and the case where the likelihood of the liver is the lowest is set as the background.
- the likelihoods of the classes are the same, the accuracy may not be sufficient if the likelihoods of each pixel are simply compared to determine at least one of the foreground and the background.
- a predetermined threshold may be set for the likelihood, at least one of the foreground and the background may be set when the predetermined threshold is satisfied, and a pixel that does not satisfy the predetermined threshold may be set as an intermediate pixel. Further, the predetermined threshold value for the likelihood may be combined with the result of the pixel-by-pixel comparison of the roughly extracted images corresponding to the classes described above.
- the pixel is set to the intermediate value. Pixel. Then, the information on the foreground and the background is given to the data term E 1 in the energy function of the graph cut to perform the segmentation. At that time, pixels other than the foreground and background pixels are set as intermediate pixels and are classified by graph cut.
- -Foreground, background, and intermediate pixels are set according to the magnitude of likelihood and at least one of likelihood comparisons for each pixel.
- a sufficiently large energy is set to the corresponding edge (t-link) for each pixel so that the pixel does not change by the graph cut method.
- the likelihood of the intermediate pixel is further added to the energy function of the graph cut.
- Step S360 Step of Extracting Region of Interest by Graph Cut Method>
- the second classification unit 103 extracts the extraction target region from the three-dimensional image data by the three-dimensional segmentation method based on the pixel information acquired in step S350. Then, the second classification unit 103 outputs the extracted information on the extraction target area to the magnetic disk 203 or the external storage device 70 as a three-dimensional image of interest 420 as shown in FIG. 4B.
- the target image 420 is composed of a target cross-sectional image 421a, a target cross-sectional image 421b, and the like.
- FIG. 4 a three-dimensional rough extracted image 430 of the liver region, which is one of the outputs of the first classifying unit 102, and a three-dimensional target image, which is the output of the second classifying unit 103.
- the difference between 420 will be described.
- FIG. 4D enlarged view of the coarse extraction image
- the two-dimensional coarse extraction image 431b of the liver region and the two-dimensional coarse extraction image 441b of the kidney region the vicinity of the boundary between the liver region and the right kidney region is shown.
- the region 432 and the region 442 that have high likelihoods to each other.
- the area 432 and the area 442 are areas that represent the same area in the three-dimensional image data.
- this region has the highest likelihood of representing right kidney-likeness (light color) among the three three-dimensional rough extracted images output from the first classifying unit 102, and next has likelihood of representing liver-likeness.
- this area corresponds to an intermediate pixel. Therefore, the area is classified into either the foreground or the background by the graph cut method. In the graph cut method, the region is easily divided along the nearby edge, so that the region 432 can be removed from the liver region (FIG. 4B).
- the second classification unit is the graph cut method
- the first energy (foreground), the second energy (background), and the third energy (intermediate pixel) set in step S350 are used to generate It may be configured to classify into two classes.
- the second classification unit 103 may be any three-dimensional segmentation method based on pixel information of at least one of the foreground and the background.
- a region expansion method it is necessary to give the pixel position of the area to be extracted. Therefore, for example, the position where the foreground pixel 751 exists may be given as the pixel position of the extraction target region.
- the snake method or the level set method it is necessary to give the coordinate data of the contour of the extraction target area as an initial value. Therefore, for example, the boundary pixel of the foreground pixel 751 in FIG. 7B may be given as an initial value.
- the contour may be extracted by a classifier that has learned the contour and other than the contour.
- the second classification unit 103 is a three-dimensional segmentation method that classifies each pixel into two classes, but a method that classifies each pixel into three or more classes may be used.
- the two-dimensional body-axis cross-sectional image data forming the three-dimensional image data is processed by the two-dimensional segmentation method using machine learning.
- the region of interest is roughly extracted.
- a three-dimensional region of interest is extracted by the three-dimensional segmentation method.
- the same 3D image data as the original 3D image data is input to the first classification unit 102 and the second classification unit 103.
- You may input the three-dimensional image data different from the three-dimensional image data.
- the acquisition unit 101 may include an arithmetic unit for that purpose, the arithmetic unit may be provided in a unit other than the acquisition unit, or may be composed of a plurality of arithmetic units.
- the calculation unit may be provided separately from the acquisition unit 101.
- Different three-dimensional image data is input, for example, with respect to the original three-dimensional image data, before the input of the first classification unit and the second classification unit, the acquisition unit 101 removes noise and normalizes density values.
- Spatial normalization and resolution conversion may be performed. At this time, these processes may be common to the classifying units or may be different processes for each input to the classifying units. In the former case, the first three-dimensional image data and the second three-dimensional image data are the same image, and in the latter case, the first three-dimensional image data and the second three-dimensional image data are different images. ..
- the first three-dimensional image data and the second three-dimensional image data are different images. ..
- the first classification unit 102 performs classification based on machine learning, it may be preferable to perform a process of reducing the resolution due to constraints such as calculation time, memory capacity, and image size during learning. In that case, the resolution of the first three-dimensional image data is smaller than the resolution of the second three-dimensional image data. Note that the above-described processing may be executed only for the input to any one of the classification units. If the objects are the same subject, the past incidental information may be referred to and the three-dimensional image data captured at different times may be used.
- an image obtained by capturing the subject in a certain time phase is input to the first classifier, and an image captured in another time phase is input to the second classifier.
- the first three-dimensional image data and the second three-dimensional image data are configured as three-dimensional image data that are associated with the same subject and have different imaging times.
- the calculation unit or the change unit with respect to the rough extraction image of the attention area output from the first classification unit 102, or the three-dimensional attention image output from the second classification unit 103, A process of removing a component other than the maximum connected component may be added.
- the maximum connected component refers to the maximum area among the areas having continuity between pixels.
- a small isolated area may be deleted by performing opening processing or closing processing, or processing of deleting areas other than the maximum connected area. By doing so, if there is only one region of interest drawn in the three-dimensional image data, unnecessary regions can be removed, and the precision of extraction of the region of interest improves.
- other pre-processing and post-processing may be used in combination.
- the extraction target may be any organ other than the liver as long as it is a region represented on the image, or may be a cyst, a tumor, a nodule, or their contours.
- the teaching data and the extraction target of the first classification unit do not have to be the liver, the right kidney, and the classes other than the liver and the right kidney as in the present embodiment.
- a class that is close to the extraction target and has a similar CT value, such as a liver and a right kidney may be a class, or a class that is not an organ such as bone may be a class.
- the second classifier is a classifier that classifies into a liver and a class other than the liver
- the number of classes constituting the class group of the first classifier liver, right kidney, other than liver and right kidney
- the number of classes is greater than the number of classes that make up the class group of the classifier.
- the teaching data and the extraction target of the first classification unit 102 may be changed depending on the configuration of the second classification unit 103 and the extraction target. For example, when the level set method or the snake method is used as the second classification unit 103, the boundary pixels of the attention area are taught to the first classification unit and the boundary pixels are roughly extracted from the first classification unit. Then, the second classification unit 103 extracts an image based on this rough extracted image.
- the learning data may be expanded or padded.
- Machine learning learns features from learning data as described above, and exerts a classification ability based on the features.
- the number of learning data may not be sufficient, and classification may not be performed with high accuracy.
- data expansion such as density value shift, rotation, parallel movement, etc., and padding of the data using GAN (General Adversary Network) can be considered.
- the learning data having different shapes and density values may be learned by the same classifier as an image of a general organ or the like, or may be separately learned by a different classifier.
- a separate processing unit is provided to determine which of a classifier that has learned general organs and the like or a classifier that has learned learning data having different shapes and concentrations. May be Or, it may have a hierarchical structure in which different classifiers are used according to the classification result of a classifier that has learned general organs, etc. It may be a structure using a classifier that has learned. For example, when the livers having different shapes and concentrations and the normal livers are classified by the same learning device, the correct labels may be different or the same.
- the label may be changed depending on the presence or nature of other classes.
- the class in the case of performing segmentation using a plurality of classifiers is not limited to those including the same organ.
- the liver may be first extracted, and the right kidney may be extracted based on the result extracted by the liver.
- the first classifying unit adopts the two-dimensional segmentation method in which each of the two-dimensional image data forming the three-dimensional image data is input.
- the attention area is extracted from the 2D image data
- the attention area is depicted as a small area on the 2D image data, or another area around the attention area has a density value similar to the attention area. If so, the extraction accuracy of the attention area may decrease.
- the two-dimensional segmentation method based on machine learning is used, the extraction accuracy may be reduced when the attention area has a unique shape.
- the three-dimensional image data has an advantage that the voxels holding the information of the connecting portion between the images can be used.
- the classification is performed simply on the three-dimensional image data
- the extraction accuracy may be insufficient when the appearance of the attention area in the target image is different or the resolution is different. Further, especially in medical image data, the number of images is often insufficient.
- a three-dimensional space area having a predetermined size is used as teaching data for the classification unit.
- the predetermined size refers to a three-dimensional space area composed of two or more pieces of two-dimensional image data out of the three-dimensional image data.
- the two-dimensional image data forming the three-dimensional spatial region having a predetermined size may not be composed of continuous slices.
- teaching data may be obtained by thinning out a predetermined number of two-dimensional slices forming a three-dimensional spatial region having a higher resolution between three-dimensional image data having different resolutions.
- the teaching data having a predetermined size reduces the effort of processing (spatial normalization) for aligning the image size during learning and the image size during classification while retaining the number of learning data.
- the first classifying unit each of the three-dimensional space areas having a predetermined size in the three-dimensional image data (an example of the first three-dimensional space area and the second three-dimensional space area) is input, The region of interest is roughly extracted.
- the sizes of the learned first classifier and the teaching data to be input are the same. Therefore, for example, a storage device for storing at least one of the number of pixels, the number of voxels, and the number of slices, which indicates the input size of an image when learning, may be included.
- the size of the input image to the classification unit is determined based on the stored input size at the time of learning. For example, when the image size of the classification target is larger than the input size at the time of learning, the images may be thinned out and used as the input image. On the other hand, when the image size of the classification target is smaller than the input size at the time of learning, a separate interpolation process is performed to obtain the input image.
- the three-dimensional rough extracted images are obtained by further stacking or interpolating/integrating these. Similar to the first embodiment, the region to be extracted is extracted by the three-dimensional segmentation method based on the three-dimensional image data and the three-dimensional rough extraction image obtained by the first classification unit. In this embodiment, since the input of the first classifying unit is a three-dimensional space area, 3D-FCN that processes three-dimensional image data is used. The difference from the first embodiment will be described below.
- the process performed by the acquisition unit 101 is the same as that of the acquisition unit 101 in the first embodiment.
- the first classification unit 102 performs three-dimensional processing on each of the three-dimensional spatial regions of a predetermined size that form the three-dimensional image data acquired from the acquisition unit 101, and the third order corresponding to each class is obtained. Obtain a rough extracted image for the original spatial region. Then, the first classifying unit stacks the roughly extracted images for the three-dimensional spatial region corresponding to each class for each class, generates a three-dimensional roughly extracted image corresponding to each class, and transmits the three-dimensional roughly extracted image to the second classifying unit. To do.
- the process performed by the second classification unit 103 is the same as that of the second classification unit 103 in the first embodiment.
- Step S810 Three-dimensional image data acquisition step>
- the process of step S810 is basically the same as the process of step S310 in the first embodiment, and thus the description thereof will be omitted.
- Step S820 3D-FCN classification step>
- the first classification unit 102 divides the three-dimensional image data into a plurality of three-dimensional space areas having a predetermined size, and focuses each of the divided three-dimensional space areas by three-dimensional processing. Roughly extract the area.
- the three-dimensional space area having a predetermined size is, for example, a group of continuous predetermined number of body-axis sectional images in the three-dimensional image data.
- the result of rough extraction of the attention area by the first classification unit 102 is output as a three-dimensional rough extraction image for the three-dimensional space area corresponding to each class of the first class group.
- the output by the first classification unit 102 may be a two-dimensional rough extracted image.
- the first classification unit 102 is a learned 3D-FCN that classifies each pixel into three classes other than liver, right kidney, liver, and right kidney.
- the three-dimensional space area having a predetermined size in the three-dimensional image data means three consecutive body axis cross-sectional images (one example of a predetermined size) in the three-dimensional image data. It is a laminated image.
- the rough extracted image for the three-dimensional spatial area corresponding to each class is a three-dimensional rough extracted image having the same image size as the three-dimensional spatial area having a predetermined size.
- the three-dimensional image data 410 in FIG. 9 indicates the three-dimensional image data 410 acquired in step S810.
- the input of the 3D-FCN in this embodiment is, for example, each of the three-dimensional spatial regions shown by different hatching in FIG. 9A.
- the first three-dimensional space area 911 and the second three-dimensional space area 912 are each composed of three continuous body axis cross-section images.
- the first three-dimensional space area 911 and the second three-dimensional space area 912 do not have overlapping areas.
- the 3D-FCN executes each process as shown in FIG.
- the second three-dimensional rough extraction image 922 is output.
- the first three-dimensional coarse extraction image 921 corresponding to the first three-dimensional space area 911 and the second three-dimensional coarse extraction image 922 corresponding to the second three-dimensional space area 912 are each continuous. It is layered as a three-dimensional rough extraction image.
- the three-dimensional space areas obtained by dividing the three-dimensional image data into predetermined sizes do not have overlapping areas, but overlapping areas may exist.
- the output of the first classifying unit may be, for example, an integrated processing of overlapping portions of the roughly extracted images.
- the overlapping region integration process may be performed, for example, before the Softmax process or after the Softmax process by the method (described in step S840) described below.
- the integration process is performed before the Softmax process, the integration process is performed by, for example, the Convolution process or the Pooling process for each pixel.
- Step S830 Step of determining unprocessed three-dimensional space area>
- the first classification unit 102 determines whether or not there is an unprocessed three-dimensional space area that has not been processed in the three-dimensional image data that is the processing target. When it is determined that there is an unprocessed three-dimensional space area, the process of step S820 is performed on the area. If it is determined that there is no unprocessed three-dimensional space area, the process proceeds to step S840.
- Step S840 Creation of Three-Dimensional Coarse Extracted Image Corresponding to Three-Dimensional Segmentation>
- the first classification unit 102 operates on the three-dimensional rough extracted image corresponding to each class of the first class group. , At least one of stacking, interpolation, and integration is performed for each class.
- the first classifying unit 102 generates a three-dimensional rough extracted image corresponding to each class for three-dimensional segmentation by this processing, and transmits it to the second classifying unit 103.
- the first classification unit creates a three-dimensional rough extraction image when inputting to the second classification unit 103 from the classification results by the plurality of first classification units 102 corresponding to the three-dimensional space area. Generation of the three-dimensional rough extracted image corresponding to each class is realized by some variations.
- the first classification unit 102 for example, there are cases where there is no overlapping area as shown in FIGS. 9A and 9C and cases where there is an overlapping area as shown in FIGS. 9B and 9D. To be Further, when there is no overlapping area, it is considered that the first three-dimensional space area 911 and the second three-dimensional space area 912 are continuous or discontinuous.
- a case where a three-dimensional rough extracted image is generated based on all the rough extracted images output from the 3D-FCN As shown in FIG. 9C and FIG. 9D, among the rough extracted images output from the 3D-FCN, for example, one selected rough extracted image among the rough extracted images corresponding to the input three-dimensional spatial region, respectively. It is conceivable to generate a three-dimensional rough extraction image using. That is, when a three-dimensional rough extraction image is output from an image whose input image corresponds to a three-dimensional space region, and when a smaller number of rough extraction results are output than the two-dimensional cross-sectional image forming the three-dimensional space region. There is.
- the number of slices of one selected rough extracted image is not limited as long as it is smaller than the number of slices of the rough extracted image output from the 3D-FCN.
- an integrated process may be used to generate one rough extracted image based on all the rough extracted images for the three-dimensional space region.
- a rough extracted image which is a continuous region without an overlapping region between the first three-dimensional spatial region and the second three-dimensional spatial region which is an input and which is an output of 3D-FCN Among them, a case of generating a three-dimensional rough extracted image using all the rough extracted images will be described.
- the three-dimensional rough extracted image is generated by simply stacking all the slices of the first three-dimensional rough extracted image 921 and the second three-dimensional rough extracted image 922, and then to the second classification unit 103. And input.
- a three-dimensional rough extracted image is generated using a predetermined number of rough extracted images from the rough extracted images corresponding to each of the three-dimensional spatial regions having a predetermined size classified by the first classification unit 102.
- a method of generating a three-dimensional rough extracted image using each of the single rough extracted images obtained by selecting or integrating the rough extracted images for the input three-dimensional spatial region will be described.
- the center coarse extracted image may be used, or all the rough extracted images corresponding to the three-dimensional space area may be used.
- the average value integration may be performed by taking the average value of the pixel values between the roughly extracted images.
- one coarse extraction image may be generated by performing maximum value integration using the maximum pixel value among all the coarse extraction images corresponding to the three-dimensional spatial region. Further, the average value integration and the maximum value integration do not have to be performed on all the slices forming the roughly extracted image of the three-dimensional spatial area having a predetermined size.
- a plurality of rough extraction images may be integrated, or a plurality of rough extraction images may be subjected to different integration processing. Further, the integration processing may be performed a plurality of times on each of the plurality of roughly extracted images.
- the first two-dimensional rough extraction image 925 and the second two-dimensional rough extraction image 926 that have been generated are subjected to at least one of stacking, interpolation processing, and integration processing between the images to perform three-dimensional rough extraction. Generate an image.
- the first three-dimensional spatial region which is the input to the first classification unit 102 and the second three-dimensional spatial region are discontinuous regions with no overlapping region, and the output is a cubic corresponding to each class.
- the case of an original rough extracted image or a two-dimensional rough extracted image corresponding to a three-dimensional spatial region having a predetermined size will be described.
- the rough extracted image corresponding to the first three-dimensional spatial region and the rough extracted image corresponding to the second three-dimensional spatial region are used.
- Each of the images is subjected to at least one of stacking, interpolation, and integration processing to obtain a three-dimensional rough extracted image.
- the step of generating a three-dimensional rough extracted image from a two-dimensional rough extracted image corresponding to a three-dimensional spatial region having a predetermined size is the same as above.
- FIG. 9B is composed of a rough extraction image 921 corresponding to a first three-dimensional space area 911 having a predetermined size and a rough extraction image 923 corresponding to a second three-dimensional space area 913 having a predetermined size. ing.
- the rough extracted image 921 and the rough extracted image 923 each have an overlapping area 924.
- average value integration or maximum value integration may be performed on the overlapping area 924.
- at least one of stacking, interpolation, and integration processing may be performed between the respective rough extracted images to form a three-dimensional rough extracted image.
- the first three-dimensional space area 911 and the second three-dimensional space area 913 have overlapping areas.
- the first three-dimensional spatial area 911 corresponds to the two-dimensional rough extracted image 927
- the second three-dimensional spatial area 913 corresponds to the two-dimensional rough extracted image 928.
- One rough extracted image is generated from each of the rough extracted images corresponding to the three-dimensional space area having a predetermined size.
- the number of rough extraction images generated from each of the rough extraction images corresponding to the three-dimensional space area is not limited to one, but may be smaller than the number of slices forming a three-dimensional space area having a predetermined size.
- the generated one rough extracted image may be a rough extracted image corresponding to a three-dimensional space area having a predetermined size that is interpolated, stacked, and integrated.
- the step of generating a three-dimensional rough extracted image from a two-dimensional rough extracted image corresponding to a three-dimensional spatial area having a predetermined size is the same as above.
- the connectivity of each three-dimensional spatial region can be considered by adjusting the number of slices generated from each three-dimensional spatial region having a predetermined size and the number of non-overlapping slices. ..
- Step S850 Pixel information setting step>
- the processing of step S850 is basically the same as the processing of step S350 in the first embodiment, so description will be omitted.
- Step S860 Step of Extracting Region of Interest by Graph Cut Method>
- the process of step S860 is basically the same as the process of step S360 in the first embodiment, and thus the description thereof will be omitted.
- the three-dimensional image data is divided into a plurality of three-dimensional space areas having a predetermined size, and each of the three-dimensional space areas is divided.
- the region of interest is roughly extracted.
- the first classifying unit 102 inputs only a three-dimensional space area having a predetermined size in the three-dimensional image data, You can enter the information at the same time.
- the input refers to learning data for learning the classifier and three-dimensional image data input to the first classifying unit.
- the extraction result of the attention area for the adjacent three-dimensional space area may be input at the same time, or the existence probability map of the attention area estimated by another method or the bounding box in which the attention area exists may be input. ..
- the first classification unit 102 can further use the attention area information of the other area and the information of the attention area estimated by the other method, so that the accuracy of the rough extraction result of the attention area by the first classification unit is improved. To do. With this effect, the accuracy of extracting the attention area by the second classification unit 103 is improved.
- the additional information is also valid in the second classification unit 103. By adding the additional information to the second classifying unit to the pixel information and using it as a seed image, more accurate extraction of the extraction region is expected. This additional information may be given to only the first classification unit, only the second classification unit, or both.
- the present modification is not limited to the second embodiment, but is also effective in the first embodiment.
- the present invention can also be realized by executing the following processing. That is, software (program) that realizes the functions of the above-described embodiments is supplied to a system or device via a network or various storage media, and the computer (or CPU, MPU, etc.) of the system or device reads the program. This is the process to be executed.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Health & Medical Sciences (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Evolutionary Computation (AREA)
- General Health & Medical Sciences (AREA)
- Artificial Intelligence (AREA)
- Software Systems (AREA)
- Multimedia (AREA)
- Computing Systems (AREA)
- Life Sciences & Earth Sciences (AREA)
- Medical Informatics (AREA)
- Biomedical Technology (AREA)
- Molecular Biology (AREA)
- Data Mining & Analysis (AREA)
- General Engineering & Computer Science (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- Mathematical Physics (AREA)
- Databases & Information Systems (AREA)
- Public Health (AREA)
- Nuclear Medicine, Radiotherapy & Molecular Imaging (AREA)
- Radiology & Medical Imaging (AREA)
- Epidemiology (AREA)
- Primary Health Care (AREA)
- Spectroscopy & Molecular Physics (AREA)
- Biodiversity & Conservation Biology (AREA)
- Bioinformatics & Computational Biology (AREA)
- Evolutionary Biology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- High Energy & Nuclear Physics (AREA)
- Optics & Photonics (AREA)
- Pathology (AREA)
- Heart & Thoracic Surgery (AREA)
- Surgery (AREA)
- Animal Behavior & Ethology (AREA)
- Veterinary Medicine (AREA)
- Image Analysis (AREA)
Abstract
Description
を有する。
本実施形態に係る画像処理装置は、対象物を含む空間的な三次元画像データ(三次元断層画像)を構成する二次元画像データ(スライス)の夫々に対し、二次元のセグメンテーション法により注目領域を大まかに抽出(粗抽出)する。なお対象物は、例えば被検体をさす。以降、空間的な三次元画像データを三次元画像データと記載する。この二次元のセグメンテーション法では、入力された二次元画像データに対応する注目領域の二次元の粗抽出画像が得られる。そして二次元の粗抽出画像を積層もしくは補間処理や統合処理をすることで三次元の粗抽出画像を得る。ここで積層は二つ以上の粗抽出画像を連続的な画像とする処理を指す。統合処理は、二つ以上の粗抽出画像間の重複領域をひとつにまとめる処理のことを指す。そして、三次元画像データと二次元のセグメンテーション法により取得された三次元の粗抽出画像とに基づいて、三次元のセグメンテーション法によりさらに高精度な注目領域を抽出する。ここで、領域を抽出するとは、画像内の各画素を所定のクラス群のいずれかに分類することを指す。分類はその抽出対象の位置を特定できればよく、臓器や病変等の抽出対象の内部か否かを区別するものであってもよいし、抽出対象の輪郭か否かを区別するものであってもよい。
以下、図1を参照して本実施形態に係る画像処理装置の機能構成について説明する。同図に示すように、本実施形態に係る、画像処理装置100は、取得部101、第一の分類部102、第二の分類部103で構成される。また、本実施形態に係る画像処理システムは、画像処理装置100の外部に記憶装置70を具備する。
図2は画像処理装置100のハードウェア構成の一例を示す図である。CPU(Central Processing Unit)201は、主として各構成要素の動作を制御する。主メモリ202は、CPU201が実行する制御プログラムを格納したり、CPU201によるプログラム実行時の作業領域を提供したりする。磁気ディスク203は、OS(Operating System)、周辺機器のデバイスドライバ、後述する処理等を行うためのプログラムを含む各種アプリケーションソフトを実現するためのプログラムを格納する。CPU201が主メモリ202、磁気ディスク203等に格納されているプログラムを実行することにより、図1に示した画像処理装置100の機能(ソフトウェア)及び後述するフローチャートにおける処理が実現される。
次に、本実施形態における画像処理装置100の処理手順について、図3を用いて説明する。
ステップS310において、取得部101は、記憶装置70から三次元画像データを取得する。
ステップS320において第一の分類部102は、三次元画像データを構成する二次元の体軸断面画像の夫々に対して、二次元のセグメンテーション法により二次元の注目領域を抽出する。二次元の注目領域は、第一のクラス群の各クラスに対応する二次元の粗抽出画像として出力される。そして、各クラスに対応する二次元の粗抽出画像をクラスごとに積層した、第一のクラス群の各クラスに対応する三次元の粗抽出画像を生成して、第二の分類部に送信する。本実施形態において、第一の分類部102は、各画素を肝臓、右腎、肝臓および右腎以外の3クラスに分類する学習済みの2D-FCNである。
なお、第一の分類部102の出力である粗抽出画像は、上述のような各画素が当該クラスらしさを表す尤度で表されたものでなくてもよい。例えば、各画素の値が2値で表されたものでもよいし、クラスごとに異なる範囲の値で表現されていてもよい。また各画素の値は第一の分類部から得られた値を直接用いてもよいし、しきい値を設けて数値を変換してもよい。
ステップS330において、第一の分類部102は、三次元画像データ中で処理対象にも関わらず未処理の二次元画像データの有無を判定する。未処理の二次元画像データが存在している場合はステップS320へと移行し、ステップS320にて注目領域の粗抽出を行う。未処理の二次元画像データが存在しない場合、次のステップへと移行する。
ステップS340において、三次元データ生成手段としての第一の分類部102は、ステップS330までの処理で得た各クラスに対応する二次元の粗抽出画像を積層もしくは補間処理・統合処理をし、三次元の粗抽出画像を生成する。そして、第一の分類部102は、生成された三次元の粗抽出画像を第二の分類部103に出力する。三次元の粗抽出画像が三次元の分類結果に相当する。なお、三次元の粗抽出画像を生成する三次元データ生成手段の機能を、第二の分類部103が担ってもよいし、第一の分類部102および第二の分類部103以外の演算手段が担ってもよい。
第二の分類部103は、三次元の粗抽出画像に基づいて、前景及び背景の少なくとも一方に対応する画素の位置を示す画素情報を設定する。図7を参照して、画素情報の設定について説明する。ここでは肝臓領域を抽出したい場合に、前景である肝臓領域と背景である肝臓領域以外の領域の画素情報の与え方を説明する。なお、ここでは説明を簡単にするため、二次元画像で例示する。図7Aは一辺が5画素の粗抽出画像であり、それぞれ肝臓領域の粗抽出画像720、右腎領域の粗抽出画像730、肝臓および右腎以外の領域の粗抽出画像740を表している。本実施形態における粗抽出画像は、画素値が当該クラスらしさを表す尤度であるため、画素値が連続的に変化している。そのため、本実施形態では、3つのクラスに対応する粗抽出画像を画素ごとに比較し、肝臓の尤度、右腎の尤度、肝臓および右腎以外の尤度のうち、肝臓の尤度が他のクラスよりも高い画素を前景の画素(前景として確定する画素)とする。同様に、肝臓の尤度が他のクラスよりも低い画素を背景の画素(背景として確定する画素)と設定する。そして、残りの画素(肝臓の尤度が2番目に高い画素等)を中間の画素(前景か背景かを現時点では確定しない画素)と設定する。図7Bは、上述の方法によって粗抽出画像を統合したシード画像750を表している。シード画像750は、前景の画素751、背景の画素752、中間の画素753を含む。本実施形態では、シード画像750に基づいて数2に示したエネルギー関数を定義する。具体的には、データ項E1は、シード画像750における前景の画素751と背景の画素752に基づき、距離値に応じて各画素に対するエネルギーを設定する。なお、前景の画素751および背景の画素752は、グラフカット法による領域抽出後も、それぞれ前景・背景の画素となるように(変化させないために)、グラフ内の当該画素に対する該当のエッジ(t-link)に十分大きなコストが付与されるようにエネルギーを設定する。平滑化項E2は、三次元画像データ410内の隣接画素間の濃度値の差に基づいてエネルギーを設定する。このようにエネルギー関数を定義することで、中間の画素753を前景または背景のいずれかに分類する。そして、前景と分類された画素を抽出対象の領域、背景と分類された画素を抽出対象の領域以外の領域とすることで、第二の分類部103により抽出対象領域と抽出対象の領域以外の領域とを分割する。
第二の分類部103は、ステップS350により取得された画素情報を基にした三次元のセグメンテーション法により三次元画像データから抽出対象の領域を抽出する。そして、第二の分類部103は、抽出した抽出対象の領域の情報を、図4Bのような三次元の注目画像420として、磁気ディスク203や外部の記憶装置70に出力する。注目画像420は注目断面画像421a、注目断面画像421b等から構成される。
なお、第二の分類部103は、前景と背景の少なくとも一方の画素情報に基づく三次元のセグメンテーション法であれば何でもよい。例えば、他のセグメンテーション法として領域拡張法やスネーク法、レベルセット法が用いられてもよい。領域拡張法では、抽出対象の領域の画素位置を与える必要がある。そのため、例えば前景の画素751が存在する位置を抽出対象の領域の画素位置として与えればよい。スネーク法やレベルセット法の場合は、抽出対象の領域の輪郭の座標データを初期値として与える必要がある。そのため、例えば、図7Bにおける前景の画素751の境界画素を初期値として与えれば良い。なお、輪郭を抽出したい場合は、輪郭と輪郭以外を学習させた分類器によって抽出されてもよい。また、本実施形態において第二の分類部103は、各画素を2クラスに分類する三次元のセグメンテーション法であったが、各画素を3クラス以上に分類するような方法でもよい。
以上に示したように、第一の実施形態に係る画像処理装置100では、機械学習を用いた二次元のセグメンテーション法により、三次元画像データを構成する二次元の体軸断面画像データの夫々から注目領域を粗抽出する。そして、三次元画像データと三次元の粗抽出画像とに基づいて、三次元のセグメンテーション法により、三次元の注目領域を抽出する。この構成により、注目領域や注目領域以外の領域に関する画素情報を自動的に三次元のセグメンテーション法に与えることができるため、ユーザによる画素情報の入力の負担を軽減できる。さらに学習時の画像中の注目領域の位置と、おおむね同じ位置に注目領域が存在する画像を入力できるため高精度に注目領域を抽出することが可能となる。
上述の第一の実施形態に係る画像処理装置100において、第一の分類部102と第二の分類部103には基の三次元画像データと同一の三次元画像データを入力したが、基の三次元画像データとは異なる三次元画像データを入力してもよい。取得部101はそのための演算装置を備えていてもよいし、演算装置は取得部以外に設けられていても、複数の演算装置から構成されていてもよい。また取得部101とは別に演算部を有していてもよい。異なる三次元画像データの入力は例えば、基の三次元画像データに対して、第一の分類部および第二の分類部の入力の前に、取得部101にてノイズ除去や濃度値の正規化、空間的な正規化や解像度変換をしてもよい。このとき、これらの処理は、分類部間で共通の処理であってもよいし、分類部への入力ごとに異なる処理であってもよい。前者の場合、第一の三次元画像データと第二の三次元画像データは同一の画像となり、後者の場合、第一の三次元画像データと第二の三次元画像データは互いに異なる画像となる。例えば、分類部への入力画像に対して異なる空間的な正規化をする場合、第一の分類部102の出力を、第二の分類部103の入力の空間に正規化するような変更部が必要になる。この構成により各分類器の特性に応じた三次元画像データを入力できるため、注目領域の抽出精度の向上が期待される。また、第一の分類部102が機械学習に基づく分類をする場合、計算時間やメモリの容量や、学習時の画像サイズ等の制約により解像度を落とす処理をすることが好ましい場合もある。その場合、第一の三次元画像データの解像度が、第二の三次元画像データの解像度よりも小さい構成になる。なお上述の処理は何れか一方の分類部への入力のみに対して実行されてもよい。また、対象物が同一の被検体であれば過去の付帯情報を参照し、異なる時刻で撮影された三次元画像データを用いてもよい。例えば、その被験者をある時相で撮像した画像を第一の分類器への入力とし、別の時相で撮像した画像を第二の分類器への入力とする。つまり、第一の三次元画像データおよび第二の三次元画像データが同一の被検体に紐づけられた互いに撮影時刻の異なる三次元画像データという構成になる。この構成により一回の撮像結果の情報のみから注目領域を抽出する場合よりも、例えば異なる時相を参照した場合の方がより抽出精度が向上することが期待される。
第一の実施形態に係る画像処理装置では、第一の分類部は三次元画像データを構成する二次元画像データの夫々を入力とする、二次元のセグメンテーション法を採用した。しかしながら、二次元画像データより注目領域を抽出すると、二次元画像データ上で注目領域が小さな領域として描出されている場合や、注目領域の周辺にある別の領域が注目領域と類似した濃度値を有する場合に、注目領域の抽出精度が低下する可能性がある。また、機械学習に基づく二次元のセグメンテーション法を用いる場合、注目領域が特異な形状である場合に抽出精度が低下する可能性がある。その点、三次元画像データは画像間の連結部分の情報を保持したボクセルを利用できるメリットがある。しかし単純に三次元画像データを対象にした分類は、対象画像内の注目領域の見え方が異なる場合や、解像度が異なる場合において抽出精度が不十分であることがある。さらに特に医用画像データにおいては画像数が十分でないことが少なくない。
本実施形態に係る画像処理装置の構成は第一の実施形態に係る画像処理装置100と同じであるため、図1を参照して本実施形態に係る画像処理装置の機能構成について、第一の実施形態に係る画像処理装置との重複部分を省略して説明する。
次に、本実施形態における画像処理装置100の処理手順について、図8を用いて説明する。
ステップS810の処理は、第一の実施形態におけるステップS310と基本的には同一の処理であるため、説明を省略する。
ステップS820において、第一の分類部102は、三次元画像データを所定の大きさを持つ複数の三次元空間領域に分割し、分割された三次元空間領域の夫々に対して三次元処理により注目領域を粗抽出する。所定の大きさを持つ三次元空間領域とは、例えば、三次元画像データ内の連続する所定の枚数の体軸断面画像のまとまりである。第一の分類部102によって注目領域を粗抽出した結果は、第一のクラス群の各クラスに対応する三次元空間領域に対する三次元の粗抽出画像として出力される。なお第一の分類部102による出力は二次元の粗抽出画像でも構わない。
ステップS830において、第一の分類部102は、処理対象となっている三次元画像データのうち、処理が為されていない未処理の三次元空間領域が存在するかを判定する。未処理の三次元空間領域が存在すると判定された場合、その領域に対してステップS820の処理を行う。未処理の三次元空間領域が存在しないと判定された場合は、ステップS840の処理へと進む。
ステップS850の処理は、第一の実施形態におけるステップS350と基本的には同一の処理であるため、説明を省略する。
ステップS860の処理は、第一の実施形態におけるステップS360と基本的には同一の処理であるため、説明を省略する。
以上に示したように、第二の実施形態に係る画像処理装置100では、三次元画像データを所定の大きさを持つ複数の三次元空間領域に分割し、三次元空間領域の夫々に対して注目領域を粗抽出する。この構成により、第一の分類部102において三次元的な連結性を考慮できるため、注目領域を粗抽出した結果の精度が向上する。この効果に伴い、第二の分類部103による注目領域の抽出精度が向上する。
上述の第二の実施形態に係る画像処理装置100において、第一の分類部102は、三次元画像データ内の所定の大きさを持つ三次元空間領域のみを入力としたが、注目領域に関する他の情報も同時に入力して良い。ここで入力とは、分類器を学習させる際の学習データと、第一の分類部へ入力する三次元画像データを指す。例えば、隣接する三次元空間領域に対する注目領域の抽出結果を同時に入力にしても良いし、別の方法で推定した注目領域の存在確率マップや、注目領域が存在するバウンディングボックスを入力にしても良い。この構成により、第一の分類部102は他領域の注目領域情報、他手法により推定された注目領域の情報をさらに利用できるため、第一の分類部による注目領域の粗抽出結果の精度が向上する。この効果に伴い、第二の分類部103による注目領域の抽出精度が向上する。またこれらの付加情報は第二の分類部103においても有効である。第二の分類部への付加情報を、画素情報に与えシード画像とすることでより精密な抽出領域の抽出が期待される。この付加情報は第一の分類部のみ、第二の分類部のみ、もしくはその両方に与えられてもよい。また本変形例は第二の実施形態のみに限られず、第一の実施形態においても有効である。
また、本発明は、以下の処理を実行することによっても実現される。即ち、上述した実施形態の機能を実現するソフトウェア(プログラム)を、ネットワーク又は各種記憶媒体を介してシステム或いは装置に供給し、そのシステム或いは装置のコンピュータ(またはCPUやMPU等)がプログラムを読み出して実行する処理である。
Claims (29)
- 対象物を含む第一の三次元画像データを構成する複数の二次元画像データのそれぞれにおける複数の画素を、学習された分類器により、第一のクラス群に分類する第一の分類手段と、
前記第一の分類手段による分類結果に基づいて、前記対象物を含む第二の三次元画像データにおける複数の画素を、前記第一のクラス群の少なくとも一つのクラスを含む第二のクラス群に分類する第二の分類手段と、
を有することを特徴とする画像処理装置。 - 前記第一の分類手段は、前記第一の三次元画像データを構成する複数の前記二次元画像データのそれぞれの複数の画素を前記第一のクラス群に分類することにより、三次元の分類結果を生成する三次元データ生成手段をさらに有し、
前記第二の分類手段は、前記三次元の分類結果に基づいて、前記第二の三次元画像データの複数の画素を前記第二のクラス群に分類する
ことを特徴とする請求項1に記載の画像処理装置。 - 前記三次元データ生成手段は、前記複数の二次元画像データのそれぞれに対する前記第一の分類手段による分類結果を積層、補間および統合の少なくとも一つの処理をすることにより、前記三次元の分類結果を生成する
ことを特徴とする請求項2に記載の画像処理装置。 - 前記二次元画像データが体軸断面画像データである
ことを特徴とする請求項1から3のいずれか一項に記載の画像処理装置。 - 学習された分類器により、対象物を含む第一の三次元画像データのうち、所定の大きさを有する第一の三次元空間領域に対応する複数のボクセルのそれぞれを、第一のクラス群に分類し、
前記所定の大きさを有する第二の三次元空間領域に対応する複数のボクセルのそれぞれを前記第一のクラス群に分類する第一の分類手段と、
前記第一の三次元空間領域に対応する複数のボクセルの分類結果と前記第二の三次元空間領域に対応する複数のボクセルの分類結果とに基づいて、前記対象物を含む第二の三次元画像データに含まれる少なくとも一つのボクセルを前記第一のクラス群の少なくとも一つのクラスを含む第二のクラス群に分類する第二の分類手段と、を有する
ことを特徴とする画像処理装置。 - 前記第一の分類手段への入力として、さらに前記第一の三次元空間領域もしくは前記第二の三次元空間領域とは異なる領域での前記第一の分類手段による分類結果を与えることを特徴とする請求項5に記載の画像処理装置。
- 前記第一の三次元空間領域と前記第二の三次元空間領域が互いに重複しないことを特徴とする請求項5または6に記載の画像処理装置。
- 前記第一の三次元空間領域に対応する複数のボクセルの分類結果と、前記第二の三次元空間領域に対応する複数のボクセルの分類結果と、を積層、補間、および統合の少なくとも一つの処理を行うことにより三次元の分類結果を生成する三次元データ生成手段を有し、
前記第二の分類手段は、前記三次元の分類結果に基づいて、前記第二の三次元画像データに含まれる少なくとも一つのボクセルを前記第二のクラス群に分類する
ことを特徴とする請求項5から7のいずれか一項に記載の画像処理装置。 - 前記第一の三次元空間領域と前記第二の三次元空間領域が互いに重複部分を有することを特徴とする請求項5または6に記載の画像処理装置。
- 前記重複部分に対応する第一の分類手段による分類結果を統合することにより、三次元の分類結果を生成する三次元データ生成手段を有し、
前記第二の分類手段は、前記三次元の分類結果に基づいて、前記第二の三次元画像データに含まれる少なくとも一つのボクセルを前記第二のクラス群に分類する
ことを特徴とする請求項9に記載の画像処理装置。 - 前記第一の三次元空間領域および前記第二の三次元空間領域のそれぞれの領域が、複数の体軸断面画像データから構成される領域である
ことを特徴とする請求項5から10のいずれか一項に記載の画像処理装置。 - 前記第一の三次元画像データおよび第二の三次元画像データが医用画像データであることを特徴とする請求項1から11のいずれか一項に記載の画像処理装置。
- 前記第一のクラス群は、肝臓であることを示すクラスおよび腎臓であることを示すクラスを含むことを特徴とする請求項1から12のいずれか一項に記載の画像処理装置。
- 前記第二のクラス群は、肝臓であることを示すクラスを含むことを特徴とする請求項1から12のいずれか一項に記載の画像処理装置。
- 前記第二の分類手段は、前記第二の三次元画像データにおける複数の画素を、グラフカット法により前記第二のクラス群に分類することを特徴とする請求項1から14のいずれか一項に記載の画像処理装置。
- 前記第一の分類手段による分類結果は、前記第一のクラス群を構成するクラスの尤度を含むことを特徴とする請求項1から15のいずれか一項に記載の画像処理装置。
- 前記尤度に基づいて、前景と、背景と、前記前景および前記背景のいずれにも対応しない中間領域とを示す領域情報を取得する取得手段を有し、
前記第二の分類手段は、前記尤度および前記領域情報に基づいて、
前記前景に対応する画素に第一のエネルギーを設定し、
前記背景に対応する画素に第二のエネルギーを設定し、
前記中間領域に対応する画素に当該画素における前記尤度に対応する第三のエネルギーを設定し、
前記第一のエネルギー、第二のエネルギー、および第三のエネルギーを用いたグラフカット法により、前記第二の三次元画像データにおける複数の画素を前記第二のクラス群に分類する
ことを特徴とする請求項16に記載の画像処理装置。 - 前記第一の三次元画像データおよび前記第二の三次元画像データは、同一の三次元画像データである
ことを特徴とする請求項1から17のいずれか一項に記載の画像処理装置。 - 前記第一の三次元画像データおよび前記第二の三次元画像データは、互いに異なる三次元画像データである
ことを特徴とする請求項1から17のいずれか一項に記載の画像処理装置。 - 前記第一の三次元画像データの解像度は、前記第二の三次元画像データの解像度よりも小さい
ことを特徴とする請求項19に記載の画像処理装置。 - 前記第一の三次元画像データおよび第二の三次元画像データは、同一の被検体に紐づけられた互いに撮影時刻の異なる三次元画像データである
ことを特徴とする請求項19または20に記載の画像処理装置。 - 前記学習された分類器が、CNN、SVM、k-meansの少なくとも一つを含む
ことを特徴とする請求項1から21のいずれか一項に記載の画像処理装置。 - 前記学習された分類器は、エンコーダ・デコーダを用いたネットワーク構造を有するCNNである
ことを特徴とする請求項1から22のいずれか一項に記載の画像処理装置。 - 前記第二の分類手段が、前記第二の三次元画像データにおける複数の画素を、グラフカット法、レベルセット法、領域拡張法、スネーク法の少なくとも一つにより前記第二のクラス群に分類する
ことを特徴とする請求項1または5に記載の画像処理装置。 - 対象物を含む第一の三次元画像データを構成する複数の二次元画像データのそれぞれにおける複数の画素を、学習された分類器により、第一のクラス群に分類し、
前記第一のクラス群への分類結果に基づいて、前記対象物を含む第二の三次元画像データにおける複数の画素を、前記第一のクラス群の少なくとも一つのクラスを含む第二のクラス群に分類する
ことを特徴とする画像処理方法。 - 学習された分類器により、対象物を含む第一の三次元画像データのうち、所定の大きさを有する第一の三次元空間領域に対応する複数のボクセルのそれぞれを、第一のクラス群に分類するステップと、
前記所定の大きさを有する第二の三次元空間領域に対応する複数のボクセルのそれぞれを前記第一のクラス群に分類する第一の分類ステップと、
前記第一の三次元空間領域に対応する複数のボクセルの分類結果と前記第二の三次元空間領域に対応する複数のボクセルの分類結果とに基づいて、前記対象物を含む第二の三次元画像データに含まれる少なくとも一つのボクセルを前記第一のクラス群の少なくとも一つのクラスを含む第二のクラス群に分類する第二の分類ステップと、を有する
ことを特徴とする画像処理方法。 - 請求項25または26に記載の画像処理方法をコンピュータに実行させるためのプログラム。
- 対象物を含む第一の三次元画像データを構成する複数の二次元画像データのそれぞれにおける複数の画素を、学習された分類器により、第一のクラス群に分類する第一の分類手段と、
前記第一の分類手段による分類結果に基づいて、前記対象物を含む第二の三次元画像データにおける複数の画素を、前記第一のクラス群の少なくとも一つのクラスを含む第二のクラス群に分類する第二の分類手段と、
を有することを特徴とする画像処理システム。 - 学習された分類器により、対象物を含む第一の三次元画像データのうち、所定の大きさを有する第一の三次元空間領域に対応する複数のボクセルのそれぞれを、第一のクラス群に分類し、
前記所定の大きさを有する第二の三次元空間領域に対応する複数のボクセルのそれぞれを前記第一のクラス群に分類する第一の分類手段と、
前記第一の三次元空間領域に対応する複数のボクセルの分類結果と前記第二の三次元空間領域に対応する複数のボクセルの分類結果とに基づいて、前記対象物を含む第二の三次元画像データに含まれる少なくとも一つのボクセルを前記第一のクラス群の少なくとも一つのクラスを含む第二のクラス群に分類する第二の分類手段と、を有する
ことを特徴とする画像処理システム。
Priority Applications (3)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN201980080427.8A CN113164145B (zh) | 2018-12-28 | 2019-12-18 | 图像处理装置、图像处理系统、图像处理方法和存储介质 |
| EP19901802.9A EP3903681B1 (en) | 2018-12-28 | 2019-12-18 | Image processing device, image processing system, image processing method, and program |
| US17/341,140 US12086992B2 (en) | 2018-12-28 | 2021-06-07 | Image processing apparatus, image processing system, image processing method, and storage medium for classifying a plurality of pixels in two-dimensional and three-dimensional image data |
Applications Claiming Priority (4)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| JP2018-247775 | 2018-12-28 | ||
| JP2018247775 | 2018-12-28 | ||
| JP2019-183345 | 2019-10-03 | ||
| JP2019183345A JP6716765B1 (ja) | 2018-12-28 | 2019-10-03 | 画像処理装置、画像処理システム、画像処理方法、プログラム |
Related Child Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| US17/341,140 Continuation US12086992B2 (en) | 2018-12-28 | 2021-06-07 | Image processing apparatus, image processing system, image processing method, and storage medium for classifying a plurality of pixels in two-dimensional and three-dimensional image data |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| WO2020137745A1 true WO2020137745A1 (ja) | 2020-07-02 |
Family
ID=71131569
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| PCT/JP2019/049623 Ceased WO2020137745A1 (ja) | 2018-12-28 | 2019-12-18 | 画像処理装置、画像処理システム、画像処理方法、プログラム |
Country Status (5)
| Country | Link |
|---|---|
| US (1) | US12086992B2 (ja) |
| EP (1) | EP3903681B1 (ja) |
| JP (1) | JP6716765B1 (ja) |
| CN (1) | CN113164145B (ja) |
| WO (1) | WO2020137745A1 (ja) |
Cited By (3)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| JP2022074841A (ja) * | 2020-11-05 | 2022-05-18 | 国立大学法人 東京大学 | 医用データ処理装置及び医用データ処理方法 |
| JP2023112511A (ja) * | 2022-02-01 | 2023-08-14 | 国立大学法人東京農工大学 | 医用画像処理装置、学習装置、医用画像処理方法、学習方法およびプログラム |
| JP2024018636A (ja) * | 2022-07-29 | 2024-02-08 | キヤノンメディカルシステムズ株式会社 | 医用画像処理装置、医用画像処理方法、及び、医用画像処理プログラム |
Families Citing this family (10)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| JP7410619B2 (ja) * | 2019-10-31 | 2024-01-10 | キヤノン株式会社 | 画像処理装置、画像処理方法およびプログラム |
| JP7486349B2 (ja) * | 2020-05-28 | 2024-05-17 | キヤノン株式会社 | ニューラルネットワーク、ニューラルネットワークの学習方法、プログラム、画像処理装置 |
| CN111951276B (zh) * | 2020-07-28 | 2025-03-28 | 上海联影智能医疗科技有限公司 | 图像分割方法、装置、计算机设备和存储介质 |
| WO2022025282A1 (ja) * | 2020-07-31 | 2022-02-03 | TechMagic株式会社 | 学習制御システム |
| JP7697794B2 (ja) * | 2021-01-29 | 2025-06-24 | 富士フイルム株式会社 | 情報処理装置、情報処理方法及び情報処理プログラム |
| CN115707429A (zh) | 2021-08-20 | 2023-02-21 | 佳能医疗系统株式会社 | 磁共振成像用脏器分割装置及其方法、磁共振成像装置 |
| JP2023081573A (ja) * | 2021-12-01 | 2023-06-13 | キヤノンメディカルシステムズ株式会社 | 医用画像処理装置、医用画像処理方法及び医用画像処理プログラム |
| EP4202825A1 (en) | 2021-12-21 | 2023-06-28 | Koninklijke Philips N.V. | Network architecture for 3d image processing |
| CN114662621B (zh) * | 2022-05-24 | 2022-09-06 | 灵枭科技(武汉)有限公司 | 基于机器学习的农机作业面积计算方法及系统 |
| KR102809592B1 (ko) * | 2023-08-29 | 2025-05-21 | 주식회사 온코소프트 | 의료 영상 분할 방법 및 이를 수행하는 장치 |
Citations (10)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| JPS647467B2 (ja) * | 1979-10-03 | 1989-02-08 | Ei Teii Ando Teii Tekunorojiizu Inc | |
| JP2007209761A (ja) * | 2006-02-11 | 2007-08-23 | General Electric Co <Ge> | 3次元画像中の構造を扱うシステム、方法および機器 |
| JP2009211138A (ja) * | 2008-02-29 | 2009-09-17 | Fujifilm Corp | 対象領域抽出方法および装置ならびにプログラム |
| JP2013206262A (ja) * | 2012-03-29 | 2013-10-07 | Kddi Corp | 複数の被写体領域を分離する方法およびプログラム |
| WO2014033792A1 (ja) * | 2012-08-31 | 2014-03-06 | 株式会社島津製作所 | 放射線断層画像生成装置、放射線断層撮影装置および放射線断層画像生成方法 |
| JP2014132392A (ja) * | 2013-01-04 | 2014-07-17 | Fujitsu Ltd | 画像処理装置、画像処理方法及びプログラム |
| JP2014137744A (ja) * | 2013-01-17 | 2014-07-28 | Fujifilm Corp | 領域分割装置、プログラムおよび方法 |
| JP2016522951A (ja) * | 2014-05-05 | 2016-08-04 | シャオミ・インコーポレイテッド | 画像分割方法、画像分割装置、プログラム、及び記録媒体 |
| JP2017189337A (ja) * | 2016-04-13 | 2017-10-19 | 富士フイルム株式会社 | 画像位置合わせ装置および方法並びにプログラム |
| JP2019183345A (ja) | 2018-04-13 | 2019-10-24 | ユニ・チャーム株式会社 | 使い捨てマスク |
Family Cites Families (27)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US7729537B2 (en) * | 2005-08-01 | 2010-06-01 | Siemens Medical Solutions Usa, Inc. | Editing of presegemented images/volumes with the multilabel random walker or graph cut segmentations |
| DE102010028382A1 (de) * | 2010-04-29 | 2011-11-03 | Siemens Aktiengesellschaft | Verfahren und Computersystem zur Bearbeitung tomographischer Bilddaten aus einer Röntgen-CT-Untersuchung eines Untersuchungsobjektes |
| US9122950B2 (en) * | 2013-03-01 | 2015-09-01 | Impac Medical Systems, Inc. | Method and apparatus for learning-enhanced atlas-based auto-segmentation |
| US9721340B2 (en) * | 2013-08-13 | 2017-08-01 | H. Lee Moffitt Cancer Center And Research Institute, Inc. | Systems, methods and devices for analyzing quantitative information obtained from radiological images |
| US9830700B2 (en) * | 2014-02-18 | 2017-11-28 | Judy Yee | Enhanced computed-tomography colonography |
| US11238975B2 (en) * | 2014-04-02 | 2022-02-01 | University Of Louisville Research Foundation, Inc. | Computer aided diagnosis system for classifying kidneys |
| US9928347B2 (en) * | 2014-04-02 | 2018-03-27 | University Of Louisville Research Foundation, Inc. | Computer aided diagnostic system for classifying kidneys |
| US9373059B1 (en) * | 2014-05-05 | 2016-06-21 | Atomwise Inc. | Systems and methods for applying a convolutional network to spatial data |
| US9959486B2 (en) * | 2014-10-20 | 2018-05-01 | Siemens Healthcare Gmbh | Voxel-level machine learning with or without cloud-based support in medical imaging |
| US9990712B2 (en) * | 2015-04-08 | 2018-06-05 | Algotec Systems Ltd. | Organ detection and segmentation |
| WO2016175773A1 (en) * | 2015-04-29 | 2016-11-03 | Siemens Aktiengesellschaft | Method and system for semantic segmentation in laparoscopic and endoscopic 2d/2.5d image data |
| KR20170009601A (ko) * | 2015-07-17 | 2017-01-25 | 삼성전자주식회사 | 단층 촬영 장치 및 그에 따른 단층 영상 처리 방법 |
| US9801601B2 (en) * | 2015-12-29 | 2017-10-31 | Laboratoires Bodycad Inc. | Method and system for performing multi-bone segmentation in imaging data |
| US10482633B2 (en) * | 2016-09-12 | 2019-11-19 | Zebra Medical Vision Ltd. | Systems and methods for automated detection of an indication of malignancy in a mammographic image |
| US10417788B2 (en) * | 2016-09-21 | 2019-09-17 | Realize, Inc. | Anomaly detection in volumetric medical images using sequential convolutional and recurrent neural networks |
| GB2554435B (en) * | 2016-09-27 | 2019-10-23 | Univ Leicester | Image processing |
| US10452813B2 (en) * | 2016-11-17 | 2019-10-22 | Terarecon, Inc. | Medical image identification and interpretation |
| JP6657132B2 (ja) * | 2017-02-27 | 2020-03-04 | 富士フイルム株式会社 | 画像分類装置、方法およびプログラム |
| US10311312B2 (en) * | 2017-08-31 | 2019-06-04 | TuSimple | System and method for vehicle occlusion detection |
| EP3392832A1 (en) * | 2017-04-21 | 2018-10-24 | General Electric Company | Automated organ risk segmentation machine learning methods and systems |
| US10706535B2 (en) * | 2017-09-08 | 2020-07-07 | International Business Machines Corporation | Tissue staining quality determination |
| JP6407467B1 (ja) * | 2018-05-21 | 2018-10-17 | 株式会社Gauss | 画像処理装置、画像処理方法およびプログラム |
| WO2020172435A1 (en) * | 2019-02-20 | 2020-08-27 | The Regents Of The University Of California | System and method for tissue classification using quantitative image analysis of serial scans |
| JP2020170408A (ja) * | 2019-04-04 | 2020-10-15 | キヤノン株式会社 | 画像処理装置、画像処理方法、プログラム |
| JP7410619B2 (ja) * | 2019-10-31 | 2024-01-10 | キヤノン株式会社 | 画像処理装置、画像処理方法およびプログラム |
| EP3836078A1 (en) * | 2019-12-10 | 2021-06-16 | Koninklijke Philips N.V. | Medical image segmentation and atlas image selection |
| CA3267573A1 (en) * | 2022-09-13 | 2024-03-21 | Sloan-Kettering Institute For Cancer Research | FAST MOTION-RESOLVING MRI IMAGE RECONSTRUCTION USING K-SPACE DATA-COHERENCE-FREE SPACE-TIME-SPIRAL CONVOLUTIONARY NETWORKS (MOVIENET) |
-
2019
- 2019-10-03 JP JP2019183345A patent/JP6716765B1/ja active Active
- 2019-12-18 EP EP19901802.9A patent/EP3903681B1/en active Active
- 2019-12-18 CN CN201980080427.8A patent/CN113164145B/zh active Active
- 2019-12-18 WO PCT/JP2019/049623 patent/WO2020137745A1/ja not_active Ceased
-
2021
- 2021-06-07 US US17/341,140 patent/US12086992B2/en active Active
Patent Citations (10)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| JPS647467B2 (ja) * | 1979-10-03 | 1989-02-08 | Ei Teii Ando Teii Tekunorojiizu Inc | |
| JP2007209761A (ja) * | 2006-02-11 | 2007-08-23 | General Electric Co <Ge> | 3次元画像中の構造を扱うシステム、方法および機器 |
| JP2009211138A (ja) * | 2008-02-29 | 2009-09-17 | Fujifilm Corp | 対象領域抽出方法および装置ならびにプログラム |
| JP2013206262A (ja) * | 2012-03-29 | 2013-10-07 | Kddi Corp | 複数の被写体領域を分離する方法およびプログラム |
| WO2014033792A1 (ja) * | 2012-08-31 | 2014-03-06 | 株式会社島津製作所 | 放射線断層画像生成装置、放射線断層撮影装置および放射線断層画像生成方法 |
| JP2014132392A (ja) * | 2013-01-04 | 2014-07-17 | Fujitsu Ltd | 画像処理装置、画像処理方法及びプログラム |
| JP2014137744A (ja) * | 2013-01-17 | 2014-07-28 | Fujifilm Corp | 領域分割装置、プログラムおよび方法 |
| JP2016522951A (ja) * | 2014-05-05 | 2016-08-04 | シャオミ・インコーポレイテッド | 画像分割方法、画像分割装置、プログラム、及び記録媒体 |
| JP2017189337A (ja) * | 2016-04-13 | 2017-10-19 | 富士フイルム株式会社 | 画像位置合わせ装置および方法並びにプログラム |
| JP2019183345A (ja) | 2018-04-13 | 2019-10-24 | ユニ・チャーム株式会社 | 使い捨てマスク |
Non-Patent Citations (3)
| Title |
|---|
| FUKUDA, KEITA ET AL.: "Automatic Segmentation of Object Region Using Graph Cuts Based on Saliency Maps and AdaBoost", THE 13TH IEEE INTERNATIONAL SYMPOSIUM ON CONSUMER ELECTRONICS(ISCE2009, 2009, pages 36 - 37, XP031484489 * |
| J. LONG ET AL.: "Fully Convolutional Networks for Semantic Segmentation", PROCEEDINGS OF THE IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR, 2015, pages 3431 - 3440, XP055573743, DOI: 10.1109/CVPR.2015.7298965 |
| Y BOYKOVG. FUNKA-LEA: "Graph Cuts and Efficient N-D Image Segmentation", INTERNATIONAL JOURNAL OF COMPUTER VISION, vol. 70, no. 2, 2006, pages 109 - 131 |
Cited By (5)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| JP2022074841A (ja) * | 2020-11-05 | 2022-05-18 | 国立大学法人 東京大学 | 医用データ処理装置及び医用データ処理方法 |
| JP7544567B2 (ja) | 2020-11-05 | 2024-09-03 | 国立大学法人 東京大学 | 医用データ処理装置及び医用データ処理方法 |
| JP2023112511A (ja) * | 2022-02-01 | 2023-08-14 | 国立大学法人東京農工大学 | 医用画像処理装置、学習装置、医用画像処理方法、学習方法およびプログラム |
| JP7832619B2 (ja) | 2022-02-01 | 2026-03-18 | 国立大学法人東京農工大学 | 医用画像処理装置、学習装置、医用画像処理方法、学習方法およびプログラム |
| JP2024018636A (ja) * | 2022-07-29 | 2024-02-08 | キヤノンメディカルシステムズ株式会社 | 医用画像処理装置、医用画像処理方法、及び、医用画像処理プログラム |
Also Published As
| Publication number | Publication date |
|---|---|
| EP3903681A4 (en) | 2022-11-09 |
| EP3903681A1 (en) | 2021-11-03 |
| CN113164145A (zh) | 2021-07-23 |
| EP3903681B1 (en) | 2026-04-15 |
| JP6716765B1 (ja) | 2020-07-01 |
| JP2020109614A (ja) | 2020-07-16 |
| US12086992B2 (en) | 2024-09-10 |
| CN113164145B (zh) | 2025-03-14 |
| US20210295523A1 (en) | 2021-09-23 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| JP6716765B1 (ja) | 画像処理装置、画像処理システム、画像処理方法、プログラム | |
| US12361543B2 (en) | Automated detection of tumors based on image processing | |
| Lessmann et al. | Iterative fully convolutional neural networks for automatic vertebra segmentation and identification | |
| Xu et al. | Efficient multiple organ localization in CT image using 3D region proposal network | |
| CN108022242B (zh) | 处理在成本函数最小化框架中提出的图像分析的系统 | |
| JP7410619B2 (ja) | 画像処理装置、画像処理方法およびプログラム | |
| US11379985B2 (en) | System and computer-implemented method for segmenting an image | |
| Enokiya et al. | Automatic liver segmentation using U-Net with Wasserstein GANs | |
| CN112348908A (zh) | 用于医学成像中分段的基于形状的生成对抗网络 | |
| Yan et al. | A propagation-DNN: Deep combination learning of multi-level features for MR prostate segmentation | |
| EP3188127A1 (en) | Method and system for performing bone multi-segmentation in imaging data | |
| CN111798424B (zh) | 一种基于医学图像的结节检测方法、装置及电子设备 | |
| JP7486349B2 (ja) | ニューラルネットワーク、ニューラルネットワークの学習方法、プログラム、画像処理装置 | |
| JP6717049B2 (ja) | 画像解析装置、画像解析方法およびプログラム | |
| Nithila et al. | Lung cancer diagnosis from CT images using CAD system: a review | |
| CN117710317B (zh) | 检测模型的训练方法及检测方法 | |
| Fernández et al. | Exploring automatic liver tumor segmentation using deep learning | |
| Yurchenko et al. | Detection and classification of objects in three-dimensional images using deep learning methods | |
| CN118898284B (zh) | 基于医学影像的模型训练方法、装置及存储介质 | |
| El Joudi et al. | U‐Net‐Based Fetal Head Circumference Segmentation With Synthetic‐Driven Generation Data Augmentation | |
| Jaffar et al. | GA-SVM based lungs nodule detection and classification | |
| Lumburovska | Attention-enhanced 3D convolutional neural network for automated lung nodule malignancy assessment | |
| Pour | Development of Medical Image/Video Segmentation via Deep Learning Models | |
| Hatt et al. | A Convolutional Neural Network Approach to Automated Lung Bounding Box Estimation from Computed Tomography Scans | |
| JP2026052582A (ja) | 画像処理装置、画像処理方法およびプログラム |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| 121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 19901802 Country of ref document: EP Kind code of ref document: A1 |
|
| NENP | Non-entry into the national phase |
Ref country code: DE |
|
| ENP | Entry into the national phase |
Ref document number: 2019901802 Country of ref document: EP Effective date: 20210728 |
|
| WWG | Wipo information: grant in national office |
Ref document number: 201980080427.8 Country of ref document: CN |
|
| WWG | Wipo information: grant in national office |
Ref document number: 2019901802 Country of ref document: EP |

