WO2020202505A1 - Image processing apparatus, image processing method and non-transitoty computer readable medium - Google Patents
Image processing apparatus, image processing method and non-transitoty computer readable medium Download PDFInfo
- Publication number
- WO2020202505A1 WO2020202505A1 PCT/JP2019/014832 JP2019014832W WO2020202505A1 WO 2020202505 A1 WO2020202505 A1 WO 2020202505A1 JP 2019014832 W JP2019014832 W JP 2019014832W WO 2020202505 A1 WO2020202505 A1 WO 2020202505A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- image processing
- change
- processing apparatus
- feature
- loss
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Ceased
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/20—Analysis of motion
- G06T7/246—Analysis of motion using feature-based methods, e.g. the tracking of corners or segments
- G06T7/248—Analysis of motion using feature-based methods, e.g. the tracking of corners or segments involving reference images or patches
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/20—Analysis of motion
- G06T7/254—Analysis of motion involving subtraction of images
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/243—Classification techniques relating to the number of classes
- G06F18/2431—Multiple classes
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/25—Fusion techniques
- G06F18/253—Fusion techniques of extracted features
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/40—Extraction of image or video features
- G06V10/62—Extraction of image or video features relating to a temporal dimension, e.g. time-based feature extraction; Pattern tracking
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/764—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/77—Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
- G06V10/80—Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level
- G06V10/806—Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level of extracted features
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20076—Probabilistic image processing
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20081—Training; Learning
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20084—Artificial neural networks [ANN]
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20088—Trinocular vision calculations; trifocal tensor
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/82—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/10—Terrestrial scenes
Definitions
- the present disclosure relates to an image processing apparatus, image processing method and image processing program.
- Change detection is a widely researched topic in remote sensing and is considered an important preliminary analysis before any advanced analysis such as object recognition. Given a pair of images, it aims to infer changes which have occurred between the pair of images over a period of time. With the advent of very high resolution sensors, it has become possible to capture changes due to small objects such as cars, human and containers. Change detection of such small objects is of interest because it helps in effective monitoring of crowded and dynamic areas. Synthetic Aperture Radar (SAR) is an ideal source for monitoring such areas because of its ability to capture images even under bad weather and no-sunlight conditions.
- SAR Synthetic Aperture Radar
- Neural-networks can automatically extract features of an object robust to changes in orientation and noise.
- One type of neural networks called siamese network
- siamese network is well suited for the task of change detection because it can receive the input of a pair of images to extract features and then output a change class for each pixel.
- a related art of employing the siamese network for change detection is disclosed in PL 1 and shown in Fig. 11.
- the network includes three main steps: feature extraction, feature merging and classification. First, each branch (feature extractor unit) receives the input of an image and extracts features. Second, the features are merged in the feature merger unit through concatenation to obtain a merged feature representation. Third, a classifier is trained with the extracted features and assigns each pixel a probability of belonging to a change class.
- a loss is computed between the predicted change class and the true change class, and this loss is back-propagated to the feature extraction step and the classification step until the network converges to a state in which the loss cannot be reduced further. At this state, the network is considered trained and can be used in operation.
- NPL 1 Francesca Bovolo, Carlo Marin, and Lorenzo Bruzzone. "A hierarchical approach to change detection in very high resolution SAR images for surveillance applications.” IEEE Transactions on Geoscience and Remote Sensing 51.4 (2013): 2042-2054.
- the neural-network disclosed in PL 1 can extract robust features for different objects automatically, it cannot detect the changes of the target object with high accuracy. For example, in a pair of images if there are multiple objects such as cars, humans and asphalt road, and if the user is interested in changes caused by the movement of cars only, the related art cannot distinguish those changes from changes due to human or asphalt road conditions.
- the network learns features of all the objects simultaneously. Even though the network is trained with change labels of only the target object, the SAR images are so noisy and less in number that it becomes difficult for the network to differentiate between relevant and irrelevant features solely based on the change labels. As a result, the related art cannot perform well in change detection task of the target object.
- the present invention has been made to solve the above mentioned problems and the objective thereof is to provide an image processing apparatus, image processing method and image processing program capable of appropriately detecting changes of a target object.
- an image processing apparatus including: an object-driven feature extractor means to extract relevant features of target object from input images; a feature merger means to merge the features extracted from the input images into a merged feature; a change classifier means to predict a probability of each change class based on the merged feature; an object classifier means to predict a probability of each object class based on the extracted features of each image; a multi-loss calculator means to calculate a combined loss from a change classification loss and an object classification loss; and a parameter updater means to update parameters of the object-driven feature extractor.
- an image processing method including: extracting object-driven features of target object from input images; merging the features extracted from the input images into a merged feature; predicting a probability of each change class based on the merged feature; predicting a probability of each object class based on the extracted features of each image; calculating a combined loss from a change classification loss and an object classification loss; and updating parameters for extracting the object-driven feature.
- a non-transitory computer readable medium storing an image processing program is a non-transitory computer readable medium storing an image processing program for causing a computer to execute an image processing method, the image processing method including: extracting object-driven features of target object from input images; merging the features extracted from the input images into a merged feature; predicting a probability of each change class based on the merged feature; predicting a probability of each object class based on the extracted features of each image; calculating a combined loss from a change classification loss and an object classification loss; and updating parameters for extracting the object-driven feature.
- an image-processing apparatus an image processing method and an image processing program capable of appropriately classifying the changes of the target object in two or more SAR images with high accuracy.
- Fig. 1 is a depiction showing the problem formulation of change detection
- Fig. 2 is a block diagram showing a configuration example of an image processing apparatus according to the first embodiment in a training mode
- Fig. 3 is a flowchart showing an example of an operation performed by the image processing apparatus according to the first embodiment in the training mode
- Fig. 4 is a block diagram showing a configuration example of an image processing apparatus according to the first embodiment in an operational mode
- Fig. 5 is a flowchart showing an example of an operation performed by the image processing apparatus according to the first embodiment in the operational mode
- Fig. 6 is a block diagram showing a configuration example of an image processing apparatus according to the second embodiment
- FIG. 7 is a flowchart showing an example of an operation performed by the image processing apparatus according to the second embodiment
- Fig. 8 is a block diagram showing a configuration example of an image processing apparatus according to the third embodiment
- Fig. 9 is a flowchart showing an example of an operation performed by the image processing apparatus according to the third embodiment
- Fig. 10 is a depiction showing the exemplary configurations of object-driven feature extraction units
- Fig. 11 is a block diagram showing a method described in PL1.
- a change detection problem will be explained with reference to Fig. 1.
- the objective of change detection is to generate a change map, representing changes of the target objects that have occurred between the acquisition dates of the two images. It is to be noted that the present disclosure is not limited to binary change detection and also includes multiple change detection.
- FIG. 2 A configuration example of an image processing apparatus in accordance with the first embodiment of the present disclosure will be explained with reference to block diagrams shown in Fig. 2 and Fig. 4.
- the image processing apparatus in accordance with the first embodiment works in two modes - a training mode (image processing apparatus 1A) and an operational mode (image processing apparatus 1B).
- the image processing apparatus 1A can include an object-driven feature extractor unit 10A for image I 1 , an object-driven feature extractor unit 11A for image I 2 , a feature merger unit 12, a change classifier unit 13A, an object classifier unit 14 for image I 1 , an object classifier unit 15 for image I 2 , a multi-loss calculator unit 16, a parameter updater unit 17 and a storage unit 18.
- the image processing apparatus 1B can include the trained object-driven feature extractor unit 10B for image I 1 , the trained object-driven feature extractor unit 11B for image I 2 , the storage unit 18, the feature merger unit 12, a trained classifier unit 13B and a thresholder unit 19.
- the image processing apparatus in accordance with the first embodiment can include an object-driven feature extractor unit 10 for image I 1 , an object-driven feature extractor unit 11 for image I 2 , an object classifier unit 14 for image I 1 , an object classifier unit 15 for image I 2 and a multi-loss calculator unit 16.
- Object-driven feature extractor units 10 and 11 can extract features specific to a target object from the images I 1 and I 2 , respectively.
- Object classifier units 14 and 15 can classify the pixels in the image I 1 and I 2 , respectively into two classes, object or no-object, respectively.
- Multi-loss calculator unit 16 can calculate a combined loss function from a change classification loss and an object classification loss. Next, functioning of these units along with the other units will be explained in detail.
- a pair of multi-temporal images I 1 and I 2 are input to train the object-driven feature extractor units 10A and 11A, respectively.
- a general way to input an image is to first divide the image into patches overlappingly or non-overlappingly, and then input those patches into feature extractor units, respectively.
- the feature extractor unit may be a series of neural-network layers which automatically extract features from the input image patches through non-linear operations. Rectified Linear Units (ReLU) is one promising non-linear operation used in a neural-network based feature extractor. Since there are two feature extractor units as shown in Fig. 1 (one for each image), a few exemplary configurations of the feature extractor units are shown in Fig. 10.
- the one example of configurations is called a siamese network in which each feature extractor unit has same architecture and share same weights, which means the feature extractor units extract the features from the two patches using the same approach.
- This configuration is suitable if the input images are homogeneous, for example either both SAR images or both optical images.
- Another configuration example is called a pseudo-siamese network which is similar to the siamese network configuration except that the weights are not shared.
- This configuration is suitable if the input images are non-homogeneous, for example one is a SAR image while the other is an optical image.
- Still another configuration example is called 2-channel network in which the two input patches are considered as a two-channel input and directly fed into the network.
- the present disclosure is not limited to any one configuration and all the configurations are equally acceptable.
- the network architecture shown in Fig. 10 is merely an example and the number and types of neural-network layers will depend on the object of interest.
- the object-driven feature extractor units 10A and 11A output feature vectors, f 1 and f 2 , for each pair of input patches.
- the feature merger unit 12 receives the input of the features vectors f 1 and f 2 , and outputs a combined feature vector f c for each pair of the input patches.
- a few examples to combine the features are explained next.
- One example is concatenation in which the feature vectors are concatenated to form a combined feature vector.
- Another example is differencing wherein the features vectors are subtracted element-wise and the obtained differential vector is the combined feature vector.
- Still another example is to compute an L1-distance between the feature vectors and the obtained distance vector is the combined feature vector.
- Still another example is to compute an element-wise dot product of the feature vectors and the obtained dot-product vector is the combined feature vector. Note that the present disclosure is not limited to the above examples and other methods of feature merging can also be used.
- the change classifier unit 13A can be any kind of classifiers, including both neural-network based and non-neural-network based.
- cross-entropy loss is merely an exemplary loss and other loss functions such as Kullback-Leibler divergence, contrastive loss, hinge loss and mean-squared error can also be used to compute the classification errors.
- the parameter updater unit 17 receives the loss E from the multi-loss calculator unit 16 and updates the parameters of the object-driven feature extractor units 10A and 11A so that the loss can be minimized.
- the parameter updater unit 17 updates the parameters of the change classifier unit 13A and the object classifier units 14 and 15 also so that the loss can be minimized.
- the minimization of loss can be performed by an optimization algorithm such as gradient descent. The minimization of the loss is continued (or repeated) until the loss converges to a state in which it cannot be reduced further. At this stage, the loss has converged and the feature extraction unit 10A and 11A are trained.
- the parameter updater unit 17 stores the parameters of the trained object-driven feature extractor units into the storage unit 18.
- the trained object-driven feature extraction units are denoted as 10B and 11B as shown in Fig. 4.
- the change classifier unit 13A is neural-network based
- its parameters are also stored in the storage unit 18 after the loss has converged.
- the trained change classifier unit is denoted as 13B as shown in Fig. 4.
- the object classifier units 14 and 15 are neural-network based, their parameters are also stored in the storage unit 18 after the loss is converged.
- the image processing apparatus 1A receives the input of a pair of multi-temporal SAR images (steps S101 and S102). Next, the image processing apparatus 1A extracts features from the first SAR image using an object-driven feature extractor unit 10A (step S103). Simultaneously, the image processing apparatus 1A extracts object-driven features from the second SAR image using another feature extractor unit 11A (step S104). Next, the image processing apparatus 1A merges the features extracted by the two feature extractors units 10A and 11A using the feature merger unit 12 (step S105). Next, the image processing apparatus 1A estimates a change class probability in the image-pair based on the merged features using the change classifier unit 13A (step S106).
- the image processing apparatus 1A estimates the object class probability in the first image based on the object-driven features of the first image using the object classifier unit 14 (step S107). Similarly, the image processing apparatus 1A estimates the object class probability in the second image based on the object-driven features of the second image using the object classifier unit 15 (step S108). Next, the image processing apparatus 1A calculates a multi-loss from a change classification loss and an object classification loss.
- the change classification loss is calculated as a classification error between the true change class and the estimated change class
- the object classification loss is calculated as a classification error between the true object class and the estimated object class using the multi-loss calculator unit 16 (step S109).
- the image processing apparatus 1A updates the parameters of the feature extractor units 10A and 11A, change classifier unit 13A and object classifier units 14 and 15 using the parameter updater unit 17 so that the loss can be minimized (step S110).
- the image processing apparatus 1A determines whether or not the loss has converged (step S111). When the image processing apparatus 1A determines that the loss has not converged yet (NO at step S111), the image processing apparatus 1A returns to the step S103 and the step S104. Then, the image processing apparatus 1A performs the step S103 and the step S104 again simultaneously. Then, the image processing apparatus 1A performs the processes in the steps S105 to S110 again.
- the image processing apparatus 1A determines that the cost has converged (YES at step S111)
- the image processing apparatus 1A stores the trained feature extractor parameters, the trained change classifier parameters and the trained object classifier parameters into the storage unit 18 (step S112).
- the trained object-driven feature extractor units 10B and 11B receives the input of a new pair of multi-temporal images (which has never been used in the training mode) and the parameters from the storage unit 18. Each trained feature extractor unit outputs robust and relevant feature vectors, f 1 and f 2 , for each patch pair of the input images.
- the feature merger unit 12 combines the feature vectors and outputs a combined feature vector f c .
- the trained change classifier unit 13B receives the input of the combined feature vector f c and the parameters from the storage unit 18, and outputs a probability of belonging to a change or no-change class for the patch pair.
- the thresholder unit 19 receives the input of the probability value and automatically determines a threshold value.
- a few examples to automatically determine the threshold value may be Expectation-Maximization and Markov Random Field. If the probability value is more than the threshold value, pixels in the patch are assigned a change class, otherwise a no-change class. Finally the decisions of all the patches are combined to generate a change map in which each pixel belongs to either change or no-change class. Note that the present disclosure is not limited to only two change classes and can be used for multiple change classes. The change map can represent binary or multiple changes depending on the application.
- the image processing apparatus 1B receives the input of a new pair of multi-temporal SAR images (steps S201 and S202).
- the image processing apparatus 1B extracts object-driven features from the first SAR image using the trained object-driven feature extractor unit 10B which reads the trained parameters from the storage unit 18 (step S203).
- the image processing apparatus 1B extracts features from the second SAR images using the trained object-driven feature extractor unit 11B which reads the trained parameters from the storage unit 18 (step S204).
- the image processing apparatus 1B merges the features extracted by the two trained feature extraction units 10B and 11B using the feature merger unit 12 (step S205).
- the image processing apparatus 1B estimates the change class probability using trained change classifier unit 13B which reads the trained parameters from the storage unit 18 (step S206).
- the image processing apparatus 1B thresholds the probability values using a thresholder unit 19 by automatically determining a threshold value to output a change map (step S207).
- the image processing apparatus (1A and 1B) in accordance with the first embodiment of the present disclosure can consider change detection using the object-driven feature extraction units 10 and 11, the object classifier units 14 and 15, and the multi-loss calculator unit 16.
- the present disclosure can learn two tasks simultaneously - the change detection task and the object classification task.
- the loss calculated using the multi-loss calculator unit 16 as a weighted combination of change classification loss and object classification loss focuses attention of the feature extraction units to learn features specific to the target object.
- the object-driven feature extraction units 10 and 11 can distinguish between the relevant and irrelevant features and a better change detection system is obtained.
- the image processing apparatus 2 in accordance with the second embodiment can include a trained object-driven feature extractor unit 10B for image I 1 , a trained object-driven feature extractor unit 11B for image I 2 , a feature merger unit 12, a trained change classifier unit 13B, a trained object classifier unit 21 for image I 1 , a trained object classifier unit 22 for image I 2 , a storage unit 18 and a thresholder unit 19.
- the image processing apparatus 2 in accordance with the second embodiment can include a trained object classifier unit 21 for image I 1 and a trained object classifier unit 22 for image I 2 .
- the trained object-driven feature extraction units 10B and 11B output robust and relevant features of the target object from each image respectively using the parameters from the storage unit 18.
- the trained object classifier unit 21 receives the input of the feature vector f 1 of each patch of the image I 1 from the feature extractor unit 10B and parameters from the storage unit 18, and outputs a probability of belonging to an object or no-object class.
- the trained object classifier unit 22 receives the input of the feature vector f 2 of each patch of the image I 2 from the feature extractor unit 10B and parameters from the storage unit 18, and outputs a probability of belonging to an object or a no-object class.
- the probability values of each patch can be either thresholded or used directly.
- the probability values of all the patches of an image are combined to output a classification map where each pixel belongs to either an object or a no-object class.
- steps S301, S302, S303, S304, S305, S306 and S309 in Fig. 7 are similar to the steps S201, S202, S203, S204, S205, S206 and S207 in Fig. 5, and therefore their explanations are omitted.
- the image processing apparatus 2 in accordance with the second embodiment can also estimate the object class probabilities in the first image using the trained object classifier unit 21 which reads the trained parameters from the storage unit 18 (step S307). Simultaneously, the image processing apparatus 2 can estimate the object class probabilities in the second image using the trained object classifier unit 22 which reads the trained parameters from the storage unit 19 (step S308).
- the class probabilities can be either thresholded or used directly to output object classification maps of the respective images.
- the image processing apparatus 2 in accordance with the second embodiment of the present disclosure can provide an additional output of classification map along with the change map. Since the features learnt by the object-driven feature extraction units can be optimized for multiple tasks of change detection and object classification, they are generic and can be used for object classification without re-training with additional data. Thus, the proposed disclosure can be extended to advanced analysis tasks such as object classification in SAR images.
- the image processing apparatus 3 in accordance with the third embodiment can include a trained object-driven feature extractor unit 10B for image I 1 , a trained object-driven feature extractor unit 11B for image I 2 , a feature merger unit 12, a trained change classifier unit 13B, an image processor unit 31 and a storage unit 18.
- a trained object-driven feature extractor unit 10B for image I 1 a trained object-driven feature extractor unit 11B for image I 2
- a feature merger unit 12 a trained change classifier unit 13B
- an image processor unit 31 an image processor unit 31
- storage unit 18 storage unit 18
- the image processing apparatus 3 in accordance with the third embodiment replaces the thresholder unit 19 with an image processor unit 31.
- the image processor unit 31 receives the input of the probability values from the trained change classifier unit 13B and outputs an image processed change map such as a density map, a distance map or a colorization map by applying an image processing operator on the probability values.
- the type of the map depends on the application of the change detection system.
- steps S401, S402, S403, S404, S405 and S406 in Fig. 9 are similar to the steps S201, S202, S203, S204, S205 and S206 in Fig. 5, and therefore their explanations are omitted.
- the image processing apparatus 3 After obtaining the class probabilities from the trained change classifier unit 13B (step S406), the image processing apparatus 3 applies an image processing operation on the class probabilities such as a distance estimator or a density estimator using the image processor unit 31 to output an image processed change map (step S407).
- the image processing apparatus 3 in accordance with the third embodiment of the present disclosure can provide different types of outputs using post-processing the probability values estimated by the trained change classifier unit 13B.
- These alternative outputs can provide additional information about the target object based on the application. For example, if the user wants to know the amount of changes instead of only detecting change and no-change, a density map can be output after the post-processing. The density map highlights the amount of changes in which a low density value implies a small change and high density value implies a large change.
- the change detection system can provide more detail about the changes of the target object and can be used for many applications.
- present disclosure is described as a hardware configuration in the above-described embodiments, the present disclosure is not limited to the hardware configurations.
- the present disclosure can be implemented by having a processor such as a CPU (Central Processing Unit) included in the image processing apparatus to execute a computer program for performing each process in each of the above-described functions.
- a processor such as a CPU (Central Processing Unit) included in the image processing apparatus to execute a computer program for performing each process in each of the above-described functions.
- CPU Central Processing Unit
- the program can be stored in various types of non-transitory computer readable media and thereby supplied to computers.
- the non-transitory computer readable media includes various types of tangible storage media.
- Examples of the non-transitory computer readable media can include a magnetic recording medium (such as a flexible disk, a magnetic tape, and a hard disk drive), a magneto-optic recording medium (such as a magneto-optic disk), a CD-ROM (Read Only Memory), a CD-R, and a CD-R/W, a DVD (Digital Versatile Disc), a BD (Blu-ray (registered trademark) Disc), and a semiconductor memory (such as a mask ROM, a PROM (Programmable ROM), an EPROM (Erasable PROM), a flash ROM, and a RAM (Random Access Memory)).
- a magnetic recording medium such as a flexible disk, a magnetic tape, and a hard disk drive
- a magneto-optic recording medium such as
- the program can be supplied to computers by using various types of transitory computer readable media.
- Examples of the transitory computer readable media can include an electrical signal, an optical signal, and an electromagnetic wave.
- the transitory computer readable media can be used to supply programs to computer through a wire communication path such as an electrical wire and an optical fiber, or wireless communication path.
- An image processing apparatus for a training method of change detection comprising: an object-driven feature extractor means to extract relevant features of target object from input images; a feature merger means to merge the features extracted from the input images into a merged feature; a change classifier means to predict a probability of each change class based on the merged feature; an object classifier means to predict a probability of each object class based on the extracted features of each image; a multi-loss calculator means to calculate a combined loss from a change classification loss and an object classification loss; and a parameter updater means to update parameters of the object-driven feature extractor means.
- the image processing apparatus for change detection method comprising, an object-driven feature extractor means to extract relevant features of target object from input images; a feature merger means to merge the features extracted from the input images into a merged feature; and a change classifier means to predict a probability of each change class based on the merged features, wherein the object-driven feature extractor means and the change classifier means use parameters trained using the training method according to any one of note 1 to note 6.
- the image processing apparatus according to note 12 wherein the neural-network based method is a siamese network, pseudo-siamese network or 2-channel network.
- the change classifier means uses a Decision Tree, Support Vector Machine, Neural Network, Gradient Boosting Machine, or an ensemble thereof.
- the object classifier means is a Decision Tree, Support Vector Machine, Neural Network, Gradient Boosting Machine, or an ensemble thereof.
- a non-transitory computer readable medium storing an image processing program is a non-transitory computer readable medium storing an image processing program for causing a computer to execute an image processing method, an image processing method comprising: extracting object-driven features of target object from input images; merging the features extracted from the input images into a merged feature, predicting a probability of each change class based on the merged feature; predicting a probability of each object class based on the extracted features of each image; calculating a combined loss from a change classification loss and an object classification loss; and updating parameters for extracting the object-driven feature.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Multimedia (AREA)
- Data Mining & Analysis (AREA)
- Artificial Intelligence (AREA)
- Evolutionary Computation (AREA)
- General Engineering & Computer Science (AREA)
- Evolutionary Biology (AREA)
- Bioinformatics & Computational Biology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Health & Medical Sciences (AREA)
- Computing Systems (AREA)
- Databases & Information Systems (AREA)
- General Health & Medical Sciences (AREA)
- Medical Informatics (AREA)
- Software Systems (AREA)
- Life Sciences & Earth Sciences (AREA)
- Image Analysis (AREA)
Abstract
An object is to provide an image processing apparatus capable of appropriately detecting changes of a target object. An image processing apparatus (1A) may include: object-driven feature extractor means (10A, 11A) to extract relevant features of target object from input images; a feature merger means (12) to merge the features extracted from the input images into a merged feature; a change classifier means (13A) to predict a probability of each change class based on the merged feature; an object classifier means (14, 15) to predict a probability of each object class based on the extracted features of each image; a multi-loss calculator means (16) to calculate a combined loss from a change classification loss and an object classification loss; and a parameter updater means (17) to update the parameters of the object-driven feature extractor means.
Description
The present disclosure relates to an image processing apparatus, image processing method and image processing program.
Change detection is a widely researched topic in remote sensing and is considered an important preliminary analysis before any advanced analysis such as object recognition. Given a pair of images, it aims to infer changes which have occurred between the pair of images over a period of time. With the advent of very high resolution sensors, it has become possible to capture changes due to small objects such as cars, human and containers. Change detection of such small objects is of interest because it helps in effective monitoring of crowded and dynamic areas. Synthetic Aperture Radar (SAR) is an ideal source for monitoring such areas because of its ability to capture images even under bad weather and no-sunlight conditions.
Traditional methods of change detection employ a pixel-to-pixel based difference between images in which each pixel of the first image is compared to the corresponding pixel of the second image. These methods, however, do not work well in very high resolution SAR images because the pixel is sensitive to SAR artifacts (shadow, layover and speckle noise) and may show a change even if there is no semantic meaning of that change. To tackle this, feature-to-feature based difference has been proposed where the features of the target object are manually modelled using domain knowledge. Such a method is disclosed in NPL1. A filter to extract features is applied directly to the images, and the two results are compared to detect the changes due to the object. However, the method has less industrial applicability because the manual features require domain knowledge and are not robust to changes in object orientation and noise.
Neural-networks can automatically extract features of an object robust to changes in orientation and noise. One type of neural networks, called siamese network, is well suited for the task of change detection because it can receive the input of a pair of images to extract features and then output a change class for each pixel. A related art of employing the siamese network for change detection is disclosed in PL 1 and shown in Fig. 11. The network includes three main steps: feature extraction, feature merging and classification. First, each branch (feature extractor unit) receives the input of an image and extracts features. Second, the features are merged in the feature merger unit through concatenation to obtain a merged feature representation. Third, a classifier is trained with the extracted features and assigns each pixel a probability of belonging to a change class. While training the network, a loss is computed between the predicted change class and the true change class, and this loss is back-propagated to the feature extraction step and the classification step until the network converges to a state in which the loss cannot be reduced further. At this state, the network is considered trained and can be used in operation.
PL 1: CN108573276A
NPL 1: Francesca Bovolo, Carlo Marin, and Lorenzo Bruzzone. "A hierarchical approach to change detection in very high resolution SAR images for surveillance applications." IEEE Transactions on Geoscience and Remote Sensing 51.4 (2013): 2042-2054.
Although, the neural-network disclosed in PL 1 can extract robust features for different objects automatically, it cannot detect the changes of the target object with high accuracy. For example, in a pair of images if there are multiple objects such as cars, humans and asphalt road, and if the user is interested in changes caused by the movement of cars only, the related art cannot distinguish those changes from changes due to human or asphalt road conditions.
This is because in the feature extraction process of the related art, the network learns features of all the objects simultaneously. Even though the network is trained with change labels of only the target object, the SAR images are so noisy and less in number that it becomes difficult for the network to differentiate between relevant and irrelevant features solely based on the change labels. As a result, the related art cannot perform well in change detection task of the target object.
The present invention has been made to solve the above mentioned problems and the objective thereof is to provide an image processing apparatus, image processing method and image processing program capable of appropriately detecting changes of a target object.
In the first example aspect, an image processing apparatus including:
an object-driven feature extractor means to extract relevant features of target object from input images;
a feature merger means to merge the features extracted from the input images into a merged feature;
a change classifier means to predict a probability of each change class based on the merged feature;
an object classifier means to predict a probability of each object class based on the extracted features of each image;
a multi-loss calculator means to calculate a combined loss from a change classification loss and an object classification loss; and
a parameter updater means to update parameters of the object-driven feature extractor.
an object-driven feature extractor means to extract relevant features of target object from input images;
a feature merger means to merge the features extracted from the input images into a merged feature;
a change classifier means to predict a probability of each change class based on the merged feature;
an object classifier means to predict a probability of each object class based on the extracted features of each image;
a multi-loss calculator means to calculate a combined loss from a change classification loss and an object classification loss; and
a parameter updater means to update parameters of the object-driven feature extractor.
In the second example aspect, an image processing method including:
extracting object-driven features of target object from input images;
merging the features extracted from the input images into a merged feature;
predicting a probability of each change class based on the merged feature;
predicting a probability of each object class based on the extracted features of each image;
calculating a combined loss from a change classification loss and an object classification loss; and
updating parameters for extracting the object-driven feature.
extracting object-driven features of target object from input images;
merging the features extracted from the input images into a merged feature;
predicting a probability of each change class based on the merged feature;
predicting a probability of each object class based on the extracted features of each image;
calculating a combined loss from a change classification loss and an object classification loss; and
updating parameters for extracting the object-driven feature.
In a third example aspect, a non-transitory computer readable medium storing an image processing program is a non-transitory computer readable medium storing an image processing program for causing a computer to execute an image processing method, the image processing method including:
extracting object-driven features of target object from input images;
merging the features extracted from the input images into a merged feature;
predicting a probability of each change class based on the merged feature;
predicting a probability of each object class based on the extracted features of each image;
calculating a combined loss from a change classification loss and an object classification loss; and
updating parameters for extracting the object-driven feature.
extracting object-driven features of target object from input images;
merging the features extracted from the input images into a merged feature;
predicting a probability of each change class based on the merged feature;
predicting a probability of each object class based on the extracted features of each image;
calculating a combined loss from a change classification loss and an object classification loss; and
updating parameters for extracting the object-driven feature.
According to the present disclosure, it is possible to provide an image-processing apparatus, an image processing method and an image processing program capable of appropriately classifying the changes of the target object in two or more SAR images with high accuracy.
Embodiments of the present disclosure are explained in detail with reference to the drawings. The same components are denoted by the same symbols throughout the drawings, and duplicated explanations are omitted as necessary for clarifying the explanations.
Prior to explaining embodiments, a change detection problem will be explained with reference to Fig. 1. Given two multi-temporal SAR images I1 and I2 of same area as shown in the Fig. 1, the objective of change detection is to generate a change map, representing changes of the target objects that have occurred between the acquisition dates of the two images. It is to be noted that the present disclosure is not limited to binary change detection and also includes multiple change detection.
A configuration example of an image processing apparatus in accordance with the first embodiment of the present disclosure will be explained with reference to block diagrams shown in Fig. 2 and Fig. 4. The image processing apparatus in accordance with the first embodiment works in two modes - a training mode (image processing apparatus 1A) and an operational mode (image processing apparatus 1B).
In the training mode as shown in Fig. 2, the image processing apparatus 1A can include an object-driven feature extractor unit 10A for image I1, an object-driven feature extractor unit 11A for image I2, a feature merger unit 12, a change classifier unit 13A, an object classifier unit 14 for image I1, an object classifier unit 15 for image I2, a multi-loss calculator unit 16, a parameter updater unit 17 and a storage unit 18.
In the operational mode as shown in Fig. 4, the image processing apparatus 1B can include the trained object-driven feature extractor unit 10B for image I1, the trained object-driven feature extractor unit 11B for image I2, the storage unit 18, the feature merger unit 12, a trained classifier unit 13B and a thresholder unit 19.
As compared to the related art shown in Fig. 11, the image processing apparatus in accordance with the first embodiment can include an object-driven feature extractor unit 10 for image I1, an object-driven feature extractor unit 11 for image I2, an object classifier unit 14 for image I1, an object classifier unit 15 for image I2 and a multi-loss calculator unit 16. Object-driven feature extractor units 10 and 11 can extract features specific to a target object from the images I1 and I2, respectively. Object classifier units 14 and 15 can classify the pixels in the image I1 and I2, respectively into two classes, object or no-object, respectively. Multi-loss calculator unit 16 can calculate a combined loss function from a change classification loss and an object classification loss. Next, functioning of these units along with the other units will be explained in detail.
First, the training mode will be explained with reference to Fig. 2. A pair of multi-temporal images I1 and I2 are input to train the object-driven feature extractor units 10A and 11A, respectively. A general way to input an image is to first divide the image into patches overlappingly or non-overlappingly, and then input those patches into feature extractor units, respectively. The feature extractor unit may be a series of neural-network layers which automatically extract features from the input image patches through non-linear operations. Rectified Linear Units (ReLU) is one promising non-linear operation used in a neural-network based feature extractor. Since there are two feature extractor units as shown in Fig. 1 (one for each image), a few exemplary configurations of the feature extractor units are shown in Fig. 10. The one example of configurations is called a siamese network in which each feature extractor unit has same architecture and share same weights, which means the feature extractor units extract the features from the two patches using the same approach. This configuration is suitable if the input images are homogeneous, for example either both SAR images or both optical images. Another configuration example is called a pseudo-siamese network which is similar to the siamese network configuration except that the weights are not shared. This configuration is suitable if the input images are non-homogeneous, for example one is a SAR image while the other is an optical image. Still another configuration example is called 2-channel network in which the two input patches are considered as a two-channel input and directly fed into the network. The present disclosure is not limited to any one configuration and all the configurations are equally acceptable. Note that the network architecture shown in Fig. 10 is merely an example and the number and types of neural-network layers will depend on the object of interest. The object-driven feature extractor units 10A and 11A output feature vectors, f1 and f2, for each pair of input patches.
The feature merger unit 12 receives the input of the features vectors f1 and f2, and outputs a combined feature vector fc for each pair of the input patches. A few examples to combine the features are explained next. One example is concatenation in which the feature vectors are concatenated to form a combined feature vector. Another example is differencing wherein the features vectors are subtracted element-wise and the obtained differential vector is the combined feature vector. Still another example is to compute an L1-distance between the feature vectors and the obtained distance vector is the combined feature vector. Still another example is to compute an element-wise dot product of the feature vectors and the obtained dot-product vector is the combined feature vector. Note that the present disclosure is not limited to the above examples and other methods of feature merging can also be used.
It is to be noted the present disclosure is not limited to binary change detection and the same method can be applied for multiple change detection by those skilled in the art. The change classifier unit 13A can be any kind of classifiers, including both neural-network based and non-neural-network based.
Note that the cross-entropy loss is merely an exemplary loss and other loss functions such as Kullback-Leibler divergence, contrastive loss, hinge loss and mean-squared error can also be used to compute the classification errors.
The parameter updater unit 17 receives the loss E from the multi-loss calculator unit 16 and updates the parameters of the object-driven feature extractor units 10A and 11A so that the loss can be minimized. In the case that the change classifier unit 13A and the object classifier units 14 and 15 are neural-network based, the parameter updater unit 17 updates the parameters of the change classifier unit 13A and the object classifier units 14 and 15 also so that the loss can be minimized. The minimization of loss can be performed by an optimization algorithm such as gradient descent. The minimization of the loss is continued (or repeated) until the loss converges to a state in which it cannot be reduced further. At this stage, the loss has converged and the feature extraction unit 10A and 11A are trained. After convergence, the parameter updater unit 17 stores the parameters of the trained object-driven feature extractor units into the storage unit 18. The trained object-driven feature extraction units are denoted as 10B and 11B as shown in Fig. 4. In the case that the change classifier unit 13A is neural-network based, its parameters are also stored in the storage unit 18 after the loss has converged. The trained change classifier unit is denoted as 13B as shown in Fig. 4. In the case that the object classifier units 14 and 15 are neural-network based, their parameters are also stored in the storage unit 18 after the loss is converged.
Next, an example of an operation performed by the image processing apparatus 1A according to the first embodiment in training mode will be explained with reference to a flowchart shown in Fig. 3.
Firstly, the image processing apparatus 1A receives the input of a pair of multi-temporal SAR images (steps S101 and S102). Next, the image processing apparatus 1A extracts features from the first SAR image using an object-driven feature extractor unit 10A (step S103). Simultaneously, the image processing apparatus 1A extracts object-driven features from the second SAR image using another feature extractor unit 11A (step S104). Next, the image processing apparatus 1A merges the features extracted by the two feature extractors units 10A and 11A using the feature merger unit 12 (step S105). Next, the image processing apparatus 1A estimates a change class probability in the image-pair based on the merged features using the change classifier unit 13A (step S106). Simultaneously, the image processing apparatus 1A estimates the object class probability in the first image based on the object-driven features of the first image using the object classifier unit 14 (step S107). Similarly, the image processing apparatus 1A estimates the object class probability in the second image based on the object-driven features of the second image using the object classifier unit 15 (step S108). Next, the image processing apparatus 1A calculates a multi-loss from a change classification loss and an object classification loss. Here, the change classification loss is calculated as a classification error between the true change class and the estimated change class and the object classification loss is calculated as a classification error between the true object class and the estimated object class using the multi-loss calculator unit 16 (step S109). Next, the image processing apparatus 1A updates the parameters of the feature extractor units 10A and 11A, change classifier unit 13A and object classifier units 14 and 15 using the parameter updater unit 17 so that the loss can be minimized (step S110). Next, the image processing apparatus 1A determines whether or not the loss has converged (step S111). When the image processing apparatus 1A determines that the loss has not converged yet (NO at step S111), the image processing apparatus 1A returns to the step S103 and the step S104. Then, the image processing apparatus 1A performs the step S103 and the step S104 again simultaneously. Then, the image processing apparatus 1A performs the processes in the steps S105 to S110 again. On the other hand, when the image processing apparatus 1A determines that the cost has converged (YES at step S111), the image processing apparatus 1A stores the trained feature extractor parameters, the trained change classifier parameters and the trained object classifier parameters into the storage unit 18 (step S112).
Next, the operational mode will be explained with reference to Fig. 4. In the operational mode, the trained object-driven feature extractor units 10B and 11B receives the input of a new pair of multi-temporal images (which has never been used in the training mode) and the parameters from the storage unit 18. Each trained feature extractor unit outputs robust and relevant feature vectors, f1 and f2, for each patch pair of the input images. The feature merger unit 12 combines the feature vectors and outputs a combined feature vector fc. The trained change classifier unit 13B receives the input of the combined feature vector fc and the parameters from the storage unit 18, and outputs a probability of belonging to a change or no-change class for the patch pair. The thresholder unit 19 receives the input of the probability value and automatically determines a threshold value. A few examples to automatically determine the threshold value may be Expectation-Maximization and Markov Random Field. If the probability value is more than the threshold value, pixels in the patch are assigned a change class, otherwise a no-change class. Finally the decisions of all the patches are combined to generate a change map in which each pixel belongs to either change or no-change class. Note that the present disclosure is not limited to only two change classes and can be used for multiple change classes. The change map can represent binary or multiple changes depending on the application.
Next, an example of an operation performed by the image processing apparatus 1B according to the first embodiment in the operational mode will be explained with reference to a flowchart shown in Fig. 5.
Firstly, the image processing apparatus 1B receives the input of a new pair of multi-temporal SAR images (steps S201 and S202). Next, the image processing apparatus 1B extracts object-driven features from the first SAR image using the trained object-driven feature extractor unit 10B which reads the trained parameters from the storage unit 18 (step S203). Simultaneously, the image processing apparatus 1B extracts features from the second SAR images using the trained object-driven feature extractor unit 11B which reads the trained parameters from the storage unit 18 (step S204). Next, the image processing apparatus 1B merges the features extracted by the two trained feature extraction units 10B and 11B using the feature merger unit 12 (step S205). Next, the image processing apparatus 1B estimates the change class probability using trained change classifier unit 13B which reads the trained parameters from the storage unit 18 (step S206). Next, the image processing apparatus 1B thresholds the probability values using a thresholder unit 19 by automatically determining a threshold value to output a change map (step S207).
As described above, the image processing apparatus (1A and 1B) in accordance with the first embodiment of the present disclosure can consider change detection using the object-driven feature extraction units 10 and 11, the object classifier units 14 and 15, and the multi-loss calculator unit 16. Unlike the related art where the network learns only the single task of change detection, the present disclosure can learn two tasks simultaneously - the change detection task and the object classification task. The loss calculated using the multi-loss calculator unit 16 as a weighted combination of change classification loss and object classification loss focuses attention of the feature extraction units to learn features specific to the target object. As a result, the object-driven feature extraction units 10 and 11 can distinguish between the relevant and irrelevant features and a better change detection system is obtained.
Next, a configuration example of an image processing apparatus 2 in accordance with the second embodiment of the present disclosure will be explained with reference to a block diagram shown in Fig. 6. The image processing apparatus 2 in accordance with the second embodiment can include a trained object-driven feature extractor unit 10B for image I1, a trained object-driven feature extractor unit 11B for image I2, a feature merger unit 12, a trained change classifier unit 13B, a trained object classifier unit 21 for image I1, a trained object classifier unit 22 for image I2, a storage unit 18 and a thresholder unit 19. Note that configurations of the trained object-driven feature extractor unit 10B for image I1, the trained object-driven feature extractor unit 11B for image I2, the feature merger unit 12, the trained change classifier unit 13B and the thresholder unit 19 are similar to those explained in the first embodiment of the present disclosure and therefore their explanations are omitted.
As compared to the first embodiment, the image processing apparatus 2 in accordance with the second embodiment can include a trained object classifier unit 21 for image I1 and a trained object classifier unit 22 for image I2.
As described in the first embodiment, in the operational mode a new pair of multi-temporal images (which has never been used for training) is input to the trained object-driven feature extraction units 10B and 11B in the form of patches. The trained object-driven feature extraction units 10B and 11B output robust and relevant features of the target object from each image respectively using the parameters from the storage unit 18. According to the second embodiment, the trained object classifier unit 21 receives the input of the feature vector f1 of each patch of the image I1 from the feature extractor unit 10B and parameters from the storage unit 18, and outputs a probability of belonging to an object or no-object class. Simultaneously, the trained object classifier unit 22 receives the input of the feature vector f2 of each patch of the image I2 from the feature extractor unit 10B and parameters from the storage unit 18, and outputs a probability of belonging to an object or a no-object class. The probability values of each patch can be either thresholded or used directly. The probability values of all the patches of an image are combined to output a classification map where each pixel belongs to either an object or a no-object class.
Next, an example of an operation performed by the image processing apparatus 2 according to the second embodiment will be explained with reference to a flowchart shown in Fig. 7. Note that steps S301, S302, S303, S304, S305, S306 and S309 in Fig. 7 are similar to the steps S201, S202, S203, S204, S205, S206 and S207 in Fig. 5, and therefore their explanations are omitted.
In addition to estimating change class probabilities as explained in the first embodiment, the image processing apparatus 2 in accordance with the second embodiment can also estimate the object class probabilities in the first image using the trained object classifier unit 21 which reads the trained parameters from the storage unit 18 (step S307). Simultaneously, the image processing apparatus 2 can estimate the object class probabilities in the second image using the trained object classifier unit 22 which reads the trained parameters from the storage unit 19 (step S308). The class probabilities can be either thresholded or used directly to output object classification maps of the respective images.
As described above, the image processing apparatus 2 in accordance with the second embodiment of the present disclosure can provide an additional output of classification map along with the change map. Since the features learnt by the object-driven feature extraction units can be optimized for multiple tasks of change detection and object classification, they are generic and can be used for object classification without re-training with additional data. Thus, the proposed disclosure can be extended to advanced analysis tasks such as object classification in SAR images.
Next, a configuration example of an image processing apparatus 3 in accordance with the third embodiment of the present disclosure will be explained with reference to a block diagram shown in Fig. 8. The image processing apparatus 3 in accordance with the third embodiment can include a trained object-driven feature extractor unit 10B for image I1, a trained object-driven feature extractor unit 11B for image I2, a feature merger unit 12, a trained change classifier unit 13B, an image processor unit 31 and a storage unit 18. Note that configurations of the trained object-driven feature extractor unit 10B for image I1, the trained object-driven feature extractor unit 11B for image I2, the trained change classifier unit 13B and the storage unit 18 are similar to those explained in the first embodiment of the present disclosure and therefore their explanations are omitted.
As compared to the first embodiment, the image processing apparatus 3 in accordance with the third embodiment replaces the thresholder unit 19 with an image processor unit 31. The image processor unit 31 receives the input of the probability values from the trained change classifier unit 13B and outputs an image processed change map such as a density map, a distance map or a colorization map by applying an image processing operator on the probability values. The type of the map depends on the application of the change detection system.
Next, an example of an operation performed by the image processing apparatus 3 according to the third embodiment will be explained with reference to a flowchart shown in Fig. 9. Note that steps S401, S402, S403, S404, S405 and S406 in Fig. 9 are similar to the steps S201, S202, S203, S204, S205 and S206 in Fig. 5, and therefore their explanations are omitted.
After obtaining the class probabilities from the trained change classifier unit 13B (step S406), the image processing apparatus 3 applies an image processing operation on the class probabilities such as a distance estimator or a density estimator using the image processor unit 31 to output an image processed change map (step S407).
As described above, the image processing apparatus 3 in accordance with the third embodiment of the present disclosure can provide different types of outputs using post-processing the probability values estimated by the trained change classifier unit 13B. These alternative outputs can provide additional information about the target object based on the application. For example, if the user wants to know the amount of changes instead of only detecting change and no-change, a density map can be output after the post-processing. The density map highlights the amount of changes in which a low density value implies a small change and high density value implies a large change. Thus, the change detection system can provide more detail about the changes of the target object and can be used for many applications.
Further, although the present disclosure is described as a hardware configuration in the above-described embodiments, the present disclosure is not limited to the hardware configurations. The present disclosure can be implemented by having a processor such as a CPU (Central Processing Unit) included in the image processing apparatus to execute a computer program for performing each process in each of the above-described functions.
In the above-described examples, the program can be stored in various types of non-transitory computer readable media and thereby supplied to computers. The non-transitory computer readable media includes various types of tangible storage media. Examples of the non-transitory computer readable media can include a magnetic recording medium (such as a flexible disk, a magnetic tape, and a hard disk drive), a magneto-optic recording medium (such as a magneto-optic disk), a CD-ROM (Read Only Memory), a CD-R, and a CD-R/W, a DVD (Digital Versatile Disc), a BD (Blu-ray (registered trademark) Disc), and a semiconductor memory (such as a mask ROM, a PROM (Programmable ROM), an EPROM (Erasable PROM), a flash ROM, and a RAM (Random Access Memory)). Further, the program can be supplied to computers by using various types of transitory computer readable media. Examples of the transitory computer readable media can include an electrical signal, an optical signal, and an electromagnetic wave. The transitory computer readable media can be used to supply programs to computer through a wire communication path such as an electrical wire and an optical fiber, or wireless communication path.
Although the present disclosure is explained above with reference to embodiments, the present disclosure is not limited to the above-described embodiments. Various modifications that can be understood by those skilled in the art can be made to the configuration and details of the present disclosure within the scope of the invention.
Part of or all the foregoing embodiments can be described as in the following appendixes, but the present invention is not limited thereto.
(Supplementary Note 1)
An image processing apparatus for a training method of change detection comprising:
an object-driven feature extractor means to extract relevant features of target object from input images;
a feature merger means to merge the features extracted from the input images into a merged feature;
a change classifier means to predict a probability of each change class based on the merged feature;
an object classifier means to predict a probability of each object class based on the extracted features of each image;
a multi-loss calculator means to calculate a combined loss from a change classification loss and an object classification loss; and
a parameter updater means to update parameters of the object-driven feature extractor means.
(Supplementary Note 2)
The image processing apparatus according tonote 1, wherein the parameter updater means updates the parameters of the change classifier means and object classifier means.
(Supplementary Note 3)
The image processing apparatus according tonote 1 or note 2, wherein the multi-loss calculator means calculates a weighted combination of a change classification loss and an object classification loss.
(Supplementary Note 4)
The image processing apparatus according tonote 3, wherein the weights are determined using grid search or random search.
(Supplementary Note 5)
The image processing apparatus according to any one ofnote 1 to note 4, wherein the change classification loss and object classification loss are selected from the group consisting of cross-entropy, Kullback-Leibler divergence, contrastive loss, hinge loss and mean-squared error as a loss function.
(Supplementary Note 6)
The image processing apparatus according to any one ofnote 1 to note 5, wherein the input images are captured by Synthetic Aperture Radar.
(Supplementary Note 7)
The image processing apparatus for change detection method comprising,
an object-driven feature extractor means to extract relevant features of target object from input images;
a feature merger means to merge the features extracted from the input images into a merged feature; and
a change classifier means to predict a probability of each change class based on the merged features,
wherein the object-driven feature extractor means and the change classifier means use parameters trained using the training method according to any one ofnote 1 to note 6.
(Supplementary Note 8)
The image processing apparatus according to note 7, further comprising a thresholder means to threshold the predicted probability of each change class.
(Supplementary Note 9)
The image processing apparatus according to note 7, further comprising an image processor means to apply an image processing operation on the predicted probability of each change class.
(Supplementary Note 10)
The image processing apparatus according to note 9, wherein the image processor means is a kernel density estimator or a euclidean distance estimator.
(Supplementary Note 11)
The image processing apparatus for change detection method according to any one of note 7 to note 10, further comprising:
an object classifier means to predict a probability of each object class based on the extracted features of each image,
wherein the object classifier means uses parameters trained using the training method according to any one ofnote 1 to note 6.
(Supplementary Note 12)
The image processing apparatus according to any one ofnote 1 to note 11, wherein the object-driven feature extraction means use a neural-network based method.
(Supplementary Note 13)
The image processing apparatus according to note 12 wherein the neural-network based method is a siamese network, pseudo-siamese network or 2-channel network.
(Supplementary Note 14)
The image processing apparatus according to any one ofnote 1 to note 11, wherein the change classifier means uses a Decision Tree, Support Vector Machine, Neural Network, Gradient Boosting Machine, or an ensemble thereof.
(Supplementary Note 15)
The image processing apparatus according to any one ofnote 1 to note 11, wherein the object classifier means is a Decision Tree, Support Vector Machine, Neural Network, Gradient Boosting Machine, or an ensemble thereof.
(Supplementary Note 16)
The image processing apparatus according to any one ofnote 1 to note 11, wherein the feature merger means combines features by concatenation, absolute subtraction, mean-squared subtraction or dot-product, or a combination thereof.
(Supplementary Note 17)
An image processing method comprising:
extracting object-driven features of target object from input images;
merging the features extracted from the input images into a merged feature;
predicting a probability of each change class based on the merged feature;
predicting a probability of each object class based on the extracted features of each image;
calculating a combined loss from a change classification loss and an object classification loss; and
updating parameters for extracting the object-driven feature.
(Supplementary Note 18)
A non-transitory computer readable medium storing an image processing program is a non-transitory computer readable medium storing an image processing program for causing a computer to execute an image processing method, an image processing method comprising:
extracting object-driven features of target object from input images;
merging the features extracted from the input images into a merged feature,
predicting a probability of each change class based on the merged feature;
predicting a probability of each object class based on the extracted features of each image;
calculating a combined loss from a change classification loss and an object classification loss; and
updating parameters for extracting the object-driven feature.
(Supplementary Note 1)
An image processing apparatus for a training method of change detection comprising:
an object-driven feature extractor means to extract relevant features of target object from input images;
a feature merger means to merge the features extracted from the input images into a merged feature;
a change classifier means to predict a probability of each change class based on the merged feature;
an object classifier means to predict a probability of each object class based on the extracted features of each image;
a multi-loss calculator means to calculate a combined loss from a change classification loss and an object classification loss; and
a parameter updater means to update parameters of the object-driven feature extractor means.
(Supplementary Note 2)
The image processing apparatus according to
(Supplementary Note 3)
The image processing apparatus according to
(Supplementary Note 4)
The image processing apparatus according to
(Supplementary Note 5)
The image processing apparatus according to any one of
(Supplementary Note 6)
The image processing apparatus according to any one of
(Supplementary Note 7)
The image processing apparatus for change detection method comprising,
an object-driven feature extractor means to extract relevant features of target object from input images;
a feature merger means to merge the features extracted from the input images into a merged feature; and
a change classifier means to predict a probability of each change class based on the merged features,
wherein the object-driven feature extractor means and the change classifier means use parameters trained using the training method according to any one of
(Supplementary Note 8)
The image processing apparatus according to note 7, further comprising a thresholder means to threshold the predicted probability of each change class.
(Supplementary Note 9)
The image processing apparatus according to note 7, further comprising an image processor means to apply an image processing operation on the predicted probability of each change class.
(Supplementary Note 10)
The image processing apparatus according to note 9, wherein the image processor means is a kernel density estimator or a euclidean distance estimator.
(Supplementary Note 11)
The image processing apparatus for change detection method according to any one of note 7 to note 10, further comprising:
an object classifier means to predict a probability of each object class based on the extracted features of each image,
wherein the object classifier means uses parameters trained using the training method according to any one of
(Supplementary Note 12)
The image processing apparatus according to any one of
(Supplementary Note 13)
The image processing apparatus according to note 12 wherein the neural-network based method is a siamese network, pseudo-siamese network or 2-channel network.
(Supplementary Note 14)
The image processing apparatus according to any one of
(Supplementary Note 15)
The image processing apparatus according to any one of
(Supplementary Note 16)
The image processing apparatus according to any one of
(Supplementary Note 17)
An image processing method comprising:
extracting object-driven features of target object from input images;
merging the features extracted from the input images into a merged feature;
predicting a probability of each change class based on the merged feature;
predicting a probability of each object class based on the extracted features of each image;
calculating a combined loss from a change classification loss and an object classification loss; and
updating parameters for extracting the object-driven feature.
(Supplementary Note 18)
A non-transitory computer readable medium storing an image processing program is a non-transitory computer readable medium storing an image processing program for causing a computer to execute an image processing method, an image processing method comprising:
extracting object-driven features of target object from input images;
merging the features extracted from the input images into a merged feature,
predicting a probability of each change class based on the merged feature;
predicting a probability of each object class based on the extracted features of each image;
calculating a combined loss from a change classification loss and an object classification loss; and
updating parameters for extracting the object-driven feature.
1A, 1B, 2, 3 Image Processing Apparatus
10, 11 Object-driven Feature Extraction Units
12 Feature Merger Unit
13A Change Classifier Unit
13B Trained Change Classifier Unit
14, 15 Object Classifier Units
16 Multi-loss Calculator Unit
17 Parameter Updater Unit
18 Storage Unit
19 Thresholder Unit
21, 22 Trained Object Classifier Units
31 Image Processor Unit
10, 11 Object-driven Feature Extraction Units
12 Feature Merger Unit
13A Change Classifier Unit
13B Trained Change Classifier Unit
14, 15 Object Classifier Units
16 Multi-loss Calculator Unit
17 Parameter Updater Unit
18 Storage Unit
19 Thresholder Unit
21, 22 Trained Object Classifier Units
31 Image Processor Unit
Claims (18)
- An image processing apparatus for a training method of change detection comprising:
an object-driven feature extractor means to extract relevant features of target object from input images;
a feature merger means to merge the features extracted from the input images into a merged feature;
a change classifier means to predict a probability of each change class based on the merged feature;
an object classifier means to predict a probability of each object class based on the extracted features of each image;
a multi-loss calculator means to calculate a combined loss from change classification loss and object classification loss; and
a parameter updater means to update parameters of the object-driven feature extractor means. - The image processing apparatus according to claim 1, wherein the parameter updater means updates the parameters of the change classifier means and object classifier means.
- The image processing apparatus according to claim 1 or claim 2, wherein the multi-loss calculator means calculates a weighted combination of change classification loss and object classification loss.
- The image processing apparatus according to claim 3, wherein the weights of the change classification loss and object classification loss are determined using a grid search or random search.
- The image processing apparatus according to any one of claim 1 to claim 4, wherein the change classification loss and object classification loss are selected, as a loss function, from the group consisting of cross-entropy, Kullback-Leibler divergence, contrastive loss, hinge loss and mean-squared error.
- The image processing apparatus according to any one of claim 1 to claim 5, wherein the input images are captured by Synthetic Aperture Radar.
- The image processing apparatus for change detection method comprising,
an object-driven feature extractor means to extract relevant features of target object from input images;
a feature merger means to merge the features extracted from the input images into a merged feature; and
a change classifier means to predict a probability of each change class based on the merged features;
wherein the object-driven feature extractor means and the change classifier means use parameters trained using the training method according to any one of claim 1 to claim 6. - The image processing apparatus according to claim 7, further comprising a thresholder means to threshold the predicted probability of each change class.
- The image processing apparatus according to claim 7, further comprising an image processor means to apply an image processing operation on the predicted probability of each change class.
- The image processing apparatus according to claim 9, wherein the image processor means is a kernel density estimator or a euclidean distance estimator.
- The image processing apparatus for change detection method according to any one of claim 7 to claim 10, further comprising:
an object classifier means to predict a probability of each object class based on the extracted features of each image,
wherein the object classifier means use parameters trained using the training method according to any one of claim 1 to claim 6. - The image processing apparatus according to any one of claim 1 to claim 11, wherein the object-driven feature extraction means uses a neural-network based method.
- The image processing apparatus according to claim 12 wherein the neural-network based method is a siamese network, pseudo-siamese network or 2-channel network.
- The image processing apparatus according to any one of claim 1 to claim 11, wherein the change classifier means uses a Decision Tree, Support Vector Machine, Neural Network, Gradient Boosting Machine, or an ensemble thereof.
- The image processing apparatus according to any one of claim 1 to claim 11, wherein the object classifier means is a Decision Tree, Support Vector Machine, Neural Network, Gradient Boosting Machine, or an ensemble thereof.
- The image processing apparatus according to any one of claim 1 to claim 11, wherein the feature merger means combines features by concatenation, absolute subtraction, mean-squared subtraction or dot-product, or a combination thereof.
- An image processing method comprising:
extracting object-driven features of target object from input images;
merging the features extracted from the input images into a merged feature;
predicting a probability of each change class based on the merged feature;
predicting a probability of each object class based on the extracted features of each image;
calculating a combined loss from change classification loss and object classification loss; and
updating parameters for extracting the object-driven feature. - A non-transitory computer readable medium storing an image processing program is a non-transitory computer readable medium storing an image processing program for causing a computer to execute an image processing method, an image processing method comprising:
extracting object-driven features of target object from input images;
merging the features extracted from the input images into a merged feature,
predicting a probability of each change class based on the merged feature;
predicting a probability of each object class based on the extracted features of each image;
calculating a combined loss from a change classification loss and an object classification loss; and
updating parameters for extracting the object-driven feature.
Priority Applications (4)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| EP19923573.0A EP3948767A4 (en) | 2019-04-03 | 2019-04-03 | IMAGE PROCESSING DEVICE, IMAGE PROCESSING METHOD AND NON-VOLATILE COMPUTER READABLE MEDIA |
| JP2021556897A JP7306473B2 (en) | 2019-04-03 | 2019-04-03 | Image processing device, image processing method and image processing program |
| PCT/JP2019/014832 WO2020202505A1 (en) | 2019-04-03 | 2019-04-03 | Image processing apparatus, image processing method and non-transitoty computer readable medium |
| US17/600,288 US20220172378A1 (en) | 2019-04-03 | 2019-04-03 | Image processing apparatus, image processing method and non-transitory computer readable medium |
Applications Claiming Priority (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| PCT/JP2019/014832 WO2020202505A1 (en) | 2019-04-03 | 2019-04-03 | Image processing apparatus, image processing method and non-transitoty computer readable medium |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| WO2020202505A1 true WO2020202505A1 (en) | 2020-10-08 |
Family
ID=72667281
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| PCT/JP2019/014832 Ceased WO2020202505A1 (en) | 2019-04-03 | 2019-04-03 | Image processing apparatus, image processing method and non-transitoty computer readable medium |
Country Status (4)
| Country | Link |
|---|---|
| US (1) | US20220172378A1 (en) |
| EP (1) | EP3948767A4 (en) |
| JP (1) | JP7306473B2 (en) |
| WO (1) | WO2020202505A1 (en) |
Cited By (6)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN112949388A (en) * | 2021-01-27 | 2021-06-11 | 上海商汤智能科技有限公司 | Image processing method and device, electronic equipment and storage medium |
| CN113112509A (en) * | 2021-04-12 | 2021-07-13 | 深圳思谋信息科技有限公司 | Image segmentation model training method and device, computer equipment and storage medium |
| CN113901877A (en) * | 2021-09-13 | 2022-01-07 | 广州市城市规划勘测设计研究院 | Method, device and equipment for detecting change of remote sensing image building and storage medium |
| CN115797670A (en) * | 2022-11-22 | 2023-03-14 | 华能伊敏煤电有限责任公司 | Bucket wheel performance monitoring method and system based on convolutional neural network |
| CN116596993A (en) * | 2023-05-18 | 2023-08-15 | 中国人民解放军火箭军工程大学 | A Target Recognition, Matching and Localization Method Based on Pseudo-Twin Network and Deep Features |
| JP2023167507A (en) * | 2022-05-12 | 2023-11-24 | キヤノン株式会社 | Information processing device, information processing method and program |
Families Citing this family (3)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN114255361B (en) * | 2020-09-10 | 2025-08-22 | 华为技术有限公司 | Neural network model training method, image processing method and device |
| KR20230043318A (en) * | 2021-09-24 | 2023-03-31 | 삼성전자주식회사 | Method and apparatus for classifying object in image |
| CN115620150B (en) * | 2022-12-05 | 2023-08-04 | 海豚乐智科技(成都)有限责任公司 | Method and device for multi-modal image ground building recognition based on twin Transformer |
Citations (1)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN109002792A (en) * | 2018-07-12 | 2018-12-14 | 西安电子科技大学 | SAR image change detection based on layering multi-model metric learning |
Family Cites Families (3)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US8977062B2 (en) * | 2013-02-25 | 2015-03-10 | Raytheon Company | Reduction of CFAR false alarms via classification and segmentation of SAR image clutter |
| JP6349937B2 (en) * | 2014-05-09 | 2018-07-04 | 日本電気株式会社 | Fluctuation detection apparatus, fluctuation detection method, and fluctuation detection program |
| EP3311194A4 (en) * | 2015-06-16 | 2018-06-13 | King Abdulaziz City for Science and Technology | Systems and methods for enhancing synthetic aperture radar imagery |
-
2019
- 2019-04-03 US US17/600,288 patent/US20220172378A1/en not_active Abandoned
- 2019-04-03 JP JP2021556897A patent/JP7306473B2/en active Active
- 2019-04-03 WO PCT/JP2019/014832 patent/WO2020202505A1/en not_active Ceased
- 2019-04-03 EP EP19923573.0A patent/EP3948767A4/en not_active Withdrawn
Patent Citations (1)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN109002792A (en) * | 2018-07-12 | 2018-12-14 | 西安电子科技大学 | SAR image change detection based on layering multi-model metric learning |
Non-Patent Citations (2)
| Title |
|---|
| See also references of EP3948767A4 * |
| SHARMA, SHREYA: "Comparative Study of Feature Extraction Approaches for Ship Classification in Moderate-Resolution SAR Imagery", IEEE INTERNATIONAL GEOSCIENCE AND REMOTE SENSING SYMPOSIUM, 2018, pages 6995 - 6998, XP033438109, DOI: 10.1109/IGARSS.2018.8518966 * |
Cited By (8)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN112949388A (en) * | 2021-01-27 | 2021-06-11 | 上海商汤智能科技有限公司 | Image processing method and device, electronic equipment and storage medium |
| WO2022160753A1 (en) * | 2021-01-27 | 2022-08-04 | 上海商汤智能科技有限公司 | Image processing method and apparatus, and electronic device and storage medium |
| CN112949388B (en) * | 2021-01-27 | 2024-04-16 | 上海商汤智能科技有限公司 | Image processing method, device, electronic device and storage medium |
| CN113112509A (en) * | 2021-04-12 | 2021-07-13 | 深圳思谋信息科技有限公司 | Image segmentation model training method and device, computer equipment and storage medium |
| CN113901877A (en) * | 2021-09-13 | 2022-01-07 | 广州市城市规划勘测设计研究院 | Method, device and equipment for detecting change of remote sensing image building and storage medium |
| JP2023167507A (en) * | 2022-05-12 | 2023-11-24 | キヤノン株式会社 | Information processing device, information processing method and program |
| CN115797670A (en) * | 2022-11-22 | 2023-03-14 | 华能伊敏煤电有限责任公司 | Bucket wheel performance monitoring method and system based on convolutional neural network |
| CN116596993A (en) * | 2023-05-18 | 2023-08-15 | 中国人民解放军火箭军工程大学 | A Target Recognition, Matching and Localization Method Based on Pseudo-Twin Network and Deep Features |
Also Published As
| Publication number | Publication date |
|---|---|
| US20220172378A1 (en) | 2022-06-02 |
| JP2022528326A (en) | 2022-06-10 |
| JP7306473B2 (en) | 2023-07-11 |
| EP3948767A1 (en) | 2022-02-09 |
| EP3948767A4 (en) | 2022-04-20 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| WO2020202505A1 (en) | Image processing apparatus, image processing method and non-transitoty computer readable medium | |
| US10896351B2 (en) | Active machine learning for training an event classification | |
| US9767570B2 (en) | Systems and methods for computer vision background estimation using foreground-aware statistical models | |
| CN103729854B (en) | A kind of method for detecting infrared puniness target based on tensor model | |
| CN110349187A (en) | Method for tracking target, device and storage medium based on TSK Fuzzy Classifier | |
| Patil et al. | MsEDNet: Multi-scale deep saliency learning for moving object detection | |
| Yadav et al. | Moving object detection in real-time visual surveillance using background subtraction technique | |
| JP7279817B2 (en) | Image processing device, image processing method and image processing program | |
| Sahoo et al. | Adaptive feature fusion and spatio-temporal background modeling in KDE framework for object detection and shadow removal | |
| CN117830356A (en) | Target tracking method, device, equipment and medium | |
| CN110363165A (en) | Multi-target tracking method, device and storage medium based on TSK fuzzy system | |
| Amjad et al. | Kalman filter-based convolutional neural network for robust tracking of froth-middling interface in a primary separation vessel in presence of occlusions | |
| CN118799827A (en) | An intelligent detection and positioning method for sea surface ship targets using dual-polarization SAR images | |
| Singh et al. | Fixed point ICA based approach for maximizing the non-Gaussianity in remote sensing image classification | |
| KR101690050B1 (en) | Intelligent video security system | |
| KR101588648B1 (en) | A method on the pedestrian detection and tracking for intelligent video surveillance | |
| CN113963178B (en) | Method, device, equipment and medium for detecting small infrared targets under ground-to-air background | |
| CN120612346A (en) | A trajectory calculation method and storage medium based on small target recognition and tracking | |
| CN110751623A (en) | Joint feature-based defect detection method, device, equipment and storage medium | |
| Jadhav et al. | FPGA based object tracking system | |
| JP7056751B2 (en) | Ship detection systems, methods and programs | |
| US20240046512A1 (en) | Inference device, inference method, and non-transitory computer-readable medium | |
| Ren et al. | Survey on unsupervised change detection techniques in SAR images1 | |
| Ramola et al. | Versatile applications of pixel-and object-based change detection techniques in remote sensing | |
| CN115457511A (en) | Obstacle identification method, device, equipment and computer readable storage medium |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| 121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 19923573 Country of ref document: EP Kind code of ref document: A1 |
|
| ENP | Entry into the national phase |
Ref document number: 2021556897 Country of ref document: JP Kind code of ref document: A |
|
| NENP | Non-entry into the national phase |
Ref country code: DE |
|
| ENP | Entry into the national phase |
Ref document number: 2019923573 Country of ref document: EP Effective date: 20211103 |




