WO2019052561A1 - 检查方法和检查设备以及计算机可读介质 - Google Patents
检查方法和检查设备以及计算机可读介质 Download PDFInfo
- Publication number
- WO2019052561A1 WO2019052561A1 PCT/CN2018/106021 CN2018106021W WO2019052561A1 WO 2019052561 A1 WO2019052561 A1 WO 2019052561A1 CN 2018106021 W CN2018106021 W CN 2018106021W WO 2019052561 A1 WO2019052561 A1 WO 2019052561A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- inspected
- neural network
- image
- semantic
- feature
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Ceased
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G01—MEASURING; TESTING
- G01N—INVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
- G01N23/00—Investigating or analysing materials by the use of wave or particle radiation, e.g. X-rays or neutrons, not covered by groups G01N3/00 – G01N17/00, G01N21/00 or G01N22/00
- G01N23/02—Investigating or analysing materials by the use of wave or particle radiation, e.g. X-rays or neutrons, not covered by groups G01N3/00 – G01N17/00, G01N21/00 or G01N22/00 by transmitting the radiation through the material
- G01N23/04—Investigating or analysing materials by the use of wave or particle radiation, e.g. X-rays or neutrons, not covered by groups G01N3/00 – G01N17/00, G01N21/00 or G01N22/00 by transmitting the radiation through the material and forming images of the material
-
- G—PHYSICS
- G01—MEASURING; TESTING
- G01V—GEOPHYSICS; GRAVITATIONAL MEASUREMENTS; DETECTING MASSES OR OBJECTS; TAGS
- G01V5/00—Prospecting or detecting by the use of ionising radiation, e.g. of natural or induced radioactivity
- G01V5/20—Detecting prohibited goods, e.g. weapons, explosives, hazardous substances, contraband or smuggled objects
- G01V5/22—Active interrogation, i.e. by irradiating objects or goods using external radiation sources, e.g. using gamma rays or cosmic rays
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/22—Matching criteria, e.g. proximity measures
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/0002—Inspection of images, e.g. flaw detection
- G06T7/0004—Industrial image inspection
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/768—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using context analysis, e.g. recognition aided by known co-occurring patterns
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/82—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/60—Type of objects
- G06V20/69—Microscopic objects, e.g. biological cells or cellular parts
- G06V20/695—Preprocessing, e.g. image segmentation
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V30/00—Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
- G06V30/10—Character recognition
- G06V30/19—Recognition using electronic means
- G06V30/191—Design or setup of recognition systems or techniques; Extraction of features in feature space; Clustering techniques; Blind source separation
- G06V30/19173—Classification techniques
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V30/00—Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
- G06V30/10—Character recognition
- G06V30/26—Techniques for post-processing, e.g. correcting the recognition result
- G06V30/262—Techniques for post-processing, e.g. correcting the recognition result using context analysis, e.g. lexical, syntactic or semantic context
- G06V30/274—Syntactic or semantic context, e.g. balancing
-
- G—PHYSICS
- G01—MEASURING; TESTING
- G01N—INVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
- G01N2223/00—Investigating materials by wave or particle radiation
- G01N2223/03—Investigating materials by wave or particle radiation by transmission
-
- G—PHYSICS
- G01—MEASURING; TESTING
- G01N—INVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
- G01N2223/00—Investigating materials by wave or particle radiation
- G01N2223/10—Different kinds of radiation or particles
- G01N2223/101—Different kinds of radiation or particles electromagnetic radiation
- G01N2223/1016—X-ray
-
- G—PHYSICS
- G01—MEASURING; TESTING
- G01N—INVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
- G01N2223/00—Investigating materials by wave or particle radiation
- G01N2223/40—Imaging
- G01N2223/401—Imaging image processing
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20084—Artificial neural networks [ANN]
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V2201/00—Indexing scheme relating to image or video recognition or understanding
- G06V2201/03—Recognition of patterns in medical or anatomical images
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V2201/00—Indexing scheme relating to image or video recognition or understanding
- G06V2201/10—Recognition assisted with metadata
Definitions
- the present application relates to security inspections, and in particular to a radiographic imaging based inspection method and inspection apparatus and computer readable medium.
- Article 85 of China's Anti-Terrorism Law stipulates that “the railway, highway, water, air cargo and postal, express delivery and other logistics operating units have not implemented a safety inspection system, and have checked the identity of customers, or have not transported or delivered according to regulations. The article shall be punished for security inspection or unpacking inspection, non-transportation, delivery of customer identity and item information registration system.” This requires express delivery companies to implement 100% prior inspection and post-sealing, 100% delivery real-name system, 100% X-ray machine security and other system guarantees.
- embodiments of the present disclosure provide an inspection method, an inspection apparatus, and a computer readable medium, which are capable of automatically checking an object to be inspected, while ensuring the accuracy of inspection, greatly Improve the speed of inspection.
- an inspection method comprising the steps of: scanning an object to be inspected with X-rays to obtain an image of the object to be inspected; processing the image with the first neural network to obtain a semantic description of the object to be inspected; Reading the text information of the manifest of the object to be inspected; processing the text information of the manifest of the object to be inspected by using the second neural network to obtain a semantic feature of the object to be inspected; based on the semantic description and the semantic feature To determine whether the object to be inspected is allowed to pass.
- the first neural network is a convolutional neural network or a candidate region-based convolutional neural network or a fast candidate region-based convolutional neural network
- the second neural network being a cyclic neural network or a bidirectional Circulating neural network.
- the first neural network is trained using a pre-established image-semantic pair set.
- the method before the processing the image by using the first neural network, the method further includes the steps of: binarizing the image of the object to be inspected; calculating an average value of the binarized image; using the binary value Each pixel value of the image is subtracted from the average.
- the step of determining whether the object to be inspected is allowed to pass based on the semantic description and the semantic feature comprises: calculating a first vector representing the semantic description and a second representation representing the semantic feature The distance between the vectors; allowing the object to be inspected to pass if the calculated distance is less than the threshold.
- a plurality of regional features included in the sample image are associated with a plurality of words included in manifest information of the sample picture during training of the first neural network.
- a dot product between a feature vector representing the region feature and a semantic vector representing the word is used as a similarity between the region feature and the word, and a plurality of regions of the sample image are utilized
- the weighted sum of the similarity between the feature and the plurality of words included in the manifest information is used as the similarity between the sample image and its manifest information.
- an inspection apparatus comprising: a scanning device that scans an object to be inspected with X-rays to obtain a scanned image; an input device that inputs manifest information of the object to be inspected; And configured to: process the image by using the first neural network to obtain a semantic description of the object to be inspected; and process the text information of the manifest of the object to be inspected by using the second neural network to obtain a semantic feature of the object to be inspected; The semantic description and the semantic feature are used to determine whether the object to be inspected is allowed to pass.
- a computer readable medium stored with a computer program that, when executed by a processor, implements the following steps:
- the inspection speed can be ensured, and the inspection speed is greatly improved, so that the efficiency of the safety inspection is greatly improved.
- FIG. 1 shows a schematic diagram of an inspection apparatus according to an embodiment of the present disclosure
- Figure 2 is a diagram showing the internal structure of a computer for image processing in the embodiment shown in Figure 1;
- FIG. 3 is a schematic diagram depicting an artificial neural network used in an inspection apparatus and an inspection method of an embodiment of the present disclosure
- FIG. 4 is a schematic diagram showing another artificial neural network used in the inspection apparatus and the inspection method of the embodiment of the present disclosure
- FIG. 5 is a schematic diagram depicting a process of aligning images and semantics in accordance with an embodiment of the present disclosure
- FIG. 6 is a flow chart describing establishing an image-semantic model in an inspection apparatus and an inspection method according to an embodiment of the present invention
- FIG. 7 is a flowchart describing a process of performing a security check on an object to be inspected according to an inspection method according to an embodiment of the present disclosure.
- the embodiment of the present disclosure proposes an inspection technique based on deep learning, which can intelligently complete the comparison of the article machine drawings.
- an area in the image of the goods of the goods machine that is inconsistent with the customs declaration data of the item declaration can be found. This area may be a false report or a false report.
- False newspapers are generally prohibited items in logistics goods or dangerous goods that pose a threat to the safety of logistics and transportation. In order to evade detection, goods are described as safe goods on the customs declaration form.
- the newspapers are generally small quantities and small items, also called entrainment, which are common means of smuggling and contraband.
- the problem of one box of multiple items in the logistics goods is a problem that cannot be solved under the conventional image processing conditions. To be precise, it is affected by equipment inconsistency, and the ambiguous complex segmentation problem under the supervision of customs declaration data. For example, the algorithm under different devices will inevitably have different performances.
- the data form of the customs declaration sheet gives multiple supervised values (such as how many kinds of goods, each type and unit weight, etc.), and each pixel on the image may belong to Multiple goods, etc.
- the method of the present disclosure solves the problem by using a deep learning-based method.
- FIG. 1 shows a schematic structural view of an inspection apparatus according to an embodiment of the present disclosure.
- the inspection apparatus 10 shown in Fig. 1 includes an X-ray source 11, a detector module 15, an acquisition circuit 16, a controller 17, a data processing computer 18, and the like.
- the source 11 includes one or more X-ray generators that can perform a single-energy transmission scan or a dual-energy transmission scan.
- an object 14 to be inspected such as a baggage item, is placed on the conveyor belt 13 through a scanning area between the radiation source 11 and the detector module 15.
- the detector module 15 and the acquisition circuit 16 are, for example, detectors and data collectors having an integral modular structure, such as multiple rows of detectors, for detecting radiation transmitted through the object under test 14, obtaining an analog signal, and The analog signal is converted into a digital signal, thereby outputting a transmission image of the object 14 to be inspected for X-rays.
- one row of detectors can be provided for high energy rays, another row of detectors for low energy rays, or the same row of detectors for high energy and low energy rays.
- the controller 17 is used to control the various parts of the entire system to work synchronously.
- the data processing computer 18 is used to process the data collected by the data acquisition circuit 16, process the image data, and output the results.
- the data processing computer 18 runs an image processing program, analyzes and learns the scanned image, obtains a semantic description of the image, and then compares the obtained semantic description with the semantic features included in the manifest information of the baggage item, and determines Whether the declaration information is consistent with the object in the baggage item. In the case of uniformity, the baggage item is allowed to pass, otherwise an alarm is issued to remind the security personnel that the baggage item may have a problem.
- the detector module 15 and the acquisition circuit 16 are used to acquire transmission data of the object 14 to be inspected.
- the acquisition circuit 16 includes a data amplification shaping circuit that operates in either (current) integration mode or pulse (count) mode.
- the data output cable of the acquisition circuit 16 is coupled to the controller 17 and the data processing computer 18, and the acquired data is stored in the data processing computer 18 in accordance with a trigger command.
- the detector module 15 includes a plurality of detection units that receive X-rays that penetrate the object under inspection 14.
- the acquisition circuit 16 is coupled to the detector module 15 to convert the signals generated by the detector module 16 into probe data.
- the controller 17 is connected to the radiation source 11 via a control line CTRL1, to the detector module 15 via a control line CTRL2, and to the acquisition circuit 16, for controlling one or more X-ray generators in the radiation source 11 to the object 14 to be inspected.
- a single-energy scan is performed, or a dual-energy scan is performed on the object 14 to be inspected, so that X-rays are transmitted through the object 14 to be inspected as the object 14 to be inspected moves.
- controller 17 controls detector module 15 and acquisition circuit 16 to obtain corresponding transmission data, such as single energy transmission data or dual energy transmission data.
- the data processing computer 18 obtains an image of the object 14 to be inspected based on the transmission data, processes the image, and determines whether the two are consistent based on the manifest information of the object 14 to be inspected.
- FIG. 2 shows a block diagram of the structure of the data processing computer shown in FIG. 1.
- the data processing calculation 20 includes a storage device 21, a read only memory (ROM) 22, a random access memory (RAM) 23, an input device 24, a processor 25, a display device 26 and an interface unit 27, and a bus 28. Wait.
- ROM read only memory
- RAM random access memory
- the data collected by the acquisition circuit 16 is stored in the storage device 21 via the interface unit 27 and the bus 28.
- Configuration information and a program of the computer data processor are stored in the read only memory (ROM) 22.
- a random access memory (RAM) 23 is used to temporarily store various data during the operation of the processor 25.
- a computer program for performing data processing is also stored in the storage device 21.
- the internal bus 28 is connected to the above-described storage device 21, read only memory 22, random access memory 23, input device 24, processor 25, display device 28, and interface unit 27.
- the instruction code of the computer program instructs the processor 25 to execute the data processing algorithm, and after obtaining the data processing result, displays it on an LCD display or the like.
- the processing result is output on the display device 27, or directly in the form of a hard copy such as printing.
- the radiation source 11 may be a radioactive isotope (e.g., cobalt-60), or may be a low energy X-ray machine or a high energy X-ray accelerator or the like.
- a radioactive isotope e.g., cobalt-60
- the radiation source 11 may be a radioactive isotope (e.g., cobalt-60), or may be a low energy X-ray machine or a high energy X-ray accelerator or the like.
- the detector module 15 is divided from materials, which may be gas detectors, scintillator detectors or solid detectors, etc., which are divided into arrays, which may be single row, double row or multiple rows, and single layer detectors. Or double-layer high and low energy detectors.
- the object 14 to be inspected such as a baggage item, is moved through the inspection area by the conveyor belt 13, but it will be appreciated by those skilled in the art that the object 14 to be inspected may be stationary and the source of radiation and the array of detectors may be moved to complete the scanning process.
- FIG. 3 is a schematic diagram showing a convolutional neural network 30 in accordance with an embodiment of the present disclosure.
- the convolutional neural network 30, as shown in Figure 3 can generally comprise a plurality of convolutional layers 32 and 34, which are generally small neurons that are partially overlapping each other (which is also referred to mathematically as A collection of convolution kernels, which are used interchangeably unless otherwise stated.
- layers of input data or input layers, such as input layer 31 of FIG.
- a layer referred to as “before” or “below” and another layer closer to the output data (or output layer, such as output layer 37 of Figure 3) is referred to as a "behind” or “on” layer.
- the direction from the input layer (eg, input layer 31 of FIG. 3) to the output layer (eg, output layer 37 of FIG. 3) during training, verification, and/or use is referred to as forward or forward (forward)
- the direction from the output layer (eg, output layer 37 of FIG. 3) to the input layer (eg, input layer 31 of FIG. 3) is referred to as backward or backward.
- these small neurons can process various parts of the input image.
- the outputs of these small neurons are then combined into an output (referred to as a feature map, such as a square in the first convolutional layer 32) to obtain an output image that better represents certain features in the original image.
- the partially overlapping arrangement between adjacent neurons also causes the convolutional neural network 30 to have a degree of translational tolerance for features in the original image.
- the convolutional neural network 30 can correctly identify the feature even if the feature in the original image changes its position in a translational manner within a certain tolerance.
- the next layer is the optional pooling layer, the first pooling layer 33, which is mainly used to downsample the output data of the previous convolution layer 32 while maintaining the features, reducing the calculation. Quantity and prevent overfitting.
- the next layer is also a convolutional layer, and the second convolutional layer 34 can perform further feature sampling on the output data generated by the first convolutional layer 32 and downsampled via the pooling layer 33.
- the features learned are globally larger than those learned by the first convolutional layer.
- subsequent convolutional layers are global to the characteristics of the previous convolutional layer.
- the convolutional layer (eg, the first and second convolutional layers 32 and 34) is the core building block of the CNN (eg, convolutional neural network 30).
- the parameters of this layer consist of a collection of learnable convolution kernels (or simply convolution kernels), each with a small receptive field, but extending over the entire depth of the input data.
- each convolution kernel is convolved along the width and height of the input data, the dot product between the elements of the convolution kernel and the input data is computed, and a two-dimensional activation map of the convolution kernel is generated.
- the network is able to learn the convolution kernel that can be activated when a particular type of feature is seen at a spatial location of the input.
- the activation maps of all convolution kernels are stacked in the depth direction to form the full output data of the convolutional layer.
- each element in the output data can be interpreted as an output of a convolution kernel that sees small regions in the input and shares parameters with other convolution kernels in the same activation map.
- the depth of the output data controls the number of convolution kernels in the same area of the layer that are connected to the input data. For example, as shown in FIG. 3, the depth of the first convoluted layer 32 is 4, and the depth of the second convolutional layer 34 is 6. All of these convolution kernels will learn to activate for different features in the input. For example, if the first convolutional layer 32 is input with the original image, then different convolution kernels along the depth dimension (ie, different squares in FIG. 3) may have various directional edges, or grayscales, appearing in the input data. Activated when the block.
- the training process is a very important part of deep learning.
- a stochastic gradient descent method can be used.
- the Nesterov optimization algorithm can be used to solve.
- the initial learning rate can be set to start at 0.01 and gradually decrease until an optimal value is found.
- a Gaussian random process with a smaller variance can be used to initialize the weight values of the respective convolution kernels.
- the image training set may employ a tagged item image that is each labeled with a feature location in the image.
- a dense picture description is generated on the picture using CNN (Convolution Neural Networks) as shown in FIG. 3 to characterize information of items in the image of the item machine.
- CNN Convolution Neural Networks
- a candidate region-based convolutional neural network or a fast candidate region convolutional neural network (Faster-RCNN) extraction method may be employed.
- CNN Convolution Neural Networks
- the RCNN Regular Convolutional Neural Network
- RNN Recurrent Neural Networks
- the RNN contains input units, the input set is labeled ⁇ x 0 , x 1 ,..., x t , x t+1 ,... ⁇ , and the output unit (Output units)
- the output set of ) is labeled ⁇ y 0 , y 1 ,..., y t , y t+1 .,.. ⁇ .
- the RNN also contains Hidden units, marking its output set as ⁇ s 0 , s 1 ,..., s t , s t+1 ,... ⁇ , and these hidden units do the most important work.
- the circulating neural network is expanded into a full neural network.
- the expanded network is a five-layer neural network, with each layer representing a word.
- the word vector refers to the use of a real-number vector v of a specified length to represent a word. You can use the One-hot vector to represent a word, that is, to generate a vector of
- each input step each layer shares the parameters U, V, W, which reflects that each step in the RNN is doing the same thing, but the input is different, thus greatly reducing the need to learn in the network. parameter.
- the improvement of the RNN by the Bidirectional RNN is that the current output (the output of the t-th step) is not only related to the previous sequence, but also related to the following sequence. For example, predicting missing words in a statement requires prediction based on context.
- the Bidirectional RNN is a relatively simple RNN consisting of two RNNs stacked one on top of the other. The output is determined by the state of the hidden layers of the two RNNs.
- an RNN Recurrent Neural Networks
- a method of bidirectional RNN is employed in an embodiment of the present disclosure.
- the BRNN enters a sequence of n words, each encoded with one-hot, converting each word into a fixed h-dimensional vector.
- the expression of the words is enriched because the context of the change in length around the words is used.
- Those skilled in the art will appreciate that other artificial neural networks may be employed to identify and learn features in the manifest information. Embodiments of the present disclosure are not limited thereto.
- the inspection process involves three parts: 1) establishing a scanned image and a database corresponding to the semantic information of the customs declaration; 2) establishing an image model of the luggage item and a corresponding semantic model, and 3) establishing a comparison model of the chart. Smart chart verification.
- the scanned image and the database corresponding to the semantic information of the customs declaration include three parts: image acquisition, image preprocessing and sample preprocessing.
- the database for establishing images and corresponding semantic information of customs declaration is mainly divided into three parts: image acquisition, image preprocessing and sample preprocessing.
- the establishment process is: (1) image acquisition. A considerable number of images of the items scanned by the item machine are collected, so that the image database contains images of various items. Note that the image at this time includes normal items and prohibited items. (2) Image preprocessing. The scanned image is pre-processed with noise information attached to the scanned image. Since the binary image has a uniform physical resolution and can be used in combination with a variety of algorithms, this patent uses a binary image. Note that to ensure the generalization of the model, you can subtract the mean of all the images for each image.
- an image of an item scanned by a substantial number of item machines is acquired such that the image database contains various item images.
- the image at this time includes normal items and prohibited items.
- noise information is attached to the image acquired by the scan, and the scanned image needs to be preprocessed.
- the binary image has a uniform physical resolution and can be conveniently used in combination with a plurality of algorithms
- the present application uses a binary image. After obtaining the binary image, the gray average of the image is calculated. In order to ensure the generalization performance of the model, the average of all the images is subtracted from each image, and the obtained result is used as the image input of the model.
- samples are annotated using customs declaration semantic information.
- Image and semantic information form a complete pair of information to facilitate network training.
- the semantic information here also includes the item information described in the express delivery note.
- the sample is annotated with the semantic information of the customs declaration.
- Image and semantic information form a complete pair of information to facilitate network training.
- non-key words in the item description such as deleting non-item keywords such as "one, one type”.
- the process of establishing a graph comparison model is shown in Fig. 5.
- the intrinsic pattern correspondence between the language and the visual data is learned on the data set composed of the pictures and their corresponding sentence descriptions.
- the method of the present disclosure is based on a new model combination approach, while also aligning the two mode inputs through a multi-mode coding model based on a structural purpose.
- the image features and semantic features are aligned by a multi-mode coding model, and the graph comparison model is finally obtained by random gradient descent training, that is, RCNN 52 and BRNN 53 as shown in FIG.
- the result of the calculation is that the image corresponds to the semantic information of the item it declares
- the item automatically passes through the item machine.
- the result of the calculation is that the image does not correspond to the semantic information of the item it declares
- the item machine issues a warning, prompting the staff to have an abnormality and performing corresponding processing.
- models need to be independent of assumptions, such as special hard-coded models, rules, or categories, but only learn from training corpora.
- the language description of the image is treated as a weak annotation.
- the consecutively segmented words in these statements correspond to some special but unknown locations in the image.
- These "alignments” are inferred using neural networks 52 and 53 and applied to the learning description generation model.
- the present disclosure employs a deep neural network model that can infer potential alignment relationships between fragmentation statements and their corresponding descriptive picture regions. Such models link the two models together through a common, multimodal coding space, and a structured target.
- a multi-mode recursive neural network architecture is adopted, inputting an image, generating a corresponding text description, generating a picture description and image annotation information for keyword matching, and determining the similarity between the generated statement description and the annotation information.
- the generated text description statement is significantly better than the retrieval-based method.
- the model was trained and tested with a new local annotated data set.
- the image features and semantic features are aligned by a multi-mode coding model, and the graph comparison model is finally obtained by random gradient descent training.
- each of the small item machine scan pictures and corresponding description statements can be converted into a common h-dimensional vector set by the established RCNN 52 and BRNN 53, respectively.
- the supervised corpus is granular across the entire picture and the entire statement, but (picture-statement) can be seen as a function of (region-word) scoring.
- a (picture-statement) combination if one of its words can find sufficient object or attribute support in the picture, then they should get a higher matching score.
- a plurality of regional features included in a sample image are associated with a plurality of words included in manifest information of the sample picture. For example, a dot product between a feature vector representing a region feature and a semantic vector representing a word is used as a similarity between the region feature and the word, and a plurality of region features of the sample image and a plurality of the manifest information thereof are utilized. The weighted sum of similarities between words is used as the similarity between the sample image and its manifest information.
- step S61 the tagged image prepared in advance is input to the convolutional neural network 30 for training.
- a trained image-semantic model is obtained in step S62, that is, a convolutional neural network after preliminary training.
- the test picture is input to test the network in step S63, and the prediction score and the difference from the label are calculated in step S64, for example, the test picture is input into the neural network, and the semantic description of the prediction and the semantics are obtained. Describe the difference between the label and the label.
- step S65 it is judged whether the difference is smaller than the threshold, and if it is larger than the network parameter is updated in step S66, the network is adjusted. If the difference is less than the threshold, the network model is established in step S67, that is, the training of the network ends.
- FIG. 7 is a flowchart describing a process of performing a security check on an object to be inspected according to an inspection method according to an embodiment of the present disclosure.
- step S71 the inspection object 14 is scanned using the inspection apparatus shown in Fig. 1 to obtain a transmission image of the inspection object.
- step S74 the manifest information of the object to be inspected is input to the data processing computer 18 by manual entry or a barcode scanner or the like.
- the transmission image is processed using a first neural network, such as a convolutional neural network or RCNN, etc., in step S72, and a semantic description of the object to be inspected is obtained in step S73.
- step S75 the text information of the manifest is processed by the bidirectional cyclic neural network to obtain the semantic features of the object to be inspected.
- step S76 it is judged whether the semantic feature is consistent with the semantic description obtained from the image, and if not, an alarm is issued in step S77. If they are identical, the object to be inspected is allowed to pass in step S78.
- a distance between a first vector representing a semantic description and a second vector representing a semantic feature may be calculated, and then the checked object is allowed to pass if the calculated distance is less than a threshold.
- the distance between the two vectors here can be expressed as the sum of the absolute values of the differences between the elements of the two vectors, or by the Euclidean distance of the two vectors. Embodiments of the present disclosure are not limited thereto.
- the present disclosure realizes whether the intelligent inspection of the declared items is consistent with the actual items, on the one hand, the work efficiency can be greatly improved, and the “pass-through” can be realized, and on the other hand, the side effects of various subjective factors can be reduced. Achieving "management" is an important means of intelligent inspection of security inspections and has huge market potential.
- aspects of the embodiments disclosed herein may be implemented in an integrated circuit as a whole or in part, as one or more of one or more computers running on one or more computers.
- a computer program eg, implemented as one or more programs running on one or more computer systems
- implemented as one or more programs running on one or more processors eg, implemented as one or One or more programs running on a plurality of microprocessors, implemented as firmware, or substantially in any combination of the above, and those skilled in the art, in accordance with the present disclosure, will be provided with design circuitry and/or write software and / or firmware code capabilities.
- signal bearing media include, but are not limited to, recordable media such as floppy disks, hard drives, compact disks (CDs), digital versatile disks (DVDs), digital tapes, computer memories, and the like; and transmission-type media such as digital and / or analog communication media (eg, fiber optic cable, waveguide, wired communication link, wireless communication link, etc.).
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Life Sciences & Earth Sciences (AREA)
- Multimedia (AREA)
- Evolutionary Computation (AREA)
- Health & Medical Sciences (AREA)
- General Health & Medical Sciences (AREA)
- Data Mining & Analysis (AREA)
- Artificial Intelligence (AREA)
- Analytical Chemistry (AREA)
- Pathology (AREA)
- Immunology (AREA)
- Biochemistry (AREA)
- Chemical & Material Sciences (AREA)
- Computing Systems (AREA)
- Computational Linguistics (AREA)
- Software Systems (AREA)
- Databases & Information Systems (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Evolutionary Biology (AREA)
- General Engineering & Computer Science (AREA)
- Medical Informatics (AREA)
- General Life Sciences & Earth Sciences (AREA)
- Quality & Reliability (AREA)
- High Energy & Nuclear Physics (AREA)
- Geophysics (AREA)
- Molecular Biology (AREA)
- Biomedical Technology (AREA)
- Image Analysis (AREA)
- Analysing Materials By The Use Of Radiation (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
Description
Claims (15)
- 一种检查方法,包括步骤:用X射线扫描被检查物体,得到被检查物体的图像;利用第一神经网络处理所述图像,得到被检查物体的语义描述;读取所述被检查物体的舱单的文字信息;利用第二神经网络对被检查物体的舱单的文字信息进行处理,得到被检查物体的语义特征;基于所述语义描述和所述语义特征来判断所述被检查物体是否允许通过。
- 如权利要求1所述的检查方法,其中所述第一神经网络是卷积神经网络或者基于候选区域的卷积神经网络或者基于快速候选区域的卷积神经网络,所述第二神经网络是循环神经网络或者双向循环神经网络。
- 如权利要求1所述的检查方法,其中利用事先建立的图像-语义对集合来训练所述第一神经网络。
- 如权利要求1所述的检查方法,其中,在利用所述第一神经网络对图像进行处理前还包括步骤:对所述被检查物体的图像进行二值化;将二值化的图像计算平均值;用二值化的图像的每个像素值减去所述平均值。
- 如权利要求1所述的检查方法,其中基于所述语义描述和所述语义特征来判断所述被检查物体是否允许通过的步骤包括:计算表示所述语义描述的第一向量与表示所述语义特征的第二向量之间的距离;在计算的距离小于阈值的情况下允许所述被检查物体通过。
- 如权利要求1所述的检查方法,其中在第一神经网络的训练过程中在样本图像中包含的多个区域特征与所述样本图片的舱单信息中包括的多个词语之间建立对应关系。
- 如权利要求6所述的检查方法,其中将表示所述区域特征的特征矢量与表示所述词语的语义矢量之间的点积作为区域特征与词语之间的相似度,并且利用所述样本图像的多个区域特征与其舱单信息包括的多个词语之间的相似度的加权和作为所述样本图像与其舱单信息之间的相似度。
- 一种检查设备,包括:扫描装置,用X射线对被检查物体进行扫描,得到扫描图像;输入装置,输入所述被检查物体的舱单信息;处理器,配置为:利用第一神经网络处理所述图像,得到被检查物体的语义描述;利用第二神经网络对被检查物体的舱单的文字信息进行处理,得到被检查物体的语义特征;基于所述语义描述和所述语义特征来判断所述被检查物体是否允许通过。
- 如权利要求8所述的检查设备,其中所述第一神经网络是卷积神经网络或者基于候选区域的卷积神经网络或者基于快速候选区域的卷积神经网络,所述第二神经网络是循环神经网络或者双向循环神经网络。
- 如权利要求8所述的检查设备,其中利用事先建立的图像-语义对集合来训练所述第一神经网络。
- 如权利要求8所述的检查设备,其中,所述处理器还被配置为在利用所述第一神经网络对图像进行处理前:对所述被检查物体的图像进行二值化;将二值化的图像计算平均值;用二值化的图像的每个像素值减去所述平均值。
- 如权利要求8所述的检查设备,其中所述处理器还被配置为:计算表示所述语义描述的第一向量与表示所述语义特征的第二向量之间的距离;在计算的距离小于阈值的情况下允许所述被检查物体通过。
- 如权利要求8所述的检查设备,其中所述处理器被配置为在第一神经网络的训练过程中在样本图像中包含的多个区域特征与所述样本图片的舱单信息中包括的多个词语之间建立对应关系。
- 如权利要求13所述的检查设备,其中将表示所述区域特征的特征矢量与表示所述词语的语义矢量之间的点积作为区域特征与词语之间的相似度,并且利用所述样本图像的多个区域特征与其舱单信息包括的多个词语之间的相似度的加权和作为所述样本图像与其舱单信息之间的相似度。
- 一种计算机可读介质,存储有计算机程序,所述计算机程序被处理器执行时实现如下步骤:利用第一神经网络处理被检查物体的X射线图像,得到被检查物体的语义描述;利用第二神经网络对被检查物体的舱单的文字信息进行处理,得到被检查物体的语义特征;基于所述语义描述和所述语义特征来判断所述被检查物体是否允许通过。
Priority Applications (3)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| KR1020197034843A KR102240058B1 (ko) | 2017-09-18 | 2018-09-17 | 검사 방법과 검사 장비 및 컴퓨터 판독 가능한 매체 |
| EP18856136.9A EP3699579B1 (en) | 2017-09-18 | 2018-09-17 | Inspection method and inspection device and computer-readable medium |
| JP2019565493A JP7678661B2 (ja) | 2017-09-18 | 2018-09-17 | 検査方法、検査設備及びコンピューター読み取り可能な媒体 |
Applications Claiming Priority (2)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN201710845577.6 | 2017-09-18 | ||
| CN201710845577.6A CN109522913B (zh) | 2017-09-18 | 2017-09-18 | 检查方法和检查设备以及计算机可读介质 |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| WO2019052561A1 true WO2019052561A1 (zh) | 2019-03-21 |
Family
ID=65722436
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| PCT/CN2018/106021 Ceased WO2019052561A1 (zh) | 2017-09-18 | 2018-09-17 | 检查方法和检查设备以及计算机可读介质 |
Country Status (5)
| Country | Link |
|---|---|
| EP (1) | EP3699579B1 (zh) |
| JP (1) | JP7678661B2 (zh) |
| KR (1) | KR102240058B1 (zh) |
| CN (1) | CN109522913B (zh) |
| WO (1) | WO2019052561A1 (zh) |
Cited By (8)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN110674292A (zh) * | 2019-08-27 | 2020-01-10 | 腾讯科技(深圳)有限公司 | 一种人机交互方法、装置、设备及介质 |
| CN111192252A (zh) * | 2019-12-30 | 2020-05-22 | 深圳大学 | 一种图像分割结果优化方法、装置、智能终端及存储介质 |
| CN112633652A (zh) * | 2020-12-15 | 2021-04-09 | 北京交通大学 | 基于语义风险自适应识别的物流安检方法 |
| CN112860889A (zh) * | 2021-01-29 | 2021-05-28 | 太原理工大学 | 一种基于bert的多标签分类方法 |
| CN113093308A (zh) * | 2021-05-08 | 2021-07-09 | 佳都科技集团股份有限公司 | X射线行李检查设备的校正方法、装置、设备及存储介质 |
| CN116046810A (zh) * | 2023-04-03 | 2023-05-02 | 云南通衢工程检测有限公司 | 基于rpc盖板破坏荷载的无损检测方法 |
| CN117150097A (zh) * | 2023-08-31 | 2023-12-01 | 应急管理部大数据中心 | 一种执法检查清单自动匹配方法 |
| CN118887609A (zh) * | 2024-07-09 | 2024-11-01 | 内蒙古电力(集团)有限责任公司蒙电项目建管分公司 | 一种视频流实时处理的园区建设物资安全监控方法及系统 |
Families Citing this family (6)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN113836998A (zh) * | 2021-08-16 | 2021-12-24 | 北京澎湃信用管理有限公司 | 针对待查验物品进行分析的信息处理装置及方法 |
| CN116092096A (zh) * | 2021-11-05 | 2023-05-09 | 同方威视技术股份有限公司 | 用于检验申报信息真实性的方法、系统、设备及介质 |
| CN115730878B (zh) * | 2022-12-15 | 2024-01-12 | 广东省电子口岸管理有限公司 | 基于数据识别的货物进出口查验管理方法 |
| CN118279578B (zh) * | 2022-12-30 | 2025-05-27 | 同方威视科技江苏有限公司 | Ct图像处理方法及装置、以及国际快件的查验方法及装置 |
| JP2024118842A (ja) * | 2023-02-21 | 2024-09-02 | 株式会社東芝 | 情報処理装置及びプログラム |
| KR102808878B1 (ko) * | 2023-11-27 | 2025-05-19 | 주식회사 딥노이드 | 대형멀티모달모델을 이용한 화물의 방사선 이미지와 텍스트 정보를 매치하기 위한 장치 및 이를 위한 방법 |
Citations (5)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US20110206240A1 (en) * | 2008-10-30 | 2011-08-25 | Baoming Hong | Detecting concealed threats |
| CN103917862A (zh) * | 2011-09-07 | 2014-07-09 | 拉皮斯坎系统股份有限公司 | 整合舱单数据和成像/检测处理的x射线检查系统 |
| CN104165896A (zh) * | 2014-08-18 | 2014-11-26 | 公安部第一研究所 | 一种液态物品安全检查的方法与装置 |
| CN106706677A (zh) * | 2015-11-18 | 2017-05-24 | 同方威视技术股份有限公司 | 检查货物的方法和系统 |
| CN107145910A (zh) * | 2017-05-08 | 2017-09-08 | 京东方科技集团股份有限公司 | 医学影像的表现生成系统、其训练方法及表现生成方法 |
Family Cites Families (9)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| JPH07160665A (ja) * | 1993-12-13 | 1995-06-23 | Fujitsu Ltd | 物体識別装置 |
| US6765981B2 (en) | 2002-07-31 | 2004-07-20 | Agilent Technologies, Inc. | Computed tomography |
| JP2009075744A (ja) * | 2007-09-19 | 2009-04-09 | Spirit21:Kk | 基板実装部品の管理システム |
| PT2639749T (pt) | 2012-03-15 | 2017-01-18 | Cortical Io Gmbh | Métodos, aparelhos e produtos para processamento semântico de texto |
| CN103901489B (zh) * | 2012-12-27 | 2017-07-21 | 清华大学 | 检查物体的方法、显示方法和设备 |
| CN105808555B (zh) | 2014-12-30 | 2019-07-26 | 清华大学 | 检查货物的方法和系统 |
| US10423874B2 (en) * | 2015-10-02 | 2019-09-24 | Baidu Usa Llc | Intelligent image captioning |
| CN106845499A (zh) * | 2017-01-19 | 2017-06-13 | 清华大学 | 一种基于自然语言语义的图像目标检测方法 |
| CN108734183A (zh) * | 2017-04-14 | 2018-11-02 | 清华大学 | 检查方法和检查设备 |
-
2017
- 2017-09-18 CN CN201710845577.6A patent/CN109522913B/zh active Active
-
2018
- 2018-09-17 WO PCT/CN2018/106021 patent/WO2019052561A1/zh not_active Ceased
- 2018-09-17 EP EP18856136.9A patent/EP3699579B1/en active Active
- 2018-09-17 KR KR1020197034843A patent/KR102240058B1/ko active Active
- 2018-09-17 JP JP2019565493A patent/JP7678661B2/ja active Active
Patent Citations (5)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US20110206240A1 (en) * | 2008-10-30 | 2011-08-25 | Baoming Hong | Detecting concealed threats |
| CN103917862A (zh) * | 2011-09-07 | 2014-07-09 | 拉皮斯坎系统股份有限公司 | 整合舱单数据和成像/检测处理的x射线检查系统 |
| CN104165896A (zh) * | 2014-08-18 | 2014-11-26 | 公安部第一研究所 | 一种液态物品安全检查的方法与装置 |
| CN106706677A (zh) * | 2015-11-18 | 2017-05-24 | 同方威视技术股份有限公司 | 检查货物的方法和系统 |
| CN107145910A (zh) * | 2017-05-08 | 2017-09-08 | 京东方科技集团股份有限公司 | 医学影像的表现生成系统、其训练方法及表现生成方法 |
Non-Patent Citations (1)
| Title |
|---|
| See also references of EP3699579A4 * |
Cited By (12)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN110674292A (zh) * | 2019-08-27 | 2020-01-10 | 腾讯科技(深圳)有限公司 | 一种人机交互方法、装置、设备及介质 |
| CN111192252A (zh) * | 2019-12-30 | 2020-05-22 | 深圳大学 | 一种图像分割结果优化方法、装置、智能终端及存储介质 |
| CN111192252B (zh) * | 2019-12-30 | 2023-03-31 | 深圳大学 | 一种图像分割结果优化方法、装置、智能终端及存储介质 |
| CN112633652A (zh) * | 2020-12-15 | 2021-04-09 | 北京交通大学 | 基于语义风险自适应识别的物流安检方法 |
| CN112633652B (zh) * | 2020-12-15 | 2023-09-29 | 北京交通大学 | 基于语义风险自适应识别的物流安检方法 |
| CN112860889A (zh) * | 2021-01-29 | 2021-05-28 | 太原理工大学 | 一种基于bert的多标签分类方法 |
| CN113093308A (zh) * | 2021-05-08 | 2021-07-09 | 佳都科技集团股份有限公司 | X射线行李检查设备的校正方法、装置、设备及存储介质 |
| CN113093308B (zh) * | 2021-05-08 | 2024-04-16 | 佳都科技集团股份有限公司 | X射线行李检查设备的校正方法、装置、设备及存储介质 |
| CN116046810A (zh) * | 2023-04-03 | 2023-05-02 | 云南通衢工程检测有限公司 | 基于rpc盖板破坏荷载的无损检测方法 |
| CN117150097A (zh) * | 2023-08-31 | 2023-12-01 | 应急管理部大数据中心 | 一种执法检查清单自动匹配方法 |
| CN117150097B (zh) * | 2023-08-31 | 2024-03-01 | 应急管理部大数据中心 | 一种执法检查清单自动匹配方法 |
| CN118887609A (zh) * | 2024-07-09 | 2024-11-01 | 内蒙古电力(集团)有限责任公司蒙电项目建管分公司 | 一种视频流实时处理的园区建设物资安全监控方法及系统 |
Also Published As
| Publication number | Publication date |
|---|---|
| CN109522913B (zh) | 2022-07-19 |
| EP3699579A4 (en) | 2021-05-19 |
| JP7678661B2 (ja) | 2025-05-16 |
| EP3699579B1 (en) | 2023-08-02 |
| JP2020534508A (ja) | 2020-11-26 |
| KR20200003011A (ko) | 2020-01-08 |
| EP3699579A1 (en) | 2020-08-26 |
| CN109522913A (zh) | 2019-03-26 |
| KR102240058B1 (ko) | 2021-04-15 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| CN109522913B (zh) | 检查方法和检查设备以及计算机可读介质 | |
| US10013615B2 (en) | Inspection methods and devices | |
| Zhong et al. | Multi-class geospatial object detection based on a position-sensitive balancing framework for high spatial resolution remote sensing imagery | |
| Shi et al. | Point-gnn: Graph neural network for 3d object detection in a point cloud | |
| EP3349050B1 (en) | Inspection devices and methods for detecting a firearm | |
| CN105808555B (zh) | 检查货物的方法和系统 | |
| CN105809091B (zh) | 检查方法和系统 | |
| JP2017097853A (ja) | 貨物の検査方法及びそのシステム | |
| CN108303748A (zh) | 检查设备和检测行李物品中的枪支的方法 | |
| Li et al. | Tunnel crack detection using coarse‐to‐fine region localization and edge detection | |
| EP3349049B1 (en) | Inspection devices and methods for inspecting a container | |
| KR102283197B1 (ko) | 상품의 유형을 결정하는 방법 및 디바이스 | |
| CN109557114B (zh) | 检查方法和检查设备以及计算机可读介质 | |
| CN114462487A (zh) | 目标检测网络训练及检测方法、装置、终端及存储介质 | |
| Jha | E-commerce product image classification using transfer learning | |
| CN113762029A (zh) | 危险品识别方法、装置、设备及存储介质 | |
| Liu et al. | A Lightweight Dangerous Liquid Detection Method Based on Depthwise Separable Convolution for X‐Ray Security Inspection | |
| NL2034690B1 (en) | Method and apparatus of training radiation image recognition model online, and method and apparatus of recognizing radiation image | |
| HK40004246A (zh) | 检查方法和检查设备以及计算机可读介质 | |
| HK40004246B (zh) | 检查方法和检查设备以及计算机可读介质 | |
| Chopade et al. | Single shot detector application for image disease localization | |
| Paranjape et al. | Segmentation of handguns in dual energy X-ray imagery of passenger carry-on baggage | |
| KR102934312B1 (ko) | 통관 시스템에서 이종의 정보를 통합하여 제공하기 위한 장치 및 이를 위한 방법 | |
| Ansary | Transfer Learning With CLIP for Intelligent Concrete Crack Detection in Structural Health Monitoring | |
| Cao | Research on The Vehicle Detection Technology Based on The Yolo Model |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| 121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 18856136 Country of ref document: EP Kind code of ref document: A1 |
|
| ENP | Entry into the national phase |
Ref document number: 20197034843 Country of ref document: KR Kind code of ref document: A |
|
| ENP | Entry into the national phase |
Ref document number: 2019565493 Country of ref document: JP Kind code of ref document: A |
|
| NENP | Non-entry into the national phase |
Ref country code: DE |
|
| ENP | Entry into the national phase |
Ref document number: 2018856136 Country of ref document: EP Effective date: 20200420 |