Summary of the invention
It was recognized by the inventor that currently based on deep learning detection model for different application scene generalization ability compared with
It is low, cause in the case where scene complexity, detection model can generate a large amount of erroneous detection as a result, causing the accuracy rate of detection lower.
One technical problem to be solved by the embodiment of the invention is that: how to improve the accuracy rate of object detection method.
First aspect according to some embodiments of the invention provides a kind of object detection method, comprising: will include multiframe
The sequence inputting to be detected of image obtains multiple figures with detection block of target detection model output into target detection model
Picture;Determine the pixel with movement light stream in detection block;According to the pixel number with movement light stream in detection block, reserve part
Point or whole detection block;Object in the detection block of reservation is determined as the target object in corresponding image.
In some embodiments, determine that the pixel with movement light stream in detection block includes: the inspection according to a frame image
The position for surveying the position of the pixel in frame and the same pixel of previous frame image, determines the displacement of pixel;It is big in the displacement of pixel
In the case where displacement threshold value, determine that pixel has movement light stream.
In some embodiments, object detection method further include: the characteristic point in the detection block of detection image, to calculate
Displacement of each characteristic point relative to the same characteristic point in the previous frame image of image in the detection block of image, and in feature
In the case that the displacement of point is greater than displacement threshold value, determine that the corresponding pixel of characteristic point has movement light stream.
In some embodiments, according to the pixel number with movement light stream in detection block, retain partly or entirely detection
Frame includes: in the detection block of image there is the pixel number of movement light stream to be less than preset threshold, and several before image
In frame image in range corresponding to the detection block of image have movement light stream pixel number again smaller than preset threshold in the case where,
Delete the detection block of image.
In some embodiments, target detection model is neural network model;Object detection method further include: using training
Image training neural network model is to obtain target detection model, wherein training image includes positive sample image and negative sample figure
Picture, each positive sample image have the location information of markd target object, do not have target object in each negative sample image,
Negative sample image includes the image in the detection block of target detection model misrecognition.
In some embodiments, object detection method further include: the true picture based on acquisition fights net using production
Network generates virtual image;Using virtual image as training image.
In some embodiments, object detection method further include: be turned in response to the cabinet door of goods selling equipment, acquire video
Or continuous acquisition multiple images are as sequence to be detected, so as to the target object in detection image, wherein target object is to be taken
The article taken;The image of target object is identified to determine the mark for the article being taken.
The second aspect according to some embodiments of the invention provides a kind of target object detection device, comprising: detection block
Output module, be configured as will include multiple image sequence inputting to be detected into target detection model, obtain target detection
The multiple images with detection block of model output;Light stream determining module is moved, is configured to determine that in detection block that there is fortune
The pixel of dynamic light stream;Detection block screening module is configured as according to the pixel number with movement light stream in detection block, reserve part
Point or whole detection block;Target object determining module, the object being configured as in the detection block by reservation are determined as scheming accordingly
Target object as in.
In terms of third according to some embodiments of the invention, a kind of target object detection device is provided, comprising: storage
Device;And it is coupled to the processor of memory, processor is configured as executing based on instruction stored in memory for transporting
Row includes the target object detection method of following operation: by the sequence inputting to be detected including multiple image to target detection model
In, obtain the multiple images with detection block of target detection model output;Determine the picture with movement light stream in detection block
Element;According to the pixel number with movement light stream in detection block, retain part or all of detection block;It will be in the detection block of reservation
Object is determined as the target object in corresponding image.
In some embodiments, determine that the pixel with movement light stream in detection block includes: the inspection according to a frame image
The position for surveying the position of the pixel in frame and the same pixel of previous frame image, determines the displacement of pixel;It is big in the displacement of pixel
In the case where displacement threshold value, determine that pixel has movement light stream.
In some embodiments, it operates further include: the characteristic point in the detection block of detection image, to calculate the inspection of image
Displacement of each characteristic point in frame relative to the same characteristic point in the previous frame image of image is surveyed, and in the displacement of characteristic point
In the case where greater than displacement threshold value, determine that the corresponding pixel of characteristic point has movement light stream.
In some embodiments, according to the pixel number with movement light stream in detection block, retain partly or entirely detection
Frame includes: in the detection block of image there is the pixel number of movement light stream to be less than preset threshold, and several before image
In frame image in range corresponding to the detection block of image have movement light stream pixel number again smaller than preset threshold in the case where,
Delete the detection block of image.
In some embodiments, target detection model is neural network model;Operation further include:
Use training image training neural network model to obtain target detection model, wherein training image includes positive sample
This image and negative sample image, each positive sample image have the location information of markd target object, each negative sample image
In do not have target object, negative sample image include target detection model misrecognition detection block in image.
In some embodiments, it operates further include: the true picture based on acquisition is generated empty using production confrontation network
Quasi- image;Using virtual image as training image.
The 4th aspect according to some embodiments of the invention, provides a kind of target object detection system, comprising: aforementioned
Anticipate a kind of target object detection device, be configured as will include multiple image sequence inputting to be detected to target detection model
In, obtain the multiple images with detection block of target detection model output;Determine the picture with movement light stream in detection block
Element;According to the pixel number with movement light stream in detection block, retain part or all of detection block;It will be in the detection block of reservation
Object is determined as the target object in corresponding image;And picture pick-up device, being configured as acquisition includes the to be checked of multiple image
Sequencing column.
In some embodiments, target object detection system further include: goods selling equipment;Picture pick-up device is located at goods selling equipment,
It is configured to be turned in response to the cabinet door of goods selling equipment, acquires video or continuous acquisition multiple images as to be detected
Sequence;Target object detection device is configured to the target object in detection image, and identifies the image of target object
To determine the mark for the article being taken, wherein target object is the article being taken.
The 5th aspect according to some embodiments of the invention, provides a kind of computer readable storage medium, stores thereon
There is computer program, wherein the program realizes any one aforementioned target object detection method when being executed by processor.
Some embodiments in foregoing invention have the following advantages that or the utility model has the advantages that the embodiment of the present invention can use mesh
Mark detection model identifies target object that may be present in single-frame images based on static characteristics of image, then using based on frame
Between the movement light streams of behavioral characteristics obtain testing result to be screened to target object that may be present.Even if in scene
In the case where complexity, the embodiment of the present invention can also be detected by way of this postsearch screening, improve target inspection
The accuracy of survey.
By referring to the drawings to the detailed description of exemplary embodiment of the present invention, other feature of the invention and its
Advantage will become apparent.
Specific embodiment
Following will be combined with the drawings in the embodiments of the present invention, and technical solution in the embodiment of the present invention carries out clear, complete
Site preparation description, it is clear that described embodiments are only a part of the embodiments of the present invention, instead of all the embodiments.Below
Description only actually at least one exemplary embodiment be it is illustrative, never as to the present invention and its application or make
Any restrictions.Based on the embodiments of the present invention, those of ordinary skill in the art are not making creative work premise
Under every other embodiment obtained, shall fall within the protection scope of the present invention.
Unless specifically stated otherwise, positioned opposite, the digital table of the component and step that otherwise illustrate in these embodiments
It is not limited the scope of the invention up to formula and numerical value.
Fig. 1 is the flow diagram according to the object detection method of some embodiments of the invention.As shown in Figure 1, the implementation
The object detection method of example includes step S102~S108.
In step s 102, by the sequence inputting to be detected including multiple image into target detection model, target is obtained
The multiple images with detection block of detection model output.Object in detection block is considered target pair by target detection model
As.
Sequence to be detected for example can be one section of video, or by capture apparatus continuous acquisition whithin a period of time
Multiple image composition image sequence.Sequence to be detected can be acquired by the capture apparatus of fixed viewpoint, capture apparatus example
It such as can be to be fixed on predetermined position in same shooting process and shooting angle and the constant monitoring camera of focal length.Example
Such as, can be to be placed at self-service cabinet, for shoot user take article behavior camera.
In some embodiments, target detection model is neural network model, such as can be for based on mobilenet (shifting
Moved end neural network)-SSD (Single Shot MultiBox Detector, single-lens more box detectors) network frame mould
Type.There may be one or more detection blocks in the image of target detection model output.
It was recognized by the inventor that target detection model is the characteristics of image in foundation single-frame images to determine target object.
This is a kind of identification method based on static nature.When the target object of identification is the object in movement, in order to further mention
Rise target detection accuracy, can also the output result to target detection model further screened.
In step S104, the pixel with movement light stream in detection block is determined.
Movement light stream reflects motion information of the object between consecutive frame.In an embodiment of the present invention, whether pixel
It for example can use same pixel generated change information when in different frame in the presence of movement light stream to determine.
In step s 106, according to the pixel number with movement light stream in detection block, retain part or all of detection block.
It in some embodiments, can be according in the detection block of a certain frame image and the image of its preceding preset quantity frame
Pixel number with movement light stream some or all of retains in this frame image detection block.When light stream number is 0 or less,
Illustrate that a possibility that there are moving objects in detection block is lower, which is likely to misidentify, or what is identified is back
Scenery and it is non-athletic in target object.
In step S108, the object in the detection block of reservation is determined as the target object in corresponding image.
Method through the foregoing embodiment can identify list using target detection model, based on static characteristics of image
Target object that may be present in frame image, then using the movement light stream based on the behavioral characteristics of interframe come to mesh that may be present
Mark object is screened to obtain testing result.Even if the embodiment of the present invention can also pass through in the case where scene complexity
The mode of this postsearch screening is detected, and the accuracy of target detection is improved.
There is the embodiment of the pixel determination techniques of movement light stream below with reference to the description present invention of Fig. 2 and 3.
Fig. 2 is the flow diagram according to the pixel determination techniques with movement light stream of some embodiments of the invention.Such as
Shown in Fig. 2, the pixel determination techniques with movement light stream of the embodiment include step S202~S204.
In step S202, according to the same pixel of the position of the pixel in the detection block of a frame image and previous frame image
Position, determine the displacement of pixel.
In step S204, in the case where the displacement of pixel is greater than displacement threshold value, determine that pixel has movement light stream.
When pixel different moments change in location be greater than displacement threshold value when, illustrate the pixel have motion information, thus
The corresponding movement light stream of the pixel can reflect the motion feature of object.By the method for the embodiment, can accurately detect
The pixel with movement light stream in detection block.
In some embodiments, the same pixel in different frame can be identified by characteristic point.Fig. 3 is according to the present invention
The flow diagram of the pixel determination techniques with movement light stream of other embodiments.As shown in figure 3, the embodiment has
The pixel determination techniques for moving light stream include step S302~S306.
In step s 302, the characteristic point in the detection block of detection image.Characteristic point for example can be corner feature, SIFT
(Scale-Invariant Feature Transform, Scale invariant features transform) feature etc., those skilled in the art can
To be selected as needed.
In some embodiments, can according to all characteristic points in whole pixel detection images in every frame image, then
The characteristic point dropped into detection block is screened.So as to more fully detect characteristic point.In some embodiments, also
Characteristic point can be detected according only to the pixel in detection block.So as to improve the detection speed of characteristic point.
In step s 304, each characteristic point in the detection block of image is calculated relative in the previous frame image of image
The displacement of same characteristic point.
In some embodiments, it can map an image in coordinate system, each location of pixels respective coordinates in image
A coordinate points in system.It is thus possible to which the distance of the coordinate points by same characteristic point in adjacent two field pictures is determined as spy
Levy the displacement of point.
In step S306, in the case where the displacement of characteristic point is greater than displacement threshold value, the corresponding pixel of characteristic point is determined
With movement light stream.
Method through the foregoing embodiment can more accurately detect the same pixel in different images frame, improve
The accuracy of the detection of pixel with movement light stream.
After determining the pixel with movement light stream in detection block, light stream can be moved according to having in detection block
Pixel number retains part or all of detection block.It, not only can be according to the frame image when deleting the detection block in a frame image
The pixel number with movement light stream in detection block is handled, can also be according to the detection of several frame images before the frame image
The pixel number with movement light stream in frame is handled.Detection block screening technique of the present invention is described below with reference to Fig. 4 and Fig. 5
Embodiment.
Fig. 4 is the flow diagram according to the detection block screening technique of some embodiments of the invention.As shown in figure 4, the reality
The detection block screening technique for applying example includes step S402~S404.
In step S402, the pixel with movement light stream in detection block is determined.
In step s 404, in the detection block of image there is the pixel number of movement light stream to be less than preset threshold, and
Feelings of the light stream number again smaller than preset threshold are moved in range corresponding to the detection block of image in several frame images before image
Under condition, the detection block of image is deleted.
If whithin a period of time, not occurring moving object in detection block, then the content in detection block is likely to carry on the back
Scenery, therefore frame deletion can be will test.
Fig. 5 is the flow diagram according to the detection block screening technique of other embodiments of the invention.As shown in figure 5, should
The detection block screening technique of embodiment includes step S502~S510.
In step S502, the multiple image with detection block of target detection model output is obtained.
In step S504, the pixel with movement light stream in the detection block in multiple image is determined.
Step S506~S10 is the exemplary place for a detection block in the frame image to be processed in multiple image
Reason mode.For other detection blocks and other frame images in image to be processed, the same or similar means can be used
It is handled.
In step S506, judge in the detection block to be processed of image to be processed have movement light stream pixel number whether
Less than preset threshold.If it is not, determining that there are moving objects in the detection block to be processed of image to be processed, retain to be processed
Detection block;If it is lower, executing step S508.
In step S508, judge in the previous frame image of image to be processed, in range corresponding to detection block to be processed
Have movement light stream pixel number whether be less than preset threshold.Although if it is not, may in explanation detection block to be processed
There is no moving objects, but with the same position of the detection block position to be processed there are moving object in former frame, because
Acquired image when target object stops after one section of movement is likely in this detection block to be processed, so as to retain
Detection block to be processed;If it is lower, executing step S510.
In step S510, judge in preceding N (N is positive integer) frame image of image to be processed, detection block pair to be processed
Whether the pixel number with movement light stream in the range of answering is less than preset threshold.If it is not, inspection to be processed can be retained
Survey frame;If it is lower, illustrating that detection block position to be processed is without moving object, then detection to be processed whithin a period of time
It is likely to be background objects in frame, therefore detection block to be processed can be deleted.
Method through the foregoing embodiment judges whether deposit in detection block with can combining the light stream Number synthesis of multiple image
In target object, the accuracy of target object detection is improved.
In order to further enhance the accuracy rate of target detection model, the present invention can also by the training of target detection model into
Row optimization.In some embodiments, training image training neural network model can be used to obtain target detection model,
In, training image includes positive sample image and negative sample image, and each positive sample image has the position of markd target object
Information does not have target object in each negative sample image.Negative sample image includes the detection block of target detection model misrecognition
In image.The embodiment of target detection model training method of the present invention is described below with reference to Fig. 6.
Fig. 6 is the flow diagram according to the target detection model training method of some embodiments of the invention.Such as Fig. 6 institute
Show, the target detection model training method of the embodiment includes step S602~S608.
In step S602, training image is obtained, training image includes positive sample image and negative sample image.
In step s 604, training image is input in neural network model, obtains the forecast image of output.Part is pre-
Altimetric image has detection block.
In step S606, according to the forecasting accuracy of neural network model, the model of neural network is adjusted.
In step S608, in response to there is inspection in the corresponding forecast image of negative sample image of neural network model output
It surveys frame and is added to training image using the image in the detection block of the corresponding forecast image of negative sample image as new negative sample
In.
It is thus possible to the image misidentified in training process is re-started training, with promoted neural network model for
The recognition capability of difficult example (Hard Example, also referred to as Hard Negative, Hard Instance), to improve mesh
Mark the recognition accuracy of detection model.
In some embodiments, negative sample can also be enriched, in use so as to constantly to target detection model
It is updated.For example, it is also possible to include step S610~S614.
In step S610, by the sequence inputting to be detected including multiple image into target detection model, target is obtained
The multiple images with detection block of detection model output.
In step S612, part or all of detection block is retained based on movement light stream, by the object in the detection block of reservation
The target object being determined as in corresponding image.
In step S614, using the image in the detection block of deletion as new negative sample, it is added in training image.
In some embodiments, training image may include true image, can also include virtual image.Virtual image
It can be according to the generation of true image.The embodiment of training image generation method of the present invention is described below with reference to Fig. 7.
Fig. 7 is the flow diagram according to the training image generation method of some embodiments of the invention.As shown in fig. 7, should
The training image generation method of embodiment includes step S702~S704.
In step S702, the true picture based on acquisition generates virtual image using production confrontation network.
In some embodiments, the corresponding multiple production confrontation networks of multiple scenes can be trained in advance, these
The scene that scene can may for example be applied for the object detection method in previous embodiment.Pass through the true picture that will be acquired
It is input in a production confrontation network, the virtual image under corresponding scene can be generated in production confrontation network.
In step S704, using virtual image as training image.It, can also be in training image other than virtual image
It further include true picture.
It is thus possible to a large amount of virtual image be generated for training based on a small amount of true picture, so that the target of training
Detection model can have good adaptability to several scenes, improve and the accuracy of training effectiveness and target detection.
The embodiment of the present invention for example can be applied to the scene of selling goods of goods selling equipment.For example, being sold when user opens nobody
When the cabinet door picking of counter, be mounted on self-service cabinet camera can acquire user take cargo when video or image.
It is then possible to which object detection method through the invention identifies the commodity taken in user hand.Since user is in commodity of taking
In the process, commodity most of time is in moving condition, therefore can identify the part quotient in image in target detection model
After product, then the static commodity ambient enviroment in image, the user that is placed in sales counter not taken based on movement light stream into
Row further screens out, and may thereby determine that the commodity that user takes.Self-service cabinet seller of the present invention is described below with reference to Fig. 8
The embodiment of method.
Fig. 8 is the flow diagram according to the good selling method of some embodiments of the invention.As shown in figure 8, the rear embodiment
Good selling method include step S802~S812.
It in step S802, is turned in response to the cabinet door of sales counter, acquires video or continuous acquisition multiple images conduct
Sequence to be detected.
In some embodiments, the acquisition of video or image can be stopped in response to user's closing cabinet door.
In step S804, by sequence inputting to be detected into target detection model, the output of target detection model is obtained
Multiple images with detection block.
In step S806, the pixel with movement light stream in detection block is determined.
In step S808, according to the pixel number with movement light stream in detection block, retain part or all of detection block.
In step S810, the object in the detection block of reservation is determined as the target object in corresponding image.Target
Object is the article being taken.
In step S812, the image of target object is identified to determine the mark for the article being taken.It is thus possible to determine
The information such as the SKU (Stock Keeping Unit, keeper unit) of the article that user takes, title, price, specification, so as to
The article that user takes is settled accounts, realizes automatic vending process.
Method through the foregoing embodiment, can for user take from automatic vending machine article when, article is most
The characteristics of number time is kept in motion, identifies the article taken by user, accurately so as to improve automatic vending machine
The accuracy of efficiency of selling goods and commodity clearing.
The embodiment of object detecting device of the present invention is described below with reference to Fig. 9.
Fig. 9 is the structural schematic diagram according to the object detecting device of some embodiments of the invention.As shown in figure 9, the implementation
Example object detecting device 90 include: detection block output module 910, be configured as will include multiple image sequence to be detected it is defeated
Enter into target detection model, obtains the multiple images with detection block of target detection model output;Movement light stream determines mould
Block 920, the pixel with movement light stream being configured to determine that in detection block;Detection block screening module 930, is configured as basis
The pixel number with movement light stream in detection block, retains part or all of detection block;Target object determining module 940 is matched
The object being set in the detection block by reservation is determined as the target object in corresponding image.
In some embodiments, movement light stream determining module 920 is configured to the detection block according to a frame image
In pixel position and previous frame image same pixel position, determine the displacement of pixel;It is greater than position in the displacement of pixel
In the case where moving threshold value, determine that pixel has movement light stream.
In some embodiments, object detecting device 90 further include: characteristic point detection module 950 is configured as detection figure
Characteristic point in the detection block of picture, previous frame image of each characteristic point in detection block relative to image to calculate image
In same characteristic point displacement, and the displacement of characteristic point be greater than displacement threshold value in the case where, determine the corresponding picture of characteristic point
Element has movement light stream.
In some embodiments, detection block screening module 930 is configured to having in the detection block of image
Move light stream pixel number be less than preset threshold, and before image in several frame images image the corresponding range of detection block
In the case that the interior pixel number with movement light stream is again smaller than light stream number threshold value, the detection block of image is deleted.
In some embodiments, target detection model is neural network model;Object detecting device 90 further include: training mould
Block 960 is configured to use training image training neural network model to obtain target detection model, wherein training figure
As including positive sample image and negative sample image, each positive sample image has the location information of markd target object, each
Do not have target object in negative sample image, negative sample image includes the image in the detection block of target detection model misrecognition.
In some embodiments, object detecting device 90 further include: virtual image generation module 970 is configured as being based on
The true picture of acquisition generates virtual image using production confrontation network;Using virtual image as training image.
The embodiment of target object detection system of the present invention is described below with reference to Figure 10.
Figure 10 is the structural schematic diagram according to the target object detection system of some embodiments of the invention.As shown in Figure 10,
The target object detection system 100 of the embodiment includes: target object detection device 1010 and picture pick-up device 1020.Camera shooting is set
Standby 1020 are configured as the sequence to be detected that acquisition includes multiple image.The specific embodiment of target object detection device 1010
Can be with reference to the object detecting device 90 in Fig. 9 embodiment, which is not described herein again.
Figure 11 is the structural schematic diagram according to the target object detection device of other embodiments of the invention.Such as Figure 11 institute
Show, the target object detection device 110 of the embodiment includes: memory 1110 and the processor for being coupled to the memory 1110
1120, processor 1120 is configured as executing in any one aforementioned embodiment based on the instruction being stored in memory 1110
Target object detection method.
Wherein, memory 1110 is such as may include system storage, fixed non-volatile memory medium.System storage
Device is for example stored with operating system, application program, Boot loader (Boot Loader) and other programs etc..
Figure 12 is the structural schematic diagram according to the target object detection device of yet other embodiments of the invention.Such as Figure 12 institute
Show, it can also include input that the target object detection device 120 of the embodiment, which includes: memory 1210 and processor 1220,
Output interface 1230, network interface 1240, memory interface 1250 etc..These interfaces 1230,1240,1250 and memory 1210
It can for example be connected by bus 1260 between processor 1220.Wherein, input/output interface 1230 be display, mouse,
The input-output equipment such as keyboard, touch screen provide connecting interface.Network interface 1240 provides connecting interface for various networked devices.
The external storages such as memory interface 1250 is SD card, USB flash disk provide connecting interface.
The embodiment of the present invention also provides a kind of computer readable storage medium, is stored thereon with computer program, special
Sign is that the program realizes any one aforementioned target object detection method when being executed by processor.
Those skilled in the art should be understood that the embodiment of the present invention can provide as method, system or computer journey
Sequence product.Therefore, complete hardware embodiment, complete software embodiment or combining software and hardware aspects can be used in the present invention
The form of embodiment.Moreover, it wherein includes the calculating of computer usable program code that the present invention, which can be used in one or more,
Machine can use the meter implemented in non-transient storage medium (including but not limited to magnetic disk storage, CD-ROM, optical memory etc.)
The form of calculation machine program product.
The present invention be referring to according to the method for the embodiment of the present invention, the process of equipment (system) and computer program product
Figure and/or block diagram describe.It is interpreted as to be realized by computer program instructions each in flowchart and/or the block diagram
The combination of process and/or box in process and/or box and flowchart and/or the block diagram.It can provide these computer journeys
Sequence instruct to general purpose computer, special purpose computer, Embedded Processor or other programmable data processing devices processor with
A machine is generated, so that the instruction generation executed by computer or the processor of other programmable data processing devices is used for
Realize the dress for the function of specifying in one or more flows of the flowchart and/or one or more blocks of the block diagram
It sets.
These computer program instructions, which may also be stored in, is able to guide computer or other programmable data processing devices with spy
Determine in the computer-readable memory that mode works, so that it includes referring to that instruction stored in the computer readable memory, which generates,
Enable the manufacture of device, the command device realize in one box of one or more flows of the flowchart and/or block diagram or
The function of being specified in multiple boxes.
These computer program instructions also can be loaded onto a computer or other programmable data processing device, so that counting
Series of operation steps are executed on calculation machine or other programmable devices to generate computer implemented processing, thus in computer or
The instruction executed on other programmable devices is provided for realizing in one or more flows of the flowchart and/or block diagram one
The step of function of being specified in a box or multiple boxes.
The foregoing is merely presently preferred embodiments of the present invention, is not intended to limit the invention, it is all in spirit of the invention and
Within principle, any modification, equivalent replacement, improvement and so on be should all be included in the protection scope of the present invention.