WO2019214201A1 - 活体检测方法及装置、系统、电子设备、存储介质 - Google Patents

活体检测方法及装置、系统、电子设备、存储介质 Download PDF

Info

Publication number
WO2019214201A1
WO2019214201A1 PCT/CN2018/115499 CN2018115499W WO2019214201A1 WO 2019214201 A1 WO2019214201 A1 WO 2019214201A1 CN 2018115499 W CN2018115499 W CN 2018115499W WO 2019214201 A1 WO2019214201 A1 WO 2019214201A1
Authority
WO
WIPO (PCT)
Prior art keywords
target object
information
sensor
feature information
key point
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Ceased
Application number
PCT/CN2018/115499
Other languages
English (en)
French (fr)
Inventor
杨凯
暴天鹏
张瑞
吴立威
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Sensetime Technology Development Co Ltd
Original Assignee
Beijing Sensetime Technology Development Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Sensetime Technology Development Co Ltd filed Critical Beijing Sensetime Technology Development Co Ltd
Priority to JP2019515661A priority Critical patent/JP6852150B2/ja
Priority to EP18839510.7A priority patent/EP3584745A4/en
Priority to KR1020197019442A priority patent/KR20190129826A/ko
Priority to US16/234,434 priority patent/US10930010B2/en
Publication of WO2019214201A1 publication Critical patent/WO2019214201A1/zh
Anticipated expiration legal-status Critical
Ceased legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/161Detection; Localisation; Normalisation
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/60Type of objects
    • G06V20/64Three-dimensional [3D] objects
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/161Detection; Localisation; Normalisation
    • G06V40/166Detection; Localisation; Normalisation using acquisition arrangements
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • G06F18/253Fusion techniques of extracted features
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/30Authentication, i.e. establishing the identity or authorisation of security principals
    • G06F21/31User authentication
    • G06F21/32User authentication using biometric data, e.g. fingerprints, iris scans or voiceprints
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/50Depth or shape recovery
    • G06T7/55Depth or shape recovery from multiple images
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/10Image acquisition
    • G06V10/12Details of acquisition arrangements; Constructional details thereof
    • G06V10/14Optical characteristics of the device performing the acquisition or on the illumination arrangements
    • G06V10/143Sensing or illuminating at different wavelengths
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/80Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level
    • G06V10/806Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level of extracted features
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/168Feature extraction; Face representation
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/168Feature extraction; Face representation
    • G06V40/171Local features and components; Facial parts ; Occluding parts, e.g. glasses; Geometrical relationships
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/172Classification, e.g. identification
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/40Spoof detection, e.g. liveness detection
    • G06V40/45Detection of the body part being alive

Definitions

  • the present disclosure relates to the field of computer vision technology, and in particular, to a living body detection method and device, system, electronic device, and storage medium.
  • face recognition technology has been widely used in scenes such as face unlocking, face payment, unmanned supermarkets and video surveillance.
  • the face recognition technology has a risk of being easily attacked by a prosthetic face in the form of a physical photograph of a face, an electronic photograph of a face, or a video containing a face. Therefore, living body detection is an indispensable part of face recognition.
  • the embodiment of the present disclosure proposes a living body detecting method and apparatus.
  • a living body detecting method includes: acquiring depth information of a target object sensed by a first sensor and a target image sensed by a second sensor; performing key on the target image Point detection, obtaining key point information of the target object; and obtaining a living body detection result of the target object based on depth information of the target object and key point information of the target object.
  • the target object is a human face.
  • the second sensor is an image sensor, for example, the second sensor is an RGB sensor or a near infrared sensor.
  • the first sensor is a depth sensor, eg, the first sensor is a time of flight TOF sensor or a structured light sensor.
  • the first sensor and the second sensor are integrated in the same device, such as integrated in a 3D camera.
  • the method before performing key point detection on the target image, the method further includes:
  • obtaining a living body detection result of the target object including:
  • the first feature information is obtained based on the depth information of the target object and the key point information of the target object, including: inputting depth information of the target object and key point information of the target object Processing by the first neural network to obtain first feature information;
  • And obtaining the second feature information based on the key point information of the target object including: inputting the target image and the key point information of the target object into the second neural network for processing, to obtain second feature information.
  • the first neural network and the second neural network have the same network structure.
  • the first feature information is obtained based on the depth information of the target object and the key point information of the target object, including: performing depth information of the target object and key point information of the target object Convolution processing, obtaining a first convolution result; performing downsampling processing on the first convolution result to obtain a first down sampling result; and obtaining first feature information based on the first down sampling result.
  • the second feature information is obtained based on the key point information of the target object, including:
  • determining the living body detection result of the target object based on the first feature information and the second feature information comprises: performing fusion processing on the first feature information and the second feature information Obtaining third feature information; and determining a living body detection result of the target object according to the third feature information.
  • determining the living body detection result according to the third feature information includes:
  • the living body detection result of the target object is determined according to the probability that the target object is a living body.
  • a living body detecting apparatus including:
  • An acquiring module configured to acquire depth information of the target object sensed by the first sensor and a target image sensed by the second sensor;
  • a detection module configured to perform key point detection on the target image to obtain key point information of the target object
  • a determining module configured to obtain a living body detection result of the target object based on the depth information of the target object and the key point information of the target object.
  • the target object is a human face.
  • the second sensor is an image sensor, for example, the second sensor is an RGB sensor or a near infrared sensor.
  • the first sensor is a depth sensor, eg, the first sensor is a time of flight TOF sensor or a structured light sensor.
  • the first sensor and the second sensor are integrated in the same device, such as integrated in a 3D camera.
  • the apparatus further includes an alignment module configured to align the depth information of the target object and the target image based on parameters of the first sensor and parameters of the second sensor.
  • the determining module includes: a first determining submodule configured to obtain first feature information based on depth information of the target object and key point information of the target object; and a second determining submodule, And configured to obtain second feature information based on the key point information of the target object; and the third determining submodule is configured to determine a living body detection result of the target object based on the first feature information and the second feature information.
  • the first determining submodule is configured to: input depth information of the target object and key point information of the target object into the first neural network for processing, to obtain first feature information;
  • the second determining sub-module is configured to: input the target image and the key point information of the target object into the second neural network for processing, to obtain second feature information.
  • the first neural network and the second neural network have the same network structure.
  • the first determining submodule includes: a first convolution unit configured to perform convolution processing on depth information of the target object and key point information of the target object to obtain a first convolution a first sampling unit configured to perform a downsampling process on the first convolution result to obtain a first down sampling result.
  • the first determining unit is configured to obtain the first feature based on the first down sampling result. information.
  • the second determining submodule includes: a second convolution unit configured to perform convolution processing on the target image and the key point information of the target object to obtain a second convolution result;
  • the second downsampling unit is configured to perform a downsampling process on the second convolution result to obtain a second down sampling result.
  • the second determining unit is configured to obtain second feature information based on the second down sampling result.
  • the third determining submodule includes: a fully connected unit configured to perform fusion processing on the first feature information and the second feature information to obtain third feature information; and a third determining unit, And configured to determine a living body detection result of the target object according to the third feature information.
  • the third determining unit includes: a first determining subunit configured to obtain, according to the third feature information, a probability that the target object is a living body; and a second determining subunit configured to be configured according to the The probability that the target object is a living body determines the living body detection result of the target object.
  • the living body detecting apparatus provided by the embodiment of the present disclosure is for performing the living body detecting method in any of the above embodiments, and includes modules and units for performing the steps and/or processes of any of the possible living body detecting methods described above.
  • a living body detecting apparatus comprising: a processor; a memory configured to store processor-executable instructions; wherein the processor is configured to perform the above method.
  • a non-transitory computer readable storage medium having stored thereon computer program instructions, wherein the computer program instructions are executed by a processor to implement the above method.
  • a living body detecting system including the above-described living body detecting device, the first sensor, and the second sensor is provided.
  • a living body detecting system including the above-described nonvolatile computer readable storage medium, the first sensor, and the second sensor is provided.
  • an electronic device including:
  • a first sensor configured to detect depth information of the target object
  • a second sensor configured to acquire a target image including the target object
  • a processor configured to perform key point detection on the target object collected by the second sensor, obtain key point information of the target object, and based on the depth information and the target object detected by the first sensor The key point information of the target object is obtained, and the living body detection result of the target object is obtained.
  • the second sensor is an RGB sensor or a near infrared sensor.
  • the first sensor is a time of flight TOF sensor or a structured light sensor.
  • the processor is further configured to: align the depth information of the target object and the target image according to parameters of the first sensor and parameters of the second sensor.
  • the processor is configured to: obtain first feature information based on depth information of the target object and key point information of the target object; and obtain a second based on key point information of the target object Feature information; determining a living body detection result of the target object based on the first feature information and the second feature information.
  • the processor is configured to: input depth information of the target object and key point information of the target object into the first neural network for processing, to obtain first feature information;
  • And obtaining the second feature information based on the key point information of the target object including: inputting the target image and the key point information of the target object into the second neural network for processing, to obtain second feature information.
  • the processor is configured to perform convolution processing on depth information of the target object and key point information of the target object to obtain a first convolution result; and the first convolution result Performing a downsampling process to obtain a first down sampling result; and based on the first down sampling result, obtaining first feature information.
  • the processor is configured to perform convolution processing on the target image and key point information of the target object to obtain a second convolution result; and downsample the second convolution result Processing, obtaining a second down sampling result; and obtaining second feature information based on the second down sampling result.
  • the processor is configured to: perform fusion processing on the first feature information and the second feature information to obtain third feature information; and determine the target object according to the third feature information. The result of the in vivo test.
  • the processor is configured to: obtain a probability that the target object is a living body based on the third feature information; and determine a living body detection result of the target object according to a probability that the target object is a living body .
  • the living body detection method of each aspect of the present disclosure can perform living body detection by combining the depth information of the target object and the target image, thereby enabling the living body detection using the depth information of the target object and the key point information of the target object in the target image, thereby improving The accuracy of in vivo detection.
  • FIG. 1 illustrates a flow chart of a living body detecting method according to an embodiment of the present disclosure.
  • FIG. 2 illustrates an exemplary flowchart of a living body detecting method according to an embodiment of the present disclosure.
  • FIG. 3 illustrates an exemplary flowchart of the step S13 of the living body detecting method according to an embodiment of the present disclosure.
  • FIG. 4A illustrates a block diagram of a living body detecting device applied to a human face in accordance with an embodiment of the present disclosure.
  • FIG. 4B shows a block diagram of the data pre-processing module of FIG. 4A, in accordance with an embodiment of the present disclosure.
  • FIG. 4C shows a block diagram of the deep neural network module of FIG. 4A, in accordance with an embodiment of the present disclosure.
  • FIG. 5 illustrates an exemplary flowchart of the living body detecting method step S131 according to an embodiment of the present disclosure.
  • FIG. 6 illustrates an exemplary flowchart of the living body detecting method step S132 according to an embodiment of the present disclosure.
  • FIG. 7 illustrates an exemplary flowchart of the living body detecting method step S133 according to an embodiment of the present disclosure.
  • FIG. 8 illustrates an exemplary flowchart of the living body detecting method step S1332 according to an embodiment of the present disclosure.
  • FIG. 9 illustrates a block diagram of a living body detecting apparatus according to an embodiment of the present disclosure.
  • FIG. 10 illustrates an exemplary block diagram of a living body detecting apparatus according to an embodiment of the present disclosure.
  • FIG. 11 is a block diagram of a living body detecting apparatus 800, according to an exemplary embodiment.
  • FIG. 1 illustrates a flow chart of a living body detecting method according to an embodiment of the present disclosure.
  • the method can be applied to a terminal device having a face recognition function such as a mobile phone, a tablet computer, a digital camera or an access control device.
  • the method can be applied to scenes such as face unlocking, face payment, unmanned supermarket and video surveillance. As shown in FIG. 1, the method includes steps S11 to S13.
  • step S11 the depth information of the target object sensed by the first sensor and the target image sensed by the second sensor are acquired.
  • the target object is a human face.
  • the first sensor is a three-dimensional sensor.
  • the first sensor can be a ToF (Time of Flight) sensor, a structured light sensor, a binocular sensor, or other type of depth sensor.
  • ToF Time of Flight
  • the embodiment of the present disclosure performs the living body detection by using the depth information including the target object, and can fully excavate the depth information of the target object, thereby improving the accuracy of the living body detection.
  • the embodiment of the present disclosure performs the living body detection by using the depth information including the face, and can fully exploit the depth information of the face data, thereby improving the accuracy of the living face detection.
  • the first sensor is described above with a ToF sensor, a structured light sensor, and a binocular sensor, those skilled in the art can understand that the embodiments of the present disclosure are not limited thereto.
  • a person skilled in the art can flexibly select the type of the first sensor according to actual application scenario requirements and/or personal preferences, as long as the depth information of the target object can be sensed by the first sensor.
  • the depth information of the target object may be any information that can reflect the depth of the target object.
  • the specific implementation of the depth information of the target object is not limited in the embodiment of the present disclosure.
  • the depth information of the target object may be a depth image of the target object.
  • the depth information of the target object may be a point cloud of the target object.
  • the point cloud of the target object can record the three-dimensional coordinates of each point of the target object.
  • the depth information of the target object may be a table or other type of file that records the depth of various points of the target object.
  • the second sensor can be an RGB (Red, Red; Green, Green; Blue, Blue) sensor or a near infrared sensor. If the second sensor is an RGB sensor or other type of image sensor, the target image sensed by the second sensor is an RGB image. If the second sensor is a near-infrared sensor, the target image sensed by the second sensor is a near-infrared image.
  • the near-infrared image may be a near-infrared image with a spot, or a near-infrared image without a spot, and the like. It should be noted that although the second sensor is described above with the RGB sensor and the near-infrared sensor, those skilled in the art can understand that the embodiment of the present disclosure is not limited thereto. A person skilled in the art can flexibly select the type of the second sensor according to the actual application scenario requirement and/or personal preference, as long as the key point information of the target object can be acquired by the target image sensed by the second sensor.
  • the depth map and the target image are acquired by a 3D camera, wherein the 3D camera includes an image sensor for acquiring an image and a depth sensor for acquiring depth information.
  • the terminal device collects three-dimensional information of the target object through a 3D camera set by itself.
  • the depth information and the target image are acquired from other devices, for example, a living body detection request sent by the terminal device, the living body detection request carrying the depth information and the target image.
  • step S12 key point detection is performed on the target image to obtain key point information of the target object.
  • the key point information of the target object may include location information of a key point of the target object.
  • the key points of the target object may include one or more of an eye key point, an eyebrow key point, a nose key point, a mouth key point, and a face contour key point.
  • the eye key points may include one or more of an eye contour key point, an eye corner key point, and a pupil key point.
  • step S13 based on the depth information of the target object and the key point information of the target object, the living body detection result of the target object is obtained.
  • the result of the living body detection of the target object may be that the target object is a living body or the target object is a prosthesis.
  • the living body detection result of the target object may be that the target object is a living face or the target object is a prosthetic face.
  • the embodiment of the present disclosure performs living body detection by combining the depth information of the target object and the target image, thereby enabling the living body detection using the depth information of the target object and the key point information of the target object in the target image, thereby improving the accuracy of the living body detection. .
  • FIG. 2 illustrates an exemplary flowchart of a living body detecting method according to an embodiment of the present disclosure. As shown in FIG. 2, the method may include steps S21 to S24.
  • step S21 the depth information of the target object sensed by the first sensor and the target image sensed by the second sensor are acquired.
  • the depth information of the target object and the target image are aligned according to the parameters of the first sensor and the parameters of the second sensor.
  • the depth information of the target image may be subjected to a conversion process such that the converted depth information is aligned with the target image. For example, if the depth information of the target object is the depth image of the target object, determining a conversion matrix of the parameter matrix of the first sensor to the parameter matrix of the second sensor according to the parameter matrix of the first sensor and the parameter matrix of the second sensor; The conversion matrix converts the depth image of the target object.
  • the target image may be subjected to a conversion process such that the converted target image is aligned with the depth information. For example, if the depth information of the target object is the depth image of the target object, determining a conversion matrix of the parameter matrix of the second sensor to the parameter matrix of the first sensor according to the parameter matrix of the first sensor and the parameter matrix of the second sensor; The conversion matrix converts the target image.
  • the parameters of the first sensor may include internal parameters and/or external parameters of the first sensor
  • the parameters of the second sensor may include internal parameters and/or external parameters of the second sensor
  • the depth information of the target object is the depth image of the target object
  • the depth image of the target object and the corresponding portion of the target image can be made in two images. The location is the same.
  • step S23 key point detection is performed on the target image to obtain key point information of the target object.
  • step S12 the description of step S12 above is referred to step S23.
  • step S24 based on the depth information of the target object and the key point information of the target object, the living body detection result of the target object is obtained.
  • step S13 the description of step S13 above is referred to step S24.
  • FIG. 3 illustrates an exemplary flowchart of the step S13 of the living body detecting method according to an embodiment of the present disclosure. As shown in FIG. 3, step S13 may include steps S131 to S133.
  • step S131 first feature information is obtained based on the depth information of the target object and the key point information of the target object.
  • the first feature information is obtained based on the depth information of the target object and the key point information of the target object, including: inputting the depth information of the target object and the key point information of the target object into the first neural network for processing, and obtaining First feature information.
  • the first neural network may include a convolutional layer, a downsampling layer, and a fully connected layer.
  • the first neural network may include a primary convolution layer, a primary downsampling layer, and a first level fully connected layer.
  • the level convolutional layer may include one or more convolution layers
  • the level downsampling layer may include one or more downsampling layers
  • the level of fully connected layers may include one or more fully connected layers.
  • the first neural network can include a multi-level convolutional layer, a multi-level downsampling layer, and a first-level fully connected layer.
  • each level of convolutional layer may include one or more convolutional layers
  • each level of the downsampling layer may include one or more downsampling layers
  • the level of fully connected layers may include one or more fully connected layers.
  • the i-th convolution layer is cascaded to the i-th down-sampling layer
  • the i-th down-sampling layer is cascaded to the i+1th-order convolutional layer
  • the n-th down-sampling layer is cascaded to the fully-connected layer, wherein , i and n are both positive integers, 1 ⁇ i ⁇ n, and n represents the number of stages of the convolutional layer and the downsampling layer in the first neural network.
  • the first neural network may include a convolutional layer, a downsampling layer, a normalized layer, and a fully connected layer.
  • the first neural network may include a first convolutional layer, a normalized layer, a primary downsampling layer, and a first level fully connected layer.
  • the level convolutional layer may include one or more convolution layers
  • the level downsampling layer may include one or more downsampling layers
  • the level of fully connected layers may include one or more fully connected layers.
  • the first neural network can include a multi-level convolutional layer, a plurality of normalized layers, and a multi-level downsampling layer and a first level fully connected layer.
  • each level of convolutional layer may include one or more convolutional layers
  • each level of the downsampling layer may include one or more downsampling layers
  • the level of fully connected layers may include one or more fully connected layers.
  • the i-th convolution layer is cascaded with the i-th normalization layer
  • the i-th normalization layer is cascaded with the i-th sub-sampling layer
  • the i-th sub-sampling layer is cascaded with the i+1th level Convolution layer
  • the nth stage downsampling layer is cascaded to the fully connected layer, where i and n are positive integers, 1 ⁇ i ⁇ n, and n represents the number of levels of the convolutional layer and the downsampling layer in the first neural network. And the number of normalized layers.
  • step S132 second feature information is obtained based on the key point information of the target object.
  • the second feature information is obtained based on the key point information of the target object, including: inputting the target image and the key point information of the target object into the second neural network for processing, to obtain the second feature information.
  • the second neural network may include a convolutional layer, a downsampling layer, and a fully connected layer.
  • the second neural network can include a primary convolution layer, a primary downsampling layer, and a first level fully connected layer.
  • the level convolutional layer may include one or more convolution layers
  • the level downsampling layer may include one or more downsampling layers
  • the level of fully connected layers may include one or more fully connected layers.
  • the second neural network can include a multi-level convolutional layer, a multi-level downsampling layer, and a first-level fully connected layer.
  • each level of convolutional layer may include one or more convolutional layers
  • each level of the downsampling layer may include one or more downsampling layers
  • the level of fully connected layers may include one or more fully connected layers.
  • the jth level convolution layer is cascaded to the jth level downsampling layer
  • the jth level downsampling layer is cascaded to the j+1th level convolutional layer
  • the mth level downsampling layer is cascaded to the full connection layer
  • j and m are both positive integers, 1 ⁇ j ⁇ m
  • m represents the number of stages of the convolutional layer and the downsampling layer in the second neural network.
  • the second neural network may include a convolutional layer, a downsampling layer, a normalized layer, and a fully connected layer.
  • the second neural network can include a primary convolutional layer, a normalized layer, a primary downsampling layer, and a first level fully connected layer.
  • the level convolutional layer may include one or more convolution layers
  • the level downsampling layer may include one or more downsampling layers
  • the level of fully connected layers may include one or more fully connected layers.
  • the second neural network can include a multi-level convolutional layer, a plurality of normalized layers, and a multi-level downsampling layer and a first-level fully connected layer.
  • each level of convolutional layer may include one or more convolutional layers
  • each level of the downsampling layer may include one or more downsampling layers
  • the level of fully connected layers may include one or more fully connected layers.
  • the jth level convolution layer is cascaded to the jth normalization layer
  • the jth normalization layer is cascaded to the jth level downsampling layer
  • the jth level downsampling layer is cascaded to the j+1th level Convolution layer
  • the m-th level downsampling layer is cascaded to the fully connected layer, where j and m are positive integers, 1 ⁇ j ⁇ m, and m represents the number of levels of the convolutional layer and the downsampling layer in the second neural network. And the number of normalized layers.
  • the first neural network and the second neural network have the same network structure.
  • step S133 the living body detection result of the target object is determined based on the first feature information and the second feature information.
  • step S131 may be performed first and then step S132 may be performed, or step S132 may be performed first and then step S131 may be performed, or step S131 and step S132 may be performed simultaneously.
  • FIG. 5 illustrates an exemplary flowchart of the living body detecting method step S131 according to an embodiment of the present disclosure. As shown in FIG. 5, step S131 may include steps S1311 through S1313.
  • step S1311 the depth information of the target object and the key point information of the target object are convoluted to obtain a first convolution result.
  • step S1312 the first convolution result is subjected to downsampling processing to obtain a first downsampling result.
  • the depth information of the target object and the key point information of the target object may be subjected to convolution processing and downsampling processing by the first convolutional layer and the first level downsampling layer.
  • the level convolution layer may comprise one or more convolution layers
  • the level downsampling layer may comprise one or more downsampling layers.
  • the depth information of the target object and the key point information of the target object may be convoluted and downsampled by the multi-level convolution layer and the multi-level downsampling layer.
  • each level of the convolution layer may include one or more convolution layers
  • each level of the downsampling layer may include one or more downsampling layers.
  • performing a downsampling process on the first convolution result to obtain a first downsampling result may include: normalizing the first convolution result to obtain a first normalization result; The normalized result is subjected to downsampling processing to obtain a first downsampling result.
  • step S1313 based on the first downsampling result, the first feature information is obtained.
  • the first downsampling result may be input to the fully connected layer, and the first downsampling result is subjected to a fusion process (eg, a full join operation) through the fully connected layer to obtain first feature information.
  • a fusion process eg, a full join operation
  • FIG. 6 illustrates an exemplary flowchart of the living body detecting method step S132 according to an embodiment of the present disclosure. As shown in FIG. 6, step S132 may include steps S1321 through S1323.
  • step S1321 convolution processing is performed on the target image and the key point information of the target object to obtain a second convolution result.
  • step S1322 the second convolution result is subjected to downsampling processing to obtain a second downsampling result.
  • the target image and the key point information of the target object may be subjected to convolution processing and downsampling processing by the first convolutional layer and the first level downsampling layer.
  • the level convolution layer may include one or more convolution layers
  • the level downsampling layer may include one or more downsampling layers.
  • the key point information of the target image and the target object may be convoluted and downsampled by the multi-level convolution layer and the multi-level downsampling layer.
  • each level of the convolution layer may include one or more convolution layers
  • each level of the downsampling layer may include one or more downsampling layers.
  • performing a downsampling process on the second convolution result to obtain a second downsampling result may include: normalizing the second convolution result to obtain a second normalization result; The normalized result is subjected to downsampling processing to obtain a second downsampling result.
  • step S1323 based on the second down sampling result, the second feature information is obtained.
  • the second downsampling result may be input to the fully connected layer, and the second downsampling result is subjected to a fusion process (eg, a full join operation) through the fully connected layer to obtain second feature information.
  • a fusion process eg, a full join operation
  • FIG. 7 illustrates an exemplary flowchart of the living body detecting method step S133 according to an embodiment of the present disclosure. As shown in FIG. 7, step S133 may include step S1331 and step S1332.
  • step S1331 the first feature information and the second feature information are subjected to a fusion process (for example, a full join operation) to obtain third feature information.
  • a fusion process for example, a full join operation
  • the first feature information and the second feature information may be connected (eg, channel superimposed) or added to obtain third feature information.
  • the first feature information and the second feature information are fully connected to each other through the fully connected layer to obtain third feature information.
  • step S1332 based on the third feature information, the living body detection result of the target object is determined.
  • FIG. 8 illustrates an exemplary flowchart of the living body detecting method step S1332 according to an embodiment of the present disclosure. As shown in FIG. 8, step S1332 may include step S13321 and step S13322.
  • step S13321 based on the third feature information, a probability that the target object is a living body is obtained.
  • the third feature information may be input into the Softmax layer, and the probability that the target object is a living body is obtained through the Softmax layer.
  • the Softmax layer may include two neurons, where one neuron represents the probability that the target object is a living body and the other neuron represents the probability that the target object is a prosthesis.
  • step S13322 the living body detection result of the target object is determined according to the probability that the target object is a living body.
  • determining a living body detection result of the target object according to a probability that the target object is a living body including: if the probability that the target object is a living body is greater than the first threshold, determining that the living body detection result of the target object is the target object is a living body; If the probability that the target object is a living body is less than or equal to the first threshold, it is determined that the living body detection result of the target object is a prosthesis.
  • the probability that the target object is a prosthesis may be obtained based on the third feature information, and the biometric detection result of the target object is determined according to the probability that the target object is a prosthesis. In this implementation, if the probability that the target object is a prosthesis is greater than the second threshold, determining that the target object's living body detection result is the target object is a prosthesis; if the target object is a prosthesis, the probability is less than or equal to the second threshold, then The living body detection result of the target object is determined to be a living body.
  • the embodiment of the present disclosure performs living body detection by combining the depth information of the target object and the target image, thereby enabling the living body detection using the depth information of the target object and the key point information of the target object in the target image, thereby improving the accuracy of the living body detection. And the calculation complexity is low, and the accurate living body detection result can still be obtained when the camera is slightly shaken or shaken.
  • face recognition With the development of face recognition technology, the accuracy of face recognition has been able to surpass fingerprints, so it is widely used in various scenes, such as video surveillance, face unlocking, face payment and other applications.
  • face recognition is at risk of being hacked, and in vivo detection is an essential part of face recognition applications.
  • the monocular living body detection uses an image captured by a normal camera as an input, and has a drawback that it is easily passed by a high-definition seamless hack image.
  • the binocular in vivo detection technology uses two cameras (ordinary RGB camera or ordinary near-infrared camera) as input, and the performance is superior to monocular in vivo detection.
  • the calculation of the depth distribution information of the face by binocular matching has the disadvantages of large computational complexity and low accuracy of depth information, and the camera may be subject to changes in camera parameters due to shaking, vibration, etc., resulting in computational failure.
  • 3D, 3Dimensions In recent years, three-dimensional (3D, 3Dimensions) sensor technology has advanced by leaps and bounds, including Time Of Flight (TOF) sensors, structured light sensors, binocular sensors, etc., enabling users to easily obtain high-precision depth directly from sensors (Sensors). information.
  • Embodiments of the present disclosure take 3D data and near-infrared data or RGB color mode data as input, use near-infrared image or RGB image to obtain face key point information, and then fuse face 3D depth map, near-infrared or RGB image, person One or more of face key information, eye corner features, pupil features, etc., using the deep learning model, can more effectively distinguish between real faces and hacks.
  • FIG. 4A illustrates a schematic block diagram of a living body detecting device applied to a human face according to an embodiment of the present disclosure.
  • the living body detecting device includes an input module 41, a data preprocessing module 42, a deep neural network module 43, and a detection result output module 44.
  • the input module 41 is suitable for data input of different hardware modules.
  • the data input form of the input module includes one or more of the following: a depth map, a pure near infrared image, a near infrared image with a spot, an RGB image, and the like. .
  • the specific combination is determined by different hardware schemes.
  • the data preprocessing module 42 is configured to preprocess the data input by the input module to obtain data required by the deep neural network.
  • 4B shows an exemplary block diagram of one implementation of the data pre-processing module 42 of FIG. 4A, wherein the input of the data pre-processing module includes: a depth map obtained by the depth sensor and an image sensor, in accordance with an embodiment of the present disclosure.
  • the obtained image pure near-infrared image, infrared image with spot, RGB image, etc.
  • the depth map 421 and the near-infrared image 422 are used as inputs to the data pre-processing module 42.
  • the data preprocessing module processes the input data as follows: image alignment correction 423 and face key detection 424, wherein face key detection can be implemented using a face keypoint model.
  • the image alignment correction 423 if the depth map and the near-infrared map (or RGB map) are not synchronously aligned, the input depth map and the near-infrared map need to be aligned/corrected according to the camera's parameter matrix to achieve image alignment.
  • a near-infrared image (or RGB image) is input to the face key point model for face key point detection, and face key point information 425 is obtained.
  • the output of the data preprocessing module corresponds to the input, including the aligned corrected face depth map (corresponding to the input depth map 421) and the face near infrared map (corresponding to the input near infrared map 422) and the person Face key information.
  • the deep neural network module 43 is a two-category model, for example, for a real face, the label of the classification is 0; for a human face of the hack, the label of the classification is 1. For another example, for a real face, the label of the classification is 1; for the face of the hack, the label of the classification is 0, and so on.
  • 4C illustrates a block diagram of one example of the deep neural network module of FIG. 4A, as shown in FIG.
  • the input of the deep neural network module includes a face depth map obtained after passing through the data preprocessing module, in accordance with an embodiment of the present disclosure. 431.
  • a face near infrared map 432 (or other form of a two-dimensional face image) and face key point information 433.
  • the output of the deep neural network module includes a discriminant score, ie, a probability of being determined as a real person or a hack.
  • the output of the deep neural network is a binary value, and the output score is compared with a preset threshold, wherein the threshold setting can be adjusted according to the accuracy and the recall rate. For example, if the output score of the neural network is greater than the threshold, then It is judged as a hack, and if it is smaller than the threshold, it is judged to be a living body, and the like.
  • the deep neural network is a multi-branch model, and the number of branches is determined by the number of input images.
  • FIG. 4C takes a face depth map and a face near infrared image as an example, and the deep neural network includes Two branches, each branch includes a plurality of convolution layers 434, a downsampling layer 435, and a fully connected layer 436, wherein the face depth map 431 and the face key point information 433 are input to the first branch for feature extraction processing, The face near infrared map 432 and the face key information 433 are input to the second branch for feature extraction processing, and finally the features extracted by the multiple branches are connected together and input to the fully connected layer 437, and finally processed by the Softmax layer 438.
  • the number of neurons in the output layer is 2, representing the probability of real people and hacks.
  • the inputs of the two branches of FIG. 4C all include face key point information
  • the all-connection layer 437 integrates the feature information of the output of the two-branch fully connected layer 436 by using the face key point information.
  • the output of the fully connected layer 436 in the first branch is the first feature information
  • the output of the fully connected layer 436 in the second branch is the second feature information
  • the fully connected layer 437 passes the face keypoint information.
  • the full join operation fuses the first feature information and the second feature information together.
  • the face key point information is used to fuse the face depth map and the face near infrared image to obtain the final output result.
  • the output is identified as 0; for a hacked face, the output is identified as 1, but this embodiment of the present disclosure does not limit this.
  • a 3D sensor with depth information and other auxiliary images such as near infrared images, RGB images, etc.
  • auxiliary images such as near infrared images, RGB images, etc.
  • the proposed framework can be applied to a variety of 3D sensor input forms, including 3D depth map + near infrared image provided by TOF camera; 3D depth map provided by structured light camera + near infrared image with spot; 3D depth map + RGB map; 3D depth map + near infrared map + RGB map and other forms including 3D depth map and near infrared map or RGB map.
  • the general camera and the binocular are mainly used, and the depth information of the face data is not fully exploited, and there is a disadvantage that the HD face is easily attacked by the high-definition hack, and the face collected by the 3D sensor in the embodiment of the present disclosure is used. Depth maps prevent flat hack attacks.
  • 3D depth information, other near-infrared data or RGB data, face keypoint information, and eye corner and pupil features are blended to distinguish between real people and hacks through the training of deep learning models.
  • single data is dominant, and the correlation and complementarity between multimodal data are not utilized. That is to say, the conventional binocular calculation depth method has the disadvantages of high computational complexity and low precision, and the embodiment of the present disclosure can effectively utilize the current 3D sensing technology to obtain more accurate 3D face data distribution.
  • a multi-branch model is integrated, and the multi-branch model can fully integrate multi-modal data, and is compatible with various data forms, and can learn real facial information features through neural networks.
  • the embodiment of the present disclosure combines face depth information, near-infrared face information or RGB map face information, face key point information, and multi-dimensional bio-feature fusion techniques such as eye corner, eye, pupil, etc., to make up for a single technology easy to be The shortcomings of hack.
  • FIG. 9 illustrates a block diagram of a living body detecting apparatus according to an embodiment of the present disclosure.
  • the device includes: an obtaining module 91 configured to acquire depth information of a target object sensed by the first sensor and a target image sensed by the second sensor; and a detecting module 92 configured to perform the target image
  • the key point detection obtains the key point information of the target object;
  • the determining module 93 is configured to obtain the living body detection result of the target object based on the depth information of the target object and the key point information of the target object.
  • the target object is a human face.
  • the second sensor is an RGB sensor or a near infrared sensor.
  • FIG. 10 illustrates an exemplary block diagram of a living body detecting apparatus according to an embodiment of the present disclosure. As shown in Figure 10:
  • the apparatus further includes an alignment module 94 configured to align the depth information and the target image of the target object based on the parameters of the first sensor and the parameters of the second sensor.
  • the determining module 93 includes: a first determining submodule 931 configured to obtain first feature information based on depth information of the target object and key point information of the target object; and second determining submodule 932 configured to be based on The key point information of the target object obtains the second feature information; the third determining submodule 933 is configured to determine the living body detection result of the target object based on the first feature information and the second feature information.
  • the first determining sub-module 931 is configured to: input the depth information of the target object and the key point information of the target object into the first neural network for processing to obtain the first feature information; and the second determining sub-module 932 is configured to And inputting the key image information of the target image and the target object into the second neural network for processing, to obtain second feature information.
  • the first neural network and the second neural network have the same network structure.
  • the first determining submodule 931 includes: a first convolution unit configured to perform convolution processing on the depth information of the target object and the key point information of the target object to obtain a first convolution result;
  • the sampling unit is configured to perform a down sampling process on the first convolution result to obtain a first down sampling result.
  • the first determining unit is configured to obtain the first feature information based on the first down sampling result.
  • the second determining submodule 932 includes: a second convolution unit configured to perform convolution processing on the target image and the key point information of the target object to obtain a second convolution result; and a second downsampling unit,
  • the second convolution result is configured to perform a down sampling process to obtain a second down sampling result.
  • the second determining unit is configured to obtain second feature information based on the second down sampling result.
  • the third determining submodule 933 includes: a fully connected unit configured to perform a fusion process (eg, a full join operation) on the first feature information and the second feature information to obtain third feature information; and a third determining unit And configured to determine a living body detection result of the target object according to the third feature information.
  • a fully connected unit configured to perform a fusion process (eg, a full join operation) on the first feature information and the second feature information to obtain third feature information
  • a third determining unit And configured to determine a living body detection result of the target object according to the third feature information.
  • the third determining unit includes: a first determining subunit configured to obtain a probability that the target object is a living body based on the third feature information; and a second determining subunit configured to be a probability according to the target object being a living body, Determine the in vivo test results of the target object.
  • the embodiment of the present disclosure performs living body detection by combining the depth information of the target object and the target image, thereby enabling the living body detection using the depth information of the target object and the key point information of the target object in the target image, thereby improving the accuracy of the living body detection. To prevent prosthetic image attacks.
  • FIG. 11 is a block diagram of a living body detecting apparatus 800, according to an exemplary embodiment.
  • device 800 can be a mobile phone, a computer, a digital broadcast terminal, a messaging device, a gaming console, a tablet device, a medical device, a fitness device, a personal digital assistant, and the like.
  • apparatus 800 can include one or more of the following components: processing component 802, memory 804, power component 806, multimedia component 808, audio component 810, input/output (I/O) interface 812, sensor component 814, And a communication component 816.
  • processing component 802 memory 804, power component 806, multimedia component 808, audio component 810, input/output (I/O) interface 812, sensor component 814, And a communication component 816.
  • I/O input/output
  • Processing component 802 typically controls the overall operation of device 800, such as operations associated with display, telephone calls, data communications, camera operations, and recording operations.
  • Processing component 802 can include one or more processors 820 to execute instructions to perform all or part of the steps of the above described methods.
  • processing component 802 can include one or more modules to facilitate interaction between component 802 and other components.
  • processing component 802 can include a multimedia module to facilitate interaction between multimedia component 808 and processing component 802.
  • Memory 804 is configured to store various types of data to support operation at device 800. Examples of such data include instructions for any application or method operating on device 800, contact data, phone book data, messages, pictures, videos, and the like.
  • the memory 804 can be implemented by any type of volatile or non-volatile storage device, or a combination thereof, such as static random access memory (SRAM), electrically erasable programmable read only memory (EEPROM), erasable. Programmable Read Only Memory (EPROM), Programmable Read Only Memory (PROM), Read Only Memory (ROM), Magnetic Memory, Flash Memory, Disk or Optical Disk.
  • Power component 806 provides power to various components of device 800. Power component 806 can include a power management system, one or more power sources, and other components associated with generating, managing, and distributing power for device 800.
  • the multimedia component 808 includes a screen between the device 800 and the user that provides an output interface.
  • the screen can include a liquid crystal display (LCD) and a touch panel (TP). If the screen includes a touch panel, the screen can be implemented as a touch screen to receive input signals from the user.
  • the touch panel includes one or more touch sensors to sense touches, slides, and gestures on the touch panel. The touch sensor may sense not only the boundary of the touch or sliding action, but also the duration and pressure associated with the touch or slide operation.
  • the multimedia component 808 includes a front camera and/or a rear camera. When the device 800 is in an operation mode, such as a shooting mode or a video mode, the front camera and/or the rear camera can receive external multimedia data.
  • the audio component 810 is configured to output and/or input an audio signal.
  • the audio component 810 includes a microphone (MIC) that is configured to receive an external audio signal when the device 800 is in an operational mode, such as a call mode, a recording mode, and a voice recognition mode.
  • the received audio signal may be further stored in memory 804 or transmitted via communication component 816.
  • the audio component 810 also includes a speaker for outputting an audio signal.
  • the I/O interface 812 provides an interface between the processing component 802 and the peripheral interface module, which may be a keyboard, a click wheel, a button, or the like. These buttons may include, but are not limited to, a home button, a volume button, a start button, and a lock button.
  • Sensor assembly 814 includes one or more sensors for providing device 800 with a status assessment of various aspects.
  • sensor assembly 814 can detect an open/closed state of device 800, relative positioning of components, such as the display and keypad of device 800, and sensor component 814 can also detect a change in position of one component of device 800 or device 800. The presence or absence of user contact with device 800, device 800 orientation or acceleration/deceleration, and temperature variation of device 800.
  • Sensor assembly 814 can include a proximity sensor configured to detect the presence of nearby objects without any physical contact.
  • Sensor assembly 814 may also include a light sensor, such as a CMOS or CCD image sensor, for use in imaging applications.
  • the sensor assembly 814 can also include an acceleration sensor, a gyro sensor, a magnetic sensor, a pressure sensor, or a temperature sensor.
  • Communication component 816 is configured to facilitate wired or wireless communication between device 800 and other devices.
  • the device 800 can access a wireless network based on a communication standard, such as WiFi, 2G or 3G, or a combination thereof.
  • communication component 816 receives broadcast signals or broadcast associated information from an external broadcast management system via a broadcast channel.
  • the communication component 816 also includes a near field communication (NFC) module to facilitate short range communication.
  • NFC near field communication
  • the NFC module can be implemented based on radio frequency identification (RFID) technology, infrared data association (IrDA) technology, ultra-wideband (UWB) technology, Bluetooth (BT) technology, and other technologies.
  • RFID radio frequency identification
  • IrDA infrared data association
  • UWB ultra-wideband
  • Bluetooth Bluetooth
  • device 800 may be implemented by one or more application specific integrated circuits (ASICs), digital signal processors (DSPs), digital signal processing devices (DSPDs), programmable logic devices (PLDs), field programmable A gate array (FPGA), controller, microcontroller, microprocessor, or other electronic component implementation for performing the above methods.
  • ASICs application specific integrated circuits
  • DSPs digital signal processors
  • DSPDs digital signal processing devices
  • PLDs programmable logic devices
  • FPGA field programmable A gate array
  • controller microcontroller, microprocessor, or other electronic component implementation for performing the above methods.
  • a non-transitory computer readable storage medium such as a memory 804 comprising computer program instructions executable by processor 820 of apparatus 800 to perform the above method.
  • Embodiments of the present disclosure may be systems, methods, and/or computer program products.
  • the computer program product can comprise a computer readable storage medium having computer readable program instructions embodied thereon for causing a processor to implement various aspects of the present disclosure.
  • the computer readable storage medium can be a tangible device that can hold and store the instructions used by the instruction execution device.
  • the computer readable storage medium can be, for example, but not limited to, an electrical storage device, a magnetic storage device, an optical storage device, an electromagnetic storage device, a semiconductor storage device, or any suitable combination of the foregoing.
  • Non-exhaustive list of computer readable storage media include: portable computer disks, hard disks, random access memory (RAM), read only memory (ROM), erasable programmable read only memory (EPROM) Or flash memory), static random access memory (SRAM), portable compact disk read only memory (CD-ROM), digital versatile disk (DVD), memory stick, floppy disk, mechanical encoding device, for example, with instructions stored thereon A raised structure in the hole card or groove, and any suitable combination of the above.
  • a computer readable storage medium as used herein is not to be interpreted as a transient signal itself, such as a radio wave or other freely propagating electromagnetic wave, an electromagnetic wave propagating through a waveguide or other transmission medium (eg, a light pulse through a fiber optic cable), or through a wire The electrical signal transmitted.
  • the computer readable program instructions described herein can be downloaded from a computer readable storage medium to various computing/processing devices or downloaded to an external computer or external storage device over a network, such as the Internet, a local area network, a wide area network, and/or a wireless network.
  • the network may include copper transmission cables, fiber optic transmissions, wireless transmissions, routers, firewalls, switches, gateway computers, and/or edge servers.
  • a network adapter card or network interface in each computing/processing device receives computer readable program instructions from the network and forwards the computer readable program instructions for storage in a computer readable storage medium in each computing/processing device .
  • the computer program instructions for performing the operations of the embodiments of the present disclosure may be assembly instructions, instruction set architecture (ISA) instructions, machine instructions, machine related instructions, microcode, firmware instructions, state setting data, or programmed in one or more Source code or object code written in any combination of languages, including object oriented programming languages such as Smalltalk, C++, etc., as well as conventional procedural programming languages such as the "C" language or similar programming languages.
  • the computer readable program instructions can execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer, partly on the remote computer, or entirely on the remote computer or server. carried out.
  • the remote computer can be connected to the user's computer through any kind of network, including a local area network (LAN) or wide area network (WAN), or can be connected to an external computer (eg, using an Internet service provider to access the Internet) connection).
  • the customized electronic circuit such as a programmable logic circuit, a field programmable gate array (FPGA), or a programmable logic array (PLA), can be customized by utilizing state information of computer readable program instructions.
  • Computer readable program instructions are executed to implement various aspects of the disclosed embodiments.
  • the computer readable program instructions can be provided to a general purpose computer, a special purpose computer, or a processor of other programmable data processing apparatus to produce a machine such that when executed by a processor of a computer or other programmable data processing apparatus Means for implementing the functions/acts specified in one or more of the blocks of the flowcharts and/or block diagrams.
  • the computer readable program instructions can also be stored in a computer readable storage medium that causes the computer, programmable data processing device, and/or other device to operate in a particular manner, such that the computer readable medium storing the instructions includes An article of manufacture that includes instructions for implementing various aspects of the functions/acts recited in one or more of the flowcharts.
  • the computer readable program instructions can also be loaded onto a computer, other programmable data processing device, or other device to perform a series of operational steps on a computer, other programmable data processing device or other device to produce a computer-implemented process.
  • instructions executed on a computer, other programmable data processing apparatus, or other device implement the functions/acts recited in one or more of the flowcharts and/or block diagrams.
  • each block in the flowchart or block diagram can represent a module, a program segment, or a portion of an instruction that includes one or more components for implementing the specified logical functions.
  • Executable instructions can also occur in a different order than those illustrated in the drawings. For example, two consecutive blocks may be executed substantially in parallel, and they may sometimes be executed in the reverse order, depending upon the functionality involved.
  • each block of the block diagrams and/or flowcharts, and combinations of blocks in the block diagrams and/or flowcharts can be implemented in a dedicated hardware-based system that performs the specified function or function. Or it can be implemented by a combination of dedicated hardware and computer instructions.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • Multimedia (AREA)
  • General Health & Medical Sciences (AREA)
  • Oral & Maxillofacial Surgery (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Human Computer Interaction (AREA)
  • Evolutionary Computation (AREA)
  • Software Systems (AREA)
  • Artificial Intelligence (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Medical Informatics (AREA)
  • Databases & Information Systems (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Computer Security & Cryptography (AREA)
  • Biophysics (AREA)
  • Molecular Biology (AREA)
  • Biomedical Technology (AREA)
  • Mathematical Physics (AREA)
  • Computational Linguistics (AREA)
  • Computer Hardware Design (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Image Analysis (AREA)
  • Image Processing (AREA)
  • Collating Specific Patterns (AREA)
  • Measurement Of The Respiration, Hearing Ability, Form, And Blood Characteristics Of Living Organisms (AREA)

Abstract

本公开实施例涉及活体检测方法及装置、设备、存储介质。该方法包括:获取第一传感器感测到的目标对象的深度信息和第二传感器感测到的目标图像;对所述目标图像进行关键点检测,得到所述目标对象的关键点信息;基于所述目标对象的深度信息和所述目标对象的关键点信息,得到所述目标对象的活体检测结果。

Description

活体检测方法及装置、系统、电子设备、存储介质
相关申请的交叉引用
本申请基于申请号为201810444105.4、申请日为2018年05月10日的中国专利申请提出,并要求该中国专利申请的优先权,该中国专利申请的全部内容在此以全文引入的方式引入本申请。
技术领域
本公开涉及计算机视觉技术领域,尤其涉及一种活体检测方法及装置、系统、电子设备、存储介质。
背景技术
目前,人脸识别技术已被广泛应用于人脸解锁、人脸支付、无人超市和视频监控等场景中。然而,人脸识别技术存在容易被人脸的实体照片、人脸的电子照片或者包含人脸的视频等形式的假体人脸攻击的风险。因此,活体检测是人脸识别中必不可少的一个环节。
发明内容
有鉴于此,本公开实施例提出了一种活体检测方法及装置。
根据本公开实施例的一方面,提供了一种活体检测方法,包括:获取第一传感器感测到的目标对象的深度信息和第二传感器感测到的目标图像;对所述目标图像进行关键点检测,得到所述目标对象的关键点信息;基于所述目标对象的深度信息和所述目标对象的关键点信息,得到所述目标对象的活体检测结果。
在一些实施例中,所述目标对象为人脸。
在一些实施例中,所述第二传感器为图像传感器,例如,所述第二传感器为RGB传感器或者近红外传感器。
在一些实施例中,所述第一传感器为深度传感器,例如,所述第一传感器为飞行时间TOF传感器或结构光传感器。
在一些实施例中,第一传感器和所述第二传感器集成在同一器件中,例如集成在3D摄像头中。
在一些实施例中,在对所述目标图像进行关键点检测之前,所述方法还包括:
根据所述第一传感器的参数以及所述第二传感器的参数,对齐所述目标对象的深度信息和所述目标图像。
在一些实施例中,基于所述目标对象的深度信息和所述目标对象的关键点信息,得到所述目标对象的活体检测结果,包括:
基于所述目标对象的深度信息和所述目标对象的关键点信息,得到第一特征信息;
基于所述目标对象的关键点信息,得到第二特征信息;
基于所述第一特征信息和所述第二特征信息,确定所述目标对象的活体检测结果。
在一些实施例中,基于所述目标对象的深度信息和所述目标对象的关键点信息,得到第一特征信息,包括:将所述目标对象的深度信息和所述目标对象的关键点信息输入第一神经网络进行处理,得到第一特征信息;
基于所述目标对象的关键点信息,得到第二特征信息,包括:将所述目标图像和所述目标对象的关键点信息输入第二神经网络进行处理,得到第二特征信息。
在一些实施例中,所述第一神经网络和所述第二神经网络具有相同的网络结构。
在一些实施例中,基于所述目标对象的深度信息和所述目标对象的关键点信息,得到第一特征信息,包括:对所述目标对象的深度信息和所述目标对象的关键点信息进行卷积处理,得到第一卷积结果;对所述第一卷积结果进行下采样处理,得到第一下采样结果;基于所述第一下采样结果,得到第一特征信息。
在一些实施例中,基于所述目标对象的关键点信息,得到第二特征信息,包括:
对所述目标图像和所述目标对象的关键点信息进行卷积处理,得到第二卷积结果;
对所述第二卷积结果进行下采样处理,得到第二下采样结果;
基于所述第二下采样结果,得到第二特征信息。
在一些实施例中,基于所述第一特征信息和所述第二特征信息,确定所述目标对象的活体检测结果,包括:对所述第一特征信息和所述第二特征信息进行融合处理,得到第三特征信息;根据所述第三特征信息,确定所述目标对象的活体检测结果。
在一些实施例中,根据所述第三特征信息,确定活体检测结果,包括:
基于所述第三特征信息,得到所述目标对象为活体的概率;
根据所述目标对象为活体的概率,确定所述目标对象的活体检测结果。
根据本公开实施例的另一方面,提供了一种活体检测装置,包括:
获取模块,配置为获取第一传感器感测到的目标对象的深度信息和第二传感器感测到的目标图像;
检测模块,配置为对所述目标图像进行关键点检测,得到所述目标对象的关键点信息;
确定模块,配置为基于所述目标对象的深度信息和所述目标对象的关键点信息,得到所述目标对象的活体检测结果。
在一些实施例中,所述目标对象为人脸。
在一些实施例中,所述第二传感器为图像传感器,例如,所述第二传感器为RGB传感器或者近红外传感器。
在一些实施例中,所述第一传感器为深度传感器,例如,所述第一传感器为飞行时间TOF传感器或结构光传感器。
在一些实施例中,第一传感器和所述第二传感器集成在同一器件中,例如集成在3D摄像头中。
在一些实施例中,所述装置还包括:对齐模块,配置为根据所述第一传感器的参数以及所述第二传感器的参数,对齐所述目标对象的深度信息和所述目标图像。
在一些实施例中,所述确定模块包括:第一确定子模块,配置为基于所述目标对象的深度信息和所述目标对象的关键点信息,得到第一特征信息;第二确定子模块,配置为基于所述目标对象的关键点信息,得到第二特征信息;第三确定子模块,配置为基于所述第一特征信息和所述第二特征信息,确定所述目标对象的活体检测结果。
在一些实施例中,所述第一确定子模块配置为:将所述目标对象的深度信息和所述目标对象的关键点信息输入第一神经网络进行处理,得到第一特征信息;
所述第二确定子模块配置为:将所述目标图像和所述目标对象的关键点信息输入第二神经网络进行处理,得到第二特征信息。
在一些实施例中,所述第一神经网络和所述第二神经网络具有相同的网络结构。
在一些实施例中,所述第一确定子模块包括:第一卷积单元,配置为对所述目标对象的深度信息和所述目标对象的关键点信息进行卷积处理,得到第一卷积结果;第一下采样单元,配置为对所述第一卷积结果进行下采样处理,得到第一下采样结果;第一确定单元,配置为基于所述第一下采样结果,得到第一特征信息。
在一些实施例中,所述第二确定子模块包括:第二卷积单元,配置为对所述目标图像和所述目标对象的关键点信息进行卷积处理,得到第二卷积结果;第二下采样单元,配置为对所述第二卷积结果进行下采样处理,得到第二下采样结果;第二确定单元,配置为基于所述第二下采样结果,得到第二特征信息。
在一些实施例中,所述第三确定子模块包括:全连接单元,配置为对所述第一特征信息和所述第二特征信息进行融合处理,得到第三特征信息;第三确定单元,配置为根据所述第三特征信息,确定所述目标对象的活体检测结果。
在一些实施例中,所述第三确定单元包括:第一确定子单元,配置为基于所述第三特征信息,得到所述目标对象为活体的概率;第二确定子单元,配置为根据所述目标对象为活体的概率,确定所述目标对象的活体检测结果。
本公开实施例提供的活体检测装置用于执行上述任意实施例中的活体检测方法,包括用于执行上述任意可能的活体检测方法的步骤和/或流程的模块和单元。
根据本公开实施例的另一方面,提供了一种活体检测装置,包括:处理器;配置为存储处理器可执行指令的存储器;其中,所述处理器被配置为执行上述方法。
根据本公开实施例的另一方面,提供了一种非易失性计算机可读存储介质,其上存储有计算机程序指令,其中,所述计算机程序指令被处理器执行时实现上述方法。
根据本公开实施例的另一方面,提供了一种活体检测系统,包括:上述的活体检测装置、所述第一传感器和所述第二传感器。
根据本公开实施例的另一方面,提供了一种活体检测系统,包括:上述的非易失性计算机可读存储介质、所述第一传感器和所述第二传感器。
根据本公开实施例的另一方面,提供了一种电子设备,包括:
第一传感器,配置为检测目标对象的深度信息;
第二传感器,配置为采集包括所述目标对象的目标图像;
处理器,配置为对所述第二传感器采集到的目标对象进行关键点检测,得到所述目标对象的关键点信息,并基于所述第一传感器检测到的所述目标对象的深度信息和所述目标对象的关键点信息,得到所述目标对象的活体检测结果。
在一些实施例中,所述第二传感器为RGB传感器或者近红外传感器。
在一些实施例中,所述第一传感器为飞行时间TOF传感器或结构光传感器。
在一些实施例中,所述处理器还配置为:根据所述第一传感器的参数以及所述第二传感器的参数,对齐所述目标对象的深度信息和所述目标图像。
在一些实施例中,所述处理器配置为:基于所述目标对象的深度信息和所述目标对象的关键点信息,得到第一特征信息;基于所述目标对象的关键点信息,得到第二特征信息;基于所述第一特征信息和所述第二特征信息,确定所述目标对象的活体检测结果。
在一些实施例中,所述处理器配置为:将所述目标对象的深度信息和所述目标对象的关键点信息输入第一神经网络进行处理,得到第一特征信息;
基于所述目标对象的关键点信息,得到第二特征信息,包括:将所述目标图像和所述目标对象的关键点信息输入第二神经网络进行处理,得到第二特征信息。
在一些实施例中,所述处理器配置为:对所述目标对象的深度信息和所述目标对象的关键点信息进行卷积处理,得到第一卷积结果;对所述第一卷积结果进行下采样处理,得到第一下采样结果;基于所述第一下采样结果,得到第一特征信息。
在一些实施例中,所述处理器配置为:对所述目标图像和所述目标对象的关键点信息进行卷积处理,得到第二卷积结果;对所述第二卷积结果进行下采样处理,得到第二下采样结果;基于所述第二下采样结果,得到第二特征信息。
在一些实施例中,所述处理器配置为:对所述第一特征信息和所述第二特征信息进行融合处理,得到第三特征信息;根据所述第三特征信息,确定所述目标对象的活体检测结果。
在一些实施例中,所述处理器配置为:基于所述第三特征信息,得到所述目标对象为活体的概率;根据所述目标对象为活体的概率,确定所述目标对象的活体检测结果。
本公开的各方面的活体检测方法通过结合目标对象的深度信息和目标图像进行活体检测,由此能够利用目标对象的深度信息和目标图像中的目标对象的关键点信息进行活体检测,从而能够提高活体检测的准确性。根据下面参考附图对示例性实施例的详细说明,本公开的其它特征及方面将变得清楚。
附图说明
包含在说明书中并且构成说明书的一部分的附图与说明书一起示出了本公开的示例性实施例、特征和方面,并且用于解释本公开的原理。
图1示出根据本公开实施例的活体检测方法的流程图。
图2示出根据本公开实施例的活体检测方法的一示例性的流程图。
图3示出根据本公开实施例的活体检测方法步骤S13的一示例性的流程图。
图4A示出根据本公开实施例应用于人脸的活体检测装置的框图。
图4B示出根据本公开实施例的图4A中的数据预处理模块的框图。
图4C示出根据本公开实施例的图4A中的深度神经网络模块的框图。
图5示出根据本公开实施例的活体检测方法步骤S131的一示例性的流程图。
图6示出根据本公开实施例的活体检测方法步骤S132的一示例性的流程图。
图7示出根据本公开实施例的活体检测方法步骤S133的一示例性的流程图。
图8示出根据本公开实施例的活体检测方法步骤S1332的一示例性的流程图。
图9示出根据本公开实施例的活体检测装置的框图。
图10示出根据本公开实施例的活体检测装置的一示例性的框图。
图11是根据一示例性实施例示出的一种活体检测装置800的框图。
具体实施方式
以下将参考附图详细说明本公开的各种示例性实施例、特征和方面。附图中相同的附图标记表示功能相同或相似的元件。尽管在附图中示出了实施例的各种方面,但是除非特别指出,不必按比例绘制附图。在这里专用的词“示例性”意为“用作例子、实施例或说明性”。这里作为“示例性”所说明的任何实施例不必解释为优于或好于其它实施例。另外,为了更好的说明本公开,在下文的具体实施方式中给出了众多的具体细节。本领域技术人员应当理解,没有某些具体细节,本公开实施例同样可以实施。在一些实例中,对于本领域技术人员熟知的方法、手段、元件和电路未作详细描述,以便于凸显本公开实施例的主旨。
图1示出根据本公开实施例的活体检测方法的流程图。该方法可以应用于手机、平板电脑、数码相机或者门禁设备等具有人脸识别功能的终端设备中。该方法可以应用于人脸解锁、人脸支付、无人超市和视频监控等场景中。如图1所示,该方法包括步骤S11至步骤S13。
在步骤S11中,获取第一传感器感测到的目标对象的深度信息和第二传感器感测到的目标图像。
在一些实施例中,目标对象为人脸。在一些实施例中,第一传感器为三维传感器。例如,第一传感器可以为ToF(Time of Flight,飞行时间)传感器、结构光传感器、双目传感器或者其他类型的深度传感器。通过三维传感器获取目标对象的深度信息,可以获得高精度的深度信息。本公开实施例利用包含目标对象的深度信息进行活体检测,能够充分挖掘目标对象的深度信息,从而能够提高活体检测的准确性。例如,当目标对象为人脸时,本公开实施例利用包含人脸的深度信息进行活体检测,能够充分挖掘人脸数据的深度信息,从而能够提高活体人脸检测的准确性。
需要说明的是,尽管以ToF传感器、结构光传感器和双目传感器介绍了第一传感器 如上,但本领域技术人员能够理解,本公开实施例应不限于此。本领域技术人员可以根据实际应用场景需求和/或个人喜好灵活选择第一传感器的类型,只要能够通过第一传感器感测到目标对象的深度信息即可。
在本公开实施例中,目标对象的深度信息可以为任意能够体现目标对象的深度的信息,本公开实施例对目标对象的深度信息的具体实现不做限定。在一些实施例中,目标对象的深度信息可以为目标对象的深度图像。在另一些实施例中,目标对象的深度信息可以为目标对象的点云。其中,目标对象的点云可以记录目标对象的各个点的三维坐标。在另一些实施例中,目标对象的深度信息可以为记录目标对象的各个点的深度的表格或其他类型的文件。
在一些实施例中,第二传感器可以为RGB(Red,红;Green,绿;Blue,蓝)传感器或者近红外传感器。若第二传感器为RGB传感器或其他类型的图像传感器,则第二传感器感测到的目标图像为RGB图像。若第二传感器为近红外传感器,则第二传感器感测到的目标图像为近红外图像。其中,近红外图像可以为带光斑的近红外图像,也可以为不带光斑的近红外图像,等等。需要说明的是,尽管以RGB传感器和近红外传感器介绍了第二传感器如上,但本领域技术人员能够理解,本公开实施例应不限于此。本领域技术人员可以根据实际应用场景需求和/或个人喜好灵活选择第二传感器的类型,只要能通过第二传感器感测到的目标图像获取目标对象的关键点信息即可。
在一些实施例中,通过3D摄像头采集深度图和目标图像,其中,3D摄像头包括用于采集图像的图像传感器和用于采集深度信息的深度传感器。例如,终端设备通过自身设置的3D摄像头采集目标对象的三维信息。在另一些实施例中,从其他设备处获取深度信息和目标图像,例如,接收终端设备发送的活体检测请求,所述活体检测请求携带所述深度信息和目标图像。
在步骤S12中,对目标图像进行关键点检测,得到目标对象的关键点信息。
其中,目标对象的关键点信息可以包括目标对象的关键点的位置信息。
在本公开实施例中,若目标对象为人脸,则目标对象的关键点可以包括眼睛关键点、眉毛关键点、鼻子关键点、嘴巴关键点和人脸轮廓关键点等中的一项或多项。其中,眼睛关键点可以包括眼睛轮廓关键点、眼角关键点和瞳孔关键点等中的一项或多项。
在步骤S13中,基于目标对象的深度信息和目标对象的关键点信息,得到目标对象的活体检测结果。
其中,目标对象的活体检测结果可以为目标对象为活体或者目标对象为假体,例如目标对象的活体检测结果可以为目标对象为活体人脸或者目标对象为假体人脸。
本公开实施例通过结合目标对象的深度信息和目标图像进行活体检测,由此能够利用目标对象的深度信息和目标图像中的目标对象的关键点信息进行活体检测,从而能够提高活体检测的准确性。
图2示出根据本公开实施例的活体检测方法的一示例性的流程图。如图2所示,该方法可以包括步骤S21至步骤S24。
在步骤S21中,获取第一传感器感测到的目标对象的深度信息和第二传感器感测到的目标图像。其中,对步骤S21参见上文对步骤S11的描述。在步骤S22中,根据第一传感器的参数以及第二传感器的参数,对齐目标对象的深度信息和目标图像。
在一些实施例中,可以对目标图像的深度信息进行转换处理,以使得转换处理后的深度信息和目标图像对齐。例如,若目标对象的深度信息为目标对象的深度图像,则根据第一传感器的参数矩阵和第二传感器的参数矩阵,确定第一传感器的参数矩阵至第二传感器的参数矩阵的转换矩阵;根据该转换矩阵,转换目标对象的深度图像。
在另一些实施例中,可以对目标图像进行转换处理,以使得转换处理后的目标图像与深度信息对齐。例如,若目标对象的深度信息为目标对象的深度图像,则根据第一传感器的参数矩阵和第二传感器的参数矩阵,确定第二传感器的参数矩阵至第一传感器的参数矩阵的转换矩阵;根据该转换矩阵,转换目标图像。
在本公开实施例中,第一传感器的参数可以包括第一传感器的内参数和/或外参数,第二传感器的参数可以包括第二传感器的内参数和/或外参数。
在本公开实施例中,当目标对象的深度信息为目标对象的深度图像时,通过对齐目标对象的深度信息和目标图像,能够使目标对象的深度图像和目标图像中相应的部分在两个图像中的位置相同。
在步骤S23中,对目标图像进行关键点检测,得到目标对象的关键点信息。其中,对步骤S23参见上文对步骤S12的描述。
在步骤S24中,基于目标对象的深度信息和目标对象的关键点信息,得到目标对象的活体检测结果。其中,对步骤S24参见上文对步骤S13的描述。
图3示出根据本公开实施例的活体检测方法步骤S13的一示例性的流程图。如图3所示,步骤S13可以包括步骤S131至步骤S133。
在步骤S131中,基于目标对象的深度信息和目标对象的关键点信息,得到第一特征信息。
在一些实施例中,基于目标对象的深度信息和目标对象的关键点信息,得到第一特征信息,包括:将目标对象的深度信息和目标对象的关键点信息输入第一神经网络进行处理,得到第一特征信息。作为该实现方式的一个示例,第一神经网络可以包括卷积层、下采样层和全连接层。例如,第一神经网络可以包括一级卷积层、一级下采样层和一级全连接层。其中,该级卷积层可以包括一个或多个卷积层,该级下采样层可以包括一个或多个下采样层,该级全连接层可以包括一个或多个全连接层。又如,第一神经网络可以包括多级卷积层、多级下采样层和一级全连接层。其中,每级卷积层可以包括一个或多个卷积层,每级下采样层可以包括一个或多个下采样层,该级全连接层可以包括一个或多个全连接层。其中,第i级卷积层后级联第i级下采样层,第i级下采样层后级联第i+1级卷积层,第n级下采样层后级联全连接层,其中,i和n均为正整数,1≤i≤n,n表示第一神经网络中卷积层和下采样层的级数。
作为该实现方式的另一个示例,第一神经网络可以包括卷积层、下采样层、归一化 层和全连接层。例如,第一神经网络可以包括一级卷积层、一个归一化层、一级下采样层和一级全连接层。其中,该级卷积层可以包括一个或多个卷积层,该级下采样层可以包括一个或多个下采样层,该级全连接层可以包括一个或多个全连接层。
又如,第一神经网络可以包括多级卷积层、多个归一化层和多级下采样层和一级全连接层。其中,每级卷积层可以包括一个或多个卷积层,每级下采样层可以包括一个或多个下采样层,该级全连接层可以包括一个或多个全连接层。其中,第i级卷积层后级联第i个归一化层,第i个归一化层后级联第i级下采样层,第i级下采样层后级联第i+1级卷积层,第n级下采样层后级联全连接层,其中,i和n均为正整数,1≤i≤n,n表示第一神经网络中卷积层、下采样层的级数和归一化层的个数。
在步骤S132中,基于目标对象的关键点信息,得到第二特征信息。
在一些实施例中,基于目标对象的关键点信息,得到第二特征信息,包括:将目标图像和目标对象的关键点信息输入第二神经网络进行处理,得到第二特征信息。
作为该实现方式的一个示例,第二神经网络可以包括卷积层、下采样层和全连接层。
例如,第二神经网络可以包括一级卷积层、一级下采样层和一级全连接层。其中,该级卷积层可以包括一个或多个卷积层,该级下采样层可以包括一个或多个下采样层,该级全连接层可以包括一个或多个全连接层。又如,第二神经网络可以包括多级卷积层、多级下采样层和一级全连接层。其中,每级卷积层可以包括一个或多个卷积层,每级下采样层可以包括一个或多个下采样层,该级全连接层可以包括一个或多个全连接层。其中,第j级卷积层后级联第j级下采样层,第j级下采样层后级联第j+1级卷积层,第m级下采样层后级联全连接层,其中,j和m均为正整数,1≤j≤m,m表示第二神经网络中卷积层和下采样层的级数。
作为该实现方式的另一个示例,第二神经网络可以包括卷积层、下采样层、归一化层和全连接层。例如,第二神经网络可以包括一级卷积层、一个归一化层、一级下采样层和一级全连接层。其中,该级卷积层可以包括一个或多个卷积层,该级下采样层可以包括一个或多个下采样层,该级全连接层可以包括一个或多个全连接层。又如,第二神经网络可以包括多级卷积层、多个归一化层和多级下采样层和一级全连接层。其中,每级卷积层可以包括一个或多个卷积层,每级下采样层可以包括一个或多个下采样层,该级全连接层可以包括一个或多个全连接层。其中,第j级卷积层后级联第j个归一化层,第j个归一化层后级联第j级下采样层,第j级下采样层后级联第j+1级卷积层,第m级下采样层后级联全连接层,其中,j和m均为正整数,1≤j≤m,m表示第二神经网络中卷积层、下采样层的级数和归一化层的个数。
在一些实施例中,第一神经网络和第二神经网络具有相同的网络结构。
在步骤S133中,基于第一特征信息和第二特征信息,确定目标对象的活体检测结果。
需要说明的是,本公开实施例不对步骤S131和步骤S132执行的先后顺序进行限定,只要步骤S131和步骤S132在步骤S133之前执行即可。例如,可以先执行步骤S131 再执行步骤S132,或者可以先执行步骤S132再执行步骤S131,或者可以同时执行步骤S131和步骤S132。
图5示出根据本公开实施例的活体检测方法步骤S131的一示例性的流程图。如图5所示,步骤S131可以包括步骤S1311至步骤S1313。
在步骤S1311中,对目标对象的深度信息和目标对象的关键点信息进行卷积处理,得到第一卷积结果。
在步骤S1312中,对第一卷积结果进行下采样处理,得到第一下采样结果。
在一些实施例中,可以通过一级卷积层和一级下采样层对目标对象的深度信息和目标对象的关键点信息进行卷积处理和下采样处理。其中,其中,该级卷积层可以包括一个或多个卷积层,该级下采样层可以包括一个或多个下采样层。
在另一种可能的实现方式中,可以通过多级卷积层和多级下采样层对目标对象的深度信息和目标对象的关键点信息进行卷积处理和下采样处理。其中,每级卷积层可以包括一个或多个卷积层,每级下采样层可以包括一个或多个下采样层。
在一些实施例中,对第一卷积结果进行下采样处理,得到第一下采样结果,可以包括:对第一卷积结果进行归一化处理,得到第一归一化结果;对第一归一化结果进行下采样处理,得到第一下采样结果。
在步骤S1313中,基于第一下采样结果,得到第一特征信息。
在一些实施例中,可以将第一下采样结果输入全连接层,通过全连接层对第一下采样结果进行融合处理(例如全连接运算),得到第一特征信息。
图6示出根据本公开实施例的活体检测方法步骤S132的一示例性的流程图。如图6所示,步骤S132可以包括步骤S1321至步骤S1323。
在步骤S1321中,对目标图像和目标对象的关键点信息进行卷积处理,得到第二卷积结果。
在步骤S1322中,对第二卷积结果进行下采样处理,得到第二下采样结果。
在一些实施例中,可以通过一级卷积层和一级下采样层对目标图像和目标对象的关键点信息进行卷积处理和下采样处理。其中,该级卷积层可以包括一个或多个卷积层,该级下采样层可以包括一个或多个下采样层。在另一种可能的实现方式中,可以通过多级卷积层和多级下采样层对目标图像和目标对象的关键点信息进行卷积处理和下采样处理。其中,每级卷积层可以包括一个或多个卷积层,每级下采样层可以包括一个或多个下采样层。在一些实施例中,对第二卷积结果进行下采样处理,得到第二下采样结果,可以包括:对第二卷积结果进行归一化处理,得到第二归一化结果;对第二归一化结果进行下采样处理,得到第二下采样结果。
在步骤S1323中,基于第二下采样结果,得到第二特征信息。
在一些实施例中,可以将第二下采样结果输入全连接层,通过全连接层对第二下采样结果进行融合处理(例如全连接运算),得到第二特征信息。
图7示出根据本公开实施例的活体检测方法步骤S133的一示例性的流程图。如图 7所示,步骤S133可以包括步骤S1331和步骤S1332。
在步骤S1331中,对第一特征信息和第二特征信息进行融合处理(例如全连接运算),得到第三特征信息。
在一些实施例中,可以对第一特征信息和第二特征信息进行连接(例如通道叠加)或者相加处理,得到第三特征信息。在一个例子中,通过全连接层对第一特征信息和第二特征信息进行全连接运算,得到第三特征信息。
在步骤S1332中,根据第三特征信息,确定目标对象的活体检测结果。
图8示出根据本公开实施例的活体检测方法步骤S1332的一示例性的流程图。如图8所示,步骤S1332可以包括步骤S13321和步骤S13322。
在步骤S13321中,基于第三特征信息,得到目标对象为活体的概率。
在一些实施例中,可以将第三特征信息输入Softmax层中,通过Softmax层得到目标对象为活体的概率。作为该实现方式的一个示例,Softmax层可以包括两个神经元,其中,一个神经元代表目标对象为活体的概率,另一个神经元代表目标对象为假体的概率。
在步骤S13322中,根据目标对象为活体的概率,确定目标对象的活体检测结果。
在一些实施例中,根据目标对象为活体的概率,确定目标对象的活体检测结果,包括:若目标对象为活体的概率大于第一阈值,则确定目标对象的活体检测结果为目标对象为活体;若目标对象为活体的概率小于或等于第一阈值,则确定目标对象的活体检测结果为假体。需要说明的是,尽管以图8所示的流程介绍了步骤S1332的实现方式如上,但本领域技术人员能够理解,本公开实施例应不限于此。在另一种可能的实现方式中,可以基于第三特征信息,得到目标对象为假体的概率,并根据目标对象为假体的概率,确定目标对象的活体检测结果。在该实现方式中,若目标对象为假体的概率大于第二阈值,则确定目标对象的活体检测结果为目标对象为假体;若目标对象为假体的概率小于或等于第二阈值,则确定目标对象的活体检测结果为活体。
本公开实施例通过结合目标对象的深度信息和目标图像进行活体检测,由此能够利用目标对象的深度信息和目标图像中的目标对象的关键点信息进行活体检测,从而能够提高活体检测的准确性,且计算复杂度较低,在摄像头轻微摇晃或者震动的情况下仍然能够获得较为准确的活体检测结果。
随着人脸识别技术的发展,人脸识别的准确度已经能够超越指纹,因此被广泛应用到各种场景中,如视频监控、人脸解锁、人脸支付等应用。但是,人脸识别存在着容易被hack(攻击)的风险,活体检测是人脸识别应用中必不可少的一个环节。
单目活体检测使用普通摄像头采集到的图像作为输入,存在容易被高清无痕hack图像通过的缺点。双目活体检测技术通过两个摄像头(普通RGB摄像头或普通近红外摄像头)作为输入,性能比单目活体检测更加优越。但是通过双目匹配来计算人脸深度分布信息,存在着计算量大、深度信息精确度低的缺点,且摄像头在摇晃,震动等情况下容易出现摄像头参数变化而导致计算失效的情况。近年来,三维(3D,3Dimensions) 传感器技术突飞猛进,包括飞行时间(Time Of Flight,TOF)传感器、结构光传感器、双目传感器等,使得用户能够方便的直接从传感器(Sensor)获得高精度的深度信息。本公开实施例将3D数据和近红外数据或RGB色彩模式的数据作为输入,利用近红外图或RGB图求得人脸关键点信息,然后融合人脸3D深度图、近红外或RGB图、人脸关键点信息、眼角特征、瞳孔特征等一项或多项,利用深度学习模型,能够对真实人脸和hack进行更有效的区分。
图4A示出根据本公开实施例的应用于人脸的活体检测装置的示意性框图。如图4A所示,该活体检测装置包括输入模块41、数据预处理模块42、深度神经网络模块43和检测结果输出模块44。
输入模块41,适用于不同硬件模组的数据输入,输入模块的数据输入形式包含以下中的一种或多种:深度图,纯近红外图,带光斑的近红外图,RGB图,等等。具体的组合形式由不同的硬件方案所确定。
数据预处理模块42,用于对输入模块输入的数据进行预处理,得到深度神经网络所需要的数据。图4B示出了根据本公开实施例的图4A中的数据预处理模块42的一种实现方式的示例性框图,其中,数据预处理模块的输入包括:深度传感器所获得的深度图和图像传感器获得的图像(纯近红外图,带光斑的红外图,RGB图等),图4B所示的例子中以深度图421和近红外图422作为数据预处理模块42的输入。在一些可能的实现方式中,数据预处理模块对输入数据的处理过程如下:图像对齐矫正423和人脸关键点检测424,其中人脸关键点检测可以利用人脸关键点模型实现。
在图像对齐矫正423中,如果深度图和近红外图(或RGB图)不是同步对齐的,则需要根据相机的参数矩阵对输入的深度图和近红外图进行对齐/校正,以实现图像对齐。
在人脸关键点检测424中,将近红外图(或者RGB图)输入到人脸关键点模型进行人脸关键点检测,得到人脸关键点信息425。
数据预处理模块的输出:输出形式和输入相对应,包括对齐校正后的人脸深度图(对应于输入的深度图421)和人脸近红外图(对应于输入的近红外图422)以及人脸关键点信息。在一些实施例中,深度神经网络模块43是二分类模型,例如,对于真实的人脸,分类的标签(label)为0;对于hack的人脸,分类的label为1。再例如,对于真实的人脸,分类的标签(label)为1;对于hack的人脸,分类的label为0,等等。图4C示出根据本公开实施例的图4A中的深度神经网络模块的一个例子的框图,如图4C所示,深度神经网络模块的输入包括:经过数据预处理模块后得到的人脸深度图431、人脸近红外图432(或者其他形式的二维人脸图像)以及人脸关键点信息433。在一些实施例中,深度神经网络模块的输出包括:判别分数,即判别为真人或hack的概率。深度神经网络的输出是一个二元值,将输出分数与预设的阈值进行比较,其中,阈值的设定可以根据准确率和召回率进行调节,例如,神经网络的输出分数若大于阈值,则判别为hack,若小于阈值,则判别为活体,等等。
在图4C所示的例子中,深度神经网络是一个多支路模型,支路的数量由输入图像 的数量决定,图4C以人脸深度图和人脸近红外图为例,深度神经网络包括两个分支,每个支路包括多个卷积层434、下采样层435和全连接层436,其中,人脸深度图431和人脸关键点信息433输入到第一分支进行特征提取处理,人脸近红外图432和人脸关键点信息433输入到第二分支进行特征提取处理,最后将多支路提取到的特征连接到一起输入到全连接层437,最后经过Softmax层438的处理得到输出结果,输出层的神经元个数为2,分别代表真人和hack的概率。需要说明的是,图4C的两个分支的输入都包括人脸关键点信息,而全连接层437正是利用人脸关键点信息,将两个分支的全连接层436的输出的特征信息融合在一起,假设第一分支中全连接层436的输出的是第一特征信息,第二分支中全连接层436的输出的是第二特征信息,而全连接层437利用人脸关键点信息通过全连接运算将第一特征信息和第二特征信息融合在一起。换句话说,本公开实施例中,利用人脸关键点信息,融合人脸深度图和人脸近红外图,从而得到最终的输出结果。
检测结果输出模块44的输出方式有多种。在一个例子中,对于真实的人脸,输出的结果标识为0;对于hack的人脸,输出的结果标识为1,但本公开实施例对此不做限定。
本公开实施例提供的技术方案,具有以下至少一种特点:
1)在一些实施例中,一方面,联合具有深度信息的3D传感器和其他辅助图像,如近红外图像,RGB图像等。即利用多种新型3D数据,作为人脸深度数据分布的基础。另一方面,提出的框架可以适用于多种3D传感器的输入形式,包括TOF相机提供的3D深度图+近红外图;结构光相机提供的3D深度图+带光斑的近红外图;3D深度图+RGB图;3D深度图+近红外图+RGB图以及包含3D深度图和近红外图或RGB图的其他形式。而相关技术中,以普通摄像头和双目为主,没有充分挖掘人脸数据的深度信息,存在着容易被高清无痕hack攻击通过的缺点,而本公开实施例利用3D传感器采集到的人脸深度图,能够防止平面的hack攻击。
2)在一些实施例中,融合了3D深度信息,其他近红外数据或RGB数据,人脸关键点信息以及眼角和瞳孔特征,通过深度学习模型的训练来区分真人和hack。而相关检测方法中,以单一的数据为主,没有利用多模态数据之间的相关性和互补性。也就是说,普通的双目计算深度的方法,存在着计算复杂度高、精度低的缺陷,而本公开实施例能够有效利用当前的3D传感技术获得更精准的3D人脸数据分布。
3)在一些实施例中,融合了多支路模型,多支路模型能够充分融合多模态的数据,而且兼容多种数据形态,能够通过神经网络学习到真实的人脸信息特征。而本公开实施例融合了人脸深度信息、近红外人脸信息或RGB图人脸信息、人脸关键点信息以及眼角、眼睛、瞳孔等多维度的生物特征融合技术,弥补了单一技术容易被针对hack的缺点。
图9示出根据本公开实施例的活体检测装置的框图。如图9所示,该装置包括:获取模块91,配置为获取第一传感器感测到的目标对象的深度信息和第二传感器感测到的 目标图像;检测模块92,配置为对目标图像进行关键点检测,得到目标对象的关键点信息;确定模块93,配置为基于目标对象的深度信息和目标对象的关键点信息,得到目标对象的活体检测结果。在一些实施例中,目标对象为人脸。在一些实施例中,第二传感器为RGB传感器或者近红外传感器。
图10示出根据本公开实施例的活体检测装置的一示例性的框图。如图10所示:
在一些实施例中,该装置还包括:对齐模块94,配置为根据第一传感器的参数以及第二传感器的参数,对齐目标对象的深度信息和目标图像。
在一些实施例中,确定模块93包括:第一确定子模块931,配置为基于目标对象的深度信息和目标对象的关键点信息,得到第一特征信息;第二确定子模块932,配置为基于目标对象的关键点信息,得到第二特征信息;第三确定子模块933,配置为基于第一特征信息和第二特征信息,确定目标对象的活体检测结果。
在一些实施例中,第一确定子模块931配置为:将目标对象的深度信息和目标对象的关键点信息输入第一神经网络进行处理,得到第一特征信息;第二确定子模块932配置为:将目标图像和目标对象的关键点信息输入第二神经网络进行处理,得到第二特征信息。在一些实施例中,第一神经网络和第二神经网络具有相同的网络结构。
在一些实施例中,第一确定子模块931包括:第一卷积单元,配置为对目标对象的深度信息和目标对象的关键点信息进行卷积处理,得到第一卷积结果;第一下采样单元,配置为对第一卷积结果进行下采样处理,得到第一下采样结果;第一确定单元,配置为基于第一下采样结果,得到第一特征信息。
在一些实施例中,第二确定子模块932包括:第二卷积单元,配置为对目标图像和目标对象的关键点信息进行卷积处理,得到第二卷积结果;第二下采样单元,配置为对第二卷积结果进行下采样处理,得到第二下采样结果;第二确定单元,配置为基于第二下采样结果,得到第二特征信息。
在一些实施例中,第三确定子模块933包括:全连接单元,配置为对第一特征信息和第二特征信息进行融合处理(例如全连接运算),得到第三特征信息;第三确定单元,配置为根据第三特征信息,确定目标对象的活体检测结果。
在一些实施例中,第三确定单元包括:第一确定子单元,配置为基于第三特征信息,得到目标对象为活体的概率;第二确定子单元,配置为根据目标对象为活体的概率,确定目标对象的活体检测结果。本公开实施例通过结合目标对象的深度信息和目标图像进行活体检测,由此能够利用目标对象的深度信息和目标图像中的目标对象的关键点信息进行活体检测,从而能够提高活体检测的准确性,防止假体图像攻击。
图11是根据一示例性实施例示出的一种活体检测装置800的框图。例如,装置800可以是移动电话,计算机,数字广播终端,消息收发设备,游戏控制台,平板设备,医疗设备,健身设备,个人数字助理等。参照图11,装置800可以包括以下一个或多个组件:处理组件802,存储器804,电源组件806,多媒体组件808,音频组件810,输入/输出(I/O)的接口812,传感器组件814,以及通信组件816。
处理组件802通常控制装置800的整体操作,诸如与显示,电话呼叫,数据通信,相机操作和记录操作相关联的操作。处理组件802可以包括一个或多个处理器820来执行指令,以完成上述的方法的全部或部分步骤。此外,处理组件802可以包括一个或多个模块,便于处理组件802和其他组件之间的交互。例如,处理组件802可以包括多媒体模块,以方便多媒体组件808和处理组件802之间的交互。
存储器804被配置为存储各种类型的数据以支持在装置800的操作。这些数据的示例包括用于在装置800上操作的任何应用程序或方法的指令,联系人数据,电话簿数据,消息,图片,视频等。存储器804可以由任何类型的易失性或非易失性存储设备或者它们的组合实现,如静态随机存取存储器(SRAM),电可擦除可编程只读存储器(EEPROM),可擦除可编程只读存储器(EPROM),可编程只读存储器(PROM),只读存储器(ROM),磁存储器,快闪存储器,磁盘或光盘。电源组件806为装置800的各种组件提供电力。电源组件806可以包括电源管理系统,一个或多个电源,及其他与为装置800生成、管理和分配电力相关联的组件。
多媒体组件808包括在所述装置800和用户之间的提供一个输出接口的屏幕。在一些实施例中,屏幕可以包括液晶显示器(LCD)和触摸面板(TP)。如果屏幕包括触摸面板,屏幕可以被实现为触摸屏,以接收来自用户的输入信号。触摸面板包括一个或多个触摸传感器以感测触摸、滑动和触摸面板上的手势。所述触摸传感器可以不仅感测触摸或滑动动作的边界,而且还检测与所述触摸或滑动操作相关的持续时间和压力。在一些实施例中,多媒体组件808包括一个前置摄像头和/或后置摄像头。当装置800处于操作模式,如拍摄模式或视频模式时,前置摄像头和/或后置摄像头可以接收外部的多媒体数据。每个前置摄像头和后置摄像头可以是一个固定的光学透镜系统或具有焦距和光学变焦能力。音频组件810被配置为输出和/或输入音频信号。例如,音频组件810包括一个麦克风(MIC),当装置800处于操作模式,如呼叫模式、记录模式和语音识别模式时,麦克风被配置为接收外部音频信号。所接收的音频信号可以被进一步存储在存储器804或经由通信组件816发送。在一些实施例中,音频组件810还包括一个扬声器,用于输出音频信号。I/O接口812为处理组件802和外围接口模块之间提供接口,上述外围接口模块可以是键盘,点击轮,按钮等。这些按钮可包括但不限于:主页按钮、音量按钮、启动按钮和锁定按钮。
传感器组件814包括一个或多个传感器,用于为装置800提供各个方面的状态评估。例如,传感器组件814可以检测到装置800的打开/关闭状态,组件的相对定位,例如所述组件为装置800的显示器和小键盘,传感器组件814还可以检测装置800或装置800一个组件的位置改变,用户与装置800接触的存在或不存在,装置800方位或加速/减速和装置800的温度变化。传感器组件814可以包括接近传感器,被配置用来在没有任何的物理接触时检测附近物体的存在。传感器组件814还可以包括光传感器,如CMOS或CCD图像传感器,用于在成像应用中使用。在一些实施例中,该传感器组件814还可以包括加速度传感器,陀螺仪传感器,磁传感器,压力传感器或温度传感器。
通信组件816被配置为便于装置800和其他设备之间有线或无线方式的通信。装置800可以接入基于通信标准的无线网络,如WiFi,2G或3G,或它们的组合。在一个示例性实施例中,通信组件816经由广播信道接收来自外部广播管理系统的广播信号或广播相关信息。在一个示例性实施例中,所述通信组件816还包括近场通信(NFC)模块,以促进短程通信。例如,在NFC模块可基于射频识别(RFID)技术,红外数据协会(IrDA)技术,超宽带(UWB)技术,蓝牙(BT)技术和其他技术来实现。
在示例性实施例中,装置800可以被一个或多个应用专用集成电路(ASIC)、数字信号处理器(DSP)、数字信号处理设备(DSPD)、可编程逻辑器件(PLD)、现场可编程门阵列(FPGA)、控制器、微控制器、微处理器或其他电子元件实现,用于执行上述方法。在示例性实施例中,还提供了一种非易失性计算机可读存储介质,例如包括计算机程序指令的存储器804,上述计算机程序指令可由装置800的处理器820执行以完成上述方法。
本公开实施例可以是系统、方法和/或计算机程序产品。计算机程序产品可以包括计算机可读存储介质,其上载有用于使处理器实现本公开的各个方面的计算机可读程序指令。计算机可读存储介质可以是可以保持和存储由指令执行设备使用的指令的有形设备。计算机可读存储介质例如可以是――但不限于――电存储设备、磁存储设备、光存储设备、电磁存储设备、半导体存储设备或者上述的任意合适的组合。计算机可读存储介质的更具体的例子(非穷举的列表)包括:便携式计算机盘、硬盘、随机存取存储器(RAM)、只读存储器(ROM)、可擦式可编程只读存储器(EPROM或闪存)、静态随机存取存储器(SRAM)、便携式压缩盘只读存储器(CD-ROM)、数字多功能盘(DVD)、记忆棒、软盘、机械编码设备、例如其上存储有指令的打孔卡或凹槽内凸起结构、以及上述的任意合适的组合。这里所使用的计算机可读存储介质不被解释为瞬时信号本身,诸如无线电波或者其他自由传播的电磁波、通过波导或其他传输媒介传播的电磁波(例如,通过光纤电缆的光脉冲)、或者通过电线传输的电信号。
这里所描述的计算机可读程序指令可以从计算机可读存储介质下载到各个计算/处理设备,或者通过网络、例如因特网、局域网、广域网和/或无线网下载到外部计算机或外部存储设备。网络可以包括铜传输电缆、光纤传输、无线传输、路由器、防火墙、交换机、网关计算机和/或边缘服务器。每个计算/处理设备中的网络适配卡或者网络接口从网络接收计算机可读程序指令,并转发该计算机可读程序指令,以供存储在各个计算/处理设备中的计算机可读存储介质中。
用于执行本公开实施例操作的计算机程序指令可以是汇编指令、指令集架构(ISA)指令、机器指令、机器相关指令、微代码、固件指令、状态设置数据、或者以一种或多种编程语言的任意组合编写的源代码或目标代码,所述编程语言包括面向对象的编程语言—诸如Smalltalk、C++等,以及常规的过程式编程语言—诸如“C”语言或类似的编程语言。计算机可读程序指令可以完全地在用户计算机上执行、部分地在用户计算机上执行、作为一个独立的软件包执行、部分在用户计算机上部分在远程计算机上执行、或 者完全在远程计算机或服务器上执行。在涉及远程计算机的情形中,远程计算机可以通过任意种类的网络—包括局域网(LAN)或广域网(WAN)—连接到用户计算机,或者,可以连接到外部计算机(例如利用因特网服务提供商来通过因特网连接)。在一些实施例中,通过利用计算机可读程序指令的状态信息来个性化定制电子电路,例如可编程逻辑电路、现场可编程门阵列(FPGA)或可编程逻辑阵列(PLA),该电子电路可以执行计算机可读程序指令,从而实现本公开实施例的各个方面。
这里参照根据本公开实施例的方法、装置(系统)和计算机程序产品的流程图和/或框图描述了本公开实施例的各个方面。应当理解,流程图和/或框图的每个方框以及流程图和/或框图中各方框的组合,都可以由计算机可读程序指令实现。
这些计算机可读程序指令可以提供给通用计算机、专用计算机或其它可编程数据处理装置的处理器,从而生产出一种机器,使得这些指令在通过计算机或其它可编程数据处理装置的处理器执行时,产生了实现流程图和/或框图中的一个或多个方框中规定的功能/动作的装置。也可以把这些计算机可读程序指令存储在计算机可读存储介质中,这些指令使得计算机、可编程数据处理装置和/或其他设备以特定方式工作,从而,存储有指令的计算机可读介质则包括一个制造品,其包括实现流程图和/或框图中的一个或多个方框中规定的功能/动作的各个方面的指令。
也可以把计算机可读程序指令加载到计算机、其它可编程数据处理装置、或其它设备上,使得在计算机、其它可编程数据处理装置或其它设备上执行一系列操作步骤,以产生计算机实现的过程,从而使得在计算机、其它可编程数据处理装置、或其它设备上执行的指令实现流程图和/或框图中的一个或多个方框中规定的功能/动作。
附图中的流程图和框图显示了根据本公开实施例的多个实施例的系统、方法和计算机程序产品的可能实现的体系架构、功能和操作。在这点上,流程图或框图中的每个方框可以代表一个模块、程序段或指令的一部分,所述模块、程序段或指令的一部分包含一个或多个用于实现规定的逻辑功能的可执行指令。在有些作为替换的实现中,方框中所标注的功能也可以以不同于附图中所标注的顺序发生。例如,两个连续的方框实际上可以基本并行地执行,它们有时也可以按相反的顺序执行,这依所涉及的功能而定。也要注意的是,框图和/或流程图中的每个方框、以及框图和/或流程图中的方框的组合,可以用执行规定的功能或动作的专用的基于硬件的系统来实现,或者可以用专用硬件与计算机指令的组合来实现。
以上已经描述了本公开的各实施例,上述说明是示例性的,并非穷尽性的,并且也不限于所披露的各实施例。在不偏离所说明的各实施例的范围和精神的情况下,对于本技术领域的普通技术人员来说许多修改和变更都是显而易见的。本文中所用术语的选择,旨在最好地解释各实施例的原理、实际应用或对市场中的技术的技术改进,或者使本技术领域的其它普通技术人员能理解本文披露的各实施例。

Claims (36)

  1. 一种活体检测方法,包括:获取第一传感器感测到的目标对象的深度信息和第二传感器感测到的目标图像;对所述目标图像进行关键点检测,得到所述目标对象的关键点信息;基于所述目标对象的深度信息和所述目标对象的关键点信息,得到所述目标对象的活体检测结果。
  2. 根据权利要求1所述的方法,所述目标对象为人脸。
  3. 根据权利要求1或2所述的方法,所述第二传感器为RGB传感器或者近红外传感器。
  4. 根据权利要求1至3中任意一项所述的方法,所述第一传感器为飞行时间TOF传感器或结构光传感器。
  5. 根据权利要求1至4中任意一项所述的方法,在对所述目标图像进行关键点检测之前,所述方法还包括:根据所述第一传感器的参数以及所述第二传感器的参数,对齐所述目标对象的深度信息和所述目标图像。
  6. 根据权利要求1至5中任意一项所述的方法,基于所述目标对象的深度信息和所述目标对象的关键点信息,得到所述目标对象的活体检测结果,包括:基于所述目标对象的深度信息和所述目标对象的关键点信息,得到第一特征信息;基于所述目标对象的关键点信息,得到第二特征信息;基于所述第一特征信息和所述第二特征信息,确定所述目标对象的活体检测结果。
  7. 根据权利要求6所述的方法,基于所述目标对象的深度信息和所述目标对象的关键点信息,得到第一特征信息,包括:将所述目标对象的深度信息和所述目标对象的关键点信息输入第一神经网络进行处理,得到第一特征信息;
    基于所述目标对象的关键点信息,得到第二特征信息,包括:将所述目标图像和所述目标对象的关键点信息输入第二神经网络进行处理,得到第二特征信息。
  8. 根据权利要求6或7所述的方法,基于所述目标对象的深度信息和所述目标对象的关键点信息,得到第一特征信息,包括:对所述目标对象的深度信息和所述目标对象的关键点信息进行卷积处理,得到第一卷积结果;对所述第一卷积结果进行下采样处理,得到第一下采样结果;基于所述第一下采样结果,得到第一特征信息。
  9. 根据权利要求6至8中任意一项所述的方法,基于所述目标对象的关键点信息,得到第二特征信息,包括:对所述目标图像和所述目标对象的关键点信息进行卷积处理, 得到第二卷积结果;对所述第二卷积结果进行下采样处理,得到第二下采样结果;基于所述第二下采样结果,得到第二特征信息。
  10. 根据权利要求6至9中任意一项所述的方法,基于所述第一特征信息和所述第二特征信息,确定所述目标对象的活体检测结果,包括:对所述第一特征信息和所述第二特征信息进行融合处理,得到第三特征信息;根据所述第三特征信息,确定所述目标对象的活体检测结果。
  11. 根据权利要求10所述的方法,根据所述第三特征信息,确定活体检测结果,包括:基于所述第三特征信息,得到所述目标对象为活体的概率;
    根据所述目标对象为活体的概率,确定所述目标对象的活体检测结果。
  12. 一种活体检测装置,包括:获取模块,配置为获取第一传感器感测到的目标对象的深度信息和第二传感器感测到的目标图像;检测模块,配置为对所述目标图像进行关键点检测,得到所述目标对象的关键点信息;确定模块,配置为基于所述目标对象的深度信息和所述目标对象的关键点信息,得到所述目标对象的活体检测结果。
  13. 根据权利要求12所述的装置,所述目标对象为人脸。
  14. 根据权利要求12或13所述的装置,所述第二传感器为RGB传感器或者近红外传感器。
  15. 根据权利要求12至14中任意一项所述的装置,所述第一传感器为飞行时间TOF传感器或结构光传感器。
  16. 根据权利要求12至15中任意一项所述的装置,所述装置还包括:对齐模块,配置为根据所述第一传感器的参数以及所述第二传感器的参数,对齐所述目标对象的深度信息和所述目标图像。
  17. 根据权利要求12至16中任意一项所述的装置,所述确定模块包括:第一确定子模块,配置为基于所述目标对象的深度信息和所述目标对象的关键点信息,得到第一特征信息;第二确定子模块,配置为基于所述目标对象的关键点信息,得到第二特征信息;第三确定子模块,配置为基于所述第一特征信息和所述第二特征信息,确定所述目标对象的活体检测结果。
  18. 根据权利要求17所述的装置,所述第一确定子模块配置为:将所述目标对象的深度信息和所述目标对象的关键点信息输入第一神经网络进行处理,得到第一特征信息;所述第二确定子模块配置为:将所述目标图像和所述目标对象的关键点信息输入第二神经网络进行处理,得到第二特征信息。
  19. 根据权利要求17或18所述的装置,所述第一确定子模块包括:第一卷积单元,配置为对所述目标对象的深度信息和所述目标对象的关键点信息进行卷积处理,得到第一卷积结果;第一下采样单元,配置为对所述第一卷积结果进行下采样处理,得到第一下采样结果;第一确定单元,配置为基于所述第一下采样结果,得到第一特征信息。
  20. 根据权利要求17至19中任意一项所述的装置,所述第二确定子模块包括:第二卷积单元,配置为对所述目标图像和所述目标对象的关键点信息进行卷积处理,得到第二卷积结果;第二下采样单元,配置为对所述第二卷积结果进行下采样处理,得到第二下采样结果;第二确定单元,配置为基于所述第二下采样结果,得到第二特征信息。
  21. 根据权利要求17至20中任意一项所述的装置,所述第三确定子模块包括:全连接单元,配置为对所述第一特征信息和所述第二特征信息进行融合处理,得到第三特征信息;第三确定单元,配置为根据所述第三特征信息,确定所述目标对象的活体检测结果。
  22. 根据权利要求21所述的装置,所述第三确定单元包括:第一确定子单元,配置为基于所述第三特征信息,得到所述目标对象为活体的概率;第二确定子单元,配置为根据所述目标对象为活体的概率,确定所述目标对象的活体检测结果。
  23. 一种活体检测装置,包括:存储器,配置为存储计算机可读指令;处理器,配置为执行所述存储器中存储的计算机可读指令,其中,对所述计算机可读指令的执行使得所述处理器执行权利要求1至11中任意一项所述的方法。
  24. 一种非易失性计算机可读存储介质,其上存储有计算机程序指令,所述计算机程序指令被处理器执行时实现权利要求1至11中任意一项所述的方法。
  25. 一种活体检测系统,包括:权利要求23所述的活体检测装置、所述第一传感器和所述第二传感器。
  26. 一种活体检测系统,包括:权利要求24所述的非易失性计算机可读存储介质、所述第一传感器和所述第二传感器。
  27. 一种电子设备,包括:第一传感器,配置为检测目标对象的深度信息;第二传感器,配置为采集包括所述目标对象的目标图像;处理器,配置为对所述第二传感器采集到的目标对象进行关键点检测,得到所述目标对象的关键点信息,并基于所述第一传感器检测到的所述目标对象的深度信息和所述目标对象的关键点信息,得到所述目标对象的活体检测结果。
  28. 根据权利要求27所述的电子设备,所述第二传感器为RGB传感器或者近红外 传感器。
  29. 根据权利要求27或28所述的电子设备,所述第一传感器为飞行时间TOF传感器或结构光传感器。
  30. 根据权利要求27至29中任意一项所述的电子设备,所述处理器还配置为:根据所述第一传感器的参数以及所述第二传感器的参数,对齐所述目标对象的深度信息和所述目标图像。
  31. 根据权利要求27至30中任意一项所述的电子设备,所述处理器配置为:基于所述目标对象的深度信息和所述目标对象的关键点信息,得到第一特征信息;基于所述目标对象的关键点信息,得到第二特征信息;基于所述第一特征信息和所述第二特征信息,确定所述目标对象的活体检测结果。
  32. 根据权利要求31所述的电子设备,所述处理器配置为:将所述目标对象的深度信息和所述目标对象的关键点信息输入第一神经网络进行处理,得到第一特征信息;
    基于所述目标对象的关键点信息,得到第二特征信息,包括:将所述目标图像和所述目标对象的关键点信息输入第二神经网络进行处理,得到第二特征信息。
  33. 根据权利要求31或32所述的电子设备,所述处理器配置为:对所述目标对象的深度信息和所述目标对象的关键点信息进行卷积处理,得到第一卷积结果;对所述第一卷积结果进行下采样处理,得到第一下采样结果;基于所述第一下采样结果,得到第一特征信息。
  34. 根据权利要求31至33中任意一项所述的电子设备,所述处理器配置为:对所述目标图像和所述目标对象的关键点信息进行卷积处理,得到第二卷积结果;对所述第二卷积结果进行下采样处理,得到第二下采样结果;基于所述第二下采样结果,得到第二特征信息。
  35. 根据权利要求31至34中任意一项所述的电子设备,所述处理器配置为:对所述第一特征信息和所述第二特征信息进行融合处理,得到第三特征信息;根据所述第三特征信息,确定所述目标对象的活体检测结果。
  36. 根据权利要求35所述的电子设备,所述处理器配置为:
    基于所述第三特征信息,得到所述目标对象为活体的概率;
    根据所述目标对象为活体的概率,确定所述目标对象的活体检测结果。
PCT/CN2018/115499 2018-05-10 2018-11-14 活体检测方法及装置、系统、电子设备、存储介质 Ceased WO2019214201A1 (zh)

Priority Applications (4)

Application Number Priority Date Filing Date Title
JP2019515661A JP6852150B2 (ja) 2018-05-10 2018-11-14 生体検知方法および装置、システム、電子機器、記憶媒体
EP18839510.7A EP3584745A4 (en) 2018-05-10 2018-11-14 METHOD AND SYSTEM FOR DETECTING A LIVING BODY, SYSTEM, ELECTRONIC DEVICE AND STORAGE MEDIUM
KR1020197019442A KR20190129826A (ko) 2018-05-10 2018-11-14 생체 검측 방법 및 장치, 시스템, 전자 기기, 저장 매체
US16/234,434 US10930010B2 (en) 2018-05-10 2018-12-27 Method and apparatus for detecting living body, system, electronic device, and storage medium

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201810444105.4A CN108764069B (zh) 2018-05-10 2018-05-10 活体检测方法及装置
CN201810444105.4 2018-05-10

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US16/234,434 Continuation US10930010B2 (en) 2018-05-10 2018-12-27 Method and apparatus for detecting living body, system, electronic device, and storage medium

Publications (1)

Publication Number Publication Date
WO2019214201A1 true WO2019214201A1 (zh) 2019-11-14

Family

ID=64010059

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2018/115499 Ceased WO2019214201A1 (zh) 2018-05-10 2018-11-14 活体检测方法及装置、系统、电子设备、存储介质

Country Status (6)

Country Link
US (1) US10930010B2 (zh)
EP (1) EP3584745A4 (zh)
JP (1) JP6852150B2 (zh)
KR (1) KR20190129826A (zh)
CN (1) CN108764069B (zh)
WO (1) WO2019214201A1 (zh)

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112926367A (zh) * 2019-12-06 2021-06-08 杭州海康威视数字技术股份有限公司 一种活体检测的设备及方法
CN113052034A (zh) * 2021-03-15 2021-06-29 上海商汤智能科技有限公司 基于双目摄像头的活体检测方法及相关装置
CN113128429A (zh) * 2021-04-24 2021-07-16 新疆爱华盈通信息技术有限公司 基于立体视觉的活体检测方法和相关设备
CN113486829A (zh) * 2021-07-15 2021-10-08 京东科技控股股份有限公司 人脸活体检测方法、装置、电子设备及存储介质
CN113989813A (zh) * 2021-10-29 2022-01-28 北京百度网讯科技有限公司 提取图像特征的方法和图像分类方法、装置、设备和介质
CN114333078A (zh) * 2021-12-01 2022-04-12 马上消费金融股份有限公司 活体检测方法、装置、电子设备及存储介质
CN115436893A (zh) * 2021-06-01 2022-12-06 富士通株式会社 基于无线雷达信号的关键点修正装置和方法

Families Citing this family (37)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110008783A (zh) * 2018-01-04 2019-07-12 杭州海康威视数字技术股份有限公司 基于神经网络模型的人脸活体检测方法、装置及电子设备
CN108764069B (zh) * 2018-05-10 2022-01-14 北京市商汤科技开发有限公司 活体检测方法及装置
CN110852134A (zh) * 2018-07-27 2020-02-28 北京市商汤科技开发有限公司 活体检测方法、装置及系统、电子设备和存储介质
CN109784149B (zh) * 2018-12-06 2021-08-20 苏州飞搜科技有限公司 一种人体骨骼关键点的检测方法及系统
CN111368601B (zh) * 2018-12-26 2021-11-16 北京市商汤科技开发有限公司 活体检测方法和装置、电子设备和计算机可读存储介质
WO2020159437A1 (en) * 2019-01-29 2020-08-06 Agency For Science, Technology And Research Method and system for face liveness detection
CN111507131B (zh) * 2019-01-31 2023-09-19 北京市商汤科技开发有限公司 活体检测方法及装置、电子设备和存储介质
CN111783501B (zh) * 2019-04-03 2025-01-07 北京地平线机器人技术研发有限公司 活体检测方法和装置以及相应的电子设备
US20200342291A1 (en) * 2019-04-23 2020-10-29 Apical Limited Neural network processing
CN111860078B (zh) * 2019-04-30 2024-05-14 北京眼神智能科技有限公司 人脸静默活体检测方法、装置、可读存储介质及设备
CN110287900B (zh) * 2019-06-27 2023-08-01 深圳市商汤科技有限公司 验证方法和验证装置
KR102766550B1 (ko) * 2019-11-21 2025-02-12 삼성전자주식회사 라이브니스 검사 방법 및 장치, 생체 인증 방법 및 장치
CN110942032B (zh) * 2019-11-27 2022-07-15 深圳市商汤科技有限公司 活体检测方法及装置、存储介质
CN111881706B (zh) * 2019-11-27 2021-09-03 马上消费金融股份有限公司 活体检测、图像分类和模型训练方法、装置、设备及介质
CN111325114B (zh) * 2020-02-03 2022-07-19 重庆特斯联智慧科技股份有限公司 一种人工智能识别分类的安检图像处理方法和装置
CN113255400A (zh) * 2020-02-10 2021-08-13 深圳市光鉴科技有限公司 活体人脸识别模型的训练、识别方法、系统、设备及介质
FR3109688B1 (fr) * 2020-04-24 2022-04-29 Idemia Identity & Security France Procédé d’authentification ou d’identification d’un individu
CN111582381B (zh) * 2020-05-09 2024-03-26 北京市商汤科技开发有限公司 确定性能参数的方法及装置、电子设备和存储介质
CN113761983B (zh) * 2020-06-05 2023-08-22 杭州海康威视数字技术股份有限公司 更新人脸活体检测模型的方法、装置及图像采集设备
EP3965071B1 (en) * 2020-09-08 2025-01-15 Samsung Electronics Co., Ltd. Method and apparatus for pose identification
CN112052830B (zh) * 2020-09-25 2022-12-20 北京百度网讯科技有限公司 人脸检测的方法、装置和计算机存储介质
CN112395963B (zh) * 2020-11-04 2021-11-12 北京嘀嘀无限科技发展有限公司 对象识别方法和装置、电子设备及存储介质
US11532182B2 (en) * 2021-03-04 2022-12-20 Lenovo (Singapore) Pte. Ltd. Authentication of RGB video based on infrared and depth sensing
CN112926489A (zh) * 2021-03-17 2021-06-08 北京市商汤科技开发有限公司 活体检测方法、装置、设备、介质、系统及交通工具
CN113191189A (zh) * 2021-03-22 2021-07-30 深圳市百富智能新技术有限公司 人脸活体检测方法、终端设备及计算机可读存储介质
CN113128428B (zh) * 2021-04-24 2023-04-07 新疆爱华盈通信息技术有限公司 基于深度图预测的活体检测方法和相关设备
CN113361349B (zh) * 2021-05-25 2023-08-04 北京百度网讯科技有限公司 人脸活体检测方法、装置、电子设备和存储介质
CN113239887B (zh) * 2021-06-04 2024-10-01 Oppo广东移动通信有限公司 活体检测方法及装置、计算机可读存储介质和电子设备
CN113449623B (zh) * 2021-06-21 2022-06-28 浙江康旭科技有限公司 一种基于深度学习的轻型活体检测方法
CN113469036A (zh) * 2021-06-30 2021-10-01 北京市商汤科技开发有限公司 活体检测方法及装置、电子设备和存储介质
CN113505682B (zh) * 2021-07-02 2024-07-02 杭州萤石软件有限公司 活体检测方法及装置
US20240289695A1 (en) * 2021-09-17 2024-08-29 Kyocera Corporation Trained model generation method, inference apparatus, and trained model generation apparatus
CN113869212B (zh) * 2021-09-28 2024-06-21 平安科技(深圳)有限公司 多模态活体检测方法、装置、计算机设备及存储介质
CN114999003B (zh) * 2022-05-20 2026-01-23 深圳市联洲国际技术有限公司 目标活体的识别方法及其装置、计算机可读存储介质
CN116091875B (zh) * 2023-04-11 2023-08-29 合肥的卢深视科技有限公司 模型训练方法、活体检测方法、电子设备及存储介质
CN118711258B (zh) * 2024-08-29 2025-02-18 浙江大华技术股份有限公司 活体检测方法、设备和存储介质
CN119229510B (zh) * 2024-12-05 2025-05-16 山东科技大学 一种基于多流注意力交互的面部表情识别方法

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105956518A (zh) * 2016-04-21 2016-09-21 腾讯科技(深圳)有限公司 一种人脸识别方法、装置和系统
CN107590430A (zh) * 2017-07-26 2018-01-16 百度在线网络技术(北京)有限公司 活体检测方法、装置、设备及存储介质
CN108764069A (zh) * 2018-05-10 2018-11-06 北京市商汤科技开发有限公司 活体检测方法及装置

Family Cites Families (17)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR101700595B1 (ko) 2010-01-05 2017-01-31 삼성전자주식회사 얼굴 인식 장치 및 그 방법
JP2013156680A (ja) * 2012-01-26 2013-08-15 Kumamoto Univ フェーストラッキング方法、フェーストラッカおよび車両
KR101444538B1 (ko) * 2012-10-26 2014-09-24 주식회사 에스원 3차원 얼굴 인식 시스템 및 그의 얼굴 인식 방법
JP6214334B2 (ja) * 2013-10-23 2017-10-18 日本放送協会 電子機器、判定方法及びプログラム
GB2532003A (en) * 2014-10-31 2016-05-11 Nokia Technologies Oy Method for alignment of low-quality noisy depth map to the high-resolution colour image
KR20170000748A (ko) 2015-06-24 2017-01-03 삼성전자주식회사 얼굴 인식 방법 및 장치
CN105518711B (zh) * 2015-06-29 2019-11-29 北京旷视科技有限公司 活体检测方法、活体检测系统以及计算机程序产品
CN105956572A (zh) * 2016-05-15 2016-09-21 北京工业大学 一种基于卷积神经网络的活体人脸检测方法
CN107451510B (zh) * 2016-05-30 2023-07-21 北京旷视科技有限公司 活体检测方法和活体检测系统
US10282530B2 (en) * 2016-10-03 2019-05-07 Microsoft Technology Licensing, Llc Verifying identity based on facial dynamics
EP3534328B1 (en) * 2016-10-31 2024-10-02 Nec Corporation Image processing device, image processing method, facial recogntion system, program, and recording medium
CN107368778A (zh) * 2017-06-02 2017-11-21 深圳奥比中光科技有限公司 人脸表情的捕捉方法、装置及存储装置
CN112861760B (zh) * 2017-07-25 2024-12-27 虹软科技股份有限公司 一种用于表情识别的方法和装置
CN107506696A (zh) * 2017-07-29 2017-12-22 广东欧珀移动通信有限公司 防伪处理方法及相关产品
US10679443B2 (en) * 2017-10-13 2020-06-09 Alcatraz AI, Inc. System and method for controlling access to a building with facial recognition
CN108876833A (zh) * 2018-03-29 2018-11-23 北京旷视科技有限公司 图像处理方法、图像处理装置和计算机可读存储介质
US10733762B2 (en) * 2018-04-04 2020-08-04 Motorola Mobility Llc Dynamically calibrating a depth sensor

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105956518A (zh) * 2016-04-21 2016-09-21 腾讯科技(深圳)有限公司 一种人脸识别方法、装置和系统
CN107590430A (zh) * 2017-07-26 2018-01-16 百度在线网络技术(北京)有限公司 活体检测方法、装置、设备及存储介质
CN108764069A (zh) * 2018-05-10 2018-11-06 北京市商汤科技开发有限公司 活体检测方法及装置

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
See also references of EP3584745A4 *

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112926367A (zh) * 2019-12-06 2021-06-08 杭州海康威视数字技术股份有限公司 一种活体检测的设备及方法
CN113052034A (zh) * 2021-03-15 2021-06-29 上海商汤智能科技有限公司 基于双目摄像头的活体检测方法及相关装置
CN113128429A (zh) * 2021-04-24 2021-07-16 新疆爱华盈通信息技术有限公司 基于立体视觉的活体检测方法和相关设备
CN115436893A (zh) * 2021-06-01 2022-12-06 富士通株式会社 基于无线雷达信号的关键点修正装置和方法
CN113486829A (zh) * 2021-07-15 2021-10-08 京东科技控股股份有限公司 人脸活体检测方法、装置、电子设备及存储介质
CN113486829B (zh) * 2021-07-15 2023-11-07 京东科技控股股份有限公司 人脸活体检测方法、装置、电子设备及存储介质
CN113989813A (zh) * 2021-10-29 2022-01-28 北京百度网讯科技有限公司 提取图像特征的方法和图像分类方法、装置、设备和介质
CN114333078A (zh) * 2021-12-01 2022-04-12 马上消费金融股份有限公司 活体检测方法、装置、电子设备及存储介质

Also Published As

Publication number Publication date
JP2020522764A (ja) 2020-07-30
EP3584745A1 (en) 2019-12-25
US20190347823A1 (en) 2019-11-14
CN108764069B (zh) 2022-01-14
JP6852150B2 (ja) 2021-03-31
CN108764069A (zh) 2018-11-06
EP3584745A4 (en) 2019-12-25
KR20190129826A (ko) 2019-11-20
US10930010B2 (en) 2021-02-23

Similar Documents

Publication Publication Date Title
WO2019214201A1 (zh) 活体检测方法及装置、系统、电子设备、存储介质
JP7026225B2 (ja) 生体検出方法、装置及びシステム、電子機器並びに記憶媒体
CN110688951B (zh) 图像处理方法及装置、电子设备和存储介质
JP7110412B2 (ja) 生体検出方法及び装置、電子機器並びに記憶媒体
TWI717146B (zh) 圖像處理方法及裝置、電子設備和儲存介質
CN110674719B (zh) 目标对象匹配方法及装置、电子设备和存储介质
CN109934275B (zh) 图像处理方法及装置、电子设备和存储介质
CN111340766A (zh) 目标对象的检测方法、装置、设备和存储介质
CN110532956B (zh) 图像处理方法及装置、电子设备和存储介质
CN115035596B (zh) 行为检测的方法及装置、电子设备和存储介质
WO2020010927A1 (zh) 图像处理方法及装置、电子设备和存储介质
CN110717399A (zh) 人脸识别方法和电子终端设备
CN111435422B (zh) 动作识别方法、控制方法及装置、电子设备和存储介质
CN112270288A (zh) 活体识别、门禁设备控制方法和装置、电子设备
CN108197585A (zh) 脸部识别方法和装置
CN111626086A (zh) 活体检测方法、装置及系统、电子设备和存储介质
CN112597944B (zh) 关键点检测方法及装置、电子设备和存储介质
CN109034150B (zh) 图像处理方法及装置
CN112613447A (zh) 关键点检测方法及装置、电子设备和存储介质
CN114627356A (zh) 网络训练、图像处理方法及装置、电子设备和存储介质
CN107977636B (zh) 人脸检测方法及装置、终端、存储介质
CN110781842A (zh) 图像处理方法及装置、电子设备和存储介质
CN111507131A (zh) 活体检测方法及装置、电子设备和存储介质
CN113807369A (zh) 目标重识别方法及装置、电子设备和存储介质
HK40016326A (zh) 活体检测方法及装置、电子设备和存储介质

Legal Events

Date Code Title Description
ENP Entry into the national phase

Ref document number: 2019515661

Country of ref document: JP

Kind code of ref document: A

ENP Entry into the national phase

Ref document number: 2018839510

Country of ref document: EP

Effective date: 20190204

ENP Entry into the national phase

Ref document number: 20197019442

Country of ref document: KR

Kind code of ref document: A

WWE Wipo information: entry into national phase

Ref document number: KR1020197019442

Country of ref document: KR

NENP Non-entry into the national phase

Ref country code: DE