WO2021143101A1 - 人脸识别方法和人脸识别装置 - Google Patents
人脸识别方法和人脸识别装置 Download PDFInfo
- Publication number
- WO2021143101A1 WO2021143101A1 PCT/CN2020/105772 CN2020105772W WO2021143101A1 WO 2021143101 A1 WO2021143101 A1 WO 2021143101A1 CN 2020105772 W CN2020105772 W CN 2020105772W WO 2021143101 A1 WO2021143101 A1 WO 2021143101A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- face
- feature
- image
- feature point
- network
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Ceased
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/10—Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
- G06V40/16—Human faces, e.g. facial parts, sketches or expressions
- G06V40/161—Detection; Localisation; Normalisation
- G06V40/165—Detection; Localisation; Normalisation using facial parts and geometric relationships
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/10—Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
- G06V40/16—Human faces, e.g. facial parts, sketches or expressions
- G06V40/161—Detection; Localisation; Normalisation
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/084—Backpropagation, e.g. using gradient descent
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/82—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/10—Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
- G06V40/16—Human faces, e.g. facial parts, sketches or expressions
- G06V40/168—Feature extraction; Face representation
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/10—Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
- G06V40/16—Human faces, e.g. facial parts, sketches or expressions
- G06V40/168—Feature extraction; Face representation
- G06V40/171—Local features and components; Facial parts ; Occluding parts, e.g. glasses; Geometrical relationships
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/10—Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
- G06V40/16—Human faces, e.g. facial parts, sketches or expressions
- G06V40/172—Classification, e.g. identification
Definitions
- This application relates to the field of artificial intelligence, and in particular to a face recognition method and face recognition device.
- Face recognition is a kind of biometric technology based on the facial feature information of the person to identify the identity, and it is an important application in the field of artificial intelligence (AI). Face recognition uses analysis and comparison of facial visual feature information for identity authentication. With the rapid development of computer and network technology, face recognition technology has been widely used in many industries and fields such as smart access control, smart door locks, mobile terminals, public safety, entertainment, and military.
- Face recognition includes collecting face images, detecting faces in the images, preprocessing the detected faces, and then extracting image features and identifying and matching.
- the extraction of image features refers to the extraction of image features including structure and texture from the image, and subsequent recognition and matching are completed based on the image features.
- the integrity of the image features is a key factor that affects the success or failure of face recognition.
- the face recognition effect depends on the integrity of the image features.
- the face image is disturbed by the outside world, such as uneven light, or hats, scarves, and masks
- some image features will be lost, and the image features will be incomplete, resulting in a low success rate of face recognition.
- the embodiment of the present application provides a face recognition method for recognizing a face image, especially a face image with occlusion, which can improve the success rate of face recognition.
- the first aspect of the embodiments of the present application provides a face recognition method, including: acquiring a face image to be recognized; extracting features of the face image through a pre-trained feature extraction network according to the face image; extracting the Multiple facial geometric feature points in the face image to determine multiple feature point sets, each feature point set in the multiple feature point sets corresponds to a face part, and the feature point set includes at least one facial geometry Feature points; acquiring face topological structure features according to the multiple feature point sets, where the face topological structure features are used to determine the relative positional relationship between the multiple feature point sets; according to the face topological structure features And the face image feature is matched in a preset face database to obtain a face recognition result.
- the facial topological structure features are also extracted.
- the facial topological structure features are used to characterize the topological structure between face parts, that is, the relative position. Relations, the topological structure has a low degree of dependence on the integrity of the image. Therefore, the combination of facial image features and facial topological structure features for matching recognition can improve the success rate of face recognition during occlusion.
- the face topology feature includes: a feature vector set, and the feature vector in the feature vector set is used to indicate any two of the multiple feature point sets.
- the face recognition method provided by the embodiment of the present application can directly obtain the topological structure features of the face according to multiple feature point sets, and express them in two forms: a feature vector set and a feature matrix.
- obtaining the features of the face topology structure according to the plurality of feature point sets of face parts includes: constructing the plurality of feature point sets and multiple feature point sets of a standard face The mapping relationship is used to determine the relative position relationship between the multiple feature point sets; the mapping relationship is input to a pre-trained face topology feature extraction network to obtain the person Features of face topology.
- the face recognition method provided by the embodiment of the application can indirectly acquire the features in the face image by constructing the corresponding relationship between the feature points of the face image and the standard face, and according to the mapping relationship data and the known standard face structure information
- the relative positional relationship between points, and the relative positional relationship between multiple feature point sets, the mapping relationship data is input to the pre-training network, and the topological structure features of the face can be extracted.
- the mapping relationship includes a distance and/or an angle between the multiple feature point sets and the multiple feature point sets of the standard face.
- the face topology structure feature extraction network is obtained after the first network training, and the method further includes: extracting multiple facial geometric feature points from the face image training sample To determine multiple sample feature point sets, each feature point set in the multiple sample feature point sets corresponds to a face part of the training sample, and the sample feature point set includes at least one facial geometric feature point; Obtain the mapping relationship between the sample feature point set and the feature point set of the standard face, and input it into the first network for training to obtain a first loss value; update the first loss value according to the first loss value The weight parameters in the network are used to obtain the face topology feature extraction network.
- the face recognition method provided by the embodiment of the present application can obtain a face topology feature extraction network by inputting topological structure data of a face image training sample for training.
- extracting the face image features includes: inputting the face image into the pre-trained face The overall feature extraction network to extract the overall features of the face.
- the features of the face image may include the overall feature of the face.
- the overall feature of the face is the global feature, such as the image color feature or the image texture feature.
- the extraction of the global feature depends on the integrity of the image. , When part of the face in the face image is occluded, the success rate of recognition based on the extracted global features is low.
- the topological structure features of the face are combined and used for feature matching, which can provide the success rate of face recognition.
- the method further includes: extracting a first face part image from the face image; according to the face image, using a pre-trained feature extraction network to extract The face image feature includes: inputting the first face part image to a pre-trained first part feature extraction network to extract the first part feature, and the first part feature is used in the face database Perform matching to obtain the face recognition result.
- the face image feature may include the image feature of a face part, that is, the part feature, which can provide another form of face image feature, which improves the diversification of the solution.
- the face part is, for example, the eye part, the nose part, or the mouth part.
- the first part feature extraction network is obtained after training by the second network, and the method further includes: face part images extracted from face image training samples Input to the second network for training to obtain a second loss value; update the weight parameter in the second network according to the second loss value to obtain the first part feature extraction network.
- the method further includes: extracting a plurality of face part images from the face image; and extracting a person according to the face image through a pre-trained feature extraction network
- the face image feature includes: inputting the multiple face part images into a pre-trained multiple part feature extraction network to extract multiple part features; determining the target part feature of the face image according to the multiple part features .
- the face recognition method provided by the embodiment of the present application extracts multiple face parts, and can separately extract multiple part features through the feature extraction network of each face part, which can improve the success rate of face recognition.
- the target part feature is determined according to a weighted average of the multiple part features, and the weight of the multiple part features is a preset value.
- the weights of features of different parts can be preset. Distinguish the importance of different face parts to improve the success rate of face recognition.
- the method further includes: detecting whether a face part in the multiple face part images is occluded; if the first face part image in the first face part image is occluded; The face part is occluded, and the second face part in the second face part image is not occluded, and the second face part is a symmetrical part of the first face part, then The horizontally flipped image of the second face part image is determined to be the restored image of the first face part, and the restored image is used to input the part feature extraction network to extract the part feature.
- the method further includes: based on the first face part being occluded, updating the weight of the part feature of the first face part, and the updated first face part A weight value is less than the preset first weight value of the first face part.
- the face recognition method provided by the embodiments of the present application can reduce the weight of the feature of the occluded face, thereby effectively distinguishing the importance of the occluded part from the unoccluded part, and can improve the performance of the occluded scene. Success rate of face recognition.
- the method further includes: preprocessing the face image to obtain a preprocessed face image, where the preprocessing includes face alignment, and The preprocessed face image is used to extract the face image features and extract the multiple facial geometric feature points.
- preprocessing before the face image is used for feature extraction, preprocessing may be performed first to improve the feature extraction efficiency and feature accuracy.
- the second aspect of the embodiments of the present application provides a face recognition device, including: an acquisition module for acquiring a face image to be recognized; an extraction module for extracting a network based on the face image through pre-trained features , Extracting facial image features; a determining module, used to extract multiple facial geometric feature points in the face image to determine multiple feature point sets, each of the multiple feature point sets corresponds to one For a human face part, the feature point set includes at least one facial geometric feature point; the acquiring module is further configured to acquire a face topological structure feature according to the multiple feature point sets, and the face topological structure feature is used for Determine the relative positional relationship between the multiple feature point sets; a matching module, configured to perform matching in a preset face database according to the face topology structure feature and the face image feature to obtain a face Recognition results.
- the face topology feature includes: a feature vector set, and the feature vector in the feature vector set is used to indicate any two of the multiple feature point sets.
- the determining module is further configured to: construct a mapping relationship between the multiple feature point sets and the multiple feature point sets of a standard face, and the mapping relationship is To determine the relative positional relationship between the multiple feature point sets; the acquisition module is specifically configured to: input the mapping relationship into a pre-trained face topology feature extraction network to acquire the face topology feature.
- the mapping relationship includes a distance and/or an angle between the multiple feature point sets and the multiple feature point sets of the standard face.
- the face topology structure feature extraction network is obtained after the first network is trained; the extraction module is also used to extract multiple facial geometry from the face image training sample Feature points to determine multiple sample feature point sets, each feature point set in the multiple sample feature point sets corresponds to a face part of the training sample, and the sample feature point set includes at least one facial geometric feature
- the acquisition module is also used to acquire the mapping relationship between the sample feature point set and the feature point set of the standard face, and input it into the first network for training to obtain a first loss value; the The obtaining module is further configured to update the weight parameter in the first network according to the first loss value to obtain the face topology structure feature extraction network.
- the extraction module is specifically configured to: input the face image into a pre-trained overall face feature extraction network to extract overall face features.
- the extraction module is specifically configured to: extract a first face part image from the face image; and input the first face part image into a pre-
- the trained first part feature extraction network is used to extract the first part feature, and the first part feature is used for matching in the face database to obtain the face recognition result.
- the first part feature extraction network is obtained after training by the second network, and the acquisition module is further used for: the face extracted from the training sample of the face image The bit image is input to the second network for training to obtain a second loss value; the acquisition module is further configured to update the weight parameter in the second network according to the second loss value to obtain the first Location feature extraction network.
- the extraction module is further configured to: extract multiple face part images from the face image; the extraction module is specifically configured to combine the multiple faces The part images are respectively input to a plurality of pre-trained part feature extraction networks to extract a plurality of part features; the determining module is further configured to determine the target part feature of the face image according to the multiple part features.
- the target part feature is determined according to a weighted average of the multiple part features, and the weight of the multiple part features is a preset value.
- the face recognition device further includes: a detection module, configured to detect whether the face parts in the multiple face part images are occluded; the determination module It is also used for, if the first face part in the first face part image is occluded, and the second face part in the second face part image is not occluded, the second face part Is a symmetrical part of the first face part, the horizontally flipped image of the second face part image is determined as the restored image of the first face part, and the restored image is used for input
- the part feature extraction network is used to extract the part feature.
- the face recognition device further includes: an update module, configured to update the position of the first face part based on the occlusion of the first face part The weight value of the feature, the updated first weight value is less than the preset first weight value of the first face part.
- the acquisition module is further configured to: preprocess the face image to acquire a preprocessed face image, where the preprocessing includes face alignment,
- the preprocessed face image is used to extract the face image features and extract the multiple facial geometric feature points.
- the second aspect of the embodiments of the present application provides a face recognition device, including a processor and a memory, the processor and the memory are connected to each other, wherein the memory is used to store a computer program, and the computer program includes a program Instruction, the processor is used to call the program instruction to execute the method described in any one of the foregoing first aspect and various possible implementation manners.
- the third aspect of the embodiments of the present application provides a computer program product containing instructions, which is characterized in that when it runs on a computer, the computer executes any one of the above-mentioned first aspect and various possible implementation manners. The method described in the item.
- the fourth aspect of the embodiments of the present application provides a computer-readable storage medium, including instructions, characterized in that, when the instructions are run on a computer, the computer executes the first aspect and various possible implementation manners. Any of the methods.
- a fifth aspect of the embodiments of the present application provides a chip including a processor.
- the processor is used to read and execute the computer program stored in the memory to execute the method in any possible implementation manner of any one of the foregoing aspects.
- the chip should include a memory, and the memory and the processor are connected to the memory through a circuit or a wire.
- the chip further includes a communication interface, and the processor is connected to the communication interface.
- the communication interface is used to receive data and/or information that needs to be processed, and the processor obtains the data and/or information from the communication interface, processes the data and/or information, and outputs the processing result through the communication interface.
- the communication interface can be an input and output interface.
- a face image is input to a pre-trained feature extraction network to obtain face image features.
- multiple facial collection feature points in the face image are extracted to determine that they correspond to multiple people
- a set of feature points of multiple face parts of the face and the topological structure features of the face are obtained according to the set of feature points of multiple face parts; feature matching is performed on the face data set according to the features of the topological structure of the face and the features of the face image, and finally Obtain the face recognition results. Since the face topology is constructed by the relative position relationship of multiple facial feature point sets, more structured information can be extracted, and the dependence on the integrity of the overall face image is reduced, which can effectively reduce The influence of occlusion on face recognition.
- FIG. 1 is a schematic diagram of an artificial intelligence main body framework provided by an embodiment of this application.
- Figure 2 is a schematic diagram of an application environment provided by an embodiment of the application.
- FIG. 3 is a schematic diagram of a convolutional neural network structure provided by an embodiment of the application.
- FIG. 4 is a schematic diagram of another convolutional neural network structure provided by an embodiment of the application.
- Figure 5 is a system architecture diagram of an embodiment of the application
- FIG. 6 is a schematic diagram of an embodiment of a face recognition method in an embodiment of this application.
- FIG. 7 is a schematic diagram of feature points of a target set in an embodiment of this application.
- FIG. 8 is a schematic diagram of a face topology structure in an embodiment of the application.
- FIG. 9 is a schematic diagram of a standard face structure in an embodiment of the application.
- FIG. 10 is a schematic diagram of a face part image and occlusion flip processing in an embodiment of the application.
- FIG. 11 is a schematic diagram of an embodiment of a training method of a feature extraction network in an embodiment of the application.
- FIG. 12 is a schematic diagram of a feature extraction network architecture in an embodiment of this application.
- FIG. 13 is a schematic diagram of an embodiment of a face recognition device in an embodiment of the application.
- FIG. 14 is a diagram of a chip hardware structure provided by an embodiment of the application.
- FIG. 15 is a schematic diagram of another embodiment of a face recognition device in an embodiment of this application.
- the embodiment of the present application provides a face recognition method for recognizing a face image, especially a face image with occlusion, which can improve the success rate of face recognition.
- Face image an image containing face information
- the human face is composed of parts such as eyes, nose, and mouth.
- the geometric description of the shape and structure of these parts can be used as an important feature of face recognition. These features are the geometric features of the face.
- Facial geometric feature points The human face is composed of eyes, nose, mouth and other parts. Through the detection of face images, the feature points used to characterize each face part can be extracted, that is, facial geometric feature points.
- Face part image refers to an image that includes characteristic local areas in a face image, usually an image of the eyes, eyebrows, nose, or mouth.
- Topology is a method of abstracting entities into "points” that have nothing to do with their size and shape, and abstracting lines connecting entities into “lines”, and then expressing the relationship between these points and lines in the form of graphs. Its purpose is to study the connection between these points and lines.
- a graph showing the relationship between points and lines is called a topological structure graph.
- Topological structure and geometric structure belong to two different mathematical concepts. In geometric structure, what we want to investigate is the positional relationship between points and lines, or the geometric structure emphasizes the shape and size of points and lines.
- trapezoids, squares, parallelograms, and circles belong to different geometric structures, but from the perspective of topological structure, because the connection relationship between points and lines is the same, they have the same topological structure, that is, ring structure. In other words, different geometric structures may have the same topological structure.
- the topological structure of a human face includes the connections between various parts of the human face.
- Figure 1 shows a schematic diagram of an artificial intelligence main framework, which describes the overall workflow of the artificial intelligence system and is suitable for general artificial intelligence field requirements.
- Intelligent Information Chain reflects a series of processes from data acquisition to processing. For example, it can be the general process of intelligent information perception, intelligent information representation and formation, intelligent reasoning, intelligent decision-making, intelligent execution and output. In this process, the data has gone through the condensing process of "data-information-knowledge-wisdom".
- the infrastructure provides computing power support for the artificial intelligence system, realizes communication with the outside world, and realizes support through the basic platform.
- smart chips hardware acceleration chips such as CPU, NPU, GPU, ASIC, FPGA
- basic platforms include distributed computing frameworks and network related platform guarantees and support, which can include cloud storage and Computing, interconnection network, etc.
- sensors communicate with the outside to obtain data, and these data are provided to the smart chip in the distributed computing system provided by the basic platform for calculation.
- the data in the upper layer of the infrastructure is used to represent the data source in the field of artificial intelligence.
- the data involves graphics, images, voice, and text, as well as the Internet of Things data of traditional devices, including business data of existing systems and sensory data such as force, displacement, liquid level, temperature, and humidity.
- machine learning and deep learning can symbolize and formalize data for intelligent information modeling, extraction, preprocessing, training, etc.
- Reasoning refers to the process of simulating human intelligent reasoning in a computer or intelligent system, using formal information to conduct machine thinking and solving problems based on reasoning control strategies.
- the typical function is search and matching.
- Decision-making refers to the process of making decisions after intelligent information is reasoned, and usually provides functions such as classification, ranking, and prediction.
- some general capabilities can be formed based on the results of the data processing, such as an algorithm or a general system, for example, translation, text analysis, computer vision processing, speech recognition, image Recognition and so on.
- Intelligent products and industry applications refer to the products and applications of artificial intelligence systems in various fields. It is an encapsulation of the overall solution of artificial intelligence, productizing intelligent information decision-making and realizing landing applications. Its application fields mainly include: intelligent manufacturing, intelligent transportation, Smart home, smart medical, smart security, autonomous driving, safe city, smart terminal, etc.
- an embodiment of the present application provides a system architecture 200.
- the data collection device 260 is used to collect face image data and store it in the database 230, and the training device 220 generates a target model/rule 201 based on the face image data maintained in the database 230.
- the following will describe in more detail how the training device 220 obtains the target model/rule 201 based on the face image data.
- the target model/rule 201 can be used in application scenarios such as face recognition, image classification, and virtual reality.
- the target model/rule 201 may be obtained based on a deep neural network, and the deep neural network will be introduced below.
- the work of each layer in the deep neural network can be expressed in mathematical expressions To describe: From the physical level, the work of each layer in the deep neural network can be understood as the transformation of the input space to the output space (that is, the row space of the matrix to the column Space), these five operations include: 1. Dimension Up/Down; 2. Enlarge/Reduce; 3. Rotate; 4. Translation; 5. "Bend”. The operations of 1, 2, and 3 are determined by Completed, the operation of 4 is completed by +b, and the operation of 5 is realized by a(). The reason why the word "space” is used here is because the object to be classified is not a single thing, but a class of things. Space refers to the collection of all individuals of this kind of thing.
- W is a weight vector, and each value in the vector represents the weight value of a neuron in the layer of neural network.
- This vector W determines the spatial transformation from the input space to the output space described above, that is, the weight W of each layer controls how the space is transformed.
- the purpose of training a deep neural network is to finally obtain the weight matrix of all layers of the trained neural network (the weight matrix formed by the vector W of many layers). Therefore, the training process of the neural network is essentially the way of learning to control the space transformation, and more specifically, the learning of the weight matrix.
- the weight vector of the network (of course, there is usually an initialization process before the first update, which is to pre-configure parameters for each layer in the deep neural network). For example, if the predicted value of the network is high, adjust the weight vector to make it The prediction is lower and keep adjusting until the neural network can predict the target value you really want. Therefore, it is necessary to predefine "how to compare the difference between the predicted value and the target value".
- This is the loss function or objective function, which is used to measure the difference between the predicted value and the target value. Important equation. Among them, take the loss function as an example. The higher the output value (loss) of the loss function, the greater the difference. Then the training of the deep neural network becomes a process of reducing this loss as much as possible.
- the target model/rule obtained by the training device 220 can be applied to different systems or devices.
- the execution device 210 is configured with an I/O interface 212 to perform data interaction with external devices.
- the "user" can input data to the I/O interface 212 through the client device 240.
- the execution device 210 can call data, codes, etc. in the data storage system 250, and can also store data, instructions, etc. in the data storage system 250.
- the calculation module 211 uses the target model/rule 201 to process the input data. Taking face image recognition as an example, the calculation module 211 can analyze the input face image to obtain image features such as texture information in the face image.
- the correlation function module 213 may preprocess the image data in the calculation module 211, for example, perform face image preprocessing, including face alignment.
- the correlation function module 214 can preprocess the image data in the calculation module 211, for example, perform face image preprocessing, including face alignment.
- the I/O interface 212 returns the processing result to the client device 240 and provides it to the user.
- the training device 220 can generate corresponding target models/rules 201 based on different data for different targets, so as to provide users with better results.
- the user can manually specify the input data in the execution device 210, for example, to operate in the interface provided by the I/O interface 212.
- the client device 240 can automatically input data to the I/O interface 212 and obtain the result. If the client device 240 automatically inputs data and needs the user's authorization, the user can set the corresponding authority in the client device 240.
- the user can view the result output by the execution device 210 on the client device 240, and the specific presentation form may be a specific manner such as display, sound, and action.
- the client device 240 may also serve as a data collection terminal to store the collected training data in the database 230.
- Fig. 2 is only a schematic diagram of a system architecture provided by an embodiment of the present application, and the positional relationship among the devices, devices, modules, etc. shown in the figure does not constitute any limitation.
- the data storage system 250 is an external memory relative to the execution device 210. In other cases, the data storage system 250 may also be placed in the execution device 210.
- Convolutional neural network (convolutional neural network, CNN) is a deep neural network with a convolutional structure. It is a deep learning architecture.
- the deep learning architecture refers to the use of machine learning algorithms in different abstractions. There are multiple levels of learning at different levels.
- CNN is a feed-forward artificial neural network. Taking image processing as an example, each neuron in the feed-forward artificial neural network responds to overlapping areas in the input image. .
- a convolutional neural network (CNN) 100 may include an input layer 110, a convolutional layer/pooling layer 120, where the pooling layer is optional, and a neural network layer 130.
- the convolutional layer/pooling layer 120 may include layers 121-126 as in the examples.
- layer 121 is a convolutional layer
- layer 122 is a pooling layer
- layer 123 is a convolutional layer
- 124 is a pooling layer
- 121 and 122 are convolutional layers
- 123 is a pooling layer
- 124 and 125 are convolutional layers
- 126 is a convolutional layer.
- Pooling layer That is, the output of the convolutional layer can be used as the input of the subsequent pooling layer, or as the input of another convolutional layer to continue the convolution operation.
- the convolutional layer 121 can include many convolution operators.
- the convolution operator is also called a kernel. Its role in image processing is equivalent to a filter that extracts specific information from the input image matrix.
- the convolution operator can be a weight matrix. This weight matrix is usually predefined. In the process of convolution on the image, the weight matrix is usually one pixel after another pixel in the horizontal direction on the input image ( Or two pixels followed by two pixels...It depends on the value of stride) to complete the work of extracting specific features from the image.
- the size of the weight matrix should be related to the size of the image. It should be noted that the depth dimension of the weight matrix and the depth dimension of the input image are the same.
- the weight matrix will extend to Enter the entire depth of the image. Therefore, convolution with a single weight matrix will produce a convolution output of a single depth dimension, but in most cases, a single weight matrix is not used, but multiple weight matrices with the same dimension are applied. The output of each weight matrix is stacked to form the depth dimension of the convolutional image. Different weight matrices can be used to extract different features in the image. For example, one weight matrix is used to extract edge information of the image, another weight matrix is used to extract specific colors of the image, and another weight matrix is used to eliminate unwanted noise in the image. Fuzzy... the dimensions of the multiple weight matrices are the same, and the dimension of the feature map extracted by the weight matrix of the same dimension is also the same, and then the extracted feature maps of the same dimension are merged to form the output of the convolution operation .
- weight values in these weight matrices need to be obtained through a lot of training in practical applications, and each weight matrix formed by the weight values obtained through training can extract information from the input image, thereby helping the convolutional neural network 100 to make correct predictions.
- the initial convolutional layer (such as 121) often extracts more general features, which can also be called low-level features; with the convolutional neural network
- the subsequent convolutional layers for example, 126
- features such as high-level semantics
- the pooling layer can also be a multi-layer convolutional layer followed by one or more pooling layers.
- the sole purpose of the pooling layer is to reduce the size of the image space.
- the pooling layer may include an average pooling operator and/or a maximum pooling operator for sampling the input image to obtain an image with a smaller size.
- the average pooling operator can calculate the pixel values in the image within a specific range to generate an average value.
- the maximum pooling operator can take the pixel with the largest value within a specific range as the result of the maximum pooling.
- the operators in the pooling layer should also be related to the image size.
- the size of the image output after processing by the pooling layer can be smaller than the size of the image of the input pooling layer, and each pixel in the image output by the pooling layer represents the average value or the maximum value of the corresponding sub-region of the image input to the pooling layer.
- the convolutional neural network 100 After processing by the convolutional layer/pooling layer 120, the convolutional neural network 100 is not enough to output the required output information. Because as mentioned above, the convolutional layer/pooling layer 120 only extracts features and reduces the parameters brought by the input image. However, in order to generate the final output information (required class information or other related information), the convolutional neural network 100 needs to use the neural network layer 130 to generate one or a group of required classes of output. Therefore, the neural network layer 130 may include multiple hidden layers (131, 132 to 13n as shown in FIG. 3) and an output layer 140. The parameters contained in the multiple hidden layers may be based on specific task types. The relevant training data of the, for example, the task type can include image recognition, image classification, image super-resolution reconstruction and so on.
- the output layer 140 After the multiple hidden layers in the neural network layer 130, that is, the final layer of the entire convolutional neural network 100 is the output layer 140.
- the output layer 140 has a loss function similar to the classification cross entropy, which is specifically used to calculate the prediction error.
- the convolutional neural network 100 shown in FIG. 3 is only used as an example of a convolutional neural network.
- the convolutional neural network may also exist in the form of other network models, for example,
- the multiple convolutional layers/pooling layers shown in FIG. 4 are in parallel, and the respectively extracted features are input to the full neural network layer 130 for processing.
- the face recognition method provided in the embodiments of this application is suitable for face recognition in various scenarios such as home and security, including robots, smart phones, desktop computers, tablet computers, TVs, home or public security surveillance cameras, cameras, access control, Use scenarios such as identity verification, personalized customization, and facial expression simulation of door locks, attendance machines, smart glasses and other products.
- the process of face recognition can be completed by the above-mentioned entities, or it can be connected to a dedicated server through the network and completed by the server, and the details are not limited here.
- the network includes one or more of multiple types of wireless or partial wireless communication networks, such as local area network (LAN), wireless local area network (WLAN), personal area network (PAN), and wide area network (WAN), Intranet, Internet, peer-to-peer network, peer-to-peer network or mesh network, etc.
- LAN local area network
- WLAN wireless local area network
- PAN personal area network
- WAN wide area network
- Intranet Internet
- peer-to-peer network peer-to-peer network or mesh network, etc.
- the specifics are not limited here.
- the face recognition device obtains a face image.
- the face recognition device captures an image through a built-in or external camera, and then detects a face image including face information from the captured image.
- the robot collects an image through a camera and detects that a human face is included in the image to obtain a human face image.
- the face recognition device preprocesses the face image. Due to various conditions and random interference, the original face image cannot be used directly. It must be gray-scale correction and noise filtering in the early stage of image processing. Image preprocessing, to obtain preprocessed images for subsequent feature extraction.
- the face image preprocessing may include: face alignment, light compensation, gray scale transformation, histogram equalization, normalization processing, geometric correction, median filtering, sharpening, etc.
- the specific processing flow is not limited here.
- the normalization process is used to obtain standardized face images with the same size and the same gray value range, and the median filter can be used to smooth the image to eliminate noise.
- face alignment processing is performed on the face image, and faces of different scales and directions are normalized to a uniform scale according to the positions of the face feature points to obtain images with correct face positions.
- Face rotation because the person in the face image may have different postures, for example, the detected face may be a front face or a side face.
- the face can be rotated through face alignment, and the faces of different poses Rotate to the same angle as much as possible for easy identification.
- Face alignment can reduce the influence of distance or pose on subsequent feature extraction, and recognize faces on a uniform scale.
- the face recognition device detects facial geometric feature points in a face image to be recognized, and the facial geometric feature points include feature points used to characterize various parts of the face, such as eyes, nose, mouth, and facial contours.
- the face recognition device may detect facial geometric feature points by using a preset algorithm, and the specific type of the preset algorithm is not limited here.
- the number of facial geometric feature points to be extracted is not limited, 68 feature points or 128 feature points can be extracted.
- FIG. 7, is a schematic diagram of facial geometric feature points in an embodiment of this application. 68 facial geometric feature points are shown.
- multiple facial feature point sets can be determined.
- the face feature point set composed of facial geometric feature points 18 to 22 is used to indicate the left eyebrow part 701; facial geometric feature points
- the face feature point set consisting of 23 to 27 is used to indicate the right eyebrow part 702;
- the face feature point set consisting of facial geometric feature points 37 to 42 is used to indicate the left eye part 703; facial geometric feature points 43
- the face feature point set consisting of to 48 is used to indicate the right eye part 704;
- the face feature point set consisting of facial geometric feature points 28 to 36 is used to indicate the nose part 705; facial geometric feature points 49 to 68
- the formed feature point set of the face part is used to indicate the mouth part 706.
- the connection relationship between the feature point sets of each face part is calculated to construct the face Topological structure, to obtain the topological structure features of the face.
- the face topological structure feature is expressed as a feature vector set, and the feature vector in the feature vector set is used to indicate the relative position between any two face part feature point sets in the plurality of face part feature point sets relation.
- FIG. 8 is a schematic diagram of a face topology structure in an embodiment of this application.
- P represents a set of feature points of a face part
- vectors ci, j represent the connection relationship between a set of feature points of a face part i and j.
- the face topology is represented as [c0,1,c0,2,c0,5,c1,2,c1,3,c3,4,c2,4,c2,5,c1,5], exemplary, vector c0 ,1 can be obtained according to the position coordinate of P0 and the position coordinate of P1.
- the vector notation only lists the characteristic relationships that have the connection relationship. It is understandable that the connection relationship of two face parts that do not directly have the connection relationship can also be obtained indirectly according to the topology information.
- Normalize the facial structure features refers to the use of a unified measurement unit and measurement standard for the features. For example, if the nose is used as the origin, the distance from the eyes to the nose is C0,1. This distance can be the image coordinate system To calculate the system, the distance can also be further normalized to between 0 and 1.
- connection relationships such as Euclidean distance, curvature, or angle, can also be defined, which are not specifically limited here.
- mapping relationship between the multiple face part feature point sets and the face part feature point set of a standard face construct a mapping relationship between the multiple face part feature point sets and the face part feature point set of a standard face, and input the mapping relationship into a pre-trained face topology feature extraction network , In order to obtain the topological structure features of the face.
- the mapping relationship between each face part in the face image and the corresponding face part in the standard face can be expressed by a metric relationship such as distance and/or angle, which is used to indicate the characteristics of the multiple face parts The relative positional relationship between point sets.
- the standard face is used as a reference standard face image. Please refer to Figure 9.
- the middle standard face at least one feature point is marked for the eyes, nose, eyebrows, mouth and other parts.
- the nose part can be used
- One feature point can be labeled, or multiple feature points can be used for labeling.
- the number of feature points labeled for each part is not specifically limited and can be specified by the developer.
- the mapping relationship can be expressed by measurement relationships such as distance and/or angle. For example, among the facial geometric feature points detected in the face image to be recognized, a feature point in the nose part feature point set is compared with the nose part feature point of the standard face. The metric relationship between a feature point in the set is a.
- a feature point can be identified by two-dimensional data.
- the data amount of the positional relationship between a feature point in the face image to be recognized and the 68 points in the standard face image is 1*68*2, and the 68 feature points in the face image to be recognized are compared with those in the standard face image.
- the data amount of the positional relationship between 68 points is a data block of 68*68*2.
- the data representing the mapping relationship is input into the pre-trained face topology feature extraction network, and the face topology feature can be obtained.
- the face recognition device can determine each face part from the face image.
- the face parts extracted from the image include: mouth, nose, left eye, left eyebrow, right eye, and right eyebrow.
- the face parts to be distinguished can be preset and are not limited here.
- FIG. 10 is a schematic diagram of the image of the human face and the occlusion flip processing in the embodiment of the application.
- each face part determines whether each face part is occluded according to a preset part discrimination model for each face part. If the occluded part is one of the paired parts, and the other of the paired parts is not covered, for example, as shown in Figure 10, the eyebrows or eyes are covered by one, then the other part of the paired part that is not covered Flip horizontally as a restored image of the currently blocked part.
- step 604 may be executed first, and then step 604 may be executed, or step 604 may be executed first, and then step 603 may be executed.
- the face part image obtained in step 604 is input into the pre-trained part feature extraction network for feature extraction, and the target part feature is output; it should be noted that each face part can be preset with a corresponding feature extraction network. For multiple parts of the face, different part feature extraction networks can be preset for feature extraction, and then the individual part features extracted for each part can be synthesized to obtain the target part feature.
- the target part feature is a weighted average of the multiple part features, and the weight of the multiple part features is a preset value.
- the face recognition device detects whether the face parts in the multiple face part images are occluded. For the occluded parts, the weight can be reduced on the basis of the initial preset weight of the part, by Therefore, the influence of part occlusion on face recognition can be reduced.
- the face topological structure feature extraction network For the training process of the overall face feature extraction network, the face topological structure feature extraction network, and the part feature extraction network, please refer to the embodiment corresponding to FIG. 11, which will not be described in detail here.
- the feature matching module matches the features of the current face with the face features in the face database, and obtains the result of face recognition according to the similarity measurement between the features.
- Extract features for parts reduce the impact of low-response areas, enhance the receptive fields of high-response areas, and learn more texture information to enhance the judgment and robustness of features.
- the strategy with weight distribution is adopted for part recognition to reduce the influence of low-response parts on the final discrimination result and increase the influence of high-noise areas, which is conducive to face recognition;
- the face part to be matched is turned over, features are extracted, and the occlusion discriminator is added to reduce the influence of the occlusion part on face recognition.
- Propose a feature extraction unit that specializes in learning the topological structure of a face, learn the difference between a face and a standard face, and the link relationship between each topological node, and obtain more structure by linking the information between each layer This information can reduce the impact of occlusion on face recognition without affecting normal face recognition.
- the following describes the training method of the feature extraction network in the face recognition method provided by this application. Please refer to Fig. 11.
- the method includes:
- the face recognition device obtains a face image from a face database.
- the face database can be downloaded from the Internet, or a part of the database can be built by yourself, and the details are not limited here. Commonly used face databases include LFW dataset, Yale series, etc.
- the face database includes occluded face images, or occlusion processing is performed on the face images in the face database to generate occluded face images, which is not specifically limited here.
- Steps 1102 to 1103 are similar to the methods of step 602 and step 604 in the embodiment corresponding to FIG. 6, and will not be repeated here.
- step 603 in the embodiment corresponding to FIG. 6 to construct the mapping relationship between the multiple face feature point sets and the face feature point set of the standard face, and obtain data representing the mapping relationship, representing a person Face topology.
- FIG. 12 is a schematic diagram of a feature extraction network architecture in an embodiment of this application.
- the first network can be various existing face extraction networks, which are not specifically limited here.
- the input image data is a face image of H*W*3, where H is the height of the image, W is the width of the image, and 3 represents 3 channels of RGB (red green blue).
- H is the height of the image
- W is the width of the image
- 3 represents 3 channels of RGB (red green blue).
- the image is used as the input of training, which can make the image robust at multiple scales.
- the input image passes through 9 convolutional layers, each block in Figure 12 includes 3 convolutional layers, and the output [T*1] vector.
- the fully connected layer is followed by the softmax layer (not shown in the figure).
- the input of this layer is to input the [T*1] vector into the softmax layer, and the output is also the [T*1] vector. Normalize each output vector to [0,1].
- the vector output by softmax here is the probability that the sample belongs to each class.
- the first network loss function L1 is as follows:
- Ls is the softmax loss of the inter-class difference
- Lc is the center loss of the individual difference within the class
- m is the batch size (batch_size)
- Xi is the i-th value of the vector of [T*1]
- Xj is the j-th value
- wi Is the weight of the convolution, learned from back propagation
- bi is the corresponding bias.
- the input image passes through the network 1, and the feature matrix (1*W dimension) is obtained; the score corresponding to each image block is a real number in the range of [0,1].
- the training process of the part feature extraction network obtained by the training of network 2 corresponds to the parts 2 and 3 in Fig. 12.
- the input of each part network is the H*W*3 part image.
- the picture of each part is processed, and the result of whether each face part is occluded is obtained.
- the above figure shows an example of a part network.
- the part network can use a small network to extract features for the part, reduce the influence of the low response area, enhance the receptive field of the high response area, and learn more high-resolution texture information to enhance the features Judgment and robustness.
- the initial weights are set for each part, which can be defined by the product developer.
- the weight of eyes is 0.4
- the weight of nose and mouth is 0.25
- the weight of eyebrows is 0.1.
- the image data of the current part processed by the part discriminator and the result of whether the part is occluded are obtained through the above-mentioned part network to obtain an N*1 dimensional feature vector and the final part weight wi of each part.
- the part is occluded, perform a drop adjustment based on the initial weight. For example, if the eye is occluded, the weight of the eye part is adjusted to 0 after passing through the part network.
- the loss function is as follows:
- wi is the weight of the convolution
- k represents the number of extracted face parts
- Li is the calculated loss function for each part.
- ⁇ j,i is the angle between vectors W j and x i , ⁇ i,j ⁇ (0, ⁇ )
- cosine represents the angle cosine value of the feature vector and the weight vector.
- the meaning of the parameter m is to make the distance between classes large enough and the distance within classes small enough.
- the face topology feature extraction network obtained by the training of network 3 corresponds to the part marked 4 in Figure 12;
- the network 3 training can obtain the face topology feature extraction network, with 68 facial feature points in 2 dimensions to measure the relationship distance, the network input is a data block of 68*68*2 size, after the network convolutional layer and the fully connected layer, The differential feature between the constructed face and the standard face is extracted, and the output is an M*1 dimensional feature vector.
- the loss function L3 of network 3 is as follows:
- m is the batch size (batch_size)
- n is XX
- Xi is the i-th value of the vector of [T*1]
- Xj is the j-th value
- wi is the weight of the convolution
- wj represents XX.
- Bi is the bias corresponding to the i-th value
- bj is the bias corresponding to the j-th value.
- the training process may not train the part features (network 2).
- the overall face features and the face topological structure features are used to complete face recognition.
- the training process may not train the overall face features (network 1).
- the face part features and the face topological structure features are used to complete the face recognition.
- FIG. 13 is a schematic diagram of an embodiment of the face recognition device in the embodiment of the application.
- the face recognition device includes:
- the obtaining module 1301 is used to obtain the face image to be recognized
- the extraction module 1302 is configured to extract features of the face image through a pre-trained feature extraction network according to the face image;
- the determining module 1303 is used to extract multiple facial geometric feature points in the face image to determine multiple feature point sets.
- Each feature point set in the multiple feature point sets corresponds to a face part.
- the set includes at least one facial geometric feature point;
- the obtaining module 1301 is further configured to obtain the topological structure feature of the face according to the plurality of feature point sets, and the topological structure feature of the face is used to determine the relative position relationship between the plurality of feature point sets;
- the matching module 1304 is configured to perform matching in a preset face database according to the face topological structure feature and the face image feature to obtain a face recognition result.
- the features of the face topology include:
- a feature vector set where the feature vector in the feature vector set is used to indicate the relative positional relationship between any two feature point sets in the multiple feature point sets; or,
- Feature matrix the elements in the feature matrix are used to indicate the relative positional relationship between any two feature point sets in the multiple feature point sets.
- the determining module 1303 is also used to:
- mapping relationship Constructing a mapping relationship between the multiple feature point sets and multiple feature point sets of a standard face, and the mapping relationship is used to determine the relative position relationship between the multiple feature point sets;
- the obtaining module 1301 is specifically configured to input the mapping relationship into a pre-trained face topology structure feature extraction network to obtain the face topology structure feature.
- the mapping relationship includes the distance and/or angle between the multiple feature point sets and the multiple feature point sets of the standard face.
- the face topology feature extraction network is obtained after the first network is trained
- the extraction module 1302 is also used to extract multiple facial geometric feature points from the face image training samples to determine multiple sample feature point sets.
- Each feature point set in the multiple sample feature point sets corresponds to the training sample A face part, the sample feature point set includes at least one facial geometric feature point;
- the acquiring module 1301 is further configured to acquire the mapping relationship between the sample feature point set and the feature point set of the standard face, and input it into the first network for training to obtain the first loss value;
- the obtaining module 1301 is further configured to update the weight parameter in the first network according to the first loss value to obtain the face topology structure feature extraction network.
- the extraction module 1302 is specifically used for:
- the face image is input to the pre-trained overall face feature extraction network to extract the overall face feature.
- the extraction module 1302 is specifically used for:
- the first part feature extraction network is obtained after training of the second network, and the acquisition module 1301 is further configured to:
- the obtaining module 1301 is further configured to update the weight parameter in the second network according to the second loss value to obtain the first part feature extraction network.
- the extraction module 1302 is also used to:
- the extraction module 1302 is specifically configured to input the multiple face part images into a pre-trained multiple part feature extraction network to extract multiple part features;
- the determining module 1303 is further configured to determine the target part feature of the face image according to the multiple part features.
- the target part feature is determined according to a weighted average of the multiple part features, and the weight of the multiple part features is a preset value.
- the face recognition device further includes:
- the detection module 1305 is configured to detect whether the face parts in the multiple face part images are occluded;
- the determining module 1303 is also used to: if the first face part in the first face part image is occluded, and the second face part in the second face part image is not occluded, the second face part image
- the face part is a symmetrical part of the first face part, then the horizontally flipped image of the second face part image is determined as the restored image of the first face part, and the restored image is used to input the
- the part feature extraction network is used to extract the part feature.
- the face recognition device further includes:
- the update module 1306 is configured to update the weight value of the part feature of the first face part based on the occlusion of the first face part, and the updated first weight value is less than the preset first face part of the first face part. One weight.
- the acquisition module 1301 is also used to:
- the face image is preprocessed to obtain a preprocessed face image, the preprocessing includes face alignment, and the preprocessed face image is used to extract features of the face image and extract the multiple facial geometry Feature points.
- FIG. 14 is a diagram of a chip hardware structure provided by an embodiment of the application.
- the algorithm based on the convolutional neural network shown in FIG. 3 and FIG. 4 can be implemented in the NPU chip shown in FIG. 14.
- Neural Network Processor NPU 50 The NPU is mounted as a coprocessor to the host CPU (Host CPU), and the Host CPU assigns tasks.
- the core part of the NPU is the arithmetic circuit 50.
- the arithmetic circuit 503 is controlled by the controller 504 to extract matrix data from the memory and perform multiplication operations.
- the arithmetic circuit 503 includes multiple processing units (process engines, PE). In some implementations, the arithmetic circuit 503 is a two-dimensional systolic array. The arithmetic circuit 503 may also be a one-dimensional systolic array or other electronic circuit capable of performing mathematical operations such as multiplication and addition. In some implementations, the arithmetic circuit 503 is a general-purpose matrix processor.
- the arithmetic circuit fetches the corresponding data of matrix B from the weight memory 502 and caches it on each PE in the arithmetic circuit.
- the arithmetic circuit takes the matrix A data and matrix B from the input memory 501 to perform matrix operations, and the partial result or final result of the obtained matrix is stored in the accumulator 508.
- the unified memory 506 is used to store input data and output data.
- the weight data is directly transferred to the weight memory 502 through the storage unit access controller 505 (direct memory access controller, DMAC).
- the input data is also transferred to the unified memory 506 through the DMAC.
- the BIU is the Bus Interface Unit, that is, the bus interface unit 510, which is used for the interaction between the AXI bus and the DMAC and the instruction fetch buffer 509.
- the bus interface unit 510 (bus interface unit, BIU for short) is used for the instruction fetch memory 509 to obtain instructions from an external memory, and is also used for the storage unit access controller 505 to obtain the original data of the input matrix A or the weight matrix B from the external memory.
- the DMAC is mainly used to transfer the input data in the external memory DDR to the unified memory 506 or to transfer the weight data to the weight memory 502 or to transfer the input data to the input memory 501.
- the vector calculation unit 507 may include multiple arithmetic processing units, if necessary, further processing the output of the arithmetic circuit, such as vector multiplication, vector addition, exponential operation, logarithmic operation, size comparison and so on.
- the vector calculation unit 507 can store the processed output vector in the unified buffer 506.
- the vector calculation unit 507 may apply a nonlinear function to the output of the arithmetic circuit 503, such as a vector of accumulated values, to generate the activation value.
- the vector calculation unit 507 generates a normalized value, a combined value, or both.
- the processed output vector can be used as an activation input to the arithmetic circuit 503, for example for use in a subsequent layer in a neural network.
- the instruction fetch buffer 509 connected to the controller 504 is used to store instructions used by the controller 504;
- the unified memory 506, the input memory 501, the weight memory 502, and the fetch memory 509 are all On-Chip memories.
- the external memory is private to the NPU hardware architecture.
- each layer in the convolutional neural network shown in FIG. 3 and FIG. 4 may be executed by the matrix calculation unit 212 or the vector calculation unit 507.
- FIG. 15 is a schematic diagram of another embodiment of a face recognition device in an embodiment of the application.
- the face recognition apparatus provided in this embodiment may include a terminal or a server, and the specific device form is not limited in the embodiment of the present application.
- the face recognition device 1500 may have relatively large differences due to different configurations or performances, and may include one or more processors 1501 and a memory 1502, and the memory 1502 stores programs or data.
- the memory 1502 may be volatile storage or non-volatile storage.
- the processor 1501 is one or more central processing units (CPU, Central Processing Unit, the CPU may be a single-core CPU or a multi-core CPU.
- the processor 1501 may communicate with the memory 1502, in the face recognition device 1500 executes a series of instructions in the memory 1502.
- the face recognition device 1500 also includes one or more wired or wireless network interfaces 1503, such as an Ethernet interface.
- the face recognition apparatus 1500 may also include one or more power supplies; one or more input and output interfaces, which may be used to connect a display, a mouse, a keyboard, a touch screen device,
- the input and output interfaces are optional components, which may or may not exist, and are not limited here.
- the disclosed system, device, and method can be implemented in other ways.
- the device embodiments described above are only illustrative.
- the division of the units is only a logical function division, and there may be other divisions in actual implementation, for example, multiple units or components may be combined or It can be integrated into another system, or some features can be ignored or not implemented.
- the displayed or discussed mutual coupling or direct coupling or communication connection may be indirect coupling or communication connection through some interfaces, devices or units, and may be in electrical, mechanical or other forms.
- the units described as separate components may or may not be physically separated, and the components displayed as units may or may not be physical units, that is, they may be located in one place, or they may be distributed on multiple network units. Some or all of the units may be selected according to actual needs to achieve the objectives of the solutions of the embodiments.
- the functional units in the various embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units may be integrated into one unit.
- the above-mentioned integrated unit can be implemented in the form of hardware or software functional unit.
- the integrated unit is implemented in the form of a software functional unit and sold or used as an independent product, it can be stored in a computer readable storage medium.
- the technical solution of this application essentially or the part that contributes to the existing technology or all or part of the technical solution can be embodied in the form of a software product, and the computer software product is stored in a storage medium.
- a computer device which may be a personal computer, a server, or a network device, etc.
- the aforementioned storage media include: U disk, mobile hard disk, read-only memory (read-only memory, ROM), random access memory (random access memory, RAM), magnetic disks or optical disks and other media that can store program codes. .
Landscapes
- Engineering & Computer Science (AREA)
- Health & Medical Sciences (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Oral & Maxillofacial Surgery (AREA)
- General Health & Medical Sciences (AREA)
- General Physics & Mathematics (AREA)
- Multimedia (AREA)
- Human Computer Interaction (AREA)
- Evolutionary Computation (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Computing Systems (AREA)
- Software Systems (AREA)
- Artificial Intelligence (AREA)
- Biophysics (AREA)
- Mathematical Physics (AREA)
- Data Mining & Analysis (AREA)
- Computational Linguistics (AREA)
- Molecular Biology (AREA)
- Biomedical Technology (AREA)
- General Engineering & Computer Science (AREA)
- Life Sciences & Earth Sciences (AREA)
- Geometry (AREA)
- Databases & Information Systems (AREA)
- Medical Informatics (AREA)
- Image Analysis (AREA)
Abstract
Description
Claims (29)
- 一种人脸识别方法,其特征在于,包括:获取待识别的人脸图像;根据所述人脸图像,通过预训练的特征提取网络,提取人脸图像特征;提取所述人脸图像中的多个面部几何特征点,以确定多个特征点集合,所述多个特征点集合中每个特征点集合对应一个人脸部位,所述特征点集合包括至少一个面部几何特征点;根据所述多个特征点集合获取人脸拓扑结构特征,所述人脸拓扑结构特征用于确定所述多个特征点集合之间的相对位置关系;根据所述人脸拓扑结构特征和所述人脸图像特征在预设的人脸数据库中进行匹配,以获取人脸识别结果。
- 根据权利要求1所述的方法,其特征在于,所述人脸拓扑结构特征包括:特征向量集合,所述特征向量集合中的特征向量用于指示所述多个特征点集合中任意两个所述特征点集合之间的相对位置关系;或者,特征矩阵,所述特征矩阵中的元素用于指示所述多个特征点集合中任意两个所述特征点集合之间的相对位置关系。
- 根据权利要求1所述的方法,其特征在于,根据所述多个人脸部位特征点集合获取人脸拓扑结构特征包括:构建所述多个特征点集合与标准人脸的多个特征点集合之间的映射关系,所述映射关系用于确定所述多个特征点集合之间的相对位置关系;将所述映射关系输入到预训练的人脸拓扑结构特征提取网络,以获取所述人脸拓扑结构特征。
- 根据权利要求3所述的方法,其特征在于,所述映射关系包括所述多个特征点集合与所述标准人脸的多个特征点集合之间的距离和/或角度。
- 根据权利要求3或4所述的方法,其特征在于,所述人脸拓扑结构特征提取网络为第一网络训练后得到,所述方法还包括:从人脸图像训练样本中提取多个面部几何特征点,以确定多个样本特征点集合,所述多个样本特征点集合中每个特征点集合对应所述训练样本的一个人脸部位,所述样本特征点集合包括至少一个面部几何特征点;获取所述样本特征点集合与标准人脸的特征点集合之间的映射关系,并输入到所述第一网络进行训练,获取第一损失值;根据所述第一损失值更新所述第一网络中的权重参数,以获取所述人脸拓扑结构特征提取网络。
- 根据权利要求1至5中任一项所述的方法,其特征在于,根据所述人脸图像,通过预训练特征提取网络,提取人脸图像特征包括:将述人脸图像输入到预训练的人脸整体特征提取网络,以提取人脸整体特征。
- 根据权利要求1至6中任一项所述的方法,其特征在于,所述方法还包括:从所述人脸图像中提取第一人脸部位图像;根据所述人脸图像,通过预训练特征提取网络,提取人脸图像特征包括:将所述第一人脸部位图像输入到预训练的第一部位特征提取网络,以提取第一部位特征,所述第一部位特征用于在所述人脸数据库中进行匹配,以获取所述人脸识别结果。
- 根据权利要求7所述的方法,其特征在于,所述第一部位特征提取网络为第二网络训练后得到,所述方法还包括:将从人脸图像训练样本中提取的人脸部位图像输入到所述第二网络进行训练,获取第二损失值;根据所述第二损失值更新所述第二网络中的权重参数,以获取所述第一部位特征提取网络。
- 根据权利要求1至6中任一项所述的方法,其特征在于,所述方法还包括:从所述人脸图像中提取多个人脸部位图像;根据所述人脸图像,通过预训练特征提取网络,提取人脸图像特征包括:将所述多个人脸部位图像分别输入预训练的多个部位特征提取网络,以提取多个部位特征;根据所述多个部位特征确定所述人脸图像的目标部位特征。
- 根据权利要求9所述的方法,其特征在于,所述目标部位特征根据所述多个部位特征的加权平均值确定,所述多个部位特征的权值为预设值。
- 根据权利要求9或10所述的方法,其特征在于,所述方法还包括:检测所述多个人脸部位图像中的人脸部位是否被遮挡;若第一人脸部位图像中的第一人脸部位被遮挡,且第二人脸部位图像中的第二人脸部位未被遮挡,所述第二人脸部位为所述第一人脸部位的对称部位,则将所述第二人脸部位图像的水平翻转图像确定为所述第一人脸部位的恢复图像,所述恢复图像用于输入所述部位特征提取网络以提取所述部位特征。
- 根据权利要求11所述的方法,其特征在于,所述方法还包括:基于所述第一人脸部位被遮挡,更新所述第一人脸部位的部位特征的权值,更新的第一权值小于所述第一人脸部位的预设第一权值。
- 根据权利要求1至12中任一项所述的方法,其特征在于,所述方法还包括:对所述人脸图像进行预处理,以获取预处理后的人脸图像,所述预处理包括人脸对齐,所述预处理后的人脸图像用于提取所述人脸图像特征和提取所述多个面部几何特征点。
- 一种人脸识别装置,其特征在于,包括:获取模块,用于获取待识别的人脸图像;提取模块,用于根据所述人脸图像,通过预训练的特征提取网络,提取人脸图像特征;确定模块,用于提取所述人脸图像中的多个面部几何特征点,以确定多个特征点集合,所述多个特征点集合中每个特征点集合对应一个人脸部位,所述特征点集合包括至少一个面部几何特征点;所述获取模块,还用于根据所述多个特征点集合获取人脸拓扑结构特征,所述人脸拓扑结构特征用于确定所述多个特征点集合之间的相对位置关系;匹配模块,用于根据所述人脸拓扑结构特征和所述人脸图像特征在预设的人脸数据库中 进行匹配,以获取人脸识别结果。
- 根据权利要求14所述的人脸识别装置,其特征在于,所述人脸拓扑结构特征包括:特征向量集合,所述特征向量集合中的特征向量用于指示所述多个特征点集合中任意两个所述特征点集合之间的相对位置关系;或者,特征矩阵,所述特征矩阵中的元素用于指示所述多个特征点集合中任意两个所述特征点集合之间的相对位置关系。
- 根据权利要求14所述的人脸识别装置,其特征在于,所述确定模块还用于:构建所述多个特征点集合与标准人脸的多个特征点集合之间的映射关系,所述映射关系用于确定所述多个特征点集合之间的相对位置关系;所述获取模块具体用于:将所述映射关系输入到预训练的人脸拓扑结构特征提取网络,以获取所述人脸拓扑结构特征。
- 根据权利要求16所述的人脸识别装置,其特征在于,所述映射关系包括所述多个特征点集合与所述标准人脸的多个特征点集合之间的距离和/或角度。
- 根据权利要求16或17所述的人脸识别装置,其特征在于,所述人脸拓扑结构特征提取网络为第一网络训练后得到;所述提取模块还用于,从人脸图像训练样本中提取多个面部几何特征点,以确定多个样本特征点集合,所述多个样本特征点集合中每个特征点集合对应所述训练样本的一个人脸部位,所述样本特征点集合包括至少一个面部几何特征点;所述获取模块还用于,获取所述样本特征点集合与标准人脸的特征点集合之间的映射关系,并输入到所述第一网络进行训练,获取第一损失值;所述获取模块还用于,根据所述第一损失值更新所述第一网络中的权重参数,以获取所述人脸拓扑结构特征提取网络。
- 根据权利要求14至18中任一项所述的人脸识别装置,其特征在于,所述提取模块具体用于:将述人脸图像输入到预训练的人脸整体特征提取网络,以提取人脸整体特征。
- 根据权利要求14至19中任一项所述的人脸识别装置,其特征在于,所述提取模块具体用于:从所述人脸图像中提取第一人脸部位图像;将所述第一人脸部位图像输入到预训练的第一部位特征提取网络,以提取第一部位特征,所述第一部位特征用于在所述人脸数据库中进行匹配,以获取所述人脸识别结果。
- 根据权利要求20所述的人脸识别装置,其特征在于,所述第一部位特征提取网络为第二网络训练后得到,所述获取模块还用于:将从人脸图像训练样本中提取的人脸部位图像输入到所述第二网络进行训练,获取第二损失值;所述获取模块还用于,根据所述第二损失值更新所述第二网络中的权重参数,以获取所述第一部位特征提取网络。
- 根据权利要求14至19中任一项所述的人脸识别装置,其特征在于,所述提取模块还用于:从所述人脸图像中提取多个人脸部位图像;所述提取模块具体用于,将所述多个人脸部位图像分别输入预训练的多个部位特征提取网络,以提取多个部位特征;所述确定模块还用于,根据所述多个部位特征确定所述人脸图像的目标部位特征。
- 根据权利要求22所述的人脸识别装置,其特征在于,所述目标部位特征根据所述多个部位特征的加权平均值确定,所述多个部位特征的权值为预设值。
- 根据权利要求22或23所述的人脸识别装置,其特征在于,所述人脸识别装置还包括:检测模块,用于检测所述多个人脸部位图像中的人脸部位是否被遮挡;所述确定模块还用于,若第一人脸部位图像中的第一人脸部位被遮挡,且第二人脸部位图像中的第二人脸部位未被遮挡,所述第二人脸部位为所述第一人脸部位的对称部位,则将所述第二人脸部位图像的水平翻转图像确定为所述第一人脸部位的恢复图像,所述恢复图像用于输入所述部位特征提取网络以提取所述部位特征。
- 根据权利要求24所述的人脸识别装置,其特征在于,所述人脸识别装置还包括:更新模块,用于基于所述第一人脸部位被遮挡,更新所述第一人脸部位的部位特征的权值,更新的第一权值小于所述第一人脸部位的预设第一权值。
- 根据权利要求14至25中任一项所述的人脸识别装置,其特征在于,所述获取模块还用于:对所述人脸图像进行预处理,以获取预处理后的人脸图像,所述预处理包括人脸对齐,所述预处理后的人脸图像用于提取所述人脸图像特征和提取所述多个面部几何特征点。
- 一种人脸识别装置,其特征在于,包括处理器和存储器,所述处理器和所述存储器相互连接,其中,所述存储器用于存储计算机程序,所述计算机程序包括程序指令,所述处理器用于调用所述程序指令,执行如权利要求1至13中任一项所述的方法。
- 一种包含指令的计算机程序产品,其特征在于,当其在计算机上运行时,使得所述计算机执行如权利要求1至13中任一项所述的方法。
- 一种计算机可读存储介质,包括指令,其特征在于,当所述指令在计算机上运行时,使得计算机执行如权利要求1至13中任一项所述的方法。
Priority Applications (2)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| US17/758,946 US12307815B2 (en) | 2020-01-16 | 2020-07-30 | Face recognition method and face recognition apparatus |
| EP20914749.5A EP4075324B1 (en) | 2020-01-16 | 2020-07-30 | Face recognition method and face recognition device |
Applications Claiming Priority (2)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN202010051193.9A CN111274916B (zh) | 2020-01-16 | 2020-01-16 | 人脸识别方法和人脸识别装置 |
| CN202010051193.9 | 2020-01-16 |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| WO2021143101A1 true WO2021143101A1 (zh) | 2021-07-22 |
Family
ID=70997337
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| PCT/CN2020/105772 Ceased WO2021143101A1 (zh) | 2020-01-16 | 2020-07-30 | 人脸识别方法和人脸识别装置 |
Country Status (4)
| Country | Link |
|---|---|
| US (1) | US12307815B2 (zh) |
| EP (1) | EP4075324B1 (zh) |
| CN (1) | CN111274916B (zh) |
| WO (1) | WO2021143101A1 (zh) |
Cited By (3)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN113688849A (zh) * | 2021-08-30 | 2021-11-23 | 中国空空导弹研究院 | 一种用于卷积神经网络的灰度图像序列特征提取方法 |
| CN115424353A (zh) * | 2022-09-07 | 2022-12-02 | 杭银消费金融股份有限公司 | 基于ai模型的业务用户特征识别方法及系统 |
| CN117173758A (zh) * | 2022-12-23 | 2023-12-05 | 南昌工学院 | 一种基于多维度特征融合网络的学习注意力状态评估方法 |
Families Citing this family (20)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN111274916B (zh) * | 2020-01-16 | 2024-02-02 | 华为技术有限公司 | 人脸识别方法和人脸识别装置 |
| CN111783601B (zh) * | 2020-06-24 | 2024-04-26 | 北京百度网讯科技有限公司 | 人脸识别模型的训练方法、装置、电子设备及存储介质 |
| CN111931628B (zh) * | 2020-08-04 | 2023-10-24 | 腾讯科技(深圳)有限公司 | 人脸识别模型的训练方法、装置及相关设备 |
| CN111985360A (zh) * | 2020-08-05 | 2020-11-24 | 上海依图网络科技有限公司 | 一种人脸识别的方法、装置、设备和介质 |
| US11741751B2 (en) * | 2020-10-01 | 2023-08-29 | Vinai AI Application and Research Joint Stock Co. | Masked face recognition method |
| CN112307947A (zh) * | 2020-10-29 | 2021-02-02 | 北京沃东天骏信息技术有限公司 | 用于生成信息的方法和装置 |
| CN114639134A (zh) * | 2020-12-16 | 2022-06-17 | 富士通株式会社 | 用于人脸识别的方法和装置 |
| CN115035560B (zh) * | 2021-03-04 | 2025-12-23 | 武汉Tcl集团工业研究院有限公司 | 口罩佩戴识别方法、装置、终端设备及存储介质 |
| CN113657227A (zh) * | 2021-08-06 | 2021-11-16 | 姜政毫 | 人脸识别方法及基于深度学习算法的人脸识别系统 |
| CN115546856A (zh) * | 2021-11-30 | 2022-12-30 | 青岛海尔智能家电科技有限公司 | 用于人脸识别的方法、装置及电子设备 |
| CN114463806A (zh) * | 2021-12-29 | 2022-05-10 | 四川新网银行股份有限公司 | 一种基于模型可视化的新特征自动预警方法 |
| CN115393482A (zh) * | 2022-08-08 | 2022-11-25 | 网易(杭州)网络有限公司 | 一种表情动画重定向方法、装置及电子设备 |
| TWI818824B (zh) * | 2022-12-07 | 2023-10-11 | 財團法人工業技術研究院 | 用於計算遮蔽人臉影像的人臉擺動方向的裝置及方法 |
| CN116645715A (zh) * | 2023-05-30 | 2023-08-25 | 中国工商银行股份有限公司 | 面部检测方法、装置、设备、介质及产品 |
| CN117238018B (zh) * | 2023-09-20 | 2024-06-21 | 华南理工大学 | 基于多粒度的可增量深宽网络活体检测方法、介质及设备 |
| CN117391968A (zh) * | 2023-10-16 | 2024-01-12 | 南京邮电大学 | 一种人脸图像复原方法、系统、存储介质及设备 |
| CN117369349B (zh) * | 2023-12-08 | 2024-02-23 | 如特数字科技(苏州)有限公司 | 一种远程监测智能机器人的管理系统 |
| CN117636446B (zh) * | 2024-01-25 | 2024-05-07 | 江汉大学 | 脸部穴位定位方法、针灸方法、针灸机器人及存储介质 |
| CN118334758B (zh) * | 2024-06-12 | 2024-08-23 | 广东博科电子科技有限公司 | 应用于楼宇门禁系统的图像识别方法及系统 |
| CN120126201B (zh) * | 2025-05-12 | 2025-07-25 | 华中科技大学 | 基于非线性映射的面部机器人表情协同控制方法及系统 |
Citations (4)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN106022317A (zh) * | 2016-06-27 | 2016-10-12 | 北京小米移动软件有限公司 | 人脸识别方法及装置 |
| CN108898068A (zh) * | 2018-06-06 | 2018-11-27 | 腾讯科技(深圳)有限公司 | 一种人脸图像的处理方法和装置以及计算机可读存储介质 |
| CN110263673A (zh) * | 2019-05-31 | 2019-09-20 | 合肥工业大学 | 面部表情识别方法、装置、计算机设备及存储介质 |
| CN111274916A (zh) * | 2020-01-16 | 2020-06-12 | 华为技术有限公司 | 人脸识别方法和人脸识别装置 |
Family Cites Families (14)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US7236615B2 (en) * | 2004-04-21 | 2007-06-26 | Nec Laboratories America, Inc. | Synergistic face detection and pose estimation with energy-based models |
| CN101980305A (zh) * | 2010-10-21 | 2011-02-23 | 青岛科技大学 | 基于人脸识别与3g通信的家庭智能门禁系统 |
| CN105684038B (zh) * | 2013-10-28 | 2019-06-11 | 谷歌有限责任公司 | 用于替换图像的部分的图像缓存 |
| CN103927554A (zh) * | 2014-05-07 | 2014-07-16 | 中国标准化研究院 | 一种基于拓扑结构的图像稀疏表征面部表情特征提取系统和方法 |
| US11544962B2 (en) * | 2014-07-15 | 2023-01-03 | FaceChecks LLC | Multi-algorithm-based face recognition system and method with optimal dataset partitioning for a cloud environment |
| JP6754619B2 (ja) * | 2015-06-24 | 2020-09-16 | 三星電子株式会社Samsung Electronics Co.,Ltd. | 顔認識方法及び装置 |
| CN105069622B (zh) * | 2015-08-03 | 2018-08-31 | 福州海景科技开发有限公司 | 一种面向移动终端的人脸识别支付系统和方法 |
| CN106570822B (zh) * | 2016-10-25 | 2020-10-16 | 宇龙计算机通信科技(深圳)有限公司 | 一种人脸贴图方法及装置 |
| CN106686353A (zh) * | 2016-12-30 | 2017-05-17 | 佛山亚图信息技术有限公司 | 基于云端的安防控制系统及其方法 |
| CN106998444B (zh) * | 2017-02-14 | 2020-02-18 | 广东中科人人智能科技有限公司 | 一种大数据人脸监控系统 |
| CN109583332B (zh) * | 2018-11-15 | 2021-07-27 | 北京三快在线科技有限公司 | 人脸识别方法、人脸识别系统、介质及电子设备 |
| CN110222566A (zh) * | 2019-04-30 | 2019-09-10 | 北京迈格威科技有限公司 | 一种人脸特征的获取方法、装置、终端及存储介质 |
| CN110675489B (zh) * | 2019-09-25 | 2024-01-23 | 北京达佳互联信息技术有限公司 | 一种图像处理方法、装置、电子设备和存储介质 |
| TWI783199B (zh) * | 2019-12-25 | 2022-11-11 | 亞旭電腦股份有限公司 | 臉部辨識的處理方法及電子裝置 |
-
2020
- 2020-01-16 CN CN202010051193.9A patent/CN111274916B/zh active Active
- 2020-07-30 EP EP20914749.5A patent/EP4075324B1/en active Active
- 2020-07-30 US US17/758,946 patent/US12307815B2/en active Active
- 2020-07-30 WO PCT/CN2020/105772 patent/WO2021143101A1/zh not_active Ceased
Patent Citations (4)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN106022317A (zh) * | 2016-06-27 | 2016-10-12 | 北京小米移动软件有限公司 | 人脸识别方法及装置 |
| CN108898068A (zh) * | 2018-06-06 | 2018-11-27 | 腾讯科技(深圳)有限公司 | 一种人脸图像的处理方法和装置以及计算机可读存储介质 |
| CN110263673A (zh) * | 2019-05-31 | 2019-09-20 | 合肥工业大学 | 面部表情识别方法、装置、计算机设备及存储介质 |
| CN111274916A (zh) * | 2020-01-16 | 2020-06-12 | 华为技术有限公司 | 人脸识别方法和人脸识别装置 |
Non-Patent Citations (1)
| Title |
|---|
| See also references of EP4075324A4 |
Cited By (5)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN113688849A (zh) * | 2021-08-30 | 2021-11-23 | 中国空空导弹研究院 | 一种用于卷积神经网络的灰度图像序列特征提取方法 |
| CN113688849B (zh) * | 2021-08-30 | 2023-10-24 | 中国空空导弹研究院 | 一种用于卷积神经网络的灰度图像序列特征提取方法 |
| CN115424353A (zh) * | 2022-09-07 | 2022-12-02 | 杭银消费金融股份有限公司 | 基于ai模型的业务用户特征识别方法及系统 |
| CN115424353B (zh) * | 2022-09-07 | 2023-05-05 | 杭银消费金融股份有限公司 | 基于ai模型的业务用户特征识别方法及系统 |
| CN117173758A (zh) * | 2022-12-23 | 2023-12-05 | 南昌工学院 | 一种基于多维度特征融合网络的学习注意力状态评估方法 |
Also Published As
| Publication number | Publication date |
|---|---|
| US12307815B2 (en) | 2025-05-20 |
| EP4075324A4 (en) | 2023-06-14 |
| EP4075324A1 (en) | 2022-10-19 |
| US20230080031A1 (en) | 2023-03-16 |
| CN111274916A (zh) | 2020-06-12 |
| CN111274916B (zh) | 2024-02-02 |
| EP4075324B1 (en) | 2025-07-30 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| WO2021143101A1 (zh) | 人脸识别方法和人脸识别装置 | |
| CN112766158B (zh) | 基于多任务级联式人脸遮挡表情识别方法 | |
| WO2021218238A1 (zh) | 图像处理方法和图像处理装置 | |
| US11232286B2 (en) | Method and apparatus for generating face rotation image | |
| CN112215180B (zh) | 一种活体检测方法及装置 | |
| CN112800903B (zh) | 一种基于时空图卷积神经网络的动态表情识别方法及系统 | |
| CN110728209B (zh) | 一种姿态识别方法、装置、电子设备及存储介质 | |
| CN112446398B (zh) | 图像分类方法以及装置 | |
| CN110222717B (zh) | 图像处理方法和装置 | |
| US20210012198A1 (en) | Method for training deep neural network and apparatus | |
| CN106951867B (zh) | 基于卷积神经网络的人脸识别方法、装置、系统及设备 | |
| CN111368672A (zh) | 一种用于遗传病面部识别模型的构建方法及装置 | |
| WO2021077984A1 (zh) | 对象识别方法、装置、电子设备及可读存储介质 | |
| WO2021175050A1 (zh) | 三维重建方法和三维重建装置 | |
| WO2021227726A1 (zh) | 面部检测、图像检测神经网络训练方法、装置和设备 | |
| WO2021043168A1 (zh) | 行人再识别网络的训练方法、行人再识别方法和装置 | |
| WO2022179587A1 (zh) | 一种特征提取的方法以及装置 | |
| CN117437522B (zh) | 一种人脸识别模型训练方法、人脸识别方法及装置 | |
| CN110222718A (zh) | 图像处理的方法及装置 | |
| WO2022246612A1 (zh) | 活体检测方法、活体检测模型的训练方法及其装置和系统 | |
| WO2022179599A1 (zh) | 一种感知网络及数据处理方法 | |
| CN114037838A (zh) | 神经网络的训练方法、电子设备及计算机程序产品 | |
| CN120472530A (zh) | 一种换衣场景下的行人重识别方法 | |
| CN114120386A (zh) | 人脸识别方法、装置、设备及存储介质 | |
| CN119049089B (zh) | 一种基于近距离单目灰度摄像头的双手拇指关键点实时检测方法 |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| 121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 20914749 Country of ref document: EP Kind code of ref document: A1 |
|
| ENP | Entry into the national phase |
Ref document number: 2020914749 Country of ref document: EP Effective date: 20220714 |
|
| NENP | Non-entry into the national phase |
Ref country code: DE |
|
| WWG | Wipo information: grant in national office |
Ref document number: 17758946 Country of ref document: US |
|
| WWG | Wipo information: grant in national office |
Ref document number: 2020914749 Country of ref document: EP |



