WO2018233438A1 - 人脸特征点跟踪方法、装置、存储介质及设备 - Google Patents

人脸特征点跟踪方法、装置、存储介质及设备 Download PDF

Info

Publication number
WO2018233438A1
WO2018233438A1 PCT/CN2018/088070 CN2018088070W WO2018233438A1 WO 2018233438 A1 WO2018233438 A1 WO 2018233438A1 CN 2018088070 W CN2018088070 W CN 2018088070W WO 2018233438 A1 WO2018233438 A1 WO 2018233438A1
Authority
WO
WIPO (PCT)
Prior art keywords
feature point
sample
image
face feature
facial feature
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Ceased
Application number
PCT/CN2018/088070
Other languages
English (en)
French (fr)
Inventor
林梦然
王新亮
李斌
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tencent Technology Shenzhen Co Ltd
Original Assignee
Tencent Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tencent Technology Shenzhen Co Ltd filed Critical Tencent Technology Shenzhen Co Ltd
Priority to EP18819678.6A priority Critical patent/EP3644219B1/en
Publication of WO2018233438A1 publication Critical patent/WO2018233438A1/zh
Priority to US16/542,005 priority patent/US10943091B2/en
Anticipated expiration legal-status Critical
Ceased legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/20Analysis of motion
    • G06T7/246Analysis of motion using feature-based methods, e.g. the tracking of corners or segments
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/243Classification techniques relating to the number of classes
    • G06F18/24323Tree-organised classifiers
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/20Analysis of motion
    • G06T7/246Analysis of motion using feature-based methods, e.g. the tracking of corners or segments
    • G06T7/248Analysis of motion using feature-based methods, e.g. the tracking of corners or segments involving reference images or patches
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/764Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/161Detection; Localisation; Normalisation
    • G06V40/167Detection; Localisation; Normalisation using comparisons between temporally consecutive images
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/168Feature extraction; Face representation
    • G06V40/169Holistic features and representations, i.e. based on the facial image taken as a whole
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30196Human being; Person
    • G06T2207/30201Face

Definitions

  • the embodiments of the present invention relate to the field of image recognition technologies, and in particular, to a face feature point tracking method, apparatus, storage medium, and device.
  • Image recognition technology is a technology that processes, analyzes and understands images through computers. It is an important field of artificial intelligence. It is widely used to track face feature points, filter garbage images and match terrain terrain.
  • the reference face feature points can be obtained from the sample images of a large number of labeled face feature points, and the feature point tracking model can be obtained according to the reference face feature points, so that the feature point tracking model can reflect The relationship between the face feature point and the reference face feature point in an image, so that the face feature point of the current image can be obtained based on the feature point tracking model.
  • the face feature points of consecutive multi-frame images in the video are usually different and continuously changing.
  • the face feature points of consecutive multi-frame images are tracked by the prior art, the face of each frame image is The feature points are obtained according to the reference face feature points, which results in the tracking limitation of the face feature points, and the tracked face feature points cannot accurately represent the real face features.
  • the embodiment of the present invention provides a method, a device, a storage medium and a device for tracking face feature points, which can solve the problem that the face feature points of each frame image are obtained according to the reference face feature points, and the tracking features of the face feature points are caused. Large, the tracked feature points cannot accurately represent the real face features.
  • the technical solution is as follows:
  • a method for tracking a face feature point for use in an electronic device, the method comprising:
  • the coordinates of the point, the preset error model is obtained according to the face feature points of the plurality of pairs of adjacent frame images, and is used to indicate the pixel point and the face feature point error of the next frame image in the adjacent frame image.
  • a face feature point tracking device comprising:
  • a first acquiring module configured to acquire a facial feature point in a previous frame image of the to-be-tracked frame image
  • a second acquiring module configured to acquire a face feature point error of the to-be-tracked frame image and the previous frame image based on a preset error model and a pixel point in the to-be-tracked frame image, the facial feature
  • the point error refers to a difference between a first coordinate and a second coordinate, the first coordinate is a coordinate of a face feature point in the image to be tracked, and the second coordinate is an image of the previous frame a coordinate of a face feature point at a corresponding position, wherein the preset error model is trained according to a face feature point of a plurality of pairs of adjacent frame images, and is used to indicate a pixel of a subsequent one of the adjacent frame images The relationship between point and face feature point error;
  • a tracking module configured to obtain a facial feature point of the to-be-tracked frame image based on the facial feature point of the previous frame image and the facial feature point error.
  • a computer readable storage medium having stored therein at least one instruction loaded by a processor and executed to implement the above-described facial feature point tracking method.
  • an electronic device comprising: one or more processors; and a memory; the memory storing one or more programs, the one or more programs being configured to be Executed by one or more processors, the one or more programs include instructions for performing the following operations:
  • the coordinates of the point, the preset error model is obtained according to the face feature points of the plurality of pairs of adjacent frame images, and is used to indicate the pixel point and the face feature point error of the next frame image in the adjacent frame image.
  • the facial feature point error of the image to be tracked and the image of the previous frame may be obtained.
  • the face feature points in the frame image to be tracked are obtained, and since the face feature points of the adjacent frame images continuously change, the person of the previous frame image
  • the face feature points as a reference can more accurately estimate the face feature points of the frame image to be tracked.
  • FIG. 1 is a schematic diagram of an implementation environment of a face feature point tracking provided by an embodiment of the present application
  • FIG. 2 is a flowchart of a method for tracking a face feature point according to an embodiment of the present application
  • FIG. 3 is a schematic diagram of a feature point of a face provided by an embodiment of the present application.
  • FIG. 4 is a schematic diagram of a data structure provided by an embodiment of the present application.
  • FIG. 5 is a schematic flowchart of a face feature point tracking according to an embodiment of the present application.
  • FIG. 6 is a block diagram of a face feature point tracking apparatus according to an embodiment of the present application.
  • FIG. 7 is a block diagram of a face feature point tracking apparatus according to an embodiment of the present application.
  • FIG. 8 is a block diagram of a face feature point tracking apparatus according to an embodiment of the present application.
  • FIG. 9 is a schematic structural diagram of a terminal according to an embodiment of the present disclosure.
  • FIG. 10 is a schematic structural diagram of a server according to an embodiment of the present application.
  • FIG. 1 is a schematic diagram of an implementation environment of face feature point tracking provided by an embodiment of the present application.
  • the implementation environment includes:
  • the server 101 may be a server that provides an image service
  • the terminal 102 may be a terminal of a user served by the server.
  • the terminal 102 can install an image application, a social application, or a game application provided by the server 101, such that the terminal 102 can interact with the server 101 based on the installed application.
  • the server 101 may be configured to acquire a preset error model, and send the preset error model to the terminal 102, so that the terminal 102 may store the preset error model and track the face feature point of the application.
  • the face feature points are obtained by tracking based on the preset error model.
  • the server 101 can also configure at least one database, such as a face image database, a user database, and the like.
  • the face image database is configured to store a face image, a face feature point in the labeled face image, and a face feature point of the previous frame image of the simulated face image, etc.
  • the user database is configured to store the face feature Personal data such as a user name and password of the user served by the server 101.
  • FIG. 2 is a flowchart of a method for tracking a face feature point according to an embodiment of the present application.
  • the method can be applied to any electronic device, such as a server or a terminal, and the execution subject is used as a terminal.
  • the method may include the following model training process and a model application process, where steps 201-205 are based on multiple The process of training the adjacent frame image to obtain the preset error model, and steps 205-207 are the face feature point tracking process when the preset error model is applied:
  • the embodiment of the present application can find the variation pattern of the facial feature points between the front and rear frame images by collecting the sample set and performing model training based on the sample set.
  • a face feature point refers to a point in an image that represents a face feature, such as a point that represents a facial feature or a facial feature, usually represented in the form of coordinates.
  • Samples can be obtained in a variety of ways. For example, to ensure the reliability of the sample set and improve the accuracy of the preset error model, multiple pairs of adjacent frame images can be extracted from the video containing the face, and the artificial image is acquired.
  • a face feature point marked on an adjacent frame image is taken as a sample, wherein a pair of adjacent frame images includes two adjacent images in the video, or, in order to save labor costs and improve acquisition efficiency of the sample set, a single person can be obtained a face image, and obtaining a face feature point manually marked on the single face image, and then simulating a previous frame image of the single face image based on the distribution of the face feature points in the single face image
  • the face feature points are obtained as a sample of the face feature points of a pair of adjacent frame images, wherein the method of simulating the face feature points is not limited to the Monte Carlo method.
  • FIG. 3 is a schematic diagram of a face feature point provided by an embodiment of the present application. (a) in FIG. 3 is a second image, and FIG.
  • FIG. 3 (b) is a face feature point of the second image
  • FIG. 3 (c) is based on the second image.
  • the face feature points simulate the face feature points of the first image.
  • the number of coordinates of the face feature points required to represent the complete face feature point is not limited, for example, 30, in the process of training the model or tracking the face feature points, It is equivalent to processing the coordinates of 30 face feature points to represent the changed face feature points.
  • the face feature points of the adjacent frame images may be stored in the face image database of the server, and the terminal is from the face image database at the beginning of the training. Gets a sample set of face feature points containing images of adjacent frames.
  • the preset threshold is used as a criterion for dividing the sample set.
  • the sample with different facial feature points can be segmented into different types of samples, and the samples with similar facial feature points are segmented into In the same class of samples.
  • the selected position refers to a position corresponding to any two pixel points in the selected area in the second image.
  • the embodiment of the present application does not limit the size of the selected area and the position in the second image, and the size of the selected area is not greater than the area of the second image.
  • a pair of pixel points at selected positions for each second image can be used as features of the sample, and the sample set is classified according to the feature. Since the sample is obtained from at least the second image of the single image, the segmentation can be performed based on the pixel points of the second image.
  • the difference of the gradation values of the pair of pixel points may be taken as the feature of the sample.
  • the terminal may acquire pixel point information of the second image, where the pixel point information includes at least a position and a gray value of the pixel point, and determines that the second image is at the selected position.
  • the gray level difference corresponding to the second image is compared with a preset threshold corresponding to the nth level segmentation in the segmentation level, and if the preset threshold is smaller than the preset threshold value, the second image is compared
  • the sample is divided into a class corresponding to the nth level segmentation. If it is not less than the preset threshold, the sample in which the second image is located is segmented into another class corresponding to the nth level segmentation.
  • the number of the segmentation level and the preset threshold are not limited, and the segmentation level is used to indicate the fineness of the classification, and each segment is performed again based on the samples of the previous segment. segmentation. Based on the set segmentation level, the terminal can use the various types of samples obtained by the final stage segmentation as the multi-class samples.
  • the terminal may perform the first level segmentation: comparing the gray level difference of a pair of pixel points of a second image at the selected position with a preset threshold a, if less than the preset threshold a , the sample in which the second image is located is divided into class 1.
  • the sample in which the second image is located is segmented into class 2; based on the first level segmentation, class 1 and class 2 are obtained, and the process proceeds to Secondary segmentation: comparing the grayscale difference corresponding to the second image in the sample in class 1 with a preset threshold b, and if less than the preset threshold b, segmenting the sample of the second image in class 1 to Class 11, if not less than the preset threshold b, segmenting the sample of the second image in the class 1 into the class 12; and setting the gray difference corresponding to the second image in the sample in the class 2 with the preset
  • the threshold c is compared.
  • the terminal can acquire class 11, class 12, class 21, and class 22.
  • the data structure of the preset error model may be a random forest, and the segmentation process may be used as a process of generating a regression tree in a random forest.
  • FIG. 4 a schematic diagram of a data structure is provided in the embodiment of the present application.
  • a random forest structure is constructed by using a preset error model by T (this step takes T as an example), and a random forest may be at least A regression tree is constructed.
  • a regression tree can have multiple nodes. Each node except the leaf node can correspond to a preset threshold. A type of sample can be obtained from a leaf node of a regression tree.
  • the terminal first compares the gray level difference corresponding to each second image with a preset threshold value at a root node of a regression tree, and if it is less than a preset threshold, is segmented to the left child node (or Right sub-node), if not less than the preset threshold, is split to the right sub-node (or left sub-node), completes the first-level segmentation, until the leaf node of the tree is reached, and the segmentation is obtained to each leaf node A type of sample of points. Since each sample in a class of samples undergoes the same segmentation process according to the corresponding grayscale difference, the face feature points of the samples in a class of samples have a certain degree of similarity.
  • a pair of pixels consists of pixels at the center of the image and pixels at the edge of the image.
  • the pair of pixels may be more representative than the other pair of pixels.
  • the discrimination between the samples in different types of samples is higher under the corresponding division mode of the pair of pixel points, and The similarity between the samples in the same kind of sample is higher, so that the accuracy of the preset error model is improved, and a segmentation method with higher purity is selected from a plurality of segmentation methods, and the segmentation purity is used to indicate a segmentation.
  • the similarity between each sample in each type of sample under the mode is higher.
  • the process of selecting the segmentation mode may be: the terminal divides the sample set in different manners based on a preset threshold and a plurality of pairs of pixels in different selected positions in a selected region of the second image in each sample. Multi-class samples in each segmentation mode; determining the segmentation purity of each segmentation method based on the face feature points of the multi-class samples in each segmentation mode; selecting a segmentation mode in which the segmentation purity conforms to a preset condition, The multi-class sample in the division mode is the finally obtained multi-class sample, and the position of the pair of pixel points corresponding to the division method is taken as the selected position.
  • the preset condition is not limited to a division method having the highest division purity.
  • the terminal may randomly select a plurality of pairs of pixels from the selected area. For example, there is a position 1 - position 10 in the selected area, and the pixel points in the position 1 and the position 3 may be selected as a pair of pixel points, and the position is selected. 2 and pixel points on position 6 as a pair of pixel points and so on.
  • the terminal performs segmentation based on each pair of pixels, and the segmentation process is the same as the above-described segmentation process based on the selected location. It should be noted that the manner in which the segmentation purity is obtained is not limited in the embodiment of the present application.
  • the segmentation purity can be obtained according to the variance of the face feature points in each sample in the current segmentation mode, and the higher the similarity between the samples in each class of samples, the smaller the variance corresponding to the samples, indicating segmentation. The higher the purity.
  • the process of obtaining the purity of the segmentation can be referred to the following formula 2,
  • r i is the difference between the face feature point of the second image in one sample and the face feature point of the first image
  • Q ⁇ , S is the number of samples
  • ⁇ S is the person corresponding to each sample in the sample
  • the accuracy of the embodiment of the present application is that the face feature point of the first image is used as an initial estimate of the face feature point of the second image, and the preset error model is obtained based on the difference between the face feature points of the two images. .
  • the terminal may use the fourth coordinate of the facial feature point in the first image as the estimated facial feature point coordinate of the corresponding second image, and analyze the third point coordinate and the estimation of the facial feature in the second image.
  • the general difference between the face feature point coordinates is obtained, and the initial face feature point error is obtained, and the initial face feature point error may be the third coordinate of the face feature point in each second image corresponding to the first image
  • the average value of the difference of the fourth coordinate of the face feature point of the position, that is, the initial face feature point error is the average value of the difference between the third coordinate and the fourth coordinate of each face feature point, and the calculation manner thereof can be referred to Formula 1 below,
  • f 01 represents the initial face feature point error
  • N represents the number of second images
  • i represents the i-th pair of adjacent frame images
  • S i2 represents the i-th second image face feature points
  • S i1 represents the i-th The face feature points of the first image.
  • the coordinates with the same label may be the coordinates of a pair of face positions of corresponding positions, and then a1 and b1 are the coordinates of a pair of face positions of corresponding positions, and the difference is For (X1-Y1, X2-Y2), when f 01 is obtained, the X-axis coordinate values in each difference can be averaged as the X-axis coordinate value of f 01 ; the Y-axis coordinate in each difference the average value is taken as the Y axis coordinate value of f 01, f 01 to obtain the coordinates of the present application embodiment, the handling of the coordinates of facial features of the same reason here.
  • the step 202 is an optional step of the embodiment of the present application.
  • the reconstruction may be determined according to the classified sample and the facial feature point of the first image.
  • the face feature point error is obtained, and the preset error model is obtained, thereby improving the accuracy of the face feature point tracking process.
  • the method for determining the coordinates of the estimated face feature points may be various, for example, using the fourth coordinate of the face feature point of the first image as the estimated face feature point coordinate of the corresponding second image, or according to the acquired The initial face feature point error, the terminal may combine the fourth coordinate of the face feature point of the first image in the pair of adjacent frame images to determine the estimated face feature point coordinates of the second image in the pair of adjacent frame images, such as And adding each coordinate of the face feature point of the first image to a corresponding coordinate difference value of the initial face feature point error to obtain an estimated face feature point coordinate of the second image.
  • the estimated facial feature points of the first obtained image are subjected to the transition of the initial facial feature point error, and can be closer to the trueness of the second image.
  • the face feature points make the preset error model based on the estimated face feature points more accurate.
  • the step is Each type of sample is analyzed to determine the difference between the face feature point and the estimated face feature point of the second image in the one type of sample.
  • the terminal may select the face feature point of each second image in the class of samples.
  • the average value of the difference between the third coordinate and the estimated face feature point coordinate is used as the reconstructed face feature point error, and the calculation process can refer to the following formula 3,
  • X n represents the reconstructed facial feature point error corresponding to the nth type of samples
  • A represents the number of second images in the nth type of samples
  • Sar represents the third of the facial feature points in the a second second image.
  • the coordinates, Sae represent the estimated face feature point coordinates of the a second second image.
  • the corresponding reconstructed facial feature point error at each leaf node of a regression tree can be obtained.
  • the preset error model is obtained based on the initial facial feature point error and the reconstructed facial feature point error corresponding to each sample.
  • the preset error model obtained through training can take many forms.
  • the preset error model is a weighted form of the initial face feature point error and the reconstructed face feature point, and the preset error model can be used to obtain a frame image and a face feature point of the previous frame image.
  • the preset error model is an initial facial feature point error, a reconstructed facial feature point error, and a facial feature point of the previous frame image of one frame image (as an independent variable to be input in the preset error model)
  • the step is an optional step of obtaining a preset error model based on the reconstructed facial feature point error corresponding to each sample.
  • the preset error model is obtained by reconstructing the facial feature point error corresponding to each sample.
  • the training process may be continued based on the reconstructed facial feature point error obtained in step 203.
  • the implementation process may refer to the following steps (1)-( 3):
  • an estimated face feature point of each second image in the one type of sample is updated based on the reconstructed face feature point error corresponding to the one type of sample.
  • the terminal may compare the estimated face feature point coordinates of each second image with the reconstructed face feature point corresponding to the type of sample in which the second image is located. Adding, the estimated face feature points after the second image update are obtained.
  • the step (1) may be expressed as: updating each second image based on the first reconstructed facial feature point error Estimating a face feature point, the first reconstructed face feature point error refers to a reconstructed face feature point error corresponding to each type of sample segmented by a pair of pixel points of the second image in the selected sample in the sample set .
  • the terminal may use the reconstructed facial feature point error corresponding to each type of sample obtained in step 203 as the first reconstructed facial feature point error, and estimate the face of each second image.
  • the feature point coordinates are added to the reconstructed face feature point errors corresponding to the type of samples in which the second image is located, and the estimated face feature points after the second image update are obtained.
  • This step (2) is equivalent to updating the selected position in step 201.
  • the selected area includes the position 1-10, and the position 1 and the position 3 are selected as the selected position in the step 201; then the step (2) can select two positions among the remaining 8 positions as the selected position, such as , select position 2 and position 7 as the selected position.
  • This step (3) can be understood as continuing to perform steps 201 and 203 until it is determined that the reconstructed face feature point error corresponding to each type of sample divided by a pair of pixel points at respective selected positions is stopped. It should be noted that, if the terminal performs step 202 between steps 201 and 203, the step (3) may be further replaced by: continuing to perform the selection based on the preset threshold and the second image in each sample in the sample set. A pair of pixels on the selected position of the fixed area, the sample set is divided into multiple types of samples, the initial facial feature point error corresponding to the selected area of the second image is determined, and the reconstructed facial features corresponding to each type of sample are determined. The step of the point error stops until the reconstructed face feature point error corresponding to each type of sample segmented by a pair of pixel points at each selected position.
  • the selected location selected for the first time may be referred to as a first location
  • the selected location selected for a second time is referred to as a second location
  • the step (3) the sample set is divided into a plurality of types of samples based on a preset threshold and a pair of pixel points of the second image in each sample at the second position of the selected region; determining the second reconstructed facial feature point An error until a reconstructed facial feature point error corresponding to each type of sample segmented by pixel points at respective locations in the selected region is determined, the second reconstructed facial feature point error being based on the second location
  • the reconstructed facial feature point error corresponding to each type of sample segmented by a pair of pixels.
  • the implementation process of the step (3) is similar to the implementation process of the step 203, except that the terminal needs to determine the reconstructed facial feature point error based on the updated estimated facial feature points in the step (3). After determining the second reconstructed facial feature point error, the terminal may continue to determine the reconstructed facial feature point error according to steps (1)-(3), for example, updating the second image based on the second reconstructed facial feature point error.
  • a face feature point segmenting the sample set based on a preset threshold and a pair of pixel points at the third position, and determining a third reconstructed face feature point error until a number of selected positions are selected based on the selected area Point, to determine the set number of reconstructed facial feature point errors, can be based on (initial facial feature point error and) reconstructed facial feature point error corresponding to each type of sample segmented by each pair of pixels, to obtain a preset Error model.
  • the estimated facial feature point of the second image is continuously updated along with the process of determining the reconstructed facial feature point error, and when determining a reconstructed facial feature point error, it is required Based on the previous reconstructed facial feature point error, the estimated facial feature points are updated, and the updated estimated facial feature points are obtained, and then a reconstructed facial feature point error is obtained based on the updated estimated facial feature points.
  • the preset error model obtained based on steps (1)-(3) can be both the initial face feature point error and the weighted form of each reconstructed face feature point, or the initial face feature point error and the reconstructed face feature point.
  • the error and the weighted form of the face feature points of the previous frame of the image of one frame can be Equation 4.
  • K represents the number of selected positions used to segment the sample in the selected region
  • k represents the label of the selected position in a selected region
  • g k (I) represents a frame image based on the kth selected position
  • the above-mentioned training process can make the obtained facial feature point error continuously approach the difference of facial feature points between adjacent two frames of images.
  • the training process is equivalent to reconstructing the face feature point error based on the previous regression tree in the random forest, updating the estimated face feature point of the second image, and based on The updated estimated face feature points, the process of generating the current regression tree.
  • a regression tree can be generated based on pixel points at a selected position, and the regression tree is used to segment the image into one of the plurality of samples according to a preset threshold and pixels at the selected position. In the class sample.
  • the preset error models obtained above are based on a random forest. In a possible application scenario, the preset error model can also be obtained based on multiple random forests.
  • the acquisition process can continue with the above step (3). And continue with the following steps (4)-(5):
  • This step (4) is equivalent to updating the selected area in step 201.
  • the selected area in step 201 is the central area in the second image
  • the selected area in step (4) is the edge area in the second image.
  • step (5) continuing to perform a pair of pixel points based on the preset threshold and the second image in each sample in the sample set at the selected position of the selected area, dividing the sample set into a plurality of types of samples, and determining each type of sample corresponding
  • the step of reconstructing the face feature point error stops until the reconstructed face feature point error corresponding to each type of sample segmented by the pair of pixel points at the selected position in each selected region is determined.
  • This step (4) can be understood as continuing to perform steps 201 and 203 until it is determined that the reconstructed face feature point error corresponding to each type of sample segmented by a pair of pixel points at selected positions in the respective selected regions is stopped. It should be noted that if the terminal performs step 202 between steps 201 and 203, the step (4) may be further replaced by: continuing to perform the selection based on the preset threshold and the second image in each sample in the sample set. A pair of pixel points at selected positions of the fixed area, the sample set is divided into multiple types of samples, the initial facial feature point error corresponding to the selected area of the second image is determined, and the reconstructed facial feature points corresponding to each type of sample are determined. The step of error stops until the reconstructed face feature point error corresponding to each type of sample segmented by a pair of pixel points at selected locations in each selected region is determined.
  • the selected area selected for the first time may be referred to as a first area
  • the selected area selected for a second time is referred to as a second area
  • the plurality of reconstructed facial feature point errors corresponding to the second region may be determined based on the pixel points of the second image in each sample in the sample set at the selected position of the second region other than the first region, until A reconstructed facial feature point error corresponding to a pixel point in each selected region in the second image is determined.
  • the terminal may divide the sample set into multiple types of samples based on each pair of pixel points in the second region, and determine the reconstructed facial feature point error corresponding to each type of sample, and the segmentation process and step 201 Again, the determination process is the same as step 203.
  • the terminal may determine the reconstructed facial feature point error corresponding to the third region based on the selected pixel points in the third region until the preset number is obtained.
  • the preset error model can be obtained based on the initial facial feature point errors corresponding to the selected regions and the reconstructed facial feature point errors corresponding to the selected regions. .
  • the terminal may select a plurality of selected positions in each area, and determine a reconstructed facial feature point for each selected position. For the error, see the description in steps (1)-(3).
  • the initial facial feature point error corresponding to the selected region may be determined, and the determining process and the step 202 are performed. Same, but for the selected area after the first selected area (eg, the second selected area), since the estimated face feature points are reconstructed with the corresponding one of the previous selected areas
  • the process of determining the error is continuously updated, and when determining the initial face feature point error corresponding to a selected area, the second image may be updated based on the reconstructed face feature point error obtained from the last pair of pixels in the last selected area.
  • the face feature points are obtained by estimating the face feature points based on the reconstructed face feature points corresponding to the previous selected region (equivalent to the reconstructed face feature point error based on the last regression tree in the previous random forest) get). That is, according to the fourth coordinate of the face feature point in the first image and the reconstructed face feature point error corresponding to each type of sample in which the sample is located, for example, the last pair of pixels according to the first region.
  • the obtained reconstructed facial feature point error and the estimated face feature point coordinates of the previous update of the second image are added to obtain the estimated facial feature point coordinates of the current update, and the facial feature points based on the second image are obtained.
  • the initial face feature point error is obtained from the estimated face feature points after this update.
  • the method for calculating the initial facial feature point error corresponding to a certain area may be the same as Equation 1, as shown in the following formula 5,
  • f 0t represents the initial facial feature point error corresponding to the t-th selected region, corresponding to the initial facial feature point error in the t-th random forest
  • N represents the number of second images
  • i represents the i-th number Two images
  • S ir represents a face feature point of the i-th second image
  • S ie represents an estimated face feature point of the i-th second image.
  • the preset error model can be a weighted form of each initial face feature point error and each reconstructed face feature point, or an initial face feature point error, a reconstructed face feature point error, and a previous frame of a frame image.
  • the weighted form of the face feature points of the image For example, a preset error model is an example, which can be Equation 6.
  • T indicates the number of selected regions and is equivalent to the number of random forests
  • t indicates the label of the selected region, which is also equivalent to the label of the random forest
  • f 0t indicates the initial face corresponding to the selected region labeled t
  • the feature point error is also equivalent to the initial face feature point error corresponding to the t-th random forest
  • K represents the number of selected positions used to segment the sample in a selected region, and is equivalent to a random forest regression tree.
  • the number of k indicates the label of a selected position in a selected area, which is also equivalent to the label of the kth regression tree; g k (I) indicates that one frame of image is based on the kth selection in the tth selected area
  • the parameters in each of the foregoing preset error models may be set with corresponding weights.
  • the weight of each parameter is 1 in the embodiment of the present application.
  • steps 201-204 are described by taking an example of calculating a preset error model in real time.
  • the embodiment of the present application does not limit the timing of acquiring the preset error model.
  • the face feature point tracking may also be performed based on the preset error model that has been acquired in advance, and the previously acquired preset error model may be obtained by referring to the above steps 201-204.
  • steps 205-207 are the face feature point tracking process when the preset error model is applied:
  • the method for acquiring the face feature points of the previous frame image may be various, for example, the face feature point tracking method according to the embodiment of the present application is tracked and acquired.
  • the face feature point can be obtained based on the face feature point tracking method such as the supervised descent method or the incremental learning method.
  • a face feature point error between the to-be-tracked frame image and the previous frame image based on the preset error model and the pixel points in the to-be-tracked frame image, where the face feature point error refers to between the first coordinate and the second coordinate
  • the difference is the coordinate of the face feature point in the image to be tracked
  • the second coordinate is the coordinate of the face feature point of the corresponding position in the image of the previous frame.
  • the preset error model is based on multiple pairs of neighbors.
  • the face feature point training of the frame image is used to indicate the relationship between the pixel point of the next frame image and the face feature point error in the adjacent frame image.
  • the frame image to be tracked refers to any frame image after the first frame image acquired by the terminal.
  • the to-be-tracked frame image refers to a frame image currently captured by the terminal, or one frame image of any video currently played by the terminal, or one frame image of a certain video stored by the terminal.
  • the terminal Since each type of sample corresponds to a respective reconstructed facial feature point error, the terminal needs to determine a plurality of selected positions used in the image to be tracked for segmenting the sample according to the preset error model, such as the first position and the first position.
  • the second position determines, according to the determined pair of pixel points and the preset threshold value, which type of sample the image to be tracked is divided into, and selects a reconstructed facial feature point error corresponding to the one type of sample, based on the Determining the pixel points at each selected position, the terminal may select multiple reconstructed facial feature point errors, and weight each initial facial feature point error and the selected reconstructed facial feature point error to obtain a to-be-tracked frame image and The face feature point error of the previous frame image, therefore, the obtained face feature point error can express the difference between the to-be-tracked frame image and the face feature point of the previous frame image.
  • the terminal may input the pixel point in the image to be tracked as the independent variable I in the preset error model, input the preset error model, and obtain the corresponding variable I corresponding to the preset error model output. Face feature point error.
  • the terminal may first determine the person from the to-be-tracked frame image. The face area, and then the process of tracking the face feature points. In order to improve the efficiency of face feature point tracking and ensure the real-time performance of the tracking process, based on the continuity of the image to be tracked and the image of the previous frame, the terminal can also be based on the second coordinate of the face feature point in the image of the previous frame. Determining a face region in the to-be-tracked frame image; and obtaining a face feature point error of the to-be-tracked frame image and the previous frame image based on the pixel points in the face region and the preset error model.
  • the terminal may determine, according to the second coordinate of each face feature point in the image of the previous frame, the boundary of the region in which the second coordinate is enclosed in the image of the previous frame, and determine the person based on the boundary of the region.
  • the face area for example, determines a center position within the boundary of the area, and a square area of a preset size centered on the center position is used as a face area in the image to be tracked.
  • the embodiment of the present application does not limit the preset size.
  • the preset size is a size that differs from the size of the boundary of the area by a preset value, or a fixed size set.
  • the terminal may also detect a face region in the to-be-tracked frame image based on other face detection algorithms, such as a neural network based person. Face detection algorithm or face detection algorithm based on active contour.
  • face detection algorithm or face detection algorithm based on active contour.
  • the above method for determining facial feature points based on the previous frame image utilizes the face feature points just acquired, and can not only determine the current frame in a simple and real-time manner. Face area, and make full use of the data obtained by the tracking process, improving data utilization. 207.
  • the face feature points of the frame image to be tracked are obtained based on the face feature points and the face feature point errors of the previous frame image.
  • the terminal may determine, according to the facial feature point error obtained in step 207, a deviation of the first coordinate of each facial feature point in the to-be-tracked frame image from the second coordinate of the facial feature point in the previous frame image. Transmitting, and based on the second coordinate of the face feature point in the image of the previous frame and the determined offset, the first coordinate of the face feature point in the frame image to be tracked is obtained. That is, based on the face feature point error, determining an offset of the first coordinate of each face feature point relative to the second coordinate; based on the second coordinate of the face feature point in the previous frame image and the determined offset The first coordinate of the face feature point in the frame image to be tracked is obtained.
  • the terminal may add the second coordinate of each face feature point of the previous frame image to the difference coordinate of the corresponding position in the face feature point error, and obtain the first feature point of each face feature in the frame image to be tracked.
  • a coordinate for example, the coordinate of the face feature point error and the same label in the previous frame image as the coordinates of the corresponding position, assuming that the difference coordinate of the face feature point error is 1 (X3-X4, Y3-Y4)
  • the second coordinate of the face feature point labeled 1 in the previous frame image is (X4, Y4), and the coordinates obtained by adding the two coordinates are (X3, Y3) as the label 1 in the image to be tracked.
  • the first coordinate of the face feature point By analogy, the first coordinates of each face feature point of the image to be tracked can be obtained.
  • the embodiment of the present application may also perform facial feature point tracking in combination with any of the current facial feature point tracking methods.
  • the facial feature points of the to-be-tracked frame image are obtained based on each facial feature point tracking method.
  • the weight corresponding to each face feature point tracking method is used to determine the face feature point of the to-be-tracked frame image.
  • the image may be pre-processed or post-processed when performing face feature point tracking, so that the tracked feature points of the face are more accurate, for example, the image is denoised before tracking, or after tracking. Smoothing the image based on a smoothing algorithm such as Kalman filtering or optical flow method.
  • the embodiments of the present application can be applied to various scenarios in which face feature point tracking is required.
  • the monitoring of the face image of the real-time shooting by the terminal is taken as an example.
  • the embodiment of the present application provides a schematic flowchart of the tracking of the face feature point.
  • the tracking process is described by taking the face feature points of the adjacent frame images in the video as an example.
  • the terminal tracks the first frame image
  • the face detection is first performed to obtain the face region in the face detection frame
  • the face feature point in the face region is estimated by using a single frame aligner, and the single frame aligner can be any face feature point tracking model established by the face feature point tracking method based on the single frame image.
  • the terminal may obtain the boundary enclosed by the estimated face feature points.
  • the terminal may perform the face based on the face feature point boundary of the previous frame image. Detecting, updating a face feature point boundary in the to-be-tracked frame image, and determining a face region in the to-be-tracked frame image based on the updated face feature point boundary, and estimating a face of the to-be-tracked frame image based on the preset error model Feature points. After estimating the face feature points in one frame of image, the terminal may determine whether to continue tracking based on the preset tracking condition, and if so, may continue to track the face feature points of the next frame image based on the acquired face feature point boundaries.
  • the preset tracking condition is used as a condition for whether to continue tracking.
  • the preset tracking condition is not limited in the embodiment of the present application.
  • the preset tracking condition may be a preset tracking duration. If the tracking duration does not reach the preset tracking duration, the tracking continues. If the tracking duration has reached the preset tracking duration, the tracking process ends.
  • the preset error model can also be obtained by the server training, and the obtained preset error model is sent to terminal.
  • the facial feature point error of the image to be tracked and the image of the previous frame may be obtained.
  • the face feature points in the frame image to be tracked are obtained, and since the face feature points of the adjacent frame images continuously change, the person of the previous frame image
  • the face feature points as a reference can more accurately estimate the face feature points of the frame image to be tracked.
  • the face feature points of the frame image to be tracked can be consistent with the face feature points of the previous frame image, accurate face feature points can be tracked. Therefore, the tracking method is very robust.
  • FIG. 6 is a block diagram of a face feature point tracking apparatus according to an embodiment of the present application.
  • the apparatus includes:
  • a first acquiring module 601 configured to acquire a facial feature point in a previous frame image of the to-be-tracked frame image
  • the second obtaining module 602 is configured to obtain a face feature point error of the to-be-tracked frame image and the previous frame image based on the preset error model and the pixel point in the to-be-tracked frame image, where the face feature point error refers to the first coordinate a difference between the second coordinate and the second coordinate, the first coordinate is a coordinate of the face feature point in the image to be tracked, and the second coordinate is between the second coordinate of the face feature point of the corresponding position in the image of the previous frame
  • the preset error model is obtained according to the face feature points of the plurality of pairs of adjacent frame images, and is used to indicate a relationship between pixel points of the next frame image and the face feature point error in the adjacent frame image;
  • the tracking module 603 is configured to obtain a facial feature point of the to-be-tracked frame image based on the facial feature point and the facial feature point error of the previous frame image.
  • the tracking module 603 is configured to: determine, according to a facial feature point error, an offset of a first coordinate of each facial feature point with respect to a second coordinate; based on a face in the image of the previous frame The second coordinate of the feature point and the determined offset amount obtain the first coordinate of the face feature point in the frame image to be tracked.
  • the device further includes:
  • the segmentation module 604 is configured to divide the sample set into a plurality of types of samples, each of the sample sets, based on a preset threshold and a pair of pixel points of the second image in each sample of the sample set in the selected location of the selected region
  • the sample includes a face feature point of the first image in the preceding frame image and a face feature point of the second image in the subsequent frame;
  • the first determining module 605 is configured to determine a reconstructed facial feature point error corresponding to each type of sample, and the reconstructed facial feature point error is used to indicate a third coordinate and an estimated face of the facial feature point of the second image in the one type of sample.
  • the difference between the feature point coordinates, and the estimated face feature point coordinates are determined based on the face feature points of the first image in the one type of sample;
  • the third obtaining module 606 is configured to obtain a preset error model based on the reconstructed facial feature point error corresponding to each type of sample.
  • the device further includes:
  • an update module configured to: before the third obtaining module 606 obtains the preset error model based on the reconstructed facial feature point error corresponding to each type of sample, update each of the types of samples based on the reconstructed facial feature point error corresponding to the one type of sample Estimated face feature points for each second image in the class sample;
  • a first selection module for reselecting a location in the selected area as the selected location
  • a first loop module configured to continue to perform a pair of pixel points based on the preset threshold and the second image in each sample in the sample set at a selected position of the selected region, and divide the sample set into a plurality of types of samples, determining each The operation of reconstructing the face feature point error corresponding to the class sample stops until the reconstructed face feature point error corresponding to each type of sample segmented by the pair of pixel points at each selected position is determined.
  • the device further includes:
  • a second selecting module configured to reselect an area in the sample as the selected area before the third obtaining module 606 obtains the preset error model based on the reconstructed facial feature point error corresponding to each type of sample;
  • a second loop module configured to continue to perform a pair of pixel points based on the preset threshold and the second image in each sample in the sample set at a selected position of the selected region, and divide the sample set into a plurality of types of samples, determining each The operation of reconstructing the face feature point error corresponding to the class sample stops until the reconstructed face feature point error corresponding to each type of sample segmented by the pair of pixel points at the selected position in each selected region is determined.
  • the device further includes:
  • the second determining module 607 is configured to determine an initial facial feature point error corresponding to the selected area of the second image, where the initial facial feature point error is used to indicate the third coordinate of the facial feature point of the second image and the estimated face The difference between the feature point coordinates;
  • the third obtaining module 606 is configured to obtain a preset error model based on the initial facial feature point error and the reconstructed facial feature point error corresponding to each sample.
  • the segmentation module 604 is further configured to: different sets of samples based on a preset threshold and a plurality of pairs of pixels in different selected locations in a selected region of the second image in each sample Segmentation of the method, obtaining multiple types of samples under each segmentation mode;
  • the segmentation purity of each segmentation mode is determined based on the face feature points of the plurality of samples in each segmentation mode, and the segmentation purity is used to indicate the similarity between each sample in each class of samples in a segmentation mode;
  • a segmentation method in which the segmentation purity meets the preset condition is selected, and the multi-class samples in the segmentation mode are used as the finally obtained multi-class samples, and the positions of the pair of pixel points corresponding to the segmentation method are taken as the selected positions.
  • the second obtaining module 602 is further configured to: determine a face region in the to-be-tracked frame image based on the second coordinate of the face feature point in the previous frame image; based on the preset error model and A pixel point in the face area obtains a face feature point error of the image to be tracked and the image of the previous frame.
  • the face feature point tracking device provided by the foregoing embodiment only uses the division of the above functional modules when tracking the face feature points.
  • the functions may be assigned differently according to needs.
  • the functional module is completed, that is, the internal structure of the device is divided into different functional modules to complete all or part of the functions described above.
  • the face feature point tracking device and the face feature point tracking method embodiment are provided in the same concept, and the implementation process thereof is described in detail in the method embodiment, and details are not described herein again.
  • FIG. 9 is a schematic structural diagram of a terminal according to an embodiment of the present disclosure.
  • the terminal may be used to perform the method for tracking a face feature point provided in the foregoing embodiments.
  • the terminal 900 includes:
  • the terminal 900 may include an RF (Radio Frequency) circuit 110, a memory 120 including one or more computer readable storage media, an input unit 130, a display unit 140, a sensor 150, an audio circuit 160, and a WiFi (Wireless Fidelity, wireless).
  • the fidelity module 170 includes a processor 180 having one or more processing cores, and a power supply 190 and the like. It will be understood by those skilled in the art that the terminal structure shown in FIG. 9 does not constitute a limitation to the terminal, and may include more or less components than those illustrated, or combine some components, or different component arrangements. among them:
  • the RF circuit 110 can be used for transmitting and receiving information or during a call, and receiving and transmitting signals. Specifically, after receiving the downlink information of the base station, the downlink information is processed by one or more processors 180. In addition, the data related to the uplink is sent to the base station. .
  • the RF circuit 110 includes, but is not limited to, an antenna, at least one amplifier, a tuner, one or more oscillators, a Subscriber Identity Module (SIM) card, a transceiver, a coupler, an LNA (Low Noise Amplifier). , duplexer, etc.
  • RF circuitry 110 can also communicate with the network and other devices via wireless communication.
  • the wireless communication may use any communication standard or protocol, including but not limited to GSM (Global System of Mobile communication), GPRS (General Packet Radio Service), CDMA (Code Division Multiple Access). , Code Division Multiple Access), WCDMA (Wideband Code Division Multiple Access), LTE (Long Term Evolution), e-mail, SMS (Short Messaging Service), and the like.
  • GSM Global System of Mobile communication
  • GPRS General Packet Radio Service
  • CDMA Code Division Multiple Access
  • WCDMA Wideband Code Division Multiple Access
  • LTE Long Term Evolution
  • e-mail Short Messaging Service
  • the memory 120 can be used to store software programs and modules, and the processor 180 executes various functional applications and data processing by running software programs and modules stored in the memory 120.
  • the memory 120 may mainly include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application required for at least one function (such as a sound playing function, an image playing function, etc.), and the like; the storage data area may be stored according to The data created by the use of the terminal 900 (such as audio data, phone book, etc.) and the like.
  • memory 120 can include high speed random access memory, and can also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other volatile solid state storage device. Accordingly, memory 120 may also include a memory controller to provide access to memory 120 by processor 180 and input unit 130.
  • the input unit 130 can be configured to receive input numeric or character information and to generate keyboard, mouse, joystick, optical or trackball signal inputs related to user settings and function controls.
  • the input unit 130 can include a touch-sensitive surface 131 and other input devices 132.
  • Touch-sensitive surface 131 also referred to as a touch display or trackpad, can collect touch operations on or near the user (such as a user using a finger, stylus, etc., on any suitable object or accessory on touch-sensitive surface 131 or The operation near the touch-sensitive surface 131) and driving the corresponding connecting device according to a preset program.
  • the touch-sensitive surface 131 can include two portions of a touch detection device and a touch controller.
  • the touch detection device detects the touch orientation of the user, and detects a signal brought by the touch operation, and transmits the signal to the touch controller; the touch controller receives the touch information from the touch detection device, converts the touch information into contact coordinates, and sends the touch information.
  • the processor 180 is provided and can receive commands from the processor 180 and execute them.
  • the touch-sensitive surface 131 can be implemented in various types such as resistive, capacitive, infrared, and surface acoustic waves.
  • the input unit 130 can also include other input devices 132.
  • the other input devices 132 may include, but are not limited to, one or more of a physical keyboard, function keys (such as volume control buttons, switch buttons, etc.), trackballs, mice, joysticks, and the like.
  • the display unit 140 can be used to display information entered by the user or information provided to the user and various graphical user interfaces of the terminal 900, which can be composed of graphics, text, icons, video, and any combination thereof.
  • the display unit 140 may include a display panel 141.
  • the display panel 141 may be configured in the form of an LCD (Liquid Crystal Display), an OLED (Organic Light-Emitting Diode), or the like.
  • the touch-sensitive surface 131 may cover the display panel 141, and when the touch-sensitive surface 131 detects a touch operation thereon or nearby, it is transmitted to the processor 180 to determine the type of the touch event, and then the processor 180 according to the touch event The type provides a corresponding visual output on display panel 141.
  • touch-sensitive surface 131 and display panel 141 are implemented as two separate components to implement input and input functions, in some embodiments, touch-sensitive surface 131 can be integrated with display panel 141 for input. And output function.
  • Terminal 900 can also include at least one type of sensor 150, such as a light sensor, motion sensor, and other sensors.
  • the light sensor may include an ambient light sensor and a proximity sensor, wherein the ambient light sensor may adjust the brightness of the display panel 141 according to the brightness of the ambient light, and the proximity sensor may close the display panel 141 and/or when the terminal 900 moves to the ear. Or backlight.
  • the gravity acceleration sensor can detect the magnitude of acceleration in all directions (usually three axes). When it is stationary, it can detect the magnitude and direction of gravity.
  • the terminal 900 can also be configured with gyroscopes, barometers, hygrometers, thermometers, infrared sensors and other sensors, not here Let me repeat.
  • the audio circuit 160, the speaker 161, and the microphone 162 can provide an audio interface between the user and the terminal 900.
  • the audio circuit 160 can transmit the converted electrical data of the received audio data to the speaker 161 for conversion to the sound signal output by the speaker 161; on the other hand, the microphone 162 converts the collected sound signal into an electrical signal by the audio circuit 160. After receiving, it is converted into audio data, and then processed by the audio data output processor 180, transmitted to the terminal, for example, via the RF circuit 110, or outputted to the memory 120 for further processing.
  • the audio circuit 160 may also include an earbud jack to provide communication of the peripheral earphones with the terminal 900.
  • WiFi is a short-range wireless transmission technology
  • the terminal 900 can help users to send and receive emails, browse web pages, and access streaming media through the WiFi module 170, which provides wireless broadband Internet access for users.
  • FIG. 9 shows the WiFi module 170, it can be understood that it does not belong to the essential configuration of the terminal 900, and may be omitted as needed within the scope of not changing the essence of the application.
  • the processor 180 is a control center of the terminal 900 that connects various portions of the entire handset using various interfaces and lines, by running or executing software programs and/or modules stored in the memory 120, and recalling data stored in the memory 120, The various functions and processing data of the terminal 900 are performed to perform overall monitoring of the mobile phone.
  • the processor 180 may include one or more processing cores; preferably, the processor 180 may integrate an application processor and a modem processor, where the application processor mainly processes an operating system, a user interface, an application, and the like.
  • the modem processor primarily handles wireless communications. It can be understood that the above modem processor may not be integrated into the processor 180.
  • the terminal 900 also includes a power source 190 (such as a battery) for powering various components.
  • the power source can be logically coupled to the processor 180 through a power management system to manage functions such as charging, discharging, and power management through the power management system.
  • Power supply 190 may also include any one or more of a DC or AC power source, a recharging system, a power failure detection circuit, a power converter or inverter, a power status indicator, and the like.
  • the terminal 900 may further include a camera, a Bluetooth module, and the like, and details are not described herein again.
  • the display unit of the terminal is a touch screen display
  • the terminal further includes a memory, and one or more programs, wherein one or more programs are stored in the memory and configured to be configured by one or more processors carried out.
  • the one or more programs include instructions for: acquiring a facial feature point in a previous frame image of the image to be tracked; and acquiring a pixel based on a preset error model and a pixel image in the image to be tracked Tracking the face feature point error of the frame image and the previous frame image, the face feature point error refers to the difference between the first coordinate and the second coordinate, and the first coordinate is the face feature in the frame image to be tracked The coordinates of the point, the second coordinate is the coordinates of the face feature point of the corresponding position in the previous frame image, and the preset error model is trained according to the face feature points of the pair of adjacent frame images, and is used to indicate the adjacent frame image.
  • the relationship between the pixel point of the subsequent image and the face feature point error; based on the face feature point and the face feature point error of the previous frame image, the face feature point of the frame image to be tracked is obtained.
  • the one or more programs further include instructions for: determining an offset of a first coordinate of each facial feature point relative to the second coordinate based on the facial feature point error; based on the front The second coordinate of the face feature point in the image of the frame and the determined offset amount obtain the first coordinate of the face feature point in the image to be tracked.
  • the one or more programs further include instructions for: performing, based on the preset threshold and a pair of pixels at a selected location of the second image in each sample of the sample set in the selected region
  • the sample set is divided into a plurality of types of samples, each sample in the sample set includes a face feature point of a preceding first image and a face feature point of a second image in a subsequent frame image; determining each type of sample Corresponding reconstructed facial feature point error, the reconstructed facial feature point error is used to indicate a difference between a third coordinate of the facial feature point of the second image and an estimated facial feature point coordinate of the first type of sample,
  • the estimated face feature point coordinates are determined based on the face feature points of the first image in the one type of samples; and the preset error model is obtained based on reconstructed face feature point errors corresponding to the various types of samples.
  • the one or more programs further include instructions for: prior to obtaining the preset error model based on reconstructed facial feature point errors corresponding to each type of sample, for each type of sample, based on the one Reconstructing the face feature point error corresponding to the class sample, updating the estimated face feature point of each second image in the class of samples; reselecting a location in the selected region as the selected location; continuing execution And dividing the sample set into a plurality of types of samples based on a preset threshold and a pair of pixel points in a selected position of the second image in each sample of the sample set, and determining a reconstruction corresponding to each type of sample
  • the step of the face feature point error stops until the reconstructed face feature point error corresponding to each type of sample segmented by a pair of pixel points at each selected position is determined.
  • the one or more programs further include instructions for: reselecting one of the samples before the predetermined error model is obtained based on the reconstructed facial feature point error corresponding to each type of sample An area as the selected area; continuing to perform the pair of pixel points based on the preset threshold and the second image in each sample in the sample set at the selected position of the selected area, dividing the sample set into multiple a class sample, determining a step of reconstructing a face feature point error corresponding to each type of sample until determining a reconstructed face feature point error corresponding to each type of sample segmented by a pair of pixel points at selected locations in each selected region Stop after.
  • the one or more programs further include instructions for: determining an initial facial feature point error corresponding to the selected region of the second image, the initial facial feature point error being used to indicate the a difference between the third coordinate of the face feature point of the two images and the estimated face feature point coordinate; and the reconstructing the face feature point error corresponding to each type of sample to obtain the preset error model, comprising: The initial face feature point error and the reconstructed face feature point error corresponding to each sample are obtained to obtain the preset error model.
  • the one or more programs further include instructions for: performing, based on the preset threshold and a plurality of pairs of pixels at different selected locations within a selected region of the second image in each sample,
  • the sample set is segmented in different manners to obtain a plurality of types of samples in each of the segmentation modes; and based on the face feature points of the plurality of samples in each segmentation mode, the segmentation purity of each segmentation mode is determined, and the segmentation purity is determined. It is used to indicate the similarity between each sample in each type of sample in a segmentation mode; a segmentation mode in which the segmentation purity conforms to a preset condition is selected, and the plurality of samples in the segmentation mode are finally obtained.
  • the plurality of types of samples have a position of a pair of pixel points corresponding to the division mode as the selected position.
  • the one or more programs further include instructions for: determining a face region in the to-be-tracked frame image based on a second coordinate of a face feature point in the previous frame image; And displaying a preset error model and a pixel point in the face region, and acquiring a face feature point error of the to-be-tracked frame image and the previous frame image.
  • FIG. 10 is a schematic structural diagram of a server according to an embodiment of the present application.
  • the server includes a processing component 1022 that further includes one or more processors, and memory resources represented by memory 1032 for storing instructions executable by processing component 1022, such as an application.
  • the application stored in the memory 1032 may include one or more programs.
  • processing component 1022 is configured to execute instructions.
  • the server may also include a power component 1026 configured to perform power management of the server, a wired or wireless network interface 1050 configured to connect the server to the network, and an input/output (I/O) interface 1058.
  • the server can operate based on an operating system stored in the memory 1032, such as Windows ServerTM, Mac OS XTM, UnixTM, LinuxTM, FreeBSDTM or the like.
  • the one or more programs include instructions for: acquiring a facial feature point in a previous frame image of the image to be tracked; and acquiring a pixel based on a preset error model and a pixel image in the image to be tracked Tracking the face feature point error of the frame image and the previous frame image, the face feature point error refers to the difference between the first coordinate and the second coordinate, and the first coordinate is the face feature in the frame image to be tracked The coordinates of the point, the second coordinate is the coordinates of the face feature point of the corresponding position in the previous frame image, and the preset error model is trained according to the face feature points of the pair of adjacent frame images, and is used to indicate the adjacent frame image.
  • the relationship between the pixel point of the subsequent image and the face feature point error; based on the face feature point and the face feature point error of the previous frame image, the face feature point of the frame image to be tracked is obtained.
  • the one or more programs further include instructions for: determining an offset of a first coordinate of each facial feature point relative to the second coordinate based on the facial feature point error; based on the front The second coordinate of the face feature point in the image of the frame and the determined offset amount obtain the first coordinate of the face feature point in the image to be tracked.
  • the one or more programs further include instructions for: performing, based on the preset threshold and a pair of pixels at a selected location of the second image in each sample of the sample set in the selected region
  • the sample set is divided into a plurality of types of samples, each sample in the sample set includes a face feature point of a preceding first image and a face feature point of a second image in a subsequent frame image; determining each type of sample Corresponding reconstructed facial feature point error, the reconstructed facial feature point error is used to indicate a difference between a third coordinate of the facial feature point of the second image and an estimated facial feature point coordinate of the first type of sample,
  • the estimated face feature point coordinates are determined based on the face feature points of the first image in the one type of samples; and the preset error model is obtained based on reconstructed face feature point errors corresponding to the various types of samples.
  • the one or more programs further include instructions for: prior to obtaining the preset error model based on reconstructed facial feature point errors corresponding to each type of sample, for each type of sample, based on the one Reconstructing the face feature point error corresponding to the class sample, updating the estimated face feature point of each second image in the class of samples; reselecting a location in the selected region as the selected location; continuing execution And dividing the sample set into a plurality of types of samples based on a preset threshold and a pair of pixel points in a selected position of the second image in each sample of the sample set, and determining a reconstruction corresponding to each type of sample
  • the step of the face feature point error stops until the reconstructed face feature point error corresponding to each type of sample segmented by a pair of pixel points at each selected position is determined.
  • the one or more programs further include instructions for: reselecting one of the samples before the predetermined error model is obtained based on the reconstructed facial feature point error corresponding to each type of sample An area as the selected area; continuing to perform the pair of pixel points based on the preset threshold and the second image in each sample in the sample set at the selected position of the selected area, dividing the sample set into multiple a class sample, determining a step of reconstructing a face feature point error corresponding to each type of sample until determining a reconstructed face feature point error corresponding to each type of sample segmented by a pair of pixel points at selected locations in each selected region Stop after.
  • the one or more programs further include instructions for: determining an initial facial feature point error corresponding to the selected region of the second image, the initial facial feature point error being used to indicate the a difference between the third coordinate of the face feature point of the two images and the estimated face feature point coordinate; and the reconstructing the face feature point error corresponding to each type of sample to obtain the preset error model, comprising: The initial face feature point error and the reconstructed face feature point error corresponding to each sample are obtained to obtain the preset error model.
  • the one or more programs further include instructions for: performing, based on the preset threshold and a plurality of pairs of pixels at different selected locations within a selected region of the second image in each sample,
  • the sample set is segmented in different manners to obtain a plurality of types of samples in each of the segmentation modes; and based on the face feature points of the plurality of samples in each segmentation mode, the segmentation purity of each segmentation mode is determined, and the segmentation purity is determined. It is used to indicate the similarity between each sample in each type of sample in a segmentation mode; a segmentation mode in which the segmentation purity conforms to a preset condition is selected, and the plurality of samples in the segmentation mode are finally obtained.
  • the plurality of types of samples have a position of a pair of pixel points corresponding to the division mode as the selected position.
  • the one or more programs further include instructions for: determining a face region in the to-be-tracked frame image based on a second coordinate of a face feature point in the previous frame image; And displaying a preset error model and a pixel point in the face region, and acquiring a face feature point error of the to-be-tracked frame image and the previous frame image.
  • a computer readable storage medium having stored therein at least one instruction executable by a processor to perform the above-described facial feature point tracking method.
  • the computer readable storage medium may be a ROM (Read Only Memory), a RAM (Random Access Memory), a CD-ROM (Compact Disc Read Only Memory), or a tape. , floppy disks and optical data storage devices.
  • a person skilled in the art may understand that all or part of the steps of implementing the above embodiments may be completed by hardware, or may be instructed by a program to execute related hardware, and the program may be stored in a computer readable storage medium.
  • the storage medium mentioned may be a read only memory, a magnetic disk or an optical disk or the like.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Health & Medical Sciences (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • General Health & Medical Sciences (AREA)
  • Oral & Maxillofacial Surgery (AREA)
  • Human Computer Interaction (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • Data Mining & Analysis (AREA)
  • Software Systems (AREA)
  • Computing Systems (AREA)
  • Medical Informatics (AREA)
  • Databases & Information Systems (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Biology (AREA)
  • General Engineering & Computer Science (AREA)
  • Image Analysis (AREA)

Abstract

本申请公开了一种人脸特征点跟踪方法、装置、存储介质及设备,属于图像识别技术领域。该方法包括:获取待跟踪帧图像的前一帧图像中的人脸特征点;基于预设误差模型以及待跟踪帧图像中的像素点,获取人脸特征点误差,人脸特征点误差是指第一坐标与第二坐标之间的差值,第一坐标是待跟踪帧图像中的人脸特征点的坐标,第二坐标是前一帧图像中相应位置的人脸特征点的坐标,预设误差模型用于指示相邻帧图像中在后的一帧图像的像素点与人脸特征点误差之间的关系;基于前一帧图像的人脸特征点和人脸特征点误差,得到待跟踪帧图像的人脸特征点。本申请以前一帧图像的人脸特征点作为参考可以更准确地估计待跟踪帧图像的人脸特征点。

Description

人脸特征点跟踪方法、装置、存储介质及设备
本申请要求于2017年06月21日提交中国国家知识产权局、申请号为201710473506.8、申请名称为“人脸特征点跟踪方法及装置”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请实施例涉及图像识别技术领域,特别涉及一种人脸特征点跟踪方法、装置、存储介质及设备。
背景技术
为了模拟人为识别图像的行为,图像识别技术应运而生。图像识别技术是一种通过计算机对图像进行处理、分析和理解的技术,是人工智能的一个重要领域,广泛地应用于跟踪人脸特征点、过滤垃圾图像以及匹配地貌地形等场景。
以跟踪人脸特征点为例,可以根据大量已标注的人脸特征点的样本图像得到参考人脸特征点,并依据参考人脸特征点得到特征点跟踪模型,使得特征点跟踪模型可以反映任一图像中的人脸特征点与参考人脸特征点之间的关系,从而可以基于该特征点跟踪模型得到当前图像的人脸特征点。
在实际场景中,视频中的连续多帧图像的人脸特征点通常不同且持续变化,然而,通过现有技术对连续多帧图像的人脸特征点进行跟踪时,由于每帧图像的人脸特征点均根据参考人脸特征点得到,导致人脸特征点的跟踪局限性大,所跟踪的人脸特征点不能准确地表达真实的人脸特征。
发明内容
本申请实施例提供了一种人脸特征点跟踪方法、装置、存储介质及设备,可以解决每帧图像的人脸特征点均根据参考人脸特征点得到,导致人脸特征点的跟踪局限性大,所跟踪的人脸特征点不能准确地表达真实的人脸特征的问题。所述技术方案如下:
一方面,提供了一种人脸特征点跟踪方法,用于电子设备中,所述方法包括:
获取待跟踪帧图像的前一帧图像中的人脸特征点;
基于预设误差模型以及所述待跟踪帧图像中的像素点,获取所述待跟踪帧图像与所述前一帧图像的人脸特征点误差,所述人脸特征点误差是指第一坐标与第二坐标之间的差值,所述第一坐标是所述待跟踪帧图像中的人脸特征点的坐标,所述第二坐标是所述前一帧图像中相应位置的人脸特征点的坐标,所述预设误差模型根据多对相邻帧图像的人脸特征点训练得到,用于指示所述相邻帧图像中在后的一帧图像的像素点与人脸特征点误差之间的关系;
基于所述前一帧图像的人脸特征点和所述人脸特征点误差,得到所述待跟踪帧图像的人脸特征点。
一方面,提供了一种人脸特征点跟踪装置,所述装置包括:
第一获取模块,用于获取待跟踪帧图像的前一帧图像中的人脸特征点;
第二获取模块,用于基于预设误差模型以及所述待跟踪帧图像中的像素点,获取所述待跟踪帧图像与所述前一帧图像的人脸特征点误差,所述人脸特征点误差是指第一坐标与第二坐标之间的差值,所述第一坐标是所述待跟踪帧图像中的人脸特征点的坐标,所述第二坐标是所述前一帧图像中相应位置的人脸特征点的坐标,所述预设误差模型根据多对相邻帧图像的人脸特征点训练得到,用于指示所述相邻帧图像中在后的一帧图像的像素点与人脸特征点误差之间的关系;
跟踪模块,用于基于所述前一帧图像的人脸特征点和所述人脸特征点误差,得到所述待跟踪帧图像的人脸特征点。
一方面,提供了一种计算机可读存储介质,所述计算机可读存储介质中存储有至少一条指令,所述指令由处理器加载并执行以实现上述人脸特征点跟踪方法。
一方面,提供了一种电子设备,所述电子设备包括:一个或多个处理器;和,存储器;所述存储器存储有一个或多个程序,所述一个或者一个以上程序被配置成由所述一个或多个处理器执行,所述一个或者一个以上程序包含用于执行以下操作的指令:
获取待跟踪帧图像的前一帧图像中的人脸特征点;
基于预设误差模型以及所述待跟踪帧图像中的像素点,获取所述待跟踪帧图像与所述前一帧图像的人脸特征点误差,所述人脸特征点误差是指第一坐标与第二坐标之间的差值,所述第一坐标是所述待跟踪帧图像中的人脸特征点的坐标,所述第二坐标是所述前一帧图像中相应位置的人脸特征点的坐标,所述预设误差模型根据多对相邻帧图像的人脸特征点训练得到,用于指示所述相邻帧图像中在后的一帧图像的像素点与人脸特征点误差之间的关系;
基于所述前一帧图像的人脸特征点和所述人脸特征点误差,得到所述待跟踪帧图像的人脸特征点。
本申请实施例通过获取前一帧图像中的人脸特征点,基于预设误差模型以及待跟踪帧图像中的像素点,可以得到待跟踪帧图像与前一帧图像的人脸特征点误差,并基于前一帧图像的人脸特征点和人脸特征点误差,得到待跟踪帧图像中的人脸特征点,由于相邻帧图像的人脸特征点连续变化,因此以前一帧图像的人脸特征点作为参考可以更准确地估计待跟踪帧图像的人脸特征点。
附图说明
为了更清楚地说明本申请实施例中的技术方案,下面将对实施例描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本申请的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。
图1是本申请实施例提供的一种人脸特征点跟踪的实施环境示意图;
图2是本申请实施例提供的一种人脸特征点跟踪方法的流程图;
图3是本申请实施例提供的一种人脸特征点示意图;
图4是本申请实施例提供的一种数据结构示意图;
图5是本申请实施例提供的一种人脸特征点跟踪的流程示意图;
图6是本申请实施例提供的一种人脸特征点跟踪装置的框图;
图7是本申请实施例提供的一种人脸特征点跟踪装置的框图;
图8是本申请实施例提供的一种人脸特征点跟踪装置的框图;
图9是本申请实施例提供的一种终端的结构示意图;
图10是本申请实施例提供的一种服务器的结构示意图。
具体实施方式
为使本申请的目的、技术方案和优点更加清楚,下面将结合附图对本申请实施方式作进一步地详细描述。
图1是本申请实施例提供的一种人脸特征点跟踪的实施环境示意图。参见图1,该实施环境中包括:
至少一个服务器101和至少一个终端102。其中,服务器101可以为提供图像服务的服务器,该终端102可以为该服务器所服务用户的终端。在一种可能的应用场景中,终端102可以安装该服务器101提供的图像应用、社交应用或游戏应用等,使得终端102可以基于所安装的应用与服务器101进行交互。
本申请实施例中,服务器101可以用于获取预设误差模型,并将该预设误差模型发送至终端102,使得终端102可以存储该预设误差模型,并在使用应用的人脸特征点跟踪功能时,基于该预设误差模型进行跟踪得到人脸特征点。
另外,该服务器101还可以配置至少一个数据库,如,人脸图像数据库、用户数据库等等。该人脸图像数据库用于存储人脸图像、已标注的人脸图像中的人脸特征点以及模拟出的人脸图像的前一帧图像的人脸特征点等;该用户数据库用于存储该服务器101所服务用户的用户名和密码等个人数据。
图2是本申请实施例提供的一种人脸特征点跟踪方法的流程图。参见图2,该方法可以应用于任一电子设备,如,服务器或终端,以执行主体为终端为例,该方法可以包括以下模型训练过程和模型应用过程,其中,步骤201-205为根据多对相邻帧图像训练得到预设误差模型的过程,步骤205-207为应用上述预设误差模型时的人脸特征点跟踪过程:
201、基于预设阈值和样本集中每个样本在选定区域的选定位置上的一对像素点,将样本集分割为多类样本,样本集中的每个样本包括相邻帧图像中在前的第一图像的人脸特征点和在后的第二图像的人脸特征点。
为了实现后续对人脸特征点的准确跟踪,本申请实施例可以通过收集样本集并基于样本集进行模型训练,来找到前后帧图像之间的人脸特征点变化规律。人脸特征点是指图像中表现人脸特征的点,如,表现五官或人脸轮廓特征的点,通常以坐标的形式表示。样本可以通过多种方式得到,例如,为了保证样本集的可靠性,提高预设误差模型的准确度,可以从包含人脸的视频中提取出多对相邻帧图像,并获取人工对该多对相邻帧图像标注出的人脸特征点作为样本,其中,一对相邻帧图像包括视频中两张相邻的图像,或者,为了节省人工成本,提高样本集的获取效率,可以获取单张人脸图像,并获取人工对该单张人脸图像标注出的人脸特征点,进而基于单张人脸图像中人脸特征点的分布,模拟出该单张 人脸图像的前一帧图像的人脸特征点,从而获取到一对相邻帧图像的人脸特征点作为一个样本,其中,模拟人脸特征点的方法不限于蒙特卡洛方法。
需要说明的是,“第一”和“第二”没有比较人脸特征点多少的意思,而是用于表示人脸特征点对应图像的时间先后顺序,该第二图像的人脸特征点可以为单张人脸图像的人脸特征点,或者包含人脸的视频中第一帧图像之后的一帧图像的人脸特征点;该第一图像的人脸特征点可以为视频中第二图像的前一帧图像的人脸特征点,也可以是基于该第二图像的人脸特征点模拟出的前一帧图像的人脸特征点。例如,图3是本申请实施例提供的一种人脸特征点示意图。其中,图3中(a)图所示为第二图像,图3中(b)图所示为第二图像的人脸特征点,图3中(c)图所示为基于该第二图像的人脸特征点模拟出的第一图像的人脸特征点。图3中(b)图或(c)图所示的人脸特征点都可以采用人脸特征点的坐标的形式来表示,例如,第二图像的人脸特征点S1=(a1,a2,a3……),第一图像的人脸特征点S2=(b1,b2,b3……),S1或S2中的每个参数均为一个人脸特征点的坐标,如a1=(X1,Y1),b1=(X2,Y2),则S1或S2中的多个参数可以表示完整的人脸特征点。需要说明的是,本申请实施例对表示完整人脸特征点所需的人脸特征点的坐标的数目不做限定,例如,30个,则在训练模型或跟踪人脸特征点的过程中,相当于是对30个人脸特征点的坐标进行处理,来表示变化的人脸特征点。
在一种可能的应用场景中,为了减少对终端存储资源的占用,相邻帧图像的人脸特征点可以存储在服务器的人脸图像数据库中,终端在训练开始时从该人脸图像数据库中获取到包含相邻帧图像的人脸特征点的样本集。
该步骤中,预设阈值用于作为分割样本集的标准,通过设置预设阈值,可以将人脸特征点差异很大的样本分割为不同类的样本,将人脸特征点相似的样本分割到同一类样本中。选定位置是指第二图像中选定区域内任两个像素点分别对应的位置。本申请实施例对该选定区域的大小和在第二图像中的位置不做限定,该选定区域的大小不大于第二图像的面积即可。考虑到不同图像在相同位置的像素点通常存在差异,因此可以基于每个第二图像在选定位置的一对像素点作为样本的特征,并依据该特征将样本集分类。由于该样本至少根据单张的第二图像得到,因此可以基于第二图像的像素点进行分割。
在基于第二图像的像素点进行分割时,可以将一对像素点的灰度值的差值(以下简称灰度差)作为该样本的特征。该步骤中,对于每个第二图像,终端可以获取该第二图像的像素点信息,该像素点信息至少包括像素点的位置和灰度值,并确定该第二图像在选定位置上的一对像素点的灰度差,将该第二图像对应的灰度差与分割级别中的第n级分割所对应的预设阈值进行比较,如果小于该预设阈值,则将该第二图像所在样本分割至该第n级分割所对应的一个类,如果不小于预设阈值,则将该第二图像所在样本分割至该第n级分割所对应的另一个类。需要说明的是,本申请实施例对分割级别以及预设阈值的数量不做限定,分割级别用于指示分类的细度,每一级分割均基于上一级分割得到的各类样本再次进行一次分割。基于设定的分割级别,终端可以将最后一级分割得到的各类样本作为该多类样本。
例如,分割级别为2,则终端可以进行第一级分割:将一个第二图像在选定位置上的一对像素点的灰度差与预设阈值a进行比较,如果小于该预设阈值a,则将第二图像所在的样本分割至类1,如果不小于预设阈值a,则将该第二图像所在样本分割至类2;基于第一级 分割得到类1和类2,继续进行第二级分割:将类1中的样本中的第二图像对应的灰度差与预设阈值b进行比较,如果小于该预设阈值b,则将类1中该第二图像所在的样本分割至类11,如果不小于该预设阈值b,则将类1中该第二图像所在的样本分割至类12中;并将类2中的样本中的第二图像对应的灰度差与预设阈值c进行比较,如果小于该预设阈值c,则将类2中该第二图像所在的样本分割至类21,如果不小于该预设阈值c,则将类2中该第二图像所在的样本分割至类22。因此,终端可以获取到类11、类12、类21和类22。
其中,预设误差模型的数据结构可以为随机森林,则该分割过程可以作为生成随机森林中的回归树的过程。参见图4,本申请实施例提供了一种数据结构示意图,该示意图中,以预设误差模型由T(该步骤以T为1为例)个随机森林构成进行说明,一个随机森林可以由至少一个回归树构成,一个回归树可以具备多个结点,除了叶子结点的每个结点均可以对应一个预设阈值,一类样本可以从一个回归树的一个叶子结点得到。在分割过程中,终端首先将每个第二图像对应的灰度差与一个回归树的根结点处的预设阈值进行比较,如果小于预设阈值,则被分割至左子结点(或右子结点),如果不小于该预设阈值,则被分割至右子结点(或左子结点),完成一级分割,直到到达这个树的叶子结点,得到分割至各个叶子结点的一类样本。由于一类样本中的各个样本均依据对应的灰度差经历了相同的分割过程,因此一类样本中的样本的人脸特征点之间具备一定的相似度。
考虑到一个图像中包括大量像素点,而图像中不同位置的像素点的代表性应该是不同的,例如,一对像素点由图像中心位置的像素点和图像边缘位置的像素点构成,另一对像素点由图像边缘位置的两个像素点构成,则该一对像素点可能比另一对像素点更具有代表性。因此,为了选择出图像中某选定位置上更具有代表性的一对像素点,使得在该一对像素点对应的分割方式下,不同类样本中的样本之间的区分度更高、且同一类样本中的样本之间的相似度更高,从而提高预设误差模型的精确度,可以从多种分割方式中选择分割纯度较高的一种分割方式,分割纯度用于指示一种分割方式下的每一类样本中的各个样本之间的相似度。
该选择分割方式的过程可以为:终端基于预设阈值和每个样本中的第二图像在一个选定区域内不同选定位置上的多对像素点,对样本集进行不同方式的分割,得到每种分割方式下的多类样本;基于每种分割方式下的多类样本的人脸特征点,确定每种分割方式的分割纯度;选择分割纯度符合预设条件的一种分割方式,将该分割方式下的多类样本作为最终得到的该多类样本,将该分割方式对应的一对像素点的位置作为该选定位置。
该选择过程中,预设条件不限于具有分割纯度最高的一种分割方式。终端可以从该一个选定区域中随机选择多对像素点,如,该一个选定区域中有位置1-位置10,可以选择位置1与位置3上的像素点作为一对像素点,选择位置2和位置6上的像素点作为一对像素点等等。终端再基于每对像素点进行分割,分割过程与上述基于选定位置的分割过程相同。需要说明的是,本申请实施例对分割纯度的获取方式不做限定。例如,该分割纯度可以根据当前的分割方式下各类样本中的人脸特征点方差得到,则每类样本中的样本之间的相似度越高,该类样本对应的方差越小,表示分割纯度越高。获取分割纯度的过程可以参考下述公式2,
公式2:
Figure PCTCN2018088070-appb-000001
其中,
Figure PCTCN2018088070-appb-000002
r i为一个样本中第二图像的人脸特征点与第一图像的人脸特征点的差值,Q θ,S为一类样本的数量,μ S为一类样本中各个样本对应的人脸特征点的差值的平均值,θ表示当前的分割方式,i表示一个样本在其一类样本中的标号,S表示一类样本,r表示右结点,l表示左结点。
202、确定第二图像的选定区域对应的初始人脸特征点误差,初始人脸特征点误差用于指示第二图像的人脸特征点的第三坐标与估计人脸特征点坐标之间的差异。
其中,由于相邻帧图像的人脸特征点的变化是连续的,且时间上具有先后顺序的人脸特征点相关性很高,因此,为了提高在跟踪相邻帧图像中的人脸特征点的准确度,本申请实施例将第一图像的人脸特征点作为第二图像的人脸特征点的初始估计,并基于这两个图像的人脸特征点之间的差异获取预设误差模型。
该步骤中,终端可以将第一图像中的人脸特征点的第四坐标作为对应的第二图像的估计人脸特征点坐标,分析第二图像中的人脸特征的第三点坐标与估计人脸特征点坐标之间的一般性差异,得到该初始人脸特征点误差,该初始人脸特征点误差可以为各个第二图像中的人脸特征点的第三坐标与第一图像中相应位置的人脸特征点的第四坐标的差值的平均值,即初始人脸特征点误差为每个人脸特征点的第三坐标与第四坐标的差值的平均值,其计算方式可以参考下述公式1,
公式1:
Figure PCTCN2018088070-appb-000003
其中,f 01表示初始人脸特征点误差,N表示第二图像的数量,i表示第i对相邻帧图像,S i2表示第i个第二图像的人脸特征点,S i1表示第i个第一图像的人脸特征点。以上述图3中的S1和S2为例,标号相同的坐标可以为一对相应位置的人脸特征点的坐标,则a1和b1为一对相应位置的人脸特征点的坐标,其差值为(X1-Y1,X2-Y2),在求得f 01时,可以对各个差值中的X轴坐标值取平均,作为f 01的X轴坐标值;对各个差值中的Y轴坐标值取平均,作为f 01的Y轴坐标值,从而得到f 01的坐标,本申请实施例中对人脸特征点的坐标的处理方式与此处同理。
需要说明的是,该步骤202是本申请实施例的可选步骤,事实上,即使不确定初始人脸特征点误差,也可以根据已分类的样本和第一图像的人脸特征点,确定重建人脸特征点误差,得到预设误差模型,从而提高人脸特征点跟踪过程的准确性。
203、确定每类样本对应的重建人脸特征点误差,该重建人脸特征点误差用于指示一类样本中第二图像的人脸特征点的第三坐标与估计人脸特征点坐标之间的差异,该估计人脸特征点坐标基于该一类样本中第一图像的人脸特征点确定。
其中,确定估计人脸特征点坐标的方式可以有多种,例如,将第一图像的人脸特征点的第四坐标作为对应的第二图像的估计人脸特征点坐标,或者,根据获取的初始人脸特征点误差,终端可以结合一对相邻帧图像中第一图像的人脸特征点的第四坐标,确定该对相邻帧图像中第二图像的估计人脸特征点坐标,如,将第一图像的人脸特征点的每个坐标与初始人脸特征点误差中对应的坐标差值相加,得到该第二图像的估计人脸特征点坐标。通过结合第一图像的人脸特征点和初始人脸特征点误差,使得首次得到的第二图像的估计人脸特征点经过了初始人脸特征点误差的过渡,能够更加逼近第二图像的真实人脸特征点,使得基于该估计人脸特征点得到的预设误差模型更加准确。
虽然已经得到第二图像的估计人脸特征点,但估计人脸特征点理应和真实人脸特征点还存在一定的差异,为使估计人脸特征点更加逼近真实人脸特征点,该步骤对每一类样本进行分析,确定该一类样本中第二图像的人脸特征点与估计人脸特征点的差异。
在对每一类样本进行分析,确定该一类样本中第二图像的人脸特征点与估计人脸特征点的差异时,终端可以将一类样本中各个第二图像的人脸特征点的第三坐标与估计人脸特征点坐标的差值的平均值作为该重建人脸特征点误差,计算过程可以参考下述公式3,
公式3:
Figure PCTCN2018088070-appb-000004
其中,X n表示第n类样本对应的重建人脸特征点误差,A表示第n类样本中的第二图像的数量,S ar表示第a个第二图像中的人脸特征点的第三坐标,S ae表示第a个第二图像的估计人脸特征点坐标。
依据上述重建人脸特征点误差的确定方法,可以得到一个回归树的各个叶子结点处对应的重建人脸特征点误差。
204、基于初始人脸特征点误差以及各类样本对应的重建人脸特征点误差,得到预设误差模型。
该步骤中,通过训练得到的预设误差模型可以有多种形式。例如,该预设误差模型为初始人脸特征点误差和重建人脸特征点的加权形式,则运用该预设误差模型时可以得到一帧图像与前一帧图像的人脸特征点之间的差异,该预设误差模型可以为:E(I)=f 01+g 1(I),E表示一帧图像与其前一帧之间的人脸特征点误差,f 01表示初始人脸特征点误差,I表示一帧图像中用于分割样本时采用的选定位置的像素点,g 1(I)表示一帧图像基于选定位置的像素点所分割至的一类样本对应的重建人脸特征点误差。
又例如,该预设误差模型为初始人脸特征点误差、重建人脸特征点误差和一帧图像的前一帧图像的人脸特征点(作为预设误差模型中待输入的自变量)的加权形式,则运用该预设误差模型时可以得到一帧图像的估计人脸特征点,该预设误差模型可以为:S t=S t-1+f 01+g 1(I),S t表示一帧图像的估计人脸特征点,S t-1表示一帧图像的前一帧图像的人脸特征点,f 01和g 1(I)与上述定义同理。
需要说明的是,该步骤是基于各类样本对应的重建人脸特征点误差得到预设误差模型的可选步骤,事实上,在终端未确定初始人脸特征点误差的情况下,也可以基于各类样本对应的重建人脸特征点误差得到预设误差模型,例如,该预设误差模型为各类样本对应的重建人脸特征点误差的加权形式,可以表示为:S t=S t-1+g 1(I)。
在一种可能的应用场景中,为使预设误差模型更为准确,也可以基于步骤203得到的重建人脸特征点误差,继续进行训练过程,实现过程可以参考下述步骤(1)-(3):
(1)、对于每类样本,基于该一类样本对应的重建人脸特征点误差,更新该一类样本中各个第二图像的估计人脸特征点。
该步骤(1)中,以接续步骤203为例进行说明,终端可以将每个第二图像的估计人脸特征点坐标与该第二图像所在的一类样本对应的重建人脸特征点误差相加,得到该第二图像更新后的估计人脸特征点。
若将第一次计算得到的重建人脸特征点误差作为第一重建人脸特征点误差,则该步骤(1)可以表示为:基于第一重建人脸特征点误差,更新各个第二图像的估计人脸特征点, 该第一重建人脸特征点误差是指样本集中每个样本中第二图像在选定位置上的一对像素点所分割的各类样本对应的重建人脸特征点误差。
仍然以接续步骤203为例进行说明,终端可以将步骤203得到的各类样本对应的重建人脸特征点误差作为该第一重建人脸特征点误差,并将每个第二图像的估计人脸特征点坐标与该第二图像所在的一类样本对应的重建人脸特征点误差相加,得到该第二图像更新后的估计人脸特征点。
(2)、在选定区域内重新选定一个位置作为该选定位置。
该步骤(2)相当于对步骤201中的选定位置进行更新。比如,选定区域包括位置1-10,且步骤201中选择位置1和位置3作为选定位置;则该步骤(2)可以在剩余的8个位置中选择两个位置作为选定位置,比如,选择位置2和位置7作为选定位置。
(3)、继续执行基于预设阈值和样本集中每个样本中的第二图像在选定区域的选定位置上的一对像素点,将该样本集分割为多类样本,确定每类样本对应的重建人脸特征点误差的步骤,直至各个选定位置上的一对像素点所分割的每类样本对应的重建人脸特征点误差后停止。
该步骤(3)可以理解为继续执行步骤201和203,直到确定出基于各个选定位置上的一对像素点所分割的每类样本对应的重建人脸特征点误差后停止。需要说明的是,如果终端在执行步骤201和203之间执行了步骤202,则该步骤(3)还可以替换为:继续执行基于预设阈值和样本集中每个样本中的第二图像在选定区域的选定位置上的一对像素点,将该样本集分割为多类样本,确定第二图像的选定区域对应的初始人脸特征点误差,确定每类样本对应的重建人脸特征点误差的步骤,直至各个选定位置上的一对像素点所分割的每类样本对应的重建人脸特征点误差后停止。
在一种可能的应用场景中,可以将第一次选择的选定位置称为第一位置,将第二次选择的选定位置称为第二位置,依此类推,则该步骤(3)也可以替换为:基于预设阈值和每个样本中的第二图像在选定区域的第二位置上的一对像素点,将样本集分割为多类样本;确定第二重建人脸特征点误差,直到确定出基于所述选定区域中各个位置上的像素点所分割的每类样本对应的重建人脸特征点误差,该第二重建人脸特征点误差是指基于第二位置上的一对像素点所分割的每类样本对应的重建人脸特征点误差。
该步骤(3)的实现过程与步骤203的实现过程类似,区别在于,步骤(3)中终端需要基于更新后的估计人脸特征点确定重建人脸特征点误差。在确定第二重建人脸特征点误差之后,终端可以依照步骤(1)-(3),继续确定重建人脸特征点误差,如,基于第二重建人脸特征点误差更新第二图像的估计人脸特征点;基于预设阈值和第三位置上的一对像素点分割样本集,并确定第三重建人脸特征点误差,直到基于选定区域中设定数量个选定位置上的像素点,确定出设定数量个重建人脸特征点误差为止,可以基于(初始人脸特征点误差以及)每一对像素点所分割的各类样本对应的重建人脸特征点误差,得到预设误差模型。
由上述确定重建人脸特征点误差的过程可知:第二图像的估计人脸特征点随着一个个重建人脸特征点误差的确定过程不断更新,在确定一个重建人脸特征点误差时,需要基于上一个重建人脸特征点误差,对估计人脸特征点进行更新,得到更新后的估计人脸特征点,进而基于更新后的估计人脸特征点得到一个重建人脸特征点误差。
基于步骤(1)-(3)获取的预设误差模型既可以为初始人脸特征点误差和各个重建人脸特征点的加权形式,也可以为初始人脸特征点误差、重建人脸特征点误差和一帧图像的前一帧图像的人脸特征点的加权形式。以后一种预设误差模型为例,其可以为公式4,
公式4:
Figure PCTCN2018088070-appb-000005
K表示该一个选定区域中用于分割样本时采用的选定位置的数量,k表示一个选定区域中选定位置的标号,g k(I)表示一帧图像基于第k个选定位置的像素点所分割至的一类样本对应的重建人脸特征点误差,其他参量与上述预设误差模型中的参量同理。
上述训练过程通过不断重复估计、更新和获取差异的过程,可以使得到的人脸特征点误差不断逼近相邻两帧图像之间的人脸特征点差异。以预设误差模型的数据结构为随机森林来说,该训练过程相当于基于随机森林中前一个回归树得到的重建人脸特征点误差,更新第二图像的估计人脸特征点,并基于已更新的估计人脸特征点,生成当前的回归树的过程。参见图4的数据结构,基于一个选定位置上的像素点可以生成一个回归树,该回归树用于按照预设阈值和该一个选定位置上的像素点将图像分割至多类样本中的一类样本中。
以上所获取的预设误差模型均基于一个随机森林为例进行说明,在一种可能的应用场景中,预设误差模型也可以基于多个随机森林得到,获取过程可以接续上述步骤(3),并继续以下步骤(4)-(5):
(4)、在样本中重新选定一个区域作为选定区域。
该步骤(4)相当于对步骤201中的选定区域进行更新。比如,步骤201中的选定区域为第二图像中的中心区域,该步骤(4)中的选定区域为第二图像中的边缘区域。
(5)、继续执行基于预设阈值和样本集中每个样本中的第二图像在选定区域的选定位置上的一对像素点,将样本集分割为多类样本,确定每类样本对应的重建人脸特征点误差的步骤,直到确定出基于各个选定区域中选定位置上的一对像素点所分割的每类样本对应的重建人脸特征点误差后停止。
该步骤(4)可以理解为继续执行步骤201和203,直到确定出基于各个选定区域中选定位置上的一对像素点所分割的每类样本对应的重建人脸特征点误差后停止。需要说明的是,如果终端在执行步骤201和203之间执行了步骤202,则该步骤(4)还可以替换为:继续执行基于预设阈值和样本集中每个样本中的第二图像在选定区域的选定位置上的一对像素点,将样本集分割为多类样本,确定第二图像的选定区域对应的初始人脸特征点误差,确定每类样本对应的重建人脸特征点误差的步骤,直到确定出基于各个选定区域中选定位置上的一对像素点所分割的每类样本对应的重建人脸特征点误差后停止。
在一种可能的应用场景中,可以将第一次选择的选定区域称为第一区域,将第二次选择的选定区域称为第二区域,依此类推,则该步骤(4)也可以替换为:基于样本集中每个样本中第二图像在除第一区域以外的第二区域的选定位置上的像素点,确定第二区域对应的多个重建人脸特征点误差,直到确定出基于第二图像中各个选定区域中像素点对应的重建人脸特征点误差。
该步骤(5)中,终端可以基于第二区域中的每一对像素点,将样本集分割为多类样本,并确定每类样本对应的重建人脸特征点误差,其分割过程与步骤201相同,确定过程与步骤203相同。在确定该第二区域对应的多个重建人脸特征点误差之后,终端可以基于第三 区域中选定位置的像素点,确定第三区域对应的重建人脸特征点误差,直到得到预设数量个选定区域分别对应的重建人脸特征点误差时,可以基于(各个选定区域对应的初始人脸特征点误差以及)各个选定区域对应的重建人脸特征点误差,得到预设误差模型。
需要说明的是,在确定每个选定区域对应的重建人脸特征点误差时,终端可以在每个区域中选择多个选定位置,并针对每个选定位置确定一个重建人脸特征点误差,详见步骤(1)-(3)中的描述。
本申请实施例中,基于样本集在每个选定区域中的像素点确定重建人脸特征点误差之前,均可以确定该选定区域对应的初始人脸特征点误差,确定过程与该步骤202相同,但对于第一个选定区域之后的选定区域来说(如,该第二个选定区域),由于估计人脸特征点随着上一个选定区域对应的一个个重建人脸特征误差的确定过程不断更新,则在确定一个选定区域对应的初始人脸特征点误差时,可以基于上一个选定区域中最后一对像素点得到的重建人脸特征点误差,更新第二图像的人脸特征点,得到基于上一个选定区域对应的各个重建人脸特征点误差更新后的估计人脸特征点(相当于基于前一个随机森林中最后一个回归树的重建人脸特征点误差得到)。也即是,根据第一图像中的人脸特征点的第四坐标和其样本所在的每一类样本对应的重建人脸特征点误差得到,如,将根据第一区域的最后一对像素点得到的重建人脸特征点误差、以及第二图像前一次更新后的估计人脸特征点坐标相加,得到本次更新后的估计人脸特征点坐标,并基于第二图像的人脸特征点和本次更新后的估计人脸特征点得到初始人脸特征点误差。
其中,计算某一区域对应的初始人脸特征点误差的方式可以与公式1同理,如下述公式5,
公式5:
Figure PCTCN2018088070-appb-000006
其中,f 0t表示第t个选定区域对应的初始人脸特征点误差,相当于第t个随机森林中的初始人脸特征点误差,N表示第二图像的数量,i表示第i个第二图像,S ir表示第i个第二图像的人脸特征点,S ie表示第i个第二图像的估计人脸特征点。
该预设误差模型既可以为各个初始人脸特征点误差和各个重建人脸特征点的加权形式,也可以为初始人脸特征点误差、重建人脸特征点误差和一帧图像的前一帧图像的人脸特征点的加权形式。以前一种预设误差模型为例,其可以为公式6,
公式6:
Figure PCTCN2018088070-appb-000007
以后一种预设误差模型为例,其可以为公式6,
公式7:
Figure PCTCN2018088070-appb-000008
其中,T表示选定区域的个数,也等同于随机森林的数量;t表示选定区域的标号,也等同于随机森林的标号;f 0t表示标号为t的选定区域对应的初始人脸特征点误差,也等同于第t个随机森林对应的初始人脸特征点误差;K表示一个选定区域中用于分割样本时采用的选定位置的数量,也等同于一个随机森林中回归树的数量;k表示一个选定区域中某一选定位置的标号,也等同于第k个回归树的标号;g k(I)表示一帧图像基于第t个选定区域中第k个选定位置的像素点所分割至的一类样本对应的重建人脸特征点误差;I表示一帧图像在 第t个选定区域中第k个选定位置的像素点。
当然,上述各个预设误差模型中的参数均可以设置对应的权重,本申请实施例以各个参数的权重为1进行说明。
需要说明的是,上述步骤201-204是以实时计算并获取到预设误差模型为例进行说明,事实上,本申请实施例对获取预设误差模型的时机不做限定。例如,也可以基于事先已获取的预设误差模型进行人脸特征点跟踪,该事先获取的预设误差模型可以参照上述步骤201-204得到。
以下步骤205-207为应用上述预设误差模型时的人脸特征点跟踪过程:
205、获取待跟踪图像的前一帧图像中的人脸特征点。
该步骤中,该前一帧图像的人脸特征点的获取方法可以有多种,例如,基于本申请实施例的人脸特征点跟踪方法进行跟踪并获取到。当然,如果该前一帧图像是终端获取的第一帧图像,则可以基于监督下降方法或增量学习方法等人脸特征点跟踪方法得到其人脸特征点。
206、基于预设误差模型以及待跟踪帧图像中的像素点,获取待跟踪帧图像与前一帧图像的人脸特征点误差,人脸特征点误差是指第一坐标与第二坐标之间的差值,第一坐标是待跟踪帧图像中的人脸特征点的坐标,第二坐标是前一帧图像中相应位置的人脸特征点的坐标,该预设误差模型根据多对相邻帧图像的人脸特征点训练得到,用于指示相邻帧图像中在后的一帧图像的像素点与人脸特征点误差之间的关系。
待跟踪帧图像是指该终端所获取的第一帧图像之后的任一帧图像。例如,该待跟踪帧图像是指终端当前所拍摄的一帧图像,或者,终端当前播放的任一视频中的一帧图像,或者,终端所存储的某一视频中的一帧图像。
由于每一类样本对应各自的重建人脸特征点误差,终端需要根据预设误差模型,确定待跟踪帧图像中用于分割样本时采用的多个选定位置,如,上述第一位置和第二位置,根据确定的选定位置上的一对像素点和预设阈值,确定待跟踪帧图像被分割至哪一类样本,并选择该一类样本对应的重建人脸特征点误差,基于该确定的各个选定位置上的像素点,终端可以选择多个重建人脸特征点误差,并将各个初始人脸特征点误差和选择的重建人脸特征点误差进行加权,得到待跟踪帧图像与前一帧图像的人脸特征点误差,因此,得到的人脸特征点误差可以表达待跟踪帧图像与前一帧图像的人脸特征点之间的差异。
以上述公式6为例,终端可以将待跟踪帧图像中像素点作为该预设误差模型中的自变量I,输入该预设误差模型,并得到该预设误差模型输出的该自变量I对应的人脸特征点误差。
在一种可能的应用场景中,由于图像中往往不止包括人脸,因此为了避免图像中其他内容的干扰,从而更为精确地得到人脸特征点,终端可以先从待跟踪帧图像中确定人脸区域,再进行人脸特征点跟踪的过程。而为了提高人脸特征点跟踪的效率,保证跟踪过程的实时性,基于待跟踪帧图像与前一帧图像连续的性质,终端也可以基于前一帧图像中的人脸特征点的第二坐标,在待跟踪帧图像中确定人脸区域;并基于该人脸区域中的像素点以及预设误差模型,获取待跟踪帧图像与前一帧图像的人脸特征点误差。
在确定人脸区域时,终端可以基于前一帧图像中的各个人脸特征点的第二坐标,确定该各个第二坐标在前一帧图像中围成的区域边界,基于该区域边界确定人脸区域,例如, 确定该区域边界内的中心位置,将以该中心位置为中心的预设大小的方形区域作为该待跟踪帧图像中的人脸区域。本申请实施例对预设大小不做限定。例如,该预设大小为与该区域边界的大小相差预设值的大小,或者,设定的固定大小。
需要说明的是,除了可以基于前一帧图像的人脸特征点确定人脸区域,终端也可以基于其他人脸检测算法在待跟踪帧图像中检测到人脸区域,如:基于神经网络的人脸检测算法或基于活动轮廓的人脸检测算法等。当然,相比相关技术中复杂的人脸检测算法,上述基于前一帧图像的人脸特征点的确定方法利用了刚刚获取到的人脸特征点,不仅可以简便、实时地确定当前帧的人脸区域,而且充分利用了追踪过程得到的数据,提高了数据利用率。207、基于前一帧图像的人脸特征点和人脸特征点误差,得到待跟踪帧图像的人脸特征点。
该步骤中,终端可以基于步骤207得到的人脸特征点误差,确定待跟踪帧图像中的每个人脸特征点的第一坐标相对前一帧图像中的人脸特征点的第二坐标的偏移量,并基于前一帧图像中的人脸特征点的第二坐标与所确定的偏移量,得到待跟踪帧图像中的人脸特征点的第一坐标。即,基于人脸特征点误差,确定每个人脸特征点的第一坐标相对于第二坐标的偏移量;基于前一帧图像中的人脸特征点的第二坐标与所确定的偏移量,得到待跟踪帧图像中的人脸特征点的第一坐标。其中,终端可以将前一帧图像的每个人脸特征点的第二坐标与该人脸特征点误差中相应位置的差值坐标相加,得到待跟踪帧图像中的各个人脸特征点的第一坐标,如,将人脸特征点误差与前一帧图像中标号相同的坐标作为相应位置的坐标,假设人脸特征点误差中标号为1的差值坐标为(X3-X4,Y3-Y4),前一帧图像中标号为1的人脸特征点的第二坐标为(X4,Y4),则两个坐标相加得到的坐标为(X3,Y3)作为待跟踪图像中标号为1的人脸特征点的第一坐标。以此类推,可以得到待跟踪图像的各个人脸特征点的第一坐标。
需要说明的是,本申请实施例也可以结合当前任一种人脸特征点跟踪方法进行人脸特征点跟踪,例如,基于每种人脸特征点跟踪方法得到待跟踪帧图像的人脸特征点和每种人脸特征点跟踪方法对应的权重,来确定该待跟踪帧图像人脸特征点。并且,在进行人脸特征点跟踪时也可以对图像进行预处理或者后处理,以使跟踪的人脸特征点更为准确,例如,在跟踪前对图像进行降噪处理,或者,在跟踪后基于卡尔曼滤波或光流法等平滑算法对图像进行平滑操作。
本申请实施例可以应用于各种需要进行人脸特征点跟踪的场景。例如监控以终端对实时拍摄的人脸图像进行跟踪为例,参见图5,本申请实施例提供了一种人脸特征点跟踪的流程示意图。该跟踪流程以跟踪视频中相邻帧图像的人脸特征点为例进行说明,当终端对第一帧图像进行跟踪时,首先进行人脸检测,得到人脸检测框中的人脸区域,并采用单帧对齐器估计出该人脸区域中的人脸特征点,该单帧对齐器可以是任一种基于单帧图像的人脸特征点跟踪方法建立的人脸特征点跟踪模型。终端可以获取估计出的人脸特征点围成的边界,当终端对第一帧图像之后的图像中的人脸特征点进行跟踪时,可以基于前一帧图像的人脸特征点边界进行人脸检测,更新待跟踪帧图像中的人脸特征点边界,并基于已更新的人脸特征点边界确定待跟踪帧图像中的人脸区域,基于预设误差模型估计出待跟踪帧图像的人脸特征点。在估计出一帧图像中的人脸特征点后,终端可以基于预设跟踪条件判断是否继续跟踪,如果是,可以基于已获取的人脸特征点边界继续跟踪下一帧图像的人脸特征 点,如果否,可以结束跟踪过程(或者,也可以将下一帧图像确定为第一帧图像,并基于再次确定的第一帧图像开始进行人脸特征点跟踪过程)。其中,预设跟踪条件用于作为是否继续跟踪的条件,本申请实施例不限定预设跟踪条件。例如,该预设跟踪条件可以为预设跟踪时长,如果跟踪时长未达到预设跟踪时长,则继续跟踪,如果跟踪时长已达到预设跟踪时长,则结束一轮跟踪过程。
在一种可能的应用场景中,由于训练预设误差模型的计算量较大,为了节省终端的计算资源,也可以由服务器训练得到对预设误差模型,并将得到的预设误差模型发送至终端。
本申请实施例通过获取前一帧图像中的人脸特征点,基于预设误差模型以及待跟踪帧图像中的像素点,可以得到待跟踪帧图像与前一帧图像的人脸特征点误差,并基于前一帧图像的人脸特征点和人脸特征点误差,得到待跟踪帧图像中的人脸特征点,由于相邻帧图像的人脸特征点连续变化,因此以前一帧图像的人脸特征点作为参考可以更准确地估计待跟踪帧图像的人脸特征点。而且,即使相邻帧图像中出现光照变化、遮挡等情况,由于待跟踪帧图像的人脸特征点能与前一帧图像的人脸特征点保持连贯,也能跟踪得到准确的人脸特征点,因此跟踪方式具备很好的鲁棒性。
图6是本申请实施例提供的一种人脸特征点跟踪装置的框图。参见图6,该装置包括:
第一获取模块601,用于获取待跟踪帧图像的前一帧图像中的人脸特征点;
第二获取模块602,用于基于预设误差模型以及待跟踪帧图像中的像素点,获取待跟踪帧图像与前一帧图像的人脸特征点误差,人脸特征点误差是指第一坐标与第二坐标之间的差值,第一坐标是待跟踪帧图像中的人脸特征点的坐标,第二坐标是前一帧图像中相应位置的人脸特征点的第二坐标之间的差值,预设误差模型根据多对相邻帧图像的人脸特征点训练得到,用于指示相邻帧图像中在后的一帧图像的像素点与人脸特征点误差之间的关系;
跟踪模块603,用于基于前一帧图像的人脸特征点和人脸特征点误差,得到待跟踪帧图像的人脸特征点。
在一种可能实现方式中,跟踪模块603用于:基于人脸特征点误差,确定每个人脸特征点的第一坐标相对于第二坐标的偏移量;基于前一帧图像中的人脸特征点的第二坐标与所确定的偏移量,得到待跟踪帧图像中的人脸特征点的第一坐标。
在一种可能实现方式中,基于图6的装置组成,参见图7,装置还包括:
分割模块604,用于基于预设阈值和样本集中每个样本中的第二图像在选定区域的选定位置上的一对像素点,将样本集分割为多类样本,样本集中的每个样本包括相邻帧图像中在前的第一图像的人脸特征点和在后的第二图像的人脸特征点;
第一确定模块605,用于确定每类样本对应的重建人脸特征点误差,重建人脸特征点误差用于指示一类样本中第二图像的人脸特征点的第三坐标与估计人脸特征点坐标之间的差异,估计人脸特征点坐标基于该一类样本中第一图像的人脸特征点确定;
第三获取模块606,用于基于各类样本对应的重建人脸特征点误差得到预设误差模型。
在一种可能实现方式中,装置还包括:
更新模块,用于在第三获取模块606基于各类样本对应的重建人脸特征点误差得到预设误差模型之前,对于每类样本,基于一类样本对应的重建人脸特征点误差,更新一类样本中各个第二图像的估计人脸特征点;
第一选择模块,用于在选定区域内重新选定一个位置作为选定位置;
第一循环模块,用于继续执行基于预设阈值和样本集中每个样本中的第二图像在选定区域的选定位置上的一对像素点,将样本集分割为多类样本,确定每类样本对应的重建人脸特征点误差的操作,直到确定出基于各个选定位置上的一对像素点所分割的每类样本对应的重建人脸特征点误差后停止。
在一种可能实现方式中,装置还包括:
第二选择模块,用于在第三获取模块606基于各类样本对应的重建人脸特征点误差得到预设误差模型之前,在样本中重新选定一个区域作为选定区域;
第二循环模块,用于继续执行基于预设阈值和样本集中每个样本中的第二图像在选定区域的选定位置上的一对像素点,将样本集分割为多类样本,确定每类样本对应的重建人脸特征点误差的操作,直到确定出基于各个选定区域中选定位置上的一对像素点所分割的每类样本对应的重建人脸特征点误差后停止。
在一种可能实现方式中,基于图7的装置组成,参见图8,装置还包括:
第二确定模块607,用于确定第二图像的选定区域对应的初始人脸特征点误差,初始人脸特征点误差用于指示第二图像的人脸特征点的第三坐标与估计人脸特征点坐标之间的差异;
第三获取模块606,用于基于初始人脸特征点误差以及各类样本对应的重建人脸特征点误差,得到预设误差模型。
在一种可能实现方式中,分割模块604还用于:基于预设阈值和每个样本中的第二图像在一个选定区域内不同选定位置上的多对像素点,对样本集进行不同方式的分割,得到每种分割方式下的多类样本;
基于每种分割方式下的多类样本的人脸特征点,确定每种分割方式的分割纯度,分割纯度用于指示一种分割方式下的每一类样本中的各个样本之间的相似度;
选择分割纯度符合预设条件的一种分割方式,将该分割方式下的多类样本作为最终得到的多类样本,将该分割方式对应的一对像素点的位置作为选定位置。
在一种可能实现方式中,第二获取模块602还用于:基于前一帧图像中的人脸特征点的第二坐标,在待跟踪帧图像中确定人脸区域;基于预设误差模型以及人脸区域中的像素点,获取待跟踪帧图像与前一帧图像的人脸特征点误差。
上述所有可选技术方案,可以采用任意结合形成本申请的可选实施例,在此不再一一赘述。
需要说明的是:上述实施例提供的人脸特征点跟踪装置在跟踪人脸特征点时,仅以上述各功能模块的划分进行举例说明,实际应用中,可以根据需要而将上述功能分配由不同的功能模块完成,即将装置的内部结构划分成不同的功能模块,以完成以上描述的全部或者部分功能。另外,上述实施例提供的人脸特征点跟踪装置与人脸特征点跟踪方法实施例属于同一构思,其实现过程详见方法实施例,这里不再赘述。
图9是本申请实施例提供的一种终端的结构示意图,参见图9,该终端可以用于执行上述各个实施例中提供的人脸特征点跟踪方法,该终端900包括:
终端900可以包括RF(Radio Frequency,射频)电路110、包括有一个或一个以上计算机可读存储介质的存储器120、输入单元130、显示单元140、传感器150、音频电路160、WiFi(Wireless Fidelity,无线保真)模块170、包括有一个或者一个以上处理核心的处理器180、以及电源190等部件。本领域技术人员可以理解,图9中示出的终端结构并不构成对终端的限定,可以包括比图示更多或更少的部件,或者组合某些部件,或者不同的部件布置。其中:
RF电路110可用于收发信息或通话过程中,信号的接收和发送,特别地,将基站的下行信息接收后,交由一个或者一个以上处理器180处理;另外,将涉及上行的数据发送给基站。通常,RF电路110包括但不限于天线、至少一个放大器、调谐器、一个或多个振荡器、用户身份模块(SIM)卡、收发信机、耦合器、LNA(Low Noise Amplifier,低噪声放大器)、双工器等。此外,RF电路110还可以通过无线通信与网络和其他设备通信。所述无线通信可以使用任一通信标准或协议,包括但不限于GSM(Global System of Mobile communication,全球移动通讯系统)、GPRS(General Packet Radio Service,通用分组无线服务)、CDMA(Code Division Multiple Access,码分多址)、WCDMA(Wideband Code Division Multiple Access,宽带码分多址)、LTE(Long Term Evolution,长期演进)、电子邮件、SMS(Short Messaging Service,短消息服务)等。
存储器120可用于存储软件程序以及模块,处理器180通过运行存储在存储器120的软件程序以及模块,从而执行各种功能应用以及数据处理。存储器120可主要包括存储程序区和存储数据区,其中,存储程序区可存储操作系统、至少一个功能所需的应用程序(比如声音播放功能、图像播放功能等)等;存储数据区可存储根据终端900的使用所创建的数据(比如音频数据、电话本等)等。此外,存储器120可以包括高速随机存取存储器,还可以包括非易失性存储器,例如至少一个磁盘存储器件、闪存器件、或其他易失性固态存储器件。相应地,存储器120还可以包括存储器控制器,以提供处理器180和输入单元130对存储器120的访问。
输入单元130可用于接收输入的数字或字符信息,以及产生与用户设置以及功能控制有关的键盘、鼠标、操作杆、光学或者轨迹球信号输入。其中,输入单元130可包括触敏表面131以及其他输入设备132。触敏表面131,也称为触摸显示屏或者触控板,可收集用户在其上或附近的触摸操作(比如用户使用手指、触笔等任何适合的物体或附件在触敏表面131上或在触敏表面131附近的操作),并根据预先设定的程式驱动相应的连接装置。可选的,触敏表面131可包括触摸检测装置和触摸控制器两个部分。其中,触摸检测装置检测用户的触摸方位,并检测触摸操作带来的信号,将信号传送给触摸控制器;触摸控制器从触摸检测装置上接收触摸信息,并将它转换成触点坐标,再送给处理器180,并能接收处理器180发来的命令并加以执行。此外,可以采用电阻式、电容式、红外线以及表面声波等多种类型实现触敏表面131。除了触敏表面131,输入单元130还可以包括其他输入设备132。其中,其他输入设备132可以包括但不限于物理键盘、功能键(比如音量控制按键、开关按键等)、轨迹球、鼠标、操作杆等中的一种或多种。
显示单元140可用于显示由用户输入的信息或提供给用户的信息以及终端900的各种图形用户接口,这些图形用户接口可以由图形、文本、图标、视频和其任意组合来构成。显示单元140可包括显示面板141,可选的,可以采用LCD(Liquid Crystal Display,液晶显 示器)、OLED(Organic Light-Emitting Diode,有机发光二极管)等形式来配置显示面板141。进一步的,触敏表面131可覆盖显示面板141,当触敏表面131检测到在其上或附近的触摸操作后,传送给处理器180以确定触摸事件的类型,随后处理器180根据触摸事件的类型在显示面板141上提供相应的视觉输出。虽然在图9中,触敏表面131与显示面板141是作为两个独立的部件来实现输入和输入功能,但是在某些实施例中,可以将触敏表面131与显示面板141集成而实现输入和输出功能。
终端900还可包括至少一种传感器150,比如光传感器、运动传感器以及其他传感器。其中,光传感器可包括环境光传感器及接近传感器,其中,环境光传感器可根据环境光线的明暗来调节显示面板141的亮度,接近传感器可在终端900移动到耳边时,关闭显示面板141和/或背光。作为运动传感器的一种,重力加速度传感器可检测各个方向上(一般为三轴)加速度的大小,静止时可检测出重力的大小及方向,可用于识别手机姿态的应用(比如横竖屏切换、相关游戏、磁力计姿态校准)、振动识别相关功能(比如计步器、敲击)等;至于终端900还可配置的陀螺仪、气压计、湿度计、温度计、红外线传感器等其他传感器,在此不再赘述。
音频电路160、扬声器161,传声器162可提供用户与终端900之间的音频接口。音频电路160可将接收到的音频数据转换后的电信号,传输到扬声器161,由扬声器161转换为声音信号输出;另一方面,传声器162将收集的声音信号转换为电信号,由音频电路160接收后转换为音频数据,再将音频数据输出处理器180处理后,经RF电路110以发送给比如另一终端,或者将音频数据输出至存储器120以便进一步处理。音频电路160还可能包括耳塞插孔,以提供外设耳机与终端900的通信。
WiFi属于短距离无线传输技术,终端900通过WiFi模块170可以帮助用户收发电子邮件、浏览网页和访问流式媒体等,它为用户提供了无线的宽带互联网访问。虽然图9示出了WiFi模块170,但是可以理解的是,其并不属于终端900的必须构成,完全可以根据需要在不改变申请的本质的范围内而省略。
处理器180是终端900的控制中心,利用各种接口和线路连接整个手机的各个部分,通过运行或执行存储在存储器120内的软件程序和/或模块,以及调用存储在存储器120内的数据,执行终端900的各种功能和处理数据,从而对手机进行整体监控。可选的,处理器180可包括一个或多个处理核心;优选的,处理器180可集成应用处理器和调制解调处理器,其中,应用处理器主要处理操作系统、用户界面和应用程序等,调制解调处理器主要处理无线通信。可以理解的是,上述调制解调处理器也可以不集成到处理器180中。
终端900还包括给各个部件供电的电源190(比如电池),优选的,电源可以通过电源管理系统与处理器180逻辑相连,从而通过电源管理系统实现管理充电、放电、以及功耗管理等功能。电源190还可以包括一个或一个以上的直流或交流电源、再充电系统、电源故障检测电路、电源转换器或者逆变器、电源状态指示器等任意组件。
尽管未示出,终端900还可以包括摄像头、蓝牙模块等,在此不再赘述。在本实施例中,终端的显示单元是触摸屏显示器,终端还包括有存储器,以及一个或者一个以上的程序,其中一个或者一个以上程序存储于存储器中,且经配置以由一个或者一个以上处理器执行。
所述一个或者一个以上程序包含用于执行以下操作的指令:获取待跟踪帧图像的前一 帧图像中的人脸特征点;基于预设误差模型以及待跟踪帧图像中的像素点,获取待跟踪帧图像与前一帧图像的人脸特征点误差,人脸特征点误差是指第一坐标与第二坐标之间的差值,所述第一坐标是待跟踪帧图像中的人脸特征点的坐标,第二坐标是前一帧图像中相应位置的人脸特征点的坐标,预设误差模型根据多对相邻帧图像的人脸特征点训练得到,用于指示相邻帧图像中在后的一帧图像的像素点与人脸特征点误差之间的关系;基于前一帧图像的人脸特征点和人脸特征点误差,得到待跟踪帧图像的人脸特征点。
所述一个或者一个以上程序还包含用于执行以下操作的指令:基于所述人脸特征点误差,确定每个人脸特征点的第一坐标相对于第二坐标的偏移量;基于所述前一帧图像中的人脸特征点的第二坐标与所确定的偏移量,得到所述待跟踪帧图像中的人脸特征点的第一坐标。
所述一个或者一个以上程序还包含用于执行以下操作的指令:基于预设阈值和样本集中每个样本中的第二图像在选定区域的选定位置上的一对像素点,将所述样本集分割为多类样本,所述样本集中的每个样本包括相邻帧图像中在前的第一图像的人脸特征点和在后的第二图像的人脸特征点;确定每类样本对应的重建人脸特征点误差,所述重建人脸特征点误差用于指示一类样本中第二图像的人脸特征点的第三坐标与估计人脸特征点坐标之间的差异,所述估计人脸特征点坐标基于所述一类样本中第一图像的人脸特征点确定;基于各类样本对应的重建人脸特征点误差得到所述预设误差模型。
所述一个或者一个以上程序还包含用于执行以下操作的指令:在所述基于各类样本对应的重建人脸特征点误差得到所述预设误差模型之前,对于每类样本,基于所述一类样本对应的重建人脸特征点误差,更新所述一类样本中各个第二图像的估计人脸特征点;在所述选定区域内重新选定一个位置作为所述选定位置;继续执行所述基于预设阈值和样本集中每个样本中的第二图像在选定区域的选定位置上的一对像素点,将所述样本集分割为多类样本,确定每类样本对应的重建人脸特征点误差的步骤,直到确定出基于各个选定位置上的一对像素点所分割的每类样本对应的重建人脸特征点误差后停止。
所述一个或者一个以上程序还包含用于执行以下操作的指令:在所述基于各类样本对应的重建人脸特征点误差得到所述预设误差模型之前,在所述样本中重新选定一个区域作为所述选定区域;继续执行所述基于预设阈值和样本集中每个样本中的第二图像在选定区域的选定位置上的一对像素点,将所述样本集分割为多类样本,确定每类样本对应的重建人脸特征点误差的步骤,直到确定出基于各个选定区域中选定位置上的一对像素点所分割的每类样本对应的重建人脸特征点误差后停止。
所述一个或者一个以上程序还包含用于执行以下操作的指令:确定所述第二图像的选定区域对应的初始人脸特征点误差,所述初始人脸特征点误差用于指示所述第二图像的人脸特征点的第三坐标与估计人脸特征点坐标之间的差异;所述基于各类样本对应的重建人脸特征点误差得到所述预设误差模型,包括:基于所述初始人脸特征点误差以及各类样本对应的重建人脸特征点误差,得到所述预设误差模型。
所述一个或者一个以上程序还包含用于执行以下操作的指令:基于所述预设阈值和每个样本中的第二图像在一个选定区域内不同选定位置上的多对像素点,对所述样本集进行不同方式的分割,得到每种分割方式下的多类样本;基于每种分割方式下的多类样本的人脸特征点,确定每种分割方式的分割纯度,所述分割纯度用于指示一种分割方式下的每一 类样本中的各个样本之间的相似度;选择分割纯度符合预设条件的一种分割方式,将所述分割方式下的多类样本作为最终得到的所述多类样本,将所述分割方式对应的一对像素点的位置作为所述选定位置。
所述一个或者一个以上程序还包含用于执行以下操作的指令:基于所述前一帧图像中的人脸特征点的第二坐标,在所述待跟踪帧图像中确定人脸区域;基于所述预设误差模型以及所述人脸区域中的像素点,获取所述待跟踪帧图像与所述前一帧图像的人脸特征点误差。
图10是本申请实施例提供的一种服务器的结构示意图。参见图10,该服务器包括处理组件1022,其进一步包括一个或多个处理器,以及由存储器1032所代表的存储器资源,用于存储可由处理组件1022的执行的指令,例如应用程序。存储器1032中存储的应用程序可以包括一个或者一个以上的程序。此外,处理组件1022被配置为执行指令。
服务器还可以包括一个电源组件1026被配置为执行服务器的电源管理,一个有线或无线网络接口1050被配置为将服务器连接到网络,和一个输入输出(I/O)接口1058。服务器可以操作基于存储在存储器1032的操作系统,例如Windows ServerTM,Mac OS XTM,UnixTM,LinuxTM,FreeBSDTM或类似。
所述一个或者一个以上程序包含用于执行以下操作的指令:获取待跟踪帧图像的前一帧图像中的人脸特征点;基于预设误差模型以及待跟踪帧图像中的像素点,获取待跟踪帧图像与前一帧图像的人脸特征点误差,人脸特征点误差是指第一坐标与第二坐标之间的差值,所述第一坐标是待跟踪帧图像中的人脸特征点的坐标,第二坐标是前一帧图像中相应位置的人脸特征点的坐标,预设误差模型根据多对相邻帧图像的人脸特征点训练得到,用于指示相邻帧图像中在后的一帧图像的像素点与人脸特征点误差之间的关系;基于前一帧图像的人脸特征点和人脸特征点误差,得到待跟踪帧图像的人脸特征点。
所述一个或者一个以上程序还包含用于执行以下操作的指令:基于所述人脸特征点误差,确定每个人脸特征点的第一坐标相对于第二坐标的偏移量;基于所述前一帧图像中的人脸特征点的第二坐标与所确定的偏移量,得到所述待跟踪帧图像中的人脸特征点的第一坐标。
所述一个或者一个以上程序还包含用于执行以下操作的指令:基于预设阈值和样本集中每个样本中的第二图像在选定区域的选定位置上的一对像素点,将所述样本集分割为多类样本,所述样本集中的每个样本包括相邻帧图像中在前的第一图像的人脸特征点和在后的第二图像的人脸特征点;确定每类样本对应的重建人脸特征点误差,所述重建人脸特征点误差用于指示一类样本中第二图像的人脸特征点的第三坐标与估计人脸特征点坐标之间的差异,所述估计人脸特征点坐标基于所述一类样本中第一图像的人脸特征点确定;基于各类样本对应的重建人脸特征点误差得到所述预设误差模型。
所述一个或者一个以上程序还包含用于执行以下操作的指令:在所述基于各类样本对应的重建人脸特征点误差得到所述预设误差模型之前,对于每类样本,基于所述一类样本对应的重建人脸特征点误差,更新所述一类样本中各个第二图像的估计人脸特征点;在所述选定区域内重新选定一个位置作为所述选定位置;继续执行所述基于预设阈值和样本集中每个样本中的第二图像在选定区域的选定位置上的一对像素点,将所述样本集分割为多 类样本,确定每类样本对应的重建人脸特征点误差的步骤,直到确定出基于各个选定位置上的一对像素点所分割的每类样本对应的重建人脸特征点误差后停止。
所述一个或者一个以上程序还包含用于执行以下操作的指令:在所述基于各类样本对应的重建人脸特征点误差得到所述预设误差模型之前,在所述样本中重新选定一个区域作为所述选定区域;继续执行所述基于预设阈值和样本集中每个样本中的第二图像在选定区域的选定位置上的一对像素点,将所述样本集分割为多类样本,确定每类样本对应的重建人脸特征点误差的步骤,直到确定出基于各个选定区域中选定位置上的一对像素点所分割的每类样本对应的重建人脸特征点误差后停止。
所述一个或者一个以上程序还包含用于执行以下操作的指令:确定所述第二图像的选定区域对应的初始人脸特征点误差,所述初始人脸特征点误差用于指示所述第二图像的人脸特征点的第三坐标与估计人脸特征点坐标之间的差异;所述基于各类样本对应的重建人脸特征点误差得到所述预设误差模型,包括:基于所述初始人脸特征点误差以及各类样本对应的重建人脸特征点误差,得到所述预设误差模型。
所述一个或者一个以上程序还包含用于执行以下操作的指令:基于所述预设阈值和每个样本中的第二图像在一个选定区域内不同选定位置上的多对像素点,对所述样本集进行不同方式的分割,得到每种分割方式下的多类样本;基于每种分割方式下的多类样本的人脸特征点,确定每种分割方式的分割纯度,所述分割纯度用于指示一种分割方式下的每一类样本中的各个样本之间的相似度;选择分割纯度符合预设条件的一种分割方式,将所述分割方式下的多类样本作为最终得到的所述多类样本,将所述分割方式对应的一对像素点的位置作为所述选定位置。
所述一个或者一个以上程序还包含用于执行以下操作的指令:基于所述前一帧图像中的人脸特征点的第二坐标,在所述待跟踪帧图像中确定人脸区域;基于所述预设误差模型以及所述人脸区域中的像素点,获取所述待跟踪帧图像与所述前一帧图像的人脸特征点误差。
在示例性实施例中,还提供了一种计算机可读存储介质,所述计算机可读存储介质中存储有至少一条指令,上述指令可由处理器执行以完成上述人脸特征点跟踪方法。例如,该计算机可读存储介质可以是ROM(Read Only Memory,只读存储器)、RAM(Random Access Memory,随机存取存储器)、CD-ROM(Compact Disc Read Only Memory,光盘只读存储器)、磁带、软盘和光数据存储设备等。
本领域普通技术人员可以理解实现上述实施例的全部或部分步骤可以通过硬件来完成,也可以通过程序来指令相关的硬件完成,所述的程序可以存储于一种计算机可读存储介质中,上述提到的存储介质可以是只读存储器,磁盘或光盘等。
以上所述仅为本申请的可选实施例,并不用以限制本申请,凡在本申请的精神和原则之内,所作的任何修改、等同替换、改进等,均应包含在本申请的保护范围之内。

Claims (25)

  1. 一种人脸特征点跟踪方法,其特征在于,用于电子设备中,所述方法包括:
    获取待跟踪帧图像的前一帧图像中的人脸特征点;
    基于预设误差模型以及所述待跟踪帧图像中的像素点,获取所述待跟踪帧图像与所述前一帧图像的人脸特征点误差,所述人脸特征点误差是指第一坐标与第二坐标之间的差值,所述第一坐标是所述待跟踪帧图像中的人脸特征点的坐标,所述第二坐标是所述前一帧图像中相应位置的人脸特征点的坐标,所述预设误差模型根据多对相邻帧图像的人脸特征点训练得到,用于指示所述相邻帧图像中在后的一帧图像的像素点与人脸特征点误差之间的关系;
    基于所述前一帧图像的人脸特征点和所述人脸特征点误差,得到所述待跟踪帧图像的人脸特征点。
  2. 根据权利要求1所述的方法,其特征在于,所述基于所述前一帧图像的人脸特征点和所述人脸特征点误差,得到所述待跟踪帧图像的人脸特征点,包括:
    基于所述人脸特征点误差,确定每个人脸特征点的第一坐标相对于第二坐标的偏移量;
    基于所述前一帧图像中的人脸特征点的第二坐标与所确定的偏移量,得到所述待跟踪帧图像中的人脸特征点的第一坐标。
  3. 根据权利要求1所述的方法,其特征在于,所述方法还包括:
    基于预设阈值和样本集中每个样本中的第二图像在选定区域的选定位置上的一对像素点,将所述样本集分割为多类样本,所述样本集中的每个样本包括相邻帧图像中在前的第一图像的人脸特征点和在后的第二图像的人脸特征点;
    确定每类样本对应的重建人脸特征点误差,所述重建人脸特征点误差用于指示一类样本中第二图像的人脸特征点的第三坐标与估计人脸特征点坐标之间的差异,所述估计人脸特征点坐标基于所述一类样本中第一图像的人脸特征点确定;
    基于各类样本对应的重建人脸特征点误差得到所述预设误差模型。
  4. 根据权利要求3所述的方法,其特征在于,在所述基于各类样本对应的重建人脸特征点误差得到所述预设误差模型之前,所述方法还包括:
    对于每类样本,基于所述一类样本对应的重建人脸特征点误差,更新所述一类样本中各个第二图像的估计人脸特征点;
    在所述选定区域内重新选定一个位置作为所述选定位置;
    继续执行所述基于预设阈值和样本集中每个样本中的第二图像在选定区域的选定位置上的一对像素点,将所述样本集分割为多类样本,确定每类样本对应的重建人脸特征点误差的步骤,直到确定出基于各个选定位置上的一对像素点所分割的每类样本对应的重建人脸特征点误差后停止。
  5. 根据权利要求3所述的方法,其特征在于,在所述基于各类样本对应的重建人脸特征点误差得到所述预设误差模型之前,所述方法还包括:
    在所述样本中重新选定一个区域作为所述选定区域;
    继续执行所述基于预设阈值和样本集中每个样本中的第二图像在选定区域的选定位置上的一对像素点,将所述样本集分割为多类样本,确定每类样本对应的重建人脸特征点误差的步骤,直到确定出基于各个选定区域中选定位置上的一对像素点所分割的每类样本对应的重建人脸特征点误差后停止。
  6. 根据权利要求5所述的方法,其特征在于,所述方法还包括:
    确定所述第二图像的选定区域对应的初始人脸特征点误差,所述初始人脸特征点误差用于指示所述第二图像的人脸特征点的第三坐标与估计人脸特征点坐标之间的差异;
    所述基于各类样本对应的重建人脸特征点误差得到所述预设误差模型,包括:
    基于所述初始人脸特征点误差以及各类样本对应的重建人脸特征点误差,得到所述预设误差模型。
  7. 根据权利要求3至6中任一项所述的方法,其特征在于,所述基于预设阈值和样本集中每个样本在选定区域的选定位置上的一对像素点,将所述样本集分割为多类样本,包括:
    基于所述预设阈值和每个样本中的第二图像在一个选定区域内不同选定位置上的多对像素点,对所述样本集进行不同方式的分割,得到每种分割方式下的多类样本;
    基于每种分割方式下的多类样本的人脸特征点,确定每种分割方式的分割纯度,所述分割纯度用于指示一种分割方式下的每一类样本中的各个样本之间的相似度;
    选择分割纯度符合预设条件的一种分割方式,将所述分割方式下的多类样本作为最终得到的所述多类样本,将所述分割方式对应的一对像素点的位置作为所述选定位置。
  8. 根据权利要求1所述的方法,其特征在于,所述基于预设误差模型以及待跟踪帧图像中的像素点,获取所述待跟踪帧图像与所述前一帧图像的人脸特征点误差,包括:
    基于所述前一帧图像中的人脸特征点的第二坐标,在所述待跟踪帧图像中确定人脸区域;
    基于所述预设误差模型以及所述人脸区域中的像素点,获取所述待跟踪帧图像与所述前一帧图像的人脸特征点误差。
  9. 一种人脸特征点跟踪装置,其特征在于,所述装置包括:
    第一获取模块,用于获取待跟踪帧图像的前一帧图像中的人脸特征点;
    第二获取模块,用于基于预设误差模型以及所述待跟踪帧图像中的像素点,获取所述待跟踪帧图像与所述前一帧图像的人脸特征点误差,所述人脸特征点误差是指第一坐标与第二坐标之间的差值,所述第一坐标是所述待跟踪帧图像中的人脸特征点的坐标,所述第二坐标是所述前一帧图像中相应位置的人脸特征点的坐标,所述预设误差模型根据多对相邻帧图像的人脸特征点训练得到,用于指示所述相邻帧图像中在后的一帧图像的像素点与人脸特征点误差之间的关系;
    跟踪模块,用于基于所述前一帧图像的人脸特征点和所述人脸特征点误差,得到所述待跟踪帧图像的人脸特征点。
  10. 根据权利要求9所述的装置,其特征在于,所述跟踪模块用于:基于所述人脸特征点误差,确定每个人脸特征点的第一坐标相对于第二坐标的偏移量;基于所述前一帧图像中的人脸特征点的第二坐标与所确定的偏移量,得到所述待跟踪帧图像中的人脸特征点的第一坐标。
  11. 根据权利要求9所述的装置,其特征在于,所述装置还包括:
    分割模块,用于基于预设阈值和样本集中每个样本中的第二图像在选定区域的选定位置上的一对像素点,将所述样本集分割为多类样本,所述样本集中的每个样本包括相邻帧图像中在前的第一图像的人脸特征点和在后的第二图像的人脸特征点;
    第一确定模块,用于确定每类样本对应的重建人脸特征点误差,所述重建人脸特征点误差用于指示一类样本中第二图像的人脸特征点的第三坐标与估计人脸特征点坐标之间的差异,所述估计人脸特征点坐标基于所述一类样本中第一图像的人脸特征点确定;
    第三获取模块,用于基于各类样本对应的重建人脸特征点误差得到所述预设误差模型。
  12. 根据权利要求11所述的装置,其特征在于,所述装置还包括:
    更新模块,用于在所述第三获取模块基于各类样本对应的重建人脸特征点误差得到所述预设误差模型之前,对于每类样本,基于所述一类样本对应的重建人脸特征点误差,更新所述一类样本中各个第二图像的估计人脸特征点;
    第一选择模块,用于在所述选定区域内重新选定一个位置作为所述选定位置;
    第一循环模块,用于继续执行所述基于预设阈值和样本集中每个样本中的第二图像在选定区域的选定位置上的一对像素点,将所述样本集分割为多类样本,确定每类样本对应的重建人脸特征点误差的操作,直到确定出基于各个选定位置上的一对像素点所分割的每类样本对应的重建人脸特征点误差后停止。
  13. 根据权利要求11所述的装置,其特征在于,所述装置还包括:
    第二选择模块,用于在所述第三获取模块基于各类样本对应的重建人脸特征点误差得到所述预设误差模型之前,在所述样本中重新选定一个区域作为所述选定区域;
    第二循环模块,用于继续执行所述基于预设阈值和样本集中每个样本中的第二图像在选定区域的选定位置上的一对像素点,将所述样本集分割为多类样本,确定每类样本对应的重建人脸特征点误差的操作,直到确定出基于各个选定区域中选定位置上的一对像素点所分割的每类样本对应的重建人脸特征点误差后停止。
  14. 根据权利要求13所述的装置,其特征在于,所述装置还包括:
    第二确定模块,用于确定所述第二图像的选定区域对应的初始人脸特征点误差,所述初始人脸特征点误差用于指示所述第二图像的人脸特征点的第三坐标与估计人脸特征点坐标之间的差异;
    所述第三获取模块,用于基于所述初始人脸特征点误差以及各类样本对应的重建人脸特征点误差,得到所述预设误差模型。
  15. 根据权利要求11至14中任一项所述的装置,其特征在于,所述分割模块,还用于:基于所述预设阈值和每个样本中的第二图像在一个选定区域内不同选定位置上的多对像素点,对所述样本集进行不同方式的分割,得到每种分割方式下的多类样本;基于每种分割方式下的多类样本的人脸特征点,确定每种分割方式的分割纯度,所述分割纯度用于指示一种分割方式下的每一类样本中的各个样本之间的相似度;选择分割纯度符合预设条件的一种分割方式,将所述分割方式下的多类样本作为最终得到的所述多类样本,将所述分割方式对应的一对像素点的位置作为所述选定位置。
  16. 根据权利要求9所述的装置,其特征在于,所述第二获取模块,还用于:基于所述前一帧图像中的人脸特征点的第二坐标,在所述待跟踪帧图像中确定人脸区域;基于所述预设误差模型以及所述人脸区域中的像素点,获取所述待跟踪帧图像与所述前一帧图像的人脸特征点误差。
  17. 一种计算机可读存储介质,其特征在于,所述计算机可读存储介质中存储有至少一条指令,所述指令由处理器加载并执行以实现如权利要求1至权利要求8中任一项所述的人脸特征点跟踪方法。
  18. 一种电子设备,其特征在于,所述电子设备包括:一个或多个处理器;和,存储器;所述存储器存储有一个或多个程序,所述一个或者一个以上程序被配置成由所述一个或多个处理器执行,所述一个或者一个以上程序包含用于执行以下操作的指令:
    获取待跟踪帧图像的前一帧图像中的人脸特征点;
    基于预设误差模型以及所述待跟踪帧图像中的像素点,获取所述待跟踪帧图像与所述前一帧图像的人脸特征点误差,所述人脸特征点误差是指第一坐标与第二坐标之间的差值,所述第一坐标是所述待跟踪帧图像中的人脸特征点的坐标,所述第二坐标是所述前一帧图像中相应位置的人脸特征点的坐标,所述预设误差模型根据多对相邻帧图像的人脸特征点训练得到,用于指示所述相邻帧图像中在后的一帧图像的像素点与人脸特征点误差之间的关系;
    基于所述前一帧图像的人脸特征点和所述人脸特征点误差,得到所述待跟踪帧图像的人脸特征点。
  19. 根据权利要求18所述的电子设备,其特征在于,所述一个或者一个以上程序还包含用于执行以下操作的指令:
    基于所述人脸特征点误差,确定每个人脸特征点的第一坐标相对于第二坐标的偏移量;
    基于所述前一帧图像中的人脸特征点的第二坐标与所确定的偏移量,得到所述待跟踪帧图像中的人脸特征点的第一坐标。
  20. 根据权利要求18所述的电子设备,其特征在于,所述一个或者一个以上程序还包含用于执行以下操作的指令:
    基于预设阈值和样本集中每个样本中的第二图像在选定区域的选定位置上的一对像素点,将所述样本集分割为多类样本,所述样本集中的每个样本包括相邻帧图像中在前的第一 图像的人脸特征点和在后的第二图像的人脸特征点;
    确定每类样本对应的重建人脸特征点误差,所述重建人脸特征点误差用于指示一类样本中第二图像的人脸特征点的第三坐标与估计人脸特征点坐标之间的差异,所述估计人脸特征点坐标基于所述一类样本中第一图像的人脸特征点确定;
    基于各类样本对应的重建人脸特征点误差得到所述预设误差模型。
  21. 根据权利要求20所述的电子设备,其特征在于,所述一个或者一个以上程序还包含用于执行以下操作的指令:
    在所述基于各类样本对应的重建人脸特征点误差得到所述预设误差模型之前,对于每类样本,基于所述一类样本对应的重建人脸特征点误差,更新所述一类样本中各个第二图像的估计人脸特征点;
    在所述选定区域内重新选定一个位置作为所述选定位置;
    继续执行所述基于预设阈值和样本集中每个样本中的第二图像在选定区域的选定位置上的一对像素点,将所述样本集分割为多类样本,确定每类样本对应的重建人脸特征点误差的步骤,直到确定出基于各个选定位置上的一对像素点所分割的每类样本对应的重建人脸特征点误差后停止。
  22. 根据权利要求20所述的电子设备,其特征在于,所述一个或者一个以上程序还包含用于执行以下操作的指令:
    在所述基于各类样本对应的重建人脸特征点误差得到所述预设误差模型之前,在所述样本中重新选定一个区域作为所述选定区域;
    继续执行所述基于预设阈值和样本集中每个样本中的第二图像在选定区域的选定位置上的一对像素点,将所述样本集分割为多类样本,确定每类样本对应的重建人脸特征点误差的步骤,直到确定出基于各个选定区域中选定位置上的一对像素点所分割的每类样本对应的重建人脸特征点误差后停止。
  23. 根据权利要求22所述的电子设备,其特征在于,所述一个或者一个以上程序还包含用于执行以下操作的指令:
    确定所述第二图像的选定区域对应的初始人脸特征点误差,所述初始人脸特征点误差用于指示所述第二图像的人脸特征点的第三坐标与估计人脸特征点坐标之间的差异;
    所述基于各类样本对应的重建人脸特征点误差得到所述预设误差模型,包括:
    基于所述初始人脸特征点误差以及各类样本对应的重建人脸特征点误差,得到所述预设误差模型。
  24. 根据权利要求20至23中任一项所述的电子设备,其特征在于,所述一个或者一个以上程序还包含用于执行以下操作的指令:
    基于所述预设阈值和每个样本中的第二图像在一个选定区域内不同选定位置上的多对像素点,对所述样本集进行不同方式的分割,得到每种分割方式下的多类样本;
    基于每种分割方式下的多类样本的人脸特征点,确定每种分割方式的分割纯度,所述分 割纯度用于指示一种分割方式下的每一类样本中的各个样本之间的相似度;
    选择分割纯度符合预设条件的一种分割方式,将所述分割方式下的多类样本作为最终得到的所述多类样本,将所述分割方式对应的一对像素点的位置作为所述选定位置。
  25. 根据权利要求18所述的电子设备,其特征在于,所述一个或者一个以上程序还包含用于执行以下操作的指令:
    基于所述前一帧图像中的人脸特征点的第二坐标,在所述待跟踪帧图像中确定人脸区域;
    基于所述预设误差模型以及所述人脸区域中的像素点,获取所述待跟踪帧图像与所述前一帧图像的人脸特征点误差。
PCT/CN2018/088070 2017-06-21 2018-05-23 人脸特征点跟踪方法、装置、存储介质及设备 Ceased WO2018233438A1 (zh)

Priority Applications (2)

Application Number Priority Date Filing Date Title
EP18819678.6A EP3644219B1 (en) 2017-06-21 2018-05-23 Human face feature point tracking method, device, storage medium and apparatus
US16/542,005 US10943091B2 (en) 2017-06-21 2019-08-15 Facial feature point tracking method, apparatus, storage medium, and device

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201710473506.8A CN108304758B (zh) 2017-06-21 2017-06-21 人脸特征点跟踪方法及装置
CN201710473506.8 2017-06-21

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US16/542,005 Continuation US10943091B2 (en) 2017-06-21 2019-08-15 Facial feature point tracking method, apparatus, storage medium, and device

Publications (1)

Publication Number Publication Date
WO2018233438A1 true WO2018233438A1 (zh) 2018-12-27

Family

ID=62872622

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2018/088070 Ceased WO2018233438A1 (zh) 2017-06-21 2018-05-23 人脸特征点跟踪方法、装置、存储介质及设备

Country Status (5)

Country Link
US (1) US10943091B2 (zh)
EP (1) EP3644219B1 (zh)
CN (1) CN108304758B (zh)
MA (1) MA49468A (zh)
WO (1) WO2018233438A1 (zh)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110097586A (zh) * 2019-04-30 2019-08-06 青岛海信网络科技股份有限公司 一种人脸检测追踪方法及装置

Families Citing this family (23)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11017210B2 (en) * 2016-05-19 2021-05-25 Visiana Aps Image processing apparatus and method
CN107169463B (zh) * 2017-05-22 2018-09-14 腾讯科技(深圳)有限公司 人脸检测方法、装置、计算机设备及存储介质
CN108304758B (zh) * 2017-06-21 2020-08-25 腾讯科技(深圳)有限公司 人脸特征点跟踪方法及装置
CN110969640A (zh) * 2018-09-29 2020-04-07 Tcl集团股份有限公司 视频图像的分割方法、终端设备以及计算机可读存储介质
CN111433815A (zh) * 2018-11-30 2020-07-17 深圳市大疆创新科技有限公司 图像特征点的评价方法和可移动平台
CN109858363B (zh) * 2018-12-28 2020-07-17 北京旷视科技有限公司 一种狗鼻纹特征点的检测方法、装置、系统及存储介质
CN110147742B (zh) * 2019-05-08 2024-04-16 腾讯科技(深圳)有限公司 一种关键点定位方法、装置及终端
CN110136229B (zh) * 2019-05-27 2023-07-14 广州亮风台信息科技有限公司 一种用于实时虚拟换脸的方法与设备
CN110334688B (zh) * 2019-07-16 2021-09-07 重庆紫光华山智安科技有限公司 基于人脸照片库的图像识别方法、装置和计算机可读存储介质
CN110347134A (zh) * 2019-07-29 2019-10-18 南京图玩智能科技有限公司 一种ai智能水产养殖样本识别方法及养殖系统
CN110659623B (zh) * 2019-09-27 2022-04-08 深圳看到科技有限公司 基于分帧处理的全景画面展示方法、装置及存储介质
CN111093077A (zh) * 2019-12-31 2020-05-01 深圳云天励飞技术有限公司 一种视频编码方法、装置、电子设备及存储介质
CN111260692A (zh) * 2020-01-20 2020-06-09 厦门美图之家科技有限公司 人脸跟踪方法、装置、设备及存储介质
CN111523467B (zh) * 2020-04-23 2023-08-08 北京百度网讯科技有限公司 人脸跟踪方法和装置
CN112417985A (zh) * 2020-10-30 2021-02-26 杭州魔点科技有限公司 一种人脸特征点追踪方法、系统、电子设备和存储介质
CN112419310B (zh) * 2020-12-08 2023-07-07 中国电子科技集团公司第二十研究所 一种基于交并融合边框优选的目标检测方法
JP7122543B1 (ja) * 2021-04-15 2022-08-22 パナソニックIpマネジメント株式会社 情報処理装置、情報処理システム、及び、推定方法
US11514719B1 (en) * 2021-05-18 2022-11-29 Fortinet, Inc. Systems and methods for hierarchical facial image clustering
CN115689967A (zh) * 2021-07-22 2023-02-03 海信集团控股股份有限公司 一种图像处理方法、系统、装置、设备及介质
CN113901916B (zh) * 2021-10-08 2025-07-01 无锡锡商银行股份有限公司 一种基于可视化光流特征的面部欺诈动作识别方法
CN116434287B (zh) * 2021-12-30 2025-11-04 北京字跳网络技术有限公司 一种人脸图像检测方法、装置、电子设备及存储介质
US20230244768A1 (en) * 2022-02-01 2023-08-03 Matthew Edward Natividad Healey Systems and methods for intent-based device unlocking
CN115393935B (zh) * 2022-08-24 2026-01-09 深圳万兴软件有限公司 人脸特征点的滤波处理方法、装置、设备及存储介质

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101169827A (zh) * 2007-12-03 2008-04-30 北京中星微电子有限公司 一种对图像中的特征点进行跟踪的方法及装置
JP2012123676A (ja) * 2010-12-09 2012-06-28 Canon Inc 個人認証装置
CN103310204A (zh) * 2013-06-28 2013-09-18 中国科学院自动化研究所 基于增量主成分分析的特征与模型互匹配人脸跟踪方法
CN103400395A (zh) * 2013-07-24 2013-11-20 佳都新太科技股份有限公司 一种基于haar特征检测的光流跟踪方法
CN104182718A (zh) * 2013-05-21 2014-12-03 腾讯科技(深圳)有限公司 一种人脸特征点定位方法及装置
CN105678702A (zh) * 2015-12-25 2016-06-15 北京理工大学 一种基于特征跟踪的人脸图像序列生成方法及装置

Family Cites Families (20)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20080073933A (ko) * 2007-02-07 2008-08-12 삼성전자주식회사 객체 트래킹 방법 및 장치, 그리고 객체 포즈 정보 산출방법 및 장치
JP2012123376A (ja) 2010-11-16 2012-06-28 Sumitomo Chemical Co Ltd レジスト組成物及びレジストパターンの製造方法
US9183638B2 (en) * 2011-08-09 2015-11-10 The Boeing Company Image based position determination
US20140185924A1 (en) * 2012-12-27 2014-07-03 Microsoft Corporation Face Alignment by Explicit Shape Regression
US9743373B2 (en) * 2012-12-28 2017-08-22 Trimble Inc. Concurrent dual processing of pseudoranges with corrections
CN103093490B (zh) * 2013-02-02 2015-08-26 浙江大学 基于单个视频摄像机的实时人脸动画方法
GB2518589B (en) * 2013-07-30 2019-12-11 Holition Ltd Image processing
CN104036240B (zh) * 2014-05-29 2017-09-01 小米科技有限责任公司 人脸特征点的定位方法和装置
WO2016038647A1 (en) * 2014-09-11 2016-03-17 Nec Corporation Image processing device, image processing method and storage medium storing program thereof
KR101717222B1 (ko) * 2015-04-24 2017-03-17 가천대학교 산학협력단 시선 검출 시스템 및 그 방법
CN106874826A (zh) * 2015-12-11 2017-06-20 腾讯科技(深圳)有限公司 人脸关键点跟踪方法和装置
CN105760826B (zh) * 2016-02-03 2020-11-13 歌尔股份有限公司 一种人脸跟踪方法、装置和智能终端
US11741639B2 (en) * 2016-03-02 2023-08-29 Holition Limited Locating and augmenting object features in images
KR102476897B1 (ko) * 2016-10-05 2022-12-12 삼성전자주식회사 객체 추적 방법 및 장치, 및 이를 이용한 3d 디스플레이 장치
CN110073363B (zh) * 2016-12-14 2023-11-14 皇家飞利浦有限公司 追踪对象的头部
CN106875422B (zh) * 2017-02-06 2022-02-25 腾讯科技(上海)有限公司 人脸跟踪方法和装置
CN108304758B (zh) * 2017-06-21 2020-08-25 腾讯科技(深圳)有限公司 人脸特征点跟踪方法及装置
CN108460787B (zh) * 2018-03-06 2020-11-27 北京市商汤科技开发有限公司 目标跟踪方法和装置、电子设备、程序、存储介质
US10171738B1 (en) * 2018-05-04 2019-01-01 Google Llc Stabilizing video to reduce camera and face movement
US10740925B2 (en) * 2018-08-29 2020-08-11 Adobe Inc. Object tracking verification in digital video

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101169827A (zh) * 2007-12-03 2008-04-30 北京中星微电子有限公司 一种对图像中的特征点进行跟踪的方法及装置
JP2012123676A (ja) * 2010-12-09 2012-06-28 Canon Inc 個人認証装置
CN104182718A (zh) * 2013-05-21 2014-12-03 腾讯科技(深圳)有限公司 一种人脸特征点定位方法及装置
CN103310204A (zh) * 2013-06-28 2013-09-18 中国科学院自动化研究所 基于增量主成分分析的特征与模型互匹配人脸跟踪方法
CN103400395A (zh) * 2013-07-24 2013-11-20 佳都新太科技股份有限公司 一种基于haar特征检测的光流跟踪方法
CN105678702A (zh) * 2015-12-25 2016-06-15 北京理工大学 一种基于特征跟踪的人脸图像序列生成方法及装置

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
See also references of EP3644219A4

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110097586A (zh) * 2019-04-30 2019-08-06 青岛海信网络科技股份有限公司 一种人脸检测追踪方法及装置
CN110097586B (zh) * 2019-04-30 2023-05-30 青岛海信网络科技股份有限公司 一种人脸检测追踪方法及装置

Also Published As

Publication number Publication date
CN108304758A (zh) 2018-07-20
US10943091B2 (en) 2021-03-09
CN108304758B (zh) 2020-08-25
EP3644219A1 (en) 2020-04-29
US20190370530A1 (en) 2019-12-05
EP3644219A4 (en) 2021-03-17
MA49468A (fr) 2020-04-29
EP3644219B1 (en) 2026-04-15

Similar Documents

Publication Publication Date Title
US10943091B2 (en) Facial feature point tracking method, apparatus, storage medium, and device
CN109919251B (zh) 一种基于图像的目标检测方法、模型训练的方法及装置
EP3805982B1 (en) Gesture recognition method, apparatus and device
WO2019020014A1 (zh) 解锁控制方法及相关产品
CN110443190B (zh) 一种对象识别方法和装置
CN110995810B (zh) 一种基于人工智能的对象识别方法和相关装置
CN111046742B (zh) 一种眼部行为检测方法、装置以及存储介质
WO2018113512A1 (zh) 图像处理方法以及相关装置
WO2019015575A1 (zh) 解锁控制方法及相关产品
WO2016184276A1 (zh) 一种人脸关键点位定位结果的评估方法,及评估装置
WO2017041664A1 (zh) 一种征信评分确定方法、装置及存储介质
WO2018133717A1 (zh) 图片的二值化方法、装置及终端
CN110147742B (zh) 一种关键点定位方法、装置及终端
CN113421211A (zh) 光斑虚化的方法、终端设备及存储介质
CN117332844A (zh) 对抗样本生成方法、相关装置及存储介质
CN112270238B (zh) 一种视频内容识别方法和相关装置
CN111756705B (zh) 活体检测算法的攻击测试方法、装置、设备及存储介质
CN114140655A (zh) 图像分类方法、装置、存储介质及电子设备
CN111738282B (zh) 一种基于人工智能的图像识别方法和相关设备
CN108829600B (zh) 算法库的测试方法、装置、存储介质和电子设备
CN117520835A (zh) 一种分类树模型的训练方法、装置以及存储介质
TWI919292B (zh) 基於無線網路的姿態檢測方法、裝置、設備及儲存媒體
CN117237228B (zh) 图像处理方法、装置、设备及存储介质
CN115204868A (zh) 一种数据处理的方法、装置以及存储介质
US20250310067A1 (en) Wireless network-based posture detection method, device, apparatus, and storage medium

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 18819678

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

ENP Entry into the national phase

Ref document number: 2018819678

Country of ref document: EP

Effective date: 20200121

WWG Wipo information: grant in national office

Ref document number: 2018819678

Country of ref document: EP