WO2021204086A1 - 身份认证的方法、身份认证模型的训练方法及装置 - Google Patents

身份认证的方法、身份认证模型的训练方法及装置 Download PDF

Info

Publication number
WO2021204086A1
WO2021204086A1 PCT/CN2021/085319 CN2021085319W WO2021204086A1 WO 2021204086 A1 WO2021204086 A1 WO 2021204086A1 CN 2021085319 W CN2021085319 W CN 2021085319W WO 2021204086 A1 WO2021204086 A1 WO 2021204086A1
Authority
WO
WIPO (PCT)
Prior art keywords
operation behavior
sample data
model
data
user
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Ceased
Application number
PCT/CN2021/085319
Other languages
English (en)
French (fr)
Inventor
陈栋
李基�
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Huawei Technologies Co Ltd
Original Assignee
Huawei Technologies Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Huawei Technologies Co Ltd filed Critical Huawei Technologies Co Ltd
Priority to EP21784667.4A priority Critical patent/EP4120105A4/en
Publication of WO2021204086A1 publication Critical patent/WO2021204086A1/zh
Priority to US17/958,746 priority patent/US20230027527A1/en
Anticipated expiration legal-status Critical
Ceased legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/30Authentication, i.e. establishing the identity or authorisation of security principals
    • G06F21/31User authentication
    • G06F21/32User authentication using biometric data, e.g. fingerprints, iris scans or voiceprints
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/30Authentication, i.e. establishing the identity or authorisation of security principals
    • G06F21/31User authentication
    • G06F21/316User authentication by observing the pattern of computer usage, e.g. typical user behaviour
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/0499Feedforward networks
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/09Supervised learning
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/0985Hyperparameter optimisation; Meta-learning; Learning-to-learn
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • G06N20/10Machine learning using kernel methods, e.g. support vector machines [SVM]
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/0464Convolutional networks [CNN, ConvNet]
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/048Activation functions
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent

Definitions

  • This application relates to the field of artificial intelligence, and more specifically, to an identity authentication method, an identity authentication model training method and device.
  • Artificial intelligence is a theory, method, technology and application system that uses digital computers or machines controlled by digital computers to simulate, extend and expand human intelligence, perceive the environment, acquire knowledge, and use knowledge to obtain the best results.
  • artificial intelligence is a branch of computer science that attempts to understand the essence of intelligence and produce a new kind of intelligent machine that can react in a similar way to human intelligence.
  • Artificial intelligence is to study the design principles and implementation methods of various intelligent machines, so that the machines have the functions of perception, reasoning and decision-making.
  • Research in the field of artificial intelligence includes robotics, natural language processing, computer vision, decision-making and reasoning, human-computer interaction, recommendation and search, and basic AI theories.
  • Behavioral identity authentication refers to the use of machine learning algorithms to model the user's past behaviors to identify recent behaviors and achieve identity authentication.
  • the existing behavioral identity authentication methods fail to meet expectations in terms of anti-attack capability and authenticated user recognition rate.
  • the authenticated user In the actual behavioral identity authentication process, if the authentication ability of the authenticated user is poor, the authenticated user will often be mistaken for a non-authenticated user, resulting in poor user experience; if the anti-attack ability is poor, the non-authenticated user will be mistaken for the authenticated user. The certification is useless.
  • the present application provides an identity authentication method, an identity authentication model training method and device, which can improve the accuracy of identity authentication.
  • an identity authentication method includes: obtaining first operation behavior data of a user to be authenticated; obtaining second operation behavior data of the user to be authenticated; An authentication model to obtain the first recognition result output by the first authentication model; use the second operation behavior data input to pass the second authentication model to obtain the second recognition result output by the second authentication model; wherein
  • the first and second authentication models are respectively an anomaly detection model and a classification model; the first recognition result and the second recognition result are input into the decision fusion model to obtain the output identity authentication result, wherein the decision The fusion model is used to determine the identity authentication result according to the weight parameters of the first recognition result and the second recognition result.
  • the identity authentication model includes a first authentication model, a second authentication model and a decision fusion model. It should be understood that the identity authentication model may include multiple authentication models and decision fusion models. The identification results output by the multiple authentication models are input into the decision fusion model, and the decision fusion model outputs the identity authentication result.
  • first operation behavior data and the second operation behavior data may be the same or different.
  • the first identification result and/or the second identification result include that the user to be authenticated is an authenticated user or the user to be authenticated is a non-authenticated user.
  • the first identification result and/or the second identification result include the probability that the user to be authenticated is an authenticated user or the probability that the user to be authenticated is a non-authenticated user.
  • the identity authentication result is an authenticated user.
  • the identity authentication result is an un-authenticated user.
  • the identity authentication result output by the decision fusion model may include whether the user to be authenticated is an authenticated user or the user to be authenticated is a non-authenticated user.
  • the identity authentication result output by the decision fusion model may include the probability that the user to be authenticated is an authenticated user or the probability that the user to be authenticated is a non-authenticated user.
  • the identification results of the two authentication models are input into the decision fusion model for decision fusion, and the identity authentication result is obtained, which can improve the accuracy of identity authentication.
  • performing identity verification through the user's operating behavior data will not change the user's machine habits, which is conducive to the realization of non-sense verification.
  • the embodiments of the present application adopt an anomaly detection model and a classification model.
  • the anomaly detection model improves the recognition rate of authenticated users; on the other hand, the classification model improves the anti-attack ability, and overall improves the identification ability of the identity authentication model.
  • the first operation behavior data and/or the second operation behavior data are data collected by sensors.
  • the sensors include motion sensors and/or touch screen sensors.
  • the operation behavior data of the user to be authenticated may be data obtained after processing the raw data collected by the sensor.
  • the raw data collected by the touch screen sensor includes: time stamp, X/Y axis coordinates of the touch point, touch area, touch pressure, action, and screen direction.
  • the raw data collected by the motion sensor includes: time stamp, acceleration X/Y/Z axis data, gyroscope X/Y/Z axis data, etc. It should be understood that the above is merely illustrative, and the data collected by the sensor may include any one or several of the above, and may also include other data.
  • the model parameters of the first authentication model are obtained by training based on first sample data
  • the first sample data includes: first operation behavior sample data
  • the label corresponding to the first operation behavior sample data, and the label corresponding to the first operation behavior sample data is used to indicate that the user corresponding to the first operation behavior sample data is an authenticated user or a non-authenticated user.
  • the model parameters of the second authentication model are obtained by training based on the fifth sample data.
  • the fifth sample data includes: the fifth operation behavior sample data and the labels corresponding to the fifth operation behavior sample data, and the labels corresponding to the fifth operation behavior sample data are used It indicates that the user corresponding to the sample data of the fifth operation behavior is an authenticated user or a non-authenticated user.
  • the anomaly detection model may adopt a single classification support vector machine SVM or an isolated forest.
  • the classification model can use SVM or neural network.
  • both the anomaly detection model and the classification model can use small-scale algorithms.
  • the anomaly detection model can use a single-class support vector machine SVM, and the classification model can use a two-class SVM.
  • the algorithm overhead is small and no additional hardware support is needed.
  • the identity authentication model can be trained on the client side, which can realize the secure storage of data on the client side and avoid the privacy and security issues caused by uploading to the cloud.
  • the weight parameter of the decision fusion model is obtained by inputting the second operation behavior sample data in the second sample data into the first authentication model and the second authentication model.
  • the recognition result of the second operation behavior sample data output by the first authentication model and the second authentication model is obtained, and the recognition result of the second operation behavior sample data is used as the decision fusion model
  • the input is obtained by training with the label corresponding to the second operation behavior sample data as the target output value of the decision fusion model
  • the second sample data includes the second operation behavior sample data and the second A label corresponding to the operation behavior sample data, and the label corresponding to the second operation behavior sample data is used to indicate that the user corresponding to the second operation behavior sample data is an authenticated user or a non-authenticated user.
  • the first recognition result includes: a matching score corresponding to the first operation behavior data and/or a matching result corresponding to the first operation behavior data; the second recognition result includes the first 2.
  • the matching result corresponding to the operation behavior data is used to indicate that the user to be authenticated is identified as an authenticated user or a non-authenticated user, and the matching result corresponding to the first operation behavior data includes a matching score based on at least two thresholds and the first operation behavior data At least two matching results determined;
  • the matching score corresponding to the second operation behavior data is used to indicate the probability that the user to be authenticated is recognized as an authenticated user, and the matching result corresponding to the second operation behavior data is used to indicate the user to be authenticated Recognized as an authenticated
  • the following examples illustrate the first recognition result and the second recognition result.
  • the recognition results output by the two authentication models include: the recognition results output by the anomaly detection model and the recognition results output by the classification model.
  • the recognition result output by the anomaly detection model includes the matching score A corresponding to the first operation behavior data output by the anomaly detection model and the matching result A corresponding to the first operation behavior data.
  • the at least two thresholds include a first threshold A and a second threshold A, and the first threshold A is greater than the second threshold A.
  • the matching result A corresponding to the first operation behavior data output by the abnormality detection model includes: a matching result determined based on the first threshold A and a matching result determined based on the second threshold A.
  • the matching score A is greater than or equal to the first threshold A, the matching result determined based on the first threshold A is an authenticated user; when the matching score A is less than the first threshold A, the matching result determined based on the first threshold A is a non-authenticated user .
  • the matching result determined based on the second threshold A is an authenticated user; when the matching score A is less than the second threshold A, the matching result determined based on the second threshold A is a non-authenticated user .
  • the recognition result output by the classification model includes: a matching score B corresponding to the second operation behavior data output by the classification model and a matching result B corresponding to the second operation behavior data.
  • the at least two thresholds include a first threshold B and a second threshold B, and the first threshold B is greater than the second threshold B.
  • the matching result B corresponding to the second operation behavior data output by the classification model includes: a matching result determined based on the first threshold B and a matching result determined based on the second threshold B.
  • the matching score B is greater than or equal to the first threshold B, the matching result determined based on the first threshold B is an authenticated user; when the matching score B is less than the first threshold B, the matching result determined based on the first threshold B is an uncertified user .
  • the matching result determined based on the second threshold B is an authenticated user; when the matching score B is less than the second threshold B, the matching result determined based on the second threshold B is an un-authenticated user .
  • the at least two thresholds may be determined according to the accuracy of the recognition result.
  • the recognition result output by the authentication model includes a matching score and at least two matching results.
  • Different thresholds can be used to adjust different performances of the authentication model, so that the performance of the authentication model can meet expectations, for example, through different thresholds. Adjust the model's anti-attack ability and the owner's recognition rate to balance the performance of the authentication model.
  • this solution can provide more features for the decision fusion model, which is conducive to improving the accuracy of the decision fusion model authentication, thereby improving the accuracy of identity authentication.
  • inputting the first recognition result and the second recognition result into the decision fusion model to obtain the output identity authentication result includes: according to the first operation behavior data The corresponding matching score obtains the first score feature; the second score feature is obtained according to the matching score corresponding to the second operation behavior data; the first score feature, the second score feature, the first recognition result, and the The second recognition result is input into the decision fusion model, and the output identity authentication result is obtained.
  • obtaining the score feature according to the matching score corresponding to the first operation behavior data includes: performing a mathematical operation on the matching score to obtain the score feature.
  • feature extraction is performed on the matching score to further provide more features for the decision fusion model, which is beneficial to improving the accuracy of the decision fusion model authentication, thereby improving the accuracy of identity authentication.
  • the first operation behavior data and/or the second operation behavior data include at least one of the following data: touch point X/Y axis coordinates, touch area, touch Pressure, touch screen speed, touch screen acceleration, touch screen trajectory slope, touch screen length, touch screen displacement, touch screen angle, touch screen direction, acceleration X/Y/Z axis data or gyroscope X/Y/Z axis data .
  • the model parameters of the authentication model are obtained by training based on the third operation behavior sample data in the third sample data and the identity authentication result corresponding to the third operation behavior sample data
  • the identity authentication result corresponding to the third operation behavior sample data is obtained by inputting the third operation behavior sample data into the first authentication model and the second authentication model to obtain the identification result of the third operation behavior sample data.
  • the recognition result of the sample data is obtained by inputting the decision fusion model, and the third sample data includes: the third operation behavior sample data.
  • the authentication model is retrained according to the identity authentication result of the decision fusion model, and the output is used to realize feedback, and the authentication model is trained again, which can further improve the accuracy of the authentication model, thereby further improving the performance of the identity authentication model. Accuracy.
  • a method for training an identity authentication model includes a first authentication model, a second authentication model, and a decision fusion model, wherein the first and second authentication models are respectively anomaly detection models
  • a classification model the method includes: acquiring second sample data, the second sample data including second operation behavior sample data and a label corresponding to the second operation behavior sample data, the second operation behavior sample data corresponding to The label is used to indicate that the user corresponding to the second operation behavior sample data is an authenticated user or a non-authenticated user; input the second operation behavior sample data in the second sample data into the first authentication model and the second authentication model
  • the recognition result of the second operation behavior data is obtained, the model parameters of the first authentication model are obtained by training based on the first sample data, and the first sample data includes the first operation behavior sample data and The label corresponding to the first operation behavior sample data, the label corresponding to the first operation behavior sample data is used to indicate that the user corresponding to the first operation behavior sample data is an authenticated user or
  • the fifth sample data includes the fifth operation behavior sample data and the label corresponding to the fifth operation behavior sample data, and the fifth operation behavior sample data corresponds to
  • the label of is used to indicate that the user corresponding to the fifth operation behavior data is an authenticated user or a non-authenticated user; the recognition result of the second operation behavior sample data is used as the input of the decision fusion model, and the second operation
  • the label corresponding to the behavior sample data is trained as the target output value of the decision fusion model to obtain a trained decision fusion model.
  • the operation behavior data of the authenticated user may be referred to as a positive sample, and the operation behavior data of the non-authenticated user may be referred to as a negative sample.
  • the sample data can be determined based on the raw data collected by the sensor. Specifically, feature data can be extracted from the original data to obtain sample data.
  • the sensor may include a touch screen sensor and/or a motion sensor.
  • sample data may also include preset data.
  • the positive sample may be determined based on the raw data collected by the sensor, and the negative sample may be preset data.
  • the identification results of the two authentication models are input into the decision fusion model for decision fusion, and the identity authentication result is obtained.
  • the accuracy of identity authentication can be improved.
  • performing identity verification through the user's operating behavior data will not change the user's machine habits, which is conducive to the realization of non-sense verification.
  • anomaly detection model and classification model improves the recognition rate of authenticated users through the anomaly detection model, and on the other hand, improves the anti-attack ability through the classification model, and improves the identification ability of the identity authentication model as a whole.
  • both the anomaly detection model and the classification model can use small-scale algorithms.
  • the anomaly detection model can use a single-class support vector machine SVM, and the classification model can use a two-class SVM.
  • the algorithm overhead is small and no additional hardware support is needed.
  • the identity authentication model can be trained on the client side, which can realize the secure storage of data on the client side and avoid the privacy and security issues caused by uploading to the cloud.
  • first sample data and the second sample data may be the same or different.
  • the recognition result of the second operation behavior sample data includes the recognition result output by the first authentication model and the recognition result output by the second authentication model;
  • the recognition result output by the first authentication model includes: the matching score corresponding to the second operation behavior sample data output by the first authentication model and/or the matching score corresponding to the second operation behavior sample output by the first authentication model Matching result;
  • the recognition result output by the second authentication model includes: the matching score corresponding to the second operation behavior sample data output by the second authentication model and/or the second operation behavior output by the second authentication model The matching result corresponding to the sample data; wherein the matching score corresponding to the second operation behavior sample data is used to indicate the probability that the user corresponding to the second operation behavior sample data is identified as an authenticated user, and the second operation behavior sample
  • the matching result corresponding to the data is used to indicate that the user corresponding to the second operation behavior sample data is identified as an authenticated user or a non-authenticated user, and the matching result corresponding to the second operation behavior sample data includes based on
  • At least two thresholds of the two authentication models may be different.
  • the two authentication models are anomaly detection model and classification model respectively.
  • the recognition result of the second operation behavior sample data includes: the recognition result of the second operation behavior sample data output by the anomaly detection model and the recognition result of the second operation behavior sample data output by the classification model.
  • the recognition result of the second operation behavior sample data output by the anomaly detection model includes: a matching score A corresponding to the second operation behavior sample data output by the anomaly detection model and a matching result A corresponding to the second operation behavior sample data.
  • the at least two thresholds include a first threshold A and a second threshold A, and the first threshold A is greater than the second threshold A.
  • the matching result A corresponding to the second operation behavior sample data output by the abnormality detection model includes: a matching result determined based on the first threshold A and a matching result determined based on the second threshold A.
  • the matching score A is greater than or equal to the first threshold A, the matching result determined based on the first threshold A is an authenticated user; when the matching score A is less than the first threshold A, the matching result determined based on the first threshold A is a non-authenticated user .
  • the matching result determined based on the second threshold A is an authenticated user; when the matching score A is less than the second threshold A, the matching result determined based on the second threshold A is a non-authenticated user .
  • the recognition result output by the classification model includes the matching score B corresponding to the second operation behavior sample data output by the classification model and the matching result B corresponding to the second operation behavior sample data.
  • the at least two thresholds include a first threshold B and a second threshold B, and the first threshold B is greater than the second threshold B.
  • the matching result B corresponding to the second operation behavior sample data output by the classification model includes: a matching result determined based on the first threshold B and a matching result determined based on the second threshold B.
  • the matching score B is greater than or equal to the first threshold B, the matching result determined based on the first threshold B is an authenticated user; when the matching score B is less than the first threshold B, the matching result determined based on the first threshold B is an uncertified user .
  • the matching result determined based on the second threshold B is an authenticated user; when the matching score B is less than the second threshold B, the matching result determined based on the second threshold B is an un-authenticated user .
  • first threshold A and the first threshold B may be different or the same.
  • the second threshold A and the second threshold B may be the same or different.
  • the at least two thresholds may be determined according to the accuracy rates of the recognition results corresponding to the second operation behavior sample data output by the two authentication models.
  • the recognition results output by the two authentication models include a matching score and at least two matching results, which is different from the existing authentication model which can only provide one recognition result.
  • This solution can provide more information for the decision fusion model.
  • Features are conducive to training a better decision fusion model, and improve the accuracy of decision fusion model authentication, thereby improving the accuracy of identity authentication.
  • the at least two thresholds include the first threshold
  • the method further includes: obtaining fourth sample data, where the fourth sample data includes fourth operation behavior sample data and all The label corresponding to the fourth operation behavior sample data, the label corresponding to the fourth operation behavior sample data is used to indicate that the user corresponding to the fourth operation behavior sample data is an authenticated user or a non-authenticated user;
  • the behavior sample data is input into the first authentication model, and the matching score corresponding to the fourth operation behavior sample data output by the first authentication model is obtained, and the matching score corresponding to the fourth operation behavior sample data is used to indicate the The probability that a user corresponding to the fourth operation behavior sample data is recognized as an authenticated user; a plurality of candidate matching results corresponding to the matching scores corresponding to the fourth operation behavior sample data are determined based on a plurality of candidate thresholds, and the plurality of candidate matching results It is used to indicate that the user corresponding to the fourth operation behavior sample data is identified as an authenticated user or a non-authentic
  • the preset condition may be that the accuracy of the candidate matching result reaches a set threshold.
  • the first threshold of the second authentication model can also be determined in the above-mentioned manner, and will not be repeated here.
  • the first threshold that satisfies the preset condition is determined from the candidate thresholds, and the different performances of the authentication model can be adjusted through different first thresholds, so that the performance of the authentication model can meet expectations, for example, through different thresholds. Adjusting the model's anti-attack ability and the owner's recognition rate is conducive to training a better decision fusion model, improving the accuracy of the decision fusion model authentication, thereby improving the accuracy of identity authentication.
  • the target output value of the model is trained.
  • feature extraction is performed on the matching scores to further provide more features for the decision fusion model, which is conducive to training a better decision fusion model, improves the accuracy of the decision fusion model authentication, and thus improves the identity authentication performance accuracy.
  • the second operation behavior sample data includes at least one of the following data: touch point X/Y axis coordinates, touch area, touch pressure, touch screen speed, touch Screen acceleration, slope of touch screen trajectory, touch screen length, touch screen displacement, touch screen angle, touch screen direction, acceleration X/Y/Z axis data or gyroscope X/Y/Z axis data.
  • the first operation behavior sample data, the third operation behavior sample data, the fourth operation behavior sample data, or the fifth operation behavior sample data may also include at least one of the foregoing data.
  • the second sample data is obtained by filtering according to the sliding duration of the user on the touch screen and/or the number of touch points of the user on the touch screen.
  • the raw data can be preprocessed according to the user's sliding time on the touch screen and/or the number of touch points of the user on the touch screen to filter out effective operation behavior data . Then, feature data is extracted from the effective operation behavior data to obtain the second sample data.
  • the operation behavior data satisfying the number of touch points of the user greater than the preset threshold value is filtered from the original data as the effective operation behavior data. .
  • feature data is extracted from the effective operation behavior data to obtain the second sample data.
  • the original data is filtered to eliminate The user's abnormal operation behavior data improves the accuracy of training samples, thereby improving the accuracy of model training.
  • the method further includes: obtaining third sample data, where the third sample data includes: third operation behavior sample data; Inputting sample data into the first authentication model to obtain an identification result of the third operation behavior sample data output by the first authentication model; inputting the third operation behavior sample data into the second authentication model, Obtain the recognition result of the third operation behavior sample data output by the second authentication model; combine the recognition result of the third operation behavior sample data output by the first authentication model and the third operation behavior output by the second authentication model The recognition result of the sample data is input into the trained decision fusion model to obtain the identity authentication result corresponding to the third operation behavior sample data; according to the third operation behavior sample data and the third operation behavior sample data The corresponding identity authentication result trains the first authentication model and/or the second authentication model.
  • the two authentication models can be trained again according to the third operation behavior sample data and the identity authentication results corresponding to the third operation behavior sample data.
  • the first sample operation behavior data and the third operation behavior sample data may include the same sample data, and the two authentication models are performed according to the identity authentication results corresponding to the third operation behavior sample data and the third operation behavior sample data.
  • the training includes: screening the first sample data according to the label corresponding to the first operation behavior sample data and the identity authentication result corresponding to the third operation behavior sample data, and re-applying at least two authentication models based on the filtered first sample data Conduct training.
  • sample data with different identity authentication results corresponding to the operation behavior data of the third sample may be excluded from the first sample data.
  • the authentication model is retrained according to the identity authentication result of the decision fusion model, and the output is used to realize feedback, and the authentication model is trained again, which can further improve the accuracy of the authentication model, thereby further improving the performance of the identity authentication model. Accuracy.
  • a device for identity authentication includes a module or unit for executing the method in the first aspect and any one of the implementation manners in the first aspect.
  • a training device for an identity authentication model includes a module or unit for executing the method in any one of the foregoing second aspect and the second aspect.
  • an identity authentication device which includes an input and output interface, a processor, and a memory.
  • the processor is used to control the input and output interface to send and receive information
  • the memory is used to store a computer program
  • the processor is used to call and run the computer program from the memory, so that the training device executes any one of the first aspect and the first aspect.
  • the foregoing device may be a terminal device/server, or a chip in the terminal device/server.
  • the aforementioned memory may be located inside the processor, for example, may be a cache in the processor.
  • the above-mentioned memory may also be located outside the processor so as to be independent of the processor, for example, the internal memory (memory) of the device.
  • a training device for an identity authentication model which includes an input and output interface, a processor, and a memory.
  • the processor is used to control the input and output interface to send and receive information
  • the memory is used to store a computer program
  • the processor is used to call and run the computer program from the memory, so that the training device executes any one of the first aspect and the first aspect.
  • the above-mentioned training device may be a terminal device/server, or a chip in the terminal device/server.
  • the aforementioned memory may be located inside the processor, for example, may be a cache in the processor.
  • the above-mentioned memory may also be located outside the processor so as to be independent of the processor, for example, the internal memory (memory) of the training device.
  • a computer program product includes: computer program code, which when the computer program code runs on a computer, causes the computer to execute the methods in the above aspects.
  • the foregoing computer program code may be stored in whole or in part on a first storage medium, where the first storage medium may be packaged with the processor or may be packaged separately with the processor. There is no specific limitation.
  • a computer-readable medium stores program code, and when the computer program code runs on a computer, the computer executes the methods in the above-mentioned aspects.
  • a chip in a ninth aspect, includes a processor and a data interface, and the processor reads instructions stored in a memory through the data interface, and executes the methods in the foregoing aspects.
  • the chip may further include a memory in which instructions are stored, and the processor is configured to execute instructions stored on the memory.
  • the processor is configured to execute the method in any one of the implementation manners of the first aspect or the second aspect.
  • the aforementioned chip may specifically be a field-programmable gate array (FPGA) or an application-specific integrated circuit (ASIC).
  • FPGA field-programmable gate array
  • ASIC application-specific integrated circuit
  • FIG. 1 is a schematic diagram of an artificial intelligence main body framework provided by an embodiment of the present application
  • FIG. 2 is a schematic structural diagram of a system architecture provided by an embodiment of the application.
  • FIG. 3 is a schematic structural diagram of another system architecture provided by an embodiment of this application.
  • FIG. 4 is a schematic structural diagram of yet another system architecture provided by an embodiment of this application.
  • FIG. 5 is a schematic structural diagram of an identity authentication module provided by an embodiment of the application.
  • FIG. 6 is a schematic flowchart of a method for training an identity authentication model provided by an embodiment of the application
  • FIG. 7 is a schematic flowchart of another method for training an identity authentication model provided by an embodiment of the application.
  • FIG. 8 is a schematic flowchart of an identity authentication method provided by an embodiment of the application.
  • FIG. 9 is a schematic flowchart of a decision fusion provided by an embodiment of this application.
  • FIG. 10 is a schematic flowchart of another decision fusion provided by an embodiment of this application.
  • FIG. 11 is a schematic flowchart of an application method provided by an embodiment of this application.
  • FIG. 12 is a schematic diagram of an application scenario provided by an embodiment of the application.
  • FIG. 13 is a schematic diagram of another application scenario provided by an embodiment of the present application.
  • FIG. 14 is a schematic diagram of another application scenario provided by an embodiment of the present application.
  • FIG. 15 is a schematic block diagram of a training device for an identity authentication model provided by an embodiment of the present application.
  • FIG. 16 is a schematic block diagram of an identity authentication device provided by an embodiment of the present application.
  • FIG. 17 is a schematic block diagram of a training device for an identity authentication model provided by an embodiment of the present application.
  • FIG. 18 is a schematic block diagram of an identity authentication device provided by an embodiment of the present application.
  • Fig. 19 is a schematic block diagram of an identity authentication device and an identity authentication model training device provided by an embodiment of the present application.
  • Figure 1 shows a schematic diagram of an artificial intelligence main framework, which describes the overall workflow of the artificial intelligence system and is suitable for general artificial intelligence field requirements.
  • Intelligent Information Chain reflects a series of processes from data acquisition to processing. For example, it can be the general process of intelligent information perception, intelligent information representation and formation, intelligent reasoning, intelligent decision-making, intelligent execution and output. In this process, the data has gone through the condensing process of "data-information-knowledge-wisdom".
  • the infrastructure provides computing power support for the artificial intelligence system, realizes communication with the outside world, and realizes support through the basic platform.
  • the infrastructure can communicate with the outside through sensors, and the computing power of the infrastructure can be provided by smart chips.
  • the smart chip here can be a central processing unit (CPU), a neural-network processing unit (NPU), a graphics processing unit (GPU), and an application specific integrated circuit (application specific).
  • Hardware acceleration chips such as integrated circuit (ASIC) and field programmable gate array (FPGA).
  • the basic platform of infrastructure can include distributed computing framework and network related platform guarantee and support, and can include cloud storage and computing, interconnection network, etc.
  • data can be obtained through sensors and external communication, and then these data can be provided to the smart chip in the distributed computing system provided by the basic platform for calculation.
  • the data in the upper layer of the infrastructure is used to represent the data source in the field of artificial intelligence.
  • the data involves graphics, images, voice, text, and IoT data of traditional devices, including business data of existing systems and sensory data such as force, displacement, liquid level, temperature, and humidity.
  • the above-mentioned data processing usually includes data training, machine learning, deep learning, search, reasoning, decision-making and other processing methods.
  • machine learning and deep learning can symbolize and formalize data for intelligent information modeling, extraction, preprocessing, training, etc.
  • Reasoning refers to the process of simulating human intelligent reasoning in a computer or intelligent system, using formal information to conduct machine thinking and solving problems based on reasoning control strategies.
  • the typical function is search and matching.
  • Decision-making refers to the process of making decisions after intelligent information is reasoned, and usually provides functions such as classification, ranking, and prediction.
  • some general capabilities can be formed based on the results of the data processing, such as an algorithm or a general system, for example, translation, text analysis, computer vision processing, speech recognition, image Recognition and so on.
  • Intelligent products and industry applications refer to the products and applications of artificial intelligence systems in various fields. It is an encapsulation of the overall solution of artificial intelligence, productizing intelligent information decision-making and realizing landing applications. Its application fields mainly include: intelligent manufacturing, intelligent transportation, Smart home, smart medical, smart security, autonomous driving, safe city, smart terminal, etc.
  • the embodiments of this application can be applied to scenarios that require identity authentication.
  • the method provided in the embodiments of the present application can be applied to scenarios that require identity authentication, such as smart terminal unlocking, application software (application, APP) login, and secure payment.
  • identity authentication such as smart terminal unlocking, application software (application, APP) login, and secure payment.
  • a terminal device for example, a mobile phone
  • Unlocking the screen through identity authentication can improve system security and protect the user's property, privacy, and the like.
  • Using the identity authentication method of the embodiment of the present application to unlock the screen can more accurately identify authenticated users and non-authenticated users, and improve the security of the system.
  • the neural network can be used to classify the pictures, so that different categories of pictures can be labeled, which is convenient for users to view and find.
  • the classification tags of these pictures can also be provided to the album management system for classification management, saving users management time, improving the efficiency of album management, and enhancing user experience.
  • a terminal device for example, a mobile phone
  • the user needs to pass identity authentication to perform the payment operation, so as to protect the user's property, privacy, and the like.
  • Support vector machine support vector machine, SVM
  • Support vector machine is a two-class classification model whose purpose is to find a hyperplane to segment sample data.
  • the learning strategy of SVM is to maximize the interval, which can be formalized as a convex quadratic programming problem.
  • a neural network can be composed of neural units.
  • a neural unit can refer to an arithmetic unit that takes xs and intercept 1 as inputs.
  • the output of the arithmetic unit can be:
  • s 1, 2,...n, n is a natural number greater than 1
  • W s is the weight of x s
  • b is the bias of the neural unit.
  • f is the activation function of the neural unit, which is used to introduce nonlinear characteristics into the neural network to convert the input signal in the neural unit into an output signal.
  • the output signal of the activation function can be used as the input of the next convolutional layer, and the activation function can be a sigmoid function.
  • a neural network is a network formed by connecting multiple above-mentioned single neural units together, that is, the output of one neural unit can be the input of another neural unit.
  • the input of each neural unit can be connected with the local receptive field of the previous layer to extract the characteristics of the local receptive field.
  • the local receptive field can be a region composed of several neural units.
  • Deep neural network also known as multi-layer neural network
  • the DNN is divided according to the positions of different layers.
  • the neural network inside the DNN can be divided into three categories: input layer, hidden layer, and output layer.
  • the first layer is the input layer
  • the last layer is the output layer
  • the number of layers in the middle are all hidden layers.
  • the layers are fully connected, that is to say, any neuron in the i-th layer must be connected to any neuron in the i+1th layer.
  • DNN looks complicated, it is not complicated as far as the work of each layer is concerned. Simply put, it is the following linear relationship expression: in, Is the input vector, Is the output vector, Is the offset vector, W is the weight matrix (also called coefficient), and ⁇ () is the activation function.
  • Each layer is just the input vector After such a simple operation, the output vector is obtained Due to the large number of DNN layers, the coefficient W and the offset vector The number is also relatively large.
  • DNN The definition of these parameters in DNN is as follows: Take coefficient W as an example: Suppose in a three-layer DNN, the linear coefficients from the fourth neuron in the second layer to the second neuron in the third layer are defined as The superscript 3 represents the number of layers where the coefficient W is located, and the subscript corresponds to the output third-level index 2 and the input second-level index 4.
  • the coefficient from the kth neuron in the L-1th layer to the jth neuron in the Lth layer is defined as
  • Important equation taking the loss function as an example, the higher the output value (loss) of the loss function, the greater the difference, then the training of the deep neural network becomes a process of reducing this loss as much as possible.
  • the neural network can use the error back propagation (BP) algorithm to modify the size of the parameters in the neural network model during the training process, so that the reconstruction error loss of the neural network model becomes smaller and smaller. Specifically, forwarding the input signal to the output will cause error loss, and the parameters in the neural network model updated by backpropagating the error loss information, so as to converge the error loss.
  • the backpropagation algorithm is a backpropagation motion dominated by error loss, and aims to obtain the optimal parameters of the neural network model, such as the weight matrix.
  • the system architecture 100 includes an execution device 110, a training device 120, a database 130, a client device 140, a data storage system 150, and a data collection system 160.
  • the execution device 110 includes a calculation module 111, an I/O interface 112, a preprocessing module 113, and a preprocessing module 114.
  • the calculation module 111 may include the target model/rule 101, and the preprocessing module 113 and the preprocessing module 114 are optional.
  • a data collection device 160 is used to collect training data.
  • the training data may include the user's operation behavior data and the identity authentication result corresponding to the user's operation behavior data.
  • users include authenticated users and non-authenticated users
  • the identity authentication result corresponding to the user's operation behavior data includes authenticated users or non-authenticated users.
  • the data collection device 160 stores the training data in the database 130, and the training device 120 trains to obtain the target model/rule 101 based on the training data maintained in the database 130.
  • the training device 120 processes the input user's operation behavior data, and authenticates the output identity authentication result corresponding to the real user's operation behavior data. The results are compared until the difference between the identity authentication result output by the training device 120 and the identity authentication result corresponding to the actual user's operation behavior data is less than a certain threshold, thereby completing the training of the target model/rule 101.
  • the above-mentioned target model/rule 101 can be used for identity authentication.
  • the target model/rule 101 in the embodiment of the present application may specifically include a neural network or SVM, etc.
  • the training data maintained in the database 130 may not all come from the collection of the data collection device 160, and may also be received from other devices.
  • the training device 120 does not necessarily perform the training of the target model/rule 101 completely based on the training data maintained by the database 130. It may also obtain training data from the cloud or other places for model training. The above description should not be used as a reference to this application. Limitations of the embodiment.
  • the target model/rule 101 trained according to the training device 120 can be applied to different systems or devices, such as the execution device 110 shown in FIG. 2, which can be a terminal, such as a mobile phone terminal, a tablet computer, notebook computers, augmented reality (AR) AR/virtual reality (VR), vehicle-mounted terminals, etc., can also be servers or clouds.
  • the execution device 110 is configured with an input/output (input/output, I/O) interface 112 for data interaction with external devices.
  • the user can input data to the I/O interface 112 through the client device 140.
  • the input data may include the user's operation behavior data input by the client device in the embodiment of the present application.
  • the client device 140 here may specifically be a terminal device.
  • the preprocessing module 113 and the preprocessing module 114 are used to perform preprocessing according to the input data received by the I/O interface 112.
  • the preprocessing module 113 may be used to extract characteristic data from the operation behavior data of the user to be authenticated.
  • the execution device 110 may call data, codes, etc. in the data storage system 150 for corresponding processing .
  • the data, instructions, etc. obtained by corresponding processing may also be stored in the data storage system 150.
  • the I/O interface 112 returns the processing result, such as the identity authentication result obtained above, to the client device 140, so as to provide it to the user.
  • the aforementioned identity authentication result may include successful identity authentication or identity authentication failure.
  • Successful identity authentication means that the user to be authenticated is an authenticated user
  • identity authentication failure means that the user to be authenticated is a non-authenticated user.
  • the identity authentication result may include unlocking success or unlocking failure.
  • the identity authentication result may include login success or login failure. It should be understood that the above is only an illustration, and in different application scenarios, specific identity authentication results may include different forms. The embodiment of the application does not limit this.
  • the training device 120 can generate corresponding target models/rules 101 based on different training data for different goals or tasks, and the corresponding target models/rules 101 can be used to achieve the above goals or complete The above tasks provide users with the desired results.
  • the user can manually set input data, and the manual setting can be operated through the interface provided by the I/O interface 112.
  • the client device 140 can automatically send input data to the I/O interface 112. If the client device 140 is required to automatically send the input data and the user's authorization is required, the user can set the corresponding authority in the client device 140.
  • the user can view the result output by the execution device 110 on the client device 140, and the specific presentation form may be a specific manner such as display, sound, and action.
  • the client device 140 can also be used as a data collection terminal to collect the input data of the input I/O interface 112 and the output result of the output I/O interface 112 as new sample data and store it in the database 130 as shown in the figure.
  • the I/O interface 112 directly uses the input data input to the I/O interface 112 and the output result of the output I/O interface 112 as a new sample as shown in the figure.
  • the data is stored in the database 130.
  • FIG. 2 is only a schematic diagram of a system architecture provided by an embodiment of the present application, and the positional relationship between the devices, devices, modules, etc. shown in the figure does not constitute any limitation.
  • the data The storage system 150 is an external memory relative to the execution device 110. In other cases, the data storage system 150 may also be placed in the execution device 110.
  • an embodiment of the present application provides a system architecture 300.
  • the system architecture includes a local device 301, a local device 302, an execution device 310, and a data storage system 350.
  • the local device 301 and the local device 302 are connected to the execution device 310 through a communication network.
  • the execution device 310 may be implemented by one or more servers.
  • the execution device 310 can be used in conjunction with other computing devices, such as data storage, routers, load balancers and other devices.
  • the execution device 310 may be arranged on one physical site or distributed on multiple physical sites.
  • the execution device 310 can use the data in the data storage system 350 or call the program code in the data storage system 350 to implement the identity authentication method and the identity authentication model training method in the embodiments of the present application.
  • the data storage system 350 may be deployed in the local device 301 or the local device 302.
  • the data storage system 350 may be used to store training samples.
  • the execution device 310 may execute the following process:
  • the second sample data includes second operation behavior sample data and a label corresponding to the second operation behavior sample data, and the label corresponding to the second operation behavior sample data is used to indicate the second operation behavior sample data.
  • the user corresponding to the operation behavior sample data is an authenticated user or a non-authenticated user;
  • the recognition result of the second operation behavior sample data is used as the input of the decision fusion model, and the label corresponding to the second operation behavior sample data is used as the target output value of the decision fusion model for training, and a trained one is obtained Decision fusion model.
  • an identity authentication model can be obtained, which improves the accuracy of identity authentication.
  • Each local device can represent any computing device, such as personal computers, computer workstations, smart phones, tablets, smart cameras, smart cars or other types of cellular phones, media consumption devices, wearable devices, set-top boxes, game consoles, etc.
  • Each user's local device can interact with the execution device 310 through a communication network of any communication mechanism/communication standard.
  • the communication network can be a wide area network, a local area network, a point-to-point connection, or any combination thereof.
  • the local device 301 and the local device 302 obtain the relevant parameters of the identity authentication model from the execution device 310, deploy the identity authentication model on the local device 301 and the local device 302, and use the identity authentication model to perform identity authentication .
  • an identity authentication model can be directly deployed on the execution device 310.
  • the execution device 310 obtains the user behavior to be processed from the local device 301 and the local device 302, and uses the identity authentication model to perform identity authentication on the user behavior to be processed.
  • the foregoing execution device 310 may also be a cloud device. In this case, the execution device 310 may be deployed in the cloud; or, the foregoing execution device 310 may also be a terminal device. In this case, the execution device 310 may be deployed on the user terminal side. This is not limited.
  • the data storage system 350 may be deployed in the local device 301 or the local device 302.
  • the data storage system 350 may be used to store training samples.
  • the data storage system 350 may be independent of the local device 301 or the local device 302 and deployed on a storage device.
  • the storage device may interact with the local device to obtain user behavior logs in the local device and store it in the storage device.
  • Figure 4 shows a schematic diagram of a system architecture provided by an embodiment of the present application.
  • the system 400 may include an APP 410, a data collection module 420, an identity authentication module 440, and a storage module 440.
  • the application layer includes multiple APP 410, and the APP 410 can request identity authentication services in a specific scenario, such as unlocking an application lock or paying for an application. In this way, a senseless identity authentication can be realized.
  • Application lock unlocking can also be referred to as application unlocking, that is, the application is opened after identity authentication.
  • the APP 410 may be an APP integrated with an identity authentication software development kit (software development kit, SDK).
  • SDK identity authentication software development kit
  • the framework layer provides a data collection module 420 and an identity authentication module 430, which can be used by the upper-level APP.
  • the data collection module 420 is used to monitor sensor data.
  • the sensors include: ambient light sensors, health sensors, sound sensors, touch screen sensors, and motion sensors.
  • Motion sensor refers to a sensor that can monitor the movement status of the device in real time, and can be embedded in the device. For example, acceleration, gyroscope or magnetometer, etc.
  • the sound sensor may include a microphone, a speaker, or the like.
  • the data collection module can monitor sensor data through a sensor manager at the native layer.
  • the sensor manager is the general manager of sensor events, used for reading practices, distributing events, and so on. For example, the sensor manager can create listeners to listen for events from a certain sensor.
  • the sensor manager interacts with the sensor through the sensor driver in the kernel.
  • the identity authentication module 430 is used for identity authentication.
  • the identity authentication module 430 executes the identity authentication method in the embodiment of the present application to realize identity authentication.
  • the identity authentication module 430 is also used for modeling training of identity authentication models.
  • the identity authentication module 430 executes the training method of the identity authentication model in the embodiment of the present application to obtain a trained identity authentication model. It should be understood that the modeling training of the identity authentication model completed by the identity authentication module 430 in FIG. 4 is only for illustration. Alternatively, the modeling training of the identity authentication model can also be completed by other devices.
  • the trained identity authentication model may be stored in the storage module 440.
  • the storage module 440 is used to store the user's operation behavior data and the identity authentication model to achieve safe storage. As shown in Figure 4, the user's operation behavior data can be stored in the operation behavior database.
  • FIG. 5 shows a schematic block diagram of an identity authentication module 500 in an embodiment of the present application.
  • FIG. 5 can be used as an example of the identity authentication module 430 in FIG. 4.
  • the identity authentication module 500 can implement functions such as feature extraction 510, behavior modeling 520, behavior matching 530, incremental learning 540, model upgrade 550, and anti-counterfeiting detection 560.
  • the identity authentication module 500 can be used to implement feature extraction 510.
  • the feature data is extracted from the original data, and the feature data can be input into the identity authentication model to realize identity authentication.
  • Raw data refers to the raw data collected by the sensor.
  • feature extraction also includes other preprocessing operations on feature data, for example, removing invalid feature data from the extracted feature data.
  • the identity authentication module 500 may establish an identity authentication model based on the characteristic data, or in other words, implement modeling training of the identity authentication model based on the characteristic data, that is, the behavior modeling 520 in FIG. 5.
  • the identity authentication module 500 may execute the method 700 in FIG. 6 or the method 730 in FIG. 7 to train an identity authentication model.
  • the identity authentication module 500 can be used for identity authentication/behavior matching 530. Specifically, the characteristic data is input into the identity authentication model to realize identity authentication. Exemplarily, the identity authentication module 500 may execute the method 800 in FIG. 8 to realize identity authentication. It should be understood that the identity authentication model may be trained by the identity authentication module 500, that is, the identity authentication module 500 may be used for modeling to obtain an identity authentication model, and then use the identity authentication model to implement identity authentication. Alternatively, the identity authentication module 500 can also use identity authentication models trained by other devices to implement identity authentication.
  • the user's operation behavior data is unstable.
  • the identity authentication module 500 may also be used for incremental learning 540.
  • Incremental learning refers to a learning system that can continuously learn new knowledge from new samples and can save most of the previously learned knowledge.
  • the identity authentication module 500 can continuously collect the user's operation behavior data, that is, the newly added user's operation behavior data. Through incremental learning, it can be based on the original database.
  • the identity authentication model is updated only for the changes caused by the newly added user's operation behavior data.
  • Incremental learning can continuously optimize the identity authentication model, adapt to changing user behaviors, and enhance recognition capabilities.
  • the identity authentication module 500 may also be used to implement a model upgrade 550. Specifically, it is used to upgrade the structure of the identity authentication model, realize interaction with cloud services, and so on.
  • the identity authentication module 500 may obtain the structural parameters of the upgraded identity authentication model to update the structural parameters of the original identity authentication model. That is to upgrade the structure of the identity authentication model.
  • the structural parameters of the identity authentication model can be artificially set.
  • the identity authentication module 500 can train the upgraded identity authentication model. For example, feature data is extracted from the original data, and the upgraded identity authentication model is trained based on the feature data.
  • the identity authentication module 500 interacts with the cloud service. Specifically, the structural parameters of the upgraded identity authentication model can be sent to the cloud service.
  • the identity authentication model 500 can also be used to implement anti-counterfeiting detection 560. Specifically, it is used to verify the legitimacy of the identity authentication module 500 and so on. For example, to ensure that the identity authentication model will not be tampered with, and to achieve secure storage.
  • FIG. 6 shows a method 700 for training an identity authentication model according to an embodiment of the present application.
  • the method 700 includes steps S710 to S720.
  • the method 700 may be executed by the training device 120 in FIG. 2.
  • the training device can be a cloud service device or a mobile terminal, for example, a computer, a server, and other devices that can be used to train an identity authentication model.
  • Steps S710 to S720 will be described in detail below.
  • the first sample data includes: the first operation behavior sample data and the label corresponding to the first operation behavior sample data, and the label corresponding to the first operation behavior sample data is used to indicate that the user corresponding to the first operation behavior sample data is an authenticated user Or non-authenticated users.
  • step S710 further includes: acquiring fifth sample data.
  • the fifth sample data includes: the fifth operation behavior sample data and the label corresponding to the fifth operation behavior sample data, and the label corresponding to the fifth operation behavior sample data is used to indicate that the user corresponding to the fifth operation behavior sample data is an authenticated user or Non-authenticated users.
  • An authenticated user can also be understood as the owner, and a non-authenticated user can also be understood as an attacker.
  • the data of authenticated users can be referred to as positive samples, and the data of non-authenticated users can be referred to as negative samples.
  • the operation behavior sample data of the authenticated user is a positive sample
  • the operation behavior sample data of the non-authenticated user is a negative sample
  • the first sample data may be determined according to the raw data collected by the sensor.
  • the sensor includes a touch screen sensor and/or a motion sensor.
  • the first sample data may also include preset data.
  • the positive sample may be determined based on the raw data collected by the sensor, and the negative sample may be preset data.
  • collecting data through sensors may include: when the screen is unlocked, registering the touch screen sensor and/or motion sensor monitoring; when the screen is locked, logging off the touch screen sensor and/or motion sensor monitoring.
  • the raw data collected by the touch screen sensor includes: timestamp, X/Y axis coordinates of the touch point, touch area, touch pressure, action, and screen direction.
  • the raw data collected by the motion sensor includes: time stamp, acceleration X/Y/Z axis data, gyroscope X/Y/Z axis data, etc.
  • the data collected by the sensor may include any one or several of the above, and may also include other data.
  • Action refers to the touch screen event.
  • the touch screen behavior is pressing and lifting
  • the touch screen event is a user's "click” operation.
  • the touch screen behavior is "press, slide, and lift”
  • the touch screen event is a user's “slide” operation.
  • the screen orientation includes landscape or portrait orientation.
  • feature data can be extracted from the original data to obtain the first sample data.
  • the first operation behavior sample data includes at least one of the following data: touch point X/Y axis coordinates, touch area, touch pressure, touch screen speed, touch screen acceleration, slope of touch screen trajectory, touch screen length , Touch screen displacement, touch screen angle, touch screen direction, acceleration X/Y/Z axis data or gyroscope X/Y/Z axis data.
  • the first operation behavior sample data may include related values of any one or several of the above features.
  • the correlation value may include: start value, end value, average value, standard deviation, 20% quantile, 50% quantile, and 80% quantile.
  • the first sample data may include: the start value and end value of the X/Y axis coordinates of the touch point, the start value, end value, average value, standard deviation of the touch area, 20% quantile, 50% quantile And 80% quantile and the average and standard deviation of acceleration X/Y/Z axis data.
  • the start value and end value can be defined in advance.
  • the data collected by the sensor includes n touch points.
  • the start value of the touch screen speed refers to the speed between the first touch point and the second touch point;
  • the end value of the touch screen speed refers to the distance between the n-1th touch point and the nth touch point speed.
  • the original data can be preprocessed to filter out effective operation behavior data.
  • feature data is extracted from the effective operation behavior data to obtain the first operation behavior sample data.
  • the effective operation behavior data is the data collected when the effective touch screen behavior occurs, and the definition method can be set as needed. For example, if the number of touch points of the user is greater than a preset threshold, it is an effective touch screen behavior, and the operation behavior data satisfying the number of touch points of the user greater than the preset threshold is filtered from the original data as the effective operation behavior data.
  • a touch point represents a touch screen event.
  • the number of touch points greater than the preset threshold means that the touch screen event exceeds the preset threshold.
  • step S720 it can be determined whether to start to establish an identity authentication model.
  • the identity authentication model starts to be established.
  • the first sample data may include the user's operation behavior data collected by the sensor, and the time period during which the first sample data is collected may include the time period during which the sensor is collected.
  • the identity authentication model is started to be established.
  • the preset duration is one week
  • the preset number is 2000. If the authenticated user uses the device for more than one week, that is, the collection period of positive samples exceeds one week, and the effective touch screen behavior reaches 2000 times, that is, the effective operation behavior data exceeds 2000, then the anomaly detection model will be established.
  • the sample data can more truly reflect the habits of authenticated users.
  • an identity authentication model that is more in line with user habits can be obtained. Improve the accuracy of identity authentication.
  • the identity authentication model includes at least two authentication models, the first authentication model can be trained based on the first sample data, and the second authentication model can be trained based on the fifth sample data.
  • the first authentication model and the second authentication model are respectively an abnormality detection model and a classification model.
  • the training method of the identity authentication model will be described below.
  • step S720 includes step S721 and step S722.
  • the anomaly detection model is used to detect abnormal data, and to determine the authenticated user and the non-authenticated user by detecting the data of the non-authenticated user.
  • the anomaly detection model can use one-class SVM or isolation forest. It should be understood that the embodiment of the present application does not limit the algorithm used in the anomaly detection model.
  • step S721 includes setting parameters of the anomaly detection model, for example, data anomaly rate.
  • the data anomaly rate can ensure the accuracy of the authenticated user's detection. For example, if the data abnormality rate is set to 0.1, theoretically, it can ensure that the recognition rate of authenticated users reaches more than 90%.
  • the degree of feature discrimination affects the recognition ability of the anomaly detection model.
  • poor feature discrimination will lead to more "similar behaviors”. That is, different operation behavior data are recognized as the same operation behavior data, which reduces the overall behavior recognition ability.
  • step S721 includes selecting an optimal feature combination.
  • the best feature combination can be selected through feature engineering.
  • the features are selected each time, for example, several features are selected from the features in step S710 to obtain a feature combination, and the feature combination is input into the anomaly detection model to obtain the feature combination corresponding to the anomaly detection model with the highest accuracy.
  • This feature combination is regarded as the best feature combination.
  • the first data sample may be preprocessed, for example, the first data sample may be standardized.
  • the fifth sample data includes multiple sample data.
  • the sample data used to train the anomaly detection model and the sample data used to train the classification model may be the same or different.
  • the classification model can be a two-classification model, and authenticated users and non-authenticated users are judged by classification.
  • the classification model can use SVM or neural network. It should be understood that the embodiment of the present application does not limit the algorithm used in the classification model.
  • step S722 includes determining the optimal hyperparameters of the classification model.
  • the optimal hyperparameters of the classification model can be determined by the grid method. Specifically, each time the hyperparameters of the classification model are selected and the accuracy of the classification model is tested, the hyperparameters corresponding to the classification model with the highest accuracy are used as the optimal hyperparameters.
  • the probability that the behaviors of non-authenticated users are misidentified as authenticated users is 1/(N +1).
  • Increasing the category of non-authenticated users can effectively reduce the false acceptance rate (FAR), but it will increase the false rejection rate (FRR).
  • FAR false acceptance rate
  • FRR false rejection rate
  • a reduction in the false recognition rate can improve the ability to resist attacks, and an increase in the rejection rate will result in a reduction in the recognition rate of authenticated users.
  • FAR false acceptance rate
  • FRR false rejection rate
  • the first sample data may be preprocessed, for example, the first sample data may be standardized.
  • step S721 and step S722 are in no particular order.
  • step S721 and step S722 can be performed at the same time.
  • step S721 is executed first, and then executed after S722.
  • step S722 is performed first, and step S721 is performed after.
  • step S721 and step S722 a trained anomaly detection model and a trained classification model can be obtained.
  • the recognition result of the anomaly detection model and the recognition result of the classification model can be input into the decision fusion model, and the fusion result can be used as the identity authentication result.
  • the recognition result in the embodiment of the present application may include that the input operation behavior data is recognized as an authenticated user or a non-authenticated user.
  • the recognition result may include the probability of the inputted operation behavior data being recognized as an authenticated user or the probability of a non-authenticated user.
  • the decision fusion model is used to perform weighted calculation on the identification results output by the at least two authentication models according to the weights corresponding to the at least two authentication models, and determine the identity authentication result according to the result of the weighted calculation.
  • the model parameters of the decision fusion model for example, the weights corresponding to the two authentication models, may be preset. Alternatively, the weight can be obtained through training.
  • the method 700 further includes establishing a decision fusion model.
  • FIG. 7 shows a schematic flowchart of a method 730 for training a decision fusion model according to an embodiment of the present application.
  • the identity authentication model in the embodiment of the present application includes at least two authentication models and a decision fusion model, and the training of the decision fusion model is also the training of the identity authentication model.
  • the training method of the decision fusion model can also be understood as the training method of the identity authentication model.
  • the method 730 includes steps S731 to S733.
  • the second sample data includes the second operation behavior sample data and the label corresponding to the second operation behavior sample data.
  • the label corresponding to the second operation behavior sample data is used to indicate that the user corresponding to the second operation behavior sample data is an authenticated user or a non-authenticated user user.
  • S732 Input the second operation behavior sample data in the second sample data into the first authentication model and the second authentication model to obtain the identification result of the second operation behavior sample data.
  • the recognition result of the second operation behavior sample data includes the recognition result output by the first authentication model and the recognition result output by the second authentication model.
  • the recognition result output by the first authentication model includes: the matching score corresponding to the second operation behavior sample data output by the first authentication model and/or the matching result corresponding to the second operation behavior sample output by the first authentication model; the second authentication model output The recognition result includes: the matching score corresponding to the second operation behavior sample data output by the second authentication model and/or the matching result corresponding to the second operation behavior sample data output by the second authentication model;
  • the matching score corresponding to the second operation behavior sample data is used to indicate the probability that the user corresponding to the second operation behavior sample data is recognized as an authenticated user
  • the matching result corresponding to the second operation behavior sample data is used to indicate the second operation behavior sample
  • the user corresponding to the data is identified as an authenticated user or a non-authenticated user
  • the matching result corresponding to the second operation behavior sample data includes at least two matching results determined based on at least two thresholds and a matching score corresponding to the second operation behavior sample data.
  • the second operation behavior sample data is input into the authentication model to obtain a matching score. According to the matching score and at least two thresholds, it can be determined that the user corresponding to the second operation behavior sample data is identified as an authenticated user or a non-authenticated user.
  • the at least two thresholds can be set as required.
  • the at least two thresholds may be determined according to the accuracy rates of the recognition results corresponding to the second operation behavior sample data output by the two authentication models.
  • the at least two thresholds may include a first threshold and a second threshold.
  • the first threshold is greater than the second threshold.
  • the matching result obtained based on the first threshold may be referred to as the matching result corresponding to the first threshold.
  • the matching result obtained based on the second threshold may be referred to as the matching result corresponding to the second threshold.
  • the confidence of the matching result corresponding to the first threshold is higher.
  • the matching score is greater than or equal to the first threshold, the user corresponding to the second operation behavior sample data is identified as an authenticated user; when the matching score is less than the first threshold, the user corresponding to the second operation behavior sample data Recognized as a non-authenticated user.
  • the matching score is greater than or equal to the second threshold, the user corresponding to the second operation behavior sample data is identified as an authenticated user; when the matching score is less than the second threshold, the user corresponding to the second operation behavior sample data Recognized as a non-authenticated user.
  • the second threshold may be a default threshold.
  • the value range of the matching score is [0,1], and the second threshold is 0.5.
  • the matching score is greater than or equal to 0.5, the user corresponding to the second operation behavior sample data is identified as an authenticated user;
  • the score is less than 0.5, the user corresponding to the second operation behavior sample data is identified as a non-authenticated user.
  • the following takes anomaly detection model and classification model as examples to illustrate the recognition result of the second operation behavior sample data.
  • the recognition result of the second operation behavior sample data includes: the recognition result of the second operation behavior sample data output by the anomaly detection model and the recognition result of the second operation behavior sample data output by the classification model.
  • the recognition result of the second operation behavior sample data output by the anomaly detection model includes: a matching score A corresponding to the second operation behavior sample data output by the anomaly detection model and a matching result A corresponding to the second operation behavior sample data.
  • the at least two thresholds include a first threshold A and a second threshold A, and the first threshold A is greater than the second threshold A.
  • the matching result A corresponding to the second operation behavior sample data output by the abnormality detection model includes: a matching result determined based on the first threshold A and a matching result determined based on the second threshold A.
  • the matching score A is greater than or equal to the first threshold A, the matching result determined based on the first threshold A is an authenticated user; when the matching score A is less than the first threshold A, the matching result determined based on the first threshold A is a non-authenticated user .
  • the matching result determined based on the second threshold A is an authenticated user; when the matching score A is less than the second threshold A, the matching result determined based on the second threshold A is a non-authenticated user .
  • the recognition result output by the classification model includes the matching score B corresponding to the second operation behavior sample data output by the classification model and the matching result B corresponding to the second operation behavior sample data.
  • the at least two thresholds include a first threshold B and a second threshold B, and the first threshold B is greater than the second threshold B.
  • the matching result B corresponding to the second operation behavior sample data output by the classification model includes: a matching result determined based on the first threshold B and a matching result determined based on the second threshold B.
  • the matching score B is greater than or equal to the first threshold B, the matching result determined based on the first threshold B is an authenticated user; when the matching score B is less than the first threshold B, the matching result determined based on the first threshold B is a non-authenticated user .
  • the matching result determined based on the second threshold B is an authenticated user; when the matching score B is less than the second threshold B, the matching result determined based on the second threshold B is an un-authenticated user .
  • first threshold A and the first threshold B may be different or the same.
  • the second threshold A and the second threshold B may be the same or different.
  • the trained authentication model input the sample data in the test sample set into the trained authentication model to obtain the matching score corresponding to the sample data.
  • multiple candidate matching results corresponding to the multiple candidate thresholds are obtained.
  • the candidate threshold corresponding to the candidate matching result that meets the preset condition among the multiple candidate matching results is determined as the first threshold.
  • the first threshold of the first authentication model can be determined through the following steps.
  • the label corresponding to the fourth operation behavior sample data is used to indicate that the user corresponding to the fourth operation behavior sample data is an authenticated user or a non-authenticated user.
  • the first threshold of the second authentication model can also be determined in the above-mentioned manner, and will not be repeated here.
  • the following takes anomaly detection model and classification model as examples to illustrate the method of determining the first threshold.
  • the first threshold corresponding to the abnormality detection model is called the first threshold A
  • the fourth operation behavior sample data used to determine the first threshold A is called the fourth operation behavior sample data A
  • the first threshold corresponding to the classification model is called the first threshold B
  • the fourth operation behavior sample data used to determine the first threshold B is called the fourth operation behavior sample data B.
  • the parameters of the anomaly detection model include the data anomaly rate. Since authenticated users may consciously or unconsciously produce abnormal behaviors, adjusting the abnormal data rate can improve the accuracy of authenticated user detection.
  • step S720 the data anomaly rate can be set to 0.5, and then the anomaly detection model is trained.
  • the fourth operation behavior sample data A is input into the trained anomaly detection model, and the matching score A corresponding to the fourth operation behavior sample data A is obtained. Based on the multiple candidate thresholds, multiple candidate matching results A corresponding to the multiple candidate thresholds are obtained. The candidate threshold value of the multiple candidate matching results A whose accuracy rate is higher than 90% is used as the first threshold value A.
  • the fourth operation behavior sample data B is input into the trained classification model, and the matching score B corresponding to the fourth operation behavior sample data B is obtained. Based on the multiple candidate thresholds, multiple candidate matching results B corresponding to the multiple candidate thresholds are obtained.
  • the first threshold A is used as the candidate threshold of the FAR that meets the preset condition in the multiple candidate matching results B.
  • Different first thresholds can adjust the different performance of the authentication model so that the performance of the authentication model can meet expectations. For example, the anti-attack ability of the model can be adjusted through different thresholds. And the recognition rate of the owner is balanced by it.
  • the recognition results output by the two authentication models include a matching score and at least two matching results, which is different from the existing authentication model which can only provide one recognition result.
  • This solution can provide more information for the decision fusion model.
  • Features are conducive to training a better decision fusion model, and improve the accuracy of decision fusion model authentication, thereby improving the accuracy of identity authentication.
  • S733 Use the recognition result of the second operation behavior sample data as the input of the decision fusion model, and use the label corresponding to the second operation behavior sample data as the target output value of the decision fusion model to train the decision fusion model to obtain the trained decision fusion Model.
  • the decision fusion model may be SVM.
  • first sample data are also applicable to the second sample data.
  • the first sample data and the second sample data can be the same or different.
  • the following takes anomaly detection model and classification model as examples.
  • the second operation behavior sample data is input into the anomaly detection model, and the first recognition result is output; the second operation behavior sample data is input into the classification model, and the second recognition result is output.
  • the first recognition result and the second recognition result are input to the decision fusion model, and the label corresponding to the second operation behavior sample data is used as the target output value of the decision fusion model to train the decision fusion model to obtain a trained decision fusion model.
  • the recognition result of the anomaly detection model and the classification model may be a matching score, that is, the probability that the user corresponding to the second operation behavior sample data is recognized as an authenticated user.
  • the higher the matching score the higher the probability that the user corresponding to the second operation behavior sample data is recognized as an authenticated user.
  • inputting the first recognition result and the second recognition result into the decision fusion model to obtain the identity authentication result includes: performing a weighting calculation according to the weights corresponding to the first recognition result and the second recognition result to obtain the weighting result, and according to the weighting result Get the identity authentication result. For example, when the weighted result is greater than or equal to the set threshold A, the authentication is successful, that is, the user corresponding to the second operation behavior sample data is identified as an authenticated user; when the weighted result is less than the set threshold A, the authentication fails, that is The user corresponding to the second operation behavior sample data is identified as a non-authenticated user.
  • the authentication is successful, that is, the user corresponding to the second operation behavior sample data is identified as an authenticated user; when the weighted result is less than or equal to the set threshold A, the authentication fails, that is The user corresponding to the second operation behavior sample data is identified as a non-authenticated user.
  • training the decision fusion model, and training the decision fusion model includes training the weights corresponding to at least two authentication models. That is, the weights corresponding to the recognition results of the anomaly detection model and the classification model are trained to obtain the best weight ratio.
  • step S733 further includes: performing feature extraction on the matching score corresponding to the second operation behavior sample data to obtain the score feature; taking the score feature and the recognition result of the second operation behavior sample data as the input of the decision fusion model, and taking the first Second, the label corresponding to the sample data of the operation behavior is trained as the target output value of the decision fusion model.
  • performing feature extraction on the matching score of the second sample data includes: performing a mathematical operation on the score, such as addition, subtraction, multiplication, and division, and the operation result is used as the score feature.
  • step S733 takes an anomaly detection model and a classification model as an example to illustrate step S733.
  • the recognition result output by the abnormality detection model includes the matching score A corresponding to the second operation behavior sample data output by the abnormality detection model and the matching result A corresponding to the second operation behavior sample data.
  • the matching result A includes a matching result corresponding to the first threshold A and a matching result corresponding to the second threshold A.
  • the recognition result output by the classification model includes the matching score B corresponding to the second operation behavior sample data output by the classification model and the matching result B corresponding to the second operation behavior sample data.
  • the matching result B includes a matching result corresponding to the first threshold B and a matching result corresponding to the second threshold B.
  • the matching score corresponding to the anomaly detection model and the matching score corresponding to the classification model are used as two score features.
  • the recognition results output by the anomaly detection model and the classification model and some or all of the four score features can be used as the input of the decision fusion model, and the label corresponding to the second operation behavior sample data can be used as the target output of the decision fusion model Value for training.
  • feature extraction is performed on the matching scores to further provide more features for the decision fusion model, which is conducive to training a better decision fusion model, improves the accuracy of the decision fusion model authentication, and thus improves the identity authentication performance accuracy.
  • step S720 further includes:
  • the third sample data includes third operation behavior sample data
  • the third operation behavior sample data is input into the second authentication model, and the recognition result of the third operation behavior sample data output by the second authentication model is obtained.
  • the third sample data and the first sample data may be the same or different.
  • the third sample data and the second sample data may be the same or different.
  • incorrect data may be introduced by the user's conscious or unconscious behavior, causing errors in the labels corresponding to the sample data.
  • incorrect data may be introduced by the user's conscious or unconscious behavior, causing errors in the labels corresponding to the sample data.
  • the authenticated user exhibits abnormal behavior
  • the corresponding sample label is the authenticated user
  • the authentication model is trained based on the sample data and the corresponding label
  • the accuracy of the identity authentication result obtained through the trained decision-making fusion model is high, and the output is used to realize feedback, and the authentication model is trained again, which can further improve the accuracy of the authentication model, thereby further improving the accuracy of the identity authentication model .
  • the following takes the identity authentication model including the anomaly detection model and the classification model as an example for description.
  • the third operation behavior sample data is input into the anomaly detection model, and the first recognition result is output; the third operation behavior sample data is input into the classification model, and the second recognition result is output.
  • the first recognition result and the second recognition result are input into the trained decision fusion model to obtain the identity authentication result corresponding to the third operation behavior sample data.
  • the first operation behavior sample data and the third operation behavior sample data may include the same sample data, and the abnormality detection model and the classification model are compared according to the identity authentication result corresponding to the third operation behavior sample data and the third operation behavior sample data.
  • Conduct training including:
  • the first sample data is screened, and the anomaly detection model and classification model are trained based on the filtered first sample data .
  • sample data with different identity authentication results corresponding to the third operation behavior sample data may be excluded from the first sample data.
  • the sample data A is one piece of sample data among the first operation behavior sample data and the third operation behavior sample data.
  • the label corresponding to the sample data A is a non-authenticated user.
  • Input the sample data A into the identity authentication model, and the obtained identity authentication result is the authenticated user.
  • the sample data A can be excluded from the first sample data, and the anomaly detection model and the classification model can be trained based on the filtered first sample data.
  • training the anomaly detection model and classification model according to the third operation behavior sample data and the identity authentication result corresponding to the third operation behavior sample data includes:
  • the third operation behavior sample data is used as the input of the classification model, and the identity authentication result corresponding to the third operation behavior data sample is used as the target output value of the classification model to train the classification model.
  • the data volume of the training sample can be expanded.
  • the identity authentication result corresponding to the third operation behavior data sample can also be used as the label pair corresponding to the three operation behavior sample data.
  • the training of the authentication model improves the accuracy of the authentication model, thereby improving the overall accuracy of the identity authentication model.
  • the method 700 further includes updating the identity authentication model through incremental learning.
  • Incremental learning refers to a learning system that can continuously learn new knowledge from new samples and can save most of the previously learned knowledge.
  • the user's operational behavior data that is, the newly-added user's operational behavior data
  • the original database can only be used.
  • the identity authentication model is updated for the changes caused by the newly added user's operation behavior data.
  • Incremental learning can continuously optimize the identity authentication model, adapt to changing user behaviors, and enhance recognition capabilities.
  • the identification results of at least two authentication models are input into the decision fusion model for decision fusion, and the identity authentication result is obtained.
  • the accuracy of identity authentication can be improved.
  • performing identity verification through the user's operating behavior data will not change the user's machine habits, which is conducive to the realization of non-sense verification.
  • the anomaly detection model and the classification model are combined to obtain the identity authentication model.
  • the anomaly detection model improves the recognition rate of authenticated users
  • the classification model improves the anti-attack ability, and overall improves the identification ability of the identity authentication model. .
  • both the anomaly detection model and the classification model can use small-scale algorithms.
  • the anomaly detection model can use a single-class SVM
  • the classification model can use a two-class SVM, which can be realized by introducing the libsvm library.
  • the algorithm overhead is small and no additional hardware support is needed.
  • the identity authentication model can be trained on the client side, which can realize the secure storage of data on the client side and avoid the privacy and security issues caused by uploading to the cloud.
  • FIG. 8 shows an identity authentication method 800 according to an embodiment of the present application.
  • the method 800 includes steps S810 to S830.
  • the method 800 may be executed by a device or device capable of performing identity authentication. This method can be executed by terminal devices, computers, servers, etc. For example, the execution device 110 in FIG. 2.
  • the steps S810 to S830 will be introduced below.
  • the identity authentication model used in the identity authentication method 800 in FIG. 8 may be constructed by the above-mentioned method in FIG. 6 or FIG. 7. In order to avoid unnecessary repetition, repetitive descriptions are appropriately omitted when the method 800 is introduced below.
  • S810 Acquire first operation behavior data of the user to be authenticated; Acquire second operation behavior data of the user to be authenticated.
  • the operation behavior data of the user to be authenticated may include data collected by sensors.
  • the sensors include motion sensors and/or touch screen sensors.
  • the operation behavior data of the user to be authenticated may be data obtained after processing the raw data collected by the sensor.
  • the raw data collected by the touch screen sensor includes: time stamp, X/Y axis coordinates of the touch point, touch area, touch pressure, action, and screen direction.
  • the data collected by the motion sensor includes: time stamp, acceleration X/Y/Z axis data, gyroscope X/Y/Z axis data, etc. It should be understood that the above is merely illustrative, and the data collected by the sensor may include any one or several of the above, and may also include other data.
  • the operation behavior data of the user to be authenticated includes at least one of the following data: touch point X/Y axis coordinates, touch area, touch pressure, touch screen speed, touch screen acceleration, slope of touch screen trajectory, touch screen Length, touch screen displacement, touch screen angle, touch screen direction, acceleration X/Y/Z axis data or gyroscope X/Y/Z axis data.
  • S820 Pass the first authentication model with the first operation behavior data input to obtain the first recognition result output by the first authentication model; pass the second authentication model with the second operation behavior data input to obtain the second recognition output by the second authentication model result.
  • the first and second authentication models are anomaly detection model and classification model respectively
  • the model parameters of the two authentication models are obtained by training based on the first sample data and the fifth sample data, respectively.
  • the specific training process refer to the aforementioned method 700.
  • the model parameters of the two authentication models are obtained by training based on the first sample data, the fifth sample data, and the third sample data, and the third sample data, and the third sample data includes: third operation behavior data.
  • the specific training process refer to the aforementioned method 700.
  • the anomaly detection model is used to detect abnormal data, and determine the authenticated user and the non-authenticated user by detecting the data of the non-authenticated user.
  • the classification model can be a two-category model, which is used to determine authenticated users and non-authenticated users through classification
  • the first operation behavior data is input to the anomaly detection model 910 to output the first recognition result; the second operation behavior data is input to the classification model 920 to output the second recognition result.
  • the first recognition result and the second recognition result are input into the decision fusion model 930 to obtain the identity authentication result.
  • the identification result includes that the user to be authenticated is an authenticated user or the user to be authenticated is a non-authenticated user.
  • the identity authentication result is an authenticated user. If the recognition results of the anomaly detection model and the classification model are both non-authenticated users, then the identity authentication result is a non-authenticated user. If one of the recognition results of the anomaly detection model and the classification model is an authenticated user and the other is a non-authenticated user, the identity authentication result is unrecognizable.
  • the identity authentication result is an authenticated user. If the recognition results of the anomaly detection model and the classification model are both non-authenticated users, then the identity authentication result is a non-authenticated user.
  • the identity authentication result is a non-authenticated user. If the recognition results of the anomaly detection model and the classification model are both authenticated users, then the identity authentication result is an authenticated user.
  • the decision fusion model is used to perform weighted calculation on the identification results output by the at least two authentication models according to the weights corresponding to the at least two authentication models, and determine the identity authentication result according to the result of the weighted calculation.
  • the weight values can be adjusted as needed, which further improves the accuracy of identity authentication.
  • the weight parameter of the decision fusion model is obtained by inputting the second operation behavior sample data in the second sample data into the first authentication model and the second authentication model to obtain the first authentication model and the second authentication model.
  • the recognition result of the second operation behavior sample data output by the second authentication model is determined by using the recognition result of the second operation behavior sample data as the input of the decision fusion model, and the data corresponding to the second operation behavior sample data
  • the label is obtained by training as the target output value of the decision fusion model
  • the second sample data includes the second operation behavior sample data and the label corresponding to the second operation behavior sample data
  • the label corresponding to the sample data is used to indicate that the user corresponding to the second operation behavior sample data is an authenticated user or a non-authenticated user.
  • method 730 For the specific training process, refer to method 730.
  • the operation behavior data of the user to be authenticated is input into the anomaly detection model 1010, and the first matching score is output; the operation behavior data of the user to be authenticated is input into the classification model 1020, and the second matching score is output.
  • the first matching score and the second matching score are input to the decision fusion model 1030, weighted calculation is performed according to the weights corresponding to the at least two authentication models, and the identity authentication result is determined according to the result of the weighted calculation.
  • the matching score in Figure 10 is the identification result output by the authentication model, that is, the probability that the user to be authenticated is an authenticated user.
  • the authentication is successful, that is, the user to be authenticated is identified as an authenticated user; when the result of the weighted calculation is less than the set threshold A, the authentication fails, that is, the waiting The authenticated user is identified as a non-authenticated user.
  • the authentication is successful, that is, the user to be authenticated is identified as an authenticated user; when the result of the weighted calculation is less than or equal to the set threshold A, the authentication fails, that is, the waiting The authenticated user is identified as a non-authenticated user.
  • the authentication is successful, that is, the user to be authenticated is identified as an authenticated user; when the result of the weighted calculation is less than the set threshold B, the authentication fails, that is, the user to be authenticated Recognized as a non-authenticated user; when the result of the weighted calculation is greater than the set threshold B and less than the set threshold A, it cannot be identified.
  • the first recognition result includes: a matching score corresponding to the first operation behavior data and/or a matching result corresponding to the first operation behavior data; the second recognition result includes a matching score corresponding to the second operation behavior data and/or the first 2.
  • the matching result corresponding to the operation behavior data includes: a matching score corresponding to the first operation behavior data and/or a matching result corresponding to the first operation behavior data;
  • the matching score corresponding to the first operation behavior data is used to indicate the probability that the user to be authenticated is recognized as an authenticated user
  • the matching result corresponding to the first operation behavior data is used to indicate that the user to be authenticated is recognized as an authenticated user or a non-authenticated user
  • the matching result corresponding to the first operation behavior data includes at least two matching results determined based on the at least two thresholds and the matching score corresponding to the first operation behavior data.
  • the matching score corresponding to the second operation behavior data is used to indicate the probability that the user to be authenticated is identified as an authenticated user
  • the matching result corresponding to the second operation behavior data is used to indicate that the user to be authenticated is identified as an authenticated user or a non-authenticated user.
  • the matching result corresponding to the operation behavior data includes at least two matching results determined based on the at least two thresholds and the matching score corresponding to the second operation behavior data.
  • the following takes the recognition result output by the anomaly detection model and the recognition result output by the classification model as an example for description.
  • the recognition result output by the anomaly detection model includes the matching score A corresponding to the operation behavior data of the user to be authenticated and the matching result A corresponding to the first operation behavior data output by the anomaly detection model.
  • the at least two thresholds include a first threshold A and a second threshold A, and the first threshold A is greater than the second threshold A.
  • the matching result A corresponding to the first operation behavior data output by the abnormality detection model includes: a matching result determined based on the first threshold A and a matching result determined based on the second threshold A.
  • the matching score A is greater than or equal to the first threshold A, the matching result determined based on the first threshold A is an authenticated user; when the matching score A is less than the first threshold A, the matching result determined based on the first threshold A is a non-authenticated user .
  • the matching result determined based on the second threshold A is an authenticated user; when the matching score A is less than the second threshold A, the matching result determined based on the second threshold A is a non-authenticated user .
  • the recognition result output by the classification model includes: a matching score B corresponding to the second operation behavior data output by the classification model and a matching result B corresponding to the second operation behavior data.
  • the at least two thresholds include a first threshold B and a second threshold B, and the first threshold B is greater than the second threshold B.
  • the matching result B corresponding to the second operation behavior data output by the classification model includes: a matching result determined based on the first threshold B and a matching result determined based on the second threshold B.
  • the matching score B is greater than or equal to the first threshold B, the matching result determined based on the first threshold B is an authenticated user; when the matching score B is less than the first threshold B, the matching result determined based on the first threshold B is an uncertified user .
  • the matching result determined based on the second threshold B is an authenticated user; when the matching score B is less than the second threshold B, the matching result determined based on the second threshold B is an un-authenticated user .
  • the recognition results output by the two authentication models include matching scores and at least two matching results.
  • Different thresholds can be used to adjust the different performance of the authentication model, so that the performance of the authentication model can meet expectations, for example, through different thresholds.
  • the threshold adjusts the model’s anti-attack ability and the owner’s recognition rate to balance the performance of the authentication model.
  • this solution can provide more features for the decision fusion model, which is conducive to improving the accuracy of the decision fusion model authentication, thereby improving the accuracy of identity authentication.
  • inputting the first recognition result and the second recognition result into the decision fusion model to obtain the output identity authentication result includes: obtaining the first score feature according to the matching score corresponding to the first operation behavior data; according to the second operation The matching score corresponding to the behavior data obtains the second score feature; the first score feature, the second score feature, the first recognition result, and the second recognition result are input into the decision fusion model to obtain the output identity authentication result.
  • performing feature extraction on the matching score corresponding to the operation behavior data of the user to be authenticated includes: performing a mathematical operation on the score, such as addition, subtraction, multiplication, and division, and the operation result is used as the score feature.
  • step S830 takes an anomaly detection model and a classification model as an example to illustrate step S830.
  • the recognition result output by the anomaly detection model includes the matching score A corresponding to the first operation behavior data output by the anomaly detection model and the matching result A corresponding to the first operation behavior data.
  • the matching result A includes a matching result corresponding to the first threshold A and a matching result corresponding to the second threshold A.
  • the recognition result output by the classification model includes a matching score B corresponding to the second operation behavior data output by the classification model and a matching result B corresponding to the second operation behavior data.
  • the matching result B includes a matching result corresponding to the first threshold B and a matching result corresponding to the second threshold B.
  • the matching score corresponding to the anomaly detection model and the matching score corresponding to the classification model are used as two score features.
  • the recognition results output by the anomaly detection model and the classification model and some or all of the four score features may be input to the decision fusion model to obtain the identity authentication result.
  • feature extraction is performed on the matching score to further provide more features for the decision fusion model, which is beneficial to improving the accuracy of the decision fusion model authentication, thereby improving the accuracy of identity authentication.
  • the method 800 further includes: inputting s pieces of user operation behavior data to be authenticated into the identity authentication model to obtain s identity authentication results; and obtaining the final identity authentication result according to the s identity authentication results.
  • the s identity authentication results may be s weighted results, an average value is calculated for the s weighted results, and the final identity authentication result is determined according to the average value. In this way, the accuracy of the authentication result is further improved.
  • Table 1 shows the simulation results of using different algorithms for identity authentication. It can be seen from Table 1 that the identity authentication method provided by the embodiment of the present application can improve the accuracy of identity authentication.
  • the identification results of at least two authentication models are input into the decision fusion model for decision fusion to obtain the identity authentication result, which can improve the accuracy of identity authentication.
  • performing identity verification through the user's operating behavior data will not change the user's machine habits, which is conducive to the realization of non-sense verification.
  • anomaly detection model and classification model improves the recognition rate of authenticated users through the anomaly detection model, and on the other hand, improves the anti-attack ability through the classification model, and improves the identification ability of the identity authentication model as a whole.
  • both the anomaly detection model and the classification model can use small-scale algorithms.
  • the anomaly detection model can use a single-class SVM
  • the classification model can use a two-class SVM, which can be realized by introducing the libsvm library.
  • the algorithm overhead is small and no additional hardware support is needed.
  • the identity authentication model can be trained on the client side, which can realize the secure storage of data on the client side and avoid privacy security issues caused by uploading to the cloud.
  • Fig. 11 shows a schematic diagram of the application process of the identity authentication method and identity authentication model training method provided by the embodiments of the present application. The following uses the scenario shown in FIG. 11 as an example to describe the application process of the identity authentication method and the identity authentication model training method of the embodiment of the present application.
  • the process of identity authentication can include two stages: a training stage and an authentication stage.
  • the training phase refers to the process of generating an identity authentication model based on the user's operational behavior data
  • the authentication phase refers to the process of verifying the matching between the operational behavior data of the user to be authenticated and the identity authentication model, and finally gives the result of identity authentication.
  • the steps of these two stages are described below.
  • the data collection module obtains the user's operation behavior data.
  • Operational behavior data may also be referred to as operational habit behavior data.
  • the data collection module registers the monitor of the sensor, and collects the user's operation behavior data through the sensor.
  • the data collection module may be the data collection module 420 in FIG. 4.
  • the senor includes a touch screen sensor and/or a motion sensor.
  • the data collected by the touch screen sensor includes: timestamp, X/Y axis coordinates of the touch point, touch area, touch pressure, action, and screen direction.
  • the data collected by the motion sensor includes: time stamp, acceleration X/Y/Z axis data, gyroscope X/Y/Z axis data, etc.
  • the data collected by the sensor may include any one or several of the above, and may also include other data.
  • the monitor of the sensor when the screen is unlocked, the monitor of the sensor can be registered; when the screen is locked, the monitor of the sensor can be deregistered.
  • S1120 Perform data preprocessing on the operation behavior data collected by the sensor to obtain a first training sample.
  • preprocessing includes: removing abnormal data, filtering effective operation behavior data, or extracting characteristic data, etc., to obtain the first training sample.
  • the first training sample may be stored in the storage module 440 shown in FIG. 4.
  • the data obtained after preprocessing can be used as the positive sample in the first training sample
  • the preset operation behavior data of the non-authenticated user can be used as the negative sample in the first training sample.
  • step (A2) by the data acquisition module in FIG. 11 is only an example.
  • step (A2) can be executed by the identity authentication module.
  • the identity authentication module trains an identity authentication model.
  • the identity authentication module may be the identity authentication module 430 in FIG. 4. This step corresponds to S720 in the method 700.
  • the identity authentication model can also determine whether to start training the identity authentication model.
  • the identity authentication model starts to be established.
  • step (A2) is executed by the data collection module, the data collection module can determine whether to start training the identity authentication model. For example, as shown in step S1121 of FIG. 11, when the time period for collecting the first sample data exceeds the preset time period and/or the data amount of the first sample data exceeds the preset number, a modeling instruction is sent to the identity authentication module, Notify the identity authentication module to start training the identity authentication model.
  • the training method of the identity authentication model may adopt the training method of the identity authentication model in the embodiment of the present application.
  • S1130 also includes verifying the validity of the identity authentication model.
  • the identity authentication model meets the preset indicators, the identity authentication model is valid, the modeling phase ends, and a trained identity authentication model is obtained. For example, you can mark the completion status of the modeling.
  • the validity verification method of the identity authentication model can be set according to the application scenario. That is, in different application scenarios, the verification methods for the validity of the identity authentication model can be different.
  • the indicators of the identity authentication model satisfying the preset indicators may mean that the anti-attack ability of the identity authentication model meets the requirements.
  • the trained identity authentication model can be stored in the storage module 440 shown in FIG. 4. There may be one or more identity authentication models stored in the storage module 440.
  • the identity authentication model can be applied to identity authentication in applications.
  • the APP may be an APP integrated with the identity authentication SDK.
  • the APP may be APP410 in FIG. 4.
  • the identity authentication model when the identity authentication model is applied to unlock the application lock, when the user to be authenticated needs to use the APP, the identity authentication model can be notified to start identity authentication.
  • This APP may be called an APP to be authenticated.
  • the APP can prompt the user to obtain the user's operation behavior data. If the user agrees, the identity authentication is started.
  • the identity authentication module can notify the data collection module to monitor the data of the sensor. Specifically, as shown in step S1141 in FIG. 11, the identity authentication module may feed back the context information to the data collection module.
  • the context information is used to indicate the application environment. For example, as shown in FIG. 11, the context information includes the type of APP to be authenticated, such as payment type, news type, and so on. Or the context information may also include information such as time or location.
  • the data collection module monitors the operation behavior data of the user to be authenticated collected by the sensor.
  • the data collection module monitoring sensor data refers to the user's operation behavior data corresponding to the application scenario. As shown in FIG. 11, the data collection module monitors the user's operation behavior data corresponding to the APP to be authenticated.
  • the data collection module can determine the user's operational behavior data that needs to be collected based on the contextual environment information.
  • the identity authentication model can perform identity authentication through different user operation behavior data.
  • the data collection module monitors user operation behavior data 1, and inputs user operation behavior data 1 into the identity authentication model; when the identity authentication model is applied to payment, the data collection module Monitor user operation behavior data 2 and input user operation behavior data 2 into the identity authentication model.
  • the parameters of the user operation behavior data 1 may be less than the parameters of the user operation behavior data 2.
  • the user operation behavior data 1 may include data collected by a touch screen sensor.
  • User operation behavior data 2 may include data collected by touch screen sensors and motion sensors. In this way, for different application scenarios, monitoring different data can achieve targeted identity authentication, which further ensures the security of the system.
  • S1160 Initiate an identity authentication request.
  • the identity authentication model can be applied to identity authentication in applications.
  • the APP to be authenticated may initiate an identity authentication request, for example, an identity authentication request to the identity authentication module 430 in FIG. 4.
  • S1170 Match the operation behavior data of the user to be authenticated.
  • the operation behavior data of the user to be authenticated is determined according to the sensor data monitored by the data collection module in step (B2).
  • the operation behavior data of the user to be authenticated is input into the identity authentication model to identify whether it is an authenticated user. Further, as shown in step S1171 of FIG. 11, the recognition result is fed back to the application to be authenticated.
  • the identity authentication method in the embodiment of the present application can be used to match the operation behavior data of the user to be authenticated.
  • the identity authentication method may be method 800 in FIG. 8.
  • the identification results of at least two authentication models are input into the decision fusion model for decision fusion to obtain the identity authentication result, which can improve the accuracy of identity authentication.
  • performing identity verification through the user's operating behavior data will not change the user's machine habits, which is conducive to the realization of non-sense verification.
  • the identity authentication method provided in the embodiments of the present application can be applied to continuous identity authentication.
  • the user who is instructed to wake up the smart terminal can be authenticated.
  • the smart terminal is in the awake state, that is, the unlocked state, the user can be authenticated outside the preset time interval or issued by the user to be authenticated.
  • the user performs identity authentication again, which can effectively prevent the user from leaving after waking up the smart terminal. Any user who touches the smart terminal can use the smart terminal, resulting in security issues such as leakage of user privacy data in the smart terminal.
  • the user to be authenticated can be authenticated.
  • the user to be authenticated can be authenticated again after the user to be authenticated issues an instruction or outside the preset time interval. This can effectively prevent the user from leaving after logging in to the APP, and any user who touches the smart terminal again can use the APP, resulting in security issues such as the leakage of user privacy data in the smart terminal.
  • the identity authentication method provided in the embodiment of the present application can be combined with other identity authentication methods.
  • the identity authentication method provided in the embodiments of the present application is used as an auxiliary authentication method, and combined with other authentication methods for identity authentication, which improves the security and reliability of the system.
  • the first identity authentication is performed through face recognition or fingerprint recognition. If the authentication is successful, the identity authentication method provided in the embodiment of this application is used for the second identity authentication. If the authentication is successful, the authentication is performed. pass through. If there is an authentication failure, the user is prompted for password authentication.
  • the identity authentication method provided in the embodiment of the present application is used as a continuous protection method, and is combined with other authentication methods for identity authentication, which improves the security and reliability of the system.
  • the identity authentication method is performed on the user to be authenticated who logs in to the APP.
  • the identity authentication method provided by the embodiment of the application adopts a behavior authentication method, which can realize non-sense authentication.
  • the identity authentication method provided in the embodiments of the present application is used for payment risk control, and is combined with other authentication methods for identity authentication, which improves the security and reliability of the system.
  • the identity authentication method provided in the embodiment of the present application is used for identity authentication, and the result of identity authentication is input into the business risk control system as behavioral risk control information for risk control. If the risk control is qualified, the payment operation is performed; if the risk control is not qualified, the user can be prompted to use other methods for identity authentication.
  • the training device and the identity authentication device of the embodiment of the present application are described in detail below with reference to the accompanying drawings. It should be understood that the training device of the identity authentication model described below can execute the aforementioned training method of the identity authentication model of the embodiment of the present application. The device can perform the aforementioned identity authentication method in the embodiment of the present application. To avoid unnecessary repetition, the following description will appropriately omit the repetitive description when introducing the identity authentication device and the training device of the identity authentication model in the embodiment of the present application.
  • FIG. 15 is a schematic block diagram of an embodiment of the present application.
  • the training device 1500 of the identity authentication model shown in FIG. 15 includes an acquiring unit 1510 and a processing unit 1520.
  • the identity authentication model includes a first authentication model, a second authentication model, and a decision fusion model, wherein the first and second authentication models are respectively an anomaly detection model and a classification model.
  • the acquiring unit 1510 and the processing unit 1520 may be used to execute the training method of the identity authentication model of the embodiment of the present application. Specifically, the acquiring unit 1510 may execute the foregoing step S731, and the processing unit 1520 may execute the foregoing steps S732 and S733.
  • the acquiring unit 1510 is configured to acquire second sample data, the second sample data including second operation behavior sample data and a label corresponding to the second operation behavior sample data, and the label corresponding to the second operation behavior sample data is used for Indicate whether the user corresponding to the second operation behavior sample data is an authenticated user or a non-authenticated user.
  • the processing unit 1520 is configured to input the second operation behavior sample data in the second sample data into the first authentication model and the second authentication model to obtain the identification result of the second operation behavior data, and the first authentication
  • the model parameters of the model are obtained by training based on the first sample data, the first sample data including the first operation behavior sample data and the label corresponding to the first operation behavior sample data, the first operation behavior sample
  • the label corresponding to the data is used to indicate that the user corresponding to the first operation behavior sample data is an authenticated user or a non-authenticated user
  • the model parameters of the second authentication model are obtained by training based on the fifth sample data
  • the fifth The sample data includes fifth operation behavior sample data and a label corresponding to the fifth operation behavior sample data, and the label corresponding to the fifth operation behavior sample data is used to indicate that the user corresponding to the fifth operation behavior data is an authenticated user or Non-authenticated user; use the recognition result of the second operation behavior sample data as the input of the decision fusion model, and use the label corresponding to the second operation behavior sample data
  • the recognition result of the second operation behavior sample data includes the recognition result output by the first authentication model and the recognition result output by the second authentication model; the first authentication model output The recognition result includes: the matching score corresponding to the second operation behavior sample data output by the first authentication model and/or the matching result corresponding to the second operation behavior sample output by the first authentication model; the second The recognition result output by the authentication model includes: the matching score corresponding to the second operation behavior sample data output by the second authentication model and/or the matching result corresponding to the second operation behavior sample data output by the second authentication model; Wherein, the matching score corresponding to the second operation behavior sample data is used to indicate the probability that the user corresponding to the second operation behavior sample data is identified as an authenticated user, and the matching result corresponding to the second operation behavior sample data is used for Indicate that the user corresponding to the second operation behavior sample data is identified as an authenticated user or a non-authenticated user, and the matching result corresponding to the second operation behavior sample data includes correspondence based on at least two thresholds and the second operation behavior sample
  • the at least two thresholds include the first threshold
  • the obtaining unit 1510 is further configured to obtain fourth sample data, where the fourth sample data includes fourth operation behavior sample data and the fourth A label corresponding to the operation behavior sample data, and the label corresponding to the fourth operation behavior sample data is used to indicate that the user corresponding to the fourth operation behavior sample data is an authenticated user or a non-authenticated user.
  • the processing unit 1520 is further configured to: input the fourth operation behavior sample data into the first authentication model to obtain a matching score corresponding to the fourth operation behavior sample data output by the first authentication model, and the fourth operation behavior sample data
  • the matching score corresponding to the operational behavior sample data is used to indicate the probability that the user corresponding to the fourth operational behavior sample data is recognized as an authenticated user; the matching score corresponding to the fourth operational behavior sample data is determined based on multiple candidate thresholds Multiple candidate matching results, the multiple candidate matching results are used to indicate that the user corresponding to the fourth operation behavior sample data is identified as an authenticated user or a non-authenticated user; and the accuracy of the multiple candidate matching results meets the expected It is assumed that the candidate threshold corresponding to the candidate matching result of the condition is determined as the first threshold.
  • the processing unit 1520 is specifically configured to: obtain the first score feature of the second operation behavior sample data according to the matching score corresponding to the second operation behavior sample data output by the first authentication model; Obtain the second score feature of the second operation behavior sample data according to the matching score corresponding to the second operation behavior sample data output by the second authentication model; use the first score feature and the first score feature of the second operation behavior sample data 2.
  • the second score feature of the operation behavior sample data and the recognition result of the second operation behavior sample data are used as the input of the decision fusion model, and the label corresponding to the second operation behavior sample data is used as the target output of the decision fusion model Value for training.
  • the second operation behavior sample data includes at least one of the following data: touch point X/Y axis coordinates, touch area, touch pressure, touch screen speed, touch screen acceleration, touch screen trajectory Slope, touch screen length, touch screen displacement, touch screen angle, touch screen direction, acceleration X/Y/Z axis data or gyroscope X/Y/Z axis data
  • the second sample data is obtained by filtering according to the sliding duration of the user on the touch screen and/or the user's sliding time on the touch screen.
  • the obtaining unit 1510 is further configured to: obtain third sample data, where the third sample data includes: third operation behavior sample data.
  • the processing unit 1520 is further configured to: input the third operation behavior sample data into the first authentication model to obtain the recognition result of the third operation behavior sample data output by the first authentication model;
  • the operation behavior sample data is input into the second authentication model to obtain the identification result of the third operation behavior sample data output by the second authentication model; the identification of the third operation behavior sample data output by the first authentication model
  • the recognition result of the third operation behavior sample data output by the second authentication model is input into the trained decision fusion model to obtain the identity authentication result corresponding to the third operation behavior sample data;
  • Three operation behavior sample data and the identity authentication result corresponding to the third operation behavior sample data train the first authentication model and/or the second authentication model.
  • FIG. 16 is a schematic block diagram of an identity authentication apparatus 1600 provided by an embodiment of the present application.
  • the identity authentication device 1600 shown in FIG. 16 includes an obtaining unit 1610 and a processing unit 1620.
  • the obtaining unit 1610 and the processing unit 1620 may be used to perform the identity authentication method of the embodiment of the present application. Specifically, the obtaining unit 1610 may perform the foregoing step S810, and the processing unit 1620 may perform the foregoing steps S820 and S830.
  • the obtaining unit 1610 is used to obtain the first operation behavior data of the user to be authenticated; to obtain the second operation behavior data of the user to be authenticated.
  • the processing unit 1620 is configured to use the first operation behavior data input to pass the first authentication model to obtain the first recognition result output by the first authentication model; to use the second operation behavior data input to pass the second authentication model to obtain The second recognition result output by the second authentication model; wherein the first and second authentication models are respectively an anomaly detection model and a classification model; the first recognition result and the second recognition result are input to the decision In the fusion model, the output identity authentication result is obtained, where the decision fusion model is used to determine the identity authentication result according to the weight parameters of the first recognition result and the second recognition result.
  • the first operation behavior data and/or the second operation behavior data are data collected by sensors.
  • the weight parameter of the decision fusion model is obtained by inputting the operational behavior sample data in the second sample data into the first authentication model and the second authentication model.
  • the authentication model and the recognition result of the second operation behavior sample data output by the second authentication model, the recognition result of the second operation behavior sample data is used as the input of the decision fusion model, and the second operation
  • the label corresponding to the behavior sample data is obtained by training as the target output value of the decision fusion model, the second sample data includes the second operation behavior sample data and the label corresponding to the second operation behavior sample data, so The label corresponding to the second operation behavior sample data is used to indicate that the user corresponding to the second operation behavior sample data is an authenticated user or a non-authenticated user.
  • the first recognition result includes: a matching score corresponding to the first operation behavior data and/or a matching result corresponding to the first operation behavior data;
  • the second recognition result includes the The matching score corresponding to the second operating behavior data and/or the matching result corresponding to the second operating behavior data; wherein the matching score corresponding to the first operating behavior data is used to indicate that the user to be authenticated is identified as an authenticated user
  • the matching result corresponding to the first operation behavior data is used to indicate that the user to be authenticated is identified as an authenticated user or a non-authenticated user
  • the matching result corresponding to the first operation behavior data includes a matching result based on at least two thresholds and At least two matching results determined by the matching score corresponding to the first operation behavior data;
  • the matching score corresponding to the second operation behavior data is used to indicate the probability that the user to be authenticated is recognized as an authenticated user
  • the second The matching result corresponding to the operation behavior data is used to indicate that the user to be authenticated is identified as an authenticated user or a non-
  • the processing unit 1620 is specifically configured to: obtain a first score feature according to the matching score corresponding to the first operation behavior data; obtain a second score according to the matching score corresponding to the second operation behavior data Features; the first score feature, the second score feature, the first recognition result, and the second recognition result are input into the decision fusion model to obtain the output identity authentication result.
  • the first operation behavior data and/or the second operation behavior data include at least one of the following data: touch point X/Y axis coordinates, touch area, touch pressure, touch screen speed, touch Screen acceleration, slope of touch screen trajectory, touch screen length, touch screen displacement, touch screen angle, touch screen direction, acceleration X/Y/Z axis data or gyroscope X/Y/Z axis data.
  • the above-mentioned training device 1500 and the identity authentication device 1600 are embodied in the form of functional units.
  • the term "unit” herein can be implemented in the form of software and/or hardware, which is not specifically limited.
  • a "unit” may be a software program, a hardware circuit, or a combination of the two that realizes the above-mentioned functions.
  • the hardware circuit may include an application specific integrated circuit (ASIC), an electronic circuit, and a processor for executing one or more software or firmware programs (such as a shared processor, a dedicated processor, or a group processor). Etc.) and memory, merged logic circuits and/or other suitable components that support the described functions.
  • the units of the examples described in the embodiments of the present application can be implemented by electronic hardware or a combination of computer software and electronic hardware. Whether these functions are executed by hardware or software depends on the specific application and design constraints of the technical solution. Professionals and technicians can use different methods for each specific application to implement the described functions, but such implementation should not be considered beyond the scope of this application.
  • FIG. 17 is a schematic diagram of the hardware structure of an identity authentication model training device provided by an embodiment of the present application.
  • the training device 900 shown in FIG. 17 includes a memory 901, a processor 902, a communication interface 903, and a bus 904.
  • the memory 901, the processor 902, and the communication interface 903 implement communication connections between each other through the bus 904.
  • the memory 901 may be a read only memory (ROM), a static storage device, a dynamic storage device, or a random access memory (RAM).
  • the memory 901 may store a program.
  • the processor 902 is configured to execute each step of the training method of the identity authentication model of the embodiment of the present application, for example, execute the steps shown in FIG. 6 or FIG. The various steps shown.
  • the training device shown in the embodiment of the present application may be a server, for example, it may be a server in the cloud, or may also be a chip configured in a server in the cloud.
  • the device shown in the embodiment of the present application may be a smart terminal, or may also be a chip configured in the smart terminal.
  • the processor 902 may adopt a general central processing unit (CPU), a microprocessor, an application specific integrated circuit (ASIC), a graphics processing unit (GPU), or one or more
  • the integrated circuit is used to execute related programs to realize the training method of the identity authentication model in the method embodiment of the present application.
  • the processor 902 may also be an integrated circuit chip with signal processing capability.
  • each step of the training method of the identity authentication model of the present application can be completed by the integrated logic circuit of hardware in the processor 902 or instructions in the form of software.
  • the aforementioned processor 902 may also be a general-purpose processor, a digital signal processing (digital signal processing, DSP), an application specific integrated circuit (ASIC), a ready-made programmable gate array (field programmable gate array, FPGA) or other programmable logic devices, Discrete gates or transistor logic devices, discrete hardware components.
  • DSP digital signal processing
  • ASIC application specific integrated circuit
  • FPGA ready-made programmable gate array
  • Discrete gates or transistor logic devices discrete hardware components.
  • the methods, steps, and logical block diagrams disclosed in the embodiments of the present application can be implemented or executed.
  • the general-purpose processor may be a microprocessor or the processor may also be any conventional processor or the like.
  • the steps of the method disclosed in the embodiments of the present application may be directly embodied as being executed and completed by a hardware decoding processor, or executed and completed by a combination of hardware and software modules in the decoding processor.
  • the software module can be located in a mature storage medium in the field, such as random access memory, flash memory, read-only memory, programmable read-only memory, or electrically erasable programmable memory, registers.
  • the storage medium is located in the memory 901, and the processor 902 reads the information in the memory 901, and combines its hardware to complete the functions required by the units included in the training device shown in FIG. 17 in the implementation of this application, or execute the method implementation of this application Example of the training method of the identity authentication model shown in Fig. 6 or Fig. 7.
  • the communication interface 903 uses a transceiver device such as but not limited to a transceiver to implement communication between the training device 900 and other devices or communication networks.
  • a transceiver device such as but not limited to a transceiver to implement communication between the training device 900 and other devices or communication networks.
  • the bus 904 may include a path for transferring information between various components of the training device 900 (for example, the memory 901, the processor 902, and the communication interface 903).
  • FIG. 18 is a schematic diagram of the hardware structure of the identity authentication device provided by an embodiment of the present application.
  • the identity authentication apparatus 1000 shown in FIG. 18 (the apparatus 1000 may specifically be a computer device) includes a memory 1001, a processor 1002, a communication interface 1003, and a bus 1004. Among them, the memory 1001, the processor 1002, and the communication interface 1003 communicate with each other through the bus 1004.
  • the memory 1001 may be a read only memory (ROM), a static storage device, a dynamic storage device, or a random access memory (RAM).
  • the memory 1001 may store a program.
  • the processor 1002 is configured to execute each step of the identity authentication method of the embodiment of the present application, for example, execute each step shown in FIG. 8.
  • the device shown in the embodiment of the present application may be a smart terminal, or may also be a chip configured in the smart terminal.
  • the processor 1002 may adopt a general central processing unit (CPU), a microprocessor, an application specific integrated circuit (ASIC), a graphics processing unit (GPU), or one or more
  • the integrated circuit is used to execute related programs to implement the identity authentication method in the method embodiment of the present application.
  • the processor 1002 may also be an integrated circuit chip with signal processing capability.
  • each step of the identity authentication method of the present application can be completed by the integrated logic circuit of hardware in the processor 1002 or instructions in the form of software.
  • the aforementioned processor 1002 may also be a general-purpose processor, a digital signal processing (digital signal processing, DSP), an application-specific integrated circuit (ASIC), an off-the-shelf programmable gate array (field programmable gate array, FPGA) or other programmable logic devices, Discrete gates or transistor logic devices, discrete hardware components.
  • DSP digital signal processing
  • ASIC application-specific integrated circuit
  • FPGA field programmable gate array
  • the general-purpose processor may be a microprocessor or the processor may also be any conventional processor or the like.
  • the steps of the method disclosed in the embodiments of the present application may be directly embodied as being executed and completed by a hardware decoding processor, or executed and completed by a combination of hardware and software modules in the decoding processor.
  • the software module can be located in a mature storage medium in the field, such as random access memory, flash memory, read-only memory, programmable read-only memory, or electrically erasable programmable memory, registers.
  • the storage medium is located in the memory 1001, and the processor 1002 reads the information in the memory 1001, and combines its hardware to complete the functions required by the units included in the device shown in FIG. 16 in the implementation of this application, or execute the method embodiments of this application The method of identity authentication shown in Figure 8.
  • the communication interface 1003 uses a transceiver device such as but not limited to a transceiver to implement communication between the device 1000 and other devices or a communication network.
  • a transceiver device such as but not limited to a transceiver to implement communication between the device 1000 and other devices or a communication network.
  • the bus 1004 may include a path for transferring information between various components of the device 1000 (for example, the memory 1001, the processor 1002, and the communication interface 1003).
  • training device 900 and device 1000 only show a memory, a processor, and a communication interface, in the specific implementation process, those skilled in the art should understand that the training device 900 and device 1000 may also include realizing normal operation. Other necessary devices. At the same time, according to specific needs, those skilled in the art should understand that the above-mentioned training device 900 and device 1000 may also include hardware devices that implement other additional functions. In addition, those skilled in the art should understand that the above-mentioned training device 900 and device 1000 may also only include the necessary devices for implementing the embodiments of the present application, and not necessarily all the devices shown in FIG. 17 or FIG. 18.
  • FIG. 19 is a schematic diagram of the hardware structure of an identity authentication device and an identity authentication model training device provided by an embodiment of the present application.
  • the apparatus 1100 shown in FIG. 19 (the apparatus 1100 may specifically be a computer device) includes a memory 1101, a processor 1102, and an output interface 1103.
  • the memory 1101 may be a read only memory (ROM), a static storage device, a dynamic storage device, or a random access memory (RAM).
  • the memory 1101 can store program instructions and data.
  • the processor 1102 is configured to execute the steps of the identity authentication method or identity authentication model training method of the embodiment of the present application.
  • the processor 1102 receives data from a touch screen sensor and a motion sensor, and can implement corresponding functions in the identity authentication process in the foregoing embodiment, including feature extraction and behavior matching as shown in FIG. 19.
  • the processor 1102 receives data from the touch screen sensor and the motion sensor, and can implement the corresponding functions in the training process of the identity authentication model in the foregoing embodiment, including feature extraction and behavior modeling as shown in FIG. 19.
  • the processor 1102 may also be used to implement other functions in FIG. 5.
  • the device shown in the embodiment of the present application may be a smart terminal, or may also be a chip configured in the smart terminal.
  • the processor 1102 may adopt a general central processing unit (CPU), a microprocessor, an application specific integrated circuit (ASIC), a graphics processing unit (GPU), or one or more
  • the integrated circuit is used to execute related programs to implement the identity authentication method or the identity authentication model training method in the method embodiment of the present application.
  • the processor 1102 may also be an integrated circuit chip with signal processing capabilities.
  • the steps of the identity authentication method or identity authentication model training method of the present application can be completed by the integrated logic circuit of hardware in the processor 1102 or instructions in the form of software.
  • the above-mentioned processor 1102 may also be a general-purpose processor, a digital signal processing (digital signal processing, DSP), an application specific integrated circuit (ASIC), an off-the-shelf programmable gate array (field programmable gate array, FPGA) or other programmable logic devices, Discrete gates or transistor logic devices, discrete hardware components.
  • DSP digital signal processing
  • ASIC application specific integrated circuit
  • FPGA field programmable gate array
  • the general-purpose processor may be a microprocessor or the processor may also be any conventional processor or the like.
  • the steps of the method disclosed in the embodiments of the present application may be directly embodied as being executed and completed by a hardware decoding processor, or executed and completed by a combination of hardware and software modules in the decoding processor.
  • the software module can be located in a mature storage medium in the field, such as random access memory, flash memory, read-only memory, programmable read-only memory, or electrically erasable programmable memory, registers.
  • the storage medium is located in the memory 1101, and the processor 1102 reads the information in the memory 1101, and combines its hardware to complete the functions required by the units included in the device shown in FIG. 15 or FIG. 16 in the implementation of this application, or execute this application The method shown in FIG. 6 or FIG. 7 or FIG. 8 of the method embodiment.
  • the output interface 1103 uses a transceiving device such as but not limited to a transceiver to implement communication between the device 1100 and other devices or a communication network.
  • a transceiving device such as but not limited to a transceiver to implement communication between the device 1100 and other devices or a communication network.
  • the foregoing device 1100 only shows a memory, a processor, and an output interface, in a specific implementation process, those skilled in the art should understand that the device 1100 may also include other devices necessary for normal operation. At the same time, according to specific needs, those skilled in the art should understand that the above-mentioned apparatus 1100 may also include hardware devices that implement other additional functions. In addition, those skilled in the art should understand that the above-mentioned apparatus 1100 may also only include the components necessary to implement the embodiments of the present application, and not necessarily include all the components shown in FIG. 19.
  • the processor in the embodiment of the present application may be a central processing unit (central processing unit, CPU), and the processor may also be other general-purpose processors, digital signal processors (digital signal processors, DSP), and application-specific integrated circuits. (application specific integrated circuit, ASIC), ready-made programmable gate array (field programmable gate array, FPGA) or other programmable logic devices, discrete gates or transistor logic devices, discrete hardware components, etc.
  • the general-purpose processor may be a microprocessor or the processor may also be any conventional processor or the like.
  • the memory in the embodiments of the present application may be volatile memory or non-volatile memory, or may include both volatile and non-volatile memory.
  • the non-volatile memory can be read-only memory (ROM), programmable read-only memory (programmable ROM, PROM), erasable programmable read-only memory (erasable PROM, EPROM), and electrically available Erase programmable read-only memory (electrically EPROM, EEPROM) or flash memory.
  • the volatile memory may be random access memory (RAM), which is used as an external cache.
  • RAM random access memory
  • static random access memory static random access memory
  • DRAM dynamic random access memory
  • DRAM synchronous dynamic random access memory
  • Access memory synchronous DRAM, SDRAM
  • double data rate synchronous dynamic random access memory double data rate SDRAM, DDR SDRAM
  • enhanced synchronous dynamic random access memory enhanced SDRAM, ESDRAM
  • synchronous connection dynamic random access memory Take memory (synchlink DRAM, SLDRAM) and direct memory bus random access memory (direct rambus RAM, DR RAM).
  • the above-mentioned embodiments may be implemented in whole or in part by software, hardware, firmware or any other combination.
  • the above-mentioned embodiments may be implemented in the form of a computer program product in whole or in part.
  • the computer program product includes one or more computer instructions or computer programs.
  • the computer may be a general-purpose computer, a special-purpose computer, a computer network, or other programmable devices.
  • the computer instructions may be stored in a computer-readable storage medium or transmitted from one computer-readable storage medium to another computer-readable storage medium.
  • the computer instructions may be transmitted from a website, computer, server, or data center. Transmission to another website, computer, server or data center via wired (such as infrared, wireless, microwave, etc.).
  • the computer-readable storage medium may be any available medium that can be accessed by a computer or a data storage device such as a server or a data center that includes one or more sets of available media.
  • the usable medium may be a magnetic medium (for example, a floppy disk, a hard disk, and a magnetic tape), an optical medium (for example, a DVD), or a semiconductor medium.
  • the semiconductor medium may be a solid state drive.
  • At least one refers to one or more, and “multiple” refers to two or more.
  • the following at least one item (a)” or similar expressions refers to any combination of these items, including any combination of a single item (a) or a plurality of items (a).
  • at least one item (a) of a, b, or c can mean: a, b, c, ab, ac, bc, or abc, where a, b, and c can be single or multiple .
  • the size of the sequence number of the above-mentioned processes does not mean the order of execution.
  • the execution order of each process should be determined by its function and internal logic, and should not correspond to the embodiments of the present application.
  • the implementation process constitutes any limitation.
  • the disclosed system, device, and method can be implemented in other ways.
  • the device embodiments described above are merely illustrative, for example, the division of the units is only a logical function division, and there may be other divisions in actual implementation, for example, multiple units or components may be combined or It can be integrated into another system, or some features can be ignored or not implemented.
  • the displayed or discussed mutual coupling or direct coupling or communication connection may be indirect coupling or communication connection through some interfaces, devices or units, and may be in electrical, mechanical or other forms.
  • the units described as separate components may or may not be physically separated, and the components displayed as units may or may not be physical units, that is, they may be located in one place, or they may be distributed on multiple network units. Some or all of the units may be selected according to actual needs to achieve the objectives of the solutions of the embodiments.
  • the functional units in the various embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units may be integrated into one unit.
  • the function is implemented in the form of a software functional unit and sold or used as an independent product, it can be stored in a computer readable storage medium.
  • the technical solution of the present application essentially or the part that contributes to the existing technology or the part of the technical solution can be embodied in the form of a software product, and the computer software product is stored in a storage medium, including Several instructions are used to make a computer device (which may be a personal computer, a server, or a network device, etc.) execute all or part of the steps of the methods described in the various embodiments of the present application.
  • the aforementioned storage media include: U disk, mobile hard disk, read-only memory, random access memory, magnetic disk or optical disk and other media that can store program codes.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Computer Security & Cryptography (AREA)
  • Software Systems (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Hardware Design (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Evolutionary Computation (AREA)
  • Data Mining & Analysis (AREA)
  • Mathematical Physics (AREA)
  • Computational Linguistics (AREA)
  • Biophysics (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Social Psychology (AREA)
  • Collating Specific Patterns (AREA)

Abstract

本申请公开了人工智能(AI)领域中的一种身份认证的方法、身份认证模型的训练方法、装置及计算机可读介质。该身份认证的方法包括:获取待认证用户的第一操作行为数据和第二操作行为数据;以第一操作行为数据输入至第一认证模型,获得第一认证模型输出的第一识别结果;以第二操作行为数据输入至第二认证模型,获得第二认证模型输出的第二识别结果;第一、第二认证模型分别为异常检测模型和分类模型;将第一识别结果和第二识别结果输入到决策融合模型中,得到输出的身份认证结果。本申请的方法基于至少两个识别结果进行决策融合,能够提高身份认证的准确性。

Description

身份认证的方法、身份认证模型的训练方法及装置
本申请要求于2020年04月06日提交中国专利局、申请号为202010262293.6、申请名称为“身份认证的方法、身份认证模型的训练方法及装置”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请涉及人工智能领域,并且更具体地,涉及一种身份认证的方法、身份认证模型的训练方法及装置。
背景技术
人工智能(artificial intelligence,AI)是利用数字计算机或者数字计算机控制的机器模拟、延伸和扩展人的智能,感知环境、获取知识并使用知识获得最佳结果的理论、方法、技术及应用系统。换句话说,人工智能是计算机科学的一个分支,它企图了解智能的实质,并生产出一种新的能以人类智能相似的方式作出反应的智能机器。人工智能也就是研究各种智能机器的设计原理与实现方法,使机器具有感知、推理与决策的功能。人工智能领域的研究包括机器人,自然语言处理,计算机视觉,决策与推理,人机交互,推荐与搜索,AI基础理论等。
随着智能设备的普及,为了保护个人的财产和隐私安全,用户通常会设置身份认证保护措施。然而,传统身份认证方式,如密码认证、图案认证等方式,存在肩窥窃取风险;生物身份认证方式,如指纹认证或人脸认证等方式,存在仿冒攻击的风险。目前,可以通过行为身份认证方式进一步提高系统安全性。行为身份认证指的是,通过机器学习算法对用户过去的行为建模,用以识别近期行为,实现身份认证。现有的行为身份认证方式在抗攻击能力、认证用户识别率等方面无法达到预期。在实际的行为身份认证过程中,若认证用户识别能力差,则认证用户常会被误认为是非认证用户,导致用户体验差;若抗攻击能力差,则非认证用户会被误认为认证用户,身份认证形同虚设。
因此,如何提高身份认证的准确率,成为一个亟需解决的技术问题。
发明内容
本申请提供一种身份认证的方法、身份认证模型的训练方法及装置,能够提高身份认证的准确性。
第一方面,提供了一种身份认证的方法,该方法包括:获取待认证用户的第一操作行为数据;获取待认证用户的第二操作行为数据;以所述第一操作行为数据输入通过第一认证模型,获得所述第一认证模型输出的第一识别结果;以所述第二操作行为数据输入通过第二认证模型,获得所述第二认证模型输出的第二识别结果;其中,所述第一、第二认证模型分别为异常检测模型和分类模型;将所述第一识别结果、所述第二识别结果输入到 决策融合模型中,得到输出的身份认证结果,其中,所述决策融合模型用于根据所述第一识别结果和所述第二识别结果的权重参数确定所述身份认证结果。
该身份认证模型包括第一认证模型、第二认证模型和决策融合模型。应理解,该身份认证模型可以包括多个认证模型和决策融合模型,将多个认证模型输出的识别结果输入决策融合模型中,由决策融合模型输出身份认证结果。
应理解,第一操作行为数据和第二操作行为数据可以相同,也可以不同。
可选地,第一识别结果和/或第二识别结果包括待认证用户为认证用户或待认证用户为非认证用户。
可选地,第一识别结果和/或第二识别结果包括待认证用户为认证用户的概率或待认证用户为非认证用户的概率。
例如,若第一识别结果和第二识别结果中存在一个识别结果为认证用户,则该身份认证结果为认证用户。或者,若第一识别结果和第二识别结果中存在一个识别结果为非认证用户,则该身份认证结果为非认证用户。
决策融合模型输出的身份认证结果可以包括待认证用户为认证用户或待认证用户为非认证用户。或者,决策融合模型输出的身份认证结果可以包括待认证用户为认证用户的概率或待认证用户为非认证用户的概率。
在本申请实施例中,将两个认证模型的识别结果输入决策融合模型进行决策融合,得到身份认证结果,能够提高身份认证的准确性。同时,通过用户的操作行为数据进行身份验证,不会改变用户的用机习惯,有利于实现无感验证。
此外,在采用异常检测模型实现身份认证的情况下,无法保证模型的抗攻击能力,抗攻击能力差,则非认证用户会被误认为认证用户,身份认证形同虚设;在采用分类模型实现身份认证的情况下,无法保证认证用户的识别率,认证用户识别能力差,则认证用户常会被误认为是非认证用户,导致用户体验差。本申请实施例采用异常检测模型和分类模型,一方面通过异常检测模型提高了认证用户识别率,另一方面通过分类模型提高了抗攻击能力,整体上提高了身份认证模型的识别能力。
结合第一方面,在第一方面的某些实现方式中,第一操作行为数据和/或第二操作行为数据是通过传感器采集的数据。
例如,传感器包括运动传感器和/或触屏传感器。
待认证用户的操作行为数据可以为对传感器采集的原始数据进行处理后得到的数据。
例如,触屏传感器采集的原始数据包括:时间戳、触摸点X/Y轴坐标、触摸面积、触摸压力、动作(action)和屏幕方向等。运动传感器采集的原始数据包括:时间戳、加速度X/Y/Z轴数据、陀螺仪X/Y/Z轴数据等。应理解,以上仅为示意,传感器采集的数据可以包括以上中的任一项或几项,也可以包括其他数据。
结合第一方面,在第一方面的某些实现方式中,第一认证模型的模型参数是基于第一样本数据进行训练得到的,所述第一样本数据包括:第一操作行为样本数据和第一操作行为样本数据对应的标签,第一操作行为样本数据对应的标签用于表示第一操作行为样本数据对应的用户为认证用户或非认证用户。
第二认证模型的模型参数是基于第五样本数据训练得到的,第五样本数据包括:第五操作行为样本数据和第五操作行为样本数据对应的标签,第五操作行为样本数据对应的标 签用于表示第五操作行为样本数据对应的用户为认证用户或非认证用户。
结合第一方面,在第一方面的某些实现方式中,异常检测模型可以采用单分类支持向量机SVM或孤立森林等。分类模型可以采用SVM或神经网络等。
在本申请实施例中,异常检测模型和分类模型均可以使用小规模算法,例如,异常检测模型可以使用单分类支持向量机SVM,分类模型可以使用二分类SVM。这样,在身份认证过程中,算法开销小,无需额外的硬件支持,该身份认证模型可以在用户端侧进行训练,能够实现用户端侧数据安全存储,避免上传云端造成的隐私安全问题。
结合第一方面,在第一方面的某些实现方式中,决策融合模型的权重参数是通过将第二样本数据中的第二操作行为样本数据输入到所述第一认证模型和所述第二认证模型中,得到所述第一认证模型和所述第二认证模型输出的所述第二操作行为样本数据的识别结果,以所述第二操作行为样本数据的识别结果作为所述决策融合模型的输入,以所述第二操作行为样本数据对应的标签作为所述决策融合模型的目标输出值进行训练得到的,所述第二样本数据包括所述第二操作行为样本数据及所述第二操作行为样本数据对应的标签,所述第二操作行为样本数据对应的标签用于指示所述第二操作行为样本数据所对应的用户为认证用户或非认证用户。
在本申请实施例中,通过对决策融合模型进行训练,得到更优的模型参数,例如,得到更优的认证模型的权重比例,进一步提高了身份认证的准确性。
结合第一方面,在第一方面的某些实现方式中,第一识别结果包括:第一操作行为数据对应的匹配分数和/或第一操作行为数据对应的匹配结果;第二识别结果包括第二操作行为数据对应的匹配分数和/或第二操作行为数据对应的匹配结果;其中,第一操作行为数据对应的匹配分数用于指示所述待认证用户被识别为认证用户的概率,第一操作行为数据对应的匹配结果用于指示所述待认证用户被识别为认证用户或非认证用户,第一操作行为数据对应的匹配结果包括基于至少两个阈值和第一操作行为数据对应的匹配分数确定的至少两个匹配结果;第二操作行为数据对应的匹配分数用于指示所述待认证用户被识别为认证用户的概率,第二操作行为数据对应的匹配结果用于指示所述待认证用户被识别为认证用户或非认证用户,第二操作行为数据对应的匹配结果包括基于至少两个阈值和第二操作行为数据对应的匹配分数确定的至少两个匹配结果。
下面举例说明第一识别结果和第二识别结果。
两个认证模型输出的识别结果分别包括:异常检测模型输出的识别结果和分类模型输出的识别结果。
异常检测模型输出的识别结果包括:由异常检测模型输出的第一操作行为数据对应的匹配分数A以及第一操作行为数据对应的匹配结果A。
例如,至少两个阈值包括第一阈值A和第二阈值A,第一阈值A大于第二阈值A。由异常检测模型输出的第一操作行为数据对应的匹配结果A包括:基于第一阈值A确定的匹配结果和基于第二阈值A确定的匹配结果。当匹配分数A大于或等于第一阈值A时,基于第一阈值A确定的匹配结果为认证用户;当匹配分数A小于第一阈值A时,基于第一阈值A确定的匹配结果为非认证用户。当匹配分数A大于或等于第二阈值A时,基于第二阈值A确定的匹配结果为认证用户;当匹配分数A小于第二阈值A时,基于第二阈值A确定的匹配结果为非认证用户。
分类模型输出的识别结果包括:由分类模型输出的第二操作行为数据对应的匹配分数B以及第二操作行为数据对应的匹配结果B。
例如,至少两个阈值包括第一阈值B和第二阈值B,第一阈值B大于第二阈值B。由分类模型输出的第二操作行为数据对应的匹配结果B包括:基于第一阈值B确定的匹配结果和基于第二阈值B确定的匹配结果。当匹配分数B大于或等于第一阈值B时,基于第一阈值B确定的匹配结果为认证用户;当匹配分数B小于第一阈值B时,基于第一阈值B确定的匹配结果为非认证用户。当匹配分数B大于或等于第二阈值B时,基于第二阈值B确定的匹配结果为认证用户;当匹配分数B小于第二阈值B时,基于第二阈值B确定的匹配结果为非认证用户。
示例性地,该至少两个阈值可以是根据所述识别结果的准确率确定的。
在本申请实施例中,认证模型输出的识别结果包括匹配分数和至少两个匹配结果,通过不同的阈值能够调节认证模型的不同性能,以使认证模型的性能达到预期,例如,通过不同的阈值调节模型的抗攻击能力和机主识别率,以使认证模型的性能达到平衡。此外,区别于现有认证模型仅能提供一种识别结果,本方案能够为决策融合模型提供更多的特征,有利于提高决策融合模型认证的准确性,从而提高身份认证的准确性。
结合第一方面,在第一方面的某些实现方式中,将第一识别结果、所述第二识别结果输入到决策融合模型中,得到输出的身份认证结果,包括:根据第一操作行为数据对应的匹配分数得到第一分数特征;根据第二操作行为数据对应的匹配分数得到第二分数特征;以所述第一分数特征、所述第二分数特征、所述第一识别结果和所述第二识别结果输入到决策融合模型中,得到输出的身份认证结果。
例如,根据第一操作行为数据对应的匹配分数得到分数特征,包括:对该匹配分数进行数学运算,得到分数特征。
在本申请实施例中,对匹配分数进行特征提取,进一步为决策融合模型提供更多的特征,有利于提高决策融合模型认证的准确性,从而提高了身份认证的准确性。
结合第一方面,在第一方面的某些实现方式中,第一操作行为数据和/或第二操作行为数据包括以下数据中的至少一种:触摸点X/Y轴坐标、触摸面积、触摸压力、触屏速度、触屏加速度、触屏轨迹的斜率、触屏长度、触屏位移、触屏角度、触屏方向、加速度X/Y/Z轴数据或陀螺仪X/Y/Z轴数据。
结合第一方面,在第一方面的某些实现方式中,认证模型的模型参数是根据第三样本数据中的第三操作行为样本数据和第三操作行为样本数据对应的身份认证结果进行训练得到的,第三操作行为样本数据对应的身份认证结果是通过将第三操作行为样本数据输入第一认证模型和第二认证模型中,得到第三操作行为样本数据的识别结果,将第三操作行为样本数据的识别结果输入决策融合模型中得到的,第三样本数据包括:第三操作行为样本数据。
在获取样本数据的过程中,可能由于用户有意识或无意识的行为等引入错误的数据,造成样本数据对应的标签出现错误。在本申请实施例中,根据决策融合模型的身份认证结果对认证模型再次训练,以其输出实现反馈,再次对认证模型进行训练,能够进一步提高认证模型的准确度,从而进一步提高身份认证模型的准确度。
第二方面,提供了一种身份认证模型的训练方法,该身份认证模型包括第一认证模型、 第二认证模型和决策融合模型,其中,所述第一、第二认证模型分别为异常检测模型和分类模型,该方法包括:获取第二样本数据,所述第二样本数据包括第二操作行为样本数据及所述第二操作行为样本数据对应的标签,所述第二操作行为样本数据对应的标签用于指示所述第二操作行为样本数据所对应的用户为认证用户或非认证用户;将所述第二样本数据中的第二操作行为样本数据输入到第一认证模型和第二认证模型中,得到所述第二操作行为数据的识别结果,所述第一认证模型的模型参数是基于第一样本数据进行训练得到的,所述第一样本数据包括第一操作行为样本数据和所述第一操作行为样本数据对应的标签,所述第一操作行为样本数据对应的标签用于表示所述第一操作行为样本数据对应的用户为认证用户或非认证用户,所述第二认证模型的模型参数是基于第五样本数据进行训练得到的,所述第五样本数据包括第五操作行为样本数据和所述第五操作行为样本数据对应的标签,所述第五操作行为样本数据对应的标签用于表示所述第五操作行为数据对应的用户为认证用户或非认证用户;以所述第二操作行为样本数据的识别结果作为所述决策融合模型的输入,以所述第二操作行为样本数据对应的标签作为所述决策融合模型的目标输出值进行训练,得到训练好的决策融合模型。
在本申请实施例中,认证用户的操作行为数据可以称为正样本,非认证用户的操作行为数据可以称为负样本。
样本数据可以是根据传感器采集的原始数据确定的。具体地,可以从原始数据中提取特征数据,得到样本数据。该传感器可以包括触屏传感器和/或运动传感器。
或者,样本数据也可以包括预先设置的数据。
例如,正样本可以是根据传感器采集的原始数据确定,负样本可以是预先设置的数据。
在本申请实施例中,将两个认证模型的识别结果输入决策融合模型进行决策融合,得到身份认证结果,通过对决策融合模型进行训练,能够提高身份认证的准确性。同时,通过用户的操作行为数据进行身份验证,不会改变用户的用机习惯,有利于实现无感验证。
此外,采用异常检测模型和分类模型,一方面通过异常检测模型提高了认证用户识别率,另一方面通过分类模型提高了抗攻击能力,整体上提高了身份认证模型的识别能力。
此外,异常检测模型和分类模型均可以使用小规模算法,例如,异常检测模型可以使用单分类支持向量机SVM,分类模型可以使用二分类SVM。这样,在身份认证过程中,算法开销小,无需额外的硬件支持,该身份认证模型可以在用户端侧进行训练,能够实现用户端侧数据安全存储,避免上传云端造成的隐私安全问题。
需要说明的是,第一样本数据和第二样本数据可以相同,也可以不同。
结合第二方面,在第二方面的某些实现方式中,所述第二操作行为样本数据的识别结果包括所述第一认证模型输出的识别结果和所述第二认证模型输出的识别结果;所述第一认证模型输出的识别结果包括:所述第一认证模型输出的所述第二操作行为样本数据对应的匹配分数和/或所述第一认证模型输出的第二操作行为样本对应的匹配结果;所述第二认证模型输出的识别结果包括:所述第二认证模型输出的所述第二操作行为样本数据对应的匹配分数和/或所述第二认证模型输出的第二操作行为样本数据对应的匹配结果;其中,所述第二操作行为样本数据对应的匹配分数用于指示所述第二操作行为样本数据对应的用户被识别为认证用户的概率,所述第二操作行为样本数据对应的匹配结果用于指示所述第二操作行为样本数据对应的用户被识别为认证用户或非认证用户,所述第二操作行为样 本数据对应的匹配结果包括基于至少两个阈值和所述第二操作行为样本数据对应的匹配分数确定的至少两个匹配结果。
应理解,两个认证模型的至少两个阈值可以不同。
示例性地,该两个认证模型分别为异常检测模型和分类模型。第二操作行为样本数据的识别结果,包括:异常检测模型输出的第二操作行为样本数据的识别结果和分类模型输出的第二操作行为样本数据的识别结果。
异常检测模型输出的第二操作行为样本数据的识别结果包括:由异常检测模型输出的第二操作行为样本数据对应的匹配分数A以及第二操作行为样本数据对应的匹配结果A。
例如,至少两个阈值包括第一阈值A和第二阈值A,第一阈值A大于第二阈值A。由异常检测模型输出的第二操作行为样本数据对应的匹配结果A包括:基于第一阈值A确定的匹配结果和基于第二阈值A确定的匹配结果。当匹配分数A大于或等于第一阈值A时,基于第一阈值A确定的匹配结果为认证用户;当匹配分数A小于第一阈值A时,基于第一阈值A确定的匹配结果为非认证用户。当匹配分数A大于或等于第二阈值A时,基于第二阈值A确定的匹配结果为认证用户;当匹配分数A小于第二阈值A时,基于第二阈值A确定的匹配结果为非认证用户。
分类模型输出的识别结果包括:由分类模型输出的第二操作行为样本数据对应的匹配分数B以及第二操作行为样本数据对应的匹配结果B。
例如,至少两个阈值包括第一阈值B和第二阈值B,第一阈值B大于第二阈值B。由分类模型输出的第二操作行为样本数据对应的匹配结果B包括:基于第一阈值B确定的匹配结果和基于第二阈值B确定的匹配结果。当匹配分数B大于或等于第一阈值B时,基于第一阈值B确定的匹配结果为认证用户;当匹配分数B小于第一阈值B时,基于第一阈值B确定的匹配结果为非认证用户。当匹配分数B大于或等于第二阈值B时,基于第二阈值B确定的匹配结果为认证用户;当匹配分数B小于第二阈值B时,基于第二阈值B确定的匹配结果为非认证用户。
其中,第一阈值A与第一阈值B可以不同,也可以相同。第二阈值A和第二阈值B可以相同,也可以不同。
示例性地,该至少两个阈值可以是根据所述两个认证模型输出的第二操作行为样本数据对应的识别结果的准确率确定的。
在本申请实施例中,两个认证模型输出的识别结果包括匹配分数和至少两个匹配结果,区别于现有认证模型仅能提供一种识别结果,本方案能够为决策融合模型提供更多的特征,有利于训练出更好的决策融合模型,提高决策融合模型认证的准确性,从而提高身份认证的准确性。
结合第二方面,在第二方面的某些实现方式中,至少两个阈值包括第一阈值,方法还包括:获取第四样本数据,所述第四样本数据包括第四操作行为样本数据和所述第四操作行为样本数据对应的标签,所述第四操作行为样本数据对应的标签用于表示所述第四操作行为样本数据对应的用户为认证用户或非认证用户;将所述第四操作行为样本数据输入所述第一认证模型,得到所述第一认证模型输出的所述第四操作行为样本数据对应的匹配分数,所述第四操作行为样本数据对应的匹配分数用于指示所述第四操作行为样本数据对应的用户被识别为认证用户的概率;基于多个候选阈值确定所述第四操作行为样本数据对应 的匹配分数对应的多个候选匹配结果,所述多个候选匹配结果用于指示所述第四操作行为样本数据对应的用户被识别为认证用户或非认证用户;将所述多个候选匹配结果中准确率满足预设条件的候选匹配结果对应的候选阈值确定为所述第一阈值。
示例性地,该预设条件可以为候选匹配结果的准确率达到设定阈值。
需要说明的是,第二认证模型的第一阈值也可以通过上述方式确定,此处不再赘述。
在本申请实施例中,从候选阈值中确定满足预设条件的第一阈值,通过不同的第一阈值能够调节认证模型的不同性能,以使认证模型的性能达到预期,例如,通过不同的阈值调节模型的抗攻击能力和机主识别率,有利于训练出更好的决策融合模型,提高决策融合模型认证的准确性,从而提高身份认证的准确性。
结合第二方面,在第二方面的某些实现方式中,以所述第二操作行为数据的识别结果作为所述决策融合模型的输入,以所述第二操作行为数据对应的标签作为所述决策融合模型的目标输出值进行训练,得到训练好的决策融合模型,包括:根据所述第一认证模型输出的所述第二操作行为样本数据对应的匹配分数得到第二操作行为样本数据的第一分数特征;根据所述第二认证模型输出的所述第二操作行为样本数据对应的匹配分数得到第二操作行为样本数据的第二分数特征;以所述第二操作行为样本数据的第一分数特征、第二操作行为样本数据的第二分数特征和所述第二操作行为样本数据的识别结果作为决策融合模型的输入,以所述第二操作行为样本数据对应的标签作为所述决策融合模型的目标输出值进行训练。
在本申请实施例中,对匹配分数进行特征提取,进一步为决策融合模型提供更多的特征,有利于训练出更好的决策融合模型,提高决策融合模型认证的准确性,从而提高身份认证的准确性。
结合第二方面,在第二方面的某些实现方式中,第二操作行为样本数据包括以下数据中的至少一种:触摸点X/Y轴坐标、触摸面积、触摸压力、触屏速度、触屏加速度、触屏轨迹的斜率、触屏长度、触屏位移、触屏角度、触屏方向、加速度X/Y/Z轴数据或陀螺仪X/Y/Z轴数据。
第一操作行为样本数据、第三操作行为样本数据、第四操作行为样本数据或第五操作行为样本数据也可以包括上述数据中的至少一种。
结合第二方面,在第二方面的某些实现方式中,第二样本数据是根据用户在触摸屏上的滑动时长和/或所述用户在触摸屏上的触摸点数进行筛选得到的。
具体地,在从传感器采集的原始数据中提取特征数据之前,可以根据用户在触摸屏上的滑动时长和/或所述用户在触摸屏上的触摸点数对原始数据进行预处理,筛选出有效操作行为数据。然后对有效操作行为数据提取特征数据,得到第二样本数据。
例如,例如,用户的触摸点数大于预设阈值即为有效触屏行为,从原始数据中筛选得到满足用户的触摸点数大于预设阈值的操作行为数据作为有效操作行为数据。。然后对有效操作行为数据提取特征数据,得到第二样本数据。
由于用户可能有意识或无意识的产生异常行为,这样的数据被采集后作为训练样本会影响训练样本的准确性,进而影响身份认证的结果,在本申请实施例中,通过对原始数据进行筛选,剔除用户本身的异常操作行为数据,提高了训练样本的准确性,从而提高了模型训练的准确性。
结合第二方面,在第二方面的某些实现方式中,方法还包括:获取第三样本数据,其中,所述第三样本数据包括:第三操作行为样本数据;将所述第三操作行为样本数据输入到所述第一认证模型中,得到所述第一认证模型输出的第三操作行为样本数据的识别结果;将所述第三操作行为样本数据输入到所述第二认证模型中,得到所述第二认证模型输出的第三操作行为样本数据的识别结果;将所述第一认证模型输出的第三操作行为样本数据的识别结果、所述第二认证模型输出的第三操作行为样本数据的识别结果输入到所述训练好的决策融合模型中,得到所述第三操作行为样本数据对应的身份认证结果;根据所述第三操作行为样本数据和所述第三操作行为样本数据对应的身份认证结果对所述第一认证模型和/或所述第二认证模型进行训练。
也就是说在基于第一样本数据对该至少两个认证模型训练后,可以根据第三操作行为样本数据和第三操作行为样本数据对应的身份认证结果对两个认证模型再次进行训练。
示例性地,第一样本操作行为数据和第三操作行为样本数据可以包括相同的样本数据,根据第三操作行为样本数据和第三操作行为样本数据对应的身份认证结果对两个认证模型进行训练,包括:根据第一操作行为样本数据对应的标签和第三操作行为样本数据对应的身份认证结果,筛选第一样本数据,基于筛选后的第一样本数据对至少两个认证模型再次进行训练。
具体地,可以从第一样本数据中剔除标签与第三样本操作行为数据对应的身份认证结果不同的样本数据。
在获取样本数据的过程中,可能由于用户有意识或无意识的行为等引入错误的数据,造成样本数据对应的标签出现错误。在本申请实施例中,根据决策融合模型的身份认证结果对认证模型再次训练,以其输出实现反馈,再次对认证模型进行训练,能够进一步提高认证模型的准确度,从而进一步提高身份认证模型的准确度。
应理解,在上述第一方面中对相关内容的扩展、限定、解释和说明也适用于第二方面中相同的内容。
第三方面,提供了一种身份认证的装置,装置包括用于执行上述第一方面以及第一方面中的任意一种实现方式中的方法的模块或单元。
第四方面,提供了一种身份认证模型的训练装置,所述装置包括用于执行上述第二方面以及第二方面中的任意一种实现方式中的方法的模块或单元。
第五方面,提供了一种身份认证的装置,包括输入输出接口、处理器和存储器。该处理器用于控制输入输出接口收发信息,该存储器用于存储计算机程序,该处理器用于从存储器中调用并运行该计算机程序,使得该训练装置执行上述第一方面以及第一方面中的任意一种实现方式中的方法。
可选地,上述装置可以是终端设备/服务器,也可以是终端设备/服务器内的芯片。
可选地,上述存储器可以位于处理器内部,例如,可以是处理器中的高速缓冲存储器(cache)。上述存储器还可以位于处理器外部,从而独立于处理器,例如,装置的内部存储器(memory)。
第六方面,提供了一种身份认证模型的训练装置,包括输入输出接口、处理器和存储器。该处理器用于控制输入输出接口收发信息,该存储器用于存储计算机程序,该处理器用于从存储器中调用并运行该计算机程序,使得该训练装置执行上述第一方面以及第一方 面中的任意一种实现方式中的方法。
可选地,上述训练装置可以是终端设备/服务器,也可以是终端设备/服务器内的芯片。
可选地,上述存储器可以位于处理器内部,例如,可以是处理器中的高速缓冲存储器(cache)。上述存储器还可以位于处理器外部,从而独立于处理器,例如,训练装置的内部存储器(memory)。
第七方面,提供了一种计算机程序产品,所述计算机程序产品包括:计算机程序代码,当所述计算机程序代码在计算机上运行时,使得计算机执行上述各方面中的方法。
需要说明的是,上述计算机程序代码可以全部或者部分存储在第一存储介质上,其中,第一存储介质可以与处理器封装在一起的,也可以与处理器单独封装,本申请实施例对此不作具体限定。
第八方面,提供了一种计算机可读介质,所述计算机可读介质存储有程序代码,当所述计算机程序代码在计算机上运行时,使得计算机执行上述各方面中的方法。
第九方面,提供一种芯片,所述芯片包括处理器与数据接口,所述处理器通过所述数据接口读取存储器上存储的指令,执行上述各方面中的方法。
可选地,作为一种实现方式,所述芯片还可以包括存储器,所述存储器中存储有指令,所述处理器用于执行所述存储器上存储的指令,当所述指令被执行时,所述处理器用于执行第一方面或第二方面中的任意一种实现方式中的方法。
上述芯片具体可以是现场可编程门阵列(field-programmable gate array,FPGA)或者专用集成电路(application-specific integrated circuit,ASIC)。
附图说明
图1是本申请实施例提供的一种人工智能主体框架示意图;
图2为本申请实施例提供的一种系统架构的结构示意图;
图3为本申请实施例提供的另一种系统架构的结构示意图;
图4为本申请实施例提供的又一种系统架构的结构示意图;
图5为本申请实施例提供的一种身份认证模块的结构示意图;
图6为本申请实施例提供的一种身份认证模型的训练方法的示意性流程图;
图7为本申请实施例提供的另一种身份认证模型的训练方法的示意性流程图;
图8为本申请实施例提供的一种身份认证的方法的示意性流程图;
图9为本申请实施例提供的一种决策融合的示意性流程图;
图10为本申请实施例提供的另一种决策融合的示意性流程图;
图11为本申请实施例提供的一种应用方法的示意性流程图;
图12为本申请实施例提供的一种应用场景的示意图;
图13是本申请实施例提供的另一种应用场景的示意图;
图14是本申请实施例提供的又一种应用场景的示意图;
图15是本申请实施例提供的身份认证模型的训练装置的示意性框图;
图16是本申请实施例提供的身份认证的装置的示意性框图;
图17是本申请实施例提供的身份认证模型的训练装置的示意性框图;
图18是本申请实施例提供的身份认证的装置的示意性框图;
图19是本申请实施例提供的身份认证的装置和身份认证模型的训练装置的示意性框图。
具体实施方式
下面将结合本申请实施例中的附图,对本申请实施例中的技术方案进行描述。
图1示出一种人工智能主体框架示意图,该主体框架描述了人工智能系统总体工作流程,适用于通用的人工智能领域需求。
下面从“智能信息链”(水平轴)和“信息技术(information technology,IT)价值链”(垂直轴)两个维度对上述人工智能主题框架进行详细的阐述。
“智能信息链”反映从数据的获取到处理的一列过程。举例来说,可以是智能信息感知、智能信息表示与形成、智能推理、智能决策、智能执行与输出的一般过程。在这个过程中,数据经历了“数据—信息—知识—智慧”的凝练过程。
“IT价值链”从人智能的底层基础设施、信息(提供和处理技术实现)到系统的产业生态过程,反映人工智能为信息技术产业带来的价值。
(1)基础设施:
基础设施为人工智能系统提供计算能力支持,实现与外部世界的沟通,并通过基础平台实现支撑。
基础设施可以通过传感器与外部沟通,基础设施的计算能力可以由智能芯片提供。
这里的智能芯片可以是中央处理器(central processing unit,CPU)、神经网络处理器(neural-network processing unit,NPU)、图形处理器(graphics processing unit,GPU)、专门应用的集成电路(application specific integrated circuit,ASIC)以及现场可编程门阵列(field programmable gate array,FPGA)等硬件加速芯片。
基础设施的基础平台可以包括分布式计算框架及网络等相关的平台保障和支持,可以包括云存储和计算、互联互通网络等。
例如,对于基础设施来说,可以通过传感器和外部沟通获取数据,然后将这些数据提供给基础平台提供的分布式计算系统中的智能芯片进行计算。
(2)数据:
基础设施的上一层的数据用于表示人工智能领域的数据来源。该数据涉及到图形、图像、语音、文本,还涉及到传统设备的物联网数据,包括已有系统的业务数据以及力、位移、液位、温度、湿度等感知数据。
(3)数据处理:
上述数据处理通常包括数据训练,机器学习,深度学习,搜索,推理,决策等处理方式。
其中,机器学习和深度学习可以对数据进行符号化和形式化的智能信息建模、抽取、预处理、训练等。
推理是指在计算机或智能系统中,模拟人类的智能推理方式,依据推理控制策略,利用形式化的信息进行机器思维和求解问题的过程,典型的功能是搜索与匹配。
决策是指智能信息经过推理后进行决策的过程,通常提供分类、排序、预测等功能。
(4)通用能力:
对数据经过上面提到的数据处理后,进一步基于数据处理的结果可以形成一些通用的能力,比如可以是算法或者一个通用系统,例如,翻译,文本的分析,计算机视觉的处理,语音识别,图像的识别等等。
(5)智能产品及行业应用:
智能产品及行业应用指人工智能系统在各领域的产品和应用,是对人工智能整体解决方案的封装,将智能信息决策产品化、实现落地应用,其应用领域主要包括:智能制造、智能交通、智能家居、智能医疗、智能安防、自动驾驶,平安城市,智能终端等。
本申请实施例可以应用在需要身份认证的场景中。例如,本申请实施例提供的方法能够应用在智能终端解锁、应用软件(application,APP)登录、安全支付等需要身份认证的场景。
下面对两种常见的应用场景进行简单的介绍。
终端设备解锁:
当终端设备(例如,手机)的屏幕处于锁屏状态,用户需要先解锁屏幕才能使用终端设备,通过身份认证解锁屏幕能够提高系统安全性,保护用户的财产、隐私安全等。
利用本申请实施例的身份认证的方法解锁屏幕,能够更准确地识别认证用户与非认证用户,提高了系统的安全性。
能够获得或者优化适用于相册分类的神经网络。接下来就可以利用该神经网络对图片进行分类,从而为不同的类别的图片打上标签,便于用户查看和查找。另外,这些图片的分类标签也可以提供给相册管理系统进行分类管理,节省用户的管理时间,提高相册管理的效率,提升用户体验。
安全支付:
在使用终端设备(例如,手机)进行支付时,用户需要通过身份认证才能执行支付操作,以保护用户的财产、隐私安全等。
通过采用本申请实施例的身份认证的方法,能够更准确地识别认证用户与非认证用户,提高系统的安全性。
为了便于理解本申请实施例,下面先对本申请实施例涉及的相关术语的相关概念进行介绍。
(1)支持向量机(support vector machine,SVM)
支持向量机是一种二分类模型,目的是寻找一个超平面对样本数据进行分割。SVM的学习策略是间隔最大化,可形式化为一个求解凸二次规划问题。
(2)神经网络
神经网络可以是由神经单元组成的,神经单元可以是指以xs和截距1为输入的运算单元,该运算单元的输出可以为:
Figure PCTCN2021085319-appb-000001
其中,s=1、2、……n,n为大于1的自然数,W s为x s的权重,b为神经单元的偏置。f为神经单元的激活函数(activation functions),用于将非线性特性引入神经网络中,来将神经单元中的输入信号转换为输出信号。该激活函数的输出信号可以作为下一层卷积层的输入,激活函数可以是sigmoid函数。神经网络是将多个上述单一的神经单元联结在一 起形成的网络,即一个神经单元的输出可以是另一个神经单元的输入。每个神经单元的输入可以与前一层的局部接受域相连,来提取局部接受域的特征,局部接受域可以是由若干个神经单元组成的区域。
(3)深度神经网络
深度神经网络(deep neural network,DNN),也称多层神经网络,可以理解为具有多层隐含层的神经网络。按照不同层的位置对DNN进行划分,DNN内部的神经网络可以分为三类:输入层,隐含层,输出层。一般来说第一层是输入层,最后一层是输出层,中间的层数都是隐含层。层与层之间是全连接的,也就是说,第i层的任意一个神经元一定与第i+1层的任意一个神经元相连。
虽然DNN看起来很复杂,但是就每一层的工作来说,其实并不复杂,简单来说就是如下线性关系表达式:
Figure PCTCN2021085319-appb-000002
其中,
Figure PCTCN2021085319-appb-000003
是输入向量,
Figure PCTCN2021085319-appb-000004
是输出向量,
Figure PCTCN2021085319-appb-000005
是偏移向量,W是权重矩阵(也称系数),α()是激活函数。每一层仅仅是对输入向量
Figure PCTCN2021085319-appb-000006
经过如此简单的操作得到输出向量
Figure PCTCN2021085319-appb-000007
由于DNN层数多,系数W和偏移向量
Figure PCTCN2021085319-appb-000008
的数量也比较多。这些参数在DNN中的定义如下所述:以系数W为例:假设在一个三层的DNN中,第二层的第4个神经元到第三层的第2个神经元的线性系数定义为
Figure PCTCN2021085319-appb-000009
上标3代表系数W所在的层数,而下标对应的是输出的第三层索引2和输入的第二层索引4。
综上,第L-1层的第k个神经元到第L层的第j个神经元的系数定义为
Figure PCTCN2021085319-appb-000010
需要注意的是,输入层是没有W参数的。在深度神经网络中,更多的隐含层让网络更能够刻画现实世界中的复杂情形。理论上而言,参数越多的模型复杂度越高,“容量”也就越大,也就意味着它能完成更复杂的学习任务。训练深度神经网络的也就是学习权重矩阵的过程,其最终目的是得到训练好的深度神经网络的所有层的权重矩阵(由很多层的向量W形成的权重矩阵)。
(4)损失函数
在训练深度神经网络的过程中,因为希望深度神经网络的输出尽可能的接近真正想要预测的值,所以可以通过比较当前网络的预测值和真正想要的目标值,再根据两者之间的差异情况来更新每一层神经网络的权重向量(当然,在第一次更新之前通常会有化的过程,即为深度神经网络中的各层预先配置参数),比如,如果网络的预测值高了,就调整权重向量让它预测低一些,不断地调整,直到深度神经网络能够预测出真正想要的目标值或与真正想要的目标值非常接近的值。因此,就需要预先定义“如何比较预测值和目标值之间的差异”,这便是损失函数(loss function)或目标函数(objective function),它们是用于衡量预测值和目标值的差异的重要方程。其中,以损失函数举例,损失函数的输出值(loss)越高表示差异越大,那么深度神经网络的训练就变成了尽可能缩小这个loss的过程。
(5)反向传播算法
神经网络可以采用误差反向传播(back propagation,BP)算法在训练过程中修正的神经网络模型中参数的大小,使得神经网络模型的重建误差损失越来越小。具体地,前向传递输入信号直至输出会产生误差损失,通过反向传播误差损失信息来更新的神经网络模型中参数,从而使误差损失收敛。反向传播算法是以误差损失为主导的反向传播运动,旨在 得到最优的神经网络模型的参数,例如权重矩阵。
如图2所示,本申请实施例提供了一种系统架构100。系统架构100包括执行设备110、训练设备120、数据库130、客户设备140、数据存储系统150、以及数据采集系统160。
另外,执行设备110包括计算模块111、I/O接口112、预处理模块113和预处理模块114。其中,计算模块111中可以包括目标模型/规则101,预处理模块113和预处理模块114是可选的。
在图2中,数据采集设备160用于采集训练数据。针对本申请实施例的身份认证的方法来说,训练数据可以包括用户的操作行为数据以及用户的操作行为数据对应的身份认证结果。例如,用户包括认证用户和非认证用户,用户的操作行为数据对应的身份认证结果包括认证用户或非认证用户。
在采集到训练数据之后,数据采集设备160将这些训练数据存入数据库130,训练设备120基于数据库130中维护的训练数据训练得到目标模型/规则101。
下面对训练设备120基于训练数据得到目标模型/规则101进行描述,训练设备120对输入的用户的操作行为数据进行处理,将输出的身份认证结果与真实的用户的操作行为数据对应的身份认证结果进行对比,直到训练设备120输出的身份认证结果与真实的用户的操作行为数据对应的身份认证结果的差值小于一定的阈值,从而完成目标模型/规则101的训练。
上述目标模型/规则101能够用于进行身份认证。本申请实施例中的目标模型/规则101具体可以包括神经网络或SVM等。
需要说明的是,在实际的应用中,所述数据库130中维护的训练数据不一定都来自于数据采集设备160的采集,也有可能是从其他设备接收得到的。另外需要说明的是,训练设备120也不一定完全基于数据库130维护的训练数据进行目标模型/规则101的训练,也有可能从云端或其他地方获取训练数据进行模型训练,上述描述不应该作为对本申请实施例的限定。
根据训练设备120训练得到的目标模型/规则101可以应用于不同的系统或设备中,如应用于图2所示的执行设备110,所述执行设备110可以是终端,如手机终端,平板电脑,笔记本电脑,增强现实(augmented reality,AR)AR/虚拟现实(virtual reality,VR),车载终端等,还可以是服务器或者云端等。在图2中,执行设备110配置输入/输出(input/output,I/O)接口112,用于与外部设备进行数据交互,用户可以通过客户设备140向I/O接口112输入数据,所述输入数据在本申请实施例中可以包括:客户设备输入的用户的操作行为数据。这里的客户设备140具体可以是终端设备。
预处理模块113和预处理模块114用于根据I/O接口112接收到的输入数据进行预处理,例如,预处理模块113可以用于从待认证用户的操作行为数据中提取特征数据。在本申请实施例中,也可以没有预处理模块113和预处理模块114(也可以只有其中的一个预处理模块),而直接采用计算模块111对输入数据进行处理。
在执行设备110对输入数据进行预处理,或者在执行设备110的计算模块111执行计算等相关的处理过程中,执行设备110可以调用数据存储系统150中的数据、代码等以用于相应的处理,也可以将相应处理得到的数据、指令等存入数据存储系统150中。
最后,I/O接口112将处理结果,如上述得到的身份认证结果返回给客户设备140, 从而提供给用户。
在本申请实施例中,上述身份认证结果可以包括身份认证成功或身份认证失败,身份认证成功指的是待认证用户为认证用户,身份认证失败指的是待认证用户为非认证用户。例如,该身份认证结果可以包括解锁成功或者解锁失败。或者,该身份认证结果可以包括登录成功或者登录失败。应理解,以上仅为示意,在不同应用场景中,具体的身份认证结果可以包括不同的形式。本申请实施例对此不做限制。
值得说明的是,训练设备120可以针对不同的目标或称不同的任务,基于不同的训练数据生成相应的目标模型/规则101,该相应的目标模型/规则101即可以用于实现上述目标或完成上述任务,从而为用户提供所需的结果。
在图2中所示情况下,用户可以手动给定输入数据,该手动给定可以通过I/O接口112提供的界面进行操作。另一种情况下,客户设备140可以自动地向I/O接口112发送输入数据,如果要求客户设备140自动发送输入数据需要获得用户的授权,则用户可以在客户设备140中设置相应权限。用户可以在客户设备140查看执行设备110输出的结果,具体的呈现形式可以是显示、声音、动作等具体方式。客户设备140也可以作为数据采集端,采集如图所示输入I/O接口112的输入数据及输出I/O接口112的输出结果作为新的样本数据,并存入数据库130。当然,也可以不经过客户设备140进行采集,而是由I/O接口112直接将如图所示输入I/O接口112的输入数据及输出I/O接口112的输出结果,作为新的样本数据存入数据库130。
值得注意的是,图2仅是本申请实施例提供的一种系统架构的示意图,图中所示设备、器件、模块等之间的位置关系不构成任何限制,例如,在图2中,数据存储系统150相对执行设备110是外部存储器,在其它情况下,也可以将数据存储系统150置于执行设备110中。
如图3所示,本申请实施例提供了一种系统架构300。该系统架构包括本地设备301、本地设备302以及执行设备310和数据存储系统350,其中,本地设备301和本地设备302通过通信网络与执行设备310连接。
执行设备310可以由一个或多个服务器实现。可选的,执行设备310可以与其它计算设备配合使用,例如:数据存储器、路由器、负载均衡器等设备。执行设备310可以布置在一个物理站点上,或者分布在多个物理站点上。执行设备310可以使用数据存储系统350中的数据,或者调用数据存储系统350中的程序代码来实现本申请实施例的身份认证的方法以及身份认证模型的训练方法。
示例性地,数据存储系统350可以部署于本地设备301或者本地设备302中,例如,数据存储系统350可以用于存储训练样本。
具体地,在一种实现方式中,执行设备310可以执行以下过程:
获取第二样本数据,所述第二样本数据包括第二操作行为样本数据及所述第二操作行为样本数据对应的标签,所述第二操作行为样本数据对应的标签用于指示所述第二操作行为样本数据所对应的用户为认证用户或非认证用户;
将所述第二样本数据中的第二操作行为样本数据输入到第一认证模型和第二认证模型中,得到所述第二操作行为样本数据的识别结果,所述第一认证模型的模型参数是基于第一样本数据进行训练得到的,所述第一样本数据包括第一操作行为样本数据和所述第一 操作行为样本数据对应的标签,所述第一操作行为样本数据对应的标签用于表示所述第一操作行为样本数据对应的用户为认证用户或非认证用户,所述第二认证模型的模型参数是基于第五样本数据进行训练得到的,所述第五样本数据包括第五操作行为样本数据和所述第五操作行为样本数据对应的标签,所述第五操作行为样本数据对应的标签用于表示所述第五操作行为数据对应的用户为认证用户或非认证用户;
以所述第二操作行为样本数据的识别结果作为所述决策融合模型的输入,以所述第二操作行为样本数据对应的标签作为所述决策融合模型的目标输出值进行训练,得到训练好的决策融合模型。
通过上述过程执行设备310能够得到身份认证模型,该身份认证模型提高身份认证的准确率。
用户可以操作各自的用户设备(例如本地设备301和本地设备302)与执行设备310进行交互。每个本地设备可以表示任何计算设备,例如个人计算机、计算机工作站、智能手机、平板电脑、智能摄像头、智能汽车或其他类型蜂窝电话、媒体消费设备、可穿戴设备、机顶盒、游戏机等。
每个用户的本地设备可以通过任何通信机制/通信标准的通信网络与执行设备310进行交互,通信网络可以是广域网、局域网、点对点连接等方式,或它们的任意组合。
在一种实现方式中,本地设备301、本地设备302从执行设备310获取到身份认证模型的相关参数,将身份认证模型部署在本地设备301、本地设备302上,利用该身份认证模型进行身份认证。
在另一种实现中,执行设备310上可以直接部署身份认证模型,执行设备310通过从本地设备301和本地设备302获取待处理用户行为,并采用身份认证模型对待处理用户行为进行身份认证。
上述执行设备310也可以为云端设备,此时,执行设备310可以部署在云端;或者,上述执行设备310也可以为终端设备,此时,执行设备310可以部署在用户终端侧,本申请实施例对此并不限定。
示例性地,数据存储系统350可以部署于本地设备301或者本地设备302中,例如,数据存储系统350可以用于存储训练样本。
示例性地,数据存储系统350可以独立于本地设备301或者本地设备302,单独部署在存储设备上,存储设备可以与本地设备进行交互,获取本地设备中用户行为日志,并存入存储设备中。
图4示出了本申请实施例提供的系统架构的示意图。如图4所示,系统400中可以包括APP410、数据采集模块420、身份认证模块440和存储模块440。
应用层中包括多个APP410,APP410能够在特定场景下,例如应用锁解锁、支付应用时,请求身份认证服务。这样可以实现无感身份认证。应用锁解锁也可以称为应用解锁,即在身份认证后打开应用。
示例性地,该APP410可以是集成身份认证软件开发工具包(software development kit,SDK)的APP。
框架(framework)层提供数据采集模块420和身份认证模块430,可以供上层APP使用。
数据采集模块420,用于监听传感器的数据。示例性地,传感器包括:环境光传感器,健康类传感器、声音传感器、触屏传感器和运动传感器。运动传感器指的能够实时监测设备移动状态的传感器,能够内嵌于设备中。例如,加速度、陀螺仪或磁力计等。声音传感器可以包括麦克风或扬声器等。
具体地,数据采集模块可以通过本地(native)层的传感器管理器(sensor manager)监听传感器的数据。
传感器管理器是传感器事件的总管理器,用于读取实践、分发事件等。例如,传感器管理器可以创建监听来监听某个传感器的事件。
具体地,传感器管理器(sensor manager)通过内核(kernel)中的传感器驱动与传感器实现交互。
身份认证模块430,用于身份认证。例如,该身份认证模块430执行本申请实施例中的身份认证方法实现身份认证。进一步地,身份认证模块430,还用于身份认证模型的建模训练。例如,该身份认证模块430执行本申请实施例中的身份认证模型的训练方法以得到训练好的身份认证模型。应理解,图4中由身份认证模块430完成身份认证模型的建模训练仅为示意。可替换地,身份认证模型的建模训练也可以由其他设备完成。该训练好的身份认证模型可以存储于存储模块440中。
存储模块440,用于存储用户的操作行为数据和身份认证模型,实现安全存储。如图4所示,用户的操作行为数据可以存储于操作行为数据库中。
图5示出了本申请实施例中的一种身份认证模块500的示意性框图。图5可以作为图4中的身份认证模块430的一例。身份认证模块500能够实现特征提取510、行为建模520、行为匹配530、增量学习540、模型升级550以及防伪检测560等功能。
如图5所示,身份认证模块500可以用于实现特征提取510。例如,从原始数据中提取特征数据,该特征数据可以输入身份认证模型中,以实现身份认证。原始数据指的是传感器采集的原始数据。进一步地,特征提取还包括,其他对特征数据的预处理操作,例如,从提取的特征数据中剔除无效特征数据。
可选地,身份认证模块500可以基于特征数据建立身份认证模型,或者说,基于特征数据实现身份认证模型的建模训练,即图5中的行为建模520。示例性地,身份认证模块500可以执行图6中的方法700或图7中的方法730以训练身份认证模型。
身份认证模块500可以用于身份认证/行为匹配530。具体地,将特征数据输入身份认证模型中,实现身份认证。示例性地,身份认证模块500可以执行图8中的方法800,实现身份认证。应理解,该身份认证模型可以是身份认证模块500训练得到的,也就是说,该身份认证模块500可以用于建模,得到身份认证模型,然后利用该身份认证模型实现身份认证。可替换地,该身份认证模块500也可以利用其它设备训练得到的身份认证模型实现身份认证。
由于随着时间推移,用户的行为可能发生变化,因此用户的操作行为数据具有不稳定性。
可选地,身份认证模块500还可以用于增量学习540。
增量学习是指一个学习系统能不断地从新样本中学习新的知识,并能保存大部分以前已经学习到的知识。
具体地,在得到训练好的身份认证模型后,身份认证模块500可以持续采集用户的操作行为数据,也即新增的用户的操作行为数据,通过增量学习可以在原有的数据库的基础上,仅对由于新增的用户的操作行为数据所引起的变化进行身份认证模型的更新。
通过增量学习能够不断优化身份认证模型,适应变化的用户行为,增强识别能力。
可选地,身份认证模块500还可以用于实现模型升级550。具体地,用于身份认证模型的结构的升级、与云服务实现交互等。
示例性地,身份认证模块500可以获取升级后的身份认证模型的结构参数,以更新原身份认证模型的结构参数。也就是升级身份认证模型的结构。该身份认证模型的结构参数可以是人为设定的。
进一步地,身份认证模块500可以对升级后的身份认证模型进行训练。例如,从原始数据中提取特征数据,基于特征数据对升级后的身份认证模型进行训练。
进一步地,身份认证模块500与云服务实现交互,具体地,可以将升级后的身份认证模型的结构参数发送至云服务。
可选地,身份认证模型500还可以用于实现防伪检测560。具体地,用于验证身份认证模块500的合法性等。例如,确保身份认证模型不会被篡改,实现安全存储。
图6示出了本申请实施例的身份认证模型的训练方法700。方法700包括步骤S710至步骤S720。该方法700可以由图2中的训练设备120执行。该训练设备可以是云服务设备,也可以是移动终端,例如,电脑、服务器等能够用来训练身份认证模型的设备。
下面对步骤S710至步骤S720进行详细介绍。
S710,获取第一样本数据。其中,第一样本数据包括:第一操作行为样本数据和第一操作行为样本数据对应的标签,第一操作行为样本数据对应的标签用于表示第一操作行为样本数据对应的用户为认证用户或非认证用户。
可选地,步骤S710还包括:获取第五样本数据。其中,第五样本数据包括:第五操作行为样本数据和第五操作行为样本数据对应的标签,第五操作行为样本数据对应的标签用于表示第五操作行为样本数据对应的用户为认证用户或非认证用户。
认证用户也可以理解为机主,非认证用户也可以理解为攻击者。认证用户的数据可以称为正样本,非认证用户的数据可以称为负样本。
示例性地,认证用户的操作行为样本数据为正样本,非认证用户的操作行为样本数据为负样本。
其中,该第一样本数据可以是根据传感器采集的原始数据确定的。例如,该传感器包括触屏传感器和/或运动传感器。或者,第一样本数据也可以包括预先设置的数据。
例如,正样本可以是根据传感器采集的原始数据确定的,负样本可以是预先设置的数据。
示例性地,通过传感器采集数据可以包括:屏幕解锁时,注册触屏传感器和/或运动传感器监听;屏幕锁屏时,注销触屏传感器和/或运动传感器监听。
其中,触屏传感器采集的原始数据包括:时间戳、触摸点X/Y轴坐标、触摸面积、触摸压力、动作(action)和屏幕方向等。
运动传感器采集的原始数据包括:时间戳、加速度X/Y/Z轴数据、陀螺仪X/Y/Z轴数据等。
应理解,以上仅为示意,传感器采集的数据可以包括以上中的任一项或几项,也可以包括其他数据。
action指的是触屏事件。例如,当触屏行为为按下和抬起,则该触屏事件为用户的“点击”操作。当触屏行为为“按下、滑动和抬起”,则该触屏事件为用户的“滑动”操作。屏幕方向包括横屏或竖屏。
具体地,可以从原始数据中提取特征数据,得到第一样本数据。
可选地,第一操作行为样本数据包括以下数据中的至少一种:触摸点X/Y轴坐标、触摸面积、触摸压力、触屏速度、触屏加速度、触屏轨迹的斜率、触屏长度、触屏位移、触屏角度、触屏方向、加速度X/Y/Z轴数据或陀螺仪X/Y/Z轴数据。
示例性地,第一操作行为样本数据可以包括上述任意一个或几个特征的相关值。该相关值可以包括:开始值、结束值、平均值、标准差、20%分位数、50%分位数、80%分位数。例如,第一样本数据可以包括:触摸点X/Y轴坐标的开始值和结束值,触摸面积的开始值、结束值、平均值、标准差、20%分位数、50%分位数和80%分位数以及加速度X/Y/Z轴数据的平均值和标准差。
start值以及end值的定义方式可以是预先设定的。例如,传感器采集的数据中包括n个触摸点。触屏速度的start值指的是第一个触摸点和第二个触摸点之间的速度;触屏速度的end值指的是第n-1个触摸点和第n个触摸点之间的速度。
可选地,在从原始数据中提取特征数据之前,可以对原始数据进行预处理,筛选出有效操作行为数据。然后对有效操作行为数据提取特征数据,得到第一操作行为样本数据。有效操作行为数据是在发生有效触屏行为的情况下采集的数据,其定义方式可以根据需要设定。例如,用户的触摸点数大于预设阈值即为有效触屏行为,从原始数据中筛选得到满足用户的触摸点数大于预设阈值的操作行为数据作为有效操作行为数据。
其中,一个触摸点代表一个触屏事件。触摸点数大于预设阈值指的是触屏事件超过预设阈值。
这样可以剔除用户本身的异常操作行为数据,提高了训练样本的准确性。
应理解,本申请实施例中对于第一样本数据的解释、说明、扩展等也适用于第五样本数据,本申请实施例中不再详细描述第五样本数据。
S720,建立身份认证模型。
可选地,在步骤S720之前,可以判断是否开始建立身份认证模型。
示例性地,当第一样本数据的采集的时段超过预设时长和/或第一样本数据的数据量超过预设数量,开始建立身份认证模型。如前所述,第一样本数据可以包括通过传感器采集的用户的操作行为数据,第一样本数据采集的时段可以包括传感器采集的时段。
或者,当第一样本数据中的正样本的采集的时段超过预设时长和/或正样本的数据量超过预设数量,开始建立身份认证模型。例如,预设时长为一周,预设数量为2000。若认证用户使用该设备超过一周,即正样本的采集时段超过一周,且有效触屏行为达到2000次,即有效操作行为数据超过2000,则开始建立异常检测模型。
在认证用户数据的采集满足一定时长、数据量达到一定数量的情况下,样本数据更能真实反映认证用户的习惯,通过设置模型建立的起始条件,能够得到更符合用户习惯的身份认证模型,提高了身份认证的准确率。
该身份认证模型包括至少两个认证模型,可以基于第一样本数据对第一认证模型进行训练,基于第五样本数据对第二认证模型进行训练。
可选地,该第一认证模型和第二认证模型分别为异常检测模型和分类模型。
下面以第一认证模型为异常检测模型,第二认证模型为分类模型为例,对身份认证模型的训练方法进行说明。
具体地,步骤S720包括步骤S721和步骤S722。
S721,建立异常检测模型。
基于第一样本数据训练异常检测模型。
该异常检测(anomaly detection)模型用于检测异常数据,通过检测出非认证用户的数据来判断认证用户和非认证用户。
例如,异常检测模型可以采用单分类(one class)SVM或孤立森林(isolation forest)等。应理解,本申请实施例对异常检测模型所采用的算法不做限制。
可选地,步骤S721包括设定异常检测模型的参数,例如,数据异常率。
由于认证用户可能有意识或无意识产生异常行为,设定数据异常率能够保证认证用户检测的准确率。例如,数据异常率设定为0.1,理论上,能够保障认证用户的识别率达到90%以上。
特征区分度会影响异常检测模型的识别能力。在训练异常检测模型的过程中,特征区分度差会导致“相似行为”增多。也就是将不同的操作行为数据识别为相同的操作行为数据,降低了整体的行为识别能力。
可选地,步骤S721包括选择最优特征组合。
具体地,可以通过特征工程选择最佳特征组合。
示例性地,每次选定若干特征,例如,从步骤S710的特征中选择若干特征,得到特征组合,将该特征组合输入异常检测模型中,得到准确率最高的异常检测模型对应的特征组合,将该特征组合作为最佳特征组合。
应理解,以上仅为示意,本申请实施例对特征选择的具体方式不做限定。
可选地,在训练异常检测模型前,可以是对第一数据样本进行预处理,例如,对第一数据样本进行标准化处理等。
S722,建立分类模型。
基于第五样本数据训练分类模型。
需要说明的是,第五样本数据中包括多个样本数据。用于训练异常检测模型的样本数据和用于训练分类模型的样本数据可以相同,也可以不同。
该分类模型可以为二分类模型,通过分类来判断认证用户和非认证用户。
例如,分类模型可以采用SVM或神经网络等。应理解,本申请实施例对分类模型所采用的算法不做限制。
可选地,步骤S722包括,确定分类模型的最优超参数。
示例性地,可以通过网格法确定分类模型的最优超参数。具体地,每次选定分类模型的超参数,并测试该分类模型的准确率,将准确率最高的分类模型对应的超参数作最优超参数。
应理解,以上仅为示意,本申请实施例对最优超参数的确定方式不做限定。
步骤S722包括,确定非认证用户的操作行为数据的类别N,也就是N种非认证用户。例如,N=10。
理论上,若二分类模型已知的行为包括认证用户的行为和N种非认证用户的行为,即N+1种行为,非认证用户的行为被误识别为认证用户的概率为1/(N+1)。增加非认证用户的类别能够有效降低误识率(false acceptance rate,FAR),但会提高拒识率(false rejection rate,FRR)。误识率降低能够提高抗攻击能力,拒识率提高会导致认证用户的识别率降低。当FAR的降幅小于FRR增幅时,整体的识别能力下降。
可选地,在训练分类模型前,可以是对第一样本数据进行预处理,例如,对第一样本数据进行标准化处理等。
应理解,步骤S721和步骤S722的执行顺序不分先后。例如,步骤S721和步骤S722可以同时执行。或者,步骤S721先执行,S722后执行。或者,步骤S722先执行,步骤S721后执行。
由步骤S721和步骤S722可以得到训练好的异常检测模型和训练好的分类模型。
在应用该身份认证进行身份认证时,可以将异常检测模型的识别结果和分类模型的识别结果输入决策融合模型,将融合后的结果作为身份认证结果。本申请实施例中的识别结果可以包括输入的操作行为数据被识别为认证用户或非认证用户。具体地,识别结果可以包括输入的操作行为数据被识别为认证用户的概率或非认证用户的概率。
可选地,该决策融合模型用于根据该至少两个认证模型对应的权重对至少两个认证模型输出的识别结果进行加权计算,根据加权计算得到的结果确定身份认证结果。
该决策融合模型的模型参数,例如,该两个认证模型对应的权重,可以是预先设定的。或者,该权重可以是训练得到的。
可选地,方法700还包括建立决策融合模型。图7示出了本申请实施例的一种决策融合模型的训练方法730示意性流程图。本申请实施例中的身份认证模型包括至少两个认证模型和决策融合模型,对决策融合模型的训练也是身份认证模型的训练。该决策融合模型的训练方法也可以理解为身份认证模型的训练方法。方法730包括步骤S731至步骤S733。
S731,获取第二样本数据。
第二样本数据包括第二操作行为样本数据及第二操作行为样本数据对应的标签,第二操作行为样本数据对应的标签用于指示第二操作行为样本数据所对应的用户为认证用户或非认证用户。
S732,将第二样本数据中的第二操作行为样本数据输入到第一认证模型和第二认证模型中,得到第二操作行为样本数据的识别结果。
可选地,第二操作行为样本数据的识别结果包括第一认证模型输出的识别结果和第二认证模型输出的识别结果。
第一认证模型输出的识别结果包括:第一认证模型输出的第二操作行为样本数据对应的匹配分数和/或第一认证模型输出的第二操作行为样本对应的匹配结果;第二认证模型输出的识别结果包括:第二认证模型输出的第二操作行为样本数据对应的匹配分数和/或第二认证模型输出的第二操作行为样本数据对应的匹配结果;
其中,第二操作行为样本数据对应的匹配分数用于指示第二操作行为样本数据对应的用户被识别为认证用户的概率,第二操作行为样本数据对应的匹配结果用于指示第二操作 行为样本数据对应的用户被识别为认证用户或非认证用户,第二操作行为样本数据对应的匹配结果包括基于至少两个阈值和第二操作行为样本数据对应的匹配分数确定的至少两个匹配结果。
具体地,将第二操作行为样本数据输入认证模型中,可以得到匹配分数,根据该匹配分数和至少两个阈值可以确定第二操作行为样本数据对应的用户被识别为认证用户或非认证用户。
该至少两个阈值可以根据需要设置。
示例性地,该至少两个阈值可以是根据所述两个认证模型输出的第二操作行为样本数据对应的识别结果的准确率确定的。
示例性地,该至少两个阈值可以包括第一阈值和第二阈值。其中,第一阈值大于第二阈值。基于第一阈值得到的匹配结果可以称为第一阈值对应的匹配结果。基于第二阈值得到的匹配结果可以称为第二阈值对应的匹配结果。第一阈值对应的匹配结果置信度更高。
当该匹配分数大于或等于第一阈值时,则该第二操作行为样本数据对应的用户被识别为认证用户;当该匹配分数小于第一阈值时,则该第二操作行为样本数据对应的用户被识别为非认证用户。
当该匹配分数大于或等于第二阈值时,则该第二操作行为样本数据对应的用户被识别为认证用户;当该匹配分数小于第二阈值时,则该第二操作行为样本数据对应的用户被识别为非认证用户。
例如,该第二阈值可以为默认阈值。例如,匹配分数的取值范围为[0,1],第二阈值为0.5,当该匹配分数大于或等于0.5时,该第二操作行为样本数据对应的用户被识别为认证用户;当该匹配分数小于0.5时,该第二操作行为样本数据对应的用户被识别为非认证用户。
下面以异常检测模型和分类模型为例说明第二操作行为样本数据的识别结果。
第二操作行为样本数据的识别结果,包括:异常检测模型输出的第二操作行为样本数据的识别结果和分类模型输出的第二操作行为样本数据的识别结果。
异常检测模型输出的第二操作行为样本数据的识别结果包括:由异常检测模型输出的第二操作行为样本数据对应的匹配分数A以及第二操作行为样本数据对应的匹配结果A。
例如,至少两个阈值包括第一阈值A和第二阈值A,第一阈值A大于第二阈值A。由异常检测模型输出的第二操作行为样本数据对应的匹配结果A包括:基于第一阈值A确定的匹配结果和基于第二阈值A确定的匹配结果。当匹配分数A大于或等于第一阈值A时,基于第一阈值A确定的匹配结果为认证用户;当匹配分数A小于第一阈值A时,基于第一阈值A确定的匹配结果为非认证用户。当匹配分数A大于或等于第二阈值A时,基于第二阈值A确定的匹配结果为认证用户;当匹配分数A小于第二阈值A时,基于第二阈值A确定的匹配结果为非认证用户。
分类模型输出的识别结果包括:由分类模型输出的第二操作行为样本数据对应的匹配分数B以及第二操作行为样本数据对应的匹配结果B。
例如,至少两个阈值包括第一阈值B和第二阈值B,第一阈值B大于第二阈值B。由分类模型输出的第二操作行为样本数据对应的匹配结果B包括:基于第一阈值B确定的匹配结果和基于第二阈值B确定的匹配结果。当匹配分数B大于或等于第一阈值B时, 基于第一阈值B确定的匹配结果为认证用户;当匹配分数B小于第一阈值B时,基于第一阈值B确定的匹配结果为非认证用户。当匹配分数B大于或等于第二阈值B时,基于第二阈值B确定的匹配结果为认证用户;当匹配分数B小于第二阈值B时,基于第二阈值B确定的匹配结果为非认证用户。
其中,第一阈值A与第一阈值B可以不同,也可以相同。第二阈值A和第二阈值B可以相同,也可以不同。
可选地,将测试样本集中的样本数据输入训练好的认证模型,得到该样本数据对应的匹配分数。基于多个候选阈值得到多个候选阈值对应的多个候选匹配结果。将多个候选匹配结果中满足预设条件的候选匹配结果对应的候选阈值确定为第一阈值。
具体地,可以通过如下步骤确定第一认证模型的第一阈值。
(1)获取第四样本数据,第四样本数据包括:第四操作行为样本数据和第四非操作行为样本数据对应的标签。第四操作行为样本数据对应的标签用于表示第四操作行为样本数据对应的用户为认证用户或非认证用户。
(2)将第四操作行为样本数据输入第一认证模型,得到第一认证模型输出的第四操作行为样本数据的匹配分数,第四操作行为样本数据的匹配分数用于指示第四操作行为样本数据对应的用户被识别为认证用户的概率。
(3)基于多个候选阈值确定第四操作行为样本数据的匹配分数对应的多个候选匹配结果,多个候选匹配结果包括第四操作行为样本数据对应的用户被识别为认证用户或非认证用户。
(4)将多个候选匹配结果中满足预设条件的候选匹配结果对应的候选阈值确定为第一阈值。
需要说明的是,第二认证模型的第一阈值也可以通过上述方式确定,此处不再赘述。
下面以异常检测模型和分类模型为例说明确定第一阈值的方法。为了便于描述,将异常检测模型对应的第一阈值称为第一阈值A,用于确定第一阈值A的第四操作行为样本数据称为第四操作行为样本数据A。分类模型对应的第一阈值称为第一阈值B,用于确定第一阈值B的第四操作行为样本数据称为第四操作行为样本数据B。
异常检测模型
如前所述,异常检测模型的参数包括数据异常率。由于认证用户可能有意识或无意识产生异常行为,调整异常数据率能够提高认证用户检测的准确率。
在步骤S720中可以设置数据异常率为0.5,然后对异常检测模型进行训练。
将第四操作行为样本数据A输入训练好的异常检测模型,得到第四操作行为样本数据A对应的匹配分数A。基于多个候选阈值得到多个候选阈值对应的多个候选匹配结果A。将多个候选匹配结果A中准确率高于90%的候选阈值作为第一阈值A。
分类模型
将第四操作行为样本数据B输入训练好的分类模型,得到第四操作行为样本数据B对应的匹配分数B。基于多个候选阈值得到多个候选阈值对应的多个候选匹配结果B。将多个候选匹配结果B识中FAR满足预设条件的候选阈值作为第一阈值A。
从候选阈值中确定满足预设条件的第一阈值,通过不同的第一阈值能够调节认证模型的不同性能,以使认证模型的性能达到预期,例如,通过不同的阈值能够调节模型的抗攻 击能力和机主识别率,以其达到平衡。
在本申请实施例中,两个认证模型输出的识别结果包括匹配分数和至少两个匹配结果,区别于现有认证模型仅能提供一种识别结果,本方案能够为决策融合模型提供更多的特征,有利于训练出更好的决策融合模型,提高决策融合模型认证的准确性,从而提高身份认证的准确性。
S733,以第二操作行为样本数据的识别结果作为决策融合模型的输入,以第二操作行为样本数据对应的标签作为决策融合模型的目标输出值对决策融合模型进行训练,得到训练好的决策融合模型。
例如,该决策融合模型可以为SVM。
需要说明的是,上述对第一样本数据的说明、限定和扩展等也适用于第二样本数据。第一样本数据与第二样本数据可以相同,也可以不同。
下面以异常检测模型和分类模型为例进行说明。
将第二操作行为样本数据输入异常检测模型,输出第一识别结果;将第二操作行为样本数据输入分类模型,输出第二识别结果。将第一识别结果和第二识别结果输入决策融合模型,以第二操作行为样本数据对应的标签作为决策融合模型的目标输出值对决策融合模型进行训练,得到训练好的决策融合模型。
示例性地,异常检测模型和分类模型的识别结果可以为匹配分数,即该第二操作行为样本数据对应的用户被识别为认证用户的概率。匹配分数越高,表示第二操作行为样本数据对应的用户被识别为认证用户的概率越高。
例如,将第一识别结果和第二识别结果输入决策融合模型,得到身份认证结果,包括:根据第一识别结果和第二识别结果分别对应的权重进行加权计算,得到加权结果,根据该加权结果得到身份认证结果。例如,当该加权结果大于或等于设定阈值A时,认证成功,即该第二操作行为样本数据对应的用户被识别为认证用户;当该加权结果小于设定阈值A时,认证失败,即该第二操作行为样本数据对应的用户被识别为非认证用户。或者,当该加权结果大于设定阈值A时,认证成功,即该第二操作行为样本数据对应的用户被识别为认证用户;当该加权结果小于或等于设定阈值A时,认证失败,即该第二操作行为样本数据对应的用户被识别为非认证用户。或者,当该加权结果大于设定阈值A时,认证成功,即该第二操作行为样本数据对应的用户被识别为认证用户;当该加权结果小于设定阈值B时,认证失败,即第二操作行为样本数据对应的用户被识别为非认证用户;当该加权结果大于设定阈值B且小于设定阈值A时,无法识别。在上述方案中,对决策融合模型进行训练,对决策融合模型进行训练包括对至少两个认证模型对应的权重的训练。即对异常检测模型和分类模型的识别结果对应的权重进行训练,以得到最佳的权重比例。
可选地,步骤S733还包括:对第二操作行为样本数据对应的匹配分数进行特征提取,得到分数特征;以分数特征和第二操作行为样本数据的识别结果作为决策融合模型的输入,以第二操作行为样本数据对应的标签作为决策融合模型的目标输出值进行训练。
示例性地,对第二样本数据的匹配分数进行特征提取,包括:对该分数进行数学运算,例如加减乘除运算,将运算结果作为分数特征。
下面以异常检测模型和分类模型为例说明步骤S733。
(1)将第二操作行为样本数据分别输入异常检测模型和分类模型,分别得到异常检 测模型和分类模型输出的识别结果。
异常检测模型输出的识别结果包括由异常检测模型输出的第二操作行为样本数据对应的匹配分数A以及第二操作行为样本数据对应的匹配结果A。该匹配结果A包括第一阈值A对应的匹配结果和第二阈值A对应的匹配结果。
分类模型输出的识别结果包括由分类模型输出的第二操作行为样本数据对应的匹配分数B以及第二操作行为样本数据对应的匹配结果B。该匹配结果B包括第一阈值B对应的匹配结果和第二阈值B对应的匹配结果。
(2)对异常检测模型对应的匹配分数和分类模型对应的匹配分数分别进行加减乘除运算,得到四个运算结果,作为四个分数特征。
将异常检测模型对应的匹配分数和分类模型对应的匹配分数作为两个分数特征。
示例性地,可以将异常检测模型和分类模型输出的识别结果以及四个分数特征中的部分或全部作为决策融合模型的输入,以第二操作行为样本数据对应的标签作为决策融合模型的目标输出值进行训练。在本申请实施例中,对匹配分数进行特征提取,进一步为决策融合模型提供更多的特征,有利于训练出更好的决策融合模型,提高决策融合模型认证的准确性,从而提高身份认证的准确性。
可选地,步骤S720还包括:
(1)获取第三样本数据。所述第三样本数据包括第三操作行为样本数据
(2)将第三操作行为样本数据输入到第一认证模型中,得到第一认证模型输出的第三操作行为样本数据的识别结果;
将第三操作行为样本数据输入到第二认证模型中,得到第二认证模型输出的第三操作行为样本数据的识别结果。
(3)将第一认证模型输出的第三操作行为样本数据的识别结果、第二认证模型输出的第三操作行为样本数据的识别结果输入到训练好的决策融合模型中,得到第三操作行为样本数据对应的身份认证结果。
(4)根据第三操作行为样本数据和第三操作行为样本数据对应的身份认证结果对第一认证模型和/或第二认证模型进行训练。
需要说明的是,上述对第一样本数据的说明、限定和扩展等也适用于第三样本数据。第三样本数据与第一样本数据可以相同,也可以不同。第三样本数据与第二样本数据可以相同,也可以不同。
在获取样本数据的过程中,可能由于用户有意识或无意识的行为等引入错误的数据,造成样本数据对应的标签出现错误。例如,通过传感器采集用户的操作行为数据时,若是认证用户出现异常行为,则该异常操作行为数据被采集之后,对应的样本标签为认证用户,基于该样本数据以及对应的标签训练得到的认证模型存在准确率不高的问题。而通过训练好的决策融合模型得到的身份认证结果的准确度较高,以其输出实现反馈,再次对认证模型进行训练,能够进一步提高认证模型的准确度,从而进一步提高身份认证模型的准确度。
下面以身份认证模型包括异常检测模型和分类模型为例进行说明。
将第三操作行为样本数据输入异常检测模型,输出第一识别结果;将第三操作行为样本数据输入分类模型,输出第二识别结果。将第一识别结果和第二识别结果输入训练好的决策融合模型,得到第三操作行为样本数据对应的身份认证结果。
根据第三操作行为样本数据和第三操作行为样本数据对应的身份认证结果对至少异常检测模型和分类模型进行训练。
示例性地,第一操作行为样本数据和第三操作行为样本数据可以包括相同的样本数据,根据第三操作行为样本数据和第三操作行为样本数据对应的身份认证结果对异常检测模型和分类模型进行训练,包括:
根据第一操作行为样本数据对应的标签和第三操作行为操作行为样本数据对应的身份认证结果,筛选第一样本数据,基于筛选后的第一样本数据对异常检测模型和分类模型进行训练。
具体地,可以从第一样本数据中剔除标签与第三操作行为样本数据对应的身份认证结果不同的样本数据。
例如,样本数据A为第一操作行为样本数据和第三操作行为样本数据中的一条样本数据。在第一操作行为样本数据中,样本数据A对应的标签为非认证用户。将该样本数据A输入身份认证模型中,得到的身份认证结果为认证用户。在该情况下,可以从第一样本数据中剔除样本数据A,基于筛选后的第一样本数据对异常检测模型和分类模型进行训练。
这样可以剔除掉错误的样本数据,提高了样本数据的准确率,并实现对以往数据的持续学习,提高了认证模型的准确率,从而提高身份认证模型的整体的准确率。
示例性地,根据第三操作行为样本数据和第三操作行为样本数据对应的身份认证结果对异常检测模型和分类模型进行训练,包括:
根据第三操作行为样本数据对异常检测模型进行训练;
以第三操作行为样本数据作为分类模型的输入,以第三操作行为数据样本对应的身份认证结果作为分类模型的目标输出值对分类模型进行训练。
这样,可以扩充训练样本的数据量,例如,在第三操作行为数据样本没有对应标签的情况下,也可以以第三操作行为数据样本对应的身份认证结果作为三操作行为样本数据对应的标签对认证模型进行训练,提高了认证模型的准确率,从而提高身份认证模型的整体准确率。
可选地,方法700还包括,通过增量学习更新身份认证模型。
增量学习是指一个学习系统能不断地从新样本中学习新的知识,并能保存大部分以前已经学习到的知识。
具体地,在由方法730得到训练好的身份认证模型后,可以持续采集用户的操作行为数据,也即新增的用户的操作行为数据,通过增量学习可以在原有的数据库的基础上,仅对由于新增的用户的操作行为数据所引起的变化进行身份认证模型的更新。
通过增量学习能够不断优化身份认证模型,适应变化的用户行为,增强识别能力。
在本申请实施例中,将至少两个认证模型的识别结果输入决策融合模型进行决策融合,得到身份认证结果,通过对决策融合模型进行训练,能够提高身份认证的准确性。同时,通过用户的操作行为数据进行身份验证,不会改变用户的用机习惯,有利于实现无感验证。
此外,异常检测模型和分类模型相结合得到身份认证模型,一方面通过异常检测模型提高了认证用户识别率,另一方面通过分类模型提高了抗攻击能力,整体上提高了身份认证模型的识别能力。
此外,异常检测模型和分类模型均可以使用小规模算法,例如,异常检测模型可以使用单分类SVM,分类模型可以使用二分类SVM,只需要引入libsvm库即可实现。这样,在身份认证过程中,算法开销小,无需额外的硬件支持,该身份认证模型可以在用户端侧进行训练,能够实现用户端侧数据安全存储,避免上传云端造成的隐私安全问题。
图8示出了本申请实施例的身份认证的方法800。方法800包括步骤S810至步骤S830。该方法800可以能够进行身份认证的装置或设备执行。该方法可以由终端设备、电脑、服务器等执行。例如,图2中的执行设备110。下面对步骤S810至步骤S830进行介绍。
图8中的身份认证的方法800中使用的身份认证模型可以是通过上述图6或图7中的方法构建的。为了避免不必要的重复,下面在介绍方法800时适当省略重复的描述。
S810,获取待认证用户的第一操作行为数据;获取待认证用户的第二操作行为数据。
其中,该待认证用户的操作行为数据可以包括通过传感器采集的数据。
传感器包括运动传感器和/或触屏传感器。
待认证用户的操作行为数据可以为对传感器采集的原始数据进行处理后得到的数据。
例如,触屏传感器采集的原始数据包括:时间戳、触摸点X/Y轴坐标、触摸面积、触摸压力、动作(action)和屏幕方向等。运动传感器采集的数据包括:时间戳、加速度X/Y/Z轴数据、陀螺仪X/Y/Z轴数据等。应理解,以上仅为示意,传感器采集的数据可以包括以上中的任一项或几项,也可以包括其他数据。
可选地,待认证用户的操作行为数据包括以下数据中的至少一种:触摸点X/Y轴坐标、触摸面积、触摸压力、触屏速度、触屏加速度、触屏轨迹的斜率、触屏长度、触屏位移、触屏角度、触屏方向、加速度X/Y/Z轴数据或陀螺仪X/Y/Z轴数据。
S820,以第一操作行为数据输入通过第一认证模型,获得第一认证模型输出的第一识别结果;以第二操作行为数据输入通过第二认证模型,获得第二认证模型输出的第二识别结果。
其中,第一、第二认证模型分别为异常检测模型和分类模型
其中,两个认证模型的模型参数分别是基于第一样本数据和第五样本数据进行训练得到的。具体的训练过程可以参见前述方法700。
可选地,两个认证模型的模型参数是基于第一样本数据、第五样本数据以及第三样本数据进行训练得到的,第三样本数据包括:第三操作行为数据。具体的训练过程可以参见前述方法700。
异常检测模型和分类模型的训练过程可以参见前述方法700。
该异常检测模型用于检测异常数据,通过检测出非认证用户的数据来判断认证用户和非认证用户。该分类模型可以为二分类模型,通过分类来判断认证用户和非认证用户
S830,将第一识别结果、第二识别结果输入到决策融合模型中,得到输出的身份认证结果,其中,决策融合模型用于根据第一识别结果和第二识别结果的权重参数确定身份认证结果。
示例性地,如图9所示,将第一操作行为数据输入异常检测模型910,输出第一识别结果;将第二操作行为数据输入分类模型920,输出第二识别结果。将第一识别结果和第二识别结果输入决策融合模型930,得到身份认证结果。其中识别结果包括待认证用户为认证用户或待认证用户为非认证用户。
例如,若异常检测模型和分类模型的识别结果均为认证用户,则该身份认证结果为认证用户。若异常检测模型和分类模型的识别结果均为非认证用户,则该身份认证结果为非认证用户。若异常检测模型和分类模型的识别结果中一个为认证用户,另一个为非认证用户,则该身份认证结果为无法识别。
再如,若异常检测模型和分类模型的识别结果中存在一个识别结果为认证用户,则该身份认证结果为认证用户。若异常检测模型和分类模型的识别结果均为非认证用户,则该身份认证结果为非认证用户。
再如,若异常检测模型和分类模型的识别结果中存在一个识别结果为非认证用户,则该身份认证结果为非认证用户。若异常检测模型和分类模型的识别结果均为认证用户,则该身份认证结果为认证用户。
可选地,决策融合模型用于根据至少两个认证模型对应的权重对至少两个认证模型输出的识别结果进行加权计算,根据加权计算得到的结果确定身份认证结果。
通过为认证模型设置对应的权重,将识别结果进行加权计算,能够根据需要调整权重值,进一步提高了身份认证的准确性。
可选地,决策融合模型的权重参数是通过将第二样本数据中的第二操作行为样本数据输入到所述第一认证模型和所述第二认证模型中,得到所述第一认证模型和所述第二认证模型输出的第二操作行为样本数据的识别结果,以所述第二操作行为样本数据的识别结果作为所述决策融合模型的输入,以所述第二操作行为样本数据对应的标签作为所述决策融合模型的目标输出值进行训练得到的,所述第二样本数据包括所述第二操作行为样本数据及所述第二操作行为样本数据对应的标签,所述第二操作行为样本数据对应的标签用于指示所述第二操作行为样本数据所对应的用户为认证用户或非认证用户。具体的训练过程可以参见方法730。
示例性地,如图10所示,将待认证用户的操作行为数据输入异常检测模型1010,输出第一匹配分数;将待认证用户的操作行为数据输入分类模型1020,输出第二匹配分数。将第一匹配分数和第二匹配分数输入决策融合模型1030,根据至少两个认证模型对应的权重对进行加权计算,根据加权计算得到的结果确定身份认证结果。其中图10中的匹配分数即为认证模型输出的识别结果,即待认证用户为认证用户的概率。
例如,若该加权计算的结果大于或等于设定阈值A时,认证成功,即该待认证用户被识别为认证用户;当该加权计算的结果小于设定阈值A时,认证失败,即该待认证用户被识别为非认证用户。或者,当该加权计算的结果大于设定阈值A时,认证成功,即该待认证用户被识别为认证用户;当该加权计算的结果小于或等于设定阈值A时,认证失败,即该待认证用户被识别为非认证用户。或者,当该加权计算的结果大于设定阈值A时,认证成功,即该待认证用户被识别为认证用户;当该加权计算的结果小于设定阈值B时,认证失败,即该待认证用户被识别为非认证用户;当该加权计算的结果大于设定阈值B且小于设定阈值A时,无法识别。
可选地,第一识别结果包括:第一操作行为数据对应的匹配分数和/或第一操作行为数据对应的匹配结果;第二识别结果包括第二操作行为数据对应的匹配分数和/或第二操作行为数据对应的匹配结果。
其中,第一操作行为数据对应的匹配分数用于指示待认证用户被识别为认证用户的概 率,第一操作行为数据对应的匹配结果用于指示待认证用户被识别为认证用户或非认证用户,第一操作行为数据对应的匹配结果包括基于至少两个阈值和第一操作行为数据对应的匹配分数确定的至少两个匹配结果。第二操作行为数据对应的匹配分数用于指示待认证用户被识别为认证用户的概率,第二操作行为数据对应的匹配结果用于指示待认证用户被识别为认证用户或非认证用户,第二操作行为数据对应的匹配结果包括基于至少两个阈值和第二操作行为数据对应的匹配分数确定的至少两个匹配结果。
下面以异常检测模型输出的识别结果和分类模型输出的识别结果为例进行说明。
异常检测模型输出的识别结果包括:由异常检测模型输出的待认证用户的操作行为数据对应的匹配分数A以及第一操作行为数据对应的匹配结果A。
例如,至少两个阈值包括第一阈值A和第二阈值A,第一阈值A大于第二阈值A。由异常检测模型输出的第一操作行为数据对应的匹配结果A包括:基于第一阈值A确定的匹配结果和基于第二阈值A确定的匹配结果。当匹配分数A大于或等于第一阈值A时,基于第一阈值A确定的匹配结果为认证用户;当匹配分数A小于第一阈值A时,基于第一阈值A确定的匹配结果为非认证用户。当匹配分数A大于或等于第二阈值A时,基于第二阈值A确定的匹配结果为认证用户;当匹配分数A小于第二阈值A时,基于第二阈值A确定的匹配结果为非认证用户。
分类模型输出的识别结果包括:由分类模型输出的第二操作行为数据对应的匹配分数B以及第二操作行为数据对应的匹配结果B。
例如,至少两个阈值包括第一阈值B和第二阈值B,第一阈值B大于第二阈值B。由分类模型输出的第二操作行为数据对应的匹配结果B包括:基于第一阈值B确定的匹配结果和基于第二阈值B确定的匹配结果。当匹配分数B大于或等于第一阈值B时,基于第一阈值B确定的匹配结果为认证用户;当匹配分数B小于第一阈值B时,基于第一阈值B确定的匹配结果为非认证用户。当匹配分数B大于或等于第二阈值B时,基于第二阈值B确定的匹配结果为认证用户;当匹配分数B小于第二阈值B时,基于第二阈值B确定的匹配结果为非认证用户。
在本申请实施例中,两个认证模型输出的识别结果包括匹配分数和至少两个匹配结果,通过不同的阈值能够调节认证模型的不同性能,以使认证模型的性能达到预期,例如,通过不同的阈值调节模型的抗攻击能力和机主识别率,以使认证模型的性能达到平衡。此外,区别于现有认证模型仅能提供一种识别结果,本方案能够为决策融合模型提供更多的特征,有利于提高决策融合模型认证的准确性,从而提高身份认证的准确性。
可选地,将第一识别结果、第二识别结果输入到决策融合模型中,得到输出的身份认证结果,包括:根据第一操作行为数据对应的匹配分数得到第一分数特征;根据第二操作行为数据对应的匹配分数得到第二分数特征;以第一分数特征、第二分数特征、第一识别结果和第二识别结果输入到决策融合模型中,得到输出的身份认证结果。
示例性地,对待认证用户的操作行为数据对应的匹配分数进行特征提取,包括:对该分数进行数学运算,例如加减乘除运算,将运算结果作为分数特征。
下面以异常检测模型和分类模型为例说明步骤S830。
(1)将待认证用户的操作行为数据分别输入异常检测模型和分类模型,分别得到异常检测模型和分类模型输出的识别结果。
异常检测模型输出的识别结果包括由异常检测模型输出的第一操作行为数据对应的匹配分数A以及第一操作行为数据对应的匹配结果A。该匹配结果A包括第一阈值A对应的匹配结果和第二阈值A对应的匹配结果。
分类模型输出的识别结果包括由分类模型输出的第二操作行为数据对应的匹配分数B以及第二操作行为数据对应的匹配结果B。该匹配结果B包括第一阈值B对应的匹配结果和第二阈值B对应的匹配结果。
(2)对异常检测模型对应的匹配分数和分类模型对应的匹配分数分别进行加减乘除运算,得到四个运算结果,作为四个分数特征。
将异常检测模型对应的匹配分数和分类模型对应的匹配分数作为两个分数特征。
示例性地,可以将异常检测模型和分类模型输出的识别结果以及四个分数特征中的部分或全部输入决策融合模型,得到身份认证结果。
在本申请实施例中,对匹配分数进行特征提取,进一步为决策融合模型提供更多的特征,有利于提高决策融合模型认证的准确性,从而提高了身份认证的准确性。
在本申请实施例中,通过对决策融合模型进行训练,得到更优的模型参数,例如,得到更优的认证模型的权重比例,进一步提高了身份认证的准确性。
可选地,方法800还包括:将s条待认证的用户操作行为数据输入身份认证模型中,得到s个身份认证结果;根据该s个身份认证结果得到最终的身份认证结果。
例如,s个身份认证结果中,认证成功的次数超过认证失败的次数,则认证成功,即该待认证用户被识别为认证用户;认证成功的次数少于认证失败的次数,则认证失败,即该待认证用户被识别为非认证用户;认证成功的次数等于认证失败的次数,则无法识别。或者,该s个身份认证结果可以为s个加权结果,对该s个加权结果计算平均值,根据该平均值确定最终的身份认证结果。这样,进一步提高了认证结果的准确率。
表1示出了采用不同的算法进行身份认证的仿真结果。由表1可以看出,本申请实施例提供的身份认证方法能够提高身份认证的准确率。
表1
算法 抗攻击率 机主识别率
单分类SVM算法 73.38% 84.87%
二分类SVM算法 92.13% 86.13%
本申请的身份认证方法 90.19% 91.77%
在本申请实施例中,将至少两个认证模型的识别结果输入决策融合模型进行决策融合,得到身份认证结果,能够提高身份认证的准确性。同时,通过用户的操作行为数据进行身份验证,不会改变用户的用机习惯,有利于实现无感验证。
此外,采用异常检测模型和分类模型,一方面通过异常检测模型提高了认证用户识别率,另一方面通过分类模型提高了抗攻击能力,整体上提高了身份认证模型的识别能力。
此外,异常检测模型和分类模型均可以使用小规模算法,例如,异常检测模型可以使用单分类SVM,分类模型可以使用二分类SVM,只需要引入libsvm库即可实现。这样, 在身份认证过程中,算法开销小,无需额外的硬件支持,该身份认证模型可以在用户端侧进行训练,能够实现用户端侧数据安全存储,避免上传云端造成的隐私安全问题。
图11示出了本申请实施例提供的身份认证的方法和身份认证模型的训练方法的应用流程示意图。下面以图11所示的场景为例对本申请实施例的身份认证方法和身份认证模型的训练方法的应用流程进行说明。
如图11所示,身份认证的过程可以包括两个阶段:训练阶段和认证阶段。
其中,训练阶段是指基于用户的操作行为数据生成身份认证模型的过程;认证阶段是指验证待认证用户的操作行为数据与身份认证模型的匹配的过程,最终给出身份认证的结果。下面分别对这两个阶段的步骤进行说明。
训练阶段
S1110,数据采集模块获取用户的操作行为数据。操作行为数据也可以称为操作习惯行为数据。
具体地,数据采集模块注册传感器的监听器,通过传感器采集用户的操作行为数据。该数据采集模块可以为图4中的数据采集模块420。
示例性地,该传感器包括触屏传感器和/或运动传感器。
例如,触屏传感器采集的数据包括:时间戳、触摸点X/Y轴坐标、触摸面积、触摸压力、动作(action)和屏幕方向等。
运动传感器采集的数据包括:时间戳、加速度X/Y/Z轴数据、陀螺仪X/Y/Z轴数据等。
应理解,以上仅为示意,传感器采集的数据可以包括以上中的任一项或几项,也可以包括其他数据。
示例性地,可以屏幕解锁时,注册传感器的监听器;在屏幕锁屏时,注销传感器的监听器。
S1120,对传感器采集的操作行为数据进行数据预处理,得到第一训练样本。
例如,预处理包括:剔除异常数据、筛选有效操作行为数据或提取特征数据等,得到第一训练样本。该第一训练样本可以存储于图4所示的存储模块440中。
示例性地,可以将预处理后得到的数据作为第一训练样本中的正样本,将预置的非认证用户的操作行为数据作为第一训练样本中的负样本。
应理解,图11中步骤(A2)由数据采集模块执行仅为示例。可选地,步骤(A2)可以由身份认证模块执行。
S1130,身份认证模块训练身份认证模型。该身份认证模块可以为图4中的身份认证模块430。该步骤与方法700中的S720对应。
可选地,在开始训练身份认证模型之前,身份认证模型还可以判断是否开始训练身份认证模型。
示例性地,当第一样本数据的采集的时段超过预设时长和/或第一样本数据的数据量超过预设数量,开始建立身份认证模型。
可替换地,若步骤(A2)由数据采集模块执行,则可以由数据采集模块判断是否开始训练身份认证模型。例如,如图11的步骤S1121所示,当第一样本数据的采集的时段超过预设时长和/或第一样本数据的数据量超过预设数量,向身份认证模块发送建模指令,通知身份认证模块开始训练身份认证模型。
具体地,身份认证模型的训练方法可以采用本申请实施例中的身份认证模型的训练方法。
用户在不同的应用场景下的行为可能不同,因此,可以针对不同的应用场景分别建立身份认证模型。
S1130还包括验证身份认证模型的有效性。
若身份认证模型的指标满足预设指标,则该身份认证模型有效,建模阶段结束,得到训练好的身份认证模型。例如,可以标记建模完成状态。
该身份认证模型的有效性验证方式可以根据应用场景设定。即在不同的应用场景中,对身份认证模型的有效性的验证方式可以不同。
例如,该身份认证模型应用于屏幕解锁时,该身份认证模型的指标满足预设指标可以为该身份认证模型的抗攻击能力满足要求。
训练好的身份认证模型可以存储于图4所述的存储模块440中。存储模块440中存储的身份认证模型可以为一个或多个。
认证阶段
S1140,启动身份认证功能。
示例性地,该身份认证模型可以应用于应用程序中的身份认证。例如,如图11所示,该APP可以是集成身份认证SDK的APP。该APP可以为图4中的APP410。
例如,在身份认证模型应用于应用锁解锁的情况下,待认证的用户需要使用APP时,可以通知身份认证模型启动身份认证。该APP可以称为待认证APP。
进一步地,该APP可以提示用户需要获取用户的操作行为数据。若用户同意,则启动身份认证。
进一步地,身份认证模块可以通知数据采集模块监听传感器的数据。具体地,如图11的步骤S1141所示,身份认证模块可以将上下文环境信息反馈至数据采集模块。上下文环境信息用于指示应用环境,例如,如图11所示,上下文环境信息包括待认证APP的类型,比如支付类,新闻类等。或者上下文信息还可以包括时间或地点等信息。
S1150,数据采集模块监听传感器采集的待认证用户的操作行为数据。
如前所述,不同的身份认证模型可以由不同的操作行为数据训练得到。在该情况下,数据采集模块监听传感器的数据指的是该应用场景对应的用户的操作行为数据,如图11所示,数据采集模块监听待认证APP对应的用户的操作行为数据。
数据采集模块可以根据上下文环境信息确定需要采集的用户的操作行为数据。
示例性地,对于不同的待认证APP,身份认证模型可以通过不同的用户操作行为数据进行身份认证。
或者,对于同一个待认证APP,身份认证模型应用于应用锁解锁和应用于支付时可以通过不同的用户操作行为数据进行身份认证。
例如,对于同一待认证APP,身份认证模型应用于应用锁解锁时,数据采集模块监听用户操作行为数据1,将用户操作行为数据1输入身份认证模型;身份认证模型应用于支付时,数据采集模块监听用户操作行为数据2,将用户操作行为数据2输入身份认证模型。用户操作行为数据1的参数可以少于用户操作行为数据2的参数。例如,用户操作行为数据1可以包括触屏传感器采集的数据。用户操作行为数据2可以包括触屏传感器和运动传 感器采集的数据。这样,针对不同的应用场景,监听不同的数据,能够有针对性的实现身份认证,进一步保证了系统的安全性。
S1160,发起身份认证请求。
示例性地,该身份认证模型可以应用于应用程序中的身份认证。待认证APP可以发起身份认证请求,例如,向图4中的身份认证模块430发起身份认证请求。
S1170,匹配待认证用户的操作行为数据。
该待认证用户的操作行为数据是根据步骤(B2)中数据采集模块监听的传感器的数据确定的。
具体地,将该待认证用户的操作行为数据输入身份认证模型中,识别是否为认证用户。进一步地,如图11的步骤S1171所示,将识别结果反馈至待认证应用程序。
可以采用本申请实施例中的身份认证方法匹配待认证用户的操作行为数据。例如,该身份认证方法可以为图8中的方法800。
在本申请实施例中,将至少两个认证模型的识别结果输入决策融合模型进行决策融合,得到身份认证结果,能够提高身份认证的准确性。同时,通过用户的操作行为数据进行身份验证,不会改变用户的用机习惯,有利于实现无感验证。
本申请实施例提供的身份认证的方法可以应用于持续的身份认证。在用户唤醒智能终端时可以对指示唤醒智能终端的待认证用户进行身份认证,在智能终端处于唤醒状态即非锁定状态时,可以在待认证用户下发指令或者预设时间间隔之外,对待认证用户再次进行身份认证,从而能够有效的避免用户在唤醒智能终端后离开,任何再接触智能终端的用户均可以使用智能终端,从而导致智能终端中用户隐私数据泄露等安全性问题。或者,在用户登录APP时,可以对登录APP的待认证用户进行身份认证,在处于登录状态时,可以在待认证用户下发指令或者预设时间间隔之外,对待认证用户再次进行身份认证,从而能够有效的避免用户在登录APP后离开,任何再接触智能终端的用户均可以使用该APP,从而导致智能终端中用户隐私数据泄露等安全性问题。
本申请实施例提供的身份认证方法可以和其他身份认证方法结合。
示例性地,将本申请实施例提供的身份认证方法作为辅助认证方法,与其他认证方法结合进行身份认证,提高了系统的安全性和可靠性。例如,如图12所示,通过人脸识别或指纹识别进行第一次身份认证,若认证成功,则采用本申请实施例提供的身份认证方法进行第二次身份认证,若认证成功,则认证通过。若出现一次认证失败,则提示用户进行密码认证。
示例性地,将本申请实施例提供的身份认证方法作为持续保护方法,与其他认证方法结合进行身份认证,提高了系统的安全性和可靠性。例如,如图13所示,在用户登录APP时,对登录APP的待认证用户进行身份认证。在处于登录状态时,可以在待认证用户下发指令或者预设时间间隔之外,可以采用本申请提供的身份认证方法对待认证用户再次进行身份认证,若认证失败,则弹出应用锁或者其他认证界面。这样能够有效的避免用户在登录APP后离开,任何再接触智能终端的用户均可以使用该APP,从而导致智能终端中用户隐私数据泄露等安全性问题。本申请实施例提供的身份认证方法采用行为认证方法,能够实现无感认证。
示例性地,将本申请实施例提供的身份认证方法用于支付风险控制,与其他认证方法 结合进行身份认证,提高了系统的安全性和可靠性。例如,如图14所示,在用户进行支付时,通过本申请实施例提供的身份认证方法进行身份认证,将身份认证的结果作为行为风险控制信息输入业务风险控制系统,进行风险控制。若风险控制合格,则进行支付操作;若风险控制不合格,则可以提示用户采用其他方式进行身份认证。
下面结合附图对本申请实施例的训练装置和身份认证的装置进行详细的描述,应理解,下面描述的身份认证模型的训练装置能够执行前述本申请实施例的身份认证模型的训练方法,身份认证的装置可以执行前述本申请实施例的身份认证的方法,为了避免不必要的重复,下面在介绍本申请实施例的身份认证的装置和身份认证模型的训练装置时适当省略重复的描述。
图15是本申请实施例的的示意性框图。图15所示的身份认证模型的训练装置1500包括获取单元1510和处理单元1520。身份认证模型包括第一认证模型、第二认证模型和决策融合模型,其中,所述第一、第二认证模型分别为异常检测模型和分类模型。
获取单元1510和处理单元1520可以用于执行本申请实施例的身份认证模型的训练方法,具体地,获取单元1510可以执行上述步骤S731,处理单元1520可以执行上述步骤S732和S733。
获取单元1510用于获取第二样本数据,所述第二样本数据包括第二操作行为样本数据及所述第二操作行为样本数据对应的标签,所述第二操作行为样本数据对应的标签用于指示所述第二操作行为样本数据所对应的用户为认证用户或非认证用户。处理单元1520用于将所述第二样本数据中的第二操作行为样本数据输入到第一认证模型和第二认证模型中,得到所述第二操作行为数据的识别结果,所述第一认证模型的模型参数是基于第一样本数据进行训练得到的,所述第一样本数据包括第一操作行为样本数据和所述第一操作行为样本数据对应的标签,所述第一操作行为样本数据对应的标签用于表示所述第一操作行为样本数据对应的用户为认证用户或非认证用户,所述第二认证模型的模型参数是基于第五样本数据进行训练得到的,所述第五样本数据包括第五操作行为样本数据和所述第五操作行为样本数据对应的标签,所述第五操作行为样本数据对应的标签用于表示所述第五操作行为数据对应的用户为认证用户或非认证用户;以所述第二操作行为样本数据的识别结果作为所述决策融合模型的输入,以所述第二操作行为样本数据对应的标签作为所述决策融合模型的目标输出值进行训练,得到训练好的决策融合模型。
可选地,作为一个实施例,所述第二操作行为样本数据的识别结果包括所述第一认证模型输出的识别结果和所述第二认证模型输出的识别结果;所述第一认证模型输出的识别结果包括:所述第一认证模型输出的所述第二操作行为样本数据对应的匹配分数和/或所述第一认证模型输出的第二操作行为样本对应的匹配结果;所述第二认证模型输出的识别结果包括:所述第二认证模型输出的所述第二操作行为样本数据对应的匹配分数和/或所述第二认证模型输出的第二操作行为样本数据对应的匹配结果;其中,所述第二操作行为样本数据对应的匹配分数用于指示所述第二操作行为样本数据对应的用户被识别为认证用户的概率,所述第二操作行为样本数据对应的匹配结果用于指示所述第二操作行为样本数据对应的用户被识别为认证用户或非认证用户,所述第二操作行为样本数据对应的匹配结果包括基于至少两个阈值和所述第二操作行为样本数据对应的匹配分数确定的至少两个匹配结果。
可选地,作为一个实施例,至少两个阈值包括第一阈值,以及,获取单元1510还用于获取第四样本数据,所述第四样本数据包括第四操作行为样本数据和所述第四操作行为样本数据对应的标签,所述第四操作行为样本数据对应的标签用于表示所述第四操作行为样本数据对应的用户为认证用户或非认证用户。处理单元1520还用于:将所述第四操作行为样本数据输入所述第一认证模型,得到所述第一认证模型输出的所述第四操作行为样本数据对应的匹配分数,所述第四操作行为样本数据对应的匹配分数用于指示所述第四操作行为样本数据对应的用户被识别为认证用户的概率;基于多个候选阈值确定所述第四操作行为样本数据对应的匹配分数对应的多个候选匹配结果,所述多个候选匹配结果用于指示所述第四操作行为样本数据对应的用户被识别为认证用户或非认证用户;将所述多个候选匹配结果中准确率满足预设条件的候选匹配结果对应的候选阈值确定为所述第一阈值。
可选地,作为一个实施例,处理单元1520具体用于:根据所述第一认证模型输出的所述第二操作行为样本数据对应的匹配分数得到第二操作行为样本数据的第一分数特征;根据所述第二认证模型输出的所述第二操作行为样本数据对应的匹配分数得到第二操作行为样本数据的第二分数特征;以所述第二操作行为样本数据的第一分数特征、第二操作行为样本数据的第二分数特征和所述第二操作行为样本数据的识别结果作为决策融合模型的输入,以所述第二操作行为样本数据对应的标签作为所述决策融合模型的目标输出值进行训练。
可选地,作为一个实施例,第二操作行为样本数据包括以下数据中的至少一种:触摸点X/Y轴坐标、触摸面积、触摸压力、触屏速度、触屏加速度、触屏轨迹的斜率、触屏长度、触屏位移、触屏角度、触屏方向、加速度X/Y/Z轴数据或陀螺仪X/Y/Z轴数据
可选地,作为一个实施例,第二样本数据是根据用户在触摸屏上的滑动时长和/或所述用户在触摸屏上的进行筛选得到的。
可选地,作为一个实施例,获取单元1510还用于:获取第三样本数据,其中,所述第三样本数据包括:第三操作行为样本数据。处理单元1520还用于:将所述第三操作行为样本数据输入到所述第一认证模型中,得到所述第一认证模型输出的第三操作行为样本数据的识别结果;将所述第三操作行为样本数据输入到所述第二认证模型中,得到所述第二认证模型输出的第三操作行为样本数据的识别结果;将所述第一认证模型输出的第三操作行为样本数据的识别结果、所述第二认证模型输出的第三操作行为样本数据的识别结果输入到所述训练好的决策融合模型中,得到所述第三操作行为样本数据对应的身份认证结果;根据所述第三操作行为样本数据和所述第三操作行为样本数据对应的身份认证结果对所述第一认证模型和/或所述第二认证模型进行训练。
图16是本申请实施例提供的身份认证的装置1600的示意性框图。图16所示的身份认证的装置1600包括获取单元1610和处理单元1620。
获取单元1610和处理单元1620可以用于执行本申请实施例的身份认证的方法,具体地,获取单元1610可以执行上述步骤S810,处理单元1620可以执行上述步骤S820和S830。
获取单元1610用于待认证用户的第一操作行为数据;获取待认证用户的第二操作行为数据。处理单元1620用于以所述第一操作行为数据输入通过第一认证模型,获得所述第一认证模型输出的第一识别结果;以所述第二操作行为数据输入通过第二认证模型,获得所述第二认证模型输出的第二识别结果;其中,所述第一、第二认证模型分别为异常检 测模型和分类模型;将所述第一识别结果、所述第二识别结果输入到决策融合模型中,得到输出的身份认证结果,其中,所述决策融合模型用于根据所述第一识别结果和所述第二识别结果的权重参数确定所述身份认证结果。
可选地,作为一个实施例,第一操作行为数据和/或第二操作行为数据是通过传感器采集的数据。
可选地,作为一个实施例,决策融合模型的权重参数是通过将第二样本数据中的操作行为样本数据输入到所述第一认证模型和所述第二认证模型中,得到所述第一认证模型和所述第二认证模型输出的所述第二操作行为样本数据的识别结果,以所述第二操作行为样本数据的识别结果作为所述决策融合模型的输入,以所述第二操作行为样本数据对应的标签作为所述决策融合模型的目标输出值进行训练得到的,所述第二样本数据包括所述第二操作行为样本数据及所述第二操作行为样本数据对应的标签,所述第二操作行为样本数据对应的标签用于指示所述第二操作行为样本数据所对应的用户为认证用户或非认证用户。
可选地,作为一个实施例,所述第一识别结果包括:所述第一操作行为数据对应的匹配分数和/或所述第一操作行为数据对应的匹配结果;第二识别结果包括所述第二操作行为数据对应的匹配分数和/或所述第二操作行为数据对应的匹配结果;其中,所述第一操作行为数据对应的匹配分数用于指示所述待认证用户被识别为认证用户的概率,所述第一操作行为数据对应的匹配结果用于指示所述待认证用户被识别为认证用户或非认证用户,所述第一操作行为数据对应的匹配结果包括基于至少两个阈值和所述第一操作行为数据对应的匹配分数确定的至少两个匹配结果;所述第二操作行为数据对应的匹配分数用于指示所述待认证用户被识别为认证用户的概率,所述第二操作行为数据对应的匹配结果用于指示所述待认证用户被识别为认证用户或非认证用户,所述第二操作行为数据对应的匹配结果包括基于至少两个阈值和所述第二操作行为数据对应的匹配分数确定的至少两个匹配结果。
可选地,作为一个实施例,处理单元1620具体用于:根据所述第一操作行为数据对应的匹配分数得到第一分数特征;根据所述第二操作行为数据对应的匹配分数得到第二分数特征;以所述第一分数特征、所述第二分数特征、所述第一识别结果和所述第二识别结果输入到决策融合模型中,得到输出的身份认证结果。
可选地,作为一个实施例,第一操作行为数据和/或第二操作行为数据包括以下数据中的至少一种:触摸点X/Y轴坐标、触摸面积、触摸压力、触屏速度、触屏加速度、触屏轨迹的斜率、触屏长度、触屏位移、触屏角度、触屏方向、加速度X/Y/Z轴数据或陀螺仪X/Y/Z轴数据。
需要说明的是,上述训练装置1500以及身份认证的装置1600以功能单元的形式体现。这里的术语“单元”可以通过软件和/或硬件形式实现,对此不作具体限定。
例如,“单元”可以是实现上述功能的软件程序、硬件电路或二者结合。所述硬件电路可能包括应用特有集成电路(application specific integrated circuit,ASIC)、电子电路、用于执行一个或多个软件或固件程序的处理器(例如共享处理器、专有处理器或组处理器等)和存储器、合并逻辑电路和/或其它支持所描述的功能的合适组件。
因此,在本申请的实施例中描述的各示例的单元,能够以电子硬件、或者计算机软件和电子硬件的结合来实现。这些功能究竟以硬件还是软件方式来执行,取决于技术方案的 特定应用和设计约束条件。专业技术人员可以对每个特定的应用来使用不同方法来实现所描述的功能,但是这种实现不应认为超出本申请的范围。
图17是本申请实施例提供的一种身份认证模型的训练装置的硬件结构示意图。图17所示的训练装置900(该装置900具体可以是一种计算机设备)包括存储器901、处理器902、通信接口903以及总线904。其中,存储器901、处理器902、通信接口903通过总线904实现彼此之间的通信连接。
存储器901可以是只读存储器(read only memory,ROM),静态存储设备,动态存储设备或者随机存取存储器(random access memory,RAM)。存储器901可以存储程序,当存储器901中存储的程序被处理器902执行时,处理器902用于执行本申请实施例的身份认证模型的训练方法的各个步骤,例如,执行图6或图7所示的各个步骤。
应理解,本申请实施例所示的训练装置可以是服务器,例如,可以是云端的服务器,或者,也可以是配置于云端的服务器中的芯片。
或者,本申请实施例所示的装置可以是智能终端,或者,也可以是配置于智能终端中的芯片。
处理器902可以采用通用的中央处理器(central processing unit,CPU),微处理器,应用专用集成电路(application specific integrated circuit,ASIC),图形处理器(graphics processing unit,GPU)或者一个或多个集成电路,用于执行相关程序,以实现本申请方法实施例的身份认证模型的训练方法。
处理器902还可以是一种集成电路芯片,具有信号的处理能力。在实现过程中,本申请的身份认证模型的训练方法的各个步骤可以通过处理器902中的硬件的集成逻辑电路或者软件形式的指令完成。
上述处理器902还可以是通用处理器、数字信号处理器(digital signal processing,DSP)、专用集成电路(ASIC)、现成可编程门阵列(field programmable gate array,FPGA)或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件。可以实现或者执行本申请实施例中的公开的各方法、步骤及逻辑框图。通用处理器可以是微处理器或者该处理器也可以是任何常规的处理器等。结合本申请实施例所公开的方法的步骤可以直接体现为硬件译码处理器执行完成,或者用译码处理器中的硬件及软件模块组合执行完成。软件模块可以位于随机存储器,闪存、只读存储器,可编程只读存储器或者电可擦写可编程存储器、寄存器等本领域成熟的存储介质中。该存储介质位于存储器901,处理器902读取存储器901中的信息,结合其硬件完成本申请实施中图17所示的训练装置中包括的单元所需执行的功能,或者,执行本申请方法实施例的图6或图7所示的身份认证模型的训练方法。
通信接口903使用例如但不限于收发器一类的收发装置,来实现训练装置900与其他设备或通信网络之间的通信。
总线904可包括在训练装置900各个部件(例如,存储器901、处理器902、通信接口903)之间传送信息的通路。
图18是本申请实施例提供的身份认证的装置的硬件结构示意图。图18所示的身份认证的装置1000(该装置1000具体可以是一种计算机设备)包括存储器1001、处理器1002、通信接口1003以及总线1004。其中,存储器1001、处理器1002、通信接口1003通过总 线1004实现彼此之间的通信连接。
存储器1001可以是只读存储器(read only memory,ROM),静态存储设备,动态存储设备或者随机存取存储器(random access memory,RAM)。存储器1001可以存储程序,当存储器1001中存储的程序被处理器1002执行时,处理器1002用于执行本申请实施例的身份认证的方法的各个步骤,例如,执行图8所示的各个步骤。
应理解,本申请实施例所示的装置可以是智能终端,或者,也可以是配置于智能终端中的芯片。
处理器1002可以采用通用的中央处理器(central processing unit,CPU),微处理器,应用专用集成电路(application specific integrated circuit,ASIC),图形处理器(graphics processing unit,GPU)或者一个或多个集成电路,用于执行相关程序,以实现本申请方法实施例的身份认证的方法。
处理器1002还可以是一种集成电路芯片,具有信号的处理能力。在实现过程中,本申请的身份认证的方法的各个步骤可以通过处理器1002中的硬件的集成逻辑电路或者软件形式的指令完成。
上述处理器1002还可以是通用处理器、数字信号处理器(digital signal processing,DSP)、专用集成电路(ASIC)、现成可编程门阵列(field programmable gate array,FPGA)或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件。可以实现或者执行本申请实施例中的公开的各方法、步骤及逻辑框图。通用处理器可以是微处理器或者该处理器也可以是任何常规的处理器等。结合本申请实施例所公开的方法的步骤可以直接体现为硬件译码处理器执行完成,或者用译码处理器中的硬件及软件模块组合执行完成。软件模块可以位于随机存储器,闪存、只读存储器,可编程只读存储器或者电可擦写可编程存储器、寄存器等本领域成熟的存储介质中。该存储介质位于存储器1001,处理器1002读取存储器1001中的信息,结合其硬件完成本申请实施中图16所示的装置中包括的单元所需执行的功能,或者,执行本申请方法实施例的图8所示的身份认证的方法。
通信接口1003使用例如但不限于收发器一类的收发装置,来实现装置1000与其他设备或通信网络之间的通信。
总线1004可包括在装置1000各个部件(例如,存储器1001、处理器1002、通信接口1003)之间传送信息的通路。
应注意,尽管上述训练装置900和装置1000仅仅示出了存储器、处理器、通信接口,但是在具体实现过程中,本领域的技术人员应当理解,训练装置900和装置1000还可以包括实现正常运行所必须的其他器件。同时,根据具体需要本领域的技术人员应当理解,上述训练装置900和装置1000还可包括实现其他附加功能的硬件器件。此外,本领域的技术人员应当理解,上述训练装置900和装置1000也可仅仅包括实现本申请实施例所必需的器件,而不必包括图17或图18中所示的全部器件。
图19是本申请实施例提供的身份认证的装置和身份认证模型的训练装置的硬件结构示意图。图19所示的装置1100(该装置1100具体可以是一种计算机设备)包括存储器1101、处理器1102、输出接口1103。
存储器1101可以是只读存储器(read only memory,ROM),静态存储设备,动态存储设备或者随机存取存储器(random access memory,RAM)。存储器1101可以存储程序 指令和数据,当存储器1101中存储的程序指令被处理器1102执行时,处理器1102用于执行本申请实施例的身份认证的方法或身份认证模型的训练方法的各个步骤。
例如,处理器1102接收触屏传感器和运动传感器的数据,能够实现上述实施例中的身份认证过程中的相应功能,包括如图19中的特征提取和行为匹配等。再如,处理器1102接收触屏传感器和运动传感器的数据,能够实现上述实施例中的身份认证模型的训练过程中的相应功能,包括如图19中的特征提取和行为建模等。进一步地,处理器1102还可以用于实现图5中的其他功能。
应理解,本申请实施例所示的装置可以是智能终端,或者,也可以是配置于智能终端中的芯片。
处理器1102可以采用通用的中央处理器(central processing unit,CPU),微处理器,应用专用集成电路(application specific integrated circuit,ASIC),图形处理器(graphics processing unit,GPU)或者一个或多个集成电路,用于执行相关程序,以实现本申请方法实施例的身份认证的方法或身份认证模型的训练方法。
处理器1102还可以是一种集成电路芯片,具有信号的处理能力。在实现过程中,本申请的身份认证的方法或身份认证模型的训练方法的各个步骤可以通过处理器1102中的硬件的集成逻辑电路或者软件形式的指令完成。
上述处理器1102还可以是通用处理器、数字信号处理器(digital signal processing,DSP)、专用集成电路(ASIC)、现成可编程门阵列(field programmable gate array,FPGA)或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件。可以实现或者执行本申请实施例中的公开的各方法、步骤及逻辑框图。通用处理器可以是微处理器或者该处理器也可以是任何常规的处理器等。结合本申请实施例所公开的方法的步骤可以直接体现为硬件译码处理器执行完成,或者用译码处理器中的硬件及软件模块组合执行完成。软件模块可以位于随机存储器,闪存、只读存储器,可编程只读存储器或者电可擦写可编程存储器、寄存器等本领域成熟的存储介质中。该存储介质位于存储器1101,处理器1102读取存储器1101中的信息,结合其硬件完成本申请实施中图15或图16所示的装置中包括的单元所需执行的功能,或者,执行本申请方法实施例的图6或图7或图8所示的方法。
输出接口1103使用例如但不限于收发器一类的收发装置,来实现装置1100与其他设备或通信网络之间的通信。
应注意,尽管上述装置1100仅仅示出了存储器、处理器、输出接口,但是在具体实现过程中,本领域的技术人员应当理解,装置1100还可以包括实现正常运行所必须的其他器件。同时,根据具体需要本领域的技术人员应当理解,上述装置1100还可包括实现其他附加功能的硬件器件。此外,本领域的技术人员应当理解,上述装置1100也可仅仅包括实现本申请实施例所必需的器件,而不必包括图19中所示的全部器件。
应理解,本申请实施例中的处理器可以为中央处理单元(central processing unit,CPU),该处理器还可以是其他通用处理器、数字信号处理器(digital signal processor,DSP)、专用集成电路(application specific integrated circuit,ASIC)、现成可编程门阵列(field programmable gate array,FPGA)或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件等。通用处理器可以是微处理器或者该处理器也可以是任何常规的处理器等。
还应理解,本申请实施例中的存储器可以是易失性存储器或非易失性存储器,或可包括易失性和非易失性存储器两者。其中,非易失性存储器可以是只读存储器(read-only memory,ROM)、可编程只读存储器(programmable ROM,PROM)、可擦除可编程只读存储器(erasable PROM,EPROM)、电可擦除可编程只读存储器(electrically EPROM,EEPROM)或闪存。易失性存储器可以是随机存取存储器(random access memory,RAM),其用作外部高速缓存。通过示例性但不是限制性说明,许多形式的随机存取存储器(random access memory,RAM)可用,例如静态随机存取存储器(static RAM,SRAM)、动态随机存取存储器(DRAM)、同步动态随机存取存储器(synchronous DRAM,SDRAM)、双倍数据速率同步动态随机存取存储器(double data rate SDRAM,DDR SDRAM)、增强型同步动态随机存取存储器(enhanced SDRAM,ESDRAM)、同步连接动态随机存取存储器(synchlink DRAM,SLDRAM)和直接内存总线随机存取存储器(direct rambus RAM,DR RAM)。
上述实施例,可以全部或部分地通过软件、硬件、固件或其他任意组合来实现。当使用软件实现时,上述实施例可以全部或部分地以计算机程序产品的形式实现。所述计算机程序产品包括一个或多个计算机指令或计算机程序。在计算机上加载或执行所述计算机指令或计算机程序时,全部或部分地产生按照本申请实施例所述的流程或功能。所述计算机可以为通用计算机、专用计算机、计算机网络、或者其他可编程装置。所述计算机指令可以存储在计算机可读存储介质中,或者从一个计算机可读存储介质向另一个计算机可读存储介质传输,例如,所述计算机指令可以从一个网站站点、计算机、服务器或数据中心通过有线(例如红外、无线、微波等)方式向另一个网站站点、计算机、服务器或数据中心进行传输。所述计算机可读存储介质可以是计算机能够存取的任何可用介质或者是包含一个或多个可用介质集合的服务器、数据中心等数据存储设备。所述可用介质可以是磁性介质(例如,软盘、硬盘、磁带)、光介质(例如,DVD)、或者半导体介质。半导体介质可以是固态硬盘。
应理解,本文中术语“和/或”,仅仅是一种描述关联对象的关联关系,表示可以存在三种关系,例如,A和/或B,可以表示:单独存在A,同时存在A和B,单独存在B这三种情况,其中A,B可以是单数或者复数。另外,本文中字符“/”,一般表示前后关联对象是一种“或”的关系,但也可能表示的是一种“和/或”的关系,具体可参考前后文进行理解。
本申请中,“至少一个”是指一个或者多个,“多个”是指两个或两个以上。“以下至少一项(个)”或其类似表达,是指的这些项中的任意组合,包括单项(个)或复数项(个)的任意组合。例如,a,b,或c中的至少一项(个),可以表示:a,b,c,a-b,a-c,b-c,或a-b-c,其中a,b,c可以是单个,也可以是多个。
应理解,在本申请的各种实施例中,上述各过程的序号的大小并不意味着执行顺序的先后,各过程的执行顺序应以其功能和内在逻辑确定,而不应对本申请实施例的实施过程构成任何限定。
本领域普通技术人员可以意识到,结合本文中所公开的实施例描述的各示例的单元及算法步骤,能够以电子硬件、或者计算机软件和电子硬件的结合来实现。这些功能究竟以硬件还是软件方式来执行,取决于技术方案的特定应用和设计约束条件。专业技术人员可以对每个特定的应用来使用不同方法来实现所描述的功能,但是这种实现不应认为超出本 申请的范围。
所属领域的技术人员可以清楚地了解到,为描述的方便和简洁,上述描述的系统、装置和单元的具体工作过程,可以参考前述方法实施例中的对应过程,在此不再赘述。
在本申请所提供的几个实施例中,应该理解到,所揭露的系统、装置和方法,可以通过其它的方式实现。例如,以上所描述的装置实施例仅仅是示意性的,例如,所述单元的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,例如多个单元或组件可以结合或者可以集成到另一个系统,或一些特征可以忽略,或不执行。另一点,所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些接口,装置或单元的间接耦合或通信连接,可以是电性,机械或其它的形式。
所述作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部单元来实现本实施例方案的目的。
另外,在本申请各个实施例中的各功能单元可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。
所述功能如果以软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在一个计算机可读取存储介质中。基于这样的理解,本申请的技术方案本质上或者说对现有技术做出贡献的部分或者该技术方案的部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质中,包括若干指令用以使得一台计算机设备(可以是个人计算机,服务器,或者网络设备等)执行本申请各个实施例所述方法的全部或部分步骤。而前述的存储介质包括:U盘、移动硬盘、只读存储器、随机存取存储器、磁碟或者光盘等各种可以存储程序代码的介质。
以上所述,仅为本申请的具体实施方式,但本申请的保护范围并不局限于此,任何熟悉本技术领域的技术人员在本申请揭露的技术范围内,可轻易想到变化或替换,都应涵盖在本申请的保护范围之内。因此,本申请的保护范围应以所述权利要求的保护范围为准。

Claims (30)

  1. 一种身份认证的方法,其特征在于,包括:
    获取待认证用户的第一操作行为数据;
    获取所述待认证用户的第二操作行为数据;
    以所述第一操作行为数据输入通过第一认证模型,获得所述第一认证模型输出的第一识别结果;
    以所述第二操作行为数据输入通过第二认证模型,获得所述第二认证模型输出的第二识别结果;
    其中,所述第一认证模型、第二认证模型分别为异常检测模型和分类模型;
    将所述第一识别结果、所述第二识别结果输入到决策融合模型中,得到输出的身份认证结果,其中,所述决策融合模型用于根据所述第一识别结果和所述第二识别结果的权重参数确定所述身份认证结果。
  2. 根据权利要求1所述的方法,其特征在于,所述第一认证模型的模型参数是基于第一样本数据进行训练得到的,所述第一样本数据包括:第一操作行为样本数据及所述第一操作行为样本数据对应的标签,所述第一操作行为样本数据对应的标签用于指示所述第一操作行为样本数据所对应的用户为认证用户或非认证用户。
  3. 根据权利要求1或2所述的方法,其特征在于,所述第一操作行为数据和/或所述第二操作行为数据是通过传感器采集的数据。
  4. 根据权利要求1至3中任一项所述的方法,其特征在于,所述异常检测模型包括:单分类支持向量机SVM模型或孤立森林,所述分类模型包括:SVM模型或神经网络。
  5. 根据权利要求1至4中任一项所述的方法,其特征在于,所述决策融合模型的权重参数是通过将第二样本数据中的第二操作行为样本数据输入到所述第一认证模型和所述第二认证模型中,得到所述第一认证模型和所述第二认证模型输出的所述第二操作行为样本数据的识别结果,以所述第二操作行为样本数据的识别结果作为所述决策融合模型的输入,以所述第二操作行为样本数据对应的标签作为所述决策融合模型的目标输出值进行训练得到的,所述第二样本数据包括所述第二操作行为样本数据及所述第二操作行为样本数据对应的标签,所述第二操作行为样本数据对应的标签用于指示所述第二操作行为样本数据所对应的用户为认证用户或非认证用户。
  6. 根据权利要求1至5中任一项所述的方法,其特征在于,
    所述第一识别结果包括:所述第一操作行为数据对应的匹配分数和/或所述第一操作行为数据对应的匹配结果;
    第二识别结果包括所述第二操作行为数据对应的匹配分数和/或所述第二操作行为数据对应的匹配结果;
    其中,所述第一操作行为数据对应的匹配分数用于指示所述待认证用户被识别为认证用户的概率,所述第一操作行为数据对应的匹配结果用于指示所述待认证用户被识别为认证用户或非认证用户,所述第一操作行为数据对应的匹配结果包括基于至少两个阈值和所述第一操作行为数据对应的匹配分数确定的至少两个匹配结果;
    所述第二操作行为数据对应的匹配分数用于指示所述待认证用户被识别为认证用户的概率,所述第二操作行为数据对应的匹配结果用于指示所述待认证用户被识别为认证用户或非认证用户,所述第二操作行为数据对应的匹配结果包括基于至少两个阈值和所述第二操作行为数据对应的匹配分数确定的至少两个匹配结果。
  7. 根据权利要求6所述的方法,其特征在于,所述将第一识别结果、所述第二识别结果输入到决策融合模型中,得到输出的身份认证结果,包括:
    根据所述第一操作行为数据对应的匹配分数得到第一分数特征;
    根据所述第二操作行为数据对应的匹配分数得到第二分数特征;
    以所述第一分数特征、所述第二分数特征、所述第一识别结果和所述第二识别结果输入到决策融合模型中,得到输出的身份认证结果。
  8. 根据权利要求1至7中任一项所述的方法,其特征在于,所述第一操作行为数据或所述第二操作行为数据包括以下数据中的至少一种:触摸点X/Y轴坐标、触摸面积、触摸压力、触屏速度、触屏加速度、触屏轨迹的斜率、触屏长度、触屏位移、触屏角度、触屏方向、加速度X/Y/Z轴数据或陀螺仪X/Y/Z轴数据。
  9. 一种身份认证模型的训练方法,其特征在于,所述身份认证模型包括第一认证模型、第二认证模型和决策融合模型,其中,所述第一认证模型、第二认证模型分别为异常检测模型和分类模型,所述方法包括:
    获取第二样本数据,所述第二样本数据包括第二操作行为样本数据及所述第二操作行为样本数据对应的标签,所述第二操作行为样本数据对应的标签用于指示所述第二操作行为样本数据所对应的用户为认证用户或非认证用户;
    将所述第二样本数据中的第二操作行为样本数据输入到第一认证模型和第二认证模型中,得到所述第二操作行为样本数据的识别结果,所述第一认证模型的模型参数是基于第一样本数据进行训练得到的,所述第一样本数据包括第一操作行为样本数据和所述第一操作行为样本数据对应的标签,所述第一操作行为样本数据对应的标签用于表示所述第一操作行为样本数据对应的用户为认证用户或非认证用户,所述第二认证模型的模型参数是基于第五样本数据进行训练得到的,所述第五样本数据包括第五操作行为样本数据和所述第五操作行为样本数据对应的标签,所述第五操作行为样本数据对应的标签用于表示所述第五操作行为数据对应的用户为认证用户或非认证用户;
    以所述第二操作行为样本数据的识别结果作为所述决策融合模型的输入,以所述第二操作行为样本数据对应的标签作为所述决策融合模型的目标输出值进行训练,得到训练好的决策融合模型。
  10. 根据权利要求9所述的方法,其特征在于,所述第二操作行为样本数据的识别结果包括所述第一认证模型输出的识别结果和所述第二认证模型输出的识别结果;所述第一认证模型输出的识别结果包括:所述第一认证模型输出的所述第二操作行为样本数据对应的匹配分数和/或所述第一认证模型输出的第二操作行为样本对应的匹配结果;所述第二认证模型输出的识别结果包括:所述第二认证模型输出的所述第二操作行为样本数据对应的匹配分数和/或所述第二认证模型输出的第二操作行为样本数据对应的匹配结果;
    其中,所述第二操作行为样本数据对应的匹配分数用于指示所述第二操作行为样本数据对应的用户被识别为认证用户的概率,所述第二操作行为样本数据对应的匹配结果用于 指示所述第二操作行为样本数据对应的用户被识别为认证用户或非认证用户,所述第二操作行为样本数据对应的匹配结果包括基于至少两个阈值和所述第二操作行为样本数据对应的匹配分数确定的至少两个匹配结果。
  11. 根据权利要求10所述的方法,其特征在于,所述至少两个阈值包括第一阈值,所述方法还包括:
    获取第四样本数据,所述第四样本数据包括第四操作行为样本数据和所述第四操作行为样本数据对应的标签,所述第四操作行为样本数据对应的标签用于表示所述第四操作行为样本数据对应的用户为认证用户或非认证用户;
    将所述第四操作行为样本数据输入所述第一认证模型,得到所述第一认证模型输出的所述第四操作行为样本数据对应的匹配分数,所述第四操作行为样本数据对应的匹配分数用于指示所述第四操作行为样本数据对应的用户被识别为认证用户的概率;
    基于多个候选阈值确定所述第四操作行为样本数据对应的匹配分数对应的多个候选匹配结果,所述多个候选匹配结果用于指示所述第四操作行为样本数据对应的用户被识别为认证用户或非认证用户;
    将所述多个候选匹配结果中准确率满足预设条件的候选匹配结果对应的候选阈值确定为所述第一阈值。
  12. 根据权利要求10或11所述的方法,其特征在于,所述以所述第二操作行为数据的识别结果作为所述决策融合模型的输入,以所述第二操作行为数据对应的标签作为所述决策融合模型的目标输出值进行训练,得到训练好的决策融合模型,包括:
    根据所述第一认证模型输出的所述第二操作行为样本数据对应的匹配分数得到第二操作行为样本数据的第一分数特征;
    根据所述第二认证模型输出的所述第二操作行为样本数据对应的匹配分数得到第二操作行为样本数据的第二分数特征;
    以所述第二操作行为样本数据的第一分数特征、第二操作行为样本数据的第二分数特征和所述第二操作行为样本数据的识别结果作为决策融合模型的输入,以所述第二操作行为样本数据对应的标签作为所述决策融合模型的目标输出值进行训练。
  13. 根据权利要求9至12中任一项所述的方法,其特征在于,所述第二操作行为样本数据包括以下数据中的至少一种:触摸点X/Y轴坐标、触摸面积、触摸压力、触屏速度、触屏加速度、触屏轨迹的斜率、触屏长度、触屏位移、触屏角度、触屏方向、加速度X/Y/Z轴数据或陀螺仪X/Y/Z轴数据。
  14. 根据权利要求9至13中任一项所述的方法,其特征在于,所述第二样本数据是根据用户在触摸屏上的滑动时长和/或所述用户在触摸屏上的触摸点数进行筛选得到的。
  15. 根据权利要求9至14中任一项所述的方法,其特征在于,所述方法还包括:
    获取第三样本数据,其中,所述第三样本数据包括:第三操作行为样本数据;
    将所述第三操作行为样本数据输入到所述第一认证模型中,得到所述第一认证模型输出的第三操作行为样本数据的识别结果;
    将所述第三操作行为样本数据输入到所述第二认证模型中,得到所述第二认证模型输出的第三操作行为样本数据的识别结果;
    将所述第一认证模型输出的第三操作行为样本数据的识别结果、所述第二认证模型输 出的第三操作行为样本数据的识别结果输入到所述训练好的决策融合模型中,得到所述第三操作行为样本数据对应的身份认证结果;
    根据所述第三操作行为样本数据和所述第三操作行为样本数据对应的身份认证结果对所述第一认证模型和/或所述第二认证模型进行训练。
  16. 一种身份认证的装置,其特征在于,包括:
    获取单元,用于:
    获取待认证用户的第一操作行为数据;
    获取所述待认证用户的第二操作行为数据;
    处理单元,用于:
    以所述第一操作行为数据输入通过第一认证模型,获得所述第一认证模型输出的第一识别结果;
    以所述第二操作行为数据输入通过第二认证模型,获得所述第二认证模型输出的第二识别结果;
    其中,所述第一认证模型、第二认证模型分别为异常检测模型和分类模型;
    将所述第一识别结果、所述第二识别结果输入到决策融合模型中,得到输出的身份认证结果,其中,所述决策融合模型用于根据所述第一识别结果和所述第二识别结果的权重参数确定所述身份认证结果。
  17. 根据权利要求16所述的装置,其特征在于,所述决策融合模型的权重参数是通过将第二样本数据中的第二操作行为样本数据输入到所述第一认证模型和所述第二认证模型中,得到所述第一认证模型和所述第二认证模型输出的所述第二操作行为样本数据的识别结果,以所述第二操作行为样本数据的识别结果作为所述决策融合模型的输入,以所述第二操作行为样本数据对应的标签作为所述决策融合模型的目标输出值进行训练得到的,所述第二样本数据包括所述第二操作行为样本数据及所述第二操作行为样本数据对应的标签,所述第二操作行为样本数据对应的标签用于指示所述第二操作行为样本数据所对应的用户为认证用户或非认证用户。
  18. 根据权利要求16或17所述的装置,其特征在于,
    所述第一识别结果包括:所述第一操作行为数据对应的匹配分数和/或所述第一操作行为数据对应的匹配结果;
    第二识别结果包括所述第二操作行为数据对应的匹配分数和/或所述第二操作行为数据对应的匹配结果;
    其中,所述第一操作行为数据对应的匹配分数用于指示所述待认证用户被识别为认证用户的概率,所述第一操作行为数据对应的匹配结果用于指示所述待认证用户被识别为认证用户或非认证用户,所述第一操作行为数据对应的匹配结果包括基于至少两个阈值和所述第一操作行为数据对应的匹配分数确定的至少两个匹配结果;
    所述第二操作行为数据对应的匹配分数用于指示所述待认证用户被识别为认证用户的概率,所述第二操作行为数据对应的匹配结果用于指示所述待认证用户被识别为认证用户或非认证用户,所述第二操作行为数据对应的匹配结果包括基于至少两个阈值和所述第二操作行为数据对应的匹配分数确定的至少两个匹配结果。
  19. 根据权利要求18所述的装置,其特征在于,所述处理单元具体用于:
    根据所述第一操作行为数据对应的匹配分数得到第一分数特征;
    根据所述第二操作行为数据对应的匹配分数得到第二分数特征;
    以所述第一分数特征、所述第二分数特征、所述第一识别结果和所述第二识别结果输入到决策融合模型中,得到输出的身份认证结果。
  20. 根据权利要求16至19中任一项所述的装置,其特征在于,所述第一操作行为数据和/或所述第二操作行为数据包括以下数据中的至少一种:触摸点X/Y轴坐标、触摸面积、触摸压力、触屏速度、触屏加速度、触屏轨迹的斜率、触屏长度、触屏位移、触屏角度、触屏方向、加速度X/Y/Z轴数据或陀螺仪X/Y/Z轴数据。
  21. 一种身份认证模型的训练装置,其特征在于,所述身份认证模型包括第一认证模型、第二认证模型和决策融合模型,其中,所述第一认证模型、第二认证模型分别为异常检测模型和分类模型,所述装置包括:
    获取单元,用于获取第二样本数据,所述第二样本数据包括第二操作行为样本数据及所述第二操作行为样本数据对应的标签,所述第二操作行为样本数据对应的标签用于指示所述第二操作行为样本数据所对应的用户为认证用户或非认证用户;
    处理单元,用于:
    将所述第二样本数据中的第二操作行为样本数据输入到第一认证模型和第二认证模型中,得到所述第二操作行为数据的识别结果,所述第一认证模型的模型参数是基于第一样本数据进行训练得到的,所述第一样本数据包括第一操作行为样本数据和所述第一操作行为样本数据对应的标签,所述第一操作行为样本数据对应的标签用于表示所述第一操作行为样本数据对应的用户为认证用户或非认证用户,所述第二认证模型的模型参数是基于第五样本数据进行训练得到的,所述第五样本数据包括第五操作行为样本数据和所述第五操作行为样本数据对应的标签,所述第五操作行为样本数据对应的标签用于表示所述第五操作行为数据对应的用户为认证用户或非认证用户;
    以所述第二操作行为样本数据的识别结果作为所述决策融合模型的输入,以所述第二操作行为样本数据对应的标签作为所述决策融合模型的目标输出值进行训练,得到训练好的决策融合模型。
  22. 根据权利要求21所述的装置,其特征在于,所述第二操作行为样本数据的识别结果包括所述第一认证模型输出的识别结果和所述第二认证模型输出的识别结果;所述第一认证模型输出的识别结果包括:所述第一认证模型输出的所述第二操作行为样本数据对应的匹配分数和/或所述第一认证模型输出的第二操作行为样本对应的匹配结果;所述第二认证模型输出的识别结果包括:所述第二认证模型输出的所述第二操作行为样本数据对应的匹配分数和/或所述第二认证模型输出的第二操作行为样本数据对应的匹配结果;
    其中,所述第二操作行为样本数据对应的匹配分数用于指示所述第二操作行为样本数据对应的用户被识别为认证用户的概率,所述第二操作行为样本数据对应的匹配结果用于指示所述第二操作行为样本数据对应的用户被识别为认证用户或非认证用户,所述第二操作行为样本数据对应的匹配结果包括基于至少两个阈值和所述第二操作行为样本数据对应的匹配分数确定的至少两个匹配结果。
  23. 根据权利要求22所述的装置,其特征在于,所述至少两个阈值包括第一阈值,以及
    所述获取单元还用于:获取第四样本数据,所述第四样本数据包括第四操作行为样本数据和所述第四操作行为样本数据对应的标签,所述第四操作行为样本数据对应的标签用于表示所述第四操作行为样本数据对应的用户为认证用户或非认证用户;
    所述处理单元还用于:
    将所述第四操作行为样本数据输入所述第一认证模型,得到所述第一认证模型输出的所述第四操作行为样本数据对应的匹配分数,所述第四操作行为样本数据对应的匹配分数用于指示所述第四操作行为样本数据对应的用户被识别为认证用户的概率;
    基于多个候选阈值确定所述第四操作行为样本数据对应的匹配分数对应的多个候选匹配结果,所述多个候选匹配结果用于指示所述第四操作行为样本数据对应的用户被识别为认证用户或非认证用户;
    将所述多个候选匹配结果中准确率满足预设条件的候选匹配结果对应的候选阈值确定为所述第一阈值。
  24. 根据权利要求22或23所述的装置,其特征在于,所述处理单元具体用于:
    根据所述第一认证模型输出的所述第二操作行为样本数据对应的匹配分数得到第二操作行为样本数据的第一分数特征;
    根据所述第二认证模型输出的所述第二操作行为样本数据对应的匹配分数得到第二操作行为样本数据的第二分数特征;
    以所述第二操作行为样本数据的第一分数特征、第二操作行为样本数据的第二分数特征和所述第二操作行为样本数据的识别结果作为决策融合模型的输入,以所述第二操作行为样本数据对应的标签作为所述决策融合模型的目标输出值进行训练。
  25. 根据权利要求21至24中任一项所述的装置,其特征在于,所述第二操作行为样本样本数据包括以下数据中的至少一种:触摸点X/Y轴坐标、触摸面积、触摸压力、触屏速度、触屏加速度、触屏轨迹的斜率、触屏长度、触屏位移、触屏角度、触屏方向、加速度X/Y/Z轴数据或陀螺仪X/Y/Z轴数据。
  26. 根据权利要求21至25中任一项所述的装置,其特征在于,
    所述获取单元还用于:获取第三样本数据,其中,所述第三样本数据包括:第三操作行为样本数据;
    所述处理单元还用于:
    将所述第三操作行为样本数据输入到所述第一认证模型中,得到所述第一认证模型输出的第三操作行为样本数据的识别结果;
    将所述第三操作行为样本数据输入到所述第二认证模型中,得到所述第二认证模型输出的第三操作行为样本数据的识别结果;
    将所述第一认证模型输出的第三操作行为样本数据的识别结果、所述第二认证模型输出的第三操作行为样本数据的识别结果输入到所述训练好的决策融合模型中,得到所述第三操作行为样本数据对应的身份认证结果;
    根据所述第三操作行为样本数据和所述第三操作行为样本数据对应的身份认证结果对所述第一认证模型和/或所述第二认证模型进行训练。
  27. 一种身份认证的装置,其特征在于,包括处理器和存储器,所述处理器与所述存储器耦合,所述处理器用于读取并执行所述存储器中的指令,以执行如权利要求1至8中 任一项所述的方法。
  28. 一种身份认证模型的训练装置,其特征在于,包括处理器和存储器,所述处理器与所述存储器耦合,所述处理器用于读取并执行所述存储器中的指令,以执行如权利要求9至15中任一项所述的训练方法。
  29. 一种计算机可读介质,其特征在于,所述计算机可读介质存储有程序代码,当所述计算机程序代码在计算机上运行时,使得计算机执行如权利要求1至8中任一项所述的方法。
  30. 一种计算机可读介质,其特征在于,所述计算机可读介质存储有程序代码,当所述计算机程序代码在计算机上运行时,使得计算机执行如权利要求9至15中任一项所述的训练方法。
PCT/CN2021/085319 2020-04-06 2021-04-02 身份认证的方法、身份认证模型的训练方法及装置 Ceased WO2021204086A1 (zh)

Priority Applications (2)

Application Number Priority Date Filing Date Title
EP21784667.4A EP4120105A4 (en) 2020-04-06 2021-04-02 Identity authentication method, and method and device for training identity authentication model
US17/958,746 US20230027527A1 (en) 2020-04-06 2022-10-03 Identity authentication method, and method and apparatus for training identity authentication model

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202010262293.6 2020-04-06
CN202010262293 2020-04-06

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US17/958,746 Continuation US20230027527A1 (en) 2020-04-06 2022-10-03 Identity authentication method, and method and apparatus for training identity authentication model

Publications (1)

Publication Number Publication Date
WO2021204086A1 true WO2021204086A1 (zh) 2021-10-14

Family

ID=78023995

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/085319 Ceased WO2021204086A1 (zh) 2020-04-06 2021-04-02 身份认证的方法、身份认证模型的训练方法及装置

Country Status (3)

Country Link
US (1) US20230027527A1 (zh)
EP (1) EP4120105A4 (zh)
WO (1) WO2021204086A1 (zh)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114036487A (zh) * 2021-12-06 2022-02-11 北京神州新桥科技有限公司 身份认证方法及电子设备
CN115103127A (zh) * 2022-08-22 2022-09-23 环球数科集团有限公司 一种高性能的嵌入式智能摄像机设计系统与方法
CN115412373A (zh) * 2022-11-01 2022-11-29 中网信安科技有限公司 一种安全接入机电一体化工控网络的方法和系统
CN116112923A (zh) * 2023-02-16 2023-05-12 惠州市源医科技有限公司 一种智能多频5g无线路由器及其安全验证方法

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US12530444B2 (en) * 2022-11-17 2026-01-20 Dell Products, L.P. Firmware-based artificial intelligence (AI) model authorization in heterogeneous computing platforms
CN116910367A (zh) * 2023-07-21 2023-10-20 北京百度网讯科技有限公司 信息识别方法、神经网络模型训练方法及装置
CN117909729A (zh) * 2023-10-19 2024-04-19 北京火山引擎科技有限公司 接口访问处理方法、装置、计算机设备和存储介质
CN119783732B (zh) * 2025-03-10 2025-07-01 云南大学 一种基于多源遥感技术的找矿方法、装置、设备及介质

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104239761A (zh) * 2014-09-15 2014-12-24 西安交通大学 基于触屏滑动行为特征的身份持续认证方法
CN104408341A (zh) * 2014-11-13 2015-03-11 西安交通大学 基于陀螺仪行为特征的智能手机用户身份认证方法
CN106039711A (zh) * 2016-05-17 2016-10-26 网易(杭州)网络有限公司 一种用户身份认证方法和装置
CN110674483A (zh) * 2019-08-14 2020-01-10 广东工业大学 一种基于多模态信息的身份识别方法

Family Cites Families (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP2406717A4 (en) * 2009-03-13 2012-12-26 Univ Rutgers SYSTEMS AND METHODS FOR DETECTING DAMAGE PROGRAMS
EP3149643A1 (en) * 2014-05-30 2017-04-05 PCMS Holdings, Inc. Systems and methods for active authentication
US9686275B2 (en) * 2014-07-07 2017-06-20 International Business Machines Corporation Correlating cognitive biometrics for continuous identify verification
US9916431B2 (en) * 2015-01-15 2018-03-13 Qualcomm Incorporated Context-based access verification
KR102439938B1 (ko) * 2015-08-03 2022-09-05 삼성전자주식회사 사용자 인증을 위한 멀티-모달 퓨전 방법 및 사용자 인증 방법
US10289819B2 (en) * 2015-08-12 2019-05-14 Kryptowire LLC Active authentication of users
US11100201B2 (en) * 2015-10-21 2021-08-24 Neurametrix, Inc. Method and system for authenticating a user through typing cadence
US10749883B1 (en) * 2017-05-02 2020-08-18 Hrl Laboratories, Llc Automatic anomaly detector
US11159520B1 (en) * 2018-12-20 2021-10-26 Wells Fargo Bank, N.A. Systems and methods for passive continuous session authentication
EP3935529B1 (en) * 2019-03-07 2024-11-13 British Telecommunications public limited company Permissive access control

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104239761A (zh) * 2014-09-15 2014-12-24 西安交通大学 基于触屏滑动行为特征的身份持续认证方法
CN104408341A (zh) * 2014-11-13 2015-03-11 西安交通大学 基于陀螺仪行为特征的智能手机用户身份认证方法
CN106039711A (zh) * 2016-05-17 2016-10-26 网易(杭州)网络有限公司 一种用户身份认证方法和装置
CN110674483A (zh) * 2019-08-14 2020-01-10 广东工业大学 一种基于多模态信息的身份识别方法

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
See also references of EP4120105A4

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114036487A (zh) * 2021-12-06 2022-02-11 北京神州新桥科技有限公司 身份认证方法及电子设备
CN115103127A (zh) * 2022-08-22 2022-09-23 环球数科集团有限公司 一种高性能的嵌入式智能摄像机设计系统与方法
CN115412373A (zh) * 2022-11-01 2022-11-29 中网信安科技有限公司 一种安全接入机电一体化工控网络的方法和系统
CN115412373B (zh) * 2022-11-01 2023-03-21 中网信安科技有限公司 一种安全接入机电一体化工控网络的方法和系统
CN116112923A (zh) * 2023-02-16 2023-05-12 惠州市源医科技有限公司 一种智能多频5g无线路由器及其安全验证方法
CN116112923B (zh) * 2023-02-16 2023-08-08 惠州市源医科技有限公司 一种智能多频5g无线路由器及其安全验证方法

Also Published As

Publication number Publication date
EP4120105A1 (en) 2023-01-18
EP4120105A4 (en) 2023-08-23
US20230027527A1 (en) 2023-01-26

Similar Documents

Publication Publication Date Title
WO2021204086A1 (zh) 身份认证的方法、身份认证模型的训练方法及装置
Ali et al. Keystroke biometric systems for user authentication
Ahmed et al. Biometric recognition based on free-text keystroke dynamics
EP4567736A1 (en) Data processing method and apparatus
US20180232508A1 (en) Learning engines for authentication and autonomous applications
CN112395979B (zh) 基于图像的健康状态识别方法、装置、设备及存储介质
CN111695594B (zh) 图像类别识别方法、装置、计算机设备及介质
US20220004904A1 (en) Deepfake detection models utilizing subject-specific libraries
CN114764869B (zh) 利用每个对象的单个检测的多对象检测
CN112364136B (zh) 关键词生成方法、装置、设备及存储介质
CN112418059A (zh) 一种情绪识别的方法、装置、计算机设备及存储介质
CN116386099A (zh) 人脸多属性识别方法及其模型获取方法、装置
Wang et al. User authentication method based on MKL for keystroke and mouse behavioral feature fusion
Borra et al. Deep hashing with multilayer CNN-based biometric authentication for identifying individuals in transportation security
AL-Ghamdi et al. Artificial Intelligence Techniques Based Learner Authentication in Cybersecurity Higher Education Institutions.
Devi et al. AI-driven voter authentication and fraud detection system
US20250181702A1 (en) Information processing apparatus, information processing method, and program
Omoze et al. Machine learning-based multimodal biometric authentication system (facial and fingerprint recognition) for online voting systems
Ouadjer et al. Feature importance evaluation of smartphone touch gestures for biometric authentication
CN116777646A (zh) 基于人工智能的风险识别方法、装置、设备及存储介质
CN114004265A (zh) 一种模型训练方法及节点设备
Wang et al. Who Is Using the Phone? Representation‐Learning‐Based Continuous Authentication on Smartphones
Gupta et al. An ensemble model for user authentication leveraging keystroke dynamics and facial recognition
Hamzah Advancing personal identity verification by integrating facial recognition through deep learning algorithms
Wang et al. Research on optimization and application of Spark decision tree algorithm under cloud‐edge collaboration

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21784667

Country of ref document: EP

Kind code of ref document: A1

ENP Entry into the national phase

Ref document number: 2021784667

Country of ref document: EP

Effective date: 20221011

NENP Non-entry into the national phase

Ref country code: DE