WO2020098606A1 - 节点分类方法、模型训练方法、装置、设备及存储介质 - Google Patents
节点分类方法、模型训练方法、装置、设备及存储介质 Download PDFInfo
- Publication number
- WO2020098606A1 WO2020098606A1 PCT/CN2019/117173 CN2019117173W WO2020098606A1 WO 2020098606 A1 WO2020098606 A1 WO 2020098606A1 CN 2019117173 W CN2019117173 W CN 2019117173W WO 2020098606 A1 WO2020098606 A1 WO 2020098606A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- node
- target
- subset
- neighbor
- nodes
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Ceased
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/241—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
- G06F18/2415—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on parametric or probabilistic models, e.g. based on likelihood ratio or false acceptance rate versus a false rejection rate
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/213—Feature extraction, e.g. by transforming the feature space; Summarisation; Mappings, e.g. subspace methods
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/23—Clustering techniques
- G06F18/232—Non-hierarchical techniques
- G06F18/2323—Non-hierarchical techniques based on graph theory, e.g. minimum spanning trees [MST] or graph cuts
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/241—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
- G06F18/2413—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on distances to training or reference patterns
- G06F18/24133—Distances to prototypes
- G06F18/24143—Distances to neighbourhood prototypes, e.g. restricted Coulomb energy networks [RCEN]
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/241—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
- G06F18/2413—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on distances to training or reference patterns
- G06F18/24147—Distances to closest patterns, e.g. nearest neighbour classification
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/0464—Convolutional networks [CNN, ConvNet]
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/09—Supervised learning
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N7/00—Computing arrangements based on specific mathematical models
- G06N7/01—Probabilistic graphical models, e.g. probabilistic networks
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02D—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
- Y02D30/00—Reducing energy consumption in communication networks
- Y02D30/70—Reducing energy consumption in communication networks in wireless communication networks
Definitions
- This application relates to the field of Internet technology, and in particular to a node classification method, model training method, device, equipment, and storage medium.
- Atlas is also a common data, such as social networks, knowledge atlas and drug molecular structure.
- FIG. 1 is a schematic diagram of a graph convolutional neural network (GCN) classification model for graph node classification in the related art.
- GCN graph convolutional neural network
- GCN For large-scale graphs, they often contain tens of millions of nodes or even more than 100 million nodes, as well as connected edges with billions or more.
- GCN needs to traverse each node in one feature information calculation, which results in excessive calculation cost and excessive consumption of computing resources.
- Embodiments of the present application provide a node classification method, model training method, device, equipment, and storage medium, which can solve the problem that GCN needs to traverse each node in a feature information calculation, resulting in excessive calculation cost and excessive consumption of computing resources. problem.
- a method for node classification is provided, which is applied to a computer device.
- the method includes:
- the neighbor node set includes at least one neighbor node, and the neighbor node has an association relationship with the node to be classified;
- the node classification model is a model trained by at least one sample node subset, and the sample node subset is a subset of the sample node collection, Nodes in the sample node set are marked with node categories.
- a method for model training is provided, which is applied to computer equipment.
- the method includes:
- the target node has an association relationship
- the target model parameters of the node classification model are trained according to the predicted category probability subset and the target node category of the target node.
- the apparatus includes:
- the obtaining module is used to obtain the nodes to be classified in the target node set
- the acquiring module is further configured to acquire a neighbor node set of the node to be classified from the target node set, wherein the neighbor node set includes at least one neighbor node, the neighbor node and the node to be classified There is an association relationship between;
- An extraction module for extracting target feature information of the node to be classified according to the neighbor node set through a node classification model
- a determining module configured to determine the classification result of the node to be classified according to the target feature information, wherein the node classification model is a model trained by at least one sample node subset, and the sample node subset is a sample node In a subset of the set, the nodes in the sample node set are marked with node categories.
- the apparatus includes:
- An obtaining module configured to obtain a target node subset and a neighbor node subset corresponding to the target node subset from the sample node set marked with the target node category, the neighbor node in the neighbor node subset and the target
- the target nodes in the node sub-collection have an association relationship
- An extraction module configured to extract a node feature subset of the target node subset through a node classification model in combination with the neighbor node subset, wherein the node feature subset includes the node feature vector of the target node;
- a prediction module configured to perform category prediction on the target node according to the node feature subset to obtain a predicted category probability subset
- the training module is configured to train the target model parameters of the node classification model according to the predicted category probability subset and the target node category of the target node.
- a computer device in another aspect, includes a processor and a memory, and the memory stores at least one instruction, at least one program, code set, or instruction set.
- the at least one instruction, the at least one A piece of program, the code set or the instruction set is loaded and executed by the processor to implement the node classification method or the model training method as described in any one of the above embodiments of the present application.
- a computer-readable storage medium in which at least one instruction, at least one program, code set or instruction set is stored, the at least one instruction, the at least one program, the code
- the set or instruction set is loaded and executed by the processor to implement the node classification method or model training method as described in any one of the above embodiments of the present application.
- a computer program product which, when the computer program product runs on a computer, causes the computer to execute the node classification method or the model training method described in any one of the embodiments of the present application described above.
- the node to be classified when updating the characteristics of a node during the node classification process, is selected from the target node set, and the node characteristics of the node to be classified are determined according to the neighbor node set of the node to be classified, thereby The node classification results are obtained according to the characteristics of the nodes.
- Figure 1 is a schematic diagram of a graph convolution neural network classification model for graph node classification in the related art
- FIG. 2 is a schematic diagram of a large-scale map in an embodiment of the present application.
- FIG. 3 is a schematic structural diagram of a node classification system in an embodiment of this application.
- FIG. 4 is a schematic diagram of an embodiment of a node classification method in an embodiment of this application.
- FIG. 5 is a schematic diagram of an embodiment of a model training method in an embodiment of this application.
- FIG. 6 is a schematic diagram of an embodiment of sampling nodes of a graph in an embodiment of the present application.
- FIG. 7 is a schematic flowchart of model training for graph nodes in an embodiment of the present application.
- FIG. 8 is a flowchart of training a target model parameter of a node classification model according to a predicted category probability subset and a target node category of a target node;
- FIG. 9 is a schematic diagram of a process of model training for graph nodes in an embodiment of the present application.
- FIG. 10 is a structural block diagram of an apparatus for classifying nodes according to an exemplary embodiment of the present application.
- FIG. 11 is a structural block diagram of an apparatus for model training provided by an exemplary embodiment of the present application.
- FIG. 12 is a schematic structural diagram of a server in an embodiment of the present application.
- the embodiments of the present application provide a method for classifying nodes, a method and a device for model training. For large-scale graphs, training can be performed based on only a part of nodes, and each iteration calculates some nodes in the graph without traversing the graph. Each node greatly reduces the computing overhead and saves computing resources.
- the graph refers to a data structure composed of many nodes connected by each other.
- FIG. 2 is a schematic diagram of a large-scale graph in an embodiment of the present application.
- each node represents a Members
- the connection between two nodes is called a connected edge.
- This connected edge is used to indicate that the two members corresponding to the connected two nodes know each other.
- the members indicated by A1 belong to the same category
- the indicated by A2 The members of A belong to the same category
- the members indicated by A3 belong to the same category
- the members indicated by A4 belong to the same category.
- a node usually refers to a person or an organization, and the connection (edge) between the nodes often represents a certain social relationship (such as kinship or transaction behavior, etc.).
- Each node can be assigned a feature vector, such as the v node's feature vector is h (v), which is used to characterize the node's attributes.
- Each edge may have a weight value a, which is used to describe the tightness of the connection. The larger the weight value a, the closer the association between nodes.
- each side can be directional, used to indicate the directionality of the connection.
- the graph edges in the embodiments of the present application do not define directivity.
- graph node classification that is, classifying node v into category c (v).
- users can be divided into fans of different sports, such as basketball fans, football fans, or volleyball fans, based on the user's portrait and the user's friend relationship.
- FIG. 3 is a schematic structural diagram of the node classification system in the embodiment of the present application.
- Both the node classification method and the model training method are deployed on the server 310
- the server 310 receives a large amount of information sent by the terminal devices 320, it is assumed that one terminal device 320 corresponds to one user, then each user is represented as a node, and the user information (such as social information) is represented as node information.
- the server collects node information corresponding to a large number of nodes, thereby forming a large-scale map.
- the server 310 shown in FIG. 3 may be a single server or a system integrating multiple servers.
- the terminal device 320 includes, but is not limited to, a smart phone, a tablet computer, a personal computer (PC), a notebook computer, and a palmtop computer, which is only an illustration here, and should not be construed as limiting the present application.
- FIG. 4 shows a flow chart of the method of node classification provided by an exemplary embodiment of this application, and the method is applied as shown in FIG. 3
- the server 310 is described as an example. The method includes:
- the node to be classified includes at least one node in the target node set, and the number of nodes to be classified is less than the number of nodes in the target node set.
- the server first needs to obtain the node to be classified, where a node corresponds to a user in the graph, therefore, the node to be classified can also be regarded as a target user that needs to be classified.
- the neighbor node set of the node to be classified from the target node set, where the neighbor node set includes at least one neighbor node, and the neighbor node has an association relationship with the node to be classified;
- At least one neighbor node associated with the node to be classified may be obtained according to the node to be classified, and these neighbor nodes form a set of neighbor nodes.
- the neighbor nodes related to the target user may generally be friends of the target user.
- the neighbor nodes may be nodes that have direct contact with the node to be classified (that is, the node to be classified and the neighbor node have edges), and nodes that have indirect links with the node to be classified, which is not limited here. .
- the neighbor nodes in the set of neighbor nodes of the node to be classified are nodes that have an association relationship with at least two nodes to be classified, or nodes that have an association relationship with n nodes to be classified, where n is a positive integer.
- the neighbor node set when determining the neighbor node set, first calculate the first candidate probability of each candidate neighbor node in the target node set according to the node to be classified, and determine the neighbor node set according to the first candidate probability of each candidate neighbor node.
- the first candidate probability of the uth candidate neighbor node is determined by the edge weight between the node to be classified and the uth candidate neighbor node, and the edge weight of the node in the target node set, u It is a positive integer.
- p (u) represents the first candidate probability of the uth candidate neighbor node
- v j represents the jth node in the target node set
- w represents the node to be classified
- b represents the number of nodes to be classified
- N represents the target
- the neighbor nodes related to the target user may generally be friends of the target user. It is understandable that the neighbor node may be a node that has direct contact with the node to be classified (that is, the node to be classified has neighboring edges with the neighbor node), and may also be a node that has indirect contact with the target node, which is not limited here. .
- the server may enhance the characteristics of the node to be classified through a set of neighbor nodes, so as to obtain target characteristic information corresponding to the node to be classified.
- v 1 represents the node to be classified
- h '(v 1 ) represents the target feature information (that is, feature vector) of the node to be classified
- d represents the number of nodes in the neighbor node set
- n represents an integer from 1 to d
- u n represents the nth neighbor node
- h (u n ) represents the multidimensional feature vector of the nth neighbor node
- W 1 represents the model parameter of the node classification model.
- the neighbor node corresponding to the node to be classified can be multiplied by a model parameter, and then the target feature information is obtained by summing, so that the updated target feature information integrates all its neighbors The characteristics of nodes, and enrich the amount of information.
- the node classification model is a model trained based on at least one sample node subset, the sample node subset is itself, and the nodes in the sample node collection
- the node category is marked.
- the classification result corresponding to the node to be classified is output according to the target feature information.
- the output classification result is that the node to be classified belongs to an abnormal node, or, the output The classification result is that the node to be classified belongs to the basketball enthusiast node.
- the node classification model is obtained by training based on at least one sample node subset, and the number of nodes in each sample node subset is less than the number of nodes in the sample node collection, that is, the node classification model is trained based on a part of the nodes, and Each iteration calculates some nodes until the number of iterations reaches the threshold.
- the classification model refers to outputting the corresponding category after integrating some information with the input data.
- the classification model contains a set of model parameters, which can be optimized and adjusted through training.
- a method for node classification is provided.
- a node to be classified is selected from a set of target nodes, and the node is classified according to the neighbor node set of the node to be classified
- the node characteristics of the classification nodes are determined to obtain the node classification results according to the node characteristics.
- FIG. 5 is a flowchart of a method of model training provided by an exemplary embodiment of the present application. The method is applied to the server 310 shown in FIG. 3
- the method includes:
- the neighbor nodes in the neighbor node subset are related to the target node in the target node subset relationship.
- the sample node set is first obtained from the sample node set corresponding to the large-scale atlas, and then only a part of the nodes (that is, the target node subset) is updated each time, and according to this part of the node (target node subset) )
- a certain number of common related nodes ie, a subset of neighbor nodes are used for information integration.
- FIG. 6 is a schematic diagram of an embodiment for sampling the nodes of the graph in the embodiment of the present application.
- a sample node sub-collection including 10 nodes is obtained from the sample node set , Where the sample node subset includes 2 target nodes (ie, node M and node N in the figure), and obtains 8 neighbor nodes (ie, wave dot pattern nodes in the figure) associated with the target node according to the target node.
- the white node belongs to the node in the sample node set, it is not selected as the object of this training.
- the neighbor nodes related to the target user can usually be friends of the target user.
- the neighbor node may be a node that has a direct connection with the target node (that is, the target node has an edge with the neighbor node), and a node that has an indirect connection with the target node, which is not limited herein.
- the method for obtaining the target node subset and the neighbor node subset includes: obtaining the target node subset from the sample node set, and calculating the second candidate probability of each candidate neighbor node in the sample node set according to the target node subset ; Determine a subset of neighbor nodes according to the second candidate probability of the neighbor node to be selected.
- the u-th neighbor node to be selected is determined.
- the second candidate probability, u is a positive integer.
- the selection method of the neighbor node to be selected please refer to Formula 1 shown in step 102 above, which will not be repeated in this embodiment.
- the probability of each neighbor node to be selected in the sample node set needs to be calculated according to the target node subset, and then the neighbor node to be selected with a higher probability is selected as the neighbor node Child collection.
- the selected target node is not repeated, but the selected neighbor node may be repeated.
- the selected neighbor node may be repeated.
- the sampled target node may have few shared neighbors.
- the non-shared neighbor node can be used as a neighbor node.
- the server uses the neighbor node subset to update the characteristics of each target node in the target node subset according to the target node subset and neighbor node subset, and each target node corresponds to a node feature vector, then each The node feature vector of the target node constitutes a node feature subset.
- the server after the server obtains the target node subset and the node feature subset, it can calculate the predicted category probability of each target node and the prediction of each target node according to each target node and its corresponding node feature vector Category probabilities constitute a subset of predicted category probabilities.
- the predicted category probability is used to indicate the probability that the target node is predicted to be a certain category. Therefore, when predicting, the category with the highest probability is usually selected as the category of the target node.
- the server adjusts the target model parameters of the node classification model according to the difference between the predicted category probability subset and the target node category.
- the target model parameter may correspond to an initialization value, that is, a parameter defined in advance, or the target model parameter may also be a model parameter calculated for the node sampled last time. This is equivalent to using the predicted category probability of each target node to optimize the first target model parameter, so as to obtain the target model parameter corresponding to the actual model.
- the server uses the second model parameters obtained by this training to train to obtain the node classification model.
- a model training method based on large-scale atlas is provided.
- the server obtains the sample node subset from the sample node set, and then determines the node feature vector subset according to the sample node subset.
- the node subset and the node characteristic vector subset determine the predicted category probability subset, where the predicted category probability subset includes at least one predicted category probability, the predicted category probability has a corresponding relationship with the target node, and finally according to the predicted category probability subset and the first
- the model parameters determine the second model parameters, and train the node classification model according to the second model parameters.
- training can be performed based on only a part of the nodes. Each iteration calculates some nodes in the graph without traversing each node in the graph, which greatly reduces the computational overhead and saves computing resources.
- a node feature subset of the target node subset is extracted by a node classification model in combination with a neighbor node subset , That is, the above step 202 may further include the following steps 2021 to 2022, please refer to FIG. 7:
- the target model parameters are model parameters to be trained.
- the target model parameters and the neighbor node subset when calculating the node feature subset, for the i-th target node, according to the edge weight between the i-th target node and the neighbor node, the neighbor node's feature vector
- the target model parameters determine the node characteristics of the i-th target node, i is a positive integer, schematic, the calculation process please refer to the following formula three:
- w i represents the i-th target node
- h '(w i ) represents the node characteristic of the i-th target node
- b represents the number of nodes of the target node subset
- i represents an integer from 1 to b
- u j represents The jth neighbor node
- h (u j ) represents the feature vector of the jth neighbor node
- the target of the node classification model is based on the predicted category probability subset and the target node category of the target node
- the model parameters are trained, that is, the above step 204 may further include the following steps 2041 to 2043, please refer to FIG. 8:
- the target loss value is a value calculated by a loss function or a cost function, where either the loss function or the cost function can measure the degree of prediction error.
- the loss function or the cost function can measure the degree of prediction error.
- L is the target loss value
- b is the number of nodes in the target node subset
- i is an integer from 1 to b
- w i is the i-th target node
- h 1 (w i ) is the i-th target node ’s n-dimensional feature vector
- exp () represents the exponential function
- c (w i ) represents the real category information of the i-th target node
- the probability that the k-th predicted category of the i-th target node is equal to the probability of real category information.
- the category with the highest probability is selected as the category of the target node, namely:
- c '(w i) represents the i-th category prediction target node.
- L is the target loss
- n represents the dimension of the i-th target node feature vector
- b represents the number of nodes of the target node subset
- i represents an integer from 1 to b
- j represents from 1 to b
- Integer Represents the k-dimensional feature vector of the i-th target node
- c (w i ) represents the real category information of the i-th target node
- ⁇ () represents the judgment function
- u j represents the j-th neighbor node
- h (u j ) represents The eigenvector of the j-th neighbor node
- exp () represents the exponential function.
- the derivation process of the model parameter gradient is as follows:
- the product of the gradient of the model parameter and the preset learning rate is used as the adjustment difference for the target model parameter, and the adjustment difference is used to adjust the target model parameter.
- W represents the adjusted target model parameters
- W ' represents the model parameters before adjustment
- ⁇ represents the preset learning rate
- the learning rate is mainly used to control the learning progress of the model.
- the appropriate learning rate can be selected according to the size of the data set.
- the square error sum is used as a cost function, as the amount of data increases, the learning rate should be set to a correspondingly smaller Value.
- the learning rate can be larger. When approaching, the learning rate is smaller.
- the learning rate can be increased. If the error rate is increased relative to the previous iteration, then The value of the previous iteration should be reset and the learning rate should be reduced to 50%. Therefore, this is a method of adaptively adjusting the learning rate.
- the learning rate ⁇ in this embodiment can be set to 0.01. In practical applications, the learning rate ⁇ can also be set to 0.00001, 0.0001, 0.001, 0.003, 0.03, 0.1, 0.3, 1, 3 or 10, etc. , Not limited here.
- an embodiment of the present application provides a method for training the target model parameters, that is, the server calculates the target loss according to the predicted category probability subset and the true category information subset, and then determines the model parameter gradient according to the target loss, Finally, the model parameters are adjusted in combination with the model parameter gradient.
- stochastic gradient descent is used to iteratively update the parameters through each sample, but the optimal solution can be found without traversing the entire sample, thereby greatly improving the convergence speed of the algorithm.
- the loss function of each sample is minimized
- the result of each iteration is not necessarily the global optimal solution, it always develops in this direction, so the final result is always close to the global optimal solution.
- FIG. 9 is a schematic diagram of a process of model training for graph nodes in an embodiment of the present application. As shown in the figure, specifically:
- step S1 the process of model training is entered.
- the iteration counter records how many iterations the program has performed.
- step S3 it is determined whether the number of iterations recorded by the iteration counter is greater than 10 6. If the number of iterations is greater than 10 6 , jump to step S12. On the contrary, if the number of iterations is less than or 10 6 , then proceed to step S4, which begins Round iterative calculation.
- step S4 128 target nodes, that is, w 1 , w 2 , ..., w b , are selected from the sample node set.
- step S5 128 neighbor nodes u 1 , u 2 , ..., u b are collected using the above formula one.
- step S6 the feature vectors h 1 (w 1 ), ..., h 1 (w b ) of 128 target nodes are collected using the above formula three.
- step S7 the predicted category of the target node is obtained using the above formula 5 as
- step S8 the loss values of the predicted category and the real category are compared according to the cross entropy.
- step S9 the gradient of the model parameters is calculated using the above formula 7.
- step S10 the model parameters are updated using the above formula 8.
- step S12 t is larger than satisfied 6:10, the model parameters can be output.
- step S13 the model training process is ended.
- corrects represents the number of correct nodes for category prediction in the test set, and acc represents the prediction ability. The greater the acc, the stronger the prediction ability.
- the running speed can also be evaluated by comparing each training time of the model.
- the prediction accuracy corresponding to the method adopted in the embodiments of the present application is very close to GCN, but the running time is greatly reduced. Therefore, the hypothesis algorithm based on sampling makes each node feature update only need to be performed on a small subset of nodes, which greatly improves the actual running speed and reduces the memory. On real social network data, the prediction accuracy obtained is close to GCN, but the speed is increased by about 50 times.
- FIG. 10 is a structural block diagram of an apparatus for classifying nodes according to an exemplary embodiment of the present application. Taking this apparatus as an example in a computer device, as shown in FIG. 10, the apparatus includes: an acquisition module 1010, an extraction module 1020, and a determination Module 1030;
- the obtaining module 1010 is used to obtain the nodes to be classified in the target node set;
- the acquiring module 1010 is further configured to acquire a neighbor node set of the node to be classified from the target node set, wherein the neighbor node set includes at least one neighbor node, the neighbor node and the to-be-classified There is an association relationship between the nodes;
- An extraction module 1020 configured to extract target feature information of the node to be classified according to the neighbor node set through a node classification model
- the determining module 1030 is configured to determine the classification result of the node to be classified according to the target feature information, wherein the node classification model is a model trained by at least one sample node subset, and the sample node subset is a sample A subset of the node set, the nodes in the sample node set are marked with node categories.
- the node classification model is a model trained by at least one sample node subset, and the sample node subset is a sample A subset of the node set, the nodes in the sample node set are marked with node categories.
- the obtaining module 1010 is further configured to calculate the first candidate probability of each candidate neighbor node in the target node set according to the node to be classified; according to each candidate neighbor node The first candidate probability determines the set of neighbor nodes.
- the obtaining module 1010 is further configured to pass the edge weight between the node to be classified and the uth candidate neighbor node, and the node to be classified and the target The edge weights of the nodes in the node set determine the first candidate probability of the u-th neighbor node to be selected, and u is a positive integer.
- the device for node classification selects the node to be classified from the set of target nodes and updates the node according to the set of neighbor nodes of the node
- the node characteristics of the classification nodes are determined to obtain the node classification results according to the node characteristics.
- FIG. 11 is a structural block diagram of an apparatus for model training provided by an exemplary embodiment of the present application. Taking this apparatus as an example in a computer device, as shown in FIG. 11, the apparatus includes: an acquisition module 1110, an extraction module 1120, and prediction Module 1130 and training module 1140;
- the obtaining module 1110 is configured to obtain a target node subset and a neighbor node subset corresponding to the target node subset from the sample node set marked with the target node category, the neighbor node in the neighbor node subset and the The target nodes in the target node subset have an association relationship;
- An extraction module 1120 configured to extract a node feature subset of the target node subset through a node classification model in combination with the neighbor node subset, wherein the node feature subset includes the node feature vector of the target node;
- the prediction module 1130 is configured to perform category prediction on the target node according to the node feature subset to obtain a predicted category probability subset
- the training module 1140 is configured to train the target model parameters of the node classification model according to the predicted category probability subset and the target node category of the target node.
- the obtaining module 1110 is further configured to obtain the target node subset from the sample node set; calculate the neighbors to be selected from the sample node set according to the target node subset The second candidate probability of the node; determining the subset of neighbor nodes according to the second candidate probability of the neighbor node to be selected.
- the obtaining module 1110 is further configured to pass the edge weight between the target node in the subset of target nodes and the u-th neighbor node to be selected, and all The edge weight of the target node and the nodes in the sample node set determines the second candidate probability of the u-th neighbor node to be selected, u is a positive integer.
- the extraction module 1120 is further configured to determine target model parameters of the node classification model, where the target model parameters are model parameters to be trained; according to the target model parameters and The neighbor node subset extracts the node feature subset of the target node subset.
- the extraction module 1120 is further configured to calculate the edge weight between the i-th target node and the neighbor node, the feature vector of the neighbor node, and the target model The parameter determines the node characteristics of the i-th target node, i is a positive integer.
- the training module 1140 is further configured to determine a target loss value according to the predicted category probability subset and the target node class of the target node; determine the target loss value according to the target loss value The model parameter gradient; training the target model parameter according to the model parameter gradient.
- the training module 1140 is further configured to use the product of the model parameter gradient and a preset learning rate as the adjustment difference value of the target model parameter; the adjustment difference value pair The target model parameters are adjusted.
- the device for model training selects the node to be classified from the set of target nodes and updates the node according to the set of neighbor nodes of the node
- the node characteristics of the classification nodes are determined to obtain the node classification results according to the node characteristics.
- FIG. 12 shows a schematic structural diagram of a server provided by an exemplary embodiment of the present application.
- the server may be the server 310 shown in FIG. 3. Specifically:
- the server 1200 includes a central processing unit (CPU, Central Processing Unit) 1201, a system memory 1204 including a random access memory (RAM, Random Access Memory) 1202 and a read only memory (ROM, Read Only Memory) 1203, and a system memory 1204 connected to the system And the system bus 1205 of the central processing unit 1201.
- the server 1200 also includes a basic input / output system (I / O system, Input) System 1206 that helps to transfer information between various devices in the computer, and a large storage for operating system 1213, application programs 1214, and other program modules 1215.
- the basic input / output system 1206 includes a display 1208 for displaying information and an input device 1209 for a user to input information, such as a mouse and a keyboard. Both the display 1208 and the input device 1209 are connected to the central processing unit 1201 via an input-output controller 1210 connected to the system bus 1205.
- the basic input / output system 1206 may also include an input-output controller 1210 for receiving and processing input from a number of other devices such as a keyboard, mouse, or electronic stylus. Similarly, the input output controller 1210 also provides output to a display screen, printer, or other type of output device.
- the mass storage device 1207 is connected to the central processing unit 1201 through a mass storage controller (not shown) connected to the system bus 1205.
- the mass storage device 1207 and its associated computer-readable medium provide non-volatile storage for the server 1200. That is, the mass storage device 1207 may include a computer-readable medium (not shown) such as a hard disk or a compact disc read-only memory (CD-ROM, Compact Disc Read Only Memory) drive.
- a computer-readable medium such as a hard disk or a compact disc read-only memory (CD-ROM, Compact Disc Read Only Memory) drive.
- Computer-readable media may include computer storage media and communication media.
- Computer storage media includes volatile and nonvolatile, removable and non-removable media implemented in any method or technology for storage of information such as computer readable instructions, data structures, program modules or other data.
- Computer storage media include RAM, ROM, erasable programmable read-only memory (EPROM, Erasable Programmable Read Only), live erasable programmable read-only memory (EEPROM, Electrically Erasable Programmable Read Only Only Memory), flash memory or other solid-state storage Its technology, CD-ROM, digital versatile disc (DVD, Digital, Versatile, Disc) or other optical storage, tape cassette, magnetic tape, magnetic disk storage or other magnetic storage devices.
- the above-mentioned system memory 1204 and mass storage device 1207 may be collectively referred to as a memory.
- the server 1200 may also be operated by a remote computer connected to the network through a network such as the Internet. That is, the server 1200 may be connected to the network 1212 through the network interface unit 1211 connected to the system bus 1205, or the network interface unit 1211 may also be used to connect to other types of networks or remote computer systems (not shown).
- the above memory also includes one or more programs.
- One or more programs are stored in the memory and configured to be executed by the CPU.
- An embodiment of the present application further provides a computer device.
- the computing mobile phone device includes a processor and a memory, and the memory stores at least one instruction, at least one program, code set, or instruction set, at least one instruction, at least one program,
- the code set or the instruction set is loaded and executed by the processor to implement the node classification method or model training method provided by the foregoing method embodiments.
- Embodiments of the present application also provide a computer-readable storage medium, which stores at least one instruction, at least one program, code set, or instruction set, at least one instruction, at least one program, code set, or The instruction set is loaded and executed by the processor to implement the node classification method or model training method provided by the foregoing method embodiments.
- the program may be stored in a computer-readable storage medium.
- the mentioned storage medium may be a read-only memory, a magnetic disk or an optical disk, etc.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Artificial Intelligence (AREA)
- Evolutionary Computation (AREA)
- Life Sciences & Earth Sciences (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Evolutionary Biology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Computing Systems (AREA)
- Computational Linguistics (AREA)
- Molecular Biology (AREA)
- Biophysics (AREA)
- Biomedical Technology (AREA)
- General Health & Medical Sciences (AREA)
- Health & Medical Sciences (AREA)
- Probability & Statistics with Applications (AREA)
- Algebra (AREA)
- Computational Mathematics (AREA)
- Mathematical Analysis (AREA)
- Mathematical Optimization (AREA)
- Pure & Applied Mathematics (AREA)
- Discrete Mathematics (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
一种节点分类方法、模型训练方法、装置、设备及存储介质,涉及互联网技术领域,该方法包括:获取目标节点集合中的待分类节点(101);从目标节点集合中获取所述待分类节点的邻居节点集合,其中,邻居节点集合包括至少一个邻居节点,邻居节点与待分类节点之间具有关联关系(102);通过节点分类模型根据所述邻居节点集合提取所述待分类节点的目标特征信息(103);根据目标特征信息确定所述待分类节点的分类结果,其中,节点分类模型为根据至少一个样本节点子集合训练得到的模型,该样本节点子集合为样本节点集合的子集,样本节点集合中的节点标注有节点类别(104)。对于大规模图谱而言,该方法可以仅基于一部分节点进行训练,每一次迭代计算图谱中的部分节点,无需遍历图谱中的每个节点,大幅地降低了计算开销,且节省计算资源。
Description
本申请要求于2018年11月15日提交的申请号为201811361409.0、发明名称为“一种节点分类的方法、模型训练的方法及装置”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
本申请涉及互联网技术领域,尤其涉及一种节点分类方法、模型训练方法、装置、设备及存储介质。
随着机器学习技术的发展,图片分类任务得到了瞩目的进展,相关技术已经应用到自动驾驶、安防以及游戏等诸多场景。与图片类似,图谱也是一种常见的数据,例如社交网络、知识图谱以及药物分子结构等。
相关技术中,针对大规模图谱,对图谱节点信息整合需要设置专门的迭代计算。请参阅图1,图1为相关技术中面向图谱节点分类的图卷积神经网络(graph convolutional networks,GCN)分类模型一个示意图,如图所示,v表示节点,a表示连接两个节点的边,对输入层的每个节点进行特征向量的计算,就可以更新整个图谱信息,得到第一层信息,采用softmax函数对第一层信息中的每个节点预测节点类别。
对于大规模图谱而言,往往含有千万级甚至亿级以上的节点数,以及含有数十亿级以上的连边。然而,为了更新每个节点的特征信息,GCN需要在一次特征信息计算中遍历每个节点,导致计算代价过高,计算资源消耗过大。
发明内容
本申请实施例提供了一种节点分类方法、模型训练方法、装置、设备及存储介质,可以解决GCN需要在一次特征信息计算中遍历每个节点,导致计算代价过高,计算资源消耗过大的问题。
一方面,提供了一种节点分类的方法,应用于计算机设备中,所述方法包括:
获取目标节点集合中的待分类节点,所述待分类节点中包括所述目标节点集合中的至少一个节点,且所述待分类节点的数量小于所述目标节点集合中节点的数量;
从所述目标节点集合中获取所述待分类节点的邻居节点集合,其中,所述邻居节点集合中包括至少一个邻居节点,所述邻居节点与所述待分类节点之间具有关联关系;
通过节点分类模型根据所述邻居节点集合提取所述待分类节点的目标特征信息;
根据所述目标特征信息确定所述待分类节点的分类结果,其中,所述节点分类模型为通过至少一个样本节点子集合训练得到的模型,所述样本节点子集合为样本节点集合的子集,所述样本节点集合中的节点标注有节点类别。
另一方面,提供了一种模型训练的方法,应用于计算机设备中,所述方法包括:
从标注有目标节点类别的样本节点集合中获取目标节点子集合和与所述目标节点子集合对应的邻居节点子集合,所述邻居节点子集合中的邻居节点与所述目标节点子集合中的目标节点具有关联关系;
结合所述邻居节点子集合通过节点分类模型提取所述目标节点子集合的节点特征子集合,其中,所述节点特征子集合中包括所述目标节点的节点特征向量;
根据所述节点特征子集合对所述目标节点进行类别预测,得到预测类别概率子集合;
根据所述预测类别概率子集合和所述目标节点的所述目标节点类别对所述节点分类模型的目标模型参数进行训练。
另一方面,提供了一种节点分类的装置,应用于计算机设备中,所述装置包括:
获取模块,用于获取目标节点集合中的待分类节点;
所述获取模块,还用于从所述目标节点集合中获取所述待分类节点的邻居节点集合,其中,所述邻居节点集合中包括至少一个邻居节点,所述邻居节点与所述待分类节点之间具有关联关系;
提取模块,用于通过节点分类模型根据所述邻居节点集合提取所述待分类节点的目标特征信息;
确定模块,用于根据所述目标特征信息确定所述待分类节点的分类结果,其中,所述节点分类模型为通过至少一个样本节点子集合训练得到的模型,所述样本节点子集合为样本节点集合的子集,所述样本节点集合中的节点标注有节点类别。
另一方面,提供了一种模型训练的装置,应用于计算机设备中,所述装置包括:
获取模块,用于从标注有目标节点类别的样本节点集合中获取目标节点子集合和与所述目标节点子集合对应的邻居节点子集合,所述邻居节点子集合中的邻居节点与所述目标节点子集合中的目标节点具有关联关系;
提取模块,用于结合所述邻居节点子集合通过节点分类模型提取所述目标节点子集合的节点特征子集合,其中,所述节点特征子集合中包括所述目标节点的节点特征向量;
预测模块,用于根据所述节点特征子集合对所述目标节点进行类别预测,得到预测类别概率子集合;
训练模块,用于根据所述预测类别概率子集合和所述目标节点的所述目标节点类别对所述节点分类模型的目标模型参数进行训练。
另一方面,提供了一种计算机设备,所述计算机设备包括处理器和存储器,所述存储器中存储有至少一条指令、至少一段程序、代码集或指令集,所述至少一条指令、所述至少一段程序、所述代码集或指令集由所述处理器加载并执行以实现如上述本申请实施例中任一所述的节点分类的方法或模型训练的方法。
另一方面,提供了一种计算机可读存储介质,所述存储介质中存储有至少一条指令、至少一段程序、代码集或指令集,所述至少一条指令、所述至少一段程序、所述代码集或指令集由所述处理器加载并执行以实现如上述本申请实施例中任一所述的节点分类的方法或模型训练的方法。
另一方面,提供了一种计算机程序产品,当所述计算机程序产品在计算机上运行时,使得计算机执行如上述本申请实施例中任一所述的节点分类的方法或模型训练的方法。
从以上技术方案可以看出,本申请实施例具有以下优点:
本申请实施例中,在节点分类过程中对节点进行特征更新时,从目标节点 集合中选择待分类节点,并根据该待分类节点的邻居节点集合对该待分类节点的节点特征进行确定,从而根据节点特征获取节点分类结果,通过上述方式,对于大规模图谱而言,每一次迭代计算图谱中的部分节点,无需遍历图谱中的每个节点,大幅地降低了计算开销,且节省计算资源。
为了更清楚地说明本申请实施例中的技术方案,下面将对实施例描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本申请的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。
图1为相关技术中面向图谱节点分类的图卷积神经网络分类模型一个示意图;
图2为本申请实施例中大规模图谱的一个示意图;
图3为本申请实施例中节点分类系统的一个架构示意图;
图4为本申请实施例中节点分类方法的实施例示意图;
图5为本申请实施例中模型训练方法的实施例示意图;
图6为本申请实施例中对图谱的节点进行采样的一个实施例示意图;
图7为本申请实施例中对图谱节点进行模型训练的一个流程示意图;
图8为本申请实施例中根据预测类别概率子集合和目标节点的目标节点类别对节点分类模型的目标模型参数进行训练的流程图;
图9为本申请实施例中对图谱节点进行模型训练的一个流程示意图;
图10是本申请一个示例性实施例提供的节点分类的装置的结构框图;
图11是本申请一个示例性实施例提供的模型训练的装置的结构框图;
图12为本申请实施例中服务器一个结构示意图。
本申请实施例提供了一种节点分类的方法、模型训练的方法及装置,对于大规模图谱而言,可以仅基于一部分节点进行训练,每一次迭代计算图谱中的部分节点,无需遍历图谱中的每个节点,大幅地降低了计算开销,且节省计算资源。
本申请的说明书和权利要求书及上述附图中的术语“第一”、“第二”、“第 三”、“第四”等(如果存在)是用于区别类似的对象,而不必用于描述特定的顺序或先后次序。应该理解这样使用的数据在适当情况下可以互换,以便这里描述的本申请的实施例例如能够以除了在这里图示或描述的那些以外的顺序实施。此外,术语“包括”和“具有”以及他们的任何变形,意图在于覆盖不排他的包含,例如,包含了一系列步骤或单元的过程、方法、系统、产品或设备不必限于清楚地列出的那些步骤或单元,而是可包括没有清楚地列出的或对于这些过程、方法、产品或设备固有的其它步骤或单元。
应理解,本申请实施例主要应用于大规模图谱,可以面向大规模图谱节点进行分类和学习。图谱是指由许多节点通过相互之间的连接而组成的一种数据结构,请参阅图2,图2为本申请实施例中大规模图谱的一个示意图,如图所示,每个节点代表一个成员,两个节点之间的连线称为连边,该连边用于表示被连接的两个节点对应的两个成员相互认识,其中,A1所指示的成员属于同一个分类,A2所指示的成员属于同一个分类,A3所指示的成员属于同一个分类,A4所指示的成员属于同一个分类。
可选地,节点通常是指人或组织,节点之间的连接(边)往往表示某种社会关系(如亲属关系或者交易行为等)。每个节点可以被赋予一个特征向量,如v节点的特征向量为h(v),用于刻画该节点的属性。每个边可以带有权重值a,用于刻画连接的紧密程度,权重值a越大,表示节点之间的关联越紧密。其中,每个边可以是带有方向的,用于表示连接的方向性。可选地,本申请实施例中的图谱连边未限定方向性。
在实际应用中,为了实现用户推荐、好友分类以及网络安防系统监控等功能,还需要对图谱中的节点之间的分类。即根据节点的特征向量以及节点之间的连接关系,把相似的节点分到同一类别的任务称为图节点分类,即将节点v分到类别c(v)。例如,在社交类应用程序(一种大规模的图谱)中,可以根据用户画像以及用户的好友关系把用户划分为不同运动的爱好者,如篮球爱好者、足球爱好者或者排球爱好者等。
可以理解的是,本申请应用广泛,除了上述提及的几类场景以外,还可以应用于捕捉支付网络中黑产用户(如传销和赌博等),也可以应用于好友归类、关系挖掘以及商品推荐等场景,此处不做限定。
应理解,本申请实施例所介绍的节点分类系统如图3所示,图3为本申请 实施例中节点分类系统的一个架构示意图,将节点分类的方法和模型训练的方法均部署于服务器310中,而服务器310接收大量终端设备320发送的信息,假设一个终端设备320对应于一个用户,那么每个用户即表示为一个节点,该用户的信息(如社交信息)表示为节点信息。服务器采集大量的节点所对应的节点信息,从而形成一个大规模图谱。可以理解的是,图3所示的服务器310可以是一台服务器,也可以是集成了多台服务器的系统。而终端设备320包含但不仅限于智能手机、平板电脑、个人电脑(personal computer,PC)、笔记本电脑以及掌上电脑,此处仅为一个示意,不应理解为对本申请的限定。
结合上述说明,对本申请中节点分类的方法进行介绍,请参阅图4,其示出了本申请一个示例性实施例提供的节点分类的方法的流程图,以该方法应用于如图3所示的服务器310中为例进行说明,该方法包括:
101、获取目标节点集合中的待分类节点。
可选地,该待分类节点中包括目标节点集合中的至少一个节点,且待分类节点的数量小于目标节点集合中节点的数量。
可选地,从目标节点集合中选择部分节点作为待分类节点。
本实施例中,服务器首先需要获取待分类的节点,其中,一个节点对应图谱中的一个用户,因此,该待分类节点也可以认为是一个需要被划分类型的目标用户。
102、从目标节点集合中获取待分类节点的邻居节点集合,其中,邻居节点集合包括至少一个邻居节点,邻居节点与待分类节点之间具有关联关系;
本实施例中,在服务器获取到待分类节点之后,即可根据该待分类节点获取与之关联的至少一个邻居节点,这些邻居节点组成一个邻居节点集合。以待分类节点为目标用户为例,与该目标用户相关的邻居节点通常可以是该目标用户的好友。
可以理解的是,邻居节点除了可以是与待分类节点具有直接联系(即待分类节点与邻居节点具有边)的节点以外,还可以是与待分类节点具有间接联系的节点,此处不做限定。
可选地,该待分类节点的邻居节点集合中的邻居节点,为与至少两个待分类节点存在关联关系的节点,或,与n个待分类节点存在关联关系的节点,n为正整数。
可选地,确定邻居节点集合时,首先根据待分类节点计算目标节点集合中每个候选邻居节点的第一候选概率,并根据每个候选邻居节点的第一候选概率确定邻居节点集合。其中,通过待分类节点与第u个候选邻居节点之间的连边权重,以及待分类节点与目标节点集合中的节点的连边权重,确定第u个候选邻居节点的第一候选概率,u为正整数。
可选地,该候选邻居节点的第一候选概率计算方式请参考如下公式一:
其中,p(u)表示第u个候选邻居节点的第一候选概率,v
j表示目标节点集合中第j个节点,w表示待分类节点,b表示待分类节点的节点个数,N表示目标节点集合的节点个数,i表示从1至b的整数,j表示从1至N的整数,
表示第i个待分类节点与第u个候选邻居节点之间的连边权重,
表示第i个待分类节点与目标节点集合中第j个节点之间的连边权重。
本实施例中,首先从大规模图谱所对应的目标节点集合中获取待分类节点,然后每次只对该部分的待分类节点进行更新,同时根据这部分待分类节点采集出特定个数的共同关联节点(即邻居节点集合)进行信息整合。
以待分类节点为目标用户为例,与该目标用户相关的邻居节点通常可以是该目标用户的好友。可以理解的是,邻居节点除了可以是与待分类节点具有直接联系(即待分类节点与邻居节点具有连边)的节点以外,还可以是与目标节点具有间接联系的节点,此处不做限定。
103、通过节点分类模型根据邻居节点集合提取待分类节点的目标特征信息;
本实施例中,服务器可以通过邻居节点集合来强化待分类节点的特征,从而得到该待分类节点所对应的目标特征信息。
具体地,可以采用如下公式二得到待分类节点的目标特征信息:
其中,v
1表示待分类节点,h'(v
1)表示待分类节点的目标特征信息(即特征向量),d表示邻居节点集合中的节点个数,n表示从1至d的整数,u
n表示 第n个邻居节点,h(u
n)表示第n个邻居节点的多维特征向量,
表示待分类节点和第n个邻居节点的连边权重,W
1表示节点分类模型的模型参数。
由此可见,为了更新待分类节点的目标特征信息,可以对该待分类节点所对应的邻居节点乘一个模型参数,再求和得到目标特征信息,从而使得更新后的目标特征信息整合其所有邻居节点的特征,并且丰富信息量。
104、根据目标特征信息确定待分类节点的分类结果,其中,节点分类模型为根据至少一个样本节点子集合训练得到的模型,该样本节点子集合为样本节点集合的自己,样本节点集合中的节点标注有节点类别。
本实施例中,节点分类模型计算得到的目标特征信息后,根据该目标特征信息输出该待分类节点所对应的分类结果,如,输出的分类结果为待分类节点属于异常节点,或,输出的分类结果为待分类节点属于篮球爱好者节点。
其中,节点分类模型为根据至少一个样本节点子集合训练得到的,每个样本节点子集合的节点个数小于样本节点集合的节点个数,也就是说,节点分类模型基于一部分节点进行训练,且每一次迭代计算部分节点,直到迭代次数达到门限值。在机器学习中,分类模型是指对输入数据进行某种信息整合后输出相应类别。分类模型包含一组模型参数,这些模型参数可以通过训练进行优化调整。
需要说明的是,在实际应用中,机器学习方法往往需要通过多次迭代来更新每个节点的模型参数,而每次迭代仅针对从目标节点集合中采样出来的待分类节点。尽管每次迭代只用到小部分数据,但是经过多次迭代后(每次迭代采样得到的待分类节点不一样),就能遍历目标节点集合中所有的节点。
本申请实施例中,提供了一种节点分类的方法,在节点分类过程中对节点进行特征更新时,从目标节点集合中选择待分类节点,并根据该待分类节点的邻居节点集合对该待分类节点的节点特征进行确定,从而根据节点特征获取节点分类结果,通过上述方式,对于大规模图谱而言,每一次迭代计算图谱中的部分节点,无需遍历图谱中的每个节点,大幅地降低了计算开销,且节省计算资源。
针对上述节点分类模型,对该节点分类模型的训练过程进行说明,图5是本申请一个示例性实施例提供的模型训练的方法的流程图,以该方法应用于如图3所示的服务器310中为例进行说明,请参阅图5,该方法包括:
201、从标注有目标节点类别的样本节点集合中获取目标节点子集合和与目标节点子集合对应的邻居节点子集合,邻居节点子集合中的邻居节点与目标节点子集合中的目标节点具有关联关系。
本实施例中,首先从大规模图谱所对应的样本节点集合中获取样本节点集合,然后每次只对部分的节点(即目标节点子集合)进行更新,同时根据这部分节点(目标节点子集合)采用出特定个数的共同关联节点(即邻居节点子集合)进行信息整合。
为了便于理解,请参阅图6,图6为本申请实施例中对图谱的节点进行采样的一个实施例示意图,如图所示,假设从样本节点集合中获取包括10个节点的样本节点子集合,其中,样本节点子集合包括2个目标节点(即图中的节点M和节点N),并根据目标节点获取与之关联的8个邻居节点(即图中波点图案节点),图6中的白色节点虽然属于样本节点集合中的节点,但是并未选取作为本次训练的对象。
以目标节点为目标用户为例,与该目标用户相关的邻居节点通常可以是该目标用户的好友。可以理解的是,邻居节点除了可以是与目标节点具有直接联系(即目标节点与邻居节点具有边)的节点以外,还可以是与目标节点具有间接联系的节点,此处不做限定。
可选地,该获取目标节点子集合和邻居节点子集合的方式包括:从样本节点集合中获取目标节点子集合,根据目标节点子集合计算样本节点集合中每个候选邻居节点的第二候选概率;根据待选择邻居节点的第二候选概率确定邻居节点子集合。其中,通过目标节点子集合中的目标节点与第u个待选择邻居节点之间的连边权重,以及目标节点与样本节点集合中的节点的连边权重,确定第u个待选择邻居节点的第二候选概率,u为正整数。可选地,该待选择邻居节点的选择方式请参考如上步骤102中示出的公式一,本实施例中不再赘述。
可选地,与目标节点关联的节点可能较多,这个时候,需要根据目标节点子集合计算样本节点集合中每个待选择邻居节点的概率,然后选择概率较大的待选择邻居节点作为邻居节点子集合。
需要说明的是,每次训练的时候,所选的目标节点不重复,但选择的邻居节点可能会重复。在对目标节点进行采样的时候是随机选择的,在极端条件下可能会导致采样出来的目标节点的共有邻居很少,此时,可以将非共有邻居节点作为邻居节点。
202、结合邻居节点子集合通过节点分类模型提取目标节点子集合的节点特征子集合,其中,节点特征子集合中包括目标节点的节点特征向量。
本实施例中,服务器根据目标节点子集合以及邻居节点子集合,利用邻居节点子集合来更新目标节点子集合中每个目标节点的特征,且每个目标节点对应于一个节点特征向量,那么各个目标节点的节点特征向量构成节点特征子集合。
203、根据节点特征子集合对目标节点进行类别预测,得到预测类别概率子集合。
本实施例中,在服务器获取到目标节点子集合以及节点特征子集合之后,即可根据各个目标节点及其对应的节点特征向量,计算得到每个目标节点的预测类别概率,各个目标节点的预测类别概率构成一个预测类别概率子集合。
其中,预测类别概率用于表示目标节点预测为某一个类别的概率,于是,在预测的时候,通常选择概率最大的类别作为该目标节点的类别。
204、根据预测类别概率子集合和目标节点的目标节点类别对节点分类模型的目标模型参数进行训练。
本实施例中,服务器根据预测类别概率子集合与目标节点类别之间的差异对该节点分类模型的目标模型参数进行调整。
其中,该目标模型参数可以对应有初始化数值,也就是预先定义的一个参数,或,该目标模型参数也可以是对前一次采样的节点计算得到的模型参数。这里相当于利用各个目标节点的预测类别概率对第目标模型参数进行优化,从而得到更趋近于实际模型所对应的目标模型参数。
本实施例中,最后,服务器利用本次训练得到的第二模型参数来训练得到节点分类模型。
需要说明的是,在实际应用中,机器学习方法往往需要通过多次迭代来训练模型参数,而每次迭代只需要用到采样出来的较小节点集合。尽管每次迭代只用到小部分数据,但是经过多次迭代后(每次迭代用到的数据不一样,即每次迭代训练用采样出的目标节点是不一样的,邻居节点也不完全一样),就能遍历图谱中所有的节点。
本申请实施例中,提供了一种基于大规模图谱的模型训练方法,首先,服务器从样本节点集合中获取样本节点子集合,然后根据样本节点子集合确定节点特性向量子集合,服务器再根据目标节点子集合以及节点特性向量子集合确 定预测类别概率子集合,其中,预测类别概率子集合包括至少一个预测类别概率,预测类别概率与目标节点具有对应关系,最后根据预测类别概率子集合以及第一模型参数确定第二模型参数,并根据第二模型参数训练得到节点分类模型。通过上述方式,对于大规模图谱而言,可以仅基于一部分节点进行训练,每一次迭代计算图谱中的部分节点,无需遍历图谱中的每个节点,大幅地降低了计算开销,且节省计算资源。
可选地,在上述图5对应的实施例的基础上,本申请实施例提供模型训练方法的可选实施例中,结合邻居节点子集合通过节点分类模型提取目标节点子集合的节点特征子集合,也即上述步骤202还可以包括如下步骤2021至步骤2022,请参考图7:
2021、确定节点分类模型的目标模型参数,该目标模型参数为待训练的模型参数。
2022、根据目标模型参数和邻居节点子集合提取目标节点子集合的节点特征子集合。
可选地,根据目标模型参数以及邻居节点子集合,计算节点特征子集合时,针对第i个目标节点,根据该第i个目标节点和邻居节点之间的连边权重、邻居节点的特征向量以及目标模型参数,确定第i个目标节点的节点特征,i为正整数,示意性的,该计算过程请参考如下公式三:
其中,w
i表示第i个目标节点,h'(w
i)表示第i个目标节点的节点特征,b表示目标节点子集合的节点个数,i表示从1至b的整数,u
j表示第j个邻居节点,h(u
j)表示第j个邻居节点的特征向量,
表示第i个目标节点和第j个邻居节点的连边权重,W'表示目标模型参数。
可选地,在上述图5对应的实施例的基础上,本申请实施例提供模型训练方法的可选实施例中,根据预测类别概率子集合和目标节点的目标节点类别对节点分类模型的目标模型参数进行训练,也即上述步骤204还可以包括如下步骤2041至步骤2043,请参考图8:
2041、根据预测类别概率子集合和目标节点的目标节点类别确定目标损失值。
可选地,该目标损失值为通过损失函数或代价函数计算得到的数值,其中,损失函数或代价函数都可以来度量预测错误程度。也就是说,评价一个算法是否是比较好的算法,需要提前定义一个损失函数,来判断这个算法是否是最优的,而后面不断的优化求梯度下降,使得损失函数最小。
可选地,该目标损失值的计算方式请参考如下公式四:
其中,L表示目标损失值,b表示目标节点子集合的节点个数,i表示从1至b的整数,w
i表示第i个目标节点,h
1(w
i)表示第i个目标节点的n维特征向量,
表示第i个目标节点的第k维特征向量,exp()表示指数函数,
表示h
1(w
i)的第c(w
i)个分量,c(w
i)表示第i个目标节点的真实类别信息,
表示第i个目标节点的第k个预测类别概率等于真实类别信息的概率。
可选地,第i个目标节点的预测类别概率的计算方式请参考如下公式五:
可选地,在预测的时候,选择概率最大对应的类别作为该目标节点的类别,即:
其中,c'(w
i)表示第i个目标节点的预测类别。
2042、根据目标损失值确定模型参数梯度。
可选地,该模型参数梯度的计算方式请参考如下公式六:
其中,L表示目标损失,
表示第k维特征向量的梯度,n表示第i个目标节点的特征向量的维数,b表示目标节点子集合的节点个数,i表示从1至b的整数,j表示从1至b的整数,
表示第i个目标节点的第k维特征向量,c(w
i)表示第i个目标节点真实类别信息,δ()表示判断函数,u
j表示第j个邻居节点,h(u
j)表示第j个邻居节点的特征向量,exp()表示指数函数。
可选地,该模型参数梯度的推导过程如下:
2043、根据模型参数梯度对目标模型参数进行训练。
可选地,将模型参数梯度与预设学习率的乘积作为对目标模型参数的调整差值,并以该调整差值对目标模型参数进行调整。
可选地,该目标模型参数的调整方式请参考如下公式八:
可选地,在模型训练过程中,通常可以根据训练轮数设置动态变化的学习率,比如一开始进行模型训练时,学习率控制在0.01至0.001,迭代一定轮数时候学习率逐渐减缓,在训练接近结束的时候,学习速率的衰减通常在100倍以上。
学习率主要用于控制模型的学习进度,可以根据数据集的大小来选择合适的学习率,当使用平方误差和作为成本函数时,随着数据量的增多,学习率应该被设置为相应更小的值。
在不同的迭代中选择不同的学习率,在最初的迭代中,学习率可以大一些,快接近时,学习率小一些。在每次迭代后,使用估计的模型参数来查看误差函数的值,如果相对于上一次迭代,错误率减少了,就可以增大学习率如果相对于上一次迭代,错误率增大了,那么应该重新设置上一轮迭代的值,并且减少学习率到之前的50%。因此,这是一种学习率自适应调节的方法。
需要说明的是,本实施例中的学习率α可以设置为0.01,在实际应用中,学习率α还可以设置为0.00001、0.0001、0.001、0.003、0.03、0.1,0.3、1、3或者10等,此处不做限定。
可选地,本申请实施例中,提供了一种对目标模型参数进行训练的方法,即服务器根据预测类别概率子集合以及真实类别信息子集合计算目标损失,然后根据目标损失确定模型参数梯度,最后结合模型参数梯度对模型参数进行调整。通过上述方式,采用随机梯度下降通过每个样本来迭代更新一次参数,可未遍历整个样本就已经找到最优解,从而大幅提高了算法的收敛速度,此外,最小化每个样本的损失函数,虽然每次迭代结果不一定都是全局最优解,却总是沿着这个方向发展,故最终结果总是接近全局最优解。
为了便于介绍,请参阅图9,图9为本申请实施例中对图谱节点进行模型训练的一个流程示意图,如图所示,具体地:
步骤S1中,进入模型训练的流程。
步骤S2中,令迭代计数器t=1,设定取样的节点数量b=128,迭代计数器是记录了程序执行了多少次迭代,当t=1时,程序开始执行第1次迭代,即从步骤S2到步骤S10,然后迭代数加1,并判断迭代计数器所记录的迭代数是否够10
6,如果不够再进行第2次迭代。
步骤S3中,判断迭代计数器所记录的迭代数是否大于10
6,若迭代数大于10
6,则跳转至步骤S12,反之,如果迭代数小于或怎样10
6,则进入步骤S4,即开始一轮迭代计算。
步骤S4中,从样本节点集合中采用出128个目标节点,即w
1,w
2,…,w
b。
步骤S5中,利用如上公式一采集出128个邻居节点u
1,u
2,…,u
b。
步骤S6中,利用如上公式三采集出128个目标节点的特征向量h
1(w
1),…,h
1(w
b)。
步骤S8中,根据交叉熵对比预测类别和真实类别的损失值。
步骤S9中,利用如上公式七计算模型参数的梯度。
步骤S10中,利用如上公式八更新模型参数。
步骤S11中,迭代计数器加1,即t=t+1。
步骤S12中,在满足t大于10
6时,即可输出模型参数。
步骤S13中,结束模型训练的流程。
基于本申请所提供的模型训练方法,已针对真实社交网络进行试验,其中,该网络共有31965个节点以及11606919条边,每个节点的特征维数是602,该网络共有类别种类个数为41。如果两个节点之间具有连边,则权重设置为1,反之,如果两个节点之间无连边,则权重设置为0。为了实现模型训练,构建了节点个数为152410的训练集,训练集中的节点类别已知。
此外,还构造了一个节点个数为55334的测试集,用来测试分类模型的预测精度。目前,采用预测精度来评价分类模型的预测能力:
其中,corrects表示在测试集中类别预测正确的节点个数,acc表示预测能力,acc越大,则预测能力越强。此外,还可以通过比较模型的每一次训练时间来评价运行速度。
接下来,将对本申请实施例所采用的方法与现有方案中采用的GCN进行对比,请参阅表1,在同样的实验条件下两者之间的比对示意如下:
表1
| 采用的方法 | 预测精度 | 运行时间(秒) |
| GCN | 0.9568 | 100 |
| 本申请 | 0.9501 | 2 |
可以看出,本申请实施例所采用的方法所对应的预测精度与GCN非常接近,但是运行时间大幅降低。因此,基于采样的假设算法,使得每一次节点特征更新只需要在一个较小的节点子集上进行,大幅提高了实际运行速度,也降低了内存。在真实的社交网络数据上,得到的预测精度与GCN接近,但是速度提高了大约50倍。
图10是本申请一个示例性实施例提供的节点分类的装置的结构框图,以该装置应用于计算机设备中为例,如图10所示,该装置包括:获取模块1010、提取模块1020和确定模块1030;
获取模块1010,用于获取目标节点集合中的待分类节点;
所述获取模块1010,还用于从所述目标节点集合中获取所述待分类节点的邻居节点集合,其中,所述邻居节点集合中包括至少一个邻居节点,所述邻居节点与所述待分类节点之间具有关联关系;
提取模块1020,用于通过节点分类模型根据所述邻居节点集合提取所述待分类节点的目标特征信息;
确定模块1030,用于根据所述目标特征信息确定所述待分类节点的分类结果,其中,所述节点分类模型为通过至少一个样本节点子集合训练得到的模型,所述样本节点子集合为样本节点集合的子集,所述样本节点集合中的节点标注有节点类别。
在一个可选的实施例中,所述获取模块1010,还用于根据所述待分类节点计算所述目标节点集合中每个候选邻居节点的第一候选概率;根据每个所述候选邻居节点的所述第一候选概率确定所述邻居节点集合。
在一个可选的实施例中,所述获取模块1010,还用于通过所述待分类节点与第u个所述候选邻居节点之间的连边权重,以及所述待分类节点与所述目标节点集合中的节点的连边权重,确定第u个所述待选择邻居节点的所述第一候 选概率,u为正整数。
综上所述,本实施例提供的节点分类的装置,在节点分类过程中对节点进行特征更新时,从目标节点集合中选择待分类节点,并根据该待分类节点的邻居节点集合对该待分类节点的节点特征进行确定,从而根据节点特征获取节点分类结果,通过上述方式,对于大规模图谱而言,每一次迭代计算图谱中的部分节点,无需遍历图谱中的每个节点,大幅地降低了计算开销,且节省计算资源。
图11是本申请一个示例性实施例提供的模型训练的装置的结构框图,以该装置应用于计算机设备中为例,如图11所示,该装置包括:获取模块1110、提取模块1120、预测模块1130和训练模块1140;
获取模块1110,用于从标注有目标节点类别的样本节点集合中获取目标节点子集合和与所述目标节点子集合对应的邻居节点子集合,所述邻居节点子集合中的邻居节点与所述目标节点子集合中的目标节点具有关联关系;
提取模块1120,用于结合所述邻居节点子集合通过节点分类模型提取所述目标节点子集合的节点特征子集合,其中,所述节点特征子集合中包括所述目标节点的节点特征向量;
预测模块1130,用于根据所述节点特征子集合对所述目标节点进行类别预测,得到预测类别概率子集合;
训练模块1140,用于根据所述预测类别概率子集合和所述目标节点的所述目标节点类别对所述节点分类模型的目标模型参数进行训练。
在一个可选的实施例中,所述获取模块1110,还用于从所述样本节点集合中获取所述目标节点子集合;根据所述目标节点子集合计算所述样本节点集合中待选择邻居节点的第二候选概率;根据所述待选择邻居节点的所述第二候选概率确定所述邻居节点子集合。
在一个可选的实施例中,所述获取模块1110,还用于通过所述目标节点子集合中的所述目标节点与第u个所述待选择邻居节点之间的连边权重,以及所述目标节点与所述样本节点集合中的节点的连边权重,确定第u个所述待选择邻居节点的所述第二候选概率,u为正整数。
在一个可选的实施例中,所述提取模块1120,还用于确定所述节点分类模型的目标模型参数,其中,所述目标模型参数为待训练的模型参数;根据所述 目标模型参数和所述邻居节点子集合提取所述目标节点子集合的节点特征子集合。
在一个可选的实施例中,所述提取模块1120,还用于根据第i个所述目标节点和所述邻居节点之间的连边权重、所述邻居节点的特征向量以及所述目标模型参数,确定第i个所述目标节点的节点特征,i为正整数。
在一个可选的实施例中,所述训练模块1140,还用于根据所述预测类别概率子集合和所述目标节点的所述目标节点类别确定目标损失值;根据所述目标损失值确定所述模型参数梯度;根据所述模型参数梯度对所述目标模型参数进行训练。
在一个可选的实施例中,所述训练模块1140,还用于将所述模型参数梯度与预设学习率的乘积作为对所述目标模型参数的调整差值;以所述调整差值对所述目标模型参数进行调整。
综上所述,本实施例提供的模型训练的装置,在节点分类过程中对节点进行特征更新时,从目标节点集合中选择待分类节点,并根据该待分类节点的邻居节点集合对该待分类节点的节点特征进行确定,从而根据节点特征获取节点分类结果,通过上述方式,对于大规模图谱而言,每一次迭代计算图谱中的部分节点,无需遍历图谱中的每个节点,大幅地降低了计算开销,且节省计算资源。
图12示出了本申请一个示例性实施例提供的服务器的结构示意图。该服务器可以是图3示出的服务器310。具体来讲:
服务器1200包括中央处理单元(CPU,Central Processing Unit)1201、包括随机存取存储器(RAM,Random Access Memory)1202和只读存储器(ROM,Read Only Memory)1203的系统存储器1204,以及连接系统存储器1204和中央处理单元1201的系统总线1205。服务器1200还包括帮助计算机内的各个器件之间传输信息的基本输入/输出系统(I/O系统,Input Output System)1206,和用于存储操作系统1213、应用程序1214和其他程序模块1215的大容量存储设备1207。
基本输入/输出系统1206包括有用于显示信息的显示器1208和用于用户输入信息的诸如鼠标、键盘之类的输入设备1209。其中显示器1208和输入设备1209都通过连接到系统总线1205的输入输出控制器1210连接到中央处理单元 1201。基本输入/输出系统1206还可以包括输入输出控制器1210以用于接收和处理来自键盘、鼠标、或电子触控笔等多个其他设备的输入。类似地,输入输出控制器1210还提供输出到显示屏、打印机或其他类型的输出设备。
大容量存储设备1207通过连接到系统总线1205的大容量存储控制器(未示出)连接到中央处理单元1201。大容量存储设备1207及其相关联的计算机可读介质为服务器1200提供非易失性存储。也就是说,大容量存储设备1207可以包括诸如硬盘或者紧凑型光盘只读存储器(CD-ROM,Compact Disc Read Only Memory)驱动器之类的计算机可读介质(未示出)。
不失一般性,计算机可读介质可以包括计算机存储介质和通信介质。计算机存储介质包括以用于存储诸如计算机可读指令、数据结构、程序模块或其他数据等信息的任何方法或技术实现的易失性和非易失性、可移动和不可移动介质。计算机存储介质包括RAM、ROM、可擦除可编程只读存储器(EPROM,Erasable Programmable Read Only Memory)、带电可擦可编程只读存储器(EEPROM,Electrically Erasable Programmable Read Only Memory)、闪存或其他固态存储其技术,CD-ROM、数字通用光盘(DVD,Digital Versatile Disc)或其他光学存储、磁带盒、磁带、磁盘存储或其他磁性存储设备。当然,本领域技术人员可知计算机存储介质不局限于上述几种。上述的系统存储器1204和大容量存储设备1207可以统称为存储器。
根据本申请的各种实施例,服务器1200还可以通过诸如因特网等网络连接到网络上的远程计算机运行。也即服务器1200可以通过连接在系统总线1205上的网络接口单元1211连接到网络1212,或者说,也可以使用网络接口单元1211来连接到其他类型的网络或远程计算机系统(未示出)。
上述存储器还包括一个或者一个以上的程序,一个或者一个以上程序存储于存储器中,被配置由CPU执行。
本申请的实施例还提供了一种计算机设备,该计算手机设备包括处理器和存储器,该存储器中存储有至少一条指令、至少一段程序、代码集或指令集,至少一条指令、至少一段程序、代码集或指令集由处理器加载并执行以实现上述各方法实施例提供的节点分类的方法或模型训练的方法。
本申请的实施例还提供了一种计算机可读存储介质,该计算机可读存储介质上存储有至少一条指令、至少一段程序、代码集或指令集,至少一条指令、至少一段程序、代码集或指令集由处理器加载并执行,以实现上述各方法实施 例提供的节点分类的方法或模型训练的方法。
应当理解的是,在本文中提及的“多个”是指两个或两个以上。“和/或”,描述关联对象的关联关系,表示可以存在三种关系,例如,A和/或B,可以表示:单独存在A,同时存在A和B,单独存在B这三种情况。字符“/”一般表示前后关联对象是一种“或”的关系。
本领域普通技术人员可以理解实现上述实施例的全部或部分步骤可以通过硬件来完成,也可以通过程序来指令相关的硬件完成,所述的程序可以存储于一种计算机可读存储介质中,上述提到的存储介质可以是只读存储器,磁盘或光盘等。
以上所述仅为本申请的可选实施例,并不用以限制本申请,凡在本申请的精神和原则之内,所作的任何修改、等同替换、改进等,均应包含在本申请的保护范围之内。
Claims (22)
- 一种节点分类的方法,其特征在于,应用于计算机设备中,所述方法包括:获取目标节点集合中的待分类节点,所述待分类节点中包括所述目标节点集合中的至少一个节点,且所述待分类节点的数量小于所述目标节点集合中节点的数量;从所述目标节点集合中获取所述待分类节点的邻居节点集合,其中,所述邻居节点集合中包括至少一个邻居节点,所述邻居节点与所述待分类节点之间具有关联关系;通过节点分类模型根据所述邻居节点集合提取所述待分类节点的目标特征信息;根据所述目标特征信息确定所述待分类节点的分类结果,其中,所述节点分类模型为通过至少一个样本节点子集合训练得到的模型,所述样本节点子集合为样本节点集合的子集,所述样本节点集合中的节点标注有节点类别。
- 根据权利要求1所述的方法,其特征在于,所述从所述目标节点集合中获取所述待分类节点的邻居节点集合,包括:根据所述待分类节点计算所述目标节点集合中每个候选邻居节点的第一候选概率;根据每个所述候选邻居节点的所述第一候选概率确定所述邻居节点集合。
- 根据权利要求2所述的方法,其特征在于,所述根据所述待分类节点计算所述目标节点集合中每个候选邻居节点的第一候选概率,包括:通过所述待分类节点与第u个所述候选邻居节点之间的连边权重,以及所述待分类节点与所述目标节点集合中的节点的连边权重,确定第u个所述待选择邻居节点的所述第一候选概率,u为正整数。
- 一种模型训练的方法,其特征在于,应用于计算机设备中,所述方法包括:从标注有目标节点类别的样本节点集合中获取目标节点子集合和与所述目 标节点子集合对应的邻居节点子集合,所述邻居节点子集合中的邻居节点与所述目标节点子集合中的目标节点具有关联关系;结合所述邻居节点子集合通过节点分类模型提取所述目标节点子集合的节点特征子集合,其中,所述节点特征子集合中包括所述目标节点的节点特征向量;根据所述节点特征子集合对所述目标节点进行类别预测,得到预测类别概率子集合;根据所述预测类别概率子集合和所述目标节点的所述目标节点类别对所述节点分类模型的目标模型参数进行训练。
- 根据权利要求4所述的方法,其特征在于,所述从标注有目标节点类别的样本节点集合中获取目标节点子集合和与所述目标节点子集合对应的邻居节点子集合,包括:从所述样本节点集合中获取所述目标节点子集合;根据所述目标节点子集合计算所述样本节点集合中待选择邻居节点的第二候选概率;根据所述待选择邻居节点的所述第二候选概率确定所述邻居节点子集合。
- 根据权利要求5所述的方法,其特征在于,所述根据所述目标节点子集合计算所述样本节点集合中每个待选择邻居节点的第二候选概率,包括:通过所述目标节点子集合中的所述目标节点与第u个所述待选择邻居节点之间的连边权重,以及所述目标节点与所述样本节点集合中的节点的连边权重,确定第u个所述待选择邻居节点的所述第二候选概率,u为正整数。
- 根据权利要求4所述的方法,其特征在于,所述结合所述邻居节点子集合通过节点分类模型提取所述目标节点子集合的节点特征子集合,包括:确定所述节点分类模型的目标模型参数,其中,所述目标模型参数为待训练的模型参数;根据所述目标模型参数和所述邻居节点子集合提取所述目标节点子集合的节点特征子集合。
- 根据权利要求7所述的方法,其特征在于,所述根据所述目标模型参数和所述邻居节点子集合提取所述目标节点子集合的节点特征子集合,包括:根据第i个所述目标节点和所述邻居节点之间的连边权重、所述邻居节点的特征向量以及所述目标模型参数,确定第i个所述目标节点的节点特征,i为正整数。
- 根据权利要求4至8任一所述的方法,其特征在于,所述根据所述预测类别概率子集合和所述目标节点的所述目标节点类别对所述节点分类模型的目标模型参数进行训练,包括:根据所述预测类别概率子集合和所述目标节点的所述目标节点类别确定目标损失值;根据所述目标损失值确定所述模型参数梯度;根据所述模型参数梯度对所述目标模型参数进行训练。
- 根据权利要求9所述的方法,其特征在于,所述根据所述模型参数梯度对所述目标模型参数进行训练,包括:将所述模型参数梯度与预设学习率的乘积作为对所述目标模型参数的调整差值;以所述调整差值对所述目标模型参数进行调整。
- 一种节点分类的装置,其特征在于,应用于计算机设备中,所述装置包括:获取模块,用于获取目标节点集合中的待分类节点;所述获取模块,还用于从所述目标节点集合中获取所述待分类节点的邻居节点集合,其中,所述邻居节点集合中包括至少一个邻居节点,所述邻居节点与所述待分类节点之间具有关联关系;提取模块,用于通过节点分类模型根据所述邻居节点集合提取所述待分类节点的目标特征信息;确定模块,用于根据所述目标特征信息确定所述待分类节点的分类结果, 其中,所述节点分类模型为通过至少一个样本节点子集合训练得到的模型,所述样本节点子集合为样本节点集合的子集,所述样本节点集合中的节点标注有节点类别。
- 根据权利要求11所述的装置,其特征在于,所述获取模块,还用于根据所述待分类节点计算所述目标节点集合中每个候选邻居节点的第一候选概率;根据每个所述候选邻居节点的所述第一候选概率确定所述邻居节点集合。
- 根据权利要求12所述的装置,其特征在于,所述获取模块,还用于通过所述待分类节点与第u个所述候选邻居节点之间的连边权重,以及所述待分类节点与所述目标节点集合中的节点的连边权重,确定第u个所述待选择邻居节点的所述第一候选概率,u为正整数。
- 一种模型训练的装置,其特征在于,应用于计算机设备中,所述装置包括:获取模块,用于从标注有目标节点类别的样本节点集合中获取目标节点子集合和与所述目标节点子集合对应的邻居节点子集合,所述邻居节点子集合中的邻居节点与所述目标节点子集合中的目标节点具有关联关系;提取模块,用于结合所述邻居节点子集合通过节点分类模型提取所述目标节点子集合的节点特征子集合,其中,所述节点特征子集合中包括所述目标节点的节点特征向量;预测模块,用于根据所述节点特征子集合对所述目标节点进行类别预测,得到预测类别概率子集合;训练模块,用于根据所述预测类别概率子集合和所述目标节点的所述目标节点类别对所述节点分类模型的目标模型参数进行训练。
- 根据权利要求14所述的装置,其特征在于,所述获取模块,还用于从所述样本节点集合中获取所述目标节点子集合;根据所述目标节点子集合计算所述样本节点集合中待选择邻居节点的第二候选概率;根据所述待选择邻居节点的所述第二候选概率确定所述邻居节点子集合。
- 根据权利要求15所述的装置,其特征在于,所述获取模块,还用于通过所述目标节点子集合中的所述目标节点与第u个所述待选择邻居节点之间的连边权重,以及所述目标节点与所述样本节点集合中的节点的连边权重,确定第u个所述待选择邻居节点的所述第二候选概率,u为正整数。
- 根据权利要求14所述的装置,其特征在于,所述提取模块,还用于确定所述节点分类模型的目标模型参数,其中,所述目标模型参数为待训练的模型参数;根据所述目标模型参数和所述邻居节点子集合提取所述目标节点子集合的节点特征子集合。
- 根据权利要求17所述的装置,其特征在于,所述提取模块,还用于根据第i个所述目标节点和所述邻居节点之间的连边权重、所述邻居节点的特征向量以及所述目标模型参数,确定第i个所述目标节点的节点特征,i为正整数。
- 根据权利要求14至18任一所述的装置,其特征在于,所述训练模块,还用于根据所述预测类别概率子集合和所述目标节点的所述目标节点类别确定目标损失值;根据所述目标损失值确定所述模型参数梯度;根据所述模型参数梯度对所述目标模型参数进行训练。
- 根据权利要求19所述的装置,其特征在于,所述训练模块,还用于将所述模型参数梯度与预设学习率的乘积作为对所述目标模型参数的调整差值;以所述调整差值对所述目标模型参数进行调整。
- 一种计算机设备,其特征在于,所述计算机设备包括处理器和存储器,所述存储器中存储有至少一条指令、至少一段程序、代码集或指令集,所述至少一条指令、所述至少一段程序、所述代码集或指令集由所述处理器加载并执行以实现如权利要求1至3任一所述的节点分类的方法或权利要求4至10任一所述的模型训练的方法。
- 计算机可读存储介质,其特征在于,所述存储介质中存储有至少一条指令、至少一段程序、代码集或指令集,所述至少一条指令、所述至少一段程序、所述代码集或指令集由处理器加载并执行以实现如权利要求1至3任一所述的节点分类的方法或权利要求4至10任一所述的模型训练的方法。
Priority Applications (3)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| EP19883716.3A EP3882820A4 (en) | 2018-11-15 | 2019-11-11 | Node classification method, model training method, device, apparatus, and storage medium |
| JP2021505770A JP7183385B2 (ja) | 2018-11-15 | 2019-11-11 | ノード分類方法、モデル訓練方法並びに、その装置、機器及びコンピュータプログラム |
| US17/153,014 US11853882B2 (en) | 2018-11-15 | 2021-01-20 | Methods, apparatus, and storage medium for classifying graph nodes |
Applications Claiming Priority (2)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN201811361409.0A CN109460793B (zh) | 2018-11-15 | 2018-11-15 | 一种节点分类的方法、模型训练的方法及装置 |
| CN201811361409.0 | 2018-11-15 |
Related Child Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| US17/153,014 Continuation US11853882B2 (en) | 2018-11-15 | 2021-01-20 | Methods, apparatus, and storage medium for classifying graph nodes |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| WO2020098606A1 true WO2020098606A1 (zh) | 2020-05-22 |
Family
ID=65610518
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| PCT/CN2019/117173 Ceased WO2020098606A1 (zh) | 2018-11-15 | 2019-11-11 | 节点分类方法、模型训练方法、装置、设备及存储介质 |
Country Status (5)
| Country | Link |
|---|---|
| US (1) | US11853882B2 (zh) |
| EP (1) | EP3882820A4 (zh) |
| JP (1) | JP7183385B2 (zh) |
| CN (1) | CN109460793B (zh) |
| WO (1) | WO2020098606A1 (zh) |
Cited By (4)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN114265988A (zh) * | 2021-12-20 | 2022-04-01 | 人工智能与数字经济广东省实验室(广州) | 基于重要性得分的好友推荐方法 |
| CN114662706A (zh) * | 2022-03-24 | 2022-06-24 | 支付宝(杭州)信息技术有限公司 | 一种模型训练方法、装置及设备 |
| CN114862592A (zh) * | 2022-04-21 | 2022-08-05 | 北京百度网讯科技有限公司 | 图节点类别获取的方法、装置及电子设备 |
| JP2024507183A (ja) * | 2021-09-09 | 2024-02-16 | テンセント・テクノロジー・(シェンジェン)・カンパニー・リミテッド | オブジェクト管理方法、オブジェクト管理装置、コンピュータ機器、コンピュータプログラム、及びオブジェクト管理システム |
Families Citing this family (23)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN109460793B (zh) | 2018-11-15 | 2023-07-18 | 腾讯科技(深圳)有限公司 | 一种节点分类的方法、模型训练的方法及装置 |
| CN110188422B (zh) * | 2019-05-16 | 2022-12-20 | 深圳前海微众银行股份有限公司 | 一种基于网络数据提取节点的特征向量的方法及装置 |
| CN112132305B (zh) * | 2019-06-25 | 2024-12-17 | 腾讯科技(深圳)有限公司 | 节点类别确定方法、相关装置、设备及存储介质 |
| CN110659723B (zh) * | 2019-09-03 | 2023-09-19 | 腾讯科技(深圳)有限公司 | 基于人工智能的数据处理方法、装置、介质及电子设备 |
| CN110705629A (zh) * | 2019-09-27 | 2020-01-17 | 北京市商汤科技开发有限公司 | 数据处理方法及相关装置 |
| CN110956643A (zh) * | 2019-12-04 | 2020-04-03 | 齐鲁工业大学 | 一种基于MDNet的改进车辆跟踪方法及系统 |
| US12249134B2 (en) * | 2020-04-17 | 2025-03-11 | Roxy Corp. | Visualization method, program for the same, visualization device, and discrimination device having the same |
| CN113743425B (zh) * | 2020-05-27 | 2024-10-22 | 北京沃东天骏信息技术有限公司 | 一种生成分类模型的方法和装置 |
| CN111881936A (zh) * | 2020-06-19 | 2020-11-03 | 北京三快在线科技有限公司 | 训练样本筛选方法、装置、电子设备及存储介质 |
| CN112183627B (zh) * | 2020-09-28 | 2024-07-19 | 中星技术股份有限公司 | 生成预测密度图网络的方法和车辆年检标数量检测方法 |
| CN114358291B (zh) * | 2020-09-30 | 2024-04-09 | 本源量子计算科技(合肥)股份有限公司 | 量子连通图的交叉连线处理方法、装置、终端及存储介质 |
| US20220284340A1 (en) * | 2021-03-02 | 2022-09-08 | Adobe Inc. | Determining digital personas utilizing data-driven analytics |
| CN112860806B (zh) * | 2021-04-06 | 2022-09-02 | 中国科学技术大学 | 数据分类方法及装置、存储介质及电子设备 |
| CN113807500A (zh) * | 2021-09-18 | 2021-12-17 | 中国电信股份有限公司 | 一种用于异常用户识别的方法、介质及装置 |
| US12488223B2 (en) * | 2021-12-21 | 2025-12-02 | International Business Machines Corporation | Federated learning for training machine learning models |
| CN114722937B (zh) * | 2022-04-06 | 2024-07-16 | 腾讯科技(深圳)有限公司 | 一种异常数据检测方法、装置、电子设备和存储介质 |
| CN115146725B (zh) * | 2022-06-30 | 2023-05-30 | 北京百度网讯科技有限公司 | 对象分类模式的确定方法、对象分类方法、装置和设备 |
| CN115329838A (zh) * | 2022-07-07 | 2022-11-11 | 武汉理工大学 | 一种考虑类别不平衡的属性图异常检测方法 |
| CN115081024B (zh) * | 2022-08-16 | 2023-01-24 | 杭州金智塔科技有限公司 | 基于隐私保护的去中心化业务模型训练方法及装置 |
| CN118171150A (zh) * | 2022-12-08 | 2024-06-11 | 马上消费金融股份有限公司 | 分类模型的训练方法、类别识别方法及计算机设备 |
| CN116016199B (zh) * | 2023-02-21 | 2023-06-09 | 山东海量信息技术研究院 | 一种信息控制方法、系统、电子设备及可读存储介质 |
| CN116186622A (zh) * | 2023-03-27 | 2023-05-30 | 吉林大学 | 一种网络数据分类方法、系统、电子设备及存储介质 |
| CN118964284B (zh) * | 2024-07-22 | 2025-07-18 | 寒序科技(北京)有限公司 | 概率计算加速卡、概率计算加速方法、装置和介质 |
Citations (5)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US20130325862A1 (en) * | 2012-06-04 | 2013-12-05 | Michael D. Black | Pipelined incremental clustering algorithm |
| US20160210348A1 (en) * | 2015-01-20 | 2016-07-21 | Telefonaktiebolaget L M Ericsson (Publ) | Systems and methods for multi-variate attribute correlation |
| CN106503319A (zh) * | 2016-10-13 | 2017-03-15 | 中国人民解放军国防科学技术大学 | 一种适用于网络节点分类方法评估的仿真网络生成方法 |
| CN106997474A (zh) * | 2016-12-29 | 2017-08-01 | 南京邮电大学 | 一种基于深度学习的图节点多标签分类方法 |
| CN109460793A (zh) * | 2018-11-15 | 2019-03-12 | 腾讯科技(深圳)有限公司 | 一种节点分类的方法、模型训练的方法及装置 |
Family Cites Families (7)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN106776729B (zh) | 2016-11-18 | 2020-08-14 | 同济大学 | 一种大规模知识图谱路径查询预测器构造方法 |
| US10573003B2 (en) * | 2017-02-13 | 2020-02-25 | Amit Sethi | Systems and methods for computational pathology using points-of-interest |
| CN108520275A (zh) * | 2017-06-28 | 2018-09-11 | 浙江大学 | 一种基于邻接矩阵的连接信息规整系统、图特征提取系统、图分类系统和方法 |
| CN108319953B (zh) * | 2017-07-27 | 2019-07-16 | 腾讯科技(深圳)有限公司 | 目标对象的遮挡检测方法及装置、电子设备及存储介质 |
| US11797838B2 (en) * | 2018-03-13 | 2023-10-24 | Pinterest, Inc. | Efficient convolutional network for recommender systems |
| US11620487B2 (en) * | 2019-12-31 | 2023-04-04 | X Development Llc | Neural architecture search based on synaptic connectivity graphs |
| US11625611B2 (en) * | 2019-12-31 | 2023-04-11 | X Development Llc | Training artificial neural networks based on synaptic connectivity graphs |
-
2018
- 2018-11-15 CN CN201811361409.0A patent/CN109460793B/zh active Active
-
2019
- 2019-11-11 WO PCT/CN2019/117173 patent/WO2020098606A1/zh not_active Ceased
- 2019-11-11 JP JP2021505770A patent/JP7183385B2/ja active Active
- 2019-11-11 EP EP19883716.3A patent/EP3882820A4/en active Pending
-
2021
- 2021-01-20 US US17/153,014 patent/US11853882B2/en active Active
Patent Citations (5)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US20130325862A1 (en) * | 2012-06-04 | 2013-12-05 | Michael D. Black | Pipelined incremental clustering algorithm |
| US20160210348A1 (en) * | 2015-01-20 | 2016-07-21 | Telefonaktiebolaget L M Ericsson (Publ) | Systems and methods for multi-variate attribute correlation |
| CN106503319A (zh) * | 2016-10-13 | 2017-03-15 | 中国人民解放军国防科学技术大学 | 一种适用于网络节点分类方法评估的仿真网络生成方法 |
| CN106997474A (zh) * | 2016-12-29 | 2017-08-01 | 南京邮电大学 | 一种基于深度学习的图节点多标签分类方法 |
| CN109460793A (zh) * | 2018-11-15 | 2019-03-12 | 腾讯科技(深圳)有限公司 | 一种节点分类的方法、模型训练的方法及装置 |
Non-Patent Citations (1)
| Title |
|---|
| See also references of EP3882820A4 * |
Cited By (7)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| JP2024507183A (ja) * | 2021-09-09 | 2024-02-16 | テンセント・テクノロジー・(シェンジェン)・カンパニー・リミテッド | オブジェクト管理方法、オブジェクト管理装置、コンピュータ機器、コンピュータプログラム、及びオブジェクト管理システム |
| US12204586B2 (en) | 2021-09-09 | 2025-01-21 | Tencent Technology (Shenzhen) Company Limited | Object management method and apparatus, device, storage medium, and system |
| JP7718628B2 (ja) | 2021-09-09 | 2025-08-05 | テンセント・テクノロジー・(シェンジェン)・カンパニー・リミテッド | オブジェクト管理方法、オブジェクト管理装置、コンピュータ機器、コンピュータプログラム、及びオブジェクト管理システム |
| CN114265988A (zh) * | 2021-12-20 | 2022-04-01 | 人工智能与数字经济广东省实验室(广州) | 基于重要性得分的好友推荐方法 |
| CN114662706A (zh) * | 2022-03-24 | 2022-06-24 | 支付宝(杭州)信息技术有限公司 | 一种模型训练方法、装置及设备 |
| CN114862592A (zh) * | 2022-04-21 | 2022-08-05 | 北京百度网讯科技有限公司 | 图节点类别获取的方法、装置及电子设备 |
| CN114862592B (zh) * | 2022-04-21 | 2026-02-27 | 北京百度网讯科技有限公司 | 图节点类别获取的方法、装置及电子设备 |
Also Published As
| Publication number | Publication date |
|---|---|
| JP2021533474A (ja) | 2021-12-02 |
| EP3882820A4 (en) | 2022-01-05 |
| CN109460793B (zh) | 2023-07-18 |
| JP7183385B2 (ja) | 2022-12-05 |
| EP3882820A1 (en) | 2021-09-22 |
| US20210142108A1 (en) | 2021-05-13 |
| US11853882B2 (en) | 2023-12-26 |
| CN109460793A (zh) | 2019-03-12 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| WO2020098606A1 (zh) | 节点分类方法、模型训练方法、装置、设备及存储介质 | |
| US11017220B2 (en) | Classification model training method, server, and storage medium | |
| CN110852447B (zh) | 元学习方法和装置、初始化方法、计算设备和存储介质 | |
| US11410038B2 (en) | Frame selection based on a trained neural network | |
| CN110209859B (zh) | 地点识别及其模型训练的方法和装置以及电子设备 | |
| CN107529650B (zh) | 闭环检测方法、装置及计算机设备 | |
| CN114693993B (zh) | 一种图像处理和图像分类方法、装置、设备及存储介质 | |
| CN112132206B (zh) | 图像识别方法及相关模型的训练方法及相关装置、设备 | |
| WO2021089013A1 (zh) | 空间图卷积网络的训练方法、电子设备及存储介质 | |
| CN111738357A (zh) | 垃圾图片的识别方法、装置及设备 | |
| CN110969200B (zh) | 基于一致性负样本的图像目标检测模型训练方法及装置 | |
| CN113850179B (zh) | 图像检测方法及相关模型的训练方法、装置、设备、介质 | |
| CN113283368B (zh) | 一种模型训练方法、人脸属性分析方法、装置及介质 | |
| US20240135698A1 (en) | Image classification method, model training method, device, storage medium, and computer program | |
| CN114694215A (zh) | 年龄估计模型的训练及估计方法、装置、设备及存储介质 | |
| CN111079930A (zh) | 数据集质量参数的确定方法、装置及电子设备 | |
| CN114998679A (zh) | 深度学习模型的在线训练方法、装置、设备及存储介质 | |
| CN114219051B (zh) | 图像分类方法、分类模型的训练方法、装置及电子设备 | |
| CN114548569B (zh) | 异质社交网络中缺失链路预测方法、系统和存储介质 | |
| CN113763193B (zh) | 群体检测方法、装置、电子设备和计算机存储介质 | |
| JP2017021606A (ja) | 動画像検索方法、動画像検索装置及びそのプログラム | |
| CN108764206B (zh) | 目标图像识别方法和系统、计算机设备 | |
| KR102060110B1 (ko) | 컨텐츠에 포함되는 객체를 분류하는 방법, 장치 및 컴퓨터 프로그램 | |
| CN113793298B (zh) | 肺结节检测模型构建优化方法、设备、存储介质及产品 | |
| CN116958628A (zh) | 图片分类方法、装置、计算机可读介质及电子设备 |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| 121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 19883716 Country of ref document: EP Kind code of ref document: A1 |
|
| ENP | Entry into the national phase |
Ref document number: 2021505770 Country of ref document: JP Kind code of ref document: A |
|
| NENP | Non-entry into the national phase |
Ref country code: DE |
|
| ENP | Entry into the national phase |
Ref document number: 2019883716 Country of ref document: EP Effective date: 20210615 |







