CN106910013A - Unreal information detecting method and device based on Expression study - Google Patents

Unreal information detecting method and device based on Expression study Download PDF

Info

Publication number
CN106910013A
CN106910013A CN201710085225.5A CN201710085225A CN106910013A CN 106910013 A CN106910013 A CN 106910013A CN 201710085225 A CN201710085225 A CN 201710085225A CN 106910013 A CN106910013 A CN 106910013A
Authority
CN
China
Prior art keywords
information
user
expression
behavior
event
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201710085225.5A
Other languages
Chinese (zh)
Inventor
谭铁牛
王亮
吴书
刘强
余峰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Institute of Automation of Chinese Academy of Science
Original Assignee
Institute of Automation of Chinese Academy of Science
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Institute of Automation of Chinese Academy of Science filed Critical Institute of Automation of Chinese Academy of Science
Priority to CN201710085225.5A priority Critical patent/CN106910013A/en
Publication of CN106910013A publication Critical patent/CN106910013A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/06Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
    • G06Q10/063Operations research, analysis or management
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/40Business processes related to social networking or social networking services
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/12Applying verification of the received information
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/30Network architectures or network communication protocols for network security for supporting lawful interception, monitoring or retaining of communications or communication related information
    • H04L63/302Network architectures or network communication protocols for network security for supporting lawful interception, monitoring or retaining of communications or communication related information gathering intelligence information for situation awareness or reconnaissance
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/30Network architectures or network communication protocols for network security for supporting lawful interception, monitoring or retaining of communications or communication related information
    • H04L63/306Network architectures or network communication protocols for network security for supporting lawful interception, monitoring or retaining of communications or communication related information intercepting packet switched data communications, e.g. Web, Internet or IMS communications

Landscapes

  • Engineering & Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Computer Security & Cryptography (AREA)
  • Human Resources & Organizations (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Computer Hardware Design (AREA)
  • Signal Processing (AREA)
  • General Engineering & Computer Science (AREA)
  • Computing Systems (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Economics (AREA)
  • Strategic Management (AREA)
  • Marketing (AREA)
  • Physics & Mathematics (AREA)
  • Technology Law (AREA)
  • Theoretical Computer Science (AREA)
  • Operations Research (AREA)
  • Quality & Reliability (AREA)
  • Tourism & Hospitality (AREA)
  • General Physics & Mathematics (AREA)
  • General Business, Economics & Management (AREA)
  • Game Theory and Decision Science (AREA)
  • Development Economics (AREA)
  • Educational Administration (AREA)
  • Evolutionary Computation (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

本发明公开一种基于动态表达学习的不实信息检测方法,包括以下步骤:获取待检测信息;利用预先建立的检测模型对所述待检测信息进行检测;输出检测结果;其中,检测模型如下建立:步骤S1,首先建模联合表示用户信息和该用户行为信息的某一事件的动态行为表达式;用户信息包含用户的特征和用户可信度,行为信息包含行为类型;步骤S2,一个事件由不同信息组成,结合步骤S1中的所述动态行为表达式,最终得出事件可信度检测表达式;步骤S3,运用时间特征矩阵来取得在信息传播过程中用户动态行为特征连步骤S4,生成用户特征表达;步骤S5,利用配对学习法估算检测模型参数。

The invention discloses a false information detection method based on dynamic expression learning, comprising the following steps: acquiring information to be detected; using a pre-established detection model to detect the information to be detected; outputting the detection result; wherein, the detection model is established as follows : Step S1, first modeling a dynamic behavior expression that jointly represents user information and a certain event of the user behavior information; user information includes user characteristics and user credibility, and behavior information includes behavior types; step S2, an event is represented by Different information components, combined with the dynamic behavior expression in step S1, finally get the event credibility detection expression; step S3, use the time feature matrix to obtain the user’s dynamic behavior characteristics in the process of information dissemination, and then step S4 to generate User feature expression; step S5, using paired learning method to estimate detection model parameters.

Description

Unreal information detection method and device based on dynamic expression learning
Technical Field
The invention relates to the technical field of computer model detection, in particular to an unreal information detection method and device based on dynamic expression learning.
Background
The rapid development of social media has enabled network users to experience unprecedented convenience. Social media such as Facebook, Twitter, and the newsbook offer a platform for users to share information and publish their personal stories publicly. But at the same time, the propagation of unreal information on the social platform also brings great trouble to users and also harms social harmony and public safety. In recent years, information reliability detection has attracted great attention in academic and industrial fields.
The information considered in the present methods mainly includes the following categories: text information, source credible information, dynamic information and comment information; the factors for evaluating the credibility of the user behavior mainly comprise: time, people, behavior, manner. The fact discovery method is an unsupervised or semi-supervised method for discovering facts and detecting information credibility in conflict data. Based on the above information, the existing methods mainly focus on artificial features, but they are cumbersome and cannot obtain the basic features of data. Meanwhile, the current method cannot simulate the relevance of different information types and different credibility factors during information transmission. The fact discovery method is mainly based on source credibility information and the detected credibility is collected from each source. But the fact finding method is only suitable for ideal case specific topics such as price prediction and flight prediction, and is not suitable for social media in such a complex environment.
In recent years, many methods capable of automatically measuring information reliability in social media have been widely used. These methods are based primarily on textual information and source trust information at the message level or event level. There are also studies that take into account both message and event levels. With respect to dynamic information, some studies define temporal features in the propagation process or train models with different temporal features. For comment information, the research adopts a user feedback or microblog marking mode to indicate suspicious information. Although these methods are widely used, they are complicated in terms of feature engineering and cannot obtain basic features of data. Meanwhile, the current method cannot simulate the relevance of different information types and different credibility factors during information transmission.
The DBRM model aims to determine whether the event is unrealistic information from what users post and forward on social media. Model set user behavior factors: user credibility, event occurrence time interval, user publishing and forwarding behaviors and user comment information can be used for judging the credibility of a message. The model introduces a representation learning method, which, unlike conventional feature engineering, can capture information in different aspects of the propagation process. The model learns the implicit representation (1 event representation) of the user, the dynamic time interval, the user behavior and the comment attitude. Based on these implicit representations, the model can generate a dynamic behavioral representation of the information and present innovations in the detection of trustworthiness.
Disclosure of Invention
In view of the technical defects of the traditional artificial characteristic-based method, the invention provides a detection method and a detection device based on dynamic behavior characteristic representation in order to better detect information reliability.
According to an aspect of the present invention, a method for detecting unreal information based on dynamic expression learning is provided, which includes the following steps:
acquiring information to be detected;
detecting the information to be detected by using a pre-established detection model;
outputting a detection result;
wherein, the detection model is established as follows:
step S1, firstly, modeling a dynamic behavior expression jointly representing the user information and a certain event of the user behavior information; the user information comprises the characteristics of the user and the user credibility, and the behavior information comprises the behavior type;
step S2, an event is composed of different information, and an event credibility detection expression is finally obtained by combining the dynamic behavior expression in the step S1;
step S3, using the time characteristic matrix to obtain the user dynamic behavior characteristics in the information transmission process
Step S4, generating a user characteristic expression;
in step S5, the detection model parameters are estimated by the pair learning method.
According to a second aspect of the present invention, there is provided an unreal information detecting apparatus based on dynamic expression learning, including:
the acquisition module is configured to acquire information to be detected;
the detection module is configured to detect the information to be detected by utilizing a pre-established detection model;
an output module configured to output a detection result;
wherein, the detection model is established as follows:
firstly, modeling a dynamic behavior expression jointly representing user information and a certain event of the user behavior information; the user information comprises the characteristics of the user and the user credibility, and the behavior information comprises the behavior type;
an event is composed of different information, and an event reliability detection expression is finally obtained by combining the dynamic behavior expression in the step S1;
method for obtaining user dynamic behavior characteristics in information transmission process by using time characteristic matrix
Generating a user characteristic expression;
and estimating detection model parameters by using a pairing learning method.
The detection model adopted by the invention summarizes various characteristics of the key characteristics of the characterization information, namely user information, behavior information, time information and comment information, and models high-order interactive expression among the characteristics. Therefore, the expression of the microblog or the event which is modeled can be more completely, fully and really learned by vector representation, and the method can be more suitable for complex and changeable social network occasions. The detection model reveals the power law distribution rule of the information quantity along with time, and adopts log according to the rule2The continuous time period is divided into different time intervals, so that the same number of information in each time interval is guaranteed, and all events can be guaranteed to share a similar time scale on the whole. The model can more easily learn the expression of the events and can fully dig out the time law of information distribution. The invention relates to an unreal information detection and transmission task based on dynamic expression learning, in particular to a social network which is real and complex and has large information quantity, long time span, complex semantic scene, user behavior change and the likeIn this case. The dynamic behavior expression of the user is learned to obtain more accurate prediction effect.
Drawings
FIG. 1 is a flow chart of a method for detecting unreal information based on dynamic expression learning according to the present invention;
FIG. 2 is a schematic diagram of the expression learning process of the dynamic behavior expression model DBRM in the present invention;
fig. 3(a) and 3(b) are accuracy-recall curves for rumors (a) and real information (b) for different comparison methods.
Detailed Description
The following describes in detail various problems involved in the technical solutions of the present invention with reference to the accompanying drawings. It should be noted that the described embodiments are only intended to facilitate understanding and do not have any limiting effect on the invention.
As shown in fig. 1, the present invention provides a method for detecting unreal information based on dynamic expression learning, which comprises the following steps:
acquiring information to be detected;
detecting the information to be detected by using a pre-established detection model;
outputting a detection result;
wherein, the detection model is established as follows:
step S1, firstly, modeling a dynamic behavior expression jointly representing the user information and a certain event of the user behavior information; the user information comprises the characteristics of the user and the user credibility, and the behavior information comprises the behavior type;
step S2, an event is composed of different information, and an event credibility detection expression is finally obtained by combining the dynamic behavior expression in the step S1;
step S3, using the time characteristic matrix to obtain the user dynamic behavior characteristics in the information transmission process
Step S4, generating a user characteristic expression;
in step S5, the detection model parameters are estimated by the pair learning method.
The Dynamic behavior expression Model (DBRM for short) provided by the invention is used for detecting unreal information in a social media scene. The model can learn dynamic behavior expression, and can establish a model containing user credibility, dynamic attributes, behavior characteristics and evaluation viewpoints by learning implicit expression. The collection of aspect information generates user behavior representations, and the collection of user dynamic behavior representations generates credibility representations describing information propagated by an event on social media. In the model, each user is represented by a corresponding vector, where the time interval, user behavior and user comments are respectively represented by a matrix. The model further introduces a pair-wise learning method so as to maximize the credibility difference between accurate information and unreal information. Building a DBRM model: 1) each user is represented by a vector with own characteristics (such as gender, attention and number of people concerned) to indicate the credibility of user information; 2) the model combines matrix representation of time interval from propagation from unreal information to microblog release so as to capture dynamic characteristics of user behaviors. Representing user behaviors (such as publishing and forwarding) by using an implicit operation matrix can indicate whether different behavior characteristics and user comments are questioning degrees or not; 3) generating an expression of the information in the propagation process based on the product of the expressions of 2); 4) after combining all the dynamic behavior expression models in 3), we can obtain the credibility expression of the event; 5) we apply a pair-wise learning method to maximize the difference between accurate and unreal information to detect the credibility of information on social media. On the experiment of the Sina microblog data set, the effect more accurate than the prediction of other existing models is obtained.
In order to better understand the role of the DBRM model in the unreal information detection and verify the implementation effect of the present invention, experiments are taken as an example to explain, and the example adopts the xinlang microblog database. The experimental data set was divided into 60% training set, 30% testing set and 10% validation set.
The experiment contained four evaluation indices: accuracy, precision, recall and F1 values. The research object respectively calculates the accuracy and the recall rate for the unreal information and the real information to display the capability of the model for detecting the two kinds of information. The larger the values of the four evaluation indexes are, the stronger the performance of the model is.
The specific experimental steps on the microblog data set are as follows:
in step S1, the traditional user information and behavior information are modeled first. The traditional user information comprises the characteristics of the user and the user credibility, wherein the characteristics of the user comprise the gender of the user, the number of people concerned by the microblog and the number of people concerned by the user; the larger the numerical value of the user credibility is, the more credible the user is; the behavior information comprises a behavior type, such as whether the microblog is originally issued or forwarded, and compared with forwarding, the original issuance is more original and more important for credibility detection. Microblogs with high credibility are often originated by users with high credibility, while some unreal information is often originated by users with low credibility and forwarded by users with high credibility.
For the ith event eiRelated jth microblogExpressions that can jointly represent users and their behaviorRepresenting users in the jth microblog in the ith eventIs represented by a vector of (a). RdRepresenting a d-dimensional real number space.Is a user actionThe implicit matrix of (a) represents, wherein each element is continuously updated in the training process, and d represents the matrix dimension. These expressions may derive the characteristics of the user under a particular behavior.
Besides, the comments of the users play an important role in detecting the credibility of the information. The user can evaluate the information according to life common knowledge and experience. Unrealized information tends to receive more questionable comments according to the following expression. Incorporating microblogsAll comments ofIt can be derived that:whereinIs a commentThe expression is added with the comment attitude of the author.
Accordingly, the combination of the time interval during which autorumor starts to propagate to a particular microblog and user behavior may provide better confidence in the detection. Will incident eiTime interval from start to propagation of corresponding micro-blogAdding the expression to obtain the microblogThe dynamic behavior of (2) expresses:whereinIs a time intervalIs expressed in a matrix of (a).Can be used for representing four different factors to microblogThe combined effect of (a).
In step S2, an event is composed of different microblogs, and an event reliability detection expression can be finally obtained by combining the microblog dynamic behavior expression in step S1. Let event eiComprisesEach microblog and all microblogs form a setCalculating according to the average value to obtain an event eiThe expression of (a) is:predicting an event eiWhether it is unreal information, the expression can be adopted:wherein W ∈ RdIs the linear weight of the prediction function. WhereinRepresents an event eiThe degree of reliability of the system (c),the larger the value of (c), the event eiThe higher the confidence of (c).
In step S3, in the model, the time feature matrix is used to obtain the dynamic behavior features of the user during the information dissemination process. To reduce the problem of data sparseness due to learning different matrices over successive time periods, we divide successive time periods into different time intervals. It is not reasonable to divide the time equally according to the power law distribution diagram of the dynamic behavior. The model is according to log2The time intervals are divided (base 2 logarithm), and only the time intervals corresponding to the upper and lower boundaries of the matrix are learned. For a certain moment in a time interval, their transition matrix can be calculated by nonlinear interpolation. Time characteristic matrix T for a certain time Tt
WhereinAndrespectively representUpper and lower boundaries of (a).
In step S4, a user representation is generated. For user representation in the model, we can get the features and credibility of the user by learning different potential vector expressions. On average, there are only two behaviors per user and we cannot learn the potential expression of each user. But we can learn the user's expression of features.
The characteristics of the user may include, for example, gender, microblog attention number, microblog number, whether the user is authenticated, and the like. For user u, feature vector Fu,∈RfWherein,andthere are two bits of information that can be,(i.e. 1 st bit of the two-bit information is 1) indicates that the sex is male,indicates gender as female;indicating that the user has been authenticated and,indicating that the user is not authenticated. The number of people concerned, the number of people concerned and the number of microblogs of a user are not easy to express each numerical value, and the numerical values such as the number of people and the number of microblogs are calculated according to log10The distribution is divided into discrete time intervals. If user u has vuBy the spotters, we can derive the corresponding characteristics,
wherein,andrespectively representI denotes the boundary of the section. In the same wayAndthe expression can be constructed in the above-described manner. Based on feature vector FuWe can derive the user expression Uu=SFuWherein S ∈ Rd×fIs a characteristic expression hidden matrix which is continuously learned in the training process.
In step S5, a pair-wise learning method is used to estimate the model parameters. Considering that unreal information is not easy to collect training models, a pairwise learning method is used for expanding a training set. Assuming that the confidence level of accurate information is higher than that of unrealized information, we maximize the difference between the two with the following equation:
wherein Respectively representing real information enAnd unreal information erG (x) is a nonlinear equation, g (x) is 1/(1+ e)-x). In combination with the negative log-likelihood function, we can write the objective function:
where E represents the set of all events, len,lerLabels representing real and unreal information, respectively, Θ ═ { U, B, C, T, W } represent all the parameters calculated, and λ is the parameter that controls the regularization size. It can be derived that J is related to W,andthe reciprocal of (a) is as follows:
wherein,
calculating an eventReciprocal of (2)The gradient of the corresponding parameter can be expressed as:
so as to push out the plastic film out,
after all gradients are calculated, we can calculate the model parameters using a random gradient descent. The above process is repeated until the model converges.
PR curves for various methods on rumor and non-rumor datasets as shown in FIGS. 3(a) and 3(b), respectively;
table 1 below is the statistical information for the data set;
TABLE 1
Element(s) Event(s) Rumor True information Micro blog Primary microblog Forwarding microblogs User' s
Number of 936 500 436 630363 98429 532236 321246
Table 2 below shows experimental comparison results of this model with the most advanced model at present:
TABLE 2
The above-described embodiments further explain the objects, technical solutions, and effects of the present invention in detail. It should be understood that the above-mentioned embodiments are merely exemplary of the present invention, and not restrictive, and that any modifications, equivalents, improvements, etc. made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims (9)

1.一种基于动态表达学习的不实信息检测方法,包括以下步骤:1. A false information detection method based on dynamic expression learning, comprising the following steps: 获取待检测信息;Obtain the information to be detected; 利用预先建立的检测模型对所述待检测信息进行检测;Detecting the information to be detected by using a pre-established detection model; 输出检测结果;output test results; 其中,检测模型如下建立:Among them, the detection model is established as follows: 步骤S1,首先建模联合表示用户信息和该用户行为信息的某一事件的动态行为表达式;用户信息包含用户的特征和用户可信度,行为信息包含行为类型;Step S1, first modeling a dynamic behavior expression that jointly represents user information and a certain event of the user behavior information; user information includes user characteristics and user credibility, and behavior information includes behavior types; 步骤S2,一个事件由不同信息组成,结合步骤S1中的所述动态行为表达式,最终得出事件可信度检测表达式;In step S2, an event is composed of different information, combined with the dynamic behavior expression in step S1, finally obtain the event credibility detection expression; 步骤S3,运用时间特征矩阵来取得在信息传播过程中用户动态行为特征连Step S3, using the time feature matrix to obtain the user dynamic behavior feature connection during the information dissemination process. 步骤S4,生成用户特征表达;Step S4, generating a user feature expression; 步骤S5,利用配对学习法估算检测模型参数。Step S5, estimating the parameters of the detection model by using the paired learning method. 2.根据权利要求1所述的方法,其特征在于,所述信息为微博信息;所述用户信息包括用户的特征和用户可信度,所述用户的特征包括用户性别、微博关注人数和被关注的人数;用户可信度的数值越大,表示用户越可信;行为信息包括原发微博或转发微博。2. The method according to claim 1, wherein the information is microblog information; the user information includes user characteristics and user credibility, and the user characteristics include user gender, number of microblog followers and the number of followers; the larger the value of user credibility, the more credible the user; behavior information includes the original microblog or forwarded microblog. 3.根据权利要求1所述的方法,其特征在于,所述动态行为表达式如下表示:3. The method according to claim 1, wherein the dynamic behavior expression is as follows: RR jj ee ii == TT tt jj ee ii CC bb jj ee ii BB bb jj ee ii Uu uu jj ee ii ,, 其中,代表第i个事件ei中的第j条微博中的用户的向量表示;Rd表示d维实数空间;是用户行为的隐含矩阵表示;是评论的矩阵表达,是时间间隔的矩阵表达。in, Represents the user in the j-th microblog in the i-th event e i The vector representation; R d represents the d-dimensional real number space; is user behavior The implicit matrix representation of ; is a comment The matrix expression of is the time interval matrix expression. 4.根据权利要求1所述的方法,其特征在于,事件可信度检测表达式如下所示:4. The method according to claim 1, wherein the event reliability detection expression is as follows: ythe y ee ii == WW TT RR ee ii 其中,表示事件ei的可信度,W∈Rd是预测函数的线性权重,表示事件ei的表达式。in, Represents the credibility of event e i , W∈R d is the linear weight of the prediction function, An expression representing an event e i . 5.根据权利要求1所述的方法,其特征在于,在步骤S3中,将连续时间段分割成不同的时间间隔,根据log2来划分时间间隔,并且只学习上边界和下边界相对应的时间间隔,而对于在一个时间间隔的某一个时刻,其时间特征矩阵(一种转移矩阵)通过非线性插值法计算得出。5. The method according to claim 1, characterized in that, in step S3, the continuous time period is divided into different time intervals, the time interval is divided according to log 2 , and only the upper boundary and the lower boundary corresponding to the learning time interval, and for a certain moment in a time interval, its time feature matrix (a transition matrix) is calculated by nonlinear interpolation. 6.根据权利要求1所述的方法,其特征在于,步骤S5中通过如下表达式区别真实信息和不实信息的差别:6. The method according to claim 1, characterized in that in step S5, the difference between real information and false information is distinguished by the following expression: pp (( ee nno >> ee rr )) == gg (( ythe y ee nno -- ythe y ee rr )) 其中分别表示真实信息en和不实信息er的可信度,g(x)是非线性方程,g(x)=1/(1+e-x);in Respectively represent the credibility of real information e n and false information e r , g(x) is a nonlinear equation, g(x)=1/(1+e -x ); 检测模型的目标函数如下表示:The objective function of the detection model is expressed as follows: JJ == ΣΣ {{ ee nno ,, ee rr }} ∈∈ EE. ,, ll ee nno == 11 ,, ll ee rr == 00 lnln (( 11 ++ ee -- WW TT (( RR ee nno -- RR ee rr )) )) ++ λλ 22 || || ΘΘ || || 22 ,, 其中,E表示所有事件的集合,len、ler分别表示真实信息和不实信息的标签,Θ代表所有被计算的检测模型参数,分别表示真实信息en和不实信息er的动态行为表达,W∈Rd是预测函数的线性权重;λ是控制正则化大小的参数。Among them, E represents the set of all events, l en and l er represent the labels of real information and false information respectively, Θ represents all the calculated detection model parameters, with Represent the dynamic behavior expressions of real information e n and false information e r respectively, W∈R d is the linear weight of the prediction function; λ is a parameter controlling the size of regularization. 7.根据权利要求1所述的方法,其特征在于,检测模型概括了表征信息关键特性的多种特征,即用户信息、行为信息、时间信息和评论信息,并且建模了这些特征间的高阶交互表达。7. The method according to claim 1, characterized in that the detection model summarizes a variety of features that characterize the key characteristics of information, namely user information, behavior information, time information and comment information, and models the high correlation between these features. interactive expression. 8.根据权利要求1所述的方法,其特征在于,检测模型揭示了信息数量随时间的幂律分布规律,并且依据此规律采用log2将连续时间段分割成不同的时间间隔,不仅保证每一个时间间隔内有相同数目的信息数目,而且能够从整体上保证所有事件共有一个相似的时间尺度。8. The method according to claim 1, wherein the detection model reveals the power-law distribution law of information quantity over time, and adopts log2 to divide the continuous time period into different time intervals according to this law, not only ensuring that each There is the same number of messages in a time interval, and it is guaranteed that all events share a similar time scale as a whole. 9.一种基于动态表达学习的不实信息检测装置,包括:9. A false information detection device based on dynamic expression learning, comprising: 获取模块,被配置为获取待检测信息;an acquisition module configured to acquire information to be detected; 检测模块,被配置为利用预先建立的检测模型对所述待检测信息进行检测;A detection module configured to detect the information to be detected by using a pre-established detection model; 输出模块,被配置为输出检测结果;an output module configured to output a detection result; 其中,检测模型如下建立:Among them, the detection model is established as follows: 首先建模联合表示用户信息和该用户行为信息的某一事件的动态行为表达式;用户信息包含用户的特征和用户可信度,行为信息包含行为类型;First, model a dynamic behavior expression that jointly represents user information and a certain event of the user behavior information; user information includes user characteristics and user credibility, and behavior information includes behavior types; 一个事件由不同信息组成,结合步骤S1中的所述动态行为表达式,最终得出事件可信度检测表达式;An event is composed of different information, and combined with the dynamic behavior expression in step S1, the event credibility detection expression is finally obtained; 运用时间特征矩阵来取得在信息传播过程中用户动态行为特征连Using the time feature matrix to obtain the dynamic behavior characteristics of users in the process of information dissemination 生成用户特征表达;generate user profile; 利用配对学习法估算检测模型参数。Estimation of detection model parameters using paired learning.
CN201710085225.5A 2017-02-16 2017-02-16 Unreal information detecting method and device based on Expression study Pending CN106910013A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710085225.5A CN106910013A (en) 2017-02-16 2017-02-16 Unreal information detecting method and device based on Expression study

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710085225.5A CN106910013A (en) 2017-02-16 2017-02-16 Unreal information detecting method and device based on Expression study

Publications (1)

Publication Number Publication Date
CN106910013A true CN106910013A (en) 2017-06-30

Family

ID=59207531

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710085225.5A Pending CN106910013A (en) 2017-02-16 2017-02-16 Unreal information detecting method and device based on Expression study

Country Status (1)

Country Link
CN (1) CN106910013A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108536776A (en) * 2018-03-28 2018-09-14 广州厚云信息科技有限公司 Unification user malicious act detection method and system in a kind of social networks
WO2020011068A1 (en) * 2018-07-10 2020-01-16 第四范式(北京)技术有限公司 Method and system for executing machine learning process
CN111881764A (en) * 2020-07-01 2020-11-03 深圳力维智联技术有限公司 A target detection method, device, electronic device and storage medium

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102117325A (en) * 2011-02-24 2011-07-06 清华大学 Method for predicting dynamic social network user behaviors
US20130298233A1 (en) * 2011-01-05 2013-11-07 Toshiba Solutions Corporation Web page falsification detection apparatus and storage medium
CN104134159A (en) * 2014-08-04 2014-11-05 中国科学院软件研究所 Method for predicting maximum information spreading range on basis of random model

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20130298233A1 (en) * 2011-01-05 2013-11-07 Toshiba Solutions Corporation Web page falsification detection apparatus and storage medium
CN102117325A (en) * 2011-02-24 2011-07-06 清华大学 Method for predicting dynamic social network user behaviors
CN104134159A (en) * 2014-08-04 2014-11-05 中国科学院软件研究所 Method for predicting maximum information spreading range on basis of random model

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
QIANG LIU等: "ICE: Information Credibility Evaluation on Social Media via Representation Learning", 《HTTPS://ARXIV.ORG/PDF/1609.09226.PDF》 *

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108536776A (en) * 2018-03-28 2018-09-14 广州厚云信息科技有限公司 Unification user malicious act detection method and system in a kind of social networks
WO2020011068A1 (en) * 2018-07-10 2020-01-16 第四范式(北京)技术有限公司 Method and system for executing machine learning process
CN111881764A (en) * 2020-07-01 2020-11-03 深圳力维智联技术有限公司 A target detection method, device, electronic device and storage medium
CN111881764B (en) * 2020-07-01 2023-11-03 深圳力维智联技术有限公司 A target detection method, device, electronic equipment and storage medium

Similar Documents

Publication Publication Date Title
Bourigault et al. Representation learning for information diffusion through social networks: an embedded cascade model
Wang et al. Attribute‐level and pattern‐level classification consistency and accuracy indices for cognitive diagnostic assessment
CN103824115B (en) Towards the inter-entity relation estimating method and system of open network knowledge base
WO2022179384A1 (en) Social group division method and division system, and related apparatuses
CN114330510A (en) Model training method and device, electronic equipment and storage medium
WO2022188773A1 (en) Text classification method and apparatus, device, computer-readable storage medium, and computer program product
CN103077247B (en) The method for building up of friends transmission tree in a kind of social networks
CN110336700B (en) Microblog popularity prediction method based on time and user forwarding sequence
CN103150374A (en) Method and system for identifying abnormal microblog users
CN104915392A (en) Micro-blog transmitting behavior predicting method and device
US20240202839A1 (en) Method of analyzing social influence between internet forums and apparatus for the same
Yang [Retracted] Application of LSTM Neural Network Technology Embedded in English Intelligent Translation
CN106910013A (en) Unreal information detecting method and device based on Expression study
CN116522013B (en) Public opinion analysis method and system based on social network platform
Li et al. Parameter estimation on a stochastic SIR model with media coverage
Ali et al. Bayesian estimation of the mixture of generalized exponential distribution: a versatile lifetime model in industrial processes
CN106021289A (en) Method for establishing probability matrix decomposition model based on node user
CN105389297A (en) Text similarity processing method
CN106844765B (en) Significant information detection method and device based on convolutional neural network
CN119357830B (en) A robust rumor detection method on social media
CN110046657A (en) A kind of social safety figure painting image space method based on multiple view study
Cuomo et al. A biologically inspired model for describing the user behaviors in a Cultural Heritage environment.
CN112052995A (en) Social network user influence prediction method based on fusion emotional tendency theme
Yu et al. Prediction of users retweet times in social network
Neammanee et al. Considering similarity and the rating conversion of neighbors on neural collaborative filtering

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WD01 Invention patent application deemed withdrawn after publication
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20170630