CN112651237A - User portrait establishing method and device based on user emotion standpoint and user portrait visualization method - Google Patents

User portrait establishing method and device based on user emotion standpoint and user portrait visualization method Download PDF

Info

Publication number
CN112651237A
CN112651237A CN201910961379.5A CN201910961379A CN112651237A CN 112651237 A CN112651237 A CN 112651237A CN 201910961379 A CN201910961379 A CN 201910961379A CN 112651237 A CN112651237 A CN 112651237A
Authority
CN
China
Prior art keywords
emotion
words
probability
emotional
user
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201910961379.5A
Other languages
Chinese (zh)
Other versions
CN112651237B (en
Inventor
刘垚
邹更
任钰欣
黄梓杰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Wuhan Yujianwan Technology Co ltd
Original Assignee
Wuhan Yujianwan Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Wuhan Yujianwan Technology Co ltd filed Critical Wuhan Yujianwan Technology Co ltd
Priority to CN201910961379.5A priority Critical patent/CN112651237B/en
Publication of CN112651237A publication Critical patent/CN112651237A/en
Application granted granted Critical
Publication of CN112651237B publication Critical patent/CN112651237B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Machine Translation (AREA)

Abstract

The invention discloses a user portrait establishing method and device based on a user emotion standpoint and a user portrait visualization method, wherein the user portrait establishing method comprises the following steps: acquiring independent short text corpora from user historical data; classifying the obtained short text corpus according to emotional tendency, and constructing an emotional word bank according to the distribution condition of words in the short text corpus classification result; constructing a vertical trigger word library according to an application scene; calculating the emotional probability of the speech block to be analyzed; calculating the emotional probability of the corresponding vertical trigger word of a single user according to the emotional probability of the speech block to be analyzed; calculating the average emotional probability of the position trigger words in the community according to the emotional probability of the position trigger words corresponding to the single user, and sequencing according to the average emotional probability; and constructing the user portrait according to the sequencing condition of the position trigger words in the community and the emotional probability of the single user to the position trigger words. The method can improve the accuracy and intuition of emotion analysis of the user.

Description

User portrait establishing method and device based on user emotion standpoint and user portrait visualization method
Technical Field
The invention relates to the technical field of data analysis, in particular to a user portrait establishing method and device based on a user emotion standpoint and a user portrait visualization method.
Background
User behavior of a user in a network platform is often used to describe a user's features, which are called user profiles, and the emphasis points for constructing user profiles may be different according to purposes. For example, the e-commerce platform focuses on the consumption capability of the user, the purchasing preference is used for establishing the user representation, the social platform can establish the user representation based on the interest characteristics and social relations of the user, different user representations can help the platform to classify the user, and customized services can be better realized for the user.
The inventor of the present application finds that the method of the prior art has at least the following technical problems in the process of implementing the present invention:
in the prior art, when emotion analysis is performed on a user, an emotion word bank adopted by the user contains a large number of words of non-network expressions and non-daily expressions, and the words of the current network commonly used expressions are lacked, so that the accuracy and the practicability of emotion analysis based on the existing emotion word bank are limited.
Therefore, the method in the prior art has the technical problem that the analysis result is not accurate enough.
Disclosure of Invention
In view of the above, the present invention provides a user portrait creating method and apparatus based on user emotion standpoint, and a user portrait visualization method, so as to solve or at least partially solve the technical problem of inaccurate result of the prior art method.
The invention provides a user portrait establishing method based on a user emotion standpoint, which comprises the following steps:
acquiring independent short text corpora from user historical data;
classifying the acquired short text corpus according to emotional tendency, constructing an emotional word bank according to the distribution condition of words in the short text corpus classification result, and calculating the original emotional probability of the emotional words in the emotional word bank; wherein, the emotion word bank comprises positive emotion words and negative emotion words;
establishing a position trigger word library according to an application scene, wherein the position trigger word library comprises position trigger words capable of causing a user position or emotional reaction;
extracting the vertical trigger words contained in the text information issued by the user, forming a speech block to be analyzed according to the extracted vertical trigger words, and calculating the emotional probability of the speech block to be analyzed according to the emotional probability of the emotional words in the speech block to be analyzed, the number of degree adverbs and the number of negative words;
calculating the emotional probability of the corresponding vertical trigger word of a single user according to the emotional probability of the speech block to be analyzed;
calculating the average emotional probability of the position trigger words in the community formed by all the users according to the emotional probability of the position trigger words corresponding to the single user, and sequencing according to the average emotional probability;
and constructing the user portrait according to the sequencing condition of the position trigger words in the community and the emotional probability of the single user to the position trigger words.
In one embodiment, classifying the obtained short text corpus according to emotion tendencies, constructing an emotion word bank according to word distribution conditions in short text corpus classification results, and calculating emotion probabilities of emotion words in the emotion word bank, the method comprises the following steps:
classifying the obtained short text corpora according to emotional tendency, and dividing the short text corpora into positive corpora, neutral corpora and negative corpora;
performing word segmentation on the classified linguistic data, and removing redundancy to obtain a linguistic data word bank;
counting the distribution condition of each word in the corpus thesaurus in a positive corpus, a neutral corpus and a negative corpus;
screening out vocabularies related to positive direction and negative direction as sign word candidates of emotional tendency by combining chi-square check according to the distribution condition of the vocabularies;
screening the marker word candidates, deleting the vocabulary which is not matched with the corresponding emotional tendency, and constructing an emotional word library;
and searching all original linguistic data corresponding to each positive emotion word, calculating the average value of the positive emotion probabilities as the original emotion probabilities of the positive emotion words, and subtracting the average value of the positive emotion probabilities from 1 for the negative emotion words in the emotion word bank as the original emotion probabilities of the negative emotion words.
In one embodiment, the method for forming the speech block to be analyzed according to the extracted vertical trigger words and then calculating the emotional probability of the speech block to be analyzed according to the number of emotional words, degree adverbs and negative words in the speech block to be analyzed comprises the following steps:
composing the sentences where the extracted vertical trigger words are located and n front sentences and n rear sentences into a speech block to be analyzed, wherein n is a positive integer greater than or equal to 1;
searching positive emotion words and negative emotion words appearing in a speech block to be analyzed, and acquiring the original emotion probability of each positive emotion word and each negative emotion word;
determining a negation coefficient and a degree weight according to each positive emotion word and each negative emotion word and the number of negation words and degree adverbs in a preset range;
calculating the emotion probability correction value of each emotion word according to the original emotion probability, the negative coefficient and the degree weight of the emotion words;
and calculating the emotion probability of the speech block to be analyzed according to the emotion probability correction value of the emotion words, the number of the positive emotion words and the number of the negative emotion words.
In one embodiment, calculating the emotional probability of the corresponding position trigger word of the single user according to the emotional probability of the speech block to be analyzed comprises the following steps:
when the position trigger word does not appear in the data issued by the user, the emotion probability of the position trigger word is null;
when the vertical trigger word appears once in the data issued by the user, taking the emotion probability corresponding to the corpus block where the vertical trigger word appears as the emotion probability of the vertical trigger word;
and when the vertical trigger words appear twice or more in the data issued by the user, taking the average value of the emotion probabilities corresponding to all the corpus blocks where the vertical trigger words appear as the emotion probability of the vertical trigger words.
In one embodiment, after calculating the emotional probability of the corresponding position trigger word of the single user according to the emotional probability of the speech block to be analyzed, the method further comprises:
and normalizing the emotional probability of the vertical trigger word corresponding to the single user to obtain the emotional probability correction value of each vertical trigger word corresponding to the single user.
In one embodiment, calculating the average emotional probability of the position trigger words in the community consisting of all the users according to the emotional probability of the position trigger words corresponding to the single user comprises the following steps:
averaging the emotion probability correction values of the position trigger words triggered by the positions corresponding to the users, and calculating the average emotion probability of the position trigger words in the community;
and sorting according to the average emotional probability of each position trigger word.
Based on the same inventive concept, the second aspect of the present invention provides a user portrait creation apparatus based on the emotion standpoint of the user, comprising:
the corpus acquiring module is used for acquiring independent short text corpora from user historical data;
the emotion word bank construction module is used for classifying the acquired short text corpus according to emotion tendencies, constructing an emotion word bank according to the distribution condition of words in the short text corpus classification result and calculating the original emotion probability of emotion words in the emotion word bank; wherein, the emotion word bank comprises positive emotion words and negative emotion words;
the system comprises a vertical trigger word library construction module, a vertical trigger word library construction module and a word library processing module, wherein the vertical trigger word library construction module is used for constructing a vertical trigger word library according to an application scene, and the vertical trigger word library comprises vertical trigger words capable of causing a user vertical or emotional reaction;
the emotion probability calculation module of the linguistic data block to be analyzed is used for extracting the vertical trigger words contained in the text information issued by the user, forming the linguistic data block to be analyzed according to the extracted vertical trigger words, and then calculating the emotion probability of the linguistic data block to be analyzed according to the emotion probability of the emotion words in the linguistic data block to be analyzed, the number of degree adverbs and the number of negative words;
the emotion probability calculation module of the single user vertical trigger word is used for calculating the emotion probability of the vertical trigger word corresponding to the single user according to the emotion probability of the speech block to be analyzed;
the average emotional probability sorting module is used for calculating the average emotional probability of the position trigger words in the community formed by all the users according to the emotional probability of the position trigger words corresponding to the single user and sorting according to the average emotional probability;
and the user portrait construction module is used for constructing the user portrait according to the sequencing condition of the position trigger words in the community and the emotional probability of the single user to the position trigger words.
Based on the same inventive concept, a third aspect of the present invention provides a method for visualizing a user representation, comprising: and visually displaying the user portrait constructed by the method of the first aspect.
In one embodiment, a user representation is visually displayed, comprising:
mapping the position trigger words to word blocks in a preset shape according to the average emotion probability;
constructing a corresponding relation between the emotional probability and the color characteristics of a single user to the position trigger words;
and carrying out visual display on the user portrait according to the corresponding relation between the emotional probability and the color characteristic.
Based on the same inventive concept, a fourth aspect of the present invention provides a computer-readable storage medium having stored thereon a computer program which, when executed, performs the method of the first aspect
One or more technical solutions in the embodiments of the present application have at least one or more of the following technical effects:
the invention provides a user portrait establishing method based on a user emotion standpoint, which comprises the following steps of firstly, obtaining independent short text corpora from user historical data; then constructing an emotion word bank, and calculating the original emotion probability of the emotion words in the emotion word bank; then, establishing a vertical trigger word library according to the application scene; then calculating the emotional probability of the speech block to be analyzed; secondly, calculating the emotional probability of the corresponding vertical trigger word of the single user according to the emotional probability of the speech block to be analyzed; then, calculating the average emotional probability of the in-community position trigger words formed by all the users, and sequencing according to the average emotional probability; and finally, constructing the user portrait according to the sequencing condition of the position trigger words in the community and the emotional probability of the single user to the position trigger words.
Because the method provided by the invention can construct the emotion word bank according to the distribution condition of words in the short text corpus classification result, the constructed emotion word bank is divided into positive emotion words and negative emotion words according to the emotion tendency, the emotion tendency of new words can be better identified from the character content of the user, the emotion response in the words can be more accurately analyzed, the vertical trigger word bank is constructed, then the mapping of a single user and the vertical trigger word bank is established, and the emotion probability of the vertical trigger word corresponding to the single user is calculated; and then calculating the average emotional probability of the position trigger words in the community consisting of all the users, and sequencing the positions, so that the portrait of the users can be constructed according to the sequencing result, the emotion position of each user on the position trigger words can be accurately determined, the viewpoint difference and the emotional response characteristics of each user on common things can be rapidly and accurately known, and the technical problem that the analysis result is not accurate enough in the method in the prior art is solved.
Furthermore, based on the constructed user portrait, the invention also provides a user portrait visualization method, which is used for visually displaying the user portrait and improving the intuitiveness.
Furthermore, according to a color gradient formula, a corresponding relation between the emotion probability and the color feature of the individual user opposite-to-place trigger words is constructed, and then the user portrait is visually displayed according to the corresponding relation between the emotion probability and the color feature, so that the display effect can be improved.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly introduced below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to these drawings without creative efforts.
FIG. 1 is a schematic flow chart of a user profile creation method based on the emotional standpoint of the user according to the present invention;
FIG. 2 is a block diagram of a user profile creation apparatus based on the emotional standpoint of the user according to an embodiment of the present invention;
FIG. 3 is a schematic diagram illustrating user representation visualization by a single user in accordance with an embodiment of the present invention;
FIG. 4 is a schematic diagram illustrating user representation visualization effects of different users according to an embodiment of the present invention;
FIG. 5 is a diagram illustrating user image visualization effects of group users in an embodiment of the present invention;
fig. 6 is a block diagram of a computer-readable storage medium according to an embodiment of the present invention.
Detailed Description
The invention aims to provide a user portrait establishing method and device based on a user emotion standpoint and a user portrait visualization method, aiming at the technical problem that the results of the method in the prior art are not accurate enough.
In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, but not all, embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
The inventor of the application discovers through a large amount of research and practice that in the behavior of a user, a lot of data can reflect the emotion of the user, corresponding to the emotion, the view and the standpoint of the user to many things are based on the emotional response of the user to common things, and therefore the emotion graph of the user can be constructed. In the aspect of emotion analysis, the existing emotion word bank contains a large number of words of non-network expressions and non-daily expressions, and the words of the current network expressions are lacked, so that the accuracy and the practicability of emotion analysis based on the existing emotion word bank are limited.
The invention provides a user portrait establishing method and a user portrait visualization method based on a user emotion standpoint, wherein a new emotion word bank establishing method is adopted, and the method has the following advantages or beneficial technical effects:
1. by adopting the emotion word bank constructed in the invention, the emotion tendency of a new word can be better identified from the text content of a user, and the emotional response in the new word can be more accurately analyzed;
2. the user can quickly know the view difference and emotional reaction characteristics of the user and others to common things;
3. the visual analysis of the emotional standpoint characteristics of a certain user group is realized;
4. and meanwhile, the method is also helpful for classifying users in more dimensions.
Example one
The embodiment provides a user portrait establishing method based on a user emotion standpoint, please refer to fig. 1, and the method includes:
step S1: and obtaining independent short text corpora from user historical data.
Specifically, the user history data includes historical behavior data of the user, such as message information, published comment information, and the like of the user. The independent short text corpus represents texts with certain meanings, and can be realized by the existing tools.
Step S2: classifying the acquired short text corpus according to emotional tendency, constructing an emotional word bank according to the distribution condition of words in the short text corpus classification result, and calculating the original emotional probability of the emotional words in the emotional word bank; the emotion word bank comprises positive emotion words and negative emotion words.
In particular, emotional inclination, i.e. the user's position or attitude, for example "like" means positive or positive emotion, "eat" means neutral, "hate" means negative or negative emotion. The words, i.e. the words contained in the classified corpus, can be obtained by word segmentation operation. The positive emotion probability of the original corpus is the probability of positive emotion of the emotion contained in the corpus calculated by adopting the open source emotion analysis API. And searching all original linguistic data corresponding to each emotional word, and calculating the average value of the positive emotional probabilities of the original linguistic data, so as to obtain the original emotional probability of the emotional words.
Step S3: and constructing a position trigger word library according to the application scene, wherein the position trigger word library contains position trigger words capable of causing the position or emotional reaction of the user.
Specifically, the nouns that may cause the user's position or emotional response, such as songs, movies, celebrities, popular concepts, etc., are collected according to the application scenario and platform features. The user may indicate more clearly whether the attitude is positive or negative for a song, a movie, a character, or a topical concept such as "transgenic" or the like. And the set formed by the vertical trigger words is a vertical trigger word library.
Step S4: extracting the vertical trigger words contained in the text information issued by the user, forming a speech block to be analyzed according to the extracted vertical trigger words, and calculating the emotional probability of the speech block to be analyzed according to the emotional probability of the emotional words in the speech block to be analyzed, the number of degree adverbs and the number of negative words.
Specifically, the text information issued by the user may be contents such as comments and articles. In specific implementation, the sentence where the vertical trigger word is located and the context information can be combined to form a speech block to be analyzed.
Step S5: and calculating the emotional probability of the corresponding vertical trigger word of the single user according to the emotional probability of the speech block to be analyzed.
Specifically, the emotion probability of the elevation trigger word corresponding to a single user can be determined according to the occurrence frequency, time and the like of the elevation trigger word in the corpus block to be analyzed, so that mapping between the single user and the elevation trigger word is constructed.
Step S6: and calculating the average emotional probability of the position trigger words in the community formed by all the users according to the emotional probability of the position trigger words corresponding to the single user, and sequencing according to the average emotional probability.
Specifically, the emotional probability of the vertical trigger word in the vertical trigger word library for a single user is obtained in step S5, and this step is the average emotional probability of the vertical trigger words in the community formed by all users. The emotional probability of all the corpora in which the word was triggered from the standpoint can be summed and divided by the total number of corpora texts in which the word occurred.
Step S7: and constructing the user portrait according to the sequencing condition of the position trigger words in the community and the emotional probability of the single user to the position trigger words.
Specifically, the user portraits may be constructed according to the average emotional probability of the vertical trigger words, which may be ranked from large to small, and then according to the emotional probability of the single user to the vertical trigger words.
In one embodiment, classifying the obtained short text corpus according to emotion tendencies, constructing an emotion word bank according to word distribution conditions in short text corpus classification results, and calculating emotion probabilities of emotion words in the emotion word bank, the method comprises the following steps:
classifying the obtained short text corpora according to emotional tendency, and dividing the short text corpora into positive corpora, neutral corpora and negative corpora;
performing word segmentation on the classified linguistic data, and removing redundancy to obtain a linguistic data word bank;
counting the distribution condition of each word in the corpus thesaurus in a positive corpus, a neutral corpus and a negative corpus;
screening out vocabularies related to positive direction and negative direction as sign word candidates of emotional tendency by combining chi-square check according to the distribution condition of the vocabularies;
screening the marker word candidates, deleting the vocabulary which is not matched with the corresponding emotional tendency, and constructing an emotional word library;
and searching all original linguistic data corresponding to each positive emotion word, calculating the average value of the positive emotion probabilities as the original emotion probabilities of the positive emotion words, and subtracting the average value of the positive emotion probabilities from 1 for the negative emotion words in the emotion word bank as the original emotion probabilities of the negative emotion words.
Specifically, when candidate signpost screening is performed, the signpost effect of each word on three types of linguistic data is judged by using chi-square test, namely whether the sentence contains the word and is related to three emotional tendencies of positive direction, neutral direction and negative direction, and then the candidate signpost is screened out by using the result of the chi-square test. The candidate marker words refer to a set of initially screened marker words.
And performing part-of-speech screening and manual screening on the candidate sign words, removing words which are not matched with the corresponding emotional tendency, taking the rest words as emotional words, and dividing the words into positive emotional words and negative emotional words according to the emotional tendency corresponding to each word.
And then, respectively calculating the original emotion probabilities of the positive emotion words and the negative emotion words, searching all original corpora corresponding to each vocabulary in the vocabulary library for the positive emotion word library, and calculating the average value of the positive emotion probabilities of the vocabularies as the original emotion probabilities of the vocabularies. In order to facilitate calculation, a normalization formula is used, and the value ranges of the original emotion probabilities of all the words in the positive emotion word bank are unified to be 0-1 and used as emotion scores of the words. (it should be noted that the positive emotion probability of the original corpus refers to the probability of positive emotion of the emotion contained in the corpus calculated by the open source emotion analysis API)
And for the negative emotion word bank, searching all original linguistic data corresponding to each word in the word bank, subtracting the average value of the positive emotion probabilities of all the original linguistic data by 1, and taking the result as the original emotion probability of the word. For convenience of calculation, a normalization formula is used, and the value ranges of the emotion probabilities of all the vocabularies in the negative emotion word bank are unified to be 0-1 and used as the emotion scores of the vocabularies.
Wherein, the normalized formula is as follows: x ═ X-Xmin)/(Xmax-Xmin) Wherein, X represents the data to be normalized, X' is the value of X after normalization, and XmaxIs the maximum value, X, of all data that need to be normalizedminIs the minimum of all data that needs to be normalized.
The method can be used for constructing the emotional word bank with emotional tendency, and the candidate sign words can be screened out more accurately by adopting a chi-square test method, so that the emotional word bank is more accurate.
In one embodiment, the method for forming the speech block to be analyzed according to the extracted vertical trigger words and then calculating the emotional probability of the speech block to be analyzed according to the number of emotional words, degree adverbs and negative words in the speech block to be analyzed comprises the following steps:
composing the sentences where the extracted vertical trigger words are located and n front sentences and n rear sentences into a speech block to be analyzed, wherein n is a positive integer greater than or equal to 1;
searching positive emotion words and negative emotion words appearing in a speech block to be analyzed, and acquiring the original emotion probability of each positive emotion word and each negative emotion word;
determining a negation coefficient and a degree weight according to each positive emotion word and each negative emotion word and the number of negation words and degree adverbs in a preset range;
calculating the emotion probability correction value of each emotion word according to the original emotion probability, the negative coefficient and the degree weight of the emotion words;
and calculating the emotion probability of the speech block to be analyzed according to the emotion probability correction value of the emotion words, the number of the positive emotion words and the number of the negative emotion words.
Specifically, the speech block to be analyzed is composed of the sentence where the vertical trigger word is located and the preceding and following n sentences, which are 2n +1 sentences in total. And then analyzing the emotional characteristics of the speech block to be analyzed.
In a specific implementation process, the preset range may be selected according to needs, for example, a plurality of words appearing before and after the emotion word may be selected. For example, the implementation process is as follows:
a. searching two vocabularies from each positive emotion vocabulary and each negative emotion vocabulary forwards, if the two vocabularies contain k negative words (the value range of k is [0,2 ]]) Then the negative coefficient N of the emotion vocabularyi=(-1)k
b. If the two vocabularies do not contain degree adverbs, the degree coefficient L of the emotion vocabularyi=1;
c. If 1 degree adverb is contained and the degree weight is L, the degree coefficient L of the emotion vocabularyiIf 2 degree adverbs are included, the degree weight is L1,L2Then the degree coefficient L of the emotion vocabularyi=L1×L2
d. Calculating emotion probability correction value P of each emotion wordi_index=Pi×Ni×Li
And then calculating the emotion probability of the speech block to be analyzed according to the calculated emotion probability correction value of the emotion words, the number of the positive emotion words and the number of the negative emotion words.
In specific implementation, the emotion word score emotion probability correction value is divided into two groups of positive emotion words and negative emotion words according to the emotion type, namely, the positive emotion words with the emotion probability correction value larger than or equal to 1 are used as the positive emotion words, the negative emotion words with emotion probability correction value smaller than 1 are used as the negative emotion words, and the positive emotion words and the negative emotion words are respectively arranged from large to small according to numerical values. Wherein the number of the positive emotional words is NpThe number of negative emotion words is Nn. Then, calculating the emotion score S of the speech block according to the following method:
if Np≥NnAnd N isnNot equal to 0, the average value of all negative emotion vocabularies is calculated first
Figure BDA0002229021580000101
And calculating the first NnAverage value of individual positive emotion words
Figure BDA0002229021580000102
And the average value of the remaining positive emotion vocabulary
Figure BDA0002229021580000103
Then calculate
Figure BDA0002229021580000104
And
Figure BDA0002229021580000105
average value of (2)
Figure BDA0002229021580000106
Sentiment score of speech chunk
Figure BDA0002229021580000107
If Np<NnAnd N ispNot equal to 0, the average value of all positive emotion words is calculated first
Figure BDA0002229021580000108
N before calculationnAverage value of negative emotion vocabulary
Figure BDA0002229021580000109
And the average value of the remaining negative emotion vocabulary
Figure BDA00022290215800001010
Then calculate
Figure BDA00022290215800001011
And
Figure BDA00022290215800001012
average value of (2)
Figure BDA00022290215800001013
Sentiment score of speech chunk
Figure BDA00022290215800001014
If NnAnd if the number is equal to 0, the number of the negative emotion words in the corpus block is 0, and all the words are positive emotion words. Calculating the average value of all positive emotion vocabularies
Figure BDA00022290215800001015
Sentiment score of speech chunk
Figure BDA00022290215800001016
Wherein L ismaxIs the maximum value of degree weights in the degree adverb lexicon.
If NpAnd if the number is equal to 0, the number of the positive emotion words in the corpus block is 0, and all the words are negative emotion words. Calculating the average value of all negative emotion vocabularies
Figure BDA00022290215800001017
Sentiment score of speech chunk
Figure BDA00022290215800001018
Wherein L ismaxIs the maximum value of the degree weight in the degree adverb lexicon.
By combining the emotion probability correction value of the emotion words, the number of the positive emotion words and the number of the negative emotion words, the objectivity and the accuracy of calculation can be improved.
In one embodiment, calculating the emotional probability of the corresponding position trigger word of the single user according to the emotional probability of the speech block to be analyzed comprises the following steps:
when the position trigger word does not appear in the data issued by the user, the emotion probability of the position trigger word is null;
when the vertical trigger word appears once in the data issued by the user, taking the emotion probability corresponding to the corpus block where the vertical trigger word appears as the emotion probability of the vertical trigger word;
and when the vertical trigger words appear twice or more in the data issued by the user, taking the average value of the emotion probabilities corresponding to all the corpus blocks where the vertical trigger words appear as the emotion probability of the vertical trigger words.
Specifically, for the situation trigger words appearing twice or more in the user data, the emotion scores are calculated according to the weighted average of the appearance time sequence of the corpus blocks containing the emotion trigger words.
For example, assuming that a user appears in n corpus blocks for a vertical trigger word, and the time span is T, T is divided by 10 to obtain 10 intervals, T1-T10, the weight of each interval is 1-10, n corpus blocks are distributed in 10 intervals, the emotional probability of each corpus block is multiplied by the weight of the corresponding interval, and finally, the weighted average is taken to obtain the emotional probability of the corresponding vertical trigger word.
In one embodiment, after calculating the emotional probability of the corresponding position trigger word of the single user according to the emotional probability of the speech block to be analyzed, the method further comprises:
and normalizing the emotional probability of the vertical trigger word corresponding to the single user to obtain the emotional probability correction value of each vertical trigger word corresponding to the single user.
Specifically, in order to facilitate subsequent calculation, the embodiment further performs normalization processing on the emotion probability of the position trigger word corresponding to a single user.
In a specific implementation process, for each user j of a user group to be analyzed, each corresponding place trigger word k:
and respectively carrying out normalization processing on the positive emotion probabilities (which are divided according to the calculated emotion probabilities of the vertical trigger words and the emotional tendency) of all the vertical trigger words with the non-empty values, so that the correction values of all the positive emotion probabilities are distributed between 0 and 1. Score S for each emotion probabilityjkCorrection value S thereofjk_indexThe method of calculating (a) is as follows,wherein SminIs the minimum of all positive emotion probability scores, SmaxIs the maximum value in all positive emotion probability scores, and the normalization processing formula is as follows: sjk_index=(Sjk-Smin)/(Smax-Smin)。
And respectively carrying out normalization processing on the negative emotion probabilities of all the position trigger words with values not being empty, so that the correction values of all the negative emotion probabilities are distributed between 0 and 1, and then taking the negative number of the negative emotion probabilities, so that the final negative emotion probability score correction values are distributed between-1 and 0. Score S 'for each sentiment probability'jkCorrection value S'jk_indexIs calculated by the following method, wherein S'minIs the minimum of all negative emotion probability scores, S'maxIs the maximum value of all negative emotion probability scores, and the normalization processing formula is as follows: s'j'k_index=-(S’jk-S’min)/(S’max-S’min)。
In one embodiment, calculating the emotional probability of a single user corresponding to a position trigger word corresponding to the single user according to the emotional probability of a speech block to be analyzed, and calculating the average emotional probability of the position trigger word in a community consisting of all users comprises:
averaging the emotion probability correction values of the position trigger words triggered by the positions corresponding to the users, and calculating the average emotion probability of the position trigger words in the community;
and sorting according to the average emotional probability of each position trigger word.
Specifically, for each of the position trigger words k in the position trigger word library, the emotion probability score correction value S is set for all user texts using the position trigger word in the communityjk_indexAnd averaging to obtain the average emotional probability of the position trigger words in the community. And then arranging the position trigger words in a high-to-low order according to the community total average emotional probability of each position trigger word.
In order to more clearly illustrate the specific implementation of the method of the present invention, the following is presented by way of specific examples:
1. example of mapping for establishing user and position trigger word stock
(1) Original text of the user (i.e. text information issued by the user):
each cartoon will use a carefully orchestrated scenario to convey some reason to the children. The storyline that the skyscraper runs empty is loved by children, and no Santa Claus exists in the Christmas season after the growth. The dream is kept and is not abandoned, and the theme of the traditional cartoon is natural, however, in the animal biscuit of the magical circus, as the father ratio of the man, the body which is probably not restored by the man is almost broken down and feared. Even moms who consistently take "last to the end" are flustered and lack of mind in the face of the established fact that the mommy is about to become soon. Rather, the worship is a naturally-rotten small daughter, and by comparing the worship-dependent eye spirit with the father, the 'I likes the father who can not be restored by the Daddy … …' to become the intimate play partner, the entertainment toy and the lovely favorite of the daughter, and certainly, the universal hero without substitution exists, so that the man can find back the significance of survival. Parents and children grow together, you do children for 10 years, I when mom is not one day more than you, we work together in the moment
(2) In the original text, the vertical trigger word is "animal biscuit of magic circus", and the sentence where the word is located and the front and back n sentences (taking n as an example) form the speech block to be analyzed.
Each cartoon will use a carefully orchestrated scenario to convey some reason to the children. The storyline that the skyscraper runs empty is loved by children, and no Santa Claus exists in the Christmas season after the growth. The dream is kept and is not abandoned, and the theme of the traditional cartoon is natural, however, in the animal biscuit of the magical circus, as the father ratio of the man, the body which is probably not restored by the man is almost broken down and feared. Even moms who consistently take "last to the end" are flustered and lack of mind in the face of the established fact that the mommy is about to become soon. Rather, the worship is a naturally-rotten small daughter, and by comparing the worship-dependent eye spirit with the father, the 'I likes the father who can not be restored by the Daddy … …' to become the intimate play partner, the entertainment toy and the lovely favorite of the daughter, and certainly, the universal hero without substitution exists, so that the man can find back the significance of survival.
(3) Calculating the emotion probability of the speech block by using the emotion analysis mode shown in the foregoing to obtain:
emotional probability Sindex=0.979658
2. Data examples (part) of word stock and user emotion triggered by location
Floor triggering word Emotional probability (score) of user 1 Emotional probability (score) of user 2
Moonlight box for western-style speech -0.9113407 0.912788
The first-aid redemption of Xiaoshenke 0.984257 0.980452
Arbutus giganteus (Fr.) Quel 0.98767 0.969589
Titanic number 0.926384 0.97395
This isThe killer is not too cold 0.986618 0.986399
Space of dreams 0.873364 0.966804
Sanfoo greatly-alarming Baoyou 0.928555 0.983658
Example two
Based on the same inventive concept, the present embodiment provides a user profile creating apparatus based on the emotional standpoint of the user, please refer to fig. 2, which includes:
a corpus acquiring module 201, configured to acquire an independent short text corpus from user history data;
the emotion word bank construction module 202 is used for classifying the acquired short text corpus according to emotion tendencies, constructing an emotion word bank according to the distribution conditions of words in the short text corpus classification results, and calculating the original emotion probability of emotion words in the emotion word bank; wherein, the emotion word bank comprises positive emotion words and negative emotion words;
the vertical trigger word bank construction module 203 is used for constructing a vertical trigger word bank according to an application scene, wherein the vertical trigger word bank contains vertical trigger words capable of causing a user vertical or emotional reaction;
the emotion probability calculation module 204 is used for extracting the vertical trigger words contained in the text information issued by the user, forming a to-be-analyzed speech block according to the extracted vertical trigger words, and calculating the emotion probability of the to-be-analyzed speech block according to the emotion probability of the emotion words, the number of degree adverbs and the number of negative words in the to-be-analyzed speech block;
the emotion probability calculation module 205 of the single user vertical trigger word is used for calculating the emotion probability of the vertical trigger word corresponding to the single user according to the emotion probability of the speech block to be analyzed;
the average emotional probability sorting module 206 is configured to calculate an average emotional probability of the position trigger words in the community formed by all the users according to the emotional probability of the position trigger words corresponding to a single user, and sort according to the average emotional probability;
and the user portrait construction module 207 is used for constructing the user portrait according to the sequencing condition of the position trigger words in the community and the emotional probability of the single user to the position trigger words.
In an embodiment, the emotion lexicon construction module 202 is specifically configured to:
classifying the obtained short text corpora according to emotional tendency, and dividing the short text corpora into positive corpora, neutral corpora and negative corpora;
performing word segmentation on the classified linguistic data, and removing redundancy to obtain a linguistic data word bank;
counting the distribution condition of each word in the corpus thesaurus in a positive corpus, a neutral corpus and a negative corpus;
screening out vocabularies related to positive direction and negative direction as sign word candidates of emotional tendency by combining chi-square check according to the distribution condition of the vocabularies;
screening the marker word candidates, deleting the vocabulary which is not matched with the corresponding emotional tendency, and constructing an emotional word library;
and searching all original linguistic data corresponding to each positive emotion word, calculating the average value of the positive emotion probabilities as the original emotion probabilities of the positive emotion words, and subtracting the average value of the positive emotion probabilities from 1 for the negative emotion words in the emotion word bank as the original emotion probabilities of the negative emotion words.
In an embodiment, the corpus emotion probability calculation module 204 is specifically configured to:
composing the sentences where the extracted vertical trigger words are located and n front sentences and n rear sentences into a speech block to be analyzed, wherein n is a positive integer greater than or equal to 1;
searching positive emotion words and negative emotion words appearing in a speech block to be analyzed, and acquiring the original emotion probability of each positive emotion word and each negative emotion word;
determining a negation coefficient and a degree weight according to each positive emotion word and each negative emotion word and the number of negation words and degree adverbs in a preset range;
calculating the emotion probability correction value of each emotion word according to the original emotion probability, the negative coefficient and the degree weight of the emotion words;
and calculating the emotion probability of the speech block to be analyzed according to the emotion probability correction value of the emotion words, the number of the positive emotion words and the number of the negative emotion words.
In one embodiment, the single user position trigger emotion probability calculation module 205 is specifically configured to:
when the position trigger word does not appear in the data issued by the user, the emotion probability of the position trigger word is null;
when the vertical trigger word appears once in the data issued by the user, taking the emotion probability corresponding to the corpus block where the vertical trigger word appears as the emotion probability of the vertical trigger word;
and when the vertical trigger words appear twice or more in the data issued by the user, taking the average value of the emotion probabilities corresponding to all the corpus blocks where the vertical trigger words appear as the emotion probability of the vertical trigger words.
In one embodiment, the apparatus further includes a normalization processing module, configured to, after calculating the emotional probability of the corresponding position trigger word of the single user according to the emotional probability of the speech block to be analyzed:
and normalizing the emotional probability of the vertical trigger word corresponding to the single user to obtain the emotional probability correction value of each vertical trigger word corresponding to the single user.
In one embodiment, the average emotion probability ranking module 206 is specifically configured to:
averaging the emotion probability correction values of the position trigger words triggered by the positions corresponding to the users, and calculating the average emotion probability of the position trigger words in the community;
and sorting according to the average emotional probability of each position trigger word.
Since the apparatus described in the second embodiment of the present invention is an apparatus used for implementing the user portrait creation method based on the emotion standpoint of the user in the first embodiment of the present invention, a person skilled in the art can understand the specific structure and deformation of the apparatus based on the method described in the first embodiment of the present invention, and thus the details are not described herein. All the devices adopted in the method of the first embodiment of the present invention belong to the protection scope of the present invention.
EXAMPLE III
Based on the same inventive concept, the application also provides a user portrait visualization method, which specifically comprises the step of carrying out visual display on the user portrait constructed in the embodiment.
In one embodiment, a user representation is visually displayed, comprising:
mapping the position trigger words to word blocks in a preset shape according to the average emotion probability;
constructing a corresponding relation between the emotional probability and the color characteristics of a single user to the position trigger words;
and carrying out visual display on the user portrait according to the corresponding relation between the emotional probability and the color characteristic.
In a specific implementation process, the vertical trigger words may be mapped to a geometric shape, that is, "vertical word blocks," and the vertical word blocks are arranged according to the arrangement sequence of the vertical trigger words in the community in the first embodiment, so as to form an overall geometric pattern.
For example, a geometric shape body with a square as a vertical word block, 300 word banks are 300 small squares, 30 small squares in length and 10 small squares in width, and finally form a regular graph (for example, a rectangle with an aspect ratio of 3: 1, where the 1 st vertical trigger is located in the 1 st row and column 1, the 30 th vertical trigger is located in the 1 st row and column 30, and the 300 th vertical trigger is located in the 10 th row and column 30).
And constructing a corresponding relation between the emotional probability and the color characteristics of the individual user opposite-position trigger words, and carrying out visual display on the user portrait according to the constructed corresponding relation.
In the specific implementation process, the emotional response of the user can be colored to each small square according to different color systems and different intensities of the colors by using a color gradient formula.
The gradient calculation method comprises the following steps: the gradation of two colors is the calculation of a color gradation formula for the RGB channels of A, B, respectively, the Gradient being a + (B-a) × p, for each color channel, a being the value of color a on that channel and B being the value of color B on that channel. p is the percentage of the position of the target color between AB.
For example, a red color is selected as a representative color for supporting or positive emotion from the standpoint, and the stronger the positive emotion is, the emotion probability score correction value S isjk_indexThe closer to 1 the darker the color. The weaker the positive emotion, the probability score S for each emotionjk_indexThe closer to 0, the more the color of the small square of the positive emotion is, the more intense the positive emotion is, red (255,0,0), and the weakest the color is, light gray (245,245,245), and at this time, the correspondence (color calculation method is) Rjk=255+(255-245)*Sjk_index,Gjk=255+(255-0)*Sjk_index,Bjk=255+(255-0)*Sjk_index
Similarly, a blue color system is selected as a representative color for an adverse or negative emotion, and the stronger the negative emotion is, the more the emotion probability score is corrected by the value Sjk_indexThe closer to 1, the darker the color. The weaker the negative emotion is, the modified value S of the emotion probability scorejk_indexThe closer to 0. Therefore, the color of the small square with negative emotion is blue (0, 255) for the strongest negative emotion, and light gray (245,245,245) for the weakest negative emotion, and the corresponding relationship (color calculation method) at this time is Rjk=245+(245-0)*Sjk_index,Gjk=245+(245-0)*Sjk_index,Bjk=255+(255-245)*Sjk_index
For a block of words with an empty emotional probability, the color is white (255 ).
FIG. 3 is a schematic diagram illustrating user portrait visualization of a single user according to an embodiment of the present invention. The example diagram is an emotional response of a user to a location-triggered thesaurus, for example, a red color system can be used as a representative color for supporting or positive emotion from a location, the color with the strongest value is (255,0,0), and the color with the weakest value is (245,245,245); the blue color is a representative color having an adverse or negative emotion from the standpoint of view, and the strongest color is (0, 255) and the weakest RGB color is (245,245,245); the missing colors of emotional reactions were white (255,255,255).
In addition, comparison of different user profiles and group user profile analysis may be performed.
Comparison of portrayal of different users
Selecting two users, analyzing the emotional reactions of the two users to each position word block, if the emotional reactions of the two users are consistent, using one color to represent, and if the emotional reactions of the two users are inconsistent, using the other color to represent. And selecting the transition colors of the same color system according to the consistency degree for displaying. The blocks of the vertical words, in which one or both of the parties do not have an emotional response, are represented in white. Specifically, see fig. 4, where the correspondence between the emotional probability of the single user relative to the location trigger word and the color feature may be implemented in a similar manner as described above, and details are not repeated here. The example is a comparison of emotional responses of two users to the location-triggered thesaurus, e.g., the emotional responses are consistent in the green color family, with the strongest RGB colors being (0,201,13) and the weakest RGB color being (245,245,245); the emotional response is inconsistent with the purple system, the RGB color with the strongest value is (115,9,170), and the color with the weakest value is (245,245,245); the color of the absence of emotional reaction is white (255 )
Group user profile analysis
For a certain user group, counting the average emotional response of each position word block, carrying out positive or negative quantity counting during counting, and not carrying out numerical value averaging. For the position word blocks with emotional reactions of the users exceeding u percent (u is a preset value), counting the percentage of positive or negative emotions of each position word block in the total number of the users, wherein the higher the consistency is, one color is presented, the lower the consistency is, the other color is presented, and the corresponding transition color reflects the consistency degree of the view of the user group. If the number of users who have emotional reactions to a certain position word block is less than u%, it is represented in white. Specifically referring to fig. 5, the correspondence between the emotion probability and the color feature of the individual user relative position trigger word may be implemented in a similar manner as described above, and is not described herein again.
FIG. 5 shows a comparison of emotional responses of a user population to a context-triggered thesaurus, e.g., the emotional responses are uniformly in the green color family with the strongest colors (0,201,13) and the weakest RGB colors (245,245,245); the emotional response is not consistent with the purple system, the color with the strongest value is (115,9,170), the color with the weakest value is (245,245,245); the missing colors of emotional reactions were white (255,255,255).
Because the method provided by the invention can construct the emotion word bank according to the distribution condition of words in the short text corpus classification result, the constructed emotion word bank is divided into positive emotion words and negative emotion words according to the emotion tendency, the emotion tendency of new words can be better identified from the character content of the user, the emotion response in the words can be more accurately analyzed, the vertical trigger word bank is constructed, then the mapping of a single user and the vertical trigger word bank is established, and the emotion probability of the vertical trigger word corresponding to the single user is calculated; and then calculating the average emotional probability of the position trigger words in the community consisting of all the users, and sequencing the positions, so that the portrait of the users can be constructed according to the sequencing result, the emotion position of each user on the position trigger words can be accurately determined, the viewpoint difference and the emotional response characteristics of each user on common things can be rapidly and accurately known, and the technical problem that the analysis result is not accurate enough in the method in the prior art is solved.
Furthermore, based on the constructed user portrait, the invention also provides a user portrait visualization method, which is used for visually displaying the user portrait and improving the intuitiveness.
Furthermore, according to a color gradient formula, a corresponding relation between the emotion probability and the color feature of the individual user opposite-to-place trigger words is constructed, and then the user portrait is visually displayed according to the corresponding relation between the emotion probability and the color feature, so that the display effect can be improved.
Example four
Referring to fig. 6, based on the same inventive concept, the present application further provides a computer-readable storage medium 300, on which a computer program 311 is stored, which when executed implements the method according to the first embodiment.
Since the computer-readable storage medium introduced in the third embodiment of the present invention is a computer device used for implementing the user portrait creation method based on the emotion standpoint of the user in the first embodiment of the present invention, based on the method introduced in the first embodiment of the present invention, persons skilled in the art can understand the specific structure and deformation of the computer-readable storage medium, and therefore, details are not described here. Any computer readable storage medium used in the method of the first embodiment of the present invention is within the scope of the present invention.
As will be appreciated by one skilled in the art, embodiments of the present invention may be provided as a method, system, or computer program product. Accordingly, the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present invention may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The present invention is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
While preferred embodiments of the present invention have been described, additional variations and modifications in those embodiments may occur to those skilled in the art once they learn of the basic inventive concepts. Therefore, it is intended that the appended claims be interpreted as including preferred embodiments and all such alterations and modifications as fall within the scope of the invention.
It will be apparent to those skilled in the art that various modifications and variations can be made in the embodiments of the present invention without departing from the spirit or scope of the embodiments of the invention. Thus, if such modifications and variations of the embodiments of the present invention fall within the scope of the claims of the present invention and their equivalents, the present invention is also intended to encompass such modifications and variations.

Claims (10)

1. A user portrait establishing method based on a user emotion standpoint is characterized by comprising the following steps:
acquiring independent short text corpora from user historical data;
classifying the acquired short text corpus according to emotional tendency, constructing an emotional word bank according to the distribution condition of words in the short text corpus classification result, and calculating the original emotional probability of the emotional words in the emotional word bank; wherein, the emotion word bank comprises positive emotion words and negative emotion words;
establishing a position trigger word library according to an application scene, wherein the position trigger word library comprises position trigger words capable of causing a user position or emotional reaction;
extracting the vertical trigger words contained in the text information issued by the user, forming a speech block to be analyzed according to the extracted vertical trigger words, and calculating the emotional probability of the speech block to be analyzed according to the emotional probability of the emotional words in the speech block to be analyzed, the number of degree adverbs and the number of negative words;
calculating the emotional probability of the corresponding vertical trigger word of a single user according to the emotional probability of the speech block to be analyzed;
calculating the average emotional probability of the position trigger words in the community formed by all the users according to the emotional probability of the position trigger words corresponding to the single user, and sequencing according to the average emotional probability;
and constructing the user portrait according to the sequencing condition of the position trigger words in the community and the emotional probability of the single user to the position trigger words.
2. The method of claim 1, wherein classifying the obtained short text corpus according to emotional tendency, constructing an emotional lexicon according to the distribution of words in the short text corpus classification result, and calculating the emotional probability of the emotional words in the emotional lexicon, comprises:
classifying the obtained short text corpora according to emotional tendency, and dividing the short text corpora into positive corpora, neutral corpora and negative corpora;
performing word segmentation on the classified linguistic data, and removing redundancy to obtain a linguistic data word bank;
counting the distribution condition of each word in the corpus thesaurus in a positive corpus, a neutral corpus and a negative corpus;
screening out vocabularies related to positive direction and negative direction as sign word candidates of emotional tendency by combining chi-square check according to the distribution condition of the vocabularies;
screening the marker word candidates, deleting the vocabulary which is not matched with the corresponding emotional tendency, and constructing an emotional word library;
and searching all original linguistic data corresponding to each positive emotion word, calculating the average value of the positive emotion probabilities as the original emotion probabilities of the positive emotion words, and subtracting the average value of the positive emotion probabilities from 1 for the negative emotion words in the emotion word bank as the original emotion probabilities of the negative emotion words.
3. The method as claimed in claim 1, wherein composing the speech block to be analyzed from the extracted floor trigger words, and then calculating the emotional probability of the speech block to be analyzed based on the number of emotional words, degree adverbs, and negative words in the speech block to be analyzed, comprises:
composing the sentences where the extracted vertical trigger words are located and n front sentences and n rear sentences into a speech block to be analyzed, wherein n is a positive integer greater than or equal to 1;
searching positive emotion words and negative emotion words appearing in a speech block to be analyzed, and acquiring the original emotion probability of each positive emotion word and each negative emotion word;
determining a negation coefficient and a degree weight according to each positive emotion word and each negative emotion word and the number of negation words and degree adverbs in a preset range;
calculating the emotion probability correction value of each emotion word according to the original emotion probability, the negative coefficient and the degree weight of the emotion words;
and calculating the emotion probability of the speech block to be analyzed according to the emotion probability correction value of the emotion words, the number of the positive emotion words and the number of the negative emotion words.
4. The method of claim 1, wherein calculating the emotional probability of the corresponding standing trigger of the single user according to the emotional probability of the speech block to be analyzed comprises:
when the position trigger word does not appear in the data issued by the user, the emotion probability of the position trigger word is null;
when the vertical trigger word appears once in the data issued by the user, taking the emotion probability corresponding to the corpus block where the vertical trigger word appears as the emotion probability of the vertical trigger word;
and when the vertical trigger words appear twice or more in the data issued by the user, taking the average value of the emotion probabilities corresponding to all the corpus blocks where the vertical trigger words appear as the emotion probability of the vertical trigger words.
5. The method as claimed in claim 1, wherein after calculating the emotional probability of the corresponding standing trigger of the single user according to the emotional probability of the speech block to be analyzed, the method further comprises:
and normalizing the emotional probability of the vertical trigger word corresponding to the single user to obtain the emotional probability correction value of each vertical trigger word corresponding to the single user.
6. The method as claimed in claim 5, wherein calculating the average emotional probability of the position trigger word in the community consisting of all the users according to the emotional probability of the position trigger word corresponding to the single user comprises:
averaging the emotion probability correction values of the position trigger words triggered by the positions corresponding to the users, and calculating the average emotion probability of the position trigger words in the community;
and sorting according to the average emotional probability of each position trigger word.
7. A user profile creation apparatus based on a user's emotional standpoint, comprising:
the corpus acquiring module is used for acquiring independent short text corpora from user historical data;
the emotion word bank construction module is used for classifying the acquired short text corpus according to emotion tendencies, constructing an emotion word bank according to the distribution condition of words in the short text corpus classification result and calculating the original emotion probability of emotion words in the emotion word bank; wherein, the emotion word bank comprises positive emotion words and negative emotion words;
the system comprises a vertical trigger word library construction module, a vertical trigger word library construction module and a word library processing module, wherein the vertical trigger word library construction module is used for constructing a vertical trigger word library according to an application scene, and the vertical trigger word library comprises vertical trigger words capable of causing a user vertical or emotional reaction;
the emotion probability calculation module of the linguistic data block to be analyzed is used for extracting the vertical trigger words contained in the text information issued by the user, forming the linguistic data block to be analyzed according to the extracted vertical trigger words, and then calculating the emotion probability of the linguistic data block to be analyzed according to the emotion probability of the emotion words in the linguistic data block to be analyzed, the number of degree adverbs and the number of negative words;
the emotion probability calculation module of the single user vertical trigger word is used for calculating the emotion probability of the vertical trigger word corresponding to the single user according to the emotion probability of the speech block to be analyzed;
the average emotional probability sorting module is used for calculating the average emotional probability of the position trigger words in the community formed by all the users according to the emotional probability of the position trigger words corresponding to the single user and sorting according to the average emotional probability;
and the user portrait construction module is used for constructing the user portrait according to the sequencing condition of the position trigger words in the community and the emotional probability of the single user to the position trigger words.
8. A method for user representation visualization, comprising: a user representation constructed by the method of any one of claims 1 to 6 is displayed visually.
9. The method of claim 8, wherein visually displaying the user representation comprises:
mapping the position trigger words to word blocks in a preset shape according to the average emotion probability;
constructing a corresponding relation between the emotional probability and the color characteristics of a single user to the position trigger words;
and carrying out visual display on the user portrait according to the corresponding relation between the emotional probability and the color characteristic.
10. A computer-readable storage medium, on which a computer program is stored, characterized in that the program, when executed, implements the method of any one of claims 1 to 6.
CN201910961379.5A 2019-10-11 2019-10-11 A method and device for establishing a user portrait based on the user's emotional stance, and a visualization method of the user portrait Active CN112651237B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910961379.5A CN112651237B (en) 2019-10-11 2019-10-11 A method and device for establishing a user portrait based on the user's emotional stance, and a visualization method of the user portrait

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910961379.5A CN112651237B (en) 2019-10-11 2019-10-11 A method and device for establishing a user portrait based on the user's emotional stance, and a visualization method of the user portrait

Publications (2)

Publication Number Publication Date
CN112651237A true CN112651237A (en) 2021-04-13
CN112651237B CN112651237B (en) 2024-03-19

Family

ID=75343004

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910961379.5A Active CN112651237B (en) 2019-10-11 2019-10-11 A method and device for establishing a user portrait based on the user's emotional stance, and a visualization method of the user portrait

Country Status (1)

Country Link
CN (1) CN112651237B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114140814A (en) * 2022-02-07 2022-03-04 北京无疆脑智科技有限公司 Emotion recognition capability training method and device and electronic equipment

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2013222231A (en) * 2012-04-13 2013-10-28 Nec Corp Emotion sharing communication facilitating system, emotion sharing communication facilitating method, and program
JP2017120634A (en) * 2015-12-28 2017-07-06 株式会社リコー Method and apparatus for analyzing affective word polarity
CN109767787A (en) * 2019-01-28 2019-05-17 腾讯科技(深圳)有限公司 Emotion identification method, equipment and readable storage medium storing program for executing
CN109994102A (en) * 2019-04-16 2019-07-09 上海航动科技有限公司 A kind of outer paging system of intelligence based on Emotion identification

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2013222231A (en) * 2012-04-13 2013-10-28 Nec Corp Emotion sharing communication facilitating system, emotion sharing communication facilitating method, and program
JP2017120634A (en) * 2015-12-28 2017-07-06 株式会社リコー Method and apparatus for analyzing affective word polarity
CN109767787A (en) * 2019-01-28 2019-05-17 腾讯科技(深圳)有限公司 Emotion identification method, equipment and readable storage medium storing program for executing
CN109994102A (en) * 2019-04-16 2019-07-09 上海航动科技有限公司 A kind of outer paging system of intelligence based on Emotion identification

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
索晓阳;王伟;: "基于社交网络数据的用户群体画像构建方法研究", 网络空间安全, no. 09, 25 September 2019 (2019-09-25) *
蒋盛益;黄卫坚;蔡茂丽;王连喜;: "面向微博的社会情绪词典构建及情绪分析方法研究", 中文信息学报, no. 06, 15 November 2015 (2015-11-15) *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114140814A (en) * 2022-02-07 2022-03-04 北京无疆脑智科技有限公司 Emotion recognition capability training method and device and electronic equipment

Also Published As

Publication number Publication date
CN112651237B (en) 2024-03-19

Similar Documents

Publication Publication Date Title
CN110188351B (en) Sentence smoothness and syntax scoring model training method and device
Hartmann et al. Attack of the snowclones: A corpus-based analysis of extravagant formulaic patterns
CN108628833B (en) Method and device for determining summary of original content and method and device for recommending original content
Neyt et al. Never mind I'll find someone like me–Assortative mating preferences on Tinder
CN110032641A (en) Method and device that computer executes, that event extraction is carried out using neural network
CN111212303B (en) Video recommendation method, server and computer-readable storage medium
CN108596051A (en) A kind of intelligent identification Method towards product style image
CN110543553B (en) Problem generation method, device, computer equipment and storage medium
CN113672818B (en) A method and system for obtaining social media user portraits
CN110489552A (en) Method and device for detecting suicide risk of microblog users
CN111581969A (en) Medical term vector representation method, device, storage medium and electronic equipment
CN117521628A (en) Script creation method, device, equipment and chip based on artificial intelligence
Visalli et al. Can natural language processing or large language models replace human operators for pre-processing word and sentence-based free comments sensory evaluation data?
CN114416929A (en) Sample generation method, device, equipment and storage medium of entity recall model
CN117217801A (en) Scenic spot optimization scheme intelligent generation method and system based on tourist real evaluation
CN106055657A (en) Evaluation system for film viewing index of specific population
CN111680134A (en) Method for measuring question-answering consulting information by information entropy
Baudouin et al. Is face distinctiveness gender based?
Kim et al. # ShoutYourAbortion on Instagram: exploring the visual representation of hashtag movement and the public’s responses
CN112651237A (en) User portrait establishing method and device based on user emotion standpoint and user portrait visualization method
CN115269919A (en) A method, device, electronic device and storage medium for determining the quality of a short video
CN115525161A (en) Entry obtaining method and device and electronic equipment
CN115510326A (en) A network forum user interest recommendation algorithm based on text features and emotional tendencies
CN107797981A (en) A kind of target text recognition methods and device
JP4403859B2 (en) Emotion matching device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant