WO2014103645A1 - Système de fourniture de sujet de conversation, dispositif terminal de commande de conversation et dispositif de maintenance - Google Patents
Système de fourniture de sujet de conversation, dispositif terminal de commande de conversation et dispositif de maintenance Download PDFInfo
- Publication number
- WO2014103645A1 WO2014103645A1 PCT/JP2013/082623 JP2013082623W WO2014103645A1 WO 2014103645 A1 WO2014103645 A1 WO 2014103645A1 JP 2013082623 W JP2013082623 W JP 2013082623W WO 2014103645 A1 WO2014103645 A1 WO 2014103645A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- topic
- information
- input
- scenario data
- unit
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Ceased
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/33—Querying
- G06F16/332—Query formulation
- G06F16/3329—Natural language query formulation
Definitions
- the present invention relates to a topic providing system that provides a topic to a user, an information search system that provides an information search function to a user, a sentence analysis device that extracts an important character string from text data, and an information update device that compares a plurality of dictionaries.
- the conventional conversation control system uses artificial intelligence, and the user creates the response contents created in advance associated with the semantic contents grasped from the morphemes by analyzing the morphological information by the server on the character information inputted by the user. It was a system to output to.
- conversation control system that divided the morpheme analysis into a first morpheme and a second morpheme and analyzed according to the concept. Furthermore, there is also a conversation control system that creates an answer by specifying the degree of emotion that the user has.
- search words that have increased rapidly due to net search or the like during a predetermined period words that have increased rapidly due to news or the like during a predetermined period, and the like are stored as event words.
- a word analyzed in advance based on the appearance frequency from news including an event word, and the like are stored as event word related words.
- one of the event words is used as a search word for net search or the like.
- There is an information search system configured to display a list of the above-described event word related words when one is input see Patent Document 4).
- a system for extracting a topical word from sentences input to a computer system or contents such as a homepage published on the Internet has been disclosed.
- the morpheme group stored in the morpheme database The input information is collated, and a character string corresponding to the morpheme is extracted from the input information.
- a morpheme corresponds to a minimum unit such as “word” that constitutes a sentence included in input information or the like, and this minimum unit includes parts of speech such as nouns, adjectives, and verbs.
- morpheme groups including nouns, adjectives, verbs and the like used for general sentences are registered in advance.
- Patent Document 7 discloses that personally generated data generated by an individual is decomposed into a plurality of pieces of decomposed text data that can be distinguished from each other, and a predetermined related condition and a predetermined characteristic condition for an object indicated by the decomposed text data. Is generated as a censorship field, and further, from the extracted data set, data satisfying a predetermined singularity condition for the target indicated by the decomposed text data is generated. An information processing system for extracting and visualizing by detecting a predetermined specificity from the data thus extracted is disclosed.
- the place of censorship is a space composed of a collection of text data to be censored.
- the related condition is a condition set by an operator of the information processing system or the like regarding the target indicated by the decomposed text data (topic or content included in the text data).
- soy sauce-flavored ramen and salty ramen can be included as related objects that satisfy the relevant conditions.
- the feature condition is a condition related to the target attribute (feature such as characteristic or spot color) indicated by the disassembled text data.
- the attribute can also be regarded as a preference. For example, when the target indicated by the decomposed text data is food, a subjective or objective description such as delicious or bad can be considered.
- the singular condition is to show a predetermined specificity with respect to the object indicated by the text data. Specificity can be determined based on whether various amounts, degrees, and change rates related to a predetermined object are larger or smaller than a predetermined threshold. For example, in a certain partial space, when the number of occurrences of the word “ramen” (utterance) is larger than that in the remaining partial spaces, the decomposed text data is extracted and visualized as a predetermined unique condition.
- the first problem is as follows. When proceeding with a conversation, it is common to first provide some topic (theme or the like) in advance and proceed with the conversation along that topic.
- a conversation with a user is advanced using a conventional topic control system (including a conversation control system)
- the following problems are assumed to occur in order to adapt to such a conversation form.
- topic control rules are associated with advanced and specialized knowledge and technology throughout the topic control system, maintenance work that changes or corrects topic control rules must be performed separately. Becomes difficult.
- the present invention has been made in view of the above-mentioned points, and the object of the present invention is to realize a natural response according to the topic and the flow of conversation, and to change or add a topic control rule.
- An object of the present invention is to provide a topic providing system as a topic control system capable of performing maintenance work such as correction separately.
- the second problem is as follows.
- a terminal device such as a portable terminal such as a mobile phone or a smart phone
- a relatively small storage capacity or a low processing capacity can advance conversation with the user. I was able to go.
- Such a conversation control system is expected to cause the following problems. That is, first, processing such as a response to input information input by the user is executed on the server side, so that communication via the network is required each time the user inputs, and the user terminal connected to the server When there are many devices, the processing time becomes long and the response tends to be delayed. Furthermore, since processing such as responses to input information input by users is executed on the server side, if different responses can be made according to the status of multiple users, the processing time must be further increased. It was easy to delay the response.
- the present invention has been made in view of the above points, and the object of the present invention is to reduce the load on the server and reduce network traffic, and to respond accurately according to the status of a plurality of users. Another object of the present invention is to provide a conversation control terminal device capable of smoothly proceeding with a conversation in accordance with a conversation flow with a user.
- the third problem is as follows.
- the existing conversation control systems have built a huge amount of data and programs for advancing conversations with users, and realized the system by processing using these data and programs. It was. Therefore, the response information is determined by a server having a large storage capacity and high processing capability, and a service can be provided to a plurality of users by connecting a plurality of terminal devices operated by the user to such a server. I was trying.
- the present invention has been made in view of the above-mentioned points, and the object of the present invention is to provide a custom response to each of a plurality of users who use the conversation control system. It is an object of the present invention to provide a maintenance device that can verify in advance whether or not a custom is appropriate.
- the fourth problem is as follows.
- the following problems occur. That is, when event word related words are obtained from AND search of event words in the net search, these event word related words are listed according to the user's event word input. Since it is information, it is often known biased information for users of the net search, and the association between the event word and the event word related word is also just the relationship considered by the input person. Therefore, it is difficult to acquire a new topic that is not known by using event word-related words obtained from AND search with event words in the net search.
- the event word related word is analyzed in advance based on the appearance frequency or the like from the news including the event word, the event word related word is obtained by analyzing the past news. It is difficult to get the latest topic by word related words.
- An object of the present invention is to provide an information search system, an information search device, an information search method, and a program that can solve the fourth problem described above.
- the fifth problem is as follows.
- a morpheme database to be collated is not created in advance.
- the labor is extremely large. If nouns, adjectives, verbs, and the like used in everyday sentences are registered in the morpheme database so that they are generally covered, effective analysis of input information or the like cannot be realized.
- topical information can be grasped only in units of morphemes such as “words”.
- the information acquisition device in order to find the most topical information from the input information or the like, it is necessary to perform a matching process with the morpheme database for each word included in the input information or the like. Since a lot of time is required for access, comparison processing, and the like, the response time becomes long. As a result, it is difficult to analyze input information and the like in real time.
- an object of the present invention is to provide a sentence analysis apparatus, a sentence analysis method, and a program that can solve the fifth problem described above.
- the sixth problem is as follows.
- a plurality of decomposed text data whose meaning can be identified is extracted from text data generated by an individual, and further, the decomposed text data satisfying a predetermined condition is narrowed down.
- the text data that is subject to censorship contains a huge amount of text data that is constantly being updated from around the world on the network, and in order to handle the latest information, The extraction must be sequentially performed in accordance with the update of the enormous text data, and the amount of work is enormous.
- newly appeared decomposed text data has great value as new information, but it is extremely difficult to separate newly appeared decomposed text data from a large amount of text data.
- extracting such newly appeared decomposed text data from a large amount of text data is important for grasping the topic, but because the text data is enormous, such decomposed text data It is difficult to capture accurately.
- an object of the present invention is to provide an information updating apparatus, an information updating method, and a program that can solve the sixth problem described above.
- the feature according to the first embodiment of the present invention is as follows.
- An input unit for a user to input input information;
- An input information analysis unit that analyzes the input information and generates input specific information;
- a scenario data storage unit for extracting scenario data for defining response information on the topic;
- a response information determination unit that determines the response information based on the scenario data and the input identification information;
- an output unit that outputs the response information determined by the response information determination unit.
- response information is determined by scenario data and input specific information analyzed by the input information analysis unit, a natural response that matches the topic and the flow of conversation can be realized.
- response information on topics can be defined based on scenario data, so topic control rules can be changed, added, modified, etc. without relying on advanced and specialized knowledge and technology throughout the topic control system.
- the maintenance work can be performed separately.
- the feature according to the first embodiment of the present invention further includes A state control index storage unit for storing a state control index related to the input information and the response information;
- the response information determination unit is to determine the response information by adding the state control index in addition to the scenario data and the input specifying information.
- response information is determined using state control indicators in addition to scenario data and input specific information, a natural response that more closely matches the topic and conversation flow can be realized.
- the status control index can capture the history of input and topic provision and provide the topic, so the user can learn the context (store the past flow of topics and conversations and adapt to the current flow. It is possible to realize a response that feels “context learning”).
- the feature according to the first embodiment of the present invention further includes
- the scenario data includes information defining transitions to different topics
- a topic switching unit that switches scenario data from scenario data for defining response information on a current topic to scenario data for defining response information on a different topic in accordance with information defining transition to the different topic Is further provided.
- the topic when used together with the state control index, the topic can be switched by capturing the user's personality and emotional state, so the user can control the emotion (the switching of the topic by capturing the personality and emotional state of the user is called “emotional control”) It is possible to realize a response that allows the user to experience this.
- a topic list (for example, shown in FIG. 17) that stores topics associated with related terms (for example, a part of text data that can be associated with a topic, which is different from a normal keyword used in a normal search process or the like).
- Topic list G Topic extraction means (for example, topic extraction means for extracting a topic related to the text data from the topic list based on a related word associated with text data (for example, a topic input from a keyboard by a person in charge of the maintenance device 3)
- Display control means for controlling to display the extracted topic for example, a control unit configured by a CPU or the like of the maintenance device 3
- the display control means displays, for each of the extracted topics, a related term associated with the topic (for example, instructing “condition setting: priority related term” as in a screen 1815 in FIG. 17).
- the related topic providing system controls the display of related terms in the form of a prioritized list.
- the topic itself and a plurality of related terms related to the topic can be displayed. Therefore, based on the topics obtained from the topic list, various topics can be displayed by changing from the topics. Variations can be displayed.
- the text data is (1) Input topics, (2) It is configured to include at least one of topics extracted from external log data (for example, data that can be collected via a network such as Twitter and blog) based on an input topic. That is.
- external log data for example, data that can be collected via a network such as Twitter and blog
- a topic introduction list can be constructed by repeating.
- the characteristics according to the first embodiment of the present invention are as follows:
- the display control unit When the display control unit receives a related term associated with the extracted topic, the display control unit displays the topic associated with the input related term (for example, as shown in a screen 1815 in FIG. 17).
- the displayed topic is narrowed down to only the topic associated with the related word “I am relieved”, and as shown in the screen 1811, by inputting “I am the most scared”
- the topic to be displayed is controlled to be further narrowed down to only the topic associated with the related term “most scary”.
- the display control means is based on an aspect in which a related term has been associated with a topic in the past (for example, by analyzing the association of related terms with reference to a preference dictionary E built in other user history in the past), It is to control the display of the topic.
- the related term can be displayed based on the user's preference.
- the feature according to the second embodiment of the present invention is as follows: An input unit for a user to input input information; A scenario data storage unit for storing scenario data for defining response information on the topic; A state control index storage unit that stores a state control index related to the input information and the response information; A response information determination unit that determines the response information based on the scenario data and the state control index; And an output unit that outputs the response information determined by the response information determination unit.
- the conversation control terminal device can judge and control whether or not conversation with the user is possible. Conversation smoothly according to the conversation flow with the user without significantly increasing the processing load on the conversation control terminal device, without increasing the load on the server, and without increasing the network traffic. Can proceed.
- the conversation control terminal device determines the response information based on the scenario data and the state control index, the response information can be determined according to the progress of the conversation with the user, and the load on the server is reduced. Without increasing, it is possible to respond accurately according to the state of the user.
- the feature according to the second embodiment of the present invention is further A transmission unit for transmitting the input information to the outside; A receiving unit that receives the input specifying information generated by analyzing the transmitted input information, and The response information determination unit determines the response information by adding the input specifying information to the scenario data and the state control index.
- Response information is determined by adding input specific information, so response information can be determined including input specific information that is the result of external analysis such as a server, and the conversation proceeds smoothly according to the conversation flow with the user. Can do.
- the feature according to the second embodiment of the present invention is further The receiving unit receives scenario data extracted based on the input information,
- the scenario data storage unit stores received scenario data.
- scenario data extracted based on the input information Since scenario data extracted based on the input information is received, the scenario data can be switched based on the input information input by the user, and the conversation can proceed smoothly according to the flow of the conversation with the user. it can.
- the feature according to the second embodiment of the present invention is further
- the scenario data includes information defining transitions to different topics
- a switching input information input unit that generates topic switching input information according to information defining transition to the different topic
- the transmission unit transmits the topic switching input information to the outside
- the receiving unit is to receive scenario data based on the topic switching input information.
- the topic can be switched based on the state control index and the scenario data, the topic can be switched while observing the state of the conversation with the user, and the conversation smoothly proceeds according to the conversation flow with the user. Can do.
- a scenario data storage unit that stores scenario data for defining response information related to a topic based on input specifying information generated by analyzing input information input by a user;
- a receiving unit for receiving the input identification information;
- a scenario data editing section for making the scenario data editable;
- a scenario data verification unit that enables verification of a response of scenario data edited based on the input specific information received by the reception unit;
- a scenario data transmission unit that transmits the edited scenario data to the outside.
- the data in the conversation control system includes both data for analyzing input information input by the user to generate input specific information and scenario data for determining response information based on the input specific information.
- This scenario data is data that can diversify response information that is an answer to the user.
- the scenario data editing unit can edit the scenario data
- the scenario data verification unit can verify the response of the edited scenario data. By doing so, it is possible to customize the scenario data for each of a plurality of users who use the conversation control system without the need for highly specialized knowledge and technology throughout the conversation control system. it can.
- the feature according to the third embodiment of the present invention further includes An input unit for a user to input input information; A state control index storage unit that stores a state control index related to the input information and the response information; A response information determination unit that determines the response information based on the scenario data and the state control index; And an output unit that outputs response information determined by the response information determination unit, and a terminal device virtual construction unit that virtually constructs a conversation control terminal device.
- the terminal device virtual construction unit can virtually construct and execute the conversation control terminal device in the maintenance device. Therefore, an environment similar to that of a conversation control terminal device used by a general user can be realized in the maintenance device. As a result, the content and operation of the scenario data can be checked in advance in an environment similar to the environment in which the user actually carries out the conversation, and the content of the scenario data can be verified before the conversation with the user. It is possible to verify in advance whether or not the custom made for each of a plurality of users connected to is appropriate.
- the feature according to the third embodiment of the present invention further includes A topic analysis unit for generating a topic list to which the closeness and connection method of topics is given via a related word that relates topics,
- the scenario data editing unit edits a topic introduction scenario for introducing a topic to a user and an input related scenario for responding to a user input as the scenario data by using the topic list and the related terminology. That is.
- the data in the conversation control system includes both data for analyzing input information input by the user to generate input specific information and scenario data for determining response information based on the input specific information.
- This scenario data is data that can diversify response information that is an answer to the user.
- the scenario data editing unit can edit the scenario data in cooperation with the topic analysis unit, and the scenario data verification unit can verify the response of the edited scenario data. By doing so, it is possible to customize the scenario data for each of a plurality of users who use the conversation control system without the need for highly specialized knowledge and technology throughout the conversation control system. it can.
- the fourth embodiment of the present invention From text data collected by keyword-based search (for example, external log 502 (text data) acquired from the collected WEB page shown in FIG. 20), sentence information related to the keyword (for example, text hit by keyword search) Sentence information acquisition means (for example, an input information analysis unit 41 shown in FIG. 20) for acquiring a question sentence that is a part of the data; One or a plurality of character strings satisfying a predetermined condition (for example, related verbs that are meaning-identifiable character strings) are selected from each of the sentence information, and the character strings are stored for each corresponding sentence information.
- a character string selection means for example, the sentence analysis unit 43 shown in FIG. 20 stored in the means (for example, the related term dictionary 50 shown in FIG.
- An information search system configured to include information output means (for example, the input information analysis unit 41 shown in FIG. 20) that outputs the input specific information including the related term / co-occurrence word data 52 shown in FIG. (For example, the information search system 100 shown in FIG. 20).
- the selected character string is obtained as information originating from the information sender, and a new topic that is not known can be acquired.
- the selected character string (related term) is the latest information obtained from the search based on the keyword, the latest information can be obtained.
- the character string selection means is: It is configured to select the character string without collating with character string data stored in advance (for example, a dictionary including morpheme data or the like).
- the feature according to the fourth embodiment of the present invention further includes
- the character string selection means further includes: A character string search means (for example, a character string search processing unit 43b shown in FIG. 21) for searching for the same character string from the text data; For the same character string, it is an index indicating the degree of difference of the previous adjacent character (for example, how much the character appearing immediately before the searched “same character string” is different (there is a variation), Based on the number of patterns of characters appearing as adjacent characters of) and the degree of difference of the subsequent adjacent characters (for example, how much the character appearing immediately after the searched “same character string” is different (variation exists)
- a different degree determination means for example, a different degree determination processing unit 43c shown in FIG.
- Specific character string determining means for determining whether or not the same character string is a specific character string (for example, a related term) based on the difference degree of the preceding adjacent character and the difference degree of the subsequent adjacent character.
- a related term determination processing unit 43d shown in FIG.
- the character string selection means is configured to select the character string from the determined specific character string.
- related terms are determined according to the degree of difference between adjacent characters, and it is not necessary to perform comparison processing with a dictionary including morpheme data etc., and the processing can be speeded up.
- the display processing of the lyrics can be performed in real time.
- the feature according to the fourth embodiment of the present invention further includes Scenario data storage means (for example, scenario data 28 and scenario data 55 shown in FIG. 20) for storing scenario data (for example, data consisting of statements as shown in FIG. 14) for defining response information related to the topic;
- Scenario data storage means for example, scenario data 28 and scenario data 55 shown in FIG. 20
- scenario data for example, data consisting of statements as shown in FIG. 14
- Response information determination means for example, a response information determination unit 25 shown in FIG. 20
- Response information output means for example, the output control unit 26 shown in FIG. 20 for outputting the response information determined by the response information determination means is further provided.
- the feature according to the fourth embodiment of the present invention further includes It further includes dictionary comparison means (for example, dictionary comparison processing unit 46c shown in FIG. 22), When the character string selection unit stores the character string in the character string storage unit, each of the character string selection units stores a corresponding dictionary (for example, the related term dictionary 50 shown in FIG. 22) according to the collection condition of the text data.
- dictionary comparison means performs a comparison process for comparing a plurality of the dictionaries, stores the comparison result in a comparison result storage means (for example, comparison result data 54 shown in FIG.
- the response information determining means determines the response information including the comparison result based on the scenario data;
- the response information output means outputs the response information determined by the response information determination means,
- the dictionary comparison unit is further configured to perform the comparison process and automatically update the comparison result stored in the comparison result storage unit when at least one of the plurality of dictionaries is updated. That is.
- the selected character string (related term) can be automatically updated, and a change in the appearance status of the related term is grasped and displayed based on a comparison result at each update timing.
- a change in the appearance status of the related term is grasped and displayed based on a comparison result at each update timing.
- the feature according to the fourth embodiment of the present invention further includes
- the information output means includes When one of the character strings corresponding to one of the sentence information and one of the character strings corresponding to sentence information different from one of the sentence information are common, the character corresponding to the one sentence information It is configured to output information for displaying the set of columns in association with the set of character strings corresponding to the other sentence information.
- the feature according to the fourth embodiment of the present invention further includes
- the information output means includes Outputting information for displaying all of the set of character strings corresponding to the predetermined one or more sentence information;
- the display order of the character strings is configured to be determined according to a usage mode of the user with respect to the character strings.
- the fourth embodiment of the present invention further includes: Keyword input means (for example, the input control unit 21 shown in FIG. 20) for inputting a keyword as a text data collection condition; When one or more character strings satisfying a predetermined condition are selected from each of the sentence information related to the keyword acquired from the text data collected based on the keyword, the selected character string is
- An information search device for example, FIG. 20
- information output means for example, the output control unit 26 shown in FIG. 20
- the fourth embodiment of the present invention further includes: A sentence information acquisition step of acquiring sentence information related to the keyword from text data collected by keyword-based search; A character string selection step of selecting one or more character strings satisfying a predetermined condition from each of the sentence information, and storing the character string in a character string storage unit for each corresponding sentence information;
- the information search method is configured to include an information output step of outputting information for displaying the selected character string to a user for each corresponding sentence information.
- the fourth embodiment of the present invention further includes: On the computer, Sentence information acquisition means for acquiring sentence information relating to the keyword from text data collected by keyword-based search; Character string selection means for selecting one or more character strings satisfying a predetermined condition from each of the sentence information, and storing the character strings in character string storage means for each corresponding sentence information, and It is a program for causing the selected character string to function as information output means for outputting information for displaying to the user for each corresponding sentence information.
- Character string search means for example, a character string search processing unit 43b shown in FIG. 21 for searching for the same character string from text data (for example, the external log 502 that is text data acquired from a collected WEB page); For the same character string, it is an index indicating the degree of difference of the previous adjacent character (for example, how much the character appearing immediately before the searched “same character string” is different (there is a variation), Based on the number of patterns of characters appearing as adjacent characters of) and the degree of difference of the subsequent adjacent characters (for example, how much the character appearing immediately after the searched “same character string” is different (variation exists) A different degree determination means (for example, a different degree determination processing unit 43c shown in FIG.
- Specific character string determining means for determining whether or not the same character string is a specific character string (for example, a related term) based on the difference degree of the preceding adjacent character and the difference degree of the subsequent adjacent character.
- a sentence analysis device for example, a sentence analysis device including the sentence analysis unit 43 shown in FIG. 21
- a related term determination processing unit 43d shown in FIG.
- the character string search means searches for the same character string from the acquired text data, and After the degree of difference between the adjacent characters before and after the same character string is determined by the difference degree determination unit, the same character string is specified based on the difference between the adjacent characters determined by the difference degree determination unit.
- a character string for example, a related term
- the specific character string determining means so that it is possible to extract a specific character string that is important for identifying a topic from text data without using a dictionary. it can.
- the feature according to the fifth embodiment of the present invention is as follows.
- the specific character string determining means When a plurality of specific character strings are included in the text data, at least one of a difference degree of the preceding adjacent character, a difference degree of the subsequent adjacent character, a character length, and an appearance frequency with respect to each specific character string Based on the above, the specific character string in the text data is ranked.
- the said specific character string determination means contains the some specific character string in the said text data
- the specific character strings differentiated according to importance, topicality, etc. can be analyzed for each of the plurality of specific character strings.
- the feature according to the fifth embodiment of the present invention is that the text data is data collected according to a predetermined condition (for example, text data of a web page or a blog that is generated by an individual and published on the Internet). , Tweet information, etc.)
- a predetermined condition for example, text data of a web page or a blog that is generated by an individual and published on the Internet. , Tweet information, etc.
- the specific character string is determined for each of the different text data
- the determined specific character string is grouped for each corresponding text data (for example, individually specified for each text data)
- a set of character strings is formed, and a plurality of text data as a whole is managed as a set of specific character string sets).
- the specific character string determination unit executes processing for grouping the determined specific character string for each corresponding text data.
- the specific character strings are differentiated in the respective groups as in the above-described ranking, and the degree of freedom in analysis may be improved.
- the fifth embodiment of the present invention further includes: Searching for the same string from text data; For the same character string, determining a difference degree of a preceding adjacent character and a difference degree of a subsequent adjacent character; Determining whether or not the same character string is a specific character string based on the degree of difference between the preceding adjacent characters and the degree of difference between the subsequent adjacent characters. If comprised in this way, the method of having the effect similar to the sentence analysis apparatus of the said invention can be provided.
- the fifth embodiment of the present invention further includes: On the computer, String search means for searching for the same string from text data, For the same character string, a different degree determination means for determining a difference degree of a preceding adjacent character and a difference degree of a subsequent adjacent character, and A program that functions as a specific character string determination unit that determines whether or not the same character string is a specific character string based on the difference degree of the preceding adjacent character and the difference degree of the subsequent adjacent character.
- a character string (for example, a related term) that can be distinguished from meaning is extracted from text data (for example, the external log 502 that is text data acquired from the collected WEB page), and the extracted character string is used as the text data.
- a character string extraction means (for example, a character string extraction processing unit 46b shown in FIG.
- a comparison process for comparing a plurality of the dictionaries for example, the related terms stored in the related term dictionary (i-1) are compared with the related terms stored in the related term dictionary (i), and the related term dictionary (i)
- Dictionary comparison means for example, dictionary comparison processing unit 46c shown in FIG.
- comparison result storage means for storing the comparison result storage means (comparison result data 54); Text data associated with different dictionaries (for example, text data 1 that is a source for extracting a related term stored in the related term dictionary (i-1) and a related term stored in the related term dictionary (i).
- the source text data 2) was collected under different collection conditions (for example, text data at different times searched by the same search conditions, or text data searched by different search conditions at the same time). Text data,
- the dictionary comparison unit performs the comparison process when at least one of the plurality of dictionaries is updated, and automatically updates the comparison result stored in the comparison result storage unit (for example, overwrite update, or An information update device configured to perform cumulative update (for example, an information update device including the information update unit 46 illustrated in FIG. 22).
- a character string whose meaning can be identified is extracted from text data, and the extracted character string is stored in a dictionary corresponding to the text data (text data associated with a different dictionary).
- the dictionary comparison means compares a plurality of the dictionaries and automatically stores the comparison results in the comparison result storage means. From the text data, it is possible to automatically extract the character strings that can be semantically identified sequentially, and by comparing the dictionaries storing those character strings, it is possible to grasp the latest character strings, Effectively grasp changes in topics.
- the feature according to the sixth embodiment of the present invention is as follows.
- the dictionary comparing means In the comparison process, when a character string appears in a plurality of the dictionaries (for example, when the related term dictionary (i-1) is compared with the related term dictionary (i), the related term is newly found in the related term dictionary (i).
- the corresponding character string is stored as the comparison result in accordance with the situation appearing in (3), the situation where the related term disappears in the related term dictionary (i), and the like.
- the said dictionary comparison means will memorize
- the related terms newly appearing in the later related term dictionary, the related terms disappearing in the later related term dictionary, and the like can be grasped according to the appearance status.
- the feature according to the sixth embodiment of the present invention further includes
- the character string extraction means is When a plurality of character strings are extracted from the text data, the plurality of character strings are associated and stored in the corresponding dictionary (for example, a plurality of related verbs (co-occurrence extracted from one text data) Related verbs) are ranked and stored as one record),
- the dictionary comparing means In the comparison process, when there are common character strings in the plurality of dictionaries, the character strings associated with each of the common character strings are compared (for example, determining the commonality between sets of related terms, That is, it is configured to compare the co-occurrence related terms of each common related term.
- the said character string extraction means associates and memorize
- the character strings (neighboring related words) associated with each other are also compared, so that the level of commonality regarding the common character strings can be grasped.
- the sixth embodiment of the present invention further includes: A character string extraction step of extracting a character string that can be distinguished from text data, and storing the extracted character string in a dictionary corresponding to the text data; A comparison process for comparing a plurality of the dictionaries, and storing a comparison result in a comparison result storage means, Text data associated with different dictionaries is text data collected under different collection conditions,
- the dictionary comparison step is configured to perform the comparison process and automatically update the comparison result stored in the comparison result storage means when at least one of the plurality of dictionaries is updated. It is an update method. If comprised in this way, the information update method with the same effect as the information update apparatus of the said invention can be provided.
- the sixth embodiment of the present invention further includes: On the computer, A character string extracting means for extracting a character string that can be distinguished from text data, and storing the extracted character string in a dictionary corresponding to the text data, and A program for performing a comparison process for comparing a plurality of the dictionaries and causing a comparison result to be stored in a comparison result storage unit and functioning as a dictionary comparison unit, Text data associated with different dictionaries is text data collected under different collection conditions, Further, the dictionary comparison unit is configured to perform the comparison process and automatically update the comparison result stored in the comparison result storage unit when at least one of the plurality of dictionaries is updated.
- the effect of the first embodiment of the present invention is as follows. It is possible to realize a natural response according to the flow of topics and conversations, and to perform maintenance work such as change, addition and correction of topic control rules separately. Further, since the topic itself and a plurality of related terms related to the topic can be displayed, it is possible to display variations of various topics by transitioning from the topic based on the topic obtained from the topic list.
- the effect of the second embodiment of the present invention is as follows. It is possible to reduce the load on the server and reduce network traffic, to respond accurately according to the status of a plurality of users, and to proceed smoothly with the conversation flow with the users.
- the effect of the third embodiment of the present invention is as follows. It is possible to apply a custom response to each of a plurality of users who use the conversation control system and to verify in advance whether or not the applied custom is appropriate.
- the effect of the fourth embodiment of the present invention is as follows. It is possible to provide a keyword (character string) that can acquire the latest topic that is not known to a user who uses the information search system.
- the effect of the fifth embodiment of the present invention is as follows.
- the sentence analysis apparatus it is possible to extract an important character string from the text data without using a database that is matched with the text data, thereby creating a database such as a morpheme database in advance, It is no longer necessary to maintain and manage.
- important character strings can be flexibly composed of phrases, clauses, and parts of other sentences that contain some words in addition to units such as words. Can be grasped in various ways.
- the sentence analysis apparatus searches for the same character string from text data, determines the degree of difference between adjacent characters before and after the character string, and determines the importance of the character string based on the degree of difference. Therefore, access to the database and comparison processing with a morpheme group stored in the database are not performed, and an important character string is extracted from text data at high speed.
- the sentence analysis apparatus determines the importance of the character string according to the degree of difference between adjacent characters before and after the character string as described above, without reconstructing the analysis logic or the like. Can easily handle languages other than Japanese.
- the effect of the sixth embodiment of the present invention is as follows.
- the information update device can automatically extract sequentially decomposed text data from text data collected under different collection conditions.
- the latest decomposed text data and the like can be extracted from a large amount of text data. It can be grasped.
- the text data can include not only data frequently created and updated by individuals on the network, but also data and the like created and updated every day at a predetermined organization or the like. Further, by automatically extracting the decomposed text data, it is possible to greatly reduce the labor related to the definition and creation of the decomposed text data.
- the information update apparatus automatically extracts sequentially decomposed text data from text data collected under different collection conditions, and compares the extracted decomposed text data thus newly appearing.
- the decomposed text data can be separated, new information with high value regarding the grasp of the topic, etc. can be obtained instantaneously, and the change of the topic can be grasped effectively.
- the decomposed text data is extracted from the text data by the information updating apparatus according to the present invention
- the plurality of decomposed text data extracted from the text data is regarded as a meaningful set, and the appearance status of the decomposed text data and By associating the above set of decomposed text data (co-occurrence relationship), changes in topics can be grasped more effectively.
- FIG. 1 is a block diagram showing an outline of a system configuration of a topic providing system 1.
- FIG. It is a flowchart which shows the specific process of the statement of scenario data. It is a flowchart which shows the specific process of the statement of scenario data. It is a flowchart which shows the specific process of the statement of scenario data. It is a flowchart which shows the specific process of the statement of scenario data. It is a flowchart which shows the specific process of the statement of scenario data. It is a flowchart which shows the specific process of the statement of scenario data. It is a flowchart which shows the specific process of the statement of scenario data. It is a flowchart which shows the specific process of the statement of scenario data. It is a flowchart which shows the specific process of the statement of scenario data.
- FIG. 6 is a diagram illustrating an example of output to an output unit 220.
- FIG. It is a figure which shows the specific example of the statement of scenario data. It is a figure which shows the process which produces
- FIG. 1 is a diagram showing an overview of the topic providing system 1.
- FIG. 2 is a diagram showing an outline of the conversation control terminal device 2.
- FIG. 3 is a diagram showing an overview of the maintenance device 3.
- the feature of the topic providing system 1 is as follows.
- An input unit for a user to input input information;
- An input information analysis unit that analyzes the input information and generates input specific information;
- a scenario data storage unit for extracting scenario data for defining response information on the topic;
- a response information determination unit that determines the response information based on the scenario data and the input identification information;
- an output unit that outputs the response information determined by the response information determination unit.
- the topic providing system 1 mainly includes an input unit, an input information analysis unit, a scenario data storage unit, a response information determination unit, and an output unit, as shown in FIG. In FIG. 1, these structures are indicated by solid-line squares.
- the transmission unit, the reception unit, and the switching input information input unit indicated by dotted-line squares will be described later.
- a portion surrounded by a broken line in FIG. 1 is the configuration of the conversation control terminal device 2 shown in FIG.
- the input unit is a member or part for the user to input input information.
- the input part should just be what can input the information which a user desires as input information.
- the input unit includes a keyboard, a touch panel, a microphone, a camera, and the like.
- the user can input text data, voice data, image data, and the like from the input unit.
- the input information input to the input unit is supplied to the input information analysis unit described below.
- the input information is preferably supplied to the input information analysis unit via a transmission unit described later.
- the input information analysis unit analyzes the input information and generates input specific information.
- the input specifying information is information generated as a result of analyzing various types of information included in the input information. For example, there is a statistical analysis of the number and frequency of specific keywords (such as related terms described later) included in the input information. Furthermore, by analyzing the input information, it is possible to analyze the user's intention and preference from a question input by the user. Furthermore, a relative analysis result can also be acquired by comparison with other users. Furthermore, data such as an analysis dictionary can be generated in advance, and input information can be analyzed using the analysis dictionary. The input information analysis unit supplies the generated input specifying information to a response information determination unit described later.
- the scenario data storage unit is a member or part for extracting scenario data.
- the scenario data is data stored in advance in a scenario data storage unit (a plurality of scenario data) to be described later. Scenario data determined to be necessary based on the input specifying information generated by the input information analysis unit is extracted, and the extracted scenario data is stored in the scenario data storage unit.
- the scenario data extracted from the scenario data storage unit (a plurality of scenario data) is stored in the scenario data storage unit via a receiving unit and a response information determination unit described later.
- Scenario data is data for defining response information related to topics provided to users. That is, the scenario data includes topic information to be provided to the user.
- the scenario data includes a topic introduction list including information on a plurality of topics.
- the topic introduction list is provided to the user. By selecting a topic included in the provided topic introduction list, the user can use the input information as alternative information. Since the user can advance the conversation by the selection operation, the user's input operation can be simplified and the conversation can be smoothly advanced as compared with the case of inputting characters.
- a plurality of topics can be provided to the user by the topic introduction list, and the user can know various topics and can expand the range of interest.
- the scenario data includes information for greeting the user.
- information for greeting the user By including not only information for providing a topic to the user but also information for greeting in the scenario data, it is possible to greet the user and make the conversation more natural.
- Scenario data is data created in advance including information that the subscriber of the topic providing system 1 wants to provide to the user. Information to be provided to the user can be defined by the scenario data.
- scenario data consists of multiple statements.
- the statement includes output information, output commands, control commands, and the like.
- the output information is information output by the output unit.
- the output information includes topic information and greeting information.
- Topic information and greeting information are information for providing a user with a conversation with the user.
- the output command is a command for controlling output specifications when topic information or greeting information is output to the output unit. For example, it is a command for erasing the screen, making a line break, controlling the output time, or displaying a predetermined image.
- the control command is a command for making a judgment for controlling a statement, switching a topic name (for example, a theme) or changing a state control index.
- the determination includes a determination for branching according to time and time, and a determination for branching depending on the contents of the state control index. It is possible to make a transition to a predetermined statement by branching by judgment.
- the statement is synonymous with scenario data.
- a predetermined statement can be composed of only output information, can be composed of only output commands, or can be composed of only control commands.
- the statement includes not only information provided to the user but also various commands.
- topic information and greeting information can be output with various specifications in the output unit, and conversation can be naturally and smoothly advanced.
- the scenario data includes output information, output commands, control commands, and the like.
- a topic control rule can be constructed by appropriately defining output information, output commands, and control commands.
- the topic control rules desired by the contractor can be constructed by including in the scenario data various information (topics) that the contractor of the topic providing system 1 wants to provide to the user.
- the scenario data only needs to be defined as appropriate for the information that the contractor wants to provide, so the topic control rules can be changed, added, modified, etc. without depending on the advanced and specialized knowledge and technology of the topic providing system 1 in general.
- the maintenance work can be performed separately.
- the response information determination unit determines response information.
- the response information is determined based on the scenario data and input specifying information supplied from the input information analysis unit described above. That is, the response information is determined using the input specifying information obtained by analyzing the input information input by the user. Therefore, the intention of the subscriber of the topic providing system 1 can be reflected in the response information by the scenario data, and the user's intention can be reflected in the response information by the input specifying information. It is possible to reflect and generate response information.
- a natural response can be realized by smoothing the conversation by balancing the contractor and the user of the topic providing system 1.
- the response information includes a scenario data statement.
- various commands such as an output command can be included in the response information.
- topic information and greeting information can be output from the output unit with various specifications.
- the output unit outputs the response information determined by the response information determination unit.
- the user is provided with the topic by recognizing the response information output from the output unit.
- the topic providing system 1 can provide various topics to the user based on the response information output to the output unit. That is, the topic providing system 1 according to the present embodiment generates input specifying information from input information, determines response information from scenario data and input specifying information, and provides various topics to the user by outputting response information. .
- the topic providing system 1 determines response information based on scenario data (for example, a statement and a topic introduction list described later) and input specific information analyzed by the input information analysis unit. Natural response that matches the flow of
- response information on topics can be defined based on scenario data, so topic control rules can be changed, added, modified, etc. without relying on advanced and specialized knowledge and technology throughout the topic control system.
- the maintenance work can be performed separately.
- a state control index storage unit for storing a state control index related to the input information and the response information
- the response information determination unit is to determine the response information by adding the state control index in addition to the scenario data and the input specifying information.
- the topic providing system 1 includes a state control index storage unit.
- the state control index storage unit stores a state control index.
- the state control index is an index related to input information and response information.
- the state control index is an index mainly based on the conversation history, and further indicates the user's emotion and personality that can be determined based on the conversation history. For example, there are an index defined based on input information input by the user in the past and an index defined based on response information provided to the user in the past. In addition, there is an index or the like indicating the user's emotion or personality obtained from the user's input to response information provided to the user in the past.
- the response information determination unit determines the response information by adding a state control index in addition to the scenario data and the input specific information. In this way, response information is determined using the state control index, so it is necessary to provide topics or advance conversations based on past conversations with the user and the user's emotions and personality obtained from the conversation. Can do. Therefore, it is possible to prevent the same topic from being provided to the user in duplicate or to provide the user with a leap topic, and a smooth conversation can be promoted according to the user's emotion and personality.
- the state control index storage unit is provided in the conversation control terminal device 2.
- the state control index is preferably determined or changed by the response information determination unit.
- the state control index is preferably determined by the response information determination unit based on the scenario data and the input specifying information.
- the topic providing system 1 determines the response information using the state control index in addition to the scenario data and the input specific information, and thus realizes a natural response that more matches the topic and the flow of conversation. Can do.
- the scenario data includes information defining transition to a different topic (for example, topic switching information described later),
- a topic switching unit that switches scenario data from scenario data for defining response information on a current topic to scenario data for defining response information on a different topic in accordance with information defining transition to the different topic Is further provided.
- the topic switching unit includes a switching input information input unit and an input information analysis unit.
- the topic switching unit determines whether to switch scenario data based on the scenario data and the input specifying information. Specifically, it is preferable to switch the scenario data based on the state control index.
- the switching input information input unit transmits the topic switching input information to a transmission unit (to be described later) and is supplied to the input information analysis unit.
- the input information analysis unit extracts the scenario data stored in the scenario data storage unit (a plurality of scenario data) based on the topic switching input information in addition to the input information.
- Scenario data includes information that defines transition to different topics (topic names).
- the topic switching unit switches scenario data. This switching is to switch from scenario data for defining response information on the current topic to scenario data for defining response information on a different topic.
- scenario data is stored in a server, etc., and scenario data that is recombined from all scenario data of the server is generated based on information that defines transition to a different topic. It is stored in the scenario data storage unit. This recombination of scenario data can be executed by determining a combination of a plurality of statements according to the topic name.
- scenario data need only be prepared for each topic (topic name), so scenario data maintenance becomes easy. Specifically, when the scenario data needs to be changed, only the scenario data need be corrected. If a new topic is required, only the scenario data need be added. Furthermore, when it is no longer necessary due to an old topic, only the scenario data need be deleted.
- Scenario data can be specified for each topic (topic name) as appropriate for information that the contractor wants to provide, so even if the topic increases, advanced and specialized knowledge throughout the topic providing system 1 Maintenance work such as changing, adding, and correcting topic control rules can be performed for each topic without depending on the technology.
- the personality index is an index indicating whether the user is active or reluctant to the topic.
- the topic providing system 1 can switch topics using scenario data, it can realize a natural response in accordance with the flow of topics and conversations.
- the transmission unit, reception unit, and switching input information input unit indicated by dotted-line squares in FIG. 1 will be described.
- the topic providing system 1 according to the present embodiment can include these transmission unit, reception unit, and switching input information input unit.
- the transmission unit shown in FIG. 1 is a device or member for transmitting input information to the outside. Any device that transmits input information to the outside may be used.
- the outside can be, for example, a server, the conversation control terminal device 2, or the like.
- the receiving unit shown in FIG. 1 is a device or member for receiving input specifying information.
- the input specifying information is generated externally. That is, the receiving unit is a device or member that receives input specifying information generated externally. Outside, the input information transmitted from the transmitting unit is analyzed to generate input specifying information, and the generated input specifying information is transmitted to the receiving unit.
- the switching input information input unit shown in FIG. 1 generates topic switching input information (for example, a personality index, which will be described later) according to information defining transition to a different topic.
- Information defining the transition to a different topic is information included in the scenario data, for example, topic switching information described later.
- the topic switching unit described above includes an input information analysis unit and a switching input information input unit. Since topic switching input information is generated based on the input information, it is possible to transition to a topic reflecting the user's intention.
- the scenario data storage unit (a plurality of scenario data) shown in FIG. 1 stores a plurality of scenario data.
- the plurality of scenario data are all scenario data corresponding to the topic names necessary for talking with the user.
- scenario data determined to be necessary based on the input identification information is extracted.
- the feature of the conversation control terminal device 2 is as follows.
- An input unit for a user to input input information;
- a scenario data storage unit for storing scenario data for defining response information on the topic;
- a state control index storage unit that stores a state control index related to the input information and the response information;
- a response information determination unit that determines the response information based on the scenario data and the state control index;
- an output unit that outputs the response information determined by the response information determination unit.
- the conversation control terminal device 2 mainly includes an input unit, a scenario data storage unit, a state control index storage unit, a response information determination unit, and an output unit, as shown in FIG. In FIG. 2, these configurations are indicated by solid squares. In FIG. 2, a portion surrounded by a broken line is the configuration of the conversation control terminal device 2. Note that the scenario data storage unit (a plurality of scenario data) and the input information analysis unit indicated by dotted-line boxes are included in the topic providing server 4.
- the input unit is a member or a part for the user to input input information, similar to the input unit of the topic providing system 1 according to the present embodiment.
- the input part should just be what can input the information which a user desires as input information.
- the input unit includes a keyboard, a touch panel, a microphone, a camera, and the like.
- the user can input text data, voice data, image data, and the like from the input unit.
- the input information input to the input unit is supplied to the input information analysis unit described below.
- the input information is preferably supplied to the input information analysis unit via a transmission unit described later.
- the scenario data storage unit is a member or part for storing scenario data.
- the scenario data is data stored in advance in a scenario data storage unit (a plurality of scenario data) of the topic providing server 4 shown in FIG.
- Scenario data determined to be necessary based on the input specifying information generated by the input information analysis unit is extracted, and the extracted scenario data is stored in the scenario data storage unit.
- the scenario data extracted from the scenario data storage unit (a plurality of scenario data) is stored in the scenario data storage unit via a receiving unit and a response information determination unit described later.
- Scenario data is data for defining response information related to topics provided to users.
- the scenario data includes topic information to be provided to the user.
- the configuration and functions of scenario data used in the conversation control terminal device 2 are the same as the scenario data of the topic providing system 1 according to this embodiment.
- the conversation control terminal device 2 includes a state control index storage unit.
- the state control index storage unit stores a state control index.
- the state control index is an index mainly based on the conversation history, and further indicates the user's emotion and personality that can be derived based on the conversation history. For example, there are an index defined based on input information input by the user in the past and an index defined based on response information provided to the user in the past. In addition, there is an index indicating the user's emotion and personality that can be derived from the user's input for response information provided to the user in the past.
- the conversation control terminal device 2 includes a state control index storage unit. That is, it is not a configuration in which, for example, the topic providing server 4 etc. outside the conversation control terminal device 2 includes a state control index storage unit. Therefore, in the present embodiment, the conversation with the user is not controlled by an external device such as the topic providing server 4 but the conversation with the user is controlled by the conversation control terminal device 2.
- the response information determination unit determines response information based on the scenario data and the state control index supplied from the input information analysis unit of the topic providing server 4.
- the intention of the subscriber of the topic providing system 1 can be reflected in the response information by the scenario data.
- the response information includes a scenario data statement.
- various commands such as an output command can be included in the response information.
- topic information and greeting information can be output from the output unit with various specifications.
- response information is determined using a state control index
- topics can be provided and conversations can be promoted based on past conversations with the user and the user's emotions and personality obtained from the conversation. . Therefore, it is possible to prevent the same topic from being provided to the user in duplicate or to provide the user with a leap topic, and a natural response can be realized by promoting a smoother conversation.
- the state control index is preferably determined or changed by the response information determination unit.
- the state control index is preferably determined by the response information determination unit based on the scenario data and the input specifying information.
- the output unit outputs the response information determined by the response information determination unit.
- the user is provided with the topic by recognizing the response information output from the output unit.
- the conversation control terminal device 2 Since the conversation control terminal device 2 according to the present embodiment provides both the scenario data storage unit and the state control index storage unit in the conversation control terminal device 2 and determines the response information, is it possible to have a conversation with the user? Can be determined and controlled by the conversation control terminal device 2, without significantly increasing the processing load in the conversation control terminal device 2, without increasing the load on the server, and also increase the network traffic Therefore, it is possible to smoothly advance the conversation according to the conversation flow with the user.
- the conversation control terminal device 2 determines the response information based on the scenario data and the state control index, the response information can be determined according to the progress of the conversation with the user, and the load on the server It is possible to respond accurately according to the state of the user without increasing the value.
- the features of the conversation control terminal device 2 are as follows: A transmission unit for transmitting the input information to the outside; A receiving unit that receives the input specifying information generated by analyzing the transmitted input information, and The response information determination unit determines the response information by adding the input specifying information to the scenario data and the state control index.
- the conversation control terminal device 2 further includes a transmission unit and a reception unit.
- the transmission unit transmits input information to the outside. What is necessary is just to transmit input information outside the conversation control terminal device 2.
- the outside can be, for example, a server or another conversation control terminal device 2.
- the receiving unit receives input specific information.
- the input specifying information is generated outside the conversation control terminal device 2. That is, the receiving unit is a device or member that receives input specifying information generated outside the conversation control terminal device 2. Outside the conversation control terminal device 2, the input information transmitted from the transmission unit is analyzed to generate input specification information, and the generated input specification information is transmitted to the reception unit of the conversation control terminal device 2.
- the response information determination unit determines the response information by adding the input specific information to the scenario data and the state control index.
- Response information is determined including input specifying information obtained by analyzing input information input by the user. Therefore, the intention of the subscriber of the topic providing system 1 can be reflected in the response information by the scenario data, and the user's intention can be reflected in the response information by the input specifying information. It is possible to reflect and generate response information.
- a natural response can be realized by smoothing the conversation by balancing the contractor and the user of the topic providing system 1.
- the conversation control terminal device 2 determines the response information by adding the input specific information
- the response information can be determined including the input specific information that is the result of analysis performed outside the server or the like, The conversation can proceed smoothly according to the conversation flow.
- the receiving unit receives scenario data extracted based on the input information
- the scenario data storage unit stores received scenario data.
- the receiving unit of the conversation control terminal device 2 receives scenario data extracted based on input information. That is, the scenario data is extracted outside the conversation control terminal device 2. Based on the input information transmitted to the outside of the conversation control terminal device 2, the scenario data is extracted based on the input information outside the conversation control terminal device 2. The extracted scenario data is transmitted to the reception unit of the conversation control terminal device 2.
- the scenario data storage unit stores scenario data received by the reception unit of the conversation control terminal device 2.
- the configuration and functions of the scenario data used in the conversation control terminal device 2 are the same as the scenario data of the topic providing system 1 according to this embodiment.
- the conversation control terminal device 2 Since the conversation control terminal device 2 according to the present embodiment receives the scenario data extracted based on the input information, the scenario data can be switched based on the input information input by the user, and the conversation with the user The conversation can be smoothly conducted according to the flow of
- the scenario data includes information defining transition to a different topic (for example, topic switching information described later), A switching input information input unit that generates topic switching input information (for example, a personality index described later) according to information defining transition to the different topic,
- the transmission unit transmits the topic switching input information to the outside,
- the receiving unit is to receive scenario data based on the topic switching input information.
- the scenario data used in the conversation control terminal device 2 includes information defining transition to a different topic.
- the conversation control terminal device 2 includes a switching input information input unit.
- the switching input information input unit generates topic switching input information according to information defining transition to a different topic.
- Information defining the transition to a different topic includes, for example, topic switching information that is an element of a statement to be described later.
- the topic switching input information includes, for example, a personality index described later.
- the switching input information input unit determines whether or not to switch scenario data based on the scenario data and the input specifying information. Specifically, it is preferable to switch the scenario data based on the state control index.
- topic switching input information is transmitted to the transmission unit and supplied to the input information analysis unit.
- the input information analysis unit extracts the scenario data stored in the scenario data storage unit (a plurality of scenario data) of the topic providing server 4 based on the topic switching input information in addition to the input information.
- the transmission unit of the conversation control terminal device 2 transmits the topic switching input information to the outside.
- the receiving unit receives scenario data based on the topic switching input information.
- scenario data for defining response information on the current topic it is possible to switch from scenario data for defining response information on different topics.
- scenario data is stored in a server or the like
- scenario data is generated by rearranging all scenario data of the server based on topic switching input information
- the rearranged scenario data is the conversation control terminal device 2.
- scenario data storage unit are stored in the scenario data storage unit.
- scenario data need only be prepared for each topic (topic name), so scenario data maintenance becomes easy. Specifically, when the scenario data needs to be changed, only the scenario data need be corrected. If a new topic is required, only the scenario data need be added. Furthermore, when it is no longer necessary due to an old topic, only the scenario data need be deleted.
- the personality index is an index indicating whether the user is active or reluctant to the topic.
- the topic can be switched based on the state control index and the scenario data, the topic can be switched while viewing the state of the conversation with the user.
- the conversation can proceed smoothly according to the conversation flow.
- the input information analysis unit shown in FIG. 2 analyzes the input information and generates input specifying information.
- the input specifying information is information generated as a result of analyzing various types of information included in the input information. For example, there is a statistical analysis of the number and frequency of specific keywords (such as related terms described later) included in the input information.
- the scenario data storage unit (a plurality of scenario data) shown in FIG. 2 stores a plurality of scenario data.
- the plurality of scenario data are all scenario data corresponding to the topic names necessary for talking with the user.
- scenario data determined to be necessary based on the input identification information is extracted.
- a scenario data storage unit that stores scenario data for defining response information related to a topic based on input specifying information generated by analyzing input information input by a user
- a receiving unit for receiving the input identification information
- a scenario data editing section for making the scenario data editable
- a scenario data verification unit that enables verification of a response of scenario data edited based on the input specific information received by the reception unit
- a scenario data transmission unit that transmits the edited scenario data to the outside.
- the maintenance device 3 mainly includes a scenario data storage unit, a reception unit, a scenario data editing unit, a scenario data verification unit, and a scenario data transmission unit.
- these configurations are indicated by solid-line squares.
- the scenario data storage unit (a plurality of scenario data) and the input information analysis unit indicated by dotted-line boxes are included in the topic providing server 4 (see FIG. 2).
- a scenario data verification unit plus a state control index storage unit constitutes a terminal device virtual construction unit to be described later. Information exchanged between these components is the same as that of the topic providing system 1 and the conversation control terminal device 2 described above.
- the topic providing system 1 and the conversation control terminal device 2 described above are mainly for a general user to have a conversation with the conversation control terminal device 2.
- the maintenance device 3 is mainly used by a contractor of the topic providing system 1, and the contractor of the topic providing system 1 performs maintenance of scenario data for providing a topic to a general user. It is a device for.
- the maintenance device 3 has such a difference, in FIG. 3, the same name is given to the configuration having the same function as the topic providing system 1 and the conversation control terminal device 2 and using the same data.
- the scenario data storage unit and the reception unit in the maintenance device 3 are functionally substantially the same as the scenario data storage unit and the reception unit in the topic providing system 1 and the conversation control terminal device 2 described above. It can be the same as the topic providing system 1 or the conversation control terminal device 2.
- the scenario data storage unit and the reception unit of the maintenance device 3 may be mounted on the conversation control terminal device 2 assumed to be used by the user.
- the conversation control terminal device 2 may be virtually constructed in the maintenance device 3 and used as a scenario data storage unit and a reception unit of the virtual conversation control terminal device 2.
- the scenario data storage unit is a member or part for storing scenario data.
- the scenario data is data stored in advance in a scenario data storage unit (a plurality of scenario data) of the topic providing server 4 (see FIG. 2). Scenario data determined to be necessary based on the input specifying information generated by the input information analysis unit is extracted, and the extracted scenario data is stored in the scenario data storage unit.
- the scenario data extracted from the scenario data storage unit (a plurality of scenario data) is stored in the scenario data storage unit via a receiving unit and a response information determination unit described later.
- Scenario data is data for defining response information related to a topic based on input specific information.
- the input specifying information is information generated by analyzing the input information.
- the input information is, for example, information input by the user in the conversation control terminal device 2.
- the maintenance device 3 is mainly used by a contractor of the topic providing system 1.
- the input information can be information virtually input by the user.
- the maintenance device 3 is for verifying scenario data before making the scenario data available to the user. Therefore, the user here may be a virtual user, and information assumed to be input by an actual user may be input information. Therefore, it is possible to generate input specifying information using various assumed input information, and to verify the response of the scenario data by the scenario data verification unit described later.
- Scenario data includes topic information to provide to users.
- the configuration and functions of scenario data used in the maintenance device 3 are the same as the scenario data of the topic providing system 1 according to the present embodiment and the scenario data of the conversation control terminal device 2 described above.
- the configuration and functions of the scenario data are the same as the scenario data in the topic providing system 1 and the conversation control terminal device 2, but as described above, the maintenance device 3 can use the scenario data for the user. It is for verifying scenario data before making it. Therefore, the scenario data targeted by the maintenance device 3 is scenario data for verification, and is data before the scenario data is made available to the user.
- the receiving unit receives input specific information.
- the input specifying information is generated outside the maintenance device 3.
- the receiving unit is a device or member that receives input specifying information generated outside the maintenance device 3.
- the input information is analyzed outside the maintenance device 3 to generate the input specification information, and the input specification information generated outside the maintenance device 3 is transmitted to the receiving unit of the maintenance device 3.
- the input information can be virtually input by the user. Therefore, the input specifying information here can be information generated outside the maintenance apparatus 3 by using information that is assumed to be input by an actual user as input information. Thus, scenario data can be verified based on various types of input information by using information that is assumed to be input by an actual user as input information.
- the scenario data editing unit is a device or member for making scenario data editable.
- the scenario data can be edited by the person in charge of the contractor of the topic providing system 1 operating the keyboard or the like. Editing includes adding, deleting, and changing scenario data. Specifically, editing is a process of adding, deleting, or changing a statement that constitutes scenario data. By editing the scenario data, the scenario data can be customized for each of a plurality of users.
- the scenario data verification unit is a device or member that enables verification of the response of scenario data.
- the scenario data here is edited scenario data. It is a device or member for verifying whether the response of scenario data edited by the scenario data editing unit is appropriate.
- the scenario data verification unit can generate input specifying information by using various assumed input information, and can verify the response of the scenario data edited by the scenario data editing unit. For this reason, since it is possible to verify whether or not the response of the scenario data is appropriate for every user, it is possible to verify the custom scenario data applied to each of the plurality of users by the scenario data editing unit.
- the scenario data transmission unit transmits the edited scenario data to the outside.
- the outside can be, for example, a server or another conversation control terminal device 2.
- the verified scenario data can be made available to the user by transmitting the edited scenario data to the outside.
- the scenario data editing unit edits the scenario data or the scenario data verification unit verifies the scenario data using information input by the user as input information.
- a topic list can be generated by the topic analysis unit, and scenario data based on the topic list can be edited or verified.
- the topic analysis unit will be described later.
- the scenario data verification unit verifies whether the response of the scenario data edited by the scenario data editing unit is appropriate. By doing so, it is possible to confirm the contents and consistency of the scenario data before transmitting the scenario data to the outside such as a server. At least a part of the verified scenario data transmitted to the outside such as the server is finally transmitted to the conversation control terminal device 2 and used for conversation with the user.
- the data in the conversation control system includes both data for analyzing input information input by the user to generate input specific information and scenario data for determining response information based on the input specific information.
- This scenario data is data that can diversify response information that is an answer to the user.
- the scenario data editing unit can edit the scenario data
- the scenario data verification unit can verify the response of the edited scenario data. By doing so, it is possible to customize the scenario data for each of a plurality of users who use the conversation control system without the need for highly specialized knowledge and technology throughout the conversation control system. it can.
- the features of the maintenance device 3 are as follows: An input unit for a user to input input information; A state control index storage unit that stores a state control index related to the input information and the response information; A response information determination unit that determines the response information based on the scenario data and the state control index; And an output unit that outputs response information determined by the response information determination unit, and a terminal device virtual construction unit that virtually constructs a conversation control terminal device.
- the maintenance device 3 has a terminal device virtual construction unit as shown in FIG.
- the conversation control terminal device 2 ′ is virtually constructed by the terminal device virtual construction unit.
- the maintenance device 3 has a scenario data verification unit as shown in FIG.
- the scenario data verification unit plus the state control index storage unit constitutes a terminal device virtual construction unit.
- the maintenance device 3 includes the conversation control terminal device 2 'as a function.
- the maintenance device 3 can have the function of the conversation control terminal device 2 ′ as a simulation package.
- the actual conversation control terminal device 2 can be obtained by omitting the maintenance function from the virtually constructed conversation control terminal device 2 ′.
- the virtually constructed conversation control terminal device 2 ' may be realized as hardware or software. When realized as hardware, a separate device different from the actual conversation control terminal device 2 may be used as the conversation control terminal device 2 ′ virtually constructed. Further, in the case of realizing by software, the maintenance device 3 may realize the conversation control terminal device 2 ′ by emulation or the like.
- the maintenance device 3 is mainly used by a contractor of the topic providing system 1 and may be any device for maintaining scenario data for providing a topic to a general user.
- the terminal device virtual construction unit and the scenario data verification unit mainly include an input unit, a response information determination unit, and an output unit. Furthermore, the terminal device virtual construction unit includes a state control index storage unit.
- the input unit is a member or part for the user to input input information.
- the input part should just be what can input the information which a user desires as input information.
- the input unit includes a keyboard, a touch panel, a microphone, a camera, and the like.
- the user can input text data, voice data, image data, and the like from the input unit.
- the input information may be information virtually input by the user.
- the user may also be a virtual user, and information that is assumed to be input by an actual user may be input information.
- the state control index storage unit stores the state control index.
- the state control index is an index related to input information and response information.
- the state control index is mainly an index related to history. For example, there are indicators relating to input information input by the user in the past and indicators relating to response information provided in the past to the user.
- the response information determination unit determines the response information by adding a state control index in addition to the scenario data and the input specific information.
- the output unit outputs the response information determined by the response information determination unit.
- the edited scenario data can be used, or the verified scenario data can be used. Before that, you can check the contents and control of the scenario data. An appropriate topic can be provided to the user.
- the terminal device virtual construction unit can virtually construct and execute the conversation control terminal device 2 ′ in the maintenance device 3. Therefore, the maintenance device 3 can realize the same environment as the conversation control terminal device 2 used by general users. As a result, the content and operation of the scenario data can be checked in advance in an environment similar to the environment in which the user actually carries out the conversation, and the content of the scenario data can be verified before the conversation with the user. It is possible to verify in advance whether or not the custom made for each of a plurality of users connected to is appropriate.
- a topic analysis unit for generating a topic list to which the closeness and connection method of topics is given via a related word that relates topics
- the scenario data editing unit edits a topic introduction scenario for introducing a topic to a user and an input related scenario for responding to a user input as the scenario data by using the topic list and the related terminology. That is.
- the maintenance device 3 includes a topic analysis unit (not shown).
- the topic analysis unit is a device or member for generating a topic list.
- the topic list is data to which the closeness and connection method of the topics are given through the related words that relate the topics.
- the topic analysis unit accumulates related terms associated with the topic in the topic list.
- the topic list is data provided to the contractor of the topic providing system 1 and is used when generating scenario data.
- a topic list is output to the output unit of the maintenance device 3 as shown in FIG.
- the contractor of the topic providing system 1 can construct topic introduction scenario data and input related scenario data to be provided to the user with reference to the topic list output to the output unit of the maintenance device 3. By doing in this way, the contractor of the topic providing system 1 can easily and easily construct topic introduction scenario data and input-related scenario data.
- the scenario data editing section enables topic introduction scenarios and input related scenarios to be edited as scenario data.
- the topic introduction scenario and the input related scenario are made editable by the topic list and the related words. Since editing can be performed using a topic list and related terms, an input related scenario can be easily and easily constructed.
- the topic introduction scenario is a scenario for introducing a topic to the user.
- the user is provided with a topic by a topic introduction scenario.
- the input-related scenario is a scenario for responding to a user input.
- the user inputs predetermined information, for example, information such as a greeting
- the corresponding information such as a greeting is answered to the user by an input-related scenario.
- This scenario data editing section allows the subscriber of the topic providing system 1 to edit the topic introduction scenario and the input related scenario to make them desired.
- the topic list can be updated based on various logs, for example, data that can be acquired such as Twitter and blog. That is, the latest information can be accumulated as related terms in the topic list. Therefore, when editing or verifying the topic introduction scenario and the input-related scenario based on the topic list, the subscriber of the topic providing system 1 knows the latest information from the topic list, and the topic introduction scenario. In addition, the input-related scenario can be edited, a topic including the latest information can be provided to the user, and a new user group can be cultivated by the latest information.
- the data in the conversation control system includes both data for analyzing input information input by the user to generate input specific information and scenario data for determining response information based on the input specific information.
- This scenario data is data that can diversify response information that is an answer to the user.
- the scenario data editing unit can edit the scenario data in cooperation with the topic analysis unit, and the scenario data verification unit can verify the response of the edited scenario data. In this way, it is possible to customize the scenario data for each of a plurality of users who use the conversation control system, even without highly specialized knowledge and technology over the conversation control system in general. it can.
- the transmission unit shown in FIG. 3 is a device or member for transmitting input information to the outside. Any device that transmits input information to the outside may be used.
- the outside can be, for example, the topic providing server 4 (see FIG. 2), the conversation control terminal device 2, or the like.
- the input information analysis unit shown in FIG. 3 analyzes the input information and generates input specifying information.
- the input specifying information is information generated as a result of analyzing various types of information included in the input information. For example, there is a statistical analysis of the number and frequency of specific keywords (such as related terms described later) included in the input information.
- the scenario data storage unit (a plurality of scenario data) shown in FIG. 3 stores a plurality of scenario data.
- the plurality of scenario data are all scenario data corresponding to the topic names necessary for talking with the user.
- scenario data determined to be necessary based on the input identification information is extracted.
- FIG. 4 is a block diagram showing an outline of the system configuration of the topic providing system 1.
- the topic providing system 1 includes topiclet 20, iWA 30, and iWA Manager 40.
- Topiclet 20 corresponds to hardware such as a terminal device used by the user, for example.
- the Topiclet 20 corresponds to the conversation control terminal device 2 shown in FIG.
- the topic is provided to the user by the Topiclet 20.
- Topiclet may be used synonymously with software executed by a terminal device to provide a topic to a user, or a topic providing environment that can be realized by the terminal device or these software.
- the Topiclet 20 includes a CPU (Central Processing Unit), a ROM (Read Only Memory), a RAM (Random Access Memory), a display, a keyboard (all not shown), and the like.
- the Topiclet 20 can be a personal computer or a mobile terminal device.
- Topiclet 20 includes an input unit 210, an output unit 220, a transmission unit 230, a reception unit 240, a response information determination unit 250, a switching input information input unit 260, a scenario data storage unit 270, and a state control index storage unit 280.
- the input unit 210 is a device or member for a user to input input information.
- the input unit 210 includes a keyboard, a touch panel, a microphone, and the like.
- the input unit 210 may be any device or member that allows the user to input information such as questions.
- the output unit 220 outputs response information determined by a response information determination unit described later.
- the output unit 220 includes a display and a speaker. The output unit 220 only needs to output the response information so that the user can recognize it.
- the user can advance the conversation by inputting the input information to the input unit 210 and recognizing the response information output to the output unit 220.
- the transmission unit 230 is a device or member for transmitting the input information input to the input unit 210 to the iWA 30.
- the transmission unit 230 includes a communication interface.
- the input information transmitted to the iWA 30 includes a user ID for identifying the user in addition to the input information from the input unit 210 by the user.
- the user ID may be information that can identify the user. Information assigned to each user may be used. Further, the user ID can be information that can uniquely identify the topiclet 20 such as the serial number of the topicic 20. The user ID may be any information that can identify each user who uses the topic providing system 1 or the conversation control terminal device 2.
- the receiving unit 240 is a device or member for receiving the input specifying information and scenario data transmitted from the iWA 30.
- the transmission unit 230 includes a communication interface.
- Topiclet 20 is communicably connected to iWA 30 by transmission unit 230 and reception unit 240.
- the response information determination unit 250 determines response information based on the input specifying information and scenario data.
- the response information determination unit 250 includes the CPU, ROM, RAM, etc. of the Topiclet 20. As described above, the response information is determined using the input specifying information and the scenario data transmitted from the iWA 30.
- the response information dynamically changes based on the scenario data regardless of the input.
- the response information determination unit 250 determines response information.
- the response information is determined based on the scenario data and the input specifying information. That is, the response information is determined using the input specifying information obtained by analyzing the input information input by the user. Therefore, the response information reflecting the user's intention can be generated, and the conversation with the user can be smoothly advanced by providing the topic desired by the user.
- the response information includes a statement of scenario data based on the input specific information.
- various commands such as an output command can be included in the response information.
- topic information and greeting information can be output with various specifications in the output unit.
- the response information determination unit 250 determines the response information by adding a state control index in addition to the scenario data and the input specifying information. In this way, by determining the response information using the state control index, it is possible to provide a topic or advance the conversation based on the past conversation with the user. Therefore, it is possible to prevent the same topic from being provided to the user in duplicate or to provide the user with a jumping topic, and a smoother conversation can be promoted.
- the response information determination unit 250 may determine the response information based on the scenario data and the state control index.
- the switching input information input unit 260 includes the CPU, ROM, RAM, and the like of the Topiclet 20.
- the switching input information input unit 260 generates topic switching input information according to information defining transition to a different topic.
- Information defining transition to a different topic includes, for example, topic switching information described later.
- the topic switching input information includes, for example, a personality index described later.
- the scenario data storage unit 270 extracts scenario data for defining response information related to the topic.
- the scenario data storage unit 270 is composed of the ROM, RAM, etc. of the Topiclet 20.
- scenario data is composed of a plurality of statements.
- Topiclet 20 it is possible to have a conversation with the user while providing a topic to the user by making a transition from one statement to another.
- the scenario data storage unit 270 stores a plurality of statements for proceeding with the conversation with the user. Specific examples of advancing a conversation with the user by transitioning statements will be described in detail with reference to FIGS.
- the iWA 30 also includes a scenario data storage unit 320.
- the scenario data storage unit 320 of the iWA 30 stores all scenario data.
- the scenario data storage unit 270 of the Topiclet 20 may be stored as a part of scenario data. What is necessary is just to transmit to Topiclet 20 as scenario data required for the conversation of the user who uses Topiclet20.
- scenario data stored in the scenario data storage unit 320 of the iWA 30 may be transmitted to the Topiclet 20.
- the topic can be provided to the user more smoothly.
- the state control index storage unit 280 stores a state control index.
- the state control index is an index related to input information and response information.
- the state control index storage unit 280 includes the ROM, RAM, and the like of the Topiclet 20.
- the Topiclet 20 has a state control index storage unit 280, and the state control index is stored in the state control index storage unit 280.
- the state control index is information that is not transmitted to the server (iWA 30 described later) but is stored in the Topiclet 20.
- the Topiclet 20 determines the response information with reference to the state control index stored in the state control index storage unit 280.
- scenario data can be used by referring to the state control index in Topiclet 20, it is possible to perform processing quickly and to have a smooth conversation with the user.
- the topic provided to the user is controlled using three types of indicators, that is, an input indicator, a progress indicator, and a personality indicator, as the state control indicators. Other indicators may be used.
- the input index is information indicating what input the user has performed so far, that is, a history of user input.
- the progress index is information indicating what topic has been provided to the user so far, that is, a history of the topic provided to the user.
- a topic desired to be provided to the user can be maintained (stored) by the progress index. Thereby, a topic can be provided to a user, without making a user feel stress.
- the progress indicator even when a series of explanations are provided as topics, even if the user asks questions on the way, the continuation of the series of explanations can be resumed.
- the personality index is information indicating what kind of posture the user has input so far, that is, a history of the posture of the user. For example, it is information indicating whether the user has made a positive input or a passive input for a certain topic. In the case of being active, it can be determined that it can continue to provide a topic on a certain theme. On the other hand, in a negative case, it can be determined that a topic must be provided by switching to a theme different from a certain theme.
- a user who is interested in a car may continue to provide a topic about the car.
- the topic name (theme) of the topic provided to the user can be switched by the personality index.
- the user can touch a topic belonging to the switched topic name without being aware of the topic name of the topic.
- the iWA 30 corresponds to hardware such as a server, for example.
- the iWA 30 corresponds to the topic providing server 4 shown in FIG.
- the iWA 30 is communicably connected to the Topiclet 20. This is hardware for executing processing related to a topic provided to a user in Topiclet 20.
- the iWA 30 includes a CPU (Central Processing Unit), a ROM (Read Only Memory), a RAM (Random Access Memory), a HDD (Hard Disk Drive), a display, a keyboard (all not shown) and the like.
- a CPU Central Processing Unit
- ROM Read Only Memory
- RAM Random Access Memory
- HDD Hard Disk Drive
- the iWA 30 includes an input information analysis unit 310 and a scenario data storage unit 320.
- the input information analysis unit 310 analyzes the input information and generates input specifying information.
- the input information is information input from the input unit 210 by the user.
- the input specifying information includes a result obtained by statistically analyzing the input information and information necessary for providing a topic based on the result. For example, there is information such as the number and frequency of appearance of related terms in the input information. Moreover, scenario data (statement) determined to be necessary for providing topics according to the result is included.
- the input identification information includes various information such as related terms, scenario data, and the number of related terms included in the scenario data.
- the scenario data includes topic information provided to the user, greeting information necessary for conversation with the user, and the like.
- ⁇ Related words> Various data used in the topic providing system 1, the conversation control terminal device 2, and the maintenance device 3 according to the present embodiment are configured on the basis of data called related terms. Unlike normal keywords used in normal search processing and the like, various information such as history information and preferences can be associated with each other. The input information can be analyzed based on the related information held by the related words.
- the scenario data storage unit 320 stores a plurality of scenario data.
- the plurality of scenario data are all scenario data corresponding to topic names necessary for talking with the user on the Topiclet 20.
- the scenario data determined to be necessary based on the input identification information is transmitted to the Topiclet 20. Therefore, when talking with the user in the Topiclet 20, the scenario data is not transmitted to the Topiclet 20 every time a conversation is made.
- the scenario data is composed of a plurality of statements. Therefore, a plurality of statements constituting scenario data determined to be necessary based on the input identification information are transmitted to the Topiclet 20.
- a statement is transmitted to the Topiclet 20 as scenario data. If it is determined that it is not necessary, the statement already transmitted to the Topiclet 20 is sufficient, and in this case, the statement is not transmitted to the Topiclet 20.
- scenario data determined to be necessary based on the input identification information is transmitted to the Topiclet 20, but all scenario data stored in the scenario data storage unit 320 is transmitted to the Topiclet 20. Good. Since all the scenario data has already been transmitted to the Topiclet 20, the time required for transmitting / receiving the scenario data can be shortened, and the conversation with the user can be facilitated.
- the scenario data stored in the scenario data storage unit 320 is rearranged, and the scenario data corresponding to the topic name is transmitted to the Topiclet 20. That is, a statement corresponding to the scenario data corresponding to the topic name is transmitted to Topiclet 20.
- the rearranged scenario data is stored in the scenario data storage unit 270 of the topiclet 20. This rearrangement of scenario data can be executed according to the topic name.
- scenario data corresponding to one topic name cannot be sufficiently handled.
- the iWA Manager 40 corresponds to hardware such as a server, for example.
- the iWA Manager 40 corresponds to the maintenance device 3 shown in FIG.
- iWA Manager 40 is communicably connected to iWA 30.
- the iWA Manager 40 is mainly hardware for executing processing related to scenario data used in the iWA 30.
- the iWA Manager 40 includes a CPU (Central Processing Unit), ROM (Read Only Memory), RAM (Random Access Memory), HDD (Hard Disk Drive), Display, Keyboard (none of which are shown).
- CPU Central Processing Unit
- ROM Read Only Memory
- RAM Random Access Memory
- HDD Hard Disk Drive
- Display Keyboard (none of which are shown).
- the iWA Manager 40 includes a scenario data editing unit 410, a scenario data verification unit 420, and a scenario data transmission unit 430.
- the scenario data editing unit 410 is a device or member that enables editing of scenario data.
- the scenario data can be edited by the person in charge of the contractor of the topic providing system 1 operating the keyboard or the like. Editing includes adding, deleting, and changing scenario data. Specifically, editing is a process of adding, deleting, or changing a statement that constitutes scenario data.
- Scenario data needs to be updated to accommodate the latest topics as new products are sold, new services are provided, various incidents occur, and the number of new users increases. . Therefore, the person in charge can acquire various types of information via the network and update the scenario data to the latest based on these information. By updating the scenario data, a topic corresponding to the latest information can be provided to the user.
- the scenario data editing unit 410 by correcting inappropriate information such as typographical errors and omissions and incorrect information by the scenario data editing unit 410, it is possible to provide a topic corresponding to appropriate information to the user.
- the scenario data verification unit 420 is a device or member that enables verification of the response of scenario data edited based on the input specifying information generated by the input information analysis unit. That is, the scenario data verification unit 420 is a device or member for verifying whether the response of the scenario data edited by the scenario data editing unit 410 is appropriate.
- Scenario data includes output information output from the output unit 220, output commands for controlling output specifications to the output unit 220, judgments for controlling statements, topic names Control commands for switching between and changing the state control index. For this reason, not only verifying whether the data output to the output unit 220 is appropriate, but also verifying whether the control of the output to the output unit 220 is appropriate and controlling the transition of scenario data, etc. It is necessary to verify whether it is appropriate.
- the scenario data verification unit 420 can generate input specifying information using various assumed input information, and can verify the response of the scenario data edited by the scenario data editing unit 410. For this reason, since it is possible to verify whether or not the response of the scenario data is appropriate for every user, it is possible to customize each of the users.
- the scenario data verification unit 420 can virtually construct an environment similar to Topiclet 20 by the terminal device virtual construction unit. By verifying scenario data in a virtual environment, it is possible to verify the output and operation of scenario data in an environment close to the environment actually used by the user, and it is easy to determine whether scenario data is appropriate. It is possible to judge accurately.
- the scenario data transmission unit 430 transmits the edited scenario data to the outside, for example, the iWA 30.
- the scenario data edited by the scenario data editing unit 410 and further verified by the scenario data verification unit 420 are transmitted to the iWA 30. Therefore, the scenario data transmission unit 430 transmits the verified scenario data to the iWA 30.
- scenario data transmitted from iWA 30 to Topiclet 20 can always be kept in an appropriate state. Accordingly, a topic using appropriate scenario data can be provided to the user via the iWA 30.
- scenario data, input specifying information, and a topic introduction list are output from the iWA 30.
- scenario data, input specifying information, and a topic introduction list will be described.
- FIG. 14 is an example of scenario data used in the topic providing system 1, the conversation control terminal device 2, and the maintenance device 3 of the present embodiment.
- the scenario data shown in FIG. 14 will be described according to a specific processing procedure.
- the scenario data shown in FIG. 14 includes a plurality of first to thirteenth statements.
- FIGS. 5 to 12 are flowcharts showing the processing procedures of these first to thirteenth scenario data.
- FIG. 13 is a diagram illustrating an example of output to the output unit 220 by processing the first to thirteenth scenario data.
- scenario data (statements) used in the topic providing system 1, the conversation control terminal device 2, and the maintenance device 3 includes output information, output commands, and control commands.
- the scenario data (statement) is composed of various elements such as output information, output commands, and control commands.
- FIG. 5 is a flowchart showing the processing of the first, second and third statements.
- the progress index is set to ⁇ 1 (step S511). This value indicates the state of progress.
- the output unit 220 is temporarily erased (step S513), and “shift to a topic to resolve anxiety” is output to the output unit 220 (step S515).
- the topic “I am concerned about the earthquake” is input by the user's operation, and the text data display area of the output unit 220 displays “ Will shift "is displayed (1311 in FIG. 13).
- a predetermined image is output (step S517) and waits for 3 seconds (step S519).
- a face image M1 of a predetermined color is displayed in the image data display area of the output unit 220 and waits for 3 seconds (1311 in FIG. 13).
- step S521 the topic name “topic topic” is switched (step S521).
- a transition is made to the second statement (1313 in FIG. 13).
- steps S523 to S525 correspond to the processing of the second statement.
- Transition to the second statement first starts monitoring whether or not an absolute time, for example, 12:00, has been reached (step S523). Next, it is determined whether or not the absolute time has come (step S525) (1315 in FIG. 13). If the absolute time has not come (NO), the process proceeds to the third statement. When the absolute time is reached (YES), the monitoring of the absolute time is canceled and a transition is made to the thirteenth statement (reference EE).
- an absolute time for example, 12:00
- steps S527 to S533 correspond to the processing of the third statement.
- step S527) it is determined whether or not the progress index is ⁇ 1 (step S527) (1317 in FIG. 13).
- the process proceeds to the fourth statement (reference E1).
- step S531 When it is determined that the progress index is ⁇ 2 (NO), it is determined whether or not the progress index is ⁇ 3 (step S531). When it is determined that the progress index is ⁇ 3 (YES), the process proceeds to the sixth statement (reference E3).
- step S533 it is determined whether or not the progress index is -4 (step S533).
- the process proceeds to another statement (sta: 200).
- sta: 200 it is determined that the progress index is not -4 (NO)
- the second and third statements described above consist only of control commands. As described above, the statement may not have output information for output to the output unit 220.
- FIG. 6 shows processing corresponding to the fourth statement and the seventh statement.
- the fourth statement corresponds to steps S611 to S621.
- the seventh statement corresponds to steps S623 to S629.
- ⁇ 4th statement> When a transition is made to the fourth statement, measurement of relative time, for example, 120 seconds is started (step S611). Next, the input index is set to 1 (step S613), and the number of times the input index is set to 1 is counted (step S615).
- step S617 it is determined whether or not the number of times the input index is set to 1 has reached 5 (step S617) (1319 in FIG. 13).
- YES the number of times that the input index is set to 1 has reached 5
- a transition is made to the tenth statement (reference E11).
- step S619 When the number of times the input index is set to 1 has not reached 5 (NO), it is determined whether or not a relative time, for example, 120 seconds has elapsed (step S619) (1319 in FIG. 13). If the relative time has elapsed (YES), the measurement of the relative time is terminated (step S621), and a transition is made to the tenth statement (reference E11).
- FIG. 7 shows processing corresponding to the fifth statement and the eighth statement.
- the fifth statement corresponds to steps S711 to S721.
- the eighth statement corresponds to steps S723 to S729.
- ⁇ 5th statement> When a transition is made to the fifth statement, measurement of a relative time, for example, 120 seconds is started (step S711). Next, the input index is set to 2 (step S713), and the number of times the input index is set to 2 is counted (step S715).
- step S717 it is determined whether or not the number of times the input index is set to 2 has reached 5 (step S717) (1319 in FIG. 13).
- YES the number of times that the input index is set to 2 has reached 5
- a transition is made to the 11th statement (reference E12).
- step S719 If the number of times the input index is set to 2 has not reached 5 (NO), it is determined whether or not a relative time, for example, 120 seconds has elapsed (step S719) (1319 in FIG. 13). If the relative time has elapsed (YES), the measurement of the relative time is terminated (step S721), and the process proceeds to the eleventh statement (reference E12).
- FIG. 8 shows processing corresponding to the sixth statement and the ninth statement.
- the sixth statement corresponds to steps S811 to S821.
- the ninth statement corresponds to steps S823 to S829.
- ⁇ Sixth statement> When a transition is made to the sixth statement, measurement of relative time, for example, 120 seconds is started (step S811). Next, the input index is set to 3 (step S813), and the number of times the input index is set to 3 is counted (step S815).
- step S817) it is determined whether or not the number of times the input index is set to 3 has reached 5 (step S817) (1319 in FIG. 13).
- YES the number of times that the input index is set to 3 has reached 5
- a transition is made to the eleventh statement (reference E13).
- step S819) If the number of times the input index is set to 3 has not reached 5 (NO), it is determined whether or not a relative time, for example, 120 seconds has elapsed (step S819) (1319 in FIG. 13). If the relative time has elapsed (YES), the measurement of the relative time is terminated (step S821), and a transition is made to the twelfth statement (reference E13).
- a relative time for example, 120 seconds has elapsed
- ⁇ 9th statement> When the transition to the ninth statement is made, the output unit 220 is deleted (step S823), and the output unit 220 has a question about “it is dangerous”? Is output (step S825). Next, a predetermined image (for example, a face image M1 of a predetermined color) is output (step S827) and waits for 10 seconds (step S829). Next, a transition is made to the second statement (reference symbol ES).
- a predetermined image for example, a face image M1 of a predetermined color
- FIG. 9 shows a process corresponding to the tenth statement. Transition to the tenth statement is made by the above-described processing of FIG. 6 (reference numeral E11).
- the progress index is set to -2 (step S911).
- the output unit 220 is erased (step S913), and “shift to the next topic” is output to the output unit 220 (step S915).
- a predetermined image for example, a face image M1 of a predetermined color
- step S917 and waits for 3 seconds (step S919).
- step S921 the word “OK” is used as input information, and the topic name corresponding to this word is switched (step S921).
- a transition is made to the second statement (reference symbol ES).
- FIG. 10 shows processing corresponding to the eleventh statement. Transition to the eleventh statement is made by the above-described processing of FIG. 7 (reference numeral E12).
- the progress index is set to -3 (step S1011).
- the output unit 220 is erased (step S1013), and “shift to the next topic” is output to the output unit 220 (step S1015).
- a predetermined image for example, a face image M1 of a predetermined color
- step S1019 waits for 3 seconds (step S1019).
- step S1021 using the word “dangerous” as input information, the topic name corresponding to this word is switched (step S1021).
- a transition is made to the second statement (reference symbol ES).
- FIG. 11 shows processing corresponding to the twelfth statement. Transition to the twelfth statement is made by the above-described processing of FIG. 8 (reference numeral E13).
- the progress index is set to -4 (step S1111).
- the output unit 220 is erased (step S1113), and “time has come” is output to the output unit 220 (step S1115).
- a predetermined image for example, a face image M1 of a predetermined color
- step S1119 waits for 3 seconds (step S1119).
- a transition is made to the thirteenth statement (reference EE).
- FIG. 12 shows processing corresponding to the thirteenth statement. The process transitions to the thirteenth statement by the process of FIG. 11 (reference EE) described above.
- the output unit 220 is deleted (step S1211), and the description is terminated to the output unit 220 (step S1213) (1321 in FIG. 8).
- a predetermined image for example, a face image M1 of a predetermined color
- step S1215 waits for 3 seconds (step S1217).
- step S1217 waits for 3 seconds (step S1217).
- a transition is made to another statement (sta: 200).
- scenario data is composed of a plurality of statements. After transitioning to one statement and executing processing based on that one statement, transitioning to another statement and executing processing based on that statement. It is possible to provide a topic to the user by repeating the transition of the statement and the processing in the statement.
- scenario data (a plurality of statements) is used in the topic providing system 1, the conversation control terminal device 2, and the maintenance device 3 in the present embodiment.
- scenario data (a plurality of statements) is used in the topic providing system 1, the conversation control terminal device 2, and the maintenance device 3 in the present embodiment.
- the statement of the present embodiment is composed of various elements such as transition information and judgment.
- a statement can include elements necessary for providing a topic to the user and controlling a conversation with the user.
- Identification information is information attached to identify a statement. This identification information is referred to when transitioning statements.
- One statement includes transition destination information.
- one statement includes both identification information and transition destination information.
- the identification information is identification information for identifying a statement and is information indicating the statement itself.
- the transition destination information is information for designating a statement to be transitioned next.
- the identification information and the transition destination information are information used when transitioning from one statement to another. That is, according to the transition destination information defined in another statement, an identification information statement that matches the transition destination information is searched, and a transition is made to the identification information statement (one statement) that matches the transition destination information. In this way, by using both the identification information and the transition destination information, it is possible to transition statements one after another.
- Judgment includes judgment based on indicators and judgment based on time.
- the determination based on the index is a determination for determining whether the index satisfies a predetermined condition. If the index satisfies a predetermined condition, it can be determined to be true, and if the index does not satisfy the predetermined condition, it can be determined to be false and branched.
- the determination based on time is a determination for determining whether time satisfies a predetermined condition. If the time or time satisfies a predetermined condition, it can be determined to be true, and if the predetermined condition is not satisfied, it can be determined to be false and branched.
- the output information is text data to be output to the output unit 220.
- identification information for example, a file name
- an image can also be output to the output unit 220 by the output information.
- the setting element is an element for setting a state index.
- a statement can be transitioned or branched based on a set state index.
- line breaks are for outputting the text with a line break in the output unit 220.
- Erasing is for erasing the text or image output to the output unit 220.
- the output control element is an element for controlling the output of the output information described above. For example, it is possible to define a time for outputting the output information, or to define an image to be output together with the output of the output information.
- the transition destination information is information for designating the next transition destination statement.
- the identification information matching the transition destination information is searched for, and a transition is made to the statement of the identification information.
- Topic switching information is an element for switching topic names.
- the scenario data (a plurality of statements) of the present embodiment is defined for each of a plurality of topic names. That is, the scenario data is defined so that the scenario data corresponds to each of a plurality of topic names. That is, in this embodiment, the scenario data is defined so as to correspond to each of a plurality of topic names, and each scenario data is constituted by a plurality of statements.
- scenario data corresponding to one topic name cannot be sufficiently handled.
- scenario data corresponding to other topic names is also composed of a plurality of statements.
- scenario data stored in the iWA 30 are rearranged to generate scenario data corresponding to the topic name.
- the rearranged scenario data is stored in the scenario data storage unit 270 of the topiclet 20. This recombination of scenario data can be executed by determining a combination of a plurality of statements according to the topic name.
- the personality index is information indicating whether the user is active or passive about a certain topic. In the case of being active, it can be determined that the topic can be continuously provided without switching the topic name. On the other hand, in a negative case, it can be determined that the topic name must be switched to provide the topic.
- Example of input specific information Identification information corresponding to the input information or information for identifying the identification information is added to the input specifying information, and scenario data can be activated by this information.
- the input specifying information in FIGS. 1 to 4 includes a topic introduction list described below.
- FIG. 16 is a diagram illustrating an example of the configuration of a topic introduction list.
- the topic introduction list is a list of pairs of related terms and topics.
- the related term set is a set of related terms included in the topic.
- the vicinity of the related term is a set of related terms including the related term A when attention is given to a certain related term A.
- a neighborhood system of related terms is a set of neighborhoods of related terms.
- the number of neighbors of the related term is the number of elements in the neighborhood system of the related term.
- the phase of the related term can be understood by looking at the neighborhood system of the related term.
- the neighborhood system can be displayed for all sets of related terms.
- By performing the preference analysis it is possible to display the neighborhood system of related terms in the order of preference.
- a scenario related to topic introduction is a scenario that can be configured based on the proximity or connection of topics based on the neighborhood system of related terms.
- the topic is text that is assigned an action and an index.
- An action is a change that is activated by clicking a text or the like.
- a set of related terms as an index is added to the topic. Clicking on the index displays the related term companions that are included in the index.
- FIG. 17 shows that topics in the topic introduction list are connected by a related terminology in which a related term structure such as a related term dictionary or a preference dictionary is introduced. The user can detect a topic by paying attention to how the topics in the topic introduction list are connected.
- the user can detect the topic from various viewpoints by switching the topic based on the related terminology while referring to the related term structure in the topic introduction list.
- Topic analysis provides "related terms and related term structure" necessary for switching the topic itself in the scenario data via the topic introduction list.
- variable related phrases for example, the most frequent related phrases, the most popular related phrases, etc.
- the topic providing system 1 can reproduce the manual operation in which the user detects the topic using the topic introduction list as the topic providing service using the scenario data.
- the maintenance device 3 constitutes a scenario data verification unit for creating in advance including information that the subscriber of the topic providing system 1 wants to provide to the user. And although it is not necessary to create the information (topic introduction list) that the user wants to provide, the maintenance device 3 of this embodiment adds the state control index storage unit described above to the scenario data verification unit.
- the terminal device virtual construction unit is configured to function as the conversation control terminal device 2 virtually. That is, the terminal device virtual construction unit is configured by adding the state control index storage unit to the scenario data verification unit, and this terminal device virtual construction unit corresponds to the topiclet 20 and the iWA Manager 40 described above.
- the topic analysis unit it is possible to analyze the topic and to output for visualizing the topic introduction list.
- the topic analysis unit is configured to generate a topic list to which the closeness of the topics and how to connect the topics are connected via the related words that relate the topics.
- the maintenance device 3 uses a topic list and the related terms to introduce a topic introduction list (equivalent to the topic list in FIG. 17) for introducing a topic to the user, It is also characterized by having a scenario data editing section for making it possible to edit an input related scenario for responding to an input as the scenario data.
- the user input since it is virtually constructed in the conversation control terminal device 2, the user input here corresponds to the input of the simulator (person in charge).
- FIG. 15 is a diagram illustrating a process of generating response information based on topic analysis and outputting the response information to an output unit.
- FIG. 16 is a diagram illustrating an example of the configuration of a topic introduction list.
- FIG. 17 is a diagram illustrating a process of topic extraction, generation of a related term dictionary, and generation of a preference dictionary.
- Topic analysis is executed by the topic analysis unit, and a preference dictionary and related term dictionary of iWA30 as a server are constructed.
- the control unit configured by the CPU of the maintenance device 3 uses these dictionaries to assign a plurality of related terms included in the topic to each of a plurality of topics (topic topics) input by the person in charge of the contractor. Grant automatically.
- FIG. 15 is a diagram illustrating a process of generating response information based on topic analysis and outputting the response information to an output unit.
- iWA30 generates scenario data from input related scenarios and topic introduction scenarios.
- the iWA 30 generates input specific information from elements such as identification information.
- the topic list is generated from the set of related terms and topics by iWA30.
- response information is generated based on the generated scenario data, input specific information, and topic introduction list, and the response information is output from the output unit 220.
- response information can be generated from topic analysis, and the response information can be output.
- topic analysis the response information used in the conversation with the user can be made appropriate for each user, and the conversation with the user can be made smoother.
- FIG. 16 is a diagram illustrating an example of the configuration of a topic introduction list.
- the topic introduction list is a list of pairs of related terms and topics.
- the related term set is a set of related terms included in the topic.
- the vicinity of the related term is a set of related terms including the related term A when attention is given to a certain related term A.
- a neighborhood system of related terms is a set of neighborhoods of related terms.
- the number of neighbors of the related term is the number of elements in the neighborhood system of the related term.
- the phase of the related term can be understood by looking at the neighborhood system of the related term.
- the neighborhood system can be displayed for all sets of related terms.
- By performing the preference analysis it is possible to display the neighborhood system of related terms in the order of preference.
- a scenario related to topic introduction is a scenario that can be configured based on the proximity or connection of topics based on the neighborhood system of related terms.
- the topic is text that is assigned an action and an index.
- An action is a change that is activated by clicking a text or the like.
- a set of related terms as an index is added to the topic. Clicking on the index displays the related term companions that are included in the index.
- FIG. 17 is a diagram illustrating creation of a topic introduction list and profiling of a user using a preference dictionary.
- the topic introduction list is visualized in a display form as shown on the screen (reference numerals 1813 and 1811) of the display device provided in the maintenance device 3 in FIG. 17, and obtained from the related term dictionary F or an external news source.
- the topic list G is automatically extracted by controlling the maintenance device 3 based on the topic list G consisting of topic data input by the person in charge and the topic data group taken from outside while extracting the topic list G from the iWA 30 It will be built.
- the input source of the content that is input and displayed as the topic (topic item setting field on the screens 1813 and 1811) in the topic introduction list is as follows.
- the person in charge of the contractor uses the input keyboard as the input device of the maintenance device 3
- the person in charge at the input device of the maintenance device 3 from the first form in which the topic material is directly input and set from the log data collected by the iWA 30 from the outside, for example, data that can be collected via a network such as Twitter or blog
- the iWA 30 automatically extracts the topic material based on the topic input by (1811). From the extracted topic, topic browsing list data that is a candidate for the topic introduction list is generated.
- the iWA 30 refers to the related term dictionary F in order to display the topic manually input by the person in charge through the maintenance device 3 on the screen 1815, and to the topic in which a plurality of related terms are input under the control of the iWA 30.
- the iWA 30 refers to a topic list G consisting of general news groups collected from the outside by using the related term dictionary associated with the topic manually entered by the person in charge as a key, and the person in charge
- the iWA 30 automatically extracts an external topic related to the topic inputted by the user, and the maintenance device 3 receives the extracted data and displays it as shown on a screen 1811.
- the person in charge repeats the topic input using a direct input device such as an input keyboard and the topic obtained from the database of the topic list in which iWA 30 previously stores information from outside while automatically reducing or adding topics.
- a direct input device such as an input keyboard
- the input topic is referred to based on the related dictionary dictionary F while referring to the preference dictionary E obtained from the iWA 30.
- the related terms displayed in the linked screen 1815, the user IDs and user types generated from the response histories of other users, and the related terms in the user type list data associated with the related terms are compared.
- the preference dictionary E constructed by other past user histories For example, as shown in a screen 1817, what kind of “most scary” extracted as a related term for a certain topic is referred to the preference dictionary E constructed by other past user histories.
- Visualization can be performed by analyzing and displaying whether similar related terms are associated with information input by a user of the user type.
- This analysis result is stored together with a user ID for identifying a user who has input the same related term, a user type (for example, yesterday's customer), and related terms common to all users, and based on this stored data. And can be used to analyze user preferences. As a method of use, for example, a user who is considered to have the same preference, a specific service that is considered to match the preference can be identified from the user ID, such as the user's email address, It can be used to specify and deliver a delivery destination, or to provide a topic that suits the taste other than the service to the specified delivery destination.
- FIG. 18 shows iWA 30 as a server by generating a topic introduction list from the related term dictionary and the topic list when a topic is input to the topic story setting screen by manual input according to the first form of the maintenance device 3 described above. The flow of processing until output is shown.
- FIG. 17 the process of performing the user profile from the topic input using the preference dictionary has been described. However, the process of performing this profile is a process different from the topic introduction list generation. A description of the process of performing the user profile from the topic is omitted.
- control unit of maintenance device 3 displays a topic topic setting screen on the display screen of maintenance device 3, and waits for topic input to be executed by the person in charge (S2000).
- the topic topic for the iWA 30 serving as a server.
- the set topic material is transmitted, and one or more related terms are extracted from the related term dictionary F under the control of the CPU of the iWA 30 in accordance with the content of the transmitted topic content data.
- “panic discussion material” is transmitted from the maintenance device 3 to the iWA 30 as the topic material, and the content of this “panic discussion material” is referred to as “panic” under the control of the CPU of the iWA 30.
- Related terms are extracted.
- the topic list (number of selected topics) is displayed as shown in a screen 1813. : 1424 (total number of topics 1424) Through-put: 17.25 is displayed. From this screen 1813, 1424 topics are extracted under the control of the CPU of the iWA 30 using the related word “panic” as a key, and there is a possibility that a topic introduction list may eventually be constructed.
- the topic and a plurality of related terms associated with the topic are received by the maintenance device 3 and displayed on the screen 1813. Briefly, list data in which a plurality of topics and a plurality of representative related terms are associated with each topic using the related term “panic” as a key to display the screen 1813 in FIG. 17. Is received from the iWA 30 (S2002).
- Such input is generated and displayed so that 1424 topics can be provided as topic introduction list candidates using the first related word “panic” as a key. If this number is large, it is visualized.
- the person in charge can determine the content of the topic, and can narrow down the topic so that it becomes a topic to be adopted in the target topic introduction list. That is, referring to the flow of FIG. 18, without confirming the list data of the topic introduction list in step S2004, a time-out occurs after a certain time has elapsed, and the input of additional topic material in step S2001 is executed again. Can do. For example, on the screen 1815, by inputting a topic topic “I am relieved” for the second time, the input representative related verbs and the received data as the topic are displayed.
- the display form can be changed to a display form centered on a related term.
- “condition setting: priority related terminology” is input by operating the input device, so that the related terms can be displayed in a priority list format.
- a plurality of related terms given to the second topic under the control of the CPU of the iWA 30 are 136 topics (1, 2, 3,...)
- a screen 1815 in FIG. .., 136) It is possible to switch to a display form in which a plurality of related terms are displayed in a plurality of columns in each row.
- this display mode switching is performed by the control unit (not shown) of the maintenance device 3 listing the related terms for each topic displayed in each row in the column direction according to the screen switching input of the input device.
- Change display to form The person in charge can determine from this display form a related terminology suitable for introducing the topic he / she wants to create from a plurality of related terms displayed on the screen 1815, and input the related terminator with the input device. (S2001: YES).
- the screen 1811 shows this input state. In this case, the related term “most scary” is input as a new topic topic.
- the related term “most scary” is referred to the topic list database under the control of the CPU of the iWA30, and a new topic introduction list comprising a plurality of related terms and topics based on the topic topic inputted.
- the list data as candidates is received (S2002) and displayed (S2003).
- the topic list G is a list composed of information collected by an external information collecting means such as the Internet from the outside by the iWA 30 as a server.
- a plurality of related terms are associated with each topic in the topic list G in advance and stored in the iWA 30 database.
- variations in setting a topic may become obsolete due to lack of knowledge or ability of the person in charge, and other related topics obtained from the iWA30 topic list as a server.
- by displaying the topic itself and a plurality of related terms related to the topic on the screen of the maintenance device 3, as described above based on the topics obtained from these topic lists This increases the possibility of abundant variations of topics that transition to.
- the modification of the candidate of the topic provision list as described above is timed out after a predetermined time until the input completion is input (S2004: YES), and an input screen for inputting the above-mentioned topic is displayed. Waiting for the next topic, the above steps S200 to S2003 are repeated in sequence.
- the information search system of the present invention provides a user with a keyword (character string) that can acquire the latest unknown topic using the mechanism of the topic providing system 1 as shown in FIG.
- a keyword character string
- an important character string that can identify a topic from an external log 502 including narrative information generated by an individual, through sentence analysis processing 511, preference analysis processing 512, and topic analysis processing 513
- the relationship (character string) and the distribution situation are displayed, and the user can grasp the topic (topic) by viewing this display.
- From the input of the external log 502 to the provision of display of a specific character string is instantaneously performed.
- a related term is used as an example of the specific character string.
- a topic dictionary that is a summary of topic information can be obtained by compressing and summarizing the topic information.
- the topic dictionary is compressed before the sentence analysis process 511 described above. For example, only text data that can be topic information, excluding tags and script sentences, is extracted from the result of searching a WEB page and the like, and the extracted text data is a processing target of the sentence analysis process 511. .
- the company information is information related to a company made up of text data generated by an individual, and this is a knowledge space related to the company.
- This knowledge space is compressed and summarized without using a dictionary related to the language by the above-described processing, and converted into a partial knowledge space.
- a dictionary partial knowledge space related to the company
- this partial knowledge space includes information representing the connection between related terms.
- a condition for collecting the external log 502 is given (for example, from the user of the conversation control terminal device 2 ′′), and the above-described processing (sentence analysis processing 511, preference analysis processing 512, In addition, as a result of the topic analysis processing 513), a related term is provided to the conversation control terminal device 2 ''.
- the conversation control terminal device 2 ′′ is, for example, a device such as a PC (personal computer), a smartphone, or a robot. If the conversation control terminal device 2 ′′ is a PC, the related term of the processing result is the conversation control terminal device 2. They are displayed on the display of '' and provided to the user of the conversation control terminal device 2 '' as information for instantly grasping the topic.
- the conversation control terminal device 2 is configured as a modification of the above-described conversation control terminal device 2 or conversation control terminal device 2 '.
- the above-described sentence analysis processing 511, preference analysis processing 512, and topic analysis processing 513 are performed by the topic providing server 4 '.
- the topic providing server 4 ′ is configured as a modification of the topic providing server 4 described above.
- the sentence analysis process 511 analyzes sentence information included in the external log 502 based on the appearance characteristic of the character string, and selects a related term 503.
- the sentence analysis process 511 selects (extracts) a related term that can identify a topic from the external log 502 without using dictionary data stored and prepared in advance such as morpheme data. That is, a common character string appearing in the external log 502 is searched, and related terms are extracted according to the degree of difference between the adjacent characters immediately before and the degree of difference between the adjacent characters immediately after those character strings.
- the external log 502 includes narrative information created by an individual as described above (for example, data stored in a predetermined log format, WEB page (homepage) or blog text data published on the Internet, and TWITTER (registration). Trademark information), data generated and edited in advance by an arbitrary organization, and text information in a database. Also, various data such as text data acquired from a voice file / moving image file through voice recognition processing may be used.
- the external log 502 is data collected according to the collection conditions. For example, it may be text data described in a WEB page (homepage) 501 shown as a search result of keyword search, a sentence described in a user's blog having a certain attribute, tweet information in TWITTER, or the like. .
- a search condition or the like in the keyword search can be specified by the user from the conversation control terminal device 2 ′′.
- One external log 502 may include a plurality of text files (for example, one including a plurality of WEB pages (HTML files) included in one WEB site) or one text file. It may be a divided part (for example, one of texts included in one file divided every 10,000 lines).
- the preference analysis processing 512 captures how the related terms extracted by the sentence analysis processing 511 are used based on the internal log 506 and determines their importance.
- the internal log 506 is data indicating the preferences of the user (including the organization or organization to which the user belongs), and is data stored in a predetermined log format, for example.
- the internal log 506 includes, for example, data indicating what related terms tend to be used by the user.
- the related terminology associated with the importance according to the user's preference is referred to as a topic key (cluster) 504 in this specification.
- the topic analysis processing 513 captures the distribution of the topic key 504 generated by the preference analysis processing 512 based on the topic material 507 and provides the user with the distribution of related terms associated with each other.
- the topic topic 507 is set when the person in charge of the contractor directly inputs and sets with the input device of the maintenance device 3 as described above, and when the topic providing server 4 ′ is the input device of the maintenance device 3
- the external log 502 collected from outside is automatically extracted based on the input keyword.
- topic analysis processing 513 it is possible to show how related terms are distributed in the topic, and it is possible to recommend related terms according to the user of the conversation control terminal device 2 ''.
- An information search system 100 shown in FIG. 20 includes a conversation control terminal device 2 ′′ and a topic providing server 4 ′. Between the conversation control terminal device 2 ′′ and the topic providing server 4 ′, a predetermined network (LAN, Internet, WAN, wireless communication, etc.).
- LAN local area network
- Internet wide area network
- WAN wide area network
- the conversation control terminal device 2 includes an input control unit 21, a search control unit 22, a transmission control unit 23, a reception control unit 24, a response information determination unit 25, an output control unit 26, and a network interface (I / F) unit 27. including.
- the scenario data 28 is stored in a main storage device such as a RAM or an external storage device such as a hard disk or a semiconductor memory.
- the input control unit 21 receives an input by the user of the conversation control terminal device 2 ′′ using a keyboard, a mouse, or the like, and passes input data or the like to a corresponding function unit according to the content of the input. For example, the user inputs a search keyword using a keyboard, or clicks a display area of an interesting related term using a mouse.
- the search control unit 22 includes a general WEB browser that operates on the conversation control terminal device 2 ′′.
- the conversation control terminal device 2 ′′ is connected to the Internet, for example, and when the user operates the WEB browser to search a WEB page (generally available Internet search), the search control unit 22
- the obtained search result is transmitted to the transmission control unit 23.
- the search result includes the address of the WEB page related to the search keyword (for example, Internet address identification information such as URL).
- the transmission control unit 23 Upon receiving the search result from the search control unit 22, the transmission control unit 23 transmits this as input information to the input information analysis unit 41 of the topic providing server 4 'by API transmission, for example.
- the reception control unit 24 receives the input specifying information transmitted from the input information analysis unit 41 of the topic providing server 4 ′ and supplies it to the response information determination unit 25.
- the response information determination unit 25 determines response information based on the scenario data 28 and the input specifying information.
- Response information is determined from the input information analysis unit 41 based on input specifying information (for example, data for displaying the distribution of related terms) and scenario data determined to be necessary for the display.
- the output control unit 26 performs control so that the response information determined by the response information determination unit 25 is displayed on the conversation control terminal device 2 ′′.
- the network interface unit 27 includes access to the topic providing server 4 ′ connected via the network, data transmission / reception, and other computers (for example, a server including an Internet search engine connected via the Internet). To control access and data transmission / reception.
- Scenario data 28 is data for defining response information related to topics provided to the user as shown in FIG.
- the scenario data 28 is data stored in advance in scenario data 55 of the topic providing server 4 ′ described later.
- Scenario data determined to be necessary based on the input specifying information generated by the input information analysis unit 41 of the topic providing server 4 ′ is extracted from the scenario data 55, and the extracted scenario data is the conversation control terminal device. 2 ′′ scenario data 28 is stored.
- the scenario data extracted from the scenario data 55 is stored in the scenario data 28 by the processes of the reception control unit 24 and the response information determination unit 25.
- scenario data 55 of the topic providing server 4 ′ All scenario data is stored in the scenario data 55 of the topic providing server 4 ′, and scenario data that is rearranged from the scenario data of the scenario data 55 is generated and rearranged based on the information that defines the transition to a different topic. Only the corrected scenario data can be stored in the scenario data 28 of the conversation control terminal device 2 ′′.
- the topic providing server 4 ′ includes an input information analysis unit 41 and a network interface (I / F) unit 47.
- the search result data 48, the related term candidate data 49, the related term dictionary 50, the preference data 51, the related term / co-occurrence data are stored in a main storage device such as a RAM or an external storage device such as a hard disk or a semiconductor memory.
- 52, topic data 53, comparison result data 54, and scenario data 55 are stored. These data can take various data formats and data storage formats.
- the input information analysis unit 41 analyzes the input information received from the conversation control terminal device 2 ′′ and generates input specifying information.
- the input specifying information is information generated as a result of analyzing various types of information included in the input information, and includes, for example, the distribution of related terms described later.
- the input information analysis unit 41 further includes an external log acquisition control unit 42, a sentence analysis unit 43, a preference analysis unit 44, a topic analysis unit 45, and an information update unit 46.
- the input information received from the conversation control terminal device 2 ′′ is identification information for identifying the external log 502 (for example, a search result including the address of the WEB page 501 related to the search keyword).
- the identification information is accessed via the Internet, and the corresponding HTML data is acquired.
- the input information received from the conversation control terminal device 2 ′′ includes text data itself that is a target for extracting related terms, the data is provided to the sentence analysis unit 43 as an external log 502.
- information identifying the external log 502 for example, a search result including the address of the WEB page 501 related to the search keyword
- a crawler see FIG. 39
- the external log 502 is accessed, the corresponding data is acquired, and the acquired data is provided to the information update unit 46 for comparison with the related term dictionary.
- the sentence analysis unit 43 acquires the text data from the external log 502 acquired by the external log acquisition control unit 42, extracts important related terms included in the text data according to the appearance characteristics of the character string, and the related terms Store in dictionary 50.
- the preference analysis unit 44 determines the importance of the related terms stored in the related term dictionary 50 by the sentence analysis unit 43 based on the preference data 51 and stores the determination result in the related term / co-occurrence word data 52.
- the preference data 51 is data including an internal log 506 that stores the usage of the related terms by the user.
- the topic analysis unit 45 captures the distribution of the related terms stored in the related term / co-occurrence word data 52 generated by the preference analysis unit 44 based on the topic data 53 and associates the related terms with each other. Update the co-occurrence word data 52.
- the topic data 53 is data including a topic material 507 that is input and set by the person in charge of the contractor or is automatically extracted.
- the information updating unit 46 selects a related term to generate a related term dictionary, compares the generated related term dictionaries, and compares the comparison result with comparison result data. 54.
- the network interface unit 47 is used to access and transmit / receive data to / from the conversation control terminal device 2 ′′ connected via the network, and other computers (for example, a server having an Internet search engine connected via the Internet, etc. ) And access data transmission / reception.
- the information search system 100 has been described as a system including the conversation control terminal device 2 ′′ and the topic providing server 4 ′.
- the conversation control terminal device 2 ′′ and the topic providing server 4 ′ are It can also be configured as one integrated computer. Conversely, the same function can also be realized by distributing it to three or more computers connected to the network.
- the sentence analysis unit 43 can be configured as one independent sentence analysis apparatus, and similarly, the information update unit 46 can be configured as one independent information update apparatus.
- the sentence analysis device and the information update device are connected to other devices via a network, and the information retrieval system 100 described above can be configured.
- the sentence analysis unit 43 retrieves the same character string from the text data, and determines the degree of difference between the previous adjacent character and the degree of difference between the subsequent adjacent character for each of the retrieved character string, and the determined difference Based on the degree, it is determined whether or not the retrieved “same character string” is a related verb that has high importance with respect to the topic and can semantically identify the text data.
- the degree of difference between the previous adjacent characters is an index indicating how much the character appearing immediately before the searched “same character string” is different.
- the degree of difference between subsequent adjacent characters is an index indicating how much the character appearing immediately after the searched “same character string” is different.
- a character string having a large degree of difference between the preceding adjacent characters and a large degree of difference between the following adjacent characters is determined as a related term.
- One or more character strings determined in this way are stored in a predetermined storage unit as necessary.
- This kind of character string extraction is based on the fact that when you focus on multiple identical character strings included in text data, many variations of characters appear as characters that are located immediately before each character string.
- the character string is based on the idea that the character string is an independent and frequently used term.
- the sentence analysis unit 43 of the present invention extracts character strings based on the idea based on kinematics of character strings.
- the degree of difference between the preceding adjacent characters is determined based on, for example, the number of variations of 30 types including “A” and “I”. Judgment is made based on the number of 20 variations including “n”.
- the column “Iroha” is determined to be an independent term and highly likely to be a highly important word, is determined as a related term, and is stored in the storage unit as necessary. Whether the difference between the preceding adjacent characters and the difference between the subsequent adjacent characters is large is determined based on a common or individual determination criterion.
- the sentence analysis unit 43 includes a text data acquisition processing unit 43a, a character string search processing unit 43b, a different degree determination processing unit 43c, and a related term determination processing unit 43d. Further, the related term determination processing unit 43d includes a related term determination unit 43d-1 and a ranking management unit 43d-2.
- the text data acquisition processing unit 43a acquires the external log 502 (text data to be processed) and provides it to the character string search processing unit 43b (text data acquisition processing 520 shown in FIG. 25 described later).
- the character string search processing unit 43b performs a character string search process 530 shown in FIG.
- the difference degree determination processing unit 43c performs a difference degree determination process 540 shown in FIG.
- the related term determination processing unit 43d determines a related term and stores the determined related term in the related term dictionary 50 as necessary (related term determination processing 550 shown in FIG. 25).
- the related term determining unit 43d-1 determines whether or not the same character is a related term from the degree of difference between adjacent characters related to the same character string included in the external log 502. When a plurality of related terms are determined in one external log 502, the ranking management unit 43d-2 ranks the related terms as necessary.
- the information update unit 46 includes a text data acquisition processing unit 46a, a character string extraction processing unit 46b, a dictionary comparison processing unit 46c, and a comparison result output unit 46d.
- the text data acquisition processing unit 46a acquires the external log 502 (text data to be processed) and provides it to the character string extraction processing unit 46b (text data acquisition processing 700 shown in FIG. 39, which will be described later).
- the character string extraction processing unit 46b extracts a related term from the external log 502 and stores it in the corresponding related term dictionary 50 (character string extraction processing 710 shown in FIG. 39).
- the character string extraction processing unit 46b is, for example, the same processing as the processing by the sentence analysis unit 43 described above.
- the dictionary comparison processing unit 46c compares the plurality of related terminology dictionaries 50 and stores the comparison result in the comparison result data 54 (dictionary comparison processing 720 shown in FIG. 39).
- the comparison result output unit 46e acquires a comparison result to be displayed from the comparison result data 54, and transmits input specifying information including the comparison result to the conversation control terminal device 2 ''.
- FIG. 23 shows the screen transition of the FAQ search system.
- the user gives a predetermined instruction on the conversation control terminal device 2 ′′, displays the FAQ search screen 600 on the display, and inputs a desired search keyword (using a keyboard or the like) there.
- the FAQ search screen 600 is an input instruction screen as shown in FIG. 35A, for example, and a search keyword input unit 601 and a “FAQ search” button 602 are displayed on the FAQ search screen 600.
- FAQ candidate display screen 610 is displayed. Is displayed.
- the FAQ candidate display screen 610 is, for example, a display screen as shown in FIG. 35B, and displays a related term index display portion 611, a candidate question sentence display portion 612, and a “return to FAQ search screen” button 613.
- the questions shown in the candidate question sentence display unit 612 are all related to “network”, and the search results based on the search keyword input by the user are displayed.
- the set of related terms shown in the related term index display unit 611 is a set of related terms included in the corresponding question.
- the FAQ display screen 630 is, for example, a display screen as shown in FIG. 36, and displays a question display portion 631, a related term index display portion 632, an answer display portion 633, and a “return to FAQ candidate display screen” button 634. ing. When the user clicks a “return to FAQ candidate display screen” button 634 here, the display on the conversation control terminal device 2 ′′ returns to the FAQ candidate display screen 610.
- the related term / co-occurrence word list is displayed.
- a screen 650 is displayed.
- the related term / co-occurrence word list display screen 650 is, for example, a display screen as shown in FIG. 37.
- the NO display unit 651, the related term display unit 652, the neighborhood related term display unit (653 to 656), and the “FAQ” A “return to candidate display screen” button 657 is displayed.
- the display on the conversation control terminal device 2 ′′ returns to the FAQ candidate display screen 610.
- the FAQ search screen 660 is displayed.
- the FAQ search screen 660 is a display screen as shown in FIG. 38B, for example, which is substantially the same as the FAQ search screen 600 shown in FIG. It shows that it has returned to.
- the search keyword input unit 661 of the FAQ search screen 660 as a search keyword, the related term (for example, “SNS” in the example of FIG. 37) selected on the related term / co-occurrence word list display screen 650 is automatically displayed. Is set automatically.
- FIG. 24 is a flowchart showing the processing for displaying the FAQ candidate display screen, and shows what processing is performed in each of the conversation control terminal device 2 ′′ and the topic providing server 4 ′.
- each process is performed by the above-described Topiclet 20, and the screen display on the display of the conversation control terminal device 2 ′′ shown in FIGS. Or a WEB browser that operates under the control of the Topiclet 20.
- step S 11 it is determined whether or not the user has clicked the “FAQ Search” button 602 on the FAQ search screen 600. While the “FAQ search” button 602 is not clicked (NO), this determination is repeated.
- the search result is a result of keyword search on a general Internet search site, and the Topiclet 20 controls the search on this Internet search site, and the topic providing server 4 transmits the search result by API transmission. Send to '.
- the search result is, for example, the address of the WEB page that hits the keyword search.
- step S13 the input information is analyzed, the address of the WEB page included in the input information is accessed, and the HTML corresponding to the WEB page is accessed.
- the external log 502 as the target text data is acquired from the data or the like.
- step S ⁇ b> 14 the topic providing server 4 ′ executes sentence analysis processing on the acquired external log 502 and extracts related terms from the external log 502. The sentence analysis process will be described later in detail.
- step S15 the topic providing server 4 'generates a related term dictionary 50 from the related terms extracted from the external log 502 in step S14.
- the related term dictionary 50 includes a related term index 50 a for each sentence information in the external log 502.
- step S16 the topic providing server 4 ′ acquires the related term index 50a and the like from the related term dictionary 50 to be displayed on the FAQ candidate display screen 610, and uses the information as input specific information to control the conversation control terminal. Send to device 2 ''.
- the conversation control terminal device 2 ′′ When the conversation control terminal device 2 ′′ receives the input specifying information from the topic providing server 4 ′ (step S 17), the response control information is determined based on the received input specifying information and the scenario data 28 in step S 18.
- the topic providing server 4 ′ transmits the scenario data 55 to the conversation control terminal device 2 ′′ as necessary, and the conversation control terminal device 2 ′′ stores this in the scenario data 28.
- step S19 the response information determined in step S18 is displayed on the display of the conversation control terminal device 2 ''.
- a FAQ candidate display screen 610 as shown in FIG. 35 (B) is displayed.
- a part of the collected question sentences (Q1, Q8, Q13, Q24, Q25) is displayed as a list of candidate question sentences on the candidate question sentence display unit 612.
- the related term index display unit 611 shows related term indexes corresponding to the question sentences displayed as candidate question sentences.
- the sentence analysis unit 43 acquires the external log 502 that is text data (text data acquisition processing 520).
- the external log 502 can be received from various data sources as described above.
- each WEB page is accessed based on the address of the WEB page received from the conversation control terminal device 2 ′′, and the text data is obtained from the corresponding HTML data or the like.
- the sentence analysis unit 43 searches for the same (common) character string from the external log 502 acquired by the text data acquisition process 520 (character string search process 530).
- This processing is, for example, processing for searching for and retrieving the same character string “Iroha” in the acquired external log 502. If 100 character strings “Iroha” exist in one text data, all of them are extracted. In text data, there may be a plurality of the same character strings other than “Iroha”. In this case, these character strings are similarly searched and extracted. For example, in addition to the character string “Iroha”, if a plurality of character strings “Nihoheto” are included, the character string is similarly extracted.
- the character string search processing 530 further stores the character strings searched as the same character string in the search result data 48 together with the adjacent character before and after the character string.
- the data stored in the search result data 48 is, for example, the character string “Iroha” in the above example, the character string “Iroha”, the adjacent character before “Iroha”, for each of the 100 “Iroha”, And the data including the adjacent character after “Iroha”.
- the character string “Ihohani” and the character string “Irohani” are also stored in the search result data 48 in the same manner as those character strings, the previous adjacent character, and the subsequent adjacent character.
- the character string search processing 530 when the same character string is searched in the character string search processing 530, storing the character string and the adjacent characters before and after it ultimately determines an associated verb having an important meaning. For this reason, even if there are a plurality of the same character strings in the external log 502, if the appearance frequency does not reach the predetermined frequency, there is a possibility that it will be determined as a related term at this point. It is possible to determine that the character string is not stored as search result data 48. This is because a word (character string) that appears only a small number of times in the external log 502 composed of many characters can be determined to be less important in the first place.
- the character string search processing 530 in order to search and store the same character string from the external log 502, in this embodiment, a search data structure called a suffix array is used. By searching by search (Binary Search), the same character string is searched at high speed.
- the character string search processing 530 is performed by the method as described above, but the same search processing can be performed by employing various other methods. The processing of the character string search processing 530 using the suffix array and binary search will be described in detail later.
- the sentence analysis unit 43 uses the character string stored in the search result data 48 by the character string search processing 530 and the contents of adjacent characters before and after the character string to determine the degree of difference between the previous adjacent character and the degree of difference between the subsequent adjacent characters. (Difference degree determination processing 540).
- m (i, j ⁇ 1) (s (i), s (i + 1), s (i + 2),...
- m (i, j) ⁇ T (i) ⁇ S (i)
- T (i) ⁇ S (i)
- B (j) ⁇ S (j)
- m (i, j-1) ⁇ B (j + 1) ⁇ S (j + 1)
- m (i, j) ⁇ means a set of characters appearing immediately before the character string m (i, j) in common.
- s (i) ⁇ T (i) and s (j) ⁇ B (j) hold.
- the character string m (i, j) is a candidate for the related term. It is judged as.
- the degree of difference regarding the adjacent character for the same character string is determined based on the appearance mode of the adjacent character before (or after) the same character string, that is, how many variations the adjacent character appears. judge.
- the degree of difference regarding the adjacent characters before and after is determined, the determined degree of difference is stored in the related term candidate data 49 together with the corresponding character string. Note that the process of the difference degree determination process 540 will be described in detail later.
- the sentence analysis unit 43 determines whether or not the same character string is a related term based on the degree of difference regarding the adjacent characters before and after the same character string determined by the difference degree determination processing 540. If it is determined to be a related term, the character string is stored in the related term dictionary 50 (related term determination process 550).
- the adjacent character and the “same character string” are combined. Can be considered to form another character string that is often used.
- the adjacent character and the “same character string” are separated.
- the "column” is an independent term and is likely to be a highly important word. Further, whether or not the “same character string” is a related term can be determined in consideration of other factors in addition to the degree of difference between the preceding adjacent characters and the degree of difference between the subsequent adjacent characters. Such a related term determination process 550 will be described in detail later.
- ranking can be performed among character strings determined as related terms.
- Such ranking is, for example, ranking related to the importance of a character string, and can be determined in consideration of other factors in addition to the degree of difference between adjacent characters in the character string. For example, ranking can be performed based on the character length, appearance frequency, etc. of the character string. In addition to ranking, ranking can also be performed numerically so that a relative degree of difference can be represented.
- FIG. 26 shows an external log 502 a that is an example of the external log 502.
- the external log 502a is a collection of only sentence information including “network” in the description part of the question sentence as a search result based on the search keyword.
- This is text data for solving problems written by various users on various servers on the Internet. Typical examples of these text data include WEB page (homepage) and blog text data published on the Internet, and TWITTER tweet information. Further, data generated and edited in advance by an arbitrary organization or text information in a database may be included.
- the processing 540 and the related term determination processing 550 are performed, a plurality of related terms are extracted for each question sentence of the extracted sentence information, as shown in FIG.
- a plurality of related terms are extracted for each question sentence of the extracted sentence information, as shown in FIG.
- the question sentence “network”, “trouble”, “response”, and “setting” are selected for Q1.
- the network corresponding to the search keyword is underlined, and other related terms are surrounded by a rectangle.
- ⁇ network, setting, trouble, correspondence ⁇ is shown as a set of extracted related terms corresponding to the description of the question of Q1. This set of related terms is described in the ranking order described above with respect to the related term determination process 550.
- FIG. 26B An example of the related term dictionary 50 generated by the related term determining process 550 is shown in FIG.
- a related term index 50a a set of related terms related to each question shown in FIG. 26B is stored as it is. In some cases, it is sufficient to store only the related term index 50a as the related term dictionary 50. However, in this embodiment, the question sentence corresponding to the related term index is stored in the question sentence 50b. The answer corresponding to the question sentence 50b is stored in the answer sentence 50c.
- the related term index 50a stored in the related term dictionary 50 is stored in association with corresponding sentence information, so that one set of related terms and other related terms are stored.
- the related terms belonging to the other set are associated with each other through the related terms common to the set.
- FIG. 27B ⁇ network, setting, trouble, correspondence ⁇ is grasped as a set of related terms for the question Q1 in the question sentence, and the question Q8 in the question sentence. If ⁇ network, setting, event ⁇ is grasped as a set of related terms, there is a common related term “setting” (except for the related keyword “network” that is the search keyword).
- the related term (also known as co-occurrence related term) is related to the related term “setting”, and is related to the related terms “trouble” and “correspondence”. It is understood that they have a relationship in that they are related to the verb “by event”, that is, they have a common point that they co-occur with the related term “setting”.
- the relationship between the related terms is seen among the sentence information of the external log 502a collected by the search keyword “network”, but is collected by a completely different search keyword. In some cases, it is possible to find such a relationship among related terms in the sentence information. In this case, a potential topical relationship can also be found.
- FIG. 28 is a flowchart illustrating the processing procedure of the character string search processing 530.
- FIGS. 29 and 30 are diagrams showing a mechanism for character string search using a suffix array and binary search.
- this code is a part of a character string of the external log 502”.
- Text data 502-1 is set. Normally, the entire text portion of the external log 502 is a search target, but here it is a part of the text portion for illustration.
- a suffix array is created in order to search for the same character string from the character string “this code is in the code list”.
- the suffix is expanded from the first character to the last character of the text data.
- FIG. 29A from the search target text data 502-1 (“this code is in the code list”) shown in FIG. 29A, as shown in FIG. 29B, Suffixes with indexes from 1 to 15 are expanded.
- Each suffix is a character string from the index position (start character position) to the end in the search target text data 502-1.
- the search target text data 502-1 includes A character string “This code is in the code list” from the first character to the end (15th character) is shown.
- the character string “in the list” from the 10th character to the end (15th character) of the text data 502-1 to be searched is shown.
- the last index “15” the last character “RU” is shown.
- step S22 of FIG. 28 the expanded suffixes are sorted in a predetermined order to create a suffix array.
- the suffixes shown in FIG. 29 (B) are sorted, and the suffix array after sorting is shown in FIG. 29 (C).
- search character strings are sequentially determined from the text data one by one. In this case, whether or not the same character string exists in the text data is searched, and all partial character strings in the text data are collated with the text data as search character strings. For example, for the text data 502-1 to be searched in FIG. 29, the search character string “ko” to “ru” of one character, the search character strings “this”, “noko”,. ‘N’, ‘Yes’, 3 character search strings ‘Kon’, ‘No’, ⁇ ⁇ ⁇ , ‘To’, ‘In’, etc. The search character string up to “This code is in the code list”. However, in this case, the search character string of 1 character or 15 characters can be omitted because it has no meaning to collate with the text data 502-1 to be searched. In addition, the character length of the search character string can be limited to a predetermined range.
- step S24 of FIG. 28 it is determined whether or not all the search character strings to be searched are completed.
- the character string search process ends. If all have not been completed yet, the suffix array is searched using the search character string as a search key in step S25.
- FIG. 30 as shown in FIGS. 30A and 30B, a character string “code” of three characters is determined from the text data 502-1 to be searched as a search character string. The search process in the case of being performed is shown. At this time, a binary search is performed on the suffix array of FIG.
- the suffix array in FIG. 30C is the same as the suffix array shown in FIG. 29C, and FIG. 30C shows a search process by binary search.
- the JIS code of “ru” is 246B
- the JIS code of “ko” is 2533
- “co” is larger
- the JIS code for “G” is 2548
- the JIS code for “G” is 2533
- step S26 of FIG. 28 it is determined whether or not a predetermined number of search character strings have been hit. If it is determined that a predetermined number of search character strings do not hit, the process proceeds to step S23 in order to perform a search by the next search character string without using the character string as a candidate for a related term.
- the predetermined number can be determined based on various factors such as the number of characters in the text data 502-1 to be searched and the number of characters in the search character string.
- the fact that the predetermined number of hits does not occur means that the frequency of appearance of the search character string is low in the text data 502-1 to be searched and it is not an important word. It should be noted that at this stage, the appearance frequency can be stored as a related term candidate without being evaluated, and finally determined in a subsequent related term determination process or the like.
- step S26 If it is determined in step S26 that the search character string has been hit by a predetermined number, the process proceeds to step S27, where the character string that matches the search key (search character string) is set as a related term candidate, At the same time, it is stored in the related term candidate data 49 as one record.
- search character string search character string
- the record (3) and the record (4) in which the character string matching the search character string “code” is found at the head are the same as the search character string, respectively.
- the character string “code” and adjacent characters before and after are stored as one record. For example, for record (3), “GA”, “CODE”, and “RE” are stored as one record. The preceding adjacent character is “GA”, and the subsequent adjacent character is “RE”.
- record (4) “no”, “code”, and “ga” are stored as one record. The preceding adjacent character is “no” and the subsequent adjacent character is “ga”.
- the sentence analysis unit 43 in this embodiment is configured to find the same character string in the text data at high speed using the suffix array and the binary search as described above. It is not limited to the said processing method. It is possible to find the same character string in the text data by a method other than that described above.
- FIG. 31 is a flowchart showing a processing procedure of the difference degree determination processing 540.
- FIG. 32 is a diagram showing a mechanism for determining the degree of difference between the adjacent characters before and after, and the character string search process using the search character string “code” as shown in FIG. As a result of performing the search on the target character string, 26 search character strings “codes” are obtained as search results, and the corresponding 26 records are processed.
- a character string search process 530 extracts a record relating to one character string from the character string records (including adjacent characters before and after) stored in the search result data 48.
- a state is shown in which records (26 records in total) for the character string “code” stored in the search result data 48 are extracted and expanded in the memory.
- step S32 if all the character string records stored in the search result data 48 are acquired in step S32 in FIG. 31 and it is determined that there is no more data to be subjected to the different degree determination process, the different degree determination process in FIG. Ends. If all the processes have not been completed in step S32 and all the records have been acquired for one of the character strings stored in the search result data 48, the process proceeds to step S33.
- step S33 of FIG. 31 all records acquired for one of the character strings stored in the search result data 48 are sorted by the previous adjacent character, and the number of appearance character patterns related to the previous adjacent character is determined.
- the result of sorting the records (26 in total) acquired for the character string “code” stored in the search result data 48 by the previous adjacent character 561 is shown. .
- This sorting can be performed using a character code (for example, JIS code) as in the case of creating a suffix array in the character string search process described above.
- the number of appearance character patterns of the previous adjacent character 561 can be determined by counting the number of times the value of the previous adjacent character 561 has changed (breaked) between records.
- the previous adjacent characters 561 are “”, “(”, “,”, ““ ”,“ ga ”,“ ta ”,“ de ”,“ do ”,“ no ”, There are 13 patterns of “ha”, “be”, “mo”, and “ri”.
- the sentence analysis unit 43 of this embodiment sorts the previous adjacent characters 561 and obtains the number of appearance character patterns related to the previous adjacent characters depending on whether or not the value has changed between the records.
- the present invention is not limited to the processing method.
- the number of appearance character patterns can be obtained by various other methods.
- uppercase and lowercase letters can be treated as the same character, or they can be treated as different characters.
- sorting for 1 byte code characters such as single-byte alphanumeric characters, the corresponding 1 byte is sorted, and for 2 byte code characters such as Kanji, the corresponding 2 bytes are sorted. Also, when sorting the previous adjacent character 561, in this embodiment, one adjacent character is sorted. However, it is possible to sort two or more characters and determine the degree of difference between them. .
- the degree of difference is an index indicating how much the adjacent character 561 is different (between the 26 records in FIG. 32A), and thus the appearance character related to the previous adjacent character described above. It is determined based on the number of patterns.
- the degree of difference may be the number of patterns itself, but can be determined in consideration of, for example, the appearance frequency of a character string (in the case of FIG. 32A, the appearance frequency of “code” is 26). .
- the degree of difference can be evaluated in a plurality of stages (for example, three stages) using a predetermined threshold.
- arbitrary weighting can be performed with respect to the count of the character and determination of the degree of difference.
- step S35 of FIG. 31 all records acquired for one of the character strings stored in the search result data 48 are sorted by the subsequent adjacent character, and the number of appearance character patterns related to the subsequent adjacent character is determined.
- FIG. 32 (B) the result obtained by sorting the records (26 in total) acquired for the character string “code” stored in the search result data 48 by the subsequent adjacent characters 563 is shown. .
- the records 565 and 566 shown in FIG. 32A are arranged at the positions indicated by dotted arrows.
- This sort can be performed using a character code (for example, JIS code) as in the case of creating a suffix array in the character string search process described above.
- a character code for example, JIS code
- the number of characters appearing in the subsequent adjacent character 563 can be determined by counting the number of times the value of the subsequent adjacent character 563 has changed (breaked) between records.
- the subsequent adjacent characters 213 are “,”, ““ ”,“ ””, “ga”, “de”, “to”, “ni”, “no”, “ha”. , “O”, “Li”, and “Branch”.
- the sentence analysis unit 43 of this embodiment sorts the subsequent adjacent characters 563, and obtains the number of appearance character patterns related to the subsequent adjacent characters depending on whether or not the value has changed between the records.
- the number of appearance character patterns can be obtained by various other methods.
- the predetermined character can be excluded from the count of the number of patterns.
- sorting for 1 byte code characters such as single-byte alphanumeric characters, the corresponding 1 byte is sorted, and for 2 byte code characters such as Kanji, the corresponding 2 bytes are sorted.
- one adjacent character is sorted. However, two or more characters can be sorted and the degree of difference can be determined. .
- the degree of difference regarding the subsequent adjacent character 563 is determined.
- the degree of difference is an index indicating how much the adjacent character 563 is different (between the 26 records in FIG. 32B). It is determined based on the number of patterns.
- the degree of difference may be the number of patterns itself, but can be determined in consideration of, for example, the appearance frequency of a character string (in the case of FIG. 32B, the appearance frequency of “code” is 26). .
- the degree of difference can be evaluated in a plurality of stages (for example, three stages) using a predetermined threshold.
- arbitrary weighting can be performed with respect to the count of the character and determination of the degree of difference.
- step S37 of FIG. 31 the character string to be determined and the degree of difference regarding the previous adjacent character and the degree of difference regarding the subsequent adjacent character determined for the character string are stored in the related term candidate data 49. Is done.
- step S37 in FIG. 31 the process proceeds to step S31, and the process for the next “same character string” is performed.
- the related term determination process 550 determines whether or not the corresponding character string is a related term, for example, according to the degree of difference between the adjacent character before and after.
- the degree of difference between the adjacent characters before and after can be determined according to the same or different criteria.
- the degree of difference between the adjacent characters before and after is determined to be a predetermined size
- the corresponding character string is an independent term and is determined as a related term that is an important word for identifying a topic. . That is, a score may be calculated based on the degree of difference between adjacent characters before and after, and based on the score, it may be determined whether or not the corresponding character string is a related term.
- the related term determination process 550 in addition to the degree of difference between the adjacent characters before and after, the character length of the corresponding character string, the appearance frequency of the corresponding character string, the probability / frequency of appearance of a specific character as the previous adjacent character , Calculate the score in consideration of the probability and frequency that a specific character appears as a subsequent adjacent character, the probability and frequency that a specific character combination appears as the adjacent character before and after, and based on the calculated score, You may make it determine whether the corresponding character string is a related term.
- ranking is performed among character strings determined as related terms. Can be done. Such ranking is, for example, ranking related to the importance of a character string related to a topic.
- score calculated based on the degree of difference between adjacent characters and the degree of difference between adjacent characters before and after
- rank is determined on the basis of a score calculated taking into account various factors.
- ranking is not only used to order the importance of strings determined as related terms, but also to indicate the relative importance between related terms, for example, the above-mentioned score Ranking can be performed by specific numerical values by using values or the like.
- Such ranking is performed when a plurality of related terms are determined, but can also be performed for a plurality of related terms determined with respect to one text data, or a plurality of grouped groups according to a predetermined condition. It can also be performed for a plurality of related terms determined for text data.
- text data of a WEB page group hit with an input search keyword contents of a user's TWITTER corresponding to a predetermined attribute, and the like can be considered.
- FIG. 33 is a flowchart showing the display processing of the FAQ display screen, and shows what processing is performed in the conversation control terminal device 2 ′′ and the topic providing server 4 ′.
- each process is performed by the above-described Topiclet 20.
- step S41 the user selects one of the candidate question sentences displayed on the candidate question sentence display unit 612 on the FAQ candidate display screen 610 shown in FIG. 35B by clicking the mouse or the like. It is determined whether or not. While one of the candidate question sentences is not selected (NO), this determination is repeated.
- step S42 the selected candidate question sentence is transmitted as input information to the topic providing server 4 '.
- the input information can include the selected question text itself, but it is sufficient if the input information includes an identifier that can identify the question text. Note that the user can select a plurality of question sentences of interest simultaneously.
- step S43 the input information is analyzed, and an answer sentence corresponding to the question sentence included in the input information is acquired from the related term dictionary 50.
- a question sentence 50b and an answer sentence 50c corresponding to the question sentence 50b are stored in the related term dictionary 50.
- the sentence 50c can be stored in another file while being associated with the related term index 50a of the related term dictionary 50.
- step S44 the topic providing server 4 ′ displays information including the answer sentence 50c corresponding to the question sentence 50b acquired from the related term dictionary 50 to be displayed on the FAQ display screen 630 as a related term / co-occurrence word.
- the information is stored in the data 52 and transmitted to the conversation control terminal device 2 ′′ as input specifying information.
- the conversation control terminal device 2 ′′ When the conversation control terminal device 2 ′′ receives the input specifying information from the topic providing server 4 ′ (step S 45), it determines response information based on the received input specifying information and the scenario data 28 in step S 46.
- the topic providing server 4 ′ transmits the scenario data 55 to the conversation control terminal device 2 ′′ as necessary, and the conversation control terminal device 2 ′′ stores this in the scenario data 28.
- step S47 the response information determined in step S46 is displayed on the display of the conversation control terminal device 2 ''.
- the FAQ candidate display screen 610 as shown in FIG. 35B
- one of the question sentences listed in the candidate question sentence display unit 612 (for example, the question Q24 indicated by the arrow (1)) is selected.
- an FAQ display screen 630 as shown in FIG. 36 is displayed, where the question display unit 631 displays the question Q24 which is the selected question sentence, and the related term index display unit 632 displays Q24.
- the related term index corresponding to the question is displayed, and further, the answer corresponding to the question of Q24 (the answer of A24) is displayed on the answer display unit 633.
- the user can search the FAQ with the search keyword designated by the user and display a plurality of question sentence candidates as search results. Furthermore, it is possible to easily grasp what important keywords appear in each candidate question sentence (that is, what matters are related) by looking at the related index. it can.
- the FAQ candidate display screen 610 does not display the answer sentence corresponding to the question sentence, but the corresponding answer sentence is displayed at the stage where the candidate question sentence is displayed as a list. You can also.
- FIG. 34 is a flowchart showing the display processing of the related term / co-occurrence word list screen, and shows what processing is performed in the conversation control terminal device 2 ′′ and the topic providing server 4 ′.
- each process is performed by the above-described Topiclet 20.
- step S51 the user selects one of the related term indexes displayed on the related term index display unit 611 on the FAQ candidate display screen 610 shown in FIG. 35B by clicking the mouse or the like. It is determined whether or not. While one of the related term indexes is not selected (NO), this determination is repeated.
- step S52 the selected related term index is transmitted as input information to the topic providing server 4 '.
- the input information may include the selected related term index itself, but it is sufficient if the input information includes an identifier that can identify this related term index. Note that the user can select a plurality of related terminology indexes of interest simultaneously.
- step S53 the input information is analyzed, and all the related terms including the related term index included in the input information are analyzed from the related term dictionary 50. Get the index.
- the topic providing server 4 'performs preference analysis for all related terms included in the related term index acquired in step S53 based on the internal log 506 such as the preference data 51.
- the preference data 51 is a log file that stores data indicating usage modes, such as what related terms have been used so far by each user, for example, as shown in FIG.
- the preference data 51 the usage date and time and detailed usage contents of the related terms can be stored, and the preference analysis can be performed in consideration of the information.
- the topic providing server 4 ′ performs preference analysis based on the preference data 51 for all related terms included in the conjunction index, and determines the importance of each related term. For example, with reference to the preference data 51, the importance level of the related term is set so that the more frequently used related term is, the higher the importance level is for the same user.
- the related terms associated with such importance correspond to the topic key (cluster) 504 described above.
- the topic providing server 4 'further performs topic analysis on the related terms whose importance is set in step S54 based on the topic material 507 such as the topic data 53.
- the topic data 53 may be a topic input by the person in charge of the contractor or a topic automatically extracted from the external log 502 based on the topic input by the person in charge. Based on such topic data 53, the distribution is captured, and the distribution of related terms associated with each other is provided to the user. For example, it is possible to associate a related term that is a topic in the FAQ with its co-occurrence word and express how the related term is distributed in the topic.
- the topic can be adjusted so as to limit the target question sentence to a predetermined range.
- step S56 the topic providing server 4 ′ displays information including the related terms finally associated in step S55 as input specific information for display on the related term / co-occurrence word list display screen 650. It is transmitted to the conversation control terminal device 2 ′′.
- the conversation control terminal device 2 ′′ When the conversation control terminal device 2 ′′ receives the input specifying information from the topic providing server 4 ′ (step S 57), it determines response information based on the received input specifying information and the scenario data 28 in step S 58.
- the topic providing server 4 ′ transmits the scenario data 55 to the conversation control terminal device 2 ′′ as necessary, and the conversation control terminal device 2 ′′ stores this in the scenario data 28.
- step S59 the response information determined in step S58 is displayed on the display of the conversation control terminal device 2 ''.
- a related term / co-occurrence word list display screen 650 as shown in FIG. 37 is displayed on the display of the conversation control terminal device 2 ′′.
- a NO display unit 651 a related term display unit 652, a neighborhood related term display unit (653-656), and a “return to FAQ candidate display screen” button 657 are displayed.
- Related terms are displayed in the form of a two-dimensional matrix.
- the related term display unit 652 extracts all related terms that are extracted as related terms regarding the FAQ search of this embodiment and appear in the related term index of the related term dictionary 50 in order without duplication. Yes.
- the display order is set according to the importance of each related term determined by preference analysis. In this embodiment, the lower the number shown in the NO display unit 651 (the higher the number displayed in the upper part of FIG. 37). The importance of the corresponding related terms is high. In FIG. 37, the numbers shown in the NO display portion 651 are 1 to 17, but the sliders on the related term / co-occurrence word list display screen 650 are operated to move downward. Allows you to browse more related terms.
- a neighborhood related term display unit 653 to a neighborhood related term display unit 656 are shown on the right side of the related term displayed on the related term display unit 652, and here, a related term display unit 652 is displayed.
- the neighborhood related verbs related to the related verbs displayed in are displayed.
- the neighborhood is a set of related terms including the related term A when the attention is given to a related term A, and such a related term is referred to as a neighborhood here. It is a related verb.
- the related term set is a set of related terms included in a certain topic, and here, the related term index corresponds to this.
- neighborhood related terms 1 to 4 there are only four neighborhood related terms, ie, neighborhood related terms 1 to 4; however, the slider bar on the related term / co-occurrence word list display screen 650 is operated to the right.
- the neighborhood related terminology can be said to be a related term co-occurring with the related term displayed in the related term display unit 652 (co-occurrence related terminology: that is, a related term appearing together in the same topic).
- the display order of the neighborhood related terms (in the horizontal direction) is displayed in the related term display unit 652 as the co-occurrence relationship is strong, that is, the frequency of appearance with the related term displayed in the related term display unit 652 is higher. It is adjusted to be displayed at a position close to the related verb. Further, the display order in the horizontal direction can be determined in consideration of ranking of related terms, setting by a user or an information search system, and the like.
- the FAQ candidate display screen 610 shown in FIG. 35B displays on the related term index display unit 611.
- the related terms included in the related term index in this embodiment, for example, “network”, “connection”, “router”). "And” modem ” are highlighted so that the user can easily recognize them.
- search keyword another related terminology related to the related terminology (search keyword) in terms of how it has used the related terminology and whether it is close to the topic that the user or information retrieval system is focusing on or recommending Yes, you can be given “awareness” about new related terms.
- search keyword the related term matrix display is performed by the search keyword “network”, but the numbers displayed on the NO display unit 651 are 1 to 13, In the case of No. 15 and No.
- the topic is generally related to the communication network.
- the numbers displayed on the NO display unit 651 are 14, 16, and 17, It is a topic related to social networks, and it can be seen that a different topic has appeared.
- the screen automatically transitions to a FAQ search screen 660 as shown in FIG. 38 (B), where the selected related term (in this example, the neighborhood related term “SNS”) is It is automatically set in the search keyword input part 661 of the FAQ search screen 660.
- the FAQ candidate display screen 610 is displayed again, and the question sentence regarding “SNS” is displayed in the candidate question sentence display unit 612 this time.
- the information updating unit 46 extracts a character string that can be distinguished from the external log 502 (text data) collected under different collection conditions, and stores the extracted character string in a related term dictionary corresponding to the text data.
- the comparison result obtained by performing the comparison process on these related terminology dictionaries is stored in the comparison result data 54 and updated. The comparison process is automatically performed when the related term dictionary is updated.
- Text data associated with different related terminology dictionaries is text data collected under different collection conditions. These text data are, for example, a plurality of text data collected at different timings for the same target or data source. Or a plurality of text data collected according to different subjects and search conditions at the same timing.
- the above comparison process compares a plurality of related terminators, and in addition to newly appearing related terms, disappeared related terms, commonly appearing related terms, (corresponding to three or more time-series text data)
- the appearance status of a related term such as a related term that has reappeared is determined, and if the related term is one of such appearance statuses, the related term is stored as a comparison result.
- these related terms can be associated as one set (as a co-occurrence related term) and stored in a corresponding related term dictionary.
- the information update unit 46 acquires the external log 502 that is text data (text data acquisition processing 700).
- the external log 502 is collected by the crawler 730, for example.
- the crawler 730 returns the network address (URL or the like) of the WEB page
- the external log 502 can be acquired by accessing the network address.
- it is possible to perform a filtering process so as to acquire only specific text data, or to perform grouping according to a specific classification.
- the crawler 730 is automatically activated, for example, and performs topic analysis on a topic name determined at a predetermined time (that is, performs a search and periodically collects topics).
- the topic name is stored in, for example, a service (an area assigned to a service ID corresponding to each topic handled by the user) that holds the related term dictionary 50. If the user wants to handle 10 topics, the topic name is 10 These services are handled using individual services. In addition, for each of the services described above, corresponding topic chips are set, each topic chip always collects information on the corresponding topic, and the related topic chips are linked and integrated according to user input. It is also possible to realize more diverse topic providing services.
- the search by the crawler 730 accesses an existing Internet search site on the Internet, and designates a search keyword there, thereby receiving a search result from the search server of the Internet search site.
- the search result includes, for example, the address of a WEB page that includes content that matches or is similar to the search keyword (the address of WEB page 1, the address of WEB page 2, the address of WEB page 3,..., The address of WEB page X. ) Is included.
- the crawler 730 acquires a search result by executing a search on an existing Internet search site.
- the crawler 730 acquires a web page address that satisfies a predetermined condition by various other methods. can do.
- the search target is not limited to WEB pages on the Internet, but TWITTER tweet information, data (database or local) generated and edited in advance by any organization or organization, and text in the database It may be information.
- a search engine provided in a search server used by the Internet search site allows an address of a WEB page that matches or is similar to a search keyword from a data source on the Internet in response to a search request. Collect (or through pre-periodic collection activities).
- the crawler 730 transmits the search result to the topic providing server 4 ′ (for example, by API transmission) from the computer on which the crawler 730 operates.
- the crawler 730 can be configured to exclude a search result that satisfies a predetermined condition using a filter.
- the crawler 730 is automatically activated at a predetermined time, but the operation of the crawler 730 is controlled in accordance with the operation control of the topic providing server 4 ′ and the search result is acquired. Good. Further, the crawler 730 acquires search results at predetermined intervals, holds the search results in a computer on which the crawler 730 operates, and the topic providing server 4 ′ accesses the computer at a necessary timing to perform a search. You can also get the result. Further, the crawler 730 may be configured to be executed by the topic providing server 4 '.
- the computer on which the crawler 730 operates transmits the address of the WEB page related to the search keyword as a search result to the topic providing server 4 ′.
- these WEB pages And the text data obtained as a result can be transmitted as the external log 502 to the topic providing server 4 ′.
- the information update unit 46 extracts a character string whose meaning can be identified from the external log 502 acquired by the text data acquisition process 700, and stores the extracted character string in the related term dictionary 50 (character string extraction). Process 710).
- the character strings extracted in this way correspond to the above-mentioned related terms, and these related terms are respectively stored in the related term dictionary 1 to 3 corresponding to the external log 502 from which the related terms are extracted.
- Various methods can be considered as a method for extracting a related term from the external log 502.
- the related words can be extracted by the method by the sentence analysis unit 43 described above.
- the character string extraction process 710 related terms are extracted from the plurality of external logs 502 and stored in the corresponding related term dictionary 50, respectively.
- the plurality of external logs 502 may be text data collected at different timings for the same target or data source, or may be a plurality of text data collected by different subjects or search conditions at the same timing. . Detailed processing of the character string extraction processing 710 will be described later.
- the information updating unit 46 compares a plurality of related terminology dictionaries 50 each storing a related term by character string extraction processing 710, and compares the comparison result into the comparison result data 54 according to the appearance status of the related term.
- Store (dictionary comparison process 720).
- the topic providing server 4 ′ displays the comparison result.
- the input specific information including the data 54 is transmitted to the conversation control terminal device 2 ′′.
- the conversation control terminal device 2 ′′ receives the input specific information, it determines response information based on the input specific information and the scenario data 28. Then, control is performed so that the response information is displayed on the display of the conversation control terminal device 2 ′′.
- a topic name and a change in a related term in this topic are displayed.
- the display of the change of the related term includes, for example, the appearance status of the related term and the corresponding related term.
- FIG. 40 is a flowchart showing the processing procedure of the character string extraction processing 710.
- an external log 502 text data
- the text data may be any data as long as the text data can be acquired.
- step S62 related texts that are meaning-identifiable character strings are extracted from the text data read in step S61.
- various methods for extracting related terms from text data including a method based on the degree of difference between adjacent characters by the sentence analysis unit 43 and a method using morphological analysis.
- step S63 when a plurality of related terms are extracted for one text data in step S62, the plurality of related terms are ranked according to a predetermined criterion. For example, ranking can be performed according to the importance of related terms in text data, and ranking can be performed according to the character length and appearance frequency of related terms.
- ranking is performed according to the degree of difference between the adjacent characters before and after.
- the ranking can be performed by various factors in addition to such criteria and combinations thereof. Such “rank” indicates relevance to a topic. Further, even when a plurality of related terms are extracted, it is possible not to perform such ranking.
- step S64 the related terms ranked in step S63 are stored in the related term dictionary corresponding to the text data.
- the related terms extracted from one text data are collectively stored in one record, and each related term is stored in a storage position (array entry) corresponding to the ranking.
- a plurality of related terms are defined as a set associated with one text data (text data from which the related terms are extracted). Ranking is to rank related terms in the set.
- step S61 to step S64 described above when there are a plurality of text data to be processed, the processing from step S61 to step S64 described above is repeated for each text data.
- FIG. 41 is a flowchart showing the processing procedure of dictionary comparison processing 720.
- related terms are extracted from two text data (text data 1 and text data 2) collected as time series data, and the corresponding related terms dictionary (i-1) and related terms dictionary (i) are extracted. It is assumed that the comparison process is performed in the situation stored in.
- step S71 the related terms stored in the related term dictionary (i-1) and the related term dictionary (i) are read.
- step S72 the related terms in the related term dictionary (i-1) are compared with the related terms in the related term dictionary (i). ) are stored in the comparison result data 54 as newly appearing related terms (new arrival related terms).
- Each related term dictionary is associated with, for example, a topic name, and the dictionary comparison process 720 can perform comparison using the topic name.
- the new arrival related terminology is a comparison result data 54 together with a topic name that can represent the corresponding related terminology dictionary and an appearance status (in this case, a “new arrival” character that represents a new appearance, a code corresponding thereto, and the like). Is remembered.
- step S73 the related terms in the related term dictionary (i-1) are compared with the related terms in the related term dictionary (i). Is stored in the comparison result data 54 as an extinct related term (an extinct related term).
- Each related terminology dictionary is associated with, for example, a topic name, and an extinction related terminology corresponds to the topic name, appearance status (in this case, “disappearance” indicating new appearance and this) Are stored in the comparison result data 54.
- step S74 the contents of the related term dictionary (i-1) are copied to the related term dictionary (i).
- the character string extraction process 710 prepares a related term dictionary (i-1) for storing related terms, and thereafter, this new related term dictionary (i-1) and The related term dictionary (i) to which the content of the related term dictionary (i-1) is copied is compared by the dictionary comparison processing 720.
- the character string extraction processing 710 and the dictionary comparison processing 720 are repeatedly executed at a predetermined timing, and detailed description thereof will be described later. Further, by repeatedly performing the dictionary comparison process 720, the comparison result is stored in the comparison result data 54 at the processing timing. However, when the comparison result is stored, it is stored before that. Whether the comparison result is deleted or stored cumulatively is determined according to the specification of the information search system 100 according to the present invention. Further, the comparison result data 54 may be prepared separately for each dictionary comparison process 720.
- the related term dictionary (i-1) and the related term dictionary (i) can be compared, and a common related term (common related term) can be stored in the comparison result data 54.
- a common related term common related term
- other related terms co-occurrence related terms
- the information related to the commonality may be determined in consideration of the rank associated with the co-occurrence related terms. For example, if co-occurrence related terms with high rank (high importance for the topic indicated by those related terms) are common in the related term dictionary (i-1) and the related term dictionary (i), Commonality can be appreciated more.
- a related term is extracted by the character string extraction processing 710a from the text data 1 collected from a predetermined WEB page at this time, and the extracted related term is stored in the related term dictionary (i -1).
- This character string extraction process 710a corresponds to the character string extraction process 710 described with reference to FIG.
- a related phrase is extracted from the text data 2 collected from the same WEB page by the character string extraction processing 710b. It is stored in the related term dictionary (i).
- the dictionary comparison process 720a compares the related term dictionary (i-1) with the related term dictionary (i), Depending on the appearance status, for example, newly-arrived related words that have newly appeared are stored in the comparison result data 54.
- the comparison process is completed, the contents of the related term dictionary (i) are copied to the related term dictionary (i-1).
- the related term is extracted by the character string extraction processing 710c from the text data 3 collected from the same WEB page, and the extracted relation
- the lyrics are stored in the related dictionary (i).
- the dictionary comparison process 720b compares the related term dictionary (i-1) with the related term dictionary (i), Depending on the appearance status, for example, newly-arrived related words that have newly appeared are stored in the comparison result data 54.
- the comparison process is completed, the contents of the related term dictionary (i) are copied (saved) to the related term dictionary (i-1).
- related words are extracted at different timings by the character string extraction processing 710 from five text data (text data 1 to 5) collected in time series from the same WEB page.
- text data 1 to 5 collected in time series from the same WEB page.
- the difference from FIG. 41 is that three related terminology dictionaries are used cyclically.
- a related term is extracted by the character string extraction processing 710f from the text data 1 collected from a predetermined WEB page at this time, and the extracted related term is stored in the related term dictionary (i -1).
- the extracted related terminology also changes accordingly.
- the frequency of appearance of the related term is memorized in the related term dictionary, but the related term that has rapidly increased in the short term (in the three related term dictionary), Maintains a certain range of appearance frequency as the frequency of appearance of related verbs (in the dictionary dictionary), related verbs in the frequency of appearance again (in the three related terminology dictionaries), and changes in the frequency of appearance of other related terms It is also possible to grasp related terms.
- the dictionary comparison process 720f when the comparison process ends, the contents of the related term dictionary (i) are copied to the related term dictionary (i-1), and the contents of the related term dictionary (i + 1) i) is copied.
- the comparison result data 54 is stored / updated using two related terminology dictionaries and in the example of FIG. 43 (cyclically), the comparison result data 54 is stored / updated. You may perform a dictionary comparison process using a related term dictionary. As a result, the appearance status of the related term at more timings can be grasped, and when the appearance status satisfies a predetermined condition, the related term can be stored in the comparison result data 54.
- FIG. 44 shows the relations extracted from the three text data (text data A to C) collected at the same timing from different WEB pages (WEB pages related to different subjects) by the character string extraction processing 710 and extracted.
- the lyrics are stored in the corresponding related term dictionary A, related term dictionary B, or related term dictionary C, respectively, and then a dictionary comparison process 720 is performed on these three related term dictionaries.
- the dictionary comparison processing 720k compares the three related terminology dictionaries (related term dictionary A, related term dictionary B, and related term dictionary C). 54.
- related terms existing in common in the three related term dictionaries are stored in the comparison result data 54.
- the text data A to C are text data collected from WEB pages related to different subjects, and it is better to focus on the related terms (common related terms) common to the three related terms dictionaries than the different related terms. It becomes possible to discover a common topic, and it is often meaningful in that respect.
- the information related to the commonality may be determined in consideration of the rank associated with the co-occurrence related terms. For example, if co-occurrence related terms with high rank (high importance for the topic indicated by those related terms) are common in the three related term dictionaries, the commonness of the common related terms can be evaluated more highly.
- the comparison result data 54 can be updated in time series.
- character string extraction processing is performed from three text data (text data A ′ to C ′) respectively collected from predetermined different WEB pages at this time.
- 710k ′, character string extraction processing 710m ′, and character string extraction processing 710n ′ extract related terms, and the extracted related terms are stored in the related term dictionary A ′, the related term dictionary B ′, and the related term dictionary C ′, respectively.
- the text data A ' is assumed to be the same WEB page as the text data A or the same subject WEB page.
- the text data B ' is the same WEB page as the text data B or the same WEB page
- the text data C' is the same WEB page as the text data C or the same WEB page.
- the dictionary comparison process 720k ′ compares the three related terminology dictionaries (the related term dictionary A ′, the related term dictionary B ′, the related term dictionary C ′), and the related terms etc. according to the appearance status of the related term. Is stored in the comparison result data 54. In this embodiment, for example, related terms existing in common in the three related term dictionaries are stored in the comparison result data 54.
- the text data B ′′ is the same WEB page as the text data B and text data B ′ or the WEB page of the same subject
- the text data C ′′ is the same WEB as the text data C and text data C ′. Page, or WEB page of the same subject.
- related terms are extracted based on three text data collected from different WEB pages (WEB pages related to different subjects) at the same timing. Extraction of the lyrics may be performed, or the related lyrics may be extracted from four or more text data.
- character string extraction processing 710 and the dictionary comparison processing 720 shown in FIG. 43 will be described in more detail with reference to FIG. 45, character string extraction processing (710f, 710g, 710h) is performed on three text data (text data 1 to 3), respectively, and the corresponding related term dictionary (i-1) and related term dictionary (i ), A dictionary comparison process 720f is performed on the related term dictionary (i + 1).
- the three text data are collected from the same WEB page related to the common subject “stock trading”.
- a search keyword “stock transaction” is input by WEB search, and three WEB pages obtained as a result are handled as one text data.
- the URLs of the first WEB page are all the same
- the URLs of the second WEB page are all the same
- the URLs of the third WEB page are all the same.
- text data 1-1 is a Q1 question
- text data 1-2 is a Q8 question
- text data 1-3 is a Q13 question.
- the ranking of related terms can be determined based on, for example, the appearance frequency.
- the four extracted related terms (related terms 1 to 4) are “ ⁇ company”, “ ⁇ bank”, “application”, and “account” in order of rank.
- the character string is ascertained by dividing it into the smallest meaningful unit (morpheme), but in other methods, units larger than the morpheme (for example, sentences or parts of sentences) are extracted as related terms. sell.
- a character string composed of a noun and a particle, such as the above-mentioned “Application”, is also extracted as a related particle.
- four related terms are extracted, and each is arranged in order of rank and stored as one record in the related term dictionary (i).
- the ranking of related terms can be determined based on, for example, the appearance frequency.
- the extracted four related terms (related terms 1 to 4) are “tax rate”, “ ⁇ company”, “ ⁇ bank”, and “application” in order of rank.
- the related term dictionary (i + 1) In this embodiment, four related terms are extracted, and each is arranged in order of rank, and is stored as one record in the related term dictionary (i + 1).
- the ranking of related terms can be determined based on, for example, the appearance frequency.
- the extracted four related terms (related terms 1 to 4) are “ ⁇ company”, “account”, “ ⁇ bank”, and “application” in order of rank.
- dictionary comparison processing 720f is performed on the related term dictionary (i-1), the related term dictionary (i), and the related term dictionary (i + 1).
- the dictionary comparison process 720f detects newly appearing related terms (new arrival related terms), disappeared related terms (disappearing related terms), and again appearing related terms (resurrection related terms). It is assumed that the comparison result data 54 is stored.
- the related term dictionary (i-1) For example, comparing the related term dictionary (i-1) with the related term dictionary (i), the related term “tax rate” has newly appeared in the related term dictionary (i). It has disappeared. Therefore, as shown in the record 54a in FIG. 46, the relative result “tax rate” and “account” are stored in the comparison result data 54.
- the comparison result data 54 has the same data indicating the appearance status together with these related terms (in this embodiment, “new arrival” for newly appearing related terms, “annihilation” for disappearing related terms). Stored in the record.
- “topic name” data for identifying a related term dictionary is stored in order to indicate the timing at which the appearance situation is reached.
- Each related term dictionary is associated with a topic name and date
- the related term dictionary (i) is associated with a topic name such as “topic of“ stock trading ”on October 10, 2013, t2”. It has been.
- the comparison result data 54 stores the related terms “account” and “tax rate”.
- the comparison result data 54 includes data indicating the appearance status together with these related terms (in this embodiment, “resurrection” in the case of related terms that have reappeared (resurrected), and “disappear” in the case of related terms that have disappeared). ) Is stored in the same record.
- “topic name” data for identifying a related term dictionary is stored in order to indicate the timing at which the appearance situation is reached.
- Each related term dictionary is associated with a topic name and date, and the related term dictionary (i + 1) is associated with a topic name such as “topic of“ stock trading ”on October 10, 2013 at t3”. It has been.
- the search keyword “Technology of company A” is input by WEB search, and the three WEB pages obtained as a result are handled as one text data.
- the search keyword “Technology of company B” is input by WEB search, and the resulting three WEB pages are treated as one text data.
- the WEB search is performed.
- the search keyword “AI (artificial intelligence) related technology” is input, and the three WEB pages obtained as a result are handled as one text data.
- text data 1 includes text data A-1 obtained from the first WEB page, text data A-2 obtained from the second WEB page, related to the subject “Technology of company A”. It includes text data A-3 obtained from the third WEB page.
- the text data 2 includes text data B-1 obtained from the first WEB page, text data B-2 obtained from the second WEB page, Text data B-3 obtained from three WEB pages, the text data 3 including text data C-1 obtained from the first WEB page related to the subject “AI (artificial intelligence) related technology”, It includes text data C-2 obtained from the second WEB page and text data C-3 obtained from the third WEB page.
- AI artificial intelligence
- the character string extraction process 710k extracts a related term from the text data A by a predetermined method and stores it in the related term dictionary A.
- four related terms are extracted, and each is arranged in order of rank and stored as one record in the related term dictionary A.
- the ranking of related terms can be determined based on, for example, the appearance frequency.
- the extracted four related terms are “Company A”, “speech”, “speech recognition”, and “sales” in order of rank.
- the character string extraction process 710m extracts a related term from the text data B by a predetermined method and stores it in the related term dictionary B.
- four related terms are extracted, and each is arranged in order of rank and stored in the related term dictionary B as one record.
- the ranking of related terms can be determined based on, for example, the appearance frequency.
- the extracted four related terms are “voice”, “research and development”, “business achievements of company B”, and “voice recognition” in order of rank.
- the character string extraction processing 710n extracts a related term from the text data C by a predetermined method and stores it in the related term dictionary C.
- the related term dictionary C In this embodiment, four related terms are extracted, and each is arranged in order of rank and stored as one record in the related term dictionary C.
- the ranking of related terms can be determined based on, for example, the appearance frequency.
- the extracted four related terms (related terms 1 to 4) are “AI”, “robot”, “voice recognition”, and “agent” in order of rank.
- dictionary comparison processing 720k is performed on the related term dictionary A, the related term dictionary B, and the related term dictionary C.
- the dictionary comparison process 720k detects related terms common to the three dictionaries (common related terms) and stores them in the comparison result data 54.
- FIG. 48 shows a modification of the character string extraction process 710 and the dictionary comparison process 720 shown in FIG.
- character string extraction processing 710 is performed for each of two text data (text data 1 and text data 2), and the corresponding related term dictionary (i-1) and related term dictionary (i) are processed.
- the dictionary comparison process 720 is performed.
- the display related to the text data 3 shown in FIG. 45 is omitted.
- the two text data are collected from the same WEB page related to the common subject "stock trading".
- a search keyword “stock transaction” is input by WEB search, and three WEB pages obtained as a result are set as one text data, but related terms are extracted for each WEB page unit.
- related terms are managed for each WEB page, but three text data based on three WEB pages may be prepared, and related terms may be extracted for each text data.
- it is important that the related terms are extracted from a plurality of text data.
- the URLs of the first WEB page are all the same
- the URLs of the second WEB page are all the same
- the URLs of the third WEB page are all the same.
- the extracted four related terms are “ ⁇ company”, “account”, “application procedure”, and “ ⁇ bank” in the rank order.
- the extracted four related terms (related terms 1 to 4) are “account”, “ ⁇ bank”, “application”, and “ ⁇ sha” in order of rank.
- the extracted four related terms (related terms 1 to 4) are “buy stock”, “ ⁇ bank”, “limit”, and “ ⁇ company” in order of rank.
- the character string extraction processing 710 obtains neighborhood related terms for each of the related terms extracted in this way, and stores them in the related term dictionary (i-1).
- a neighborhood related terminator is a related term that appears (co-occurs) with a related term when a particular related term is focused.
- the set of related terms included in the topic corresponding to the text data is a set of related terms.
- a set of related terms including related terms is referred to as a neighborhood of related terms, and a set of neighborhoods of related terms is referred to as a neighborhood system of related terms.
- a neighborhood system of related terms is stored for each related term.
- this related term is extracted for the text data 1-1, and the neighborhood of the related term is ⁇ Company, account, application procedure, ⁇ bank ⁇ .
- this related term is extracted for the text data 1-3, and the neighborhood of the related term is ⁇ buy stock, ⁇ bank, limit, ⁇ company ⁇ .
- the neighborhood of the related term is ⁇ Company, account, stock purchase, ⁇ bank, application procedure, limit ⁇ (the neighborhood of the related term for text data 1-1)
- the relative system of the related terms thus obtained is the related terms “ ⁇ Company”, “Account”, “ ⁇ Bank”, “Application procedure”, “Stock purchase”, “Application is”, “Limit price”, “ ⁇ Company ”is stored in the related term dictionary (i-1).
- a related system of the related terms (neighboring related terms 1 to 7) is stored. The order of these is determined by the ranking performed by the character string extraction processing 710, the high co-occurrence, etc. Determined in consideration of
- the four extracted related terms are “ ⁇ company”, “account”, “new system”, and “application procedure” in order of rank.
- the extracted four related terms (related terms 1 to 4) are “account”, “ ⁇ bank”, “ ⁇ company”, and “buy stock” in order of rank.
- the extracted four related terms (related terms 1 to 4) are “stock purchase”, “ ⁇ bank”, “ ⁇ company”, and “new system” in rank order.
- the character string extraction processing 710 obtains neighborhood related terms for each of the related terms extracted in this way, and stores them in the related term dictionary (i). For example, focusing on the related term “ ⁇ Company”, this related term is extracted for the text data 2-1, and the neighborhood of the related term is ⁇ Company, account, new system, application procedure ⁇ . Similarly, this related term is extracted for the text data 2-3, and the neighborhood of the related term is ⁇ buy stock, ⁇ bank, ⁇ company, new system ⁇ .
- the relative system of the related term is ⁇ Company, account, stock purchase, new system, application procedure, ⁇ bank ⁇ (the related terms for Text Data 2-1 Only one related term “new system” that overlaps the neighborhood and the neighborhood of the related term for the text data 2-3 is included).
- the related system of the related terms obtained in this way is the related terms “ ⁇ Company”, “Account”, “ ⁇ Bank”, “Application Procedure”, “Stock Purchase”, “New System”, “ ⁇ Company”.
- the related term dictionary i.
- the neighborhood system of the related terms (neighboring related terms 1 to 6) is stored. The order of these is determined by the ranking performed by the character string extraction processing 710, the high co-occurrence, etc. Determined in consideration of
- the dictionary comparison process 720 compares the related term dictionary (i-1) with the related term dictionary (i).
- FIG. 49 shows a modification of the character string extraction process 710 and the dictionary comparison process 720 shown in FIG.
- character string extraction processing 710 is performed for each of the three text data (text data A to C), and the corresponding related term dictionary A, related term dictionary B, and related term dictionary C are compared with the dictionary. It shows where the comparison process 720 is performed.
- a search keyword “Technology of Company A” is input by WEB search, and two text data (text data A-1 and text data A-2 are obtained from two WEB pages obtained as a result. ) And are individually handled in the character string extraction processing 710.
- a search keyword “Technology of company B” is input by WEB search, and two text data (text data B-1, text data B-) are obtained from the two WEB pages obtained as a result. 2) are acquired, and these are handled individually in the character string extraction processing 710.
- text data C a search keyword “AI (artificial intelligence) related technology” is input by WEB search, and two text data (text data C-1, text) are obtained from the two WEB pages obtained as a result. Data C-2) is obtained, and these are handled individually in the character string extraction processing 710.
- text data A, text data B, and text data C each include three text data. In this embodiment, it is assumed that each text data A, text data B, and text data C includes two text data.
- the character string extraction processing 710 obtains neighborhood related terms for each of the related terms extracted in this way, and stores them in the related term dictionary A.
- a neighborhood related terminator is a related term that appears (co-occurs) with a related term when a particular related term is focused.
- the set of related terms included in the topic corresponding to the text data is a related term set.
- a set of related terms is called a neighborhood of related terms, and a set of neighborhoods of related terms is called a neighborhood system of related terms.
- a neighborhood system of related terms is stored for each related term.
- this related term is extracted for the text data A-1, and the neighborhood of the related term is ⁇ Company A, speech, speech recognition, robot ⁇ .
- this related term is extracted for the text data A-2, and the neighborhood of the related term is ⁇ compression technology, speech recognition, sales, speech ⁇ .
- the neighborhood system of the related term is ⁇ speech recognition, company A, compression technology, speech, robot, business ⁇ (the neighborhood of the related term for the text data A-1 and Only one related terminology “speech” that overlaps with the neighborhood of the related term for the text data A-2 is included).
- the neighborhood system of the related terms thus obtained is stored in the related term dictionary A for the related terms “Company A”, “speech recognition”, “speech”, “compression technology”, “sales”, and “robot”.
- the related system of the related terms (neighboring related terms 1 to 5) is stored. The order of these is determined by the ranking performed by the character string extraction processing 710, the high co-occurrence, etc. Determined in consideration of
- the character string extraction processing 710 obtains neighborhood related terms for each of the related terms extracted in this way, and stores them in the related term dictionary B. For example, focusing on the related term “speech”, this related term is extracted for the text data B-1, and the neighborhood of the related term is ⁇ speech, achievement of company B, speech recognition, research and development ⁇ . Similarly, this related term is extracted for the text data B-2, and the neighborhood of the related term is ⁇ R & D, speech, speech recognition, authentication technology ⁇ .
- the neighborhood system of the related term is ⁇ speech, research and development, achievements of company B, speech recognition, authentication technology ⁇ (the neighborhood of the related term for the text data B-1, Only one of the related terms “speech recognition” and “research and development” that overlap in the vicinity of the related term for text data B-2 is included.
- the related system of the related terms thus obtained is stored in the related term dictionary B for the related terms “speech”, “research and development”, “business achievements of company B”, “speech recognition”, and “authentication technology”. .
- a related system of the related terms (neighboring related terms 1 to 4) is stored. The order of these is determined by the ranking performed by the character string extraction process 710, the high co-occurrence, etc. Determined in consideration of
- the character string extraction processing 710 obtains neighborhood related terms for each of the related terms extracted in this way, and stores them in the related term dictionary C. For example, focusing on the related term “AI”, this related term is extracted for the text data C-1, and the neighborhood of the related term is ⁇ AI, agent, robot, voice recognition ⁇ . Similarly, this related term is extracted for the text data C-2, and the neighborhood of the related term is ⁇ robot, voice recognition, AI, learning function ⁇ .
- the neighborhood of the related term is ⁇ AI, robot, agent, speech recognition, learning function ⁇ (the neighborhood of the related term for the text data C-1 and the text data C ⁇ Only one of the related terms “robot” and “speech recognition” that overlap in the vicinity of the related term for 2 is included).
- the neighborhood system of the related terms thus obtained is stored in the related term dictionary C for each of the related terms “AI”, “robot”, “voice recognition”, “agent”, and “learning function”.
- a related system of the related terms (neighboring related terms 1 to 4) is stored. The order of these is determined by the ranking performed by the character string extraction process 710, the high co-occurrence, etc. Determined in consideration of
- the dictionary comparison process 720 compares the related term dictionary A to C.
- the dictionary comparison process 720 compares the common related terms with the neighboring related terms of each related term. Then, it is possible to grasp the commonality of neighboring related terms, the commonality of the order of neighboring related terms, and the like, thereby determining the level of commonality between the common related terms.
- the input index, the progress index, etc. are learned in the dialogue between the conversation control terminal device 2 ′′ and the user who uses this device.
- a function is a context learning function.
- the input index is information indicating what input the user has made so far, that is, a history of user input.
- the progress index is information indicating what topic has been provided to the user so far, that is, a history of the topic provided to the user.
- the information update unit 46 of the topic providing server 4 ′ can capture the history of the appearance of related terms by the information update unit 46, thereby clarifying the topic name to which the related term belongs, and the usual related information.
- Related words learning that can distinguish between words and new arrival related words (newly appearing), and can make judgments on similarities and differences between topics through comparison processing of related words Functions can be realized.
- the user of the conversation control terminal device 2 ′′ can handle input types related to many topic names, and the user's input identification means can be diversified.
- Scenario data is controlled so as to switch the service ID based on the input type determined from the input status of the user's conversation control terminal device 2 ′′.
- the related word dictionary 50 provided by the sentence analysis unit 43, the preference analysis unit 44, and the topic analysis unit 45 of the topic providing server 4 ′ in the information search system 100, the related term / co-occurrence word list display screen 650, and the like are provided.
- the service ID corresponding to the function to be performed and the service ID corresponding to the function providing the display of the related term dictionary 50 and the comparison result data 54 provided by the information updating unit 46 of the topic providing server 4 ′ are automatically determined by the scenario data. Switch automatically.
- the statement of the corresponding scenario data is set so as to transition to the service with the corresponding service ID as an action when a predetermined input type is input, for example.
- scenario data as shown in FIG. 14, the following statement is obtained.
- “sto” is a description to shift the state (shift to)
- “$ IDN $” is the identification number of the destination service
- “ ⁇ sta: $ num $>” is the destination " ⁇ $ Input $>” is a user input sentence.
- the topic providing server 4 ′ includes a CPU (Central Processing Unit) 801, a RAM (Random Access Memory) 802, a ROM (Read Only Memory) 803, a network interface 804, an audio control unit 805, a microphone 806, a speaker 807, a display controller 808, A display 809, an input device interface 810, a keyboard 811, a mouse 812, an external storage device 813, an external recording medium interface 814, and a bus 815 for connecting these components to each other are included.
- a CPU Central Processing Unit
- RAM Random Access Memory
- ROM Read Only Memory
- the CPU 801 controls the operation of each component of the topic providing server 4 ′, and under the control of the OS, processes in the sentence analysis unit 43, the preference analysis unit 44, the topic analysis unit 45, the information update unit 46 according to the present invention, and the like. Control the execution of
- the RAM 802 temporarily stores programs for executing each process executed by the CPU 801 and data used during the execution of these programs. Further, as described above, the related term dictionary 50, the comparison result data 54, and the like can also be stored.
- the ROM 803 stores a program executed when the topic providing server 4 'is started.
- the network interface 804 is an interface for connecting to the network 900.
- the network 900 is, for example, a network between the conversation control terminal device 2 ′′ shown in FIG. 20 and a computer on which the crawler 730 operates, or a network such as the Internet.
- the audio control unit 805 controls the microphone 806 and the speaker 807 to control audio input / output.
- the display controller 808 is a dedicated controller for actually processing a drawing command issued by the CPU 801.
- the display 809 is a display device composed of, for example, an LCD (Liquid Crystal Display) or a CRT (Cathode Ray Tube).
- the input device interface 810 receives a signal input from the keyboard 811 or the mouse 812, and transmits a predetermined command to the CPU 801 according to the signal pattern.
- the external storage device 813 is, for example, a storage device such as a hard disk or a semiconductor memory, and the above-described program and data are recorded in this device and loaded into the RAM 802 from there if necessary at the time of execution.
- a storage device such as a hard disk or a semiconductor memory
- the above-described program and data are recorded in this device and loaded into the RAM 802 from there if necessary at the time of execution.
- the related term dictionary 50, the comparison result data 54, and the like can also be stored.
- the external recording medium interface 814 accesses the external recording medium 910 and reads data recorded therein.
- the external recording medium 910 is, for example, a portable flash memory, a CD (Compact Disc), a DVD (Digital Versatile Disc), or the like.
- a program executed by the CPU 801 and realizing each function of the present invention can be provided from the external recording medium 910 via the external recording medium interface 814. Further, as another distribution form of the program for realizing each function of the present invention, a route in which the program is stored in the external storage device 813 or the RAM 802 via a network 900 and a network interface 804 from a predetermined server on the network. Etc. are also conceivable.
- the conversation control terminal device 2 ′′ and the crawler 730 included in the information search system 100 of the present invention The hardware configuration is basically the same as that shown in FIG. However, regarding the computer on which the topic providing server 4 ′ and the crawler 730 operate, the audio control unit 805, the microphone 806, the speaker 807, the display controller 808, the display 809, the input device interface 810, the keyboard 811, and the mouse 812 It is not an essential component.
- the information search system 100 described so far responds to the control by the scenario data 28 (or the scenario data 55) between the Topiclet 20 operating on the conversation control terminal device 2 ′′ and the topic providing server 4 ′.
- the scenario data 28 or the scenario data 55
- information such as related terms is displayed on the display of the conversation control terminal device 2 ′′.
- the Topiclet 20 is downloaded to the conversation control terminal device 2 ′′ at a predetermined timing and activated, for example, and the Topiclet 20 communicates with the topic providing server 4 ′ via a network such as the Internet.
- each function of the topic providing server 4 ′ described above is configured by a WEB server, an ASP (Active Server Pages) server, etc., and a general WEB browser operating on the conversation control terminal device 2 ′′ controls the scenario data.
- a WEB server, an ASP server, or the like that functions as the topic providing server 4 ′ displays data (for example, HTML data) for the display in order to display the screen on the display of the conversation control terminal device 2 ′′. Edit and generate.
- Topic 51 includes a conversation control terminal device 1002 (Topiclet 1020), a topic providing server 1004 (iWA 1030), a maintenance device 1003 (iWA Manager 1040), and a topic analysis device 1005.
- Topic 1020 Topic 1020
- iWA 1030 topic providing server 1004
- iWA Manager 1040 maintenance device 1003
- topic analysis device 1005 topic analysis device 1005.
- the topic analysis processing itself is executed by a topic analysis device 1005, which is a device separate from the maintenance device 1003 (iWA Manager 1040). Can also be realized.
- the topic providing system 1 and the topic providing system 1 ′ are the same in other configurations, and detailed description thereof is omitted.
- the conversation control terminal device 1002 includes a reception unit 1240 and a transmission unit 1230.
- the reception unit 1240 corresponds to the reception unit 240 of the conversation control terminal device 2
- the transmission unit 1230 corresponds to the transmission unit 230 of the conversation control terminal device 2.
- the conversation control terminal device 1002 is basically the same as the conversation control terminal device 2 of FIG. 4, and the display of other components is omitted.
- the topic providing server 1004 includes an input information analysis unit 1310 and a scenario data storage unit 1320
- the maintenance device 1003 includes a scenario data transmission unit 1430, a scenario data editing unit 1410, and a terminal device virtual construction unit 1420.
- the topic analysis device 1005 includes a topic analysis unit 1510.
- the topic analyzing unit 1510 of the topic analyzing apparatus 1005 is connected to the topic providing server 1004 via a network, and receives scenario data and other data obtained by topic analyzing processing. And provided to the input information analysis unit 1310 of the topic providing server 1004. Further, the topic analysis unit 1510 of the topic analysis device 1005 is connected to the maintenance device 1003 via a network (or indirectly connected via the topic providing server 1004), and is a scenario obtained by topic analysis processing. Data and other data are provided to the scenario data editing unit 1410 of the maintenance device 1003.
- the topic analysis unit 1510 of the topic analysis device 1005 generates a topic list and edits or verifies scenario data based on the topic list.
- the topic list is data to which the closeness and connection method of the topics are given through the related words that relate the topics.
- the topic analysis unit 1510 accumulates related terms associated with the topic in the topic list.
- the topic list provided to the maintenance device 1003 is data provided to the contractor of the topic providing system 1 ′, and these data are used, for example, when generating scenario data.
- topic analysis unit 1510 of the topic analysis device 1005 may be configured to realize the processing of the sentence analysis unit 43 of the topic providing server 4 ′ illustrated in FIG. 20, and each process of the input information analysis unit 41 may be performed. It may be configured to be realized.
- a plurality of topic analysis units 1510 of the topic analysis device 1005 can be arranged so as to be associated with each contractor of the topic providing system 1 ′.
- information acquired by each topic analysis device 1005 can be used (or used after being organized and integrated), and data can be provided to the corresponding maintenance device 1003 and the topic providing server 1004.
- the timing at which data is provided by the topic analysis unit 1510 of the topic analysis device 1005 and the content of the data may differ depending on the provision destination (that is, between the maintenance device 1003 and the topic provision server 1004).
- each of the maintenance devices 1003 uses a PaaS (Platform as a Service) or SaaS (Software as a Service) that can be used remotely as a service on the Internet. Functions can also be realized.
- PaaS Platinum as a Service
- SaaS Software as a Service
- This system can be used when there is a purpose to provide a user with important keywords extracted from topics or topics without having them.
- Topic Analysis unit 1, 1 'Topic providing system 2, 1002 Conversation control terminal device (Topiclet 20, Topiclet 1020) 2 ′, 2 ′′ conversation control terminal device 3, 1003 maintenance device (iWA Manager 40, iWA Manager 1040) 4, 4 ', 1004 Topic providing server (iWA30, iWA1030) DESCRIPTION OF SYMBOLS 10 Topic storage device 21 Input control part 22 Search control part 23 Transmission control part 24 Reception control part 25 Response information determination part 26 Output control part 41 Input information analysis part 42 External log acquisition control part 43 Sentence analysis part 44 Preference analysis part 45 Topic Analysis unit 46 Information update unit 100 Information retrieval system 1005 Topic analysis device
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Mathematical Physics (AREA)
- Theoretical Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Computational Linguistics (AREA)
- Human Computer Interaction (AREA)
- Databases & Information Systems (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Artificial Intelligence (AREA)
- Machine Translation (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
- Information Transfer Between Computers (AREA)
Abstract
L'invention a trait à un système de fourniture de sujet de conversation qui permet d'obtenir une réponse naturelle en adéquation avec un sujet de conversation et avec le déroulement d'une conversation, et qui permet de réaliser séparément un travail de maintenance tel que la modification, l'ajout ou la correction d'une règle de commande de sujet de conversation. Ce système de fourniture de sujet de conversation comprend : une unité d'entrée qui permet à un utilisateur d'entrer des informations d'entrée ; une unité d'analyse d'informations d'entrée qui analyse les informations d'entrée et génère des informations de spécification d'entrée ; une unité de mémorisation de données de scénario qui extrait des données de scénario définissant des informations de réponse se rapportant à un sujet de conversation ; une unité de détermination d'informations de réponse qui détermine les informations de réponse sur la base des données de scénario et des informations de spécification d'entrée ; et une unité de sortie qui émet les informations de réponse.
Priority Applications (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| JP2014554277A JP6529761B2 (ja) | 2012-12-28 | 2013-12-04 | 話題提供システム、及び会話制御端末装置 |
Applications Claiming Priority (6)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| JP2012-288858 | 2012-12-28 | ||
| JP2012288858 | 2012-12-28 | ||
| JP2012-288857 | 2012-12-28 | ||
| JP2012288857 | 2012-12-28 | ||
| JP2012-288856 | 2012-12-28 | ||
| JP2012288856 | 2012-12-28 |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| WO2014103645A1 true WO2014103645A1 (fr) | 2014-07-03 |
Family
ID=51020739
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| PCT/JP2013/082623 Ceased WO2014103645A1 (fr) | 2012-12-28 | 2013-12-04 | Système de fourniture de sujet de conversation, dispositif terminal de commande de conversation et dispositif de maintenance |
Country Status (2)
| Country | Link |
|---|---|
| JP (3) | JP6529761B2 (fr) |
| WO (1) | WO2014103645A1 (fr) |
Cited By (9)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| JP2017010207A (ja) * | 2015-06-19 | 2017-01-12 | 日本電信電話株式会社 | 話題継続願望判定装置、方法、及びプログラム |
| CN106354815A (zh) * | 2016-08-30 | 2017-01-25 | 北京光年无限科技有限公司 | 一种对话系统中话题处理方法 |
| JP2018110026A (ja) * | 2018-03-06 | 2018-07-12 | ヤフー株式会社 | 応答生成装置、応答生成方法及び応答生成プログラム |
| JP2018132754A (ja) * | 2017-02-13 | 2018-08-23 | 株式会社東芝 | 対話システム、対話方法、および対話システムを適合させる方法 |
| JP2019522839A (ja) * | 2016-05-17 | 2019-08-15 | グーグル エルエルシー | 入力および/または出力がチャット語を含む、ユーザインターフェース入力に応じた提示のための出力の生成 |
| WO2019188981A1 (fr) * | 2018-03-29 | 2019-10-03 | 株式会社アドバンスト・メディア | Système, dispositif et procédé de traitement d'informations, serveur et programme |
| WO2019188982A1 (fr) * | 2018-03-29 | 2019-10-03 | 株式会社アドバンスト・メディア | Système de traitement d'informations, dispositif de traitement d'informations, serveur, procédé de traitement d'informations, et programme |
| JP2022132691A (ja) * | 2020-12-08 | 2022-09-09 | Nota株式会社 | 情報処理装置、情報処理方法及びプログラム |
| US11675979B2 (en) | 2018-11-30 | 2023-06-13 | Fujitsu Limited | Interaction control system and interaction control method using machine learning model |
Families Citing this family (6)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| JP7256935B2 (ja) * | 2019-09-02 | 2023-04-13 | 富士通株式会社 | 辞書作成装置及び辞書作成方法 |
| JP2022176415A (ja) * | 2019-11-08 | 2022-11-29 | 株式会社 資生堂 | 情報処理装置及びプログラム |
| WO2021168650A1 (fr) * | 2020-02-25 | 2021-09-02 | 京东方科技集团股份有限公司 | Appareil et procédé de requête de question, dispositif et support d'enregistrement |
| JP7576290B1 (ja) | 2023-06-09 | 2024-10-31 | 株式会社サイバーエージェント | 話題モジュールセット作成装置、対話装置、話題モジュールセット作成方法、対話方法及びコンピュータプログラム |
| JP2025054280A (ja) * | 2023-09-25 | 2025-04-07 | ソフトバンクグループ株式会社 | システム |
| JP7815596B2 (ja) * | 2023-12-01 | 2026-02-18 | 裕 勝倉 | 対話装置 |
Citations (3)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| JP2007079397A (ja) * | 2005-09-16 | 2007-03-29 | Nippon Telegr & Teleph Corp <Ntt> | 対話方法、対話装置、対話プログラムおよび記録媒体 |
| JP2007193380A (ja) * | 2006-01-16 | 2007-08-02 | So-Net Entertainment Corp | 情報処理装置,情報処理方法,およびコンピュータプログラム |
| JP2007264198A (ja) * | 2006-03-28 | 2007-10-11 | Toshiba Corp | 対話装置、対話方法、対話システム、コンピュータプログラム及び対話シナリオ生成装置 |
Family Cites Families (9)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| JP4348357B2 (ja) * | 1997-09-08 | 2009-10-21 | 富士通株式会社 | 関連文書表示装置 |
| JPH11272684A (ja) * | 1998-03-19 | 1999-10-08 | Mitsubishi Electric Corp | 情報提供方法および装置 |
| JP4017354B2 (ja) * | 2000-04-17 | 2007-12-05 | 富士通株式会社 | 情報分類装置および情報分類プログラム |
| JP2003242173A (ja) * | 2001-12-13 | 2003-08-29 | Sony Corp | 情報処理装置および方法、記録媒体、並びにプログラム |
| JP4150208B2 (ja) * | 2002-05-02 | 2008-09-17 | 日本放送協会 | 関連用語提示装置及び関連用語提示プログラム |
| CN1910654B (zh) * | 2004-01-20 | 2012-01-25 | 皇家飞利浦电子股份有限公司 | 确定交谈主题并获取和呈现相关内容的方法和系统 |
| JP5181533B2 (ja) * | 2007-05-21 | 2013-04-10 | トヨタ自動車株式会社 | 音声対話装置 |
| JP4637969B1 (ja) * | 2009-12-31 | 2011-02-23 | 株式会社Taggy | ウェブページの主意,およびユーザの嗜好を適切に把握して,最善の情報をリアルタイムに推奨する方法 |
| JP5551985B2 (ja) * | 2010-07-05 | 2014-07-16 | パイオニア株式会社 | 情報検索装置及び情報検索方法 |
-
2013
- 2013-12-04 WO PCT/JP2013/082623 patent/WO2014103645A1/fr not_active Ceased
- 2013-12-04 JP JP2014554277A patent/JP6529761B2/ja active Active
-
2018
- 2018-11-30 JP JP2018225638A patent/JP6759308B2/ja active Active
- 2018-11-30 JP JP2018225637A patent/JP2019067433A/ja active Pending
Patent Citations (3)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| JP2007079397A (ja) * | 2005-09-16 | 2007-03-29 | Nippon Telegr & Teleph Corp <Ntt> | 対話方法、対話装置、対話プログラムおよび記録媒体 |
| JP2007193380A (ja) * | 2006-01-16 | 2007-08-02 | So-Net Entertainment Corp | 情報処理装置,情報処理方法,およびコンピュータプログラム |
| JP2007264198A (ja) * | 2006-03-28 | 2007-10-11 | Toshiba Corp | 対話装置、対話方法、対話システム、コンピュータプログラム及び対話シナリオ生成装置 |
Cited By (13)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| JP2017010207A (ja) * | 2015-06-19 | 2017-01-12 | 日本電信電話株式会社 | 話題継続願望判定装置、方法、及びプログラム |
| JP2019522839A (ja) * | 2016-05-17 | 2019-08-15 | グーグル エルエルシー | 入力および/または出力がチャット語を含む、ユーザインターフェース入力に応じた提示のための出力の生成 |
| CN106354815A (zh) * | 2016-08-30 | 2017-01-25 | 北京光年无限科技有限公司 | 一种对话系统中话题处理方法 |
| CN106354815B (zh) * | 2016-08-30 | 2019-12-24 | 北京光年无限科技有限公司 | 一种对话系统中话题处理方法 |
| JP2018132754A (ja) * | 2017-02-13 | 2018-08-23 | 株式会社東芝 | 対話システム、対話方法、および対話システムを適合させる方法 |
| JP2018110026A (ja) * | 2018-03-06 | 2018-07-12 | ヤフー株式会社 | 応答生成装置、応答生成方法及び応答生成プログラム |
| WO2019188981A1 (fr) * | 2018-03-29 | 2019-10-03 | 株式会社アドバンスト・メディア | Système, dispositif et procédé de traitement d'informations, serveur et programme |
| WO2019188982A1 (fr) * | 2018-03-29 | 2019-10-03 | 株式会社アドバンスト・メディア | Système de traitement d'informations, dispositif de traitement d'informations, serveur, procédé de traitement d'informations, et programme |
| JP2019175290A (ja) * | 2018-03-29 | 2019-10-10 | 株式会社アドバンスト・メディア | 情報処理システム、情報処理装置、サーバ、情報処理方法及びプログラム |
| JP2019174732A (ja) * | 2018-03-29 | 2019-10-10 | 株式会社アドバンスト・メディア | 情報処理システム、情報処理装置、サーバ、情報処理方法及びプログラム |
| US11675979B2 (en) | 2018-11-30 | 2023-06-13 | Fujitsu Limited | Interaction control system and interaction control method using machine learning model |
| JP2022132691A (ja) * | 2020-12-08 | 2022-09-09 | Nota株式会社 | 情報処理装置、情報処理方法及びプログラム |
| US12608409B2 (en) | 2020-12-08 | 2026-04-21 | Helpfeel Inc. | Information processing device, information processing method, and program |
Also Published As
| Publication number | Publication date |
|---|---|
| JPWO2014103645A1 (ja) | 2017-01-12 |
| JP6529761B2 (ja) | 2019-06-12 |
| JP2019067433A (ja) | 2019-04-25 |
| JP6759308B2 (ja) | 2020-09-23 |
| JP2019053767A (ja) | 2019-04-04 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| JP6759308B2 (ja) | 保守装置 | |
| CN102163198B (zh) | 提供新词或热词的方法及系统 | |
| CA2716062C (fr) | Determination d'informations pertinentes pour des domaines d'interet | |
| CN101470732B (zh) | 一种辅助词库的生成方法和装置 | |
| US11573989B2 (en) | Corpus specific generative query completion assistant | |
| JP6165913B1 (ja) | 情報処理装置、情報処理方法およびプログラム | |
| US10929603B2 (en) | Context-based text auto completion | |
| KR102285142B1 (ko) | 챗봇을 위한 학습 데이터 추천 장치 및 방법 | |
| WO2025130162A1 (fr) | Procédé de traitement d'informations multimédias, système et dispositif électronique | |
| CN103870001A (zh) | 一种生成输入法候选项的方法及电子装置 | |
| CN111767394A (zh) | 一种基于人工智能专家系统的摘要提取方法及装置 | |
| US10073828B2 (en) | Updating language databases using crowd-sourced input | |
| CN119557500B (zh) | 一种基于ai技术的互联网海量数据精准搜索方法及系统 | |
| CN111930884B (zh) | 一种确定回复语句的方法、设备和人机对话系统 | |
| JP2022184830A (ja) | テキスト処理装置、方法、機器及びコンピュータ可読記憶媒体 | |
| Mercan et al. | Abstractive text summarization for resumes with cutting edge NLP transformers and LSTM | |
| JP2006134183A (ja) | 情報分類方法及び装置及びプログラム及びプログラムを格納した記憶媒体 | |
| CN103631784B (zh) | 页面内容检索方法和系统 | |
| JP5911839B2 (ja) | 情報検索システム、情報検索装置、情報検索方法、及びプログラム | |
| JP2010146430A (ja) | 情報処理装置 | |
| JP5242722B2 (ja) | 代表文抽出装置およびプログラム | |
| CN118760763A (zh) | 基于不同年级知识点提取的伴读文稿生成方法及装置 | |
| JP2025169836A (ja) | 情報処理システム、情報処理装置、情報処理方法及びプログラム | |
| JP2009151541A (ja) | 検索システムにおける最適情報の提示方法 | |
| JP2023183930A (ja) | 発話データ生成装置、対話装置及び生成モデルの作成方法 |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| 121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 13868184 Country of ref document: EP Kind code of ref document: A1 |
|
| ENP | Entry into the national phase |
Ref document number: 2014554277 Country of ref document: JP Kind code of ref document: A |
|
| NENP | Non-entry into the national phase |
Ref country code: DE |
|
| 122 | Ep: pct application non-entry in european phase |
Ref document number: 13868184 Country of ref document: EP Kind code of ref document: A1 |