JPS6286429A

JPS6286429A - Natural language processing system

Info

Publication number: JPS6286429A
Application number: JP60226075A
Authority: JP
Inventors: Toshiaki Yoshino; 利明吉野
Original assignee: Fujitsu Ltd
Current assignee: Fujitsu Ltd
Priority date: 1985-10-11
Filing date: 1985-10-11
Publication date: 1987-04-20
Anticipated expiration: 2010-05-15
Also published as: JPH0743716B2

Abstract

PURPOSE:To decide the modification of a verb by providing a means regarding the attribute relation of the world model as a case in which the said verb is conjugated other than the case described in a word dictionary. CONSTITUTION:Whether or not the description of case information of a verb exists in a dictionary is checked and when present, it is compared with that of an input sentence and when they are coincident, the modification processing is applied (forming of analysis tree). When no description exists in the case information of the verb in the dictionary or it is not coincident with the suffix information of the input sentence, the world model of the said verb is referenced and the model check the modification with other word. When in existing, the analysis tree is formed and when not in existence, the modification is regarded as disabled. The analysis tree is generated by a syntax analysis and the tree is formed by applying context processing taking the preceding input sentence into account. Thus, the verb case processing using the world model is sued is not applied and only the paragraph coincident with the case information of the word dictionary is modified.

Description

【発明の詳細な説明】（産業上の利用分野〕本発明は自然言語処理方式、特にデータベース検索に自
然言語を用い、コマンドは内部的に発生させる際、入力
文の解析時に必要な動詞の格処理方式に関する。Detailed Description of the Invention (Field of Industrial Application) The present invention uses a natural language processing method, particularly a natural language for database search. Regarding processing method.

[Conventional technology]

データベースの検索は、データベース名、当該データベ
ースで採用している大分類、小分類の名前などを指定す
るコマンドを入力して行なうが、この方式では、検索者
はデータベースの構成を熟知していて所要のコマンドを
作成可能でなければならず、対象が限定される。入力文
は自然言語とし、システムでこれを解析してコマンド生
成、データベース検索を行なえるようにすれば、検索は
非常に容易になる。Database searches are performed by inputting commands that specify the database name, the names of the major classifications and minor classifications used in the database, etc., but with this method, the searcher is familiar with the structure of the database and can search for the required information. It must be possible to create commands, and the scope is limited. If the input text is in natural language and the system can parse it, generate commands, and search the database, searching becomes much easier.

自然言語でのデータベース検索時の入力文は例えば、富
士屋が２００円で６月１７日に２００本販売した商品は
？というものである。か＼る入力文があるとシステムは
単語辞書を用いて形態素解析し、また世界モデルを用い
て構文・意味解析、文脈処理し、コマンドを生成する。For example, the input text when searching a database in natural language is: What product did Fujiya sell 200 units on June 17th for 200 yen? That is what it is. When there is an input sentence, the system uses a word dictionary to perform morphological analysis, and uses a world model to perform syntactic and semantic analysis, context processing, and generates commands.

世界モデルとは本システムで使用する特定の単語に対す
る各種単語の属性関係を記述したものであり、構文・意
味解析、文脈処理に有効である。単語辞書には各種単語
、各単語の対応クラス、単語が動詞であれば該動詞に結
び付く単語とその助詞（格情報）などが収録されている
。The world model describes the attribute relationships of various words to specific words used in this system, and is effective for syntactic/semantic analysis and context processing. The word dictionary contains various words, the corresponding class of each word, and if the word is a verb, the word associated with the verb and its particle (case information), etc.

[Problem that the invention seeks to solve]

単語辞書には動詞の格情報も記述されているから動詞の
格処理（動詞とそれを修飾する名詞節を１つの動詞節と
する処理）に当っては単語辞書を参照すればよいが、こ
れには動詞の全ての格情報を辞書中に記述して置かねば
ならず、記述もれがあると解析不能になることも有り得
る。The word dictionary also contains case information for verbs, so you can refer to the word dictionary for case processing of verbs (processing in which a verb and the noun clause that modifies it are treated as one verb clause). All the case information of the verb must be written in the dictionary, and if there is any omission, it may become impossible to analyze.

ところで世界モデルは単語の属性関係を示しているので
、有力な格情報源になる。そこで本発明は動詞の格処理
に世界モデルも利用し、世界モデルの属性情報を動詞の
格情報の省略値に用いようとするものである。By the way, since the world model shows the attribute relationships of words, it is a powerful source of case information. Therefore, the present invention uses a world model for case processing of verbs, and uses attribute information of the world model as a default value for case information of verbs.

[Means for solving problems]

本発明は、各種単語と各単語の対応クラス名及び単語が
動詞であれば該動詞に結び付く単語とその助詞を記述し
た単語辞書、および特定の単語と該単語に結び付く各種
単語を纏めた世界モデルを用いて、入力された自然言語
を解析木に処理する方式において、動詞とそれを修飾す
る名詞節を１つの動詞節にする格処理に際し、前記単語
辞書中に記述された格以外に、前記世界モデルの属性関
係を調べてその属性関係を当該動詞が取り得る格と見做
す手段を有することを特徴とするものである。The present invention provides a word dictionary that describes various words and corresponding class names for each word, words that are associated with a verb if the word is a verb, and their particles, and a world model that summarizes specific words and various words that are associated with the word. In a method of processing input natural language into a parse tree using It is characterized by having a means for examining the attribute relationships of the world model and regarding the attribute relationships as possible cases of the verb.

[Effect]

世界モデルの属性情報を動詞格処理に用いれば、単語辞
書の格情報が不充分であっても又は正確に記述されてい
なくても動詞の係り受けが判定でき、自然言語の処理に
甚だ有効である。If the attribute information of the world model is used for verb case processing, verb dependency can be determined even if the case information in the word dictionary is insufficient or not accurately described, making it extremely effective for natural language processing. be.

〔Example〕

図面を参照しながら説明すると、第１図はシステム全体
の概要を示し、１は端末装置でＣＲＴディスプレイｌａ
、キーボードＩｂ、フロッピーディスクＩＣを備える。To explain with reference to the drawings, FIG. 1 shows an overview of the entire system, and 1 is a terminal device with a CRT display la.
, a keyboard Ib, and a floppy disk IC.

１１は単語辞書、１３は世界モデル、１５はデータベー
スである。２はインタラクク、３は形態素解析、４は構
文・意味解析、５は文脈処理、６は言替生成、７はコン
ド生成、８はデータ検索、９は応答生成で、この部分は
ソフトウェアである。また１０は単語辞書１１、世界モ
デル１３、およびモデリングシステム１２のエディタ、
１４はデータベース管理システムである。モデリングシ
ステム１２はシステム定義のクラス、インチ−ジャなど
を持ち、ユーザ定義のデータベースはこの下にある。11 is a word dictionary, 13 is a world model, and 15 is a database. 2 is interaction, 3 is morphological analysis, 4 is syntactic/semantic analysis, 5 is context processing, 6 is paraphrase generation, 7 is condo generation, 8 is data search, and 9 is response generation, and this part is software. 10 is an editor for a word dictionary 11, a world model 13, and a modeling system 12;
14 is a database management system. The modeling system 12 has system-defined classes, inchers, etc., under which the user-defined database resides.

単語辞書１１には第２図に示すように、販売する、商品
、などの単語とそのモデル情報、即ち単語「販売する」
の対応クラスは「販売」であること、付く単語は数量、
販売店、および商品で、数量は助詞なしくｎ１ｌ）、販
売店の助詞は［販売店が、又は販売店は」と「が」又は
「は」であり、「商品」の助詞は「商品を」と「を」で
あること、また単語「商品」に対してはその対応クラス
は「商品」であること等が収録されている。なお（販売
店かは）などの格情報の記述の仕方には、直接書く、記
号にカテゴライズして書くなど各種の方法がある。「販
売」の世界モデルは第３図に示す如くで、「販売」に対
して「２００円で販売」など「販売単価」が、また「富
士層が販売した」など「販売店」が、また「シャープペ
ンを販売した」など「商品」が、また「６月１７日に販
売した」など「販売日付」が結び付くことを示している
。As shown in FIG. 2, the word dictionary 11 includes words such as "sell" and "product" and their model information, that is, the word "sell".
The corresponding class is "sales", the attached word is quantity,
In the case of store and product, the quantity is n1l without a particle), the particle of store is [store is, or store is] and “ga” or “ha”, and the particle of “product” is “product”. ” and “wo”, and for the word “product”, the corresponding class is “product”, etc. There are various ways to write case information such as (seller), such as writing it directly or categorizing it into symbols. The world model of ``sales'' is as shown in Figure 3, where ``sales'' includes ``sales price'' such as ``sold for 200 yen,'' and ``sales stores'' such as ``sold by the Fuji layer.'' It shows that "products" such as "mechanical pencils were sold" and "sales dates" such as "sold on June 17" are linked.

これらの販売店、販売単価、・・・・・・はインスタン
スという。These stores, sales unit prices, etc. are called instances.

今、端末キーボードを操作して「富士層が２００円で６
月１７日に２００本販売した商品は？」と入力されたと
すると、形態素解析では単語辞書を参照しながら入力文
を単語に分解し、各単語に次の如くモデル情報を付ける
。Now, operate the terminal keyboard and click "Fuji layer is 6 for 200 yen.
Which product sold 200 units on the 17th of the month? ”, in the morphological analysis, the input sentence is broken down into words while referring to the word dictionary, and model information is attached to each word as follows.

富士層　が　２００円　で　６月１７日　に（販売店）
　　　（価格）　　　　（日付）２００本　販売した　
商品は　？（数量）　　　（販売）　　（商品）構文・意味解析では文節統合を行ない、単語化した入力
文を文節にまとめる。Fuji layer will be available for 200 yen on June 17th (Seller)
(Price) (Date) 200 bottles sold
products are ? (Quantity) (Sales) (Products) In syntactic/semantic analysis, phrase integration is performed, and the worded input sentences are combined into phrases.

次に係り受は解析（解析木生成処理）を行なう。Next, the modification is analyzed (analysis tree generation processing).

第４図はその要領を示す。先ず、辞書中に動詞の格情報
の記述があるか否かヂエソクし、有ればそれと入力文の
それを比較し、一致すればそれで係り受は処理を行なう
（解析木を作る）。辞書中に動詞の格情報の記述がない
とき、またはそれがあっても入力文の助詞情報と一致し
ないときは当該動詞の世界モデルを参照し、該モデルで
他の単語との結び付きをチェックする。該当するものが
あればそれで解析木を作り、なければ係り受は不可とす
る。解析木作成は構文意味解析で行ない、前の入力文を
考慮してのそれは文脈処理で行なう。Figure 4 shows the outline. First, it is checked whether there is a description of the case information of the verb in the dictionary, and if so, it is compared with that of the input sentence, and if they match, the dependency is processed (an analysis tree is created). When there is no case information for a verb in the dictionary, or even if there is, it does not match the particle information in the input sentence, the world model of the verb is referred to and the model is used to check connections with other words. . If there is a matching item, a parse tree is created using it, and if there is not, no modification is possible. Parse trees are created using syntactic and semantic analysis, and context processing takes into account the previous input sentence.

第５図は動詞節処理され、解析木が作られた状態を示す
。入力文「富士層が２００円で６月１７日に２００本販
売した商品は？」が入ると単語辞書を参照しての形態素
解析で点線で示すように区切られ、辞書で得られたモデ
ル情報、及び世界モデルを参照して得られたモデル情報
により、構文意味解析及び文脈処理で図示のようにトリ
ーが作られ、相互の関連をはっきりさせる。この図で「
モ」としたのは、単語「２００円で」及び「２００本」
はモデル情報が辞書にはなかったので「販売」の世界モ
デルで関連を求めたことを示す。サブモデルとは、世界
モデルに特定の単語を入れたものであり、第２図の販売
に「販売した」、販売店に「富士層が」、販売単価に「
２００円で」、販売日付に「６月１７日に」、そして販
売数量に「２００本」を挿入すると第５図の最初のサブ
モデルになる。このような解析木作成過程で処理が行き
詰まると言替を行ない、言替えた単語をディスプレイに
表示し、これでもよいか否か検索者に尋ねる。ＯＫの入
力があるとそれにより処理を進める。これが言替生成の
ルーチンである。FIG. 5 shows a state in which the verb clause has been processed and a parse tree has been created. When the input sentence ``What product did Fuji Layer sell 200 units on June 17th for 200 yen?'' is morphologically analyzed with reference to the word dictionary and separated as shown by dotted lines, and the model information obtained from the dictionary is generated. , and the model information obtained by referring to the world model, a tree is created as shown in the figure through syntax-semantic analysis and context processing, and the mutual relationships are made clear. In this diagram,
The words "for 200 yen" and "200 pieces" were used as "Mo".
indicates that the model information was not found in the dictionary, so the relationship was sought using the world model of "sales". A sub-model is a world model that includes specific words, such as "sold" for sales in Figure 2, "Fuji layer" for dealers, and "Fuji layer" for sales price in Figure 2.
200 yen'', ``June 17th'' as the sales date, and ``200 pieces'' as the sales quantity, the first sub-model in Figure 5 is obtained. If the process reaches a dead end in the process of creating such an analysis tree, the searcher performs a paraphrase, displays the paraphrased words on the display, and asks the searcher whether this is acceptable. If OK is input, processing proceeds accordingly. This is the paraphrase generation routine.

このようにして解析木が作られると次はコマンド生成に
入る。データベースには次のような商品テーブル及び販
売テーブルがあるとすると、商品テーブル販売テーブル生成されるコマンドは次の如くなる。Once the parse tree is created in this way, the next step is to generate commands. Assuming that the database has the following product table and sales table, the command to generate the product table and sales table is as follows.

ｇｅｔ　　商品名ｆ　ｒｏｍ　　商品テーブル１ｖｈｅｒｅ　　商品テーブル、価格−２００〃、商品
名−販売テーブル、商品名販売テーブル、販売店−富士層〃　０１寸−６月１７日〃　販売量＝２００解析木はデータベースの構成に対応しており、第５図の
如き解析木が得られたらこれより上記のコマンドを作成
することが可能である。コマンドが得られたらそれでテ
ーブルを検索し、求める商品名が得られると、それをデ
ィスプレイに表示する。get product name from product table 1 vhere product table, price - 200〃, product name - sales table, product name sales table, store - Fuji layer〃 01 sun - June 17th〃 sales volume = 200 The analysis tree is in the database If the analytic tree shown in FIG. 5 is obtained, it is possible to create the above command from this tree. Once the command is obtained, the table is searched using it, and when the desired product name is obtained, it is displayed on the display.

〔Effect of the invention〕

以上説明したように本発明によれば、世界モデルの属性
関係を使用して動詞の格処理を進めるので、単語辞書中
に記述もれがあってもそれをカバーでき、また単語辞書
中に厳密に格情報を記述する必要がないという利点が得
られる。例えば第２図の単語辞書では単語「販売する」
には数量ｎｉ！、販売店かは、商品を、の３種の格情報
しかないので、×月×日に（販売）に対しては格情報が
得られないが、これは販売の世界モデルを調べることに
より属性情報が得られる。勿論世界モデルを参照しても
助詞そのものは得られず、この点は不明のま−であるが
、属性関係が分れば、これで解析木の作成は可能である
。また入力文には「販売店の販売する」など、単語辞書
にはない助詞（この場合は「の」）が含まれることがあ
るが、この場合も世界モデルを見て属性関係を知り、そ
れで処理することができる。As explained above, according to the present invention, the attribute relationships of the world model are used to proceed with the case processing of verbs, so even if there are omissions in the word dictionary, it can be covered, and the word dictionary can be The advantage is that there is no need to write case information. For example, in the word dictionary in Figure 2, the word "sell"
Quantity ni! Since there are only three types of case information: , store, and product, no case information can be obtained for (sales) on month and day, but this can be determined by examining the world model of sales. Information can be obtained. Of course, even if the world model is referred to, the particles themselves cannot be obtained, and this point remains unclear, but if the attribute relationships are known, it is possible to create an analysis tree. In addition, the input sentence may contain particles that are not found in the word dictionary (in this case, ``no''), such as ``sold by a store,'' but in this case as well, by looking at the world model and knowing the attribute relationships, can be processed.

なお第４図のアルゴリズムで、係り受は可能の基準を変
更することにより色々なアプリケーションで本発明は適
用可能である。またデータベース更新の入力文では、こ
れは厳密な意味解釈を必要とするから助詞の間違いは認
めない。即ち、世界モデルを使用しての動詞格処理は行
なわず、単語辞書の格情報と一致した文節のみ係り受は
可能とする。Note that the present invention can be applied to various applications by changing the criteria for allowing modification in the algorithm shown in FIG. Furthermore, in the input sentence for database update, errors in particles are not allowed because this requires strict semantic interpretation. That is, verb case processing using the world model is not performed, and only clauses that match the case information in the word dictionary can be modified.

[Brief explanation of drawings]

第１図は本発明の実施例を示すブロック図、第２図は単
語辞書の説明図、第３図は世界モデルの説明図、第４図
は係り受は処理の説明図、第５図は解析木の説明図であ
る。Fig. 1 is a block diagram showing an embodiment of the present invention, Fig. 2 is an explanatory diagram of a word dictionary, Fig. 3 is an explanatory diagram of a world model, Fig. 4 is an explanatory diagram of modification processing, and Fig. 5 is an explanatory diagram of a world model. It is an explanatory diagram of an analysis tree.

Claims

[Claims] A word dictionary that describes various words and corresponding class names of each word, words that are associated with a verb if the word is a verb, and their particles, and a list of specific words and various words that are associated with the word. In a method of processing input natural language into a parse tree using a world model, when case processing is performed to convert a verb and a noun clause modifying it into a single verb clause, cases other than those described in the word dictionary are used. , a natural language processing method characterized by having means for examining the attribute relationships of the world model and regarding the attribute relationships as possible cases of the verb.