JPH02297193A

JPH02297193A - Dictionary consulting device

Info

Publication number: JPH02297193A
Application number: JP1042380A
Authority: JP
Inventors: Yasushi Tamakoshi; 玉越　靖司
Original assignee: Matsushita Electric Industrial Co Ltd
Current assignee: Panasonic Holdings Corp
Priority date: 1989-02-22
Filing date: 1989-02-22
Publication date: 1990-12-07

Abstract

(57)【要約】本公報は電子出願前の出願データであるた
め要約のデータは記録されません。(57) [Summary] This bulletin contains application data before electronic filing, so abstract data is not recorded.

Description

【発明の詳細な説明】産業上の利用分野本発明は、自然言語による文章または文を機械によって
解析しまたは生成する際に利用される辞書引き装置に関
するものである。DETAILED DESCRIPTION OF THE INVENTION Field of the Invention The present invention relates to a dictionary lookup device used when a machine analyzes or generates sentences or sentences written in a natural language.

従来の技術自然言語による文章や文を機械によって処理する自然言
語処理装置において、予め蓄えておいた必要な語彙情報
を適宜参照して所定の語重情報を抽出する辞書引きが行
われる。2. Description of the Related Art In a natural language processing device that processes texts and sentences written in a natural language by a machine, a dictionary lookup is performed to extract predetermined word weight information by appropriately referring to necessary vocabulary information stored in advance.

従来、辞書引き装置としては、Ｂ木などの転置ファイル
によるファイル管理手法を応用することによって、１個
以上のキーワードのそれぞれについて検索を行い、その
結果を統合し、あるいは順次キーワードによって結果を
絞シ込むことにより得たものを検索結果として表示する
ものがあった。Conventionally, dictionary lookup devices search for one or more keywords individually by applying a file management method using transposed files such as B-tree, and then integrate the results or sequentially narrow down the results by keyword. There was one that displayed what was obtained by searching as a search result.

発明が解決しようとする課題従来のＢ木などの転置ファイルによるファイル管理手法
を応用した辞書引き装置では、自然言語処理装置におい
て要求される１個以上のキーワードによる辞書の検索を
行う場合、１個以上のキーワードのそれぞれについて検
索を行い、その結果を統合し、あるいは順次キーワード
によって結果を絞υ込むことにより得たものを検索結果
とするので、辞書引き装置における転置ファイルのため
の記憶領域が大きく、辞書内容の変更に伴う転置ファイ
ルの変更の手間が大きく、辞書引き装置を実装する電子
計算機の環境に応じて転置ファイルのための記憶領域の
大きさを柔軟に設計することが困難であった。Problems to be Solved by the Invention In a dictionary lookup device that applies a conventional file management method using transposed files such as a B-tree, when searching a dictionary using one or more keywords required by a natural language processing device, only one keyword is required. Since the search results are obtained by searching for each of the above keywords, integrating the results, or sequentially narrowing down the results by keywords, the storage area for the transposed files in the dictionary lookup device is large. , it took a lot of effort to change the transposed file as the contents of the dictionary changed, and it was difficult to flexibly design the size of the storage area for the transposed file according to the environment of the computer on which the dictionary lookup device was implemented. .

本発明は上記課題を解決するもので、記憶領域が小さく
、辞書内容の変更の容易な辞書引き装置を提供すること
を目的とするものである。The present invention has been made to solve the above problems, and it is an object of the present invention to provide a dictionary lookup device that has a small storage area and allows easy modification of dictionary contents.

課題を解決するだめの手段上記課題を解決するだめの本発明の技術的手段は、自然
言語の始業情報を蓄積した辞書情報蓄積装置と、辞書情
報蓄積装置の語粟情報の重ね合せ符号を蓄積する重ね合
せ符号蓄積装置と、辞書引きのための質問情報の重ね合
せ符号によシ重ね合せ符号蓄積装置の重ね合せ符号を検
索する検索装置と、検索結果と辞書情報蓄積装置の始業
情報とを照合する照合装置とを具備したものである。Means for Solving the Problems Technical means of the present invention for solving the above problems include a dictionary information storage device that stores natural language starting information, and a superimposed code of word information in the dictionary information storage device. a superposition code storage device for searching the superposition code of the superposition code storage device based on the superposition code of question information for dictionary lookup; It is equipped with a verification device for verification.

作用上記構成の作用は次のようになる。辞書情報蓄積装置に
蓄積された始業情報が持つ１個以上のキーワードは・・
ツシュ関数を用いて符号化され、その重ね合せ符号が作
成されて重ね合せ符号蓄積装置に蓄積される。一方辞書
引きのための質問情報も同様にして重ね合せ符号化され
る。この質問情報の重ね合せ符号により重ね合せ符号蓄
積装置に蓄積されているデータを検索装置で照合し、質
問情報に対する正しい検索結果の候補であるドロップと
呼ばれる検索結果を得る。検索装置による検索結果であ
るドロップは質問に対する正しい検索結果の始業情報の
集合を必ず含んでいるが、７オルスドロツプと呼ばれる
誤って検索された始業情報も含む場合がある。照合装置
では、ドロップからフォルスドロップを除去するために
、ドロップが示す辞書情報蓄積装置中の始業情報と質問
とを照合する。そしてその結果正しい検索結果を出力装
置によシ表示する。Effect The effect of the above configuration is as follows. One or more keywords included in the start-of-day information stored in the dictionary information storage device are...
The data is encoded using the Tusch function, and its superposition code is created and stored in a superposition code storage device. On the other hand, question information for dictionary lookup is similarly superimposed encoded. The search device collates the data stored in the superposition code storage device using the superposition code of the question information, and obtains a search result called a drop, which is a candidate for the correct search result for the question information. A drop, which is a search result by a search device, always includes a set of starting information that is a correct search result for a question, but may also include incorrectly retrieved starting information called 7 orsdrops. In order to remove false drops from the drops, the collation device collates the question with the starting information in the dictionary information storage device indicated by the drop. Then, the correct search results are displayed on the output device.

実施例以下、図面にもとづいて、本発明についてさらに詳しく
説明する。EXAMPLES Hereinafter, the present invention will be explained in more detail based on the drawings.

第１図には、本発明一実施例の辞書引き装置の概念図を
示す。あらかじめ辞書情報蓄積装置１に蓄えられた自然
言語の詔書情報を、重ね合せ符号作成装置２によって、
辞書重ね合せ符号として重ね合せ符号蓄積装置３に蓄え
ておき、入力装置４に与えられた質問を質問処理装置５
により分解し、重ね合せ符号作成装置２で質問重ね合せ
符号を作成し、この質問重ね合せ符号と辞書重ね合せ符
号を検索装置６で照合し、その検索結果が表わす始業情
報と質問を照合装置７で照合することによって、質問に
対する正しい辞書引き結果である始業情報を出力装置８
により表示する。FIG. 1 shows a conceptual diagram of a dictionary lookup device according to an embodiment of the present invention. The natural language edict information stored in the dictionary information storage device 1 in advance is processed by the superposition code creation device 2.
The dictionary superposition code is stored in the superposition code storage device 3, and the question given to the input device 4 is sent to the question processing device 5.
A question superimposition code is created by a superimposition code creation device 2, a search device 6 collates this question superimposition code with a dictionary superimposition code, and a collation device 7 compares the start-of-day information and the question represented by the search results. The output device 8 outputs starting information, which is the correct dictionary lookup result for the question.
Displayed by

第２図は、重ね合せ符号作成装置２の構成を示す。辞書
情報蓄積装置１からの自然言語の始業情報は重ね合せ符
号作成装置２のｂｃｗ作成装置２１に与えられる。始業
情報Ｒは１つ以上の数Ｒｉ個のキーワードを有している
。たとえば語粟「日本」はキーワードとして「日本」、
「固有名詞」の２つのキーワードを有している。ｂｃｗ
作成装置２１では、入力される１個以上のキーワード１
つ１つに対し、ｂビットのうち幾つかのビットに“ｌ”
を立て、残りは“０”とするようなノ・ノシュ関数を用
いてｂｃＷ（バイナリ　コードワード）と呼ぶ長さｂの
ビット列を作る。図示の例ではキーワード「日本ｊとし
て０１０１０１００．キーワード「固有名詞」として０
００１１０１０の８ビツトのビット列が作られる。ｂｃ
ｗ重ね合せ装置ｎではこれらＲｉ本のｂｃｗをビット毎
に論理和をとった長さｂのビット列５（Ｒｉ）を始業情
報Ｒｉの辞書重ね合せ符号として作成する。図示の例で
は始業「日本」に対して辞書重ね合せ符号０１０１１１
１０が作成される。この操作が辞書情報蓄積装置１のす
べての始業情報について行われ、それぞれの始業に対し
て辞書重ね合せ符号が作成される。こうして作成された
Ｆ本の辞書重ね合せ符号を第３図で説明するようにｂ本
の長さＦのビット列として重ね合せ符号蓄積装置３に格
納する。ここで、Ｆは辞書情報蓄積装置１中の詔書情報
の個数である。なお、ｂをきめることにより重ね合せ符
号のサイズは自由に設計できる。FIG. 2 shows the configuration of the superposition code creation device 2. As shown in FIG. The natural language starting information from the dictionary information storage device 1 is given to the bcw creation device 21 of the superposition code creation device 2. The starting information R has one or more keywords, number Ri. For example, the word ``Japan'' is ``Japan'' as a keyword,
It has two keywords: "proper noun". bcw
In the creation device 21, one or more input keywords 1
For each one, some bits of b bits are “l”
A bit string of length b called bcW (binary code word) is created using a no-nosh function that sets 0 and the rest are 0. In the illustrated example, the keyword "Japan J" is 01010100. The keyword "proper noun" is 0.
An 8-bit bit string of 0011010 is created. bc
The w superimposition device n creates a bit string 5 (Ri) of length b by ORing these Ri bcw bit by bit as a dictionary superimposition code of the starting information Ri. In the illustrated example, the dictionary superimposition code is 010111 for "Japan" at the beginning of the school day.
10 is created. This operation is performed for all the start-of-day information in the dictionary information storage device 1, and a dictionary superimposition code is created for each start-of-day information. The F dictionary superposition codes thus created are stored in the superposition code storage device 3 as b bit strings of length F, as explained in FIG. Here, F is the number of pieces of edict information in the dictionary information storage device 1. Note that by determining b, the size of the superposition code can be freely designed.

第３図は重ね合せ符号作成装置２で作成されたＦ本の辞
書重ね合せ符号を重ね合せ符号蓄積装置３に蓄積する様
子を示す。前述したように、重ね合せ符号作成装置２で
は、辞書情報蓄積装置１に蓄積されているＦ個の語粟に
ついて重ね合せ符号を作成する。第３図に図示した例で
は、■「日本」、■「アメリカ」、■「イギリス」の３
つの始業に対してそハぞれ０１０１１１１０．１１０１
１０１０．１１０１１１１０の３本の重ね合せ符号が作
成された例を示している。これらＦ本の重ね合せ符号は
、重ね合せ符号蓄積装置３に蓄積される際には縦横を変
換したビット列として蓄積される。すなわち、第３図に
示したように、ｂｃｗ重ね合せ装置ｎよシの形で出力さ
れた重ね合せ符号は、重ね合せ符号蓄積装置３では、そ
の縦横が変換されての形で蓄積される。このように、重
ね合せ符号作成装置２で作成されたＦ本、ｂビット列の
重ね合せ符号は、重ね合せ符号蓄積装置３ではｂ本、Ｆ
ビット列、すなわちｂ行Ｆ列の重ね合せ信号として蓄積
される。この場合詔書■、■、■は縦の列情報として蓄
積されることになる。FIG. 3 shows how F dictionary superposition codes created by the superposition code creation device 2 are stored in the superposition code storage device 3. As described above, the superposition code creation device 2 creates superposition codes for the F words stored in the dictionary information storage device 1. In the example shown in Figure 3, there are three
01011110.1101 for each starting day.
An example is shown in which three superposition codes of 1010.11011110 are created. When these F superposition codes are stored in the superposition code storage device 3, they are stored as bit strings whose vertical and horizontal directions have been converted. That is, as shown in FIG. 3, the superposition code outputted from the bcw superposition device n is stored in the superposition code storage device 3 in a form in which the vertical and horizontal directions have been converted. In this way, the superposition code of F and b bit strings created by the superposition code creation device 2 is stored in the superposition code storage device 3 of b and F bit strings.
It is stored as a bit string, that is, a superposed signal of rows B and columns F. In this case, the edicts ■, ■, ■ will be stored as vertical column information.

一方、入力装置４より与えられた質問も同様に重ね合せ
符号化される。この様子を第４図により説明する。いま
、入力装置４に「固有名詞の日本を辞書引きせよ。」と
いう指令が入力された場合を考える。On the other hand, questions given from the input device 4 are also superposition encoded in the same way. This situation will be explained with reference to FIG. Now, let us consider a case where the command ``Look up the proper noun Japan in a dictionary'' is input to the input device 4.

入力装置４から得た質問を質問処理装置５で１個以上の
キーワードに分解する。上記指令の場合には「固有名詞
」、「日本」の２つのキーワードに分解される。これら
のキーワードについて、辞書重ね合せ符号の作成・蓄積
の場合と同じ／・ソシュ関数を用いる重ね合せ符号作成
装置２を用いて、すなわちｂｃｗ作成装置２１を用いて
ｂｃｗとして、「固有名詞；０ＯＯＩＩＯＩＯＪ、「日
本；　０１０１０１００Ｊか作成され、つぎに、ｂｃｗ
重ね合せ装置二により質問重ね合せ符号ｒＱ；０ＩＯＩ
ＩＩＩＯＪが作成される。こうして質問が重ね合せ符号
化される。A question obtained from an input device 4 is broken down into one or more keywords by a question processing device 5. In the case of the above directive, it is broken down into two keywords: "proper noun" and "Japan". Regarding these keywords, the same as in the case of creating and storing dictionary superposition codes/- Using the superposition code creation device 2 using the Sosch function, that is, using the bcw creation device 21, as bcw, "proper noun; 0OOIIOIOJ, “Japan; 01010100J was created, then bcw
The superposition device 2 generates a query superposition code rQ;0IOI
IIIOJ is created. The question is thus superposition coded.

この質問重ね合せ符号Ｑは検索装置６に供給される。検
索装置６では、質問重ね合せ符号と重ね合せ符号蓄積装
置中のデータを照合して、ドロップと呼ばれる検索結果
を求める。すなわち、質問重ね合せ符号で“１″が立っ
ているビット位置ｉ（図示の例では位置”　ｔ’％　　
０％　ｄＸ　ｅ）に対応する重ね合せ符号蓄積装置３に
おける蓄積データの１行めのビット列に対して、ビット
毎の論理積をとることによってドロップが簡単に求めら
れる。This query superposition code Q is supplied to the search device 6. The search device 6 compares the query superposition code with the data in the superposition code storage device to obtain a search result called a drop. That is, the bit position i where "1" is set in the query superposition code (position "t'% in the illustrated example)
Drops can be easily determined by performing a bit-by-bit logical product on the bit string of the first row of stored data in the superposition code storage device 3 corresponding to 0% dX e).

この論理積の結果であるビット列において“１”が立っ
ているビットｊに対応する詔書情報が検索装置６におけ
る検索結果である。図示の例では始業■、■が検索結果
として得られる。検索装置６による検索結果であるドロ
ップは質問に対する正しい検索結果の詔書情報を必ず含
んでいるが、フォルスドロップと呼ばれる誤って検索さ
れた詔書情報も含む場合があるので、照合装置７により
、ドロップからフォルスドロップを除去する。この操作
を第５図によシ説明する。照合装置７は、検索装置６か
ら受けた検索結果、すなわち、ドロップにおいて“１”
が立りているビットが示す辞書情報蓄積装置１中の詔書
情報、図示の例では０日本と■イギリスの２つの語紮情
報を質問Ｑと照合し、一致したもの、図示の例では０日
本を正しい検索結果として出力装置８に出力表示する。The edict information corresponding to the bit j in which "1" is set in the bit string that is the result of this logical product is the search result in the search device 6. In the illustrated example, opening hours ■ and ■ are obtained as search results. The drop, which is the search result by the search device 6, always includes edict information that is the correct search result for the question, but it may also include edict information that was incorrectly searched, called a false drop. Eliminate false drops. This operation will be explained with reference to FIG. The collation device 7 receives the search result from the search device 6, that is, “1” in the drop.
The imperial edict information in the dictionary information storage device 1 indicated by the bit that is set, in the illustrated example, the two words of ``Japanese'' and ■British are compared with the question Q, and those that match are 0 Japanese in the illustrated example. is output and displayed on the output device 8 as a correct search result.

発明の効果以上のように、本発明は辞書情報蓄積装置の詔書情報を
重ね合せ符号化して重ね合せ符号蓄積装置に蓄積し、重
ね合せ符号蓄積装置に蓄えられた辞書情報蓄積装置の内
容を表わす辞書重ね合せ符号と質問処理装置によりキー
ワードを変換した質問重ね合せ符号との部分照合検索を
行い、その検索結果が表わす辞書情報とキーワードとの
照合を行うようにしたもので、重ね合せ符号蓄積装置中
のデータは、従来の方式による辞書引き装置における転
置ファイルに比べてはるかに小さく、メンテナンスも容
易である。さらに、重ね合せ符号蓄積装置中のデータの
サイズを自由に変更することができるため、各種の電子
計算機上の自然言語処理装置に対して辞書引き機能を提
供することができる。Effects of the Invention As described above, the present invention superimposes the edict information in the dictionary information storage device, stores it in the superposition code storage device, and represents the contents of the dictionary information storage device stored in the superposition code storage device. A partial matching search is performed between the dictionary superimposition code and the query superimposition code whose keywords have been converted by the query processing device, and the dictionary information represented by the search result is compared with the keyword. The data contained therein is much smaller than the transposed file in a conventional dictionary lookup device, and maintenance is easy. Furthermore, since the size of data in the superposition code storage device can be changed freely, a dictionary lookup function can be provided to natural language processing devices on various electronic computers.

また、重ね合せ符号による処理を行っているので、辞書
情報蓄積装置の始業数にほとんど無関係に高速にドロッ
プの検索が可能である。Furthermore, since processing is performed using superimposed codes, it is possible to search for drops at high speed almost regardless of the number of starting operations of the dictionary information storage device.

[Brief explanation of the drawing]

第１図は本発明の一実施例における辞書引き装置の概念
を示すブロック図、第２図は第１図の構成における重ね
合せ符号作成部の概念を示すブロック図、第３図は第１
図の構成における重ね合せ符号蓄積部分の概念を示すブ
ロック図、第４図は第１図の構成における検索部の概念
を示すブロック図、第５図は第１図の構成における照合
部の概念を示すブロック図である。ｌ・・・辞書情報蓄積装置、２・・・重ね合せ符号作成
装置、３・・・重ね合せ符号蓄積装置、４・・・入力装
置、５・・・質問処理装置、６・・・検索装置、７・・
・照合装置、８・・・出力装置、２１・・・ｂｃｗ作成
装置、ｎ・・・ｂｃｗ重ね合せ装置。代理人の氏名　弁理士　粟野重孝　ほか１名り郊２図ｉ〒Ｏし＝第図FIG. 1 is a block diagram showing the concept of a dictionary lookup device in an embodiment of the present invention, FIG. 2 is a block diagram showing the concept of a superposition code creation section in the configuration of FIG. 1, and FIG.
FIG. 4 is a block diagram showing the concept of the superposition code storage part in the configuration shown in the figure. FIG. 4 is a block diagram showing the concept of the search unit in the configuration shown in FIG. 1. FIG. FIG. l... Dictionary information storage device, 2... Superposition code creation device, 3... Superposition code storage device, 4... Input device, 5... Question processing device, 6... Search device ,7...
- Collation device, 8... Output device, 21... bcw creation device, n... bcw superposition device. Name of agent: Patent attorney Shigetaka Awano and one other person Figure 2

Claims

[Claims]

(1) A dictionary information storage device that stores natural language vocabulary information, an input device that inputs questions for dictionary lookup, and converts each keyword into a superposition code after encoding each keyword for each of the vocabulary information and questions. a search device that performs partial matching between the superposition code of vocabulary information and the superposition code of the question; and a matching device that matches the keywords of the question with the vocabulary information corresponding to the search results of the search device. A dictionary lookup device characterized by comprising:

(2) Claim 1, further comprising a superimposition code storage device for vertically and horizontally converting superimposed codes of vocabulary information and storing the superimposed codes, and partially collating the superimposed codes stored in the superimposed code storage device with the superimposed codes of questions. Dictionary lookup device described.