JPH0512331A

JPH0512331A - Valence structure analysis method

Info

Publication number: JPH0512331A
Application number: JP3186797A
Authority: JP
Inventors: Junko Komatsu; 順子小松
Original assignee: Ricoh Co Ltd
Current assignee: Ricoh Co Ltd
Priority date: 1991-07-01
Filing date: 1991-07-01
Publication date: 1993-01-22

Abstract

(57)【要約】【目的】実用性のある結合価辞書および分類コード辞
書を容易にかつ精度良く作成可能である。【構成】第１の結合価フレーム作成部２は、受け単語
が共通な係り受け事例を表層的な類似性からグループ分
けし１次結合価辞書３を作成する。第１の分類コード付
与部４では、１次結合価辞書３を参照し、同一の結合価
フレームの同一スロットに属する単語に同一分類コード
を付与し１次単語分類コード辞書５を作成する。第２の
結合価フレーム作成部６ではさらに、受け単語が共通な
係り受け事例を、表層的な類似性とともに係り単語に付
加された１次単語分類コード間の類似性をも考慮してグ
ループ分けし２次結合価辞書７を作成する。第２の分類
コード付与部８では、２次結合価辞書７を参照し、第１
の分類コード付与部４と同様の手順で２次単語分類コー
ド辞書９を作成する。 (57) [Summary] [Purpose] It is possible to easily and accurately create a practical valence dictionary and classification code dictionary. [Structure] The first valence frame creation unit 2 creates a primary valence dictionary 3 by grouping dependency cases in which a common word is common based on surface similarity. The first classification code assigning unit 4 refers to the primary valence dictionary 3 and assigns the same classification code to words belonging to the same slot of the same valence frame to create the primary word classification code dictionary 5. The second valency frame creation unit 6 further divides the dependency cases having common accepting words into groups by considering not only the surface similarity but also the similarity between the primary word classification codes added to the related words. Then, the secondary valence dictionary 7 is created. The second classification code assigning unit 8 refers to the secondary valence dictionary 7 and
The secondary word classification code dictionary 9 is created by the same procedure as that of the classification code assigning unit 4.

Description

Detailed Description of the Invention

【０００１】[0001]

【産業上の利用分野】本発明は、日本語ワードプロセッ
サにおける仮名漢字変換やテキスト音声合成システムの
言語解析等の種々の自然言語解析に利用される結合価構
造解析方式に関する。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a valence structure analysis method used for various natural language analyzes such as kana-kanji conversion in a Japanese word processor and language analysis of a text-to-speech synthesis system.

【０００２】[0002]

【従来の技術】一般に、種々の自然言語解析において
は、解析結果が一意に定まらない場合が多い。例えば、
日本語ワードプロセッサにおける仮名漢字変換では、同
音語（読みが同じで表記が異なる単語）があるために、
また、テキスト音声合成システムの言語解析では、同形
語（表記が同じで読みが異なる単語）があるために、そ
の解析結果が一定に定まらないことがある。2. Description of the Related Art Generally, in various natural language analysis, the analysis result is often not uniquely determined. For example,
In kana-kanji conversion in a Japanese word processor, since there are homophones (words that have the same reading but different notations),
Further, in the language analysis of the text-to-speech synthesis system, the analysis result may not be fixed because there are isomorphic words (words having the same notation but different readings).

【０００３】このような瞬昧性を解消するために、従
来、例えば１９８７年７月２４日発行の文献「自然言語
処理６２−６第３９頁乃至第４４頁」に開示されて
いるように、動詞などの用言とその支配要素である格要
素との結び付きを結合価パターンとして記述し、この結
合価パターンをベースとした解析方式が知られている。In order to eliminate such ambiguity, as disclosed in, for example, the document "Natural Language Processing 62-6, pages 39 to 44" issued on July 24, 1987, the following is known. It is known that a valence pattern such as a verb and its governing case element is described as a valence pattern, and an analysis method based on this valence pattern is used.

【０００４】この種の解析方式においては、それぞれの
用言のもつ結合価構造を適当な枠組みに形式化し、結合
価パターンとして結合価辞書に記述し、また、格助詞が
とりうる名詞に対し意味分類を作成するようになってい
る。In this kind of analysis method, the valence structure of each idiom is formalized into an appropriate framework, described as a valence pattern in a valence dictionary, and meanings are given to nouns that a case particle can take. It is designed to create classifications.

【０００５】[0005]

【発明が解決しようとする課題】ところで、上述したよ
うな従来の解析方式において、結合価辞書，すなわち結
合価パターンの作成および意味分類の作成は、通常、人
手を介して行なわれている。しかしながら、自然言語解
析等では実際に利用する単語数が多く、また同じ単語で
も文脈や使用分野等によって様々な意味をもつため、実
用性のある結合価辞書およびそれと整合性のとれた意味
分類を人手によって作成するのは非常に困難である。一
方、係り受け事例から自動的に結合価フレームを抽出し
ようという試みがなされているが、その際に用いる名詞
の意味分類は人手でトップダウンに与えているため、必
ずしも適切な結合価フレームが得られていない。このよ
うに、従来では、実用性のある結合価辞書およびそれと
整合性のとれた意味分類を容易にかつ精度良く作成する
ことができないという欠点があった。By the way, in the conventional analysis method as described above, the valence valuation dictionary, that is, the valence valence pattern and the meaning classification are usually created manually. However, in natural language analysis, etc., many words are actually used, and even the same word has various meanings depending on the context, field of use, etc., so a practical valence valuation dictionary and semantic classification consistent with it are required. It is very difficult to create by hand. On the other hand, attempts have been made to automatically extract valence frames from dependency cases, but since the semantic classification of nouns used at that time is given top-down manually, it is not always possible to obtain an appropriate valence frame. Has not been done. As described above, conventionally, there is a drawback in that it is impossible to easily and accurately create a practical valence dictionary and a semantic classification consistent with the valence dictionary.

【０００６】本発明は、実用性のある結合価辞書および
それと整合性のとれた意味分類を容易にかつ精度良く作
成することが可能な結合価構造解析方式を提供すること
を目的としている。An object of the present invention is to provide a valence valence dictionary and a valence valence structure analysis method capable of easily and accurately creating a semantic valence dictionary consistent with the valence valence dictionary.

【０００７】[0007]

【課題を解決するための手段】上記目的を達成するため
に本発明は、受け単語を共通にする係り受け事例をグル
ープ分けし結合価フレームを作成する結合価フレーム作
成手段と、同一の結合価フレームの同一スロットに属す
る単語に同一の分類コードを付加し、単語分類コード辞
書を作成する分類コード付与手段とを有し、前記結合価
フレーム作成手段は、受け単語を共通にする係り受け事
例を、表層的な類似性とともに、係り単語に付加された
分類コード間の類似性をも加味してグループ分けし、最
終的な結合価フレームを作成するようになっていること
を特徴としている。In order to achieve the above object, the present invention provides the same valence as a valence frame creating means for grouping dependency cases having common words and creating a valence frame. And a classification code assigning means for creating a word classification code dictionary by adding the same classification code to words belonging to the same slot of the frame, and the valence frame creating means provides a dependency case in which the received words are common. In addition to the surface similarity, the similarity between classification codes added to related words is also taken into consideration for grouping to create a final valence frame.

【０００８】また、上記結合価フレーム作成手段は、自
己が作成した結合価フレームに基づいて分類コード付与
手段が作成した単語分類コード辞書を参照して、結合価
フレームを逐次更新するように構成されていることを特
徴としている。The valence frame creating means is configured to sequentially update the valence frame by referring to the word classification code dictionary created by the classification code assigning means based on the valence frame created by itself. It is characterized by

【０００９】また、上記分類コード付与手段は、同一の
結合価フレームの同一スロットに属する単語に同一の分
類コードを付加すると同時に、各単語に予め人手により
用意した意味分類コードをも付け加えて単語分類コード
辞書を作成するようになっていることを特徴としてい
る。Further, the classification code assigning means adds the same classification code to the words belonging to the same slot of the same valence frame, and at the same time adds the meaning classification code prepared by hand to each word to classify the words. It is characterized by creating a code dictionary.

【００１０】[0010]

【作用】本発明では、受け単語を共通にする係り受け事
例を、表層的な類似性とともに、係り単語に付加された
分類コード間の類似性をも加味して、最終的な結合価フ
レームを作成する。According to the present invention, the final valence frame is determined by taking into account the dependency case in which the dependency word is common, in addition to the surface similarity and the similarity between the classification codes added to the dependency word. create.

【００１１】[0011]

【実施例】以下、本発明の一実施例を図面に基づいて説
明する。図１は本発明の一実施例のブロック図であり、
本実施例では、係り受け事例を記憶する係り受け事例デ
ータベース１と、係り受け事例データベース１に記憶さ
れている係り受け事例について、受け単語を共通にする
係り受け事例を表層的な類似性からグループ分けしこれ
を１次の結合価フレームとして作成して１次結合価辞書
３に記述する第１の結合価フレーム作成部２と、第１の
結合価フレーム作成部２で作成された１次結合価辞書３
を参照し、同一の結合価フレームの同一スロットに属す
る単語に同一の分類コードを付与して１次単語分類コー
ド辞書５を作成する第１の分類コード付与部４と、係り
受け事例データベース１に加えて１次単語分類コード辞
書５をも参照し、受け単語を共通にする係り受け事例
を、表層的な類似性とともに係り単語に付加された分類
コード間の類似性をも考慮してグループ分けしこれを２
次の結合価フレームとして作成して２次結合価辞書７に
記述する第２の結合価フレーム作成部６と、第２の結合
価フレーム作成部６で作成された２次結合価辞書７を参
照し、同一の結合価フレームの同一スロットに属する単
語に同一の分類コードを付与して２次単語分類コード辞
書９を作成する第２の分類コード付与部８とが設けられ
ている。DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS An embodiment of the present invention will be described below with reference to the drawings. FIG. 1 is a block diagram of an embodiment of the present invention.
In the present embodiment, regarding the dependency case database 1 that stores the dependency cases and the dependency cases stored in the dependency case database 1, the dependency cases that have the same accepting word are grouped from the surface similarity. A first valence frame creation unit 2 that divides this and creates it as a primary valence frame and describes it in the primary valence dictionary 3, and a primary bond created by the first valence frame creation unit 2. Valuation dictionary 3
To the first classification code assigning unit 4 for creating the primary word classification code dictionary 5 by assigning the same classification code to the words belonging to the same slot of the same valence frame, and the dependency case database 1. In addition, the primary word classification code dictionary 5 is also referred to, and the modification examples in which the common words are common are grouped in consideration of not only the surface similarity but also the similarity between the classification codes added to the related words. Shico 2
Refer to the second valence frame creation unit 6 created as the next valence frame and described in the secondary valence dictionary 7, and the secondary valence dictionary 7 created by the second valence frame creation unit 6. Then, a second classification code assigning unit 8 is provided which assigns the same classification code to the words belonging to the same slot of the same valence frame to create the secondary word classification code dictionary 9.

【００１２】次にこのような構成における結合価構造解
析処理について説明する。なお、以下では、説明を簡単
にするため、受け単語として用言を中心とした係り受け
事例のみを扱うものとする。Next, the valence structure analysis processing in such a configuration will be described. Note that, in the following, for the sake of simplicity of explanation, only dependency cases centered on idioms will be treated as received words.

【００１３】先づ、第１の結合価フレーム作成部２で
は、ある用言を中心とした係り受け事例間の類似度ｄ１
を求める。類似度ｄ１は、係り受け事例間の表層的な類
似性（ここでは、用言がとる格助詞の類似性）を反映す
るように定義され、例えば次式のように定められる。First, in the first valence frame creating unit 2, the degree of similarity d1 between dependency cases centered on a certain verb.
Ask for. The similarity d1 is defined so as to reflect the superficial similarity between the dependency cases (here, the similarity of the case particles taken by the denotation), and is defined as, for example, the following expression.

【００１４】[0014]

【数１】 [Equation 1]

【００１５】ここで、ｉは格助詞の種類（“に”，
“を”，“が”，“から”，“へ”，“で”，“よ
り”，“と”のいずれか）を表わし、ｎは格助詞の種類
数（“８”）であり、Ｋ_iは係り受け事例である。ま
た、δ（ｉ）は格スロット，すなわち格助詞ｉが存在す
るときには“１”，格スロット，すなわち格助詞ｉが存
在しないときには“０”となる関数であり、ｗ（ｉ）は
格助詞ｉに対する重みを表わしている。なお、ここで、
格スロットとは、格助詞の他に、格助詞の前に存在する
単語（名詞）をも含ませた単位を指している。Here, i is the type of case particle ("ni",
"Wa", "ga", "from", "to", "de", "yori", "to"), and n is the number of case particle types ("8"), and K _i is a dependency case. Further, δ (i) is a function that is “1” when the case slot, that is, the case particle i exists, and “0” when the case slot, that is, the case particle i does not exist, and w (i) is the case particle i. Represents the weight for. Here,
The case slot refers to a unit including not only a case particle but also a word (noun) existing before the case particle.

【００１６】このようにして、ある用言を中心とした係
り受け事例間の類似度ｄ１を求めた後、第１の結合価フ
レーム作成部２では、ｄ１≧ＴＨ１の事例をまとめてい
くつかのグループを作り、それを１次の結合価フレーム
として作成する。なお、ＴＨ１は所定の閾値である。こ
のような処理を係り受け事例データベース１中の全ての
用言について行なって１次結合価辞書３を作成する。In this way, after the similarity d1 between the dependency cases centered on a certain declination is obtained, the first valence frame creating unit 2 collects several cases of d1 ≧ TH1. Create a group and create it as a primary valence frame. Note that TH1 is a predetermined threshold value. Such processing is performed for all the terms in the dependency case database 1 to create the primary valence dictionary 3.

【００１７】次いで、第１の分類コード付与部４では、
１次結合価辞書３を参照し、同一の結合価フレームの同
一の格スロットに属する単語（例えば名詞）に同一の分
類コードを付与する。なお、１つの単語（名詞）に付与
される分類コードの種類は、その単語（名詞）が係り受
け事例データベース１中に出現した頻度数だけ存在する
ことになる。Next, in the first classification code assigning section 4,
By referring to the primary valence dictionary 3, the same classification code is given to words (for example, nouns) belonging to the same case slot of the same valence frame. There are as many types of classification codes assigned to one word (noun) as there are frequencies in which the word (noun) appears in the dependency case database 1.

【００１８】第１の分類コード付与部４により上記のよ
うにして１次単語分類コード辞書５が作成された後、第
２の結合価フレーム作成部６では、係り受け事例データ
ベース１中の同じ係り受け事例について、係り受け事例
間の類似度ｄ２を求めるが、この場合、類似度ｄ２は、
その用語がとる格助詞の類似性のみでなく、その格助詞
を介してその用言に係る単語（名詞），すなわち係り単
語の分類コード間の類似性，すなわち格スロット内の単
語（名詞）の分類コード間の類似性をも加味したものと
して求められる。具体的には、先づ格スロット内の単語
（名詞）に類似性があるか否かを、例えば、次式の類似
度ｄｗに基づき判定する。After the primary word classification code dictionary 5 is created by the first classification code assigning unit 4 as described above, the second valence frame creating unit 6 creates the same relationship in the dependency case database 1 by the second valence frame creating unit 6. For the receiving case, the similarity d2 between the dependency cases is obtained. In this case, the similarity d2 is
Not only the similarity of case particles taken by the term, but the similarity of the words (nouns) related to the adjectives through the case particles, that is, the similarity between the classification codes of related words, that is, the words (nouns) in case slots It is calculated as taking into consideration the similarity between classification codes. Specifically, it is determined whether or not the words (nouns) in the case slots are similar to each other, for example, based on the similarity dw of the following equation.

【００１９】[0019]

【数２】 [Equation 2]

【００２０】ここで、Ｎ（Ｗ₁），Ｎ（Ｗ₂）は各単語
（名詞）Ｗ₁，Ｗ₂に付加された意味分類コード数であ
り、Ｎ（Ｗ₁₂）は単語（名詞）Ｗ₁，Ｗ₂に共通する意味
分類コード数である。第２の結合価フレーム作成部６
は、数２に基づき、格スロットに属する全ての単語（名
詞）間の類似度ｄｗの最小値ｄｗ_minが所定の閾値ＴＨ
２に対しｄｗ_min≧ＴＨ２の関係を満たすときに、格ス
ロット内の単語（名詞）に類似性があると判定する。Here, N (W ₁ ) and N (W ₂ ) are the number of meaning classification codes added to each word (noun) W ₁ and W ₂ , and N (W ₁₂ ) is the word (noun) W. It is the number of meaning classification codes common to ₁ and W ₂ . Second valency frame creation unit 6
Is the minimum value dw _min of the similarity dw between all the words (nouns) belonging to the case slot based on the equation 2
When the relationship of dw _min ≧ TH2 with respect to 2 is satisfied, it is determined that the words (nouns) in the case slot have similarity.

【００２１】このようにして、格スロット内の単語（名
詞）について類似性を判定した後、第２の結合価フレー
ム作成部６は、数１におけるδ（ｉ）の定義を、格スロ
ット内の名詞に類似性があるときに“１”，格スロット
が存在するときには“０．５”，格スロットが存在しな
いときには“０”というように変更して定義した上で、
数１を用いて類似度ｄ２を求める。After determining the similarity of the words (nouns) in the case slot in this way, the second valence frame creating unit 6 defines the definition of δ (i) in Equation 1 in the case slot. Change the definition to "1" when the nouns have similarity, "0.5" when there is a case slot, and "0" when there is no case slot.
The similarity d2 is obtained by using the equation 1.

【００２２】しかる後、ｄ２≧ＴＨ１の事例をまとめて
いくつかのグループを作り、それを２次の結合価フレー
ムとして作成する。Thereafter, the cases of d2 ≧ TH1 are put together to form some groups, which are created as secondary valence frames.

【００２３】この結果、ある用言を共通にする係り受け
事例を、同一の格スロットに基づく表層的な類似性とと
もに、同一格スロット内の単語（名詞）の分類コード間
の類似性をも考慮してグループ分けし、２次の結合価フ
レームとして作成することができ、２次結合価辞書７を
作成することができる。As a result, in the dependency case in which a certain denotation is common, the similarity between the classification codes of words (nouns) in the same case slot is considered in addition to the superficial similarity based on the same case slot. Then, they can be divided into groups and created as a secondary valence frame, and a secondary valence dictionary 7 can be created.

【００２４】しかる後、第２の分類コード付与部８で
は、２次結合価辞書７内の同一格フレームの同一格スロ
ットに含まれる単語（名詞）に同一の分類コードを付加
し、２次単語分類コード辞書９を作成することができ
る。After that, the second classification code assigning unit 8 adds the same classification code to the words (nouns) included in the same case slot of the same case frame in the secondary valence dictionary 7 to add the secondary word. The classification code dictionary 9 can be created.

【００２５】次に、上記処理の具体例を図２（ａ），
（ｂ），（ｃ）により説明する。いま、例えば、係り受
け事例データベース１に図２（ａ）のような係り受け事
例があるとすると、第１の結合価フレーム作成部２で
は、図２（ｂ）のような１次の結合価フレームを作成す
る。すなわち、図２（ａ）の係り受け事例は、動詞“か
わす”を共通の用言とする係り受け事例であり、第１の
結合価フレーム作成部２は、この係り受け事例から、用
言がとる格助詞として、先づ、“を”と“が”とを抽出
し、次いで、同一の格助詞“を”の格スロットに属する
名詞として、“意見”，“体”，“批判”を抽出し、ま
た、同一の格助詞“が”の格スロットに属する名詞とし
て“会”を抽出し、これにより、図２（ｂ）に示すよう
な１次の結合価フレームを作成する。Next, a concrete example of the above processing is shown in FIG.
This will be described with reference to (b) and (c). Now, for example, if the dependency case database 1 has dependency cases as shown in FIG. 2A, the first valence frame creating unit 2 will have a primary valence as shown in FIG. 2B. Create a frame. That is, the dependency case of FIG. 2A is a dependency case in which the verb "Kawasu" is a common verb, and the first valence frame creating unit 2 changes the verb from this dependency case. First, "wa" and "ga" are extracted as the case particles, and then "opinion", "body", and "criticism" are extracted as the nouns belonging to the same case particle "wa". In addition, "kai" is extracted as a noun belonging to the case slot of the same case particle "ga", thereby creating a primary valence frame as shown in FIG. 2 (b).

【００２６】また、図２（ｂ）において、左端の英数字
列は、１次の結合価フレームが作成された後に第１の分
類コード付与部４によって付与された分類コードを表わ
しており、上記同一格スロットに属する名詞，例えば
“意見”，“体”，“批判”には同一の分類コード“０
０６１０Ｗ”が付与されている。In FIG. 2B, the leftmost alphanumeric string represents the classification code assigned by the first classification code assigning unit 4 after the primary valence frame is created. The same classification code “0” is assigned to nouns belonging to the same case slot, for example, “opinion”, “body”, and “criticism”.
0610W ″ is given.

【００２７】ところで、図２（ａ）の係り受け事例から
わかるように、図２（ｂ）のように作成された１次の結
合価フレームにおいて、同一格スロットに属する名詞
“意見”，“体”，“批判”のうち、“意見”，“批
判”は、その意味が類似しているととらえることができ
るが、“体”は上記２つの名詞と意味が類似していな
い。By the way, as can be seen from the dependency example of FIG. 2A, in the primary valence frame created as shown in FIG. 2B, the nouns "opinion" and "body" belonging to the same case slot Of "" and "criticism", "opinion" and "criticism" can be considered to have similar meanings, but "body" does not have similar meaning to the above two nouns.

【００２８】第２の結合価フレーム作成部６では、さら
に同一格スロット内の名詞に類似性があるか否かをも判
定し、この類似性をも加味して、２次の結合価フレーム
を作成するようになっており、これにより、図２（ｂ）
に示す１次の結合価フレームは、図２（ｃ）に示すよう
な２次の結合価フレームに更新される。図２（ｃ）から
わかるように、図２（ｂ）において同じ分類コードが付
加されていた同一格スロットに属する名詞“意見”，
“体”，“批判”は、名詞の類似性が加味された結果、
名詞“意見”，“批判”については、同一の結合価フレ
ームを構成するものと判断されて同じ分類コードが付加
されるが、名詞“体”については、別の結合価フレーム
を構成するものと判断されて別の分類コードが付加され
る。The second valence frame creating unit 6 further determines whether or not the nouns in the same case slot have similarities, and by taking this similarity into consideration, the secondary valence frame is calculated. It is designed to be created.
The primary valence frame shown in FIG. 2 is updated to the secondary valence frame as shown in FIG. As can be seen from FIG. 2 (c), the noun “opinion” belonging to the same case slot to which the same classification code is added in FIG. 2 (b),
“Body” and “criticism” are the result of the similarity of nouns,
The nouns "opinion" and "criticism" are judged to constitute the same valence frame and the same classification code is added, but the noun "body" constitutes another valence frame. It is judged and another classification code is added.

【００２９】このようにして、本実施例によれば、表層
的な類似性とともに係り単語に付加された分類コード間
の類似性をも考慮して結合価フレーム並びに分類コード
が自動的に作成されるので、人手を介さずに結合価辞書
およびそれと整合性のとれた意味分類を容易に作成する
ことが可能となる。As described above, according to the present embodiment, the valence frame and the classification code are automatically created in consideration of the surface similarity and the similarity between the classification codes added to the related words. Therefore, it becomes possible to easily create a valence dictionary and a semantic classification consistent with the valence dictionary without human intervention.

【００３０】この際に、係り受け事例データベース１を
ユーザの利用分野に適応した事例によって構成すれば、
これに基づきユーザの利用分野に適応した精度の高い実
用的な結合価辞書および意味分類を作成することができ
る。At this time, if the dependency case database 1 is composed of cases adapted to the field of use of the user,
Based on this, it is possible to create a highly accurate and practical valence dictionary and semantic classification adapted to the field of use of the user.

【００３１】なお、上述の実施例において、第２の結合
価フレーム作成部６としては、２次単語分類コード辞書
９をも参照可能に構成することができて、この場合に
は、第２の結合価フレーム作成部６は、第２の分類コー
ド付与部８によって２次単語分類コード辞書９が作成さ
れたときに、係り受け事例データベース１とともにこの
２次単語分類コード辞書９をも参照して、第２の結合価
フレームを再度作成し２次結合価辞書７を逐次更新する
ことができ、さらに、２次単語分類コード辞書９をも逐
次更新することができる。従って、このような一連の処
理を何回か繰り返せば結合価辞書７および２次単語分類
コード辞書９の精度を一層向上させることができる。In the above embodiment, the second valence frame creating section 6 can also be configured to be able to refer to the secondary word classification code dictionary 9, and in this case, the second valence classification code dictionary 9 can be referred to. The valence frame creating unit 6 refers to the secondary word classification code dictionary 9 together with the dependency case database 1 when the secondary word classification code dictionary 9 is created by the second classification code assigning unit 8. , The second valence frame can be created again and the secondary valence dictionary 7 can be updated sequentially, and the secondary word classification code dictionary 9 can also be updated sequentially. Therefore, the accuracy of the valence dictionary 7 and the secondary word classification code dictionary 9 can be further improved by repeating such a series of processes several times.

【００３２】また、実際に係り受け事例解析などの瞬昧
性解消のためには、最終的に作られた結合価辞書７と２
次単語分類コード辞書９を用い、比較したい解析解をＫ
₁，Ｋ₂とすると、結合価辞書中の結合価フレームと
Ｋ₁，Ｋ₂との類似度を結合価フレーム作成部６における
と同様にして求め、類似度が大きい方を最もらしい解と
判断すれば良い。Further, in order to actually eliminate the ambiguity such as the dependency case analysis, the finally created bond valence dictionaries 7 and 2 are used.
Use the next word classification code dictionary 9 to find the analysis solution you want to compare.
_{If 1} and K ₂ , the similarity between the valence frame in the valence dictionary and K ₁ and K ₂ is calculated in the same manner as in the valence frame creating unit 6, and the one with the higher similarity is determined to be the most plausible solution. Just do it.

【００３３】また、図１においては、説明の便宜上、第
１の結合価フレーム作成部２と第２の結合価フレーム作
成部６，第１の分類コード付与部４と第２の分類コード
付与部８とがそれぞれ別個に設けられているが、これら
を１つの結合価フレーム作成部，分類コード付与部とし
てまとめることもできる。この場合、１次単語分類コー
ド辞書５，２次単語分類コード辞書９も単語分類コード
辞書として１つにまとめることができ、１つの単語分類
コード辞書としてまとめたときには、２次単語分類コー
ド辞書９は１次単語分類コード辞書５を更新したものと
してとらえることができる。Further, in FIG. 1, for convenience of description, the first valence frame creating section 2, the second valence frame creating section 6, the first classification code assigning section 4 and the second classification code assigning section. Although 8 and 8 are provided separately, they can be combined into one valence frame creating section and classification code assigning section. In this case, the primary word classification code dictionary 5 and the secondary word classification code dictionary 9 can also be combined into one as a word classification code dictionary, and when combined into one word classification code dictionary, the secondary word classification code dictionary 9 Can be regarded as an update of the primary word classification code dictionary 5.

【００３４】さらに、上述の実施例では、分類コード付
与部において、同一の結合価フレームの同一スロットに
属する単語に同一の分類コードを付加しているが、これ
と同時に、各単語に予め人手で用意した意味分類コード
も付け加えて単語分類コード辞書を作成すれば、係り受
け事例数が比較的少ない場合でも精度の高い結合価辞書
および単語分類コード辞書を作成することができる。Further, in the above embodiment, the classification code assigning section adds the same classification code to words belonging to the same slot of the same valence frame, but at the same time, each word is manually preliminarily added. By creating the word classification code dictionary by adding the prepared semantic classification code, it is possible to create a highly accurate valence valence dictionary and word classification code dictionary even when the number of dependency cases is relatively small.

【００３５】[0035]

【発明の効果】以上に説明したように本発明によれば、
受け単語を共通にする係り受け事例を、表層的な類似性
とともに、係り単語に付加された分類コード間の類似性
をも加味して、最終的な結合価フレームを作成するよう
にしているので、実用性のある結合価辞書およびそれと
整合性のとれた意味分類を容易にかつ精度良く作成する
ことができる。As described above, according to the present invention,
For the dependency case in which the common word is common, the final valence frame is created by taking into consideration the similarity between the classification codes added to the modification word as well as the surface similarity. It is possible to easily and accurately create a practical valence dictionary and a semantic classification that is consistent with it.

【００３６】結合価フレーム作成手段が、自己の作成し
た結合価フレームに基づいて分類コード付与手段が作成
した単語分類コード辞書を参照して、結合価フレームを
逐次更新するように構成されていれば、結合価辞書（結
合価フレーム）並びに単語分類コード辞書の精度を逐次
向上させることができる。If the valence frame creating means is configured to sequentially update the valence frame by referring to the word classification code dictionary created by the classification code assigning means based on the valence frame created by itself. , The valence valence dictionary (valence valence frame) and the accuracy of the word classification code dictionary can be successively improved.

【００３７】さらに、同一の結合価フレームの同一スロ
ットに属する単語に同一の分類コードを付加すると同時
に、各単語に予め人手により用意した意味分類コードを
も付け加えて単語分類コード辞書を作成するように分類
コード付与手段が構成されていれば、係り受け事例数が
比較的少ない場合であっても精度の高い結合価辞書およ
び単語分類コード辞書を作成することができる。Further, the same classification code is added to the words belonging to the same slot of the same valence frame, and at the same time, the meaning classification code prepared by hand is added to each word to create the word classification code dictionary. If the classification code assigning means is configured, it is possible to create a highly accurate bond valence dictionary and word classification code dictionary even when the number of dependency cases is relatively small.

[Brief description of drawings]

【図１】本発明の一実施例のブロック図である。FIG. 1 is a block diagram of an embodiment of the present invention.

【図２】（ａ），（ｂ），（ｃ）は結合価構造解析処理
の具体例を示す図である。2A, 2B, and 2C are diagrams showing a specific example of a valence structure analysis process.

[Explanation of symbols]

１係り受け事例データベース２第１の結合価フレーム作成部３１次結合価辞書４第１の分類コード付与部５１次単語分類コード辞書６第２の結合価フレーム作成部７２次結合価辞書８第２の分類コード付与部９２次単語分類コード辞書 1 Dependency case database 2 First valence frame creation section 3 primary valence dictionary 4 First classification code assigning section 5 Primary word classification code dictionary 6 Second valency frame creation section 7 Secondary valence dictionary 8 Second classification code assigning section 9 Secondary word classification code dictionary

Claims

[Claims]

1. A valence frame creating means for grouping dependency cases having common vocabulary words to create a valence frame, and adding the same classification code to words belonging to the same slot of the same valence frame. , A categorization code assigning means for creating a word categorization code dictionary, wherein the valence frame creating means classifies the dependency cases in which the vulnerable words are common, together with the surface similarity, to the categorization added to the dependency words. Grouped taking into account the similarity between chords,
A valence structure analysis method characterized by creating a final valence frame.

2. The valence frame creating means is configured to sequentially update the valence frame with reference to the word classification code dictionary created by the classification code assigning means based on the valence frame created by itself. The valence structure analysis method according to claim 1, wherein

3. The classification code assigning means adds the same classification code to words belonging to the same slot of the same valence frame, and at the same time adds a semantic classification code prepared in advance to each word to classify the words. The code dictionary is adapted to be created.
Or the valence structure analysis method described in 2.