JPH0336662A - Natural language processing method - Google Patents

Natural language processing method

Info

Publication number
JPH0336662A
JPH0336662A JP1171472A JP17147289A JPH0336662A JP H0336662 A JPH0336662 A JP H0336662A JP 1171472 A JP1171472 A JP 1171472A JP 17147289 A JP17147289 A JP 17147289A JP H0336662 A JPH0336662 A JP H0336662A
Authority
JP
Japan
Prior art keywords
clause
likelihood
natural language
processing method
language processing
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
JP1171472A
Other languages
Japanese (ja)
Inventor
Yoshitoshi Yamauchi
佐敏 山内
Kazuhiro Inoue
和博 井上
Masaru Nakajima
勝 中島
Nobuyuki Oro
大呂 延幸
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ricoh Co Ltd
Original Assignee
Ricoh Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ricoh Co Ltd filed Critical Ricoh Co Ltd
Priority to JP1171472A priority Critical patent/JPH0336662A/en
Publication of JPH0336662A publication Critical patent/JPH0336662A/en
Pending legal-status Critical Current

Links

Landscapes

  • Machine Translation (AREA)
  • Document Processing Apparatus (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

(57)【要約】本公報は電子出願前の出願データであるた
め要約のデータは記録されません。
(57) [Summary] This bulletin contains application data before electronic filing, so abstract data is not recorded.

Description

【発明の詳細な説明】 校延発互 ワードプロセッサやDPSの日本語入力などに用いる仮
名漢字変換処理装置、音声認識、機械翻訳や校正支援や
文字認識等における自然言語解析装置などの自然言語処
理装置に関し、例えば、音声合成等における仮名漢字変
換装置等に適用されるものである。
[Detailed description of the invention] Natural language processing devices such as proofreading word processors, kana-kanji conversion processing devices used for Japanese input in DPS, natural language analysis devices for speech recognition, machine translation, proofreading support, character recognition, etc. Regarding this, it is applied, for example, to a kana-kanji conversion device in speech synthesis, etc.

丈来挟佐 従来のかな漢字変換処理方式としては、最初の文節と後
続の文節を解析して、2文節の読み長を元に、最初の文
節を決定する2文節最長−教法が良く知られているが、
2文節の読み長など形態素的な関係だけでは同音異義語
を正しく判断できないという欠点があった。これについ
て、意味・共起情報を辞書に記述することでこの問題を
解決する方法がいくつか発表されているが、いずれも辞
書とのマツチングにより候補を優先出力するか、辞書と
のマツチした候補に対しである得点を与えるかであり、
多種多様な次元の異なる意味・共起情報を利用して解析
を行うとき(例えば、格関係で判断すると候補1が最高
得点であるが、共起情報で判断すると候補2が最高得点
であるなど)、それぞれの情報に対してどれだけの比重
を与えるかが大きな問題であり、次元の異なる情報を追
加するのが困難であった。
As for the conventional kana-kanji conversion processing method, the 2-bunsetsu longest teaching method is well-known, in which the first clause and the following clauses are analyzed and the first clause is determined based on the reading length of the two clauses. Although,
The drawback was that homophones could not be determined correctly based on morphological relationships such as the reading length of two clauses. Regarding this, several methods have been announced to solve this problem by writing meaning/co-occurrence information in a dictionary, but all of them either prioritize outputting candidates by matching with a dictionary, or output candidates that match with a dictionary. It is a matter of giving a certain score to
When performing analysis using different meanings and co-occurrence information in a wide variety of dimensions (for example, candidate 1 has the highest score when judged based on case relationships, but candidate 2 has the highest score when judged based on co-occurrence information, etc.) ), the major problem was how much weight to give to each piece of information, and it was difficult to add information of different dimensions.

また、先の出願として特願昭63−288338号があ
るが、1.5文節に尤もらしさを与えて最初の1文節を
決定する方式のもので、この方式では、複数文節間の情
報を扱えないという欠点があった。
In addition, there is an earlier application, Japanese Patent Application No. 63-288338, which uses a method that determines the first clause by giving plausibility to 1.5 clauses, and this method cannot handle information between multiple clauses. There was a drawback that there was no

且−一放 本発明は、上述のごとき欠点を解析するためになされた
もので、自立部としての尤もらしさを正確に表わすこと
、1.5文節としての尤もらしさを正確に表わすこと、
前後文節の情報を利用して複数文節間での矛盾をなくす
こと、複数の情報に対する比重を簡単に与え、新たな情
報を追加しやすくすること、複数の情報を合成する方法
を実現し、高性能な自然言語解析を行う自然言語処理方
式を提供することを目的としてなされたものである。
The present invention was made in order to analyze the above-mentioned drawbacks, and it is necessary to accurately represent the plausibility as an independent part, accurately represent the plausibility as a clause,
By using the information of the preceding and following clauses to eliminate contradictions between multiple clauses, by easily giving weight to multiple pieces of information and making it easier to add new information, and by realizing a method for synthesizing multiple pieces of information, The purpose was to provide a natural language processing method that performs high-performance natural language analysis.

週−一」又 本発明は、上記目的を達成するために、(1)単語の情
報を保持する単語辞書を用いて、仮名文字列を漢字仮名
混じり文字列に変換する仮名漢字変換処理、および文字
列に対して文節を判断する処理において、文節の自立部
内の接頭語、自立語、接尾語の意味・共起の関係から、
それを含む候補に尤もらしさを与えること、更には、(
2)文節内の自立部とその前文節の自立部および付属部
との格関係から、それを含む候補に尤もらしさを与える
こと、更には、(3)文節内の自立部とその前文節の自
立部との意味・共起の関係から、それを含む候補に尤も
らしさを与えること、更には、(4)連続する3候補の
自立部および付属部の格関係の尤もらしさを解析に利用
すること、更には、(5)連続する3候補の自立部の意
味・共起の関係の尤もらしさを解析に利用すること、更
には、(6)単語の頻度などを含む前記それぞれの情報
について尤もらしさを与えて、解析に利用すること、更
には、(7)複数文節間で次元の異なる多種多様な情報
を合成して尤度と確度を求めこれに基づいて、最初の1
文節を決定することを特徴としたものである。以下、本
発明の実施例に基づいて説明する。
In order to achieve the above object, the present invention provides (1) a kana-kanji conversion process that converts a kana character string into a kanji-kana-mixed character string using a word dictionary that holds word information; In the process of determining a clause from a character string, from the meaning and co-occurrence relationships of prefixes, independent words, and suffixes in the independent part of the clause,
To give plausibility to candidates that include it, and furthermore, (
2) From the case relationship between the independent part within a clause and the independent part and attached part of its preceding clause, give plausibility to candidates that include it; Based on the relationship of meaning and co-occurrence with the independent part, give plausibility to the candidates that include it, and (4) use the likelihood of the case relationship between the independent part and the attached part of three consecutive candidates for analysis. Furthermore, (5) the likelihood of the meaning/co-occurrence relationship of the independent parts of three consecutive candidates is used for analysis; and (6) the likelihood of each of the above information including the frequency of words is Furthermore, (7) combining a wide variety of information with different dimensions between multiple clauses, finding the likelihood and accuracy, and based on this,
It is characterized by determining the phrase. Hereinafter, the present invention will be explained based on examples.

第1図は、本発明による自然言語処理方式の一実施例を
説明するための構成図で、図中、1は入力部、2は解析
処理部、3は候補抽出部、4は辞書検索部、5は辞書、
6は候補評価部、7は合成演算部、8は出力部である。
FIG. 1 is a block diagram for explaining an embodiment of the natural language processing method according to the present invention. In the figure, 1 is an input section, 2 is an analysis processing section, 3 is a candidate extraction section, and 4 is a dictionary search section. , 5 is a dictionary,
6 is a candidate evaluation section, 7 is a composition calculation section, and 8 is an output section.

なお、解析を行う以前に既に確定している直前の文節を
確定文節という。文節と次に続く自立部を合わせて1.
5文節という。
Note that the immediately preceding phrase that has already been determined before analysis is called a determined phrase. Combine the clause and the following independent part to 1.
It is called 5 clauses.

候補抽出部3は1.5文節単位の候補群を抽出する処理
を行う。候補評価部6は抽出した候補に対して1.5文
節としての尤もらしさを評価して与える。合成演算部7
は、確定文節の尤もらしさと、最初の1.5文節の尤も
らしさと、最初の1文節に続く1.5文節の尤もらしさ
に対して合成演算を行い最尤の候補の最初の1文節を決
定する。
The candidate extraction unit 3 performs a process of extracting a group of candidates in units of 1.5 bunsetsu. The candidate evaluation unit 6 evaluates and provides the extracted candidates with respect to their plausibility as 1.5 clauses. Synthesis operation section 7
calculates the first clause of the most likely candidate by performing a composition operation on the likelihood of the definite clause, the likelihood of the first 1.5 clauses, and the likelihood of the 1.5 clauses following the first clause. decide.

第2図は、候補抽出部の具体例を示すもので。FIG. 2 shows a specific example of the candidate extraction section.

解析開始位置から、1.5文節の単位で候補群をすべて
蓄積(候補1、候補2、・・・・・・、候補n)する。
All candidate groups are accumulated (candidate 1, candidate 2, . . . , candidate n) in units of 1.5 clauses from the analysis start position.

次にそれぞれの候補の第1文節に続く候補を同様に1.
5文節の単位で蓄積(候補1に対して候補1−1.候補
1−2、・・・・・・候補1−m、候補2に対して候補
2−1、候補2−2・・・・・・)する。
Next, select the candidates following the first clause of each candidate in the same manner as 1.
Accumulate in units of 5 clauses (Candidate 1-1 for Candidate 1. Candidate 1-2, ... Candidate 1-m, Candidate 2-1 for Candidate 2, Candidate 2-2, etc.) ···)do.

第3図は、接頭語分類と接尾語分類に対する尤もらしさ
を示す図で、“運動に対する尤もらしさを示す。請求項
1の構成を説明するためのものである。すなわち、接頭
語分類と接尾語分類に対する尤もらしさの情報を使用し
てそれぞれの自立部に尤もらしさを与える。この情報を
それぞれの自立部を含む1.5文節の尤もらしさの情報
に加味する。例として″うんどうかいでは・・・・・・
′″という文字列に対して1.5文節を解析した場合、
′“運動会を含む1.5文節候補に対して0.7′運動
化″を含む↓、5文節候補に対して0.5″運動″を含
む1.5文節候補に対して0.6の尤もらしさを与えて
、1.5文節候補の読み長。
FIG. 3 is a diagram showing the likelihood of prefix classification and suffix classification, and shows the likelihood of movement. The likelihood information for the classification is used to give a likelihood to each independent part.This information is added to the likelihood information of the 1.5 clauses that include each independent part.For example, "Undo Kai de...・・・・・・
When 1.5 clauses are analyzed for the string ``'',
' 0.7 for 1.5 clause candidates including "athletic day" ↓, 0.5 for 5 clause candidates including ``exercise'' ↓, 0.6 for 1.5 clause candidates including ``exercise'' Given the plausibility, the reading length of 1.5 clause candidates.

頻度、接続確立度等から得られる尤もらしさと合成して
1.5文節候補の尤もらしさを判断する。
The likelihood of the 1.5 phrase candidate is determined by combining it with the likelihood obtained from the frequency, connection probability, etc.

第4図(a)、(b)は、前文節の自立部分類と付属部
分類に対する尤もらしさを示す図で、(a)は″言った
″(言う)に対する尤もらしさ、(b)はパ行った″(
行く)に対する尤もらしさを示す。請求項2の構成を説
明するためのものである。すなわち、前文節の自立部分
類と付属部分類に対する尤もらしさの情報を使用して1
.5文節に尤もらしさを与える。例として″・・・・・
・いった。′という文字例に対して↓、5文節を解析し
た場合。
Figures 4 (a) and (b) are diagrams showing the plausibility of the independent and attached subclauses of the preceding clause, where (a) is the plausibility for ``said'' (say), and (b) is the plausibility of the preceding clause. went"(
show the plausibility of This is for explaining the configuration of claim 2. That is, using the information on the likelihood for the independent and attached subclasses of the preceding clause, 1
.. Give plausibility to 5 clauses. As an example"·····
·said. When ↓ and 5 clauses are analyzed for the example character ''.

″言った。"Said.

″行った。"went.

等の候補が得られる。この時、前文節が、″友達(人間
)が′の場合は、 パ言った。″に対して、0.8 ″行った。′に対して、0.7 の尤もらしさを与えて、1.5文節候補の読み長、頻度
、接続確立度等から得られる尤もらしさと合成して1.
5文節候補の尤もらしさを判断する。
Candidates such as In this case, if the previous clause is ``My friend (human being) is'', then 0.8 is given to ``I went.'', and 0.7 is given to 1. .5 Combining the likelihood obtained from the reading length, frequency, degree of connection probability, etc. of the bunsetsu candidates, 1.
Determine the plausibility of the 5 clause candidates.

第5図は、前文節の自立部分類との意味・共起部情報を
使用した尤もらしさを示す図で、請求項3の構成を説明
するためのものである。すなわち、前文節の自立部分類
との意味・共起部情報を使用して、それぞれの自立部を
含む1.5文節の尤もらしさの情報を加味する。例とし
て″・・・・・・あつい。′という文字例に対して1.
5文節を解析した場合、 “淳い。
FIG. 5 is a diagram showing the likelihood using the meaning/co-occurrence part information with the independent subclass of the preceding clause, and is for explaining the structure of claim 3. That is, using the meaning/co-occurrence part information with the independent part class of the previous clause, information on the likelihood of 1.5 clauses including each independent part is added. For example, for the character ``......hot.'', 1.
When analyzing 5 clauses, “Junii.

パ暑い・ 等の候補が得られる。この時、前文節が、本が″の場合
は、 trJダい。″に対して、0.7 “著い。”に対して、0.4 の尤もらしさを与えて、↓、5文節候補の読み長、頻度
、接続確立度等から得られる尤もらしさと合成して1.
5文節候補の尤もらしさを判断する。
You can get candidates such as ``Passatsu''. At this time, if the previous clause is ``book,'' give a likelihood of 0.7 for ``trJ daai.'' and 0.4 for ``remarkable.'' and select ↓, 5 clause candidates. Combined with the likelihood obtained from reading length, frequency, degree of connection probability, etc., 1.
Determine the plausibility of the 5 clause candidates.

第6図は、前後文節の情報と一致する文節候補の組合わ
せに対する尤もらしさを示す図で、請求項4,5の構成
を説明するためのものである。
FIG. 6 is a diagram showing the likelihood of a combination of phrase candidates that match the information of the preceding and following phrases, and is for explaining the configurations of claims 4 and 5.

前後文節の情報と一致する文節候補の組み合わせに対し
て尤もらしさを与える。前後の3文節を解析することで
、″東京″と″1毎上″1.11海上″と″′保険″の
意味・共起のつながりは弱いが、″東京″と″海上″と
″保険″の3文節のつながりが非常に強い場合を表わす
ことができる。
A likelihood is given to a combination of phrase candidates that match the information of the preceding and following phrases. By analyzing the three preceding and following clauses, it was found that the meanings and co-occurrence of "Tokyo", "1. This can represent a case where the connection between the three clauses of `` is very strong.

第7図は、前記実施例に関する候補評価部と合成演算部
とのフローチャートを示す。図中、請求項↓〜5で得ら
れた各候補の尤もらしさを証拠上〜5として示しである
FIG. 7 shows a flowchart of the candidate evaluation section and the composition calculation section regarding the embodiment. In the figure, the likelihood of each candidate obtained in claims ↓ to 5 is shown as evidence-based ~5.

第8図は、最初の1.5文節と、その文節に続く1.5
文節についての実施例を示すフローチャートである。以
下5tepに従って説明する。
Figure 8 shows the first 1.5 clauses and the 1.5 clauses following that clause.
It is a flowchart which shows an example about a phrase. The following will be explained in 5 steps.

匹吐よ;それぞれの証拠から1.5文節単位で候補の尤
もらしさを解析する。
Let's spit it out; analyze the likelihood of each candidate in 1.5 clause units from each piece of evidence.

幻」上」工;正解候補は、蓄積された候補の中で上位l
O候補の中にあると子側して、上位10候補のみを蓄積
する。ただし、最尤候補の1.5文節請み長が、読み文
字列の区切り長さ(最初に現われる句読点・記号までの
長さ)と同じ場合は、以下の処理を行なわないので、こ
の時点でのlik尤候抽を、第1文節として決定する。
The correct answer candidate is the top one among the accumulated candidates.
If it is among the O candidates, only the top 10 candidates are accumulated. However, if the 1.5 clause length of the maximum likelihood candidate is the same as the separation length of the reading character string (the length up to the first punctuation mark/symbol that appears), the following processing will not be performed, so at this point lik likelihood lottery is determined as the first clause.

吐牡又;蓄積した各候補に対して、次文節を2の処理と
同じ1.5文節単位で解析し、上位10候袖を蓄積する
For each accumulated candidate, the next clause is analyzed in units of 1.5 clauses, which is the same as in step 2, and the top 10 candidates are accumulated.

!厚狸−±;」二連の処理によって蓄積した及初の1.
5文節、および次に続く1.5文節についてそれぞれの
候補の確からしさを独立の証拠として扱い、証拠1〜5
をD−3演算による合成演1→゛処理を行う。
! Atsushi-±;” The first 1.
Treat the certainty of each candidate for 5 clauses and the following 1.5 clauses as independent evidence, and use evidence 1 to 5.
is subjected to the synthesis operation 1→' process using the D-3 operation.

7i上述の尤度演算の結果より、尤度が最高値の候補の
第1文節を最初の1文節として決定する。
7i Based on the result of the above-described likelihood calculation, the first clause of the candidate with the highest likelihood is determined as the first clause.

これら複数の証拠は、ある候補については証拠1は存在
するが他の証拠は存在しない。またある候補については
証拠2だけが存在するなど、それぞれの候補に対して証
拠が均等でなく、線形的な合成で候補の尤もらしさを正
しく判断するのは非常に困難であった。これについて発
明者らは既に複数の証拠を合成する方法についてデンプ
スターシェーファーの確立論を応用した方法(前述の特
願昭63−288338号)を提案している。この方法
を使用する際に1合成する証拠について証拠の重要度を
判断して重み付けを行った後に合成を行うことで、さら
に効果が上がる。
Among these multiple pieces of evidence, evidence 1 exists for a certain candidate, but other pieces of evidence do not exist. Furthermore, the evidence is not equal for each candidate, such as only evidence 2 exists for a certain candidate, and it is extremely difficult to correctly judge the likelihood of a candidate using linear synthesis. In this regard, the inventors have already proposed a method (Japanese Patent Application No. 63-288338 mentioned above) that applies Dempster Schaefer's establishment theory to a method of synthesizing a plurality of pieces of evidence. When using this method, the effectiveness can be further increased by determining the importance of the evidence to be synthesized and weighting it before synthesizing it.

劾−−−聚 以上の説明から明らかなように、本発明によると以下の
ような効果がある。
As is clear from the above description, the present invention has the following effects.

(1)請求項1については、自立部の構成要素の接頭語
、自立語、付属請の、形態素的、意味・共起の関係に基
づいて尤もらしさを与えるので、自立部の尤もらしさを
正確に表わすことができる。
(1) Regarding claim 1, plausibility is given based on the morphological, semantic, and co-occurrence relationships of the prefixes, independent words, and adjuncts of the constituent elements of the independent part, so the plausibility of the independent part is determined accurately. can be expressed as

(2)請求項2,3については、前文節との格関係、意
味・共起の関係に基づいて尤もらしさを与えるので、1
.5文節としての尤もらしさを正確に表わすことができ
る。
(2) Regarding claims 2 and 3, plausibility is given based on the case relationship with the preceding clause, meaning/co-occurrence relationship, so 1
.. The plausibility of the 5 clauses can be accurately expressed.

(3)請求項4,5については、前後文節の情報を利用
しているので複数文節間での矛盾をなくすことができる
(3) Regarding claims 4 and 5, since the information on the preceding and following clauses is used, contradictions between a plurality of clauses can be eliminated.

(4)請求項6については、複数の情報に対して情報の
尤もらしさを与えているので、情報の重要度を簡単に操
作することができるとともに、新たな情報を追加しやす
くすることができる。
(4) Regarding claim 6, since information plausibility is given to multiple pieces of information, the importance of information can be easily manipulated, and new information can be added easily. .

(5)請求項7については、複数の情報を簡単に合成す
ることができるので、それぞれの情報を独立して与える
ことができるとともに、高性能な自然言語解析を行うこ
とができる。
(5) Regarding claim 7, since a plurality of pieces of information can be easily combined, each piece of information can be provided independently, and high-performance natural language analysis can be performed.

【図面の簡単な説明】[Brief explanation of drawings]

第1図は、本発明による自然言語処理方式の一実施例を
説明するための構成図、第2図は、候補抽出部の具体例
を示す図、第3図は、接頭語分類と接尾語分類に対する
尤もらしさを示す図、第4図(a)、(b)は、前文節
の自立部分類と付属部分類に対する尤もらしさを示す図
、第5図は、前文節の自立部分類との意味・共起部情報
を使用した尤もらしさを示す図、第6図は1前後文節の
情報と一致する文節候補の組み合わせに対する尤もらし
さを示す図、第7図は、候補評価部と合成演算部とのフ
ローチャートを示す図、第8図は、1.5文節について
実施した場合のフローチャートを示す図である。 1・・・入力部、2・・・解析処理部、3・・・候補抽
出部、4・・・辞書検索部、5・・・辞書、6・・・候
補評価部、7・・合成演算部、8・・・出力部。
FIG. 1 is a block diagram for explaining an embodiment of the natural language processing method according to the present invention, FIG. 2 is a diagram showing a specific example of the candidate extraction section, and FIG. 3 is a diagram showing prefix classification and suffix classification. Figures 4 (a) and (b) are diagrams showing the likelihood of classification, and Figure 4 (a) and (b) are diagrams showing the likelihood of the independent and attached parts of the preceding clause. A diagram showing the likelihood using meaning/co-occurrence part information. Figure 6 is a diagram showing the likelihood for a combination of clause candidates that match the information of the first and previous clauses. Figure 7 is a diagram showing the candidate evaluation unit and the composition calculation unit. FIG. 8 is a diagram showing a flowchart when the process is performed for 1.5 clauses. DESCRIPTION OF SYMBOLS 1... Input unit, 2... Analysis processing unit, 3... Candidate extraction unit, 4... Dictionary search unit, 5... Dictionary, 6... Candidate evaluation unit, 7... Synthesis operation Part, 8... Output part.

Claims (1)

【特許請求の範囲】 1、単語の情報を保持する単語辞書を用いて、仮名文字
列を漢字仮名混じり文字列に変換する仮名漢字変換処理
、および文字列に対して文節を判断する処理において、
文節の自立部内の接頭語、自立語、接尾語の意味・共起
の関係から、それを含む候補に尤もらしさを与えること
を特徴とする自然言語処理方式。 2、文節内の自立部とその前文節の自立部および付属部
との格関係から、それを含む候補に尤もらしさを与える
ことを特徴とする請求項1記載の自然言語処理方式。 3、文節内の自立部とその前文節の自立部との意味・共
起の関係から、それを含む候補に尤もらしさを与えるこ
とを特徴とする請求項1記載の自然言語処理方式。 4、連続する3候補の自立部および付属部の格関係の尤
もらしさを解析に利用することを特徴とする請求項1記
載の自然言語処理方式。 5、連続する3候補の自立部の意味・共起の関係の尤も
らしさを解析に利用することを特徴とする請求項1記載
の自然言語処理方式。 6、単語の頻度などを含む前記それぞれの情報について
尤もらしさを与えて、解析に利用することを特徴とする
請求項1〜5のいずれかを少なくとも1項記載の自然言
語処理方式。 7、複数文節間で次元の異なる多種多様な情報を合成し
て尤度と確度を求めこれに基づいて、最初の1文節を決
定することを特徴とする請求項1〜6のいずれかを少な
くとも1項記載の自然言語処理方式。
[Scope of Claims] 1. In a kana-kanji conversion process that converts a kana character string into a character string containing kanji and kana using a word dictionary that holds word information, and in a process that determines a clause from a character string,
A natural language processing method that is characterized by assigning plausibility to candidates containing prefixes, independent words, and suffixes based on their meanings and co-occurrence relationships within the independent part of a clause. 2. The natural language processing method according to claim 1, wherein a likelihood is given to a candidate containing the independent part in the clause based on the case relationship between the independent part in the clause and the independent part and attached part of the preceding clause. 3. The natural language processing method according to claim 1, characterized in that based on the relationship of meaning and co-occurrence between the independent part within a clause and the independent part of the preceding clause, likelihood is given to candidates containing the independent part. 4. The natural language processing method according to claim 1, wherein the likelihood of a case relationship between the independent part and the attached part of three consecutive candidates is used for analysis. 5. The natural language processing method according to claim 1, wherein the likelihood of the meaning/co-occurrence relationship of the independent parts of three consecutive candidates is used for analysis. 6. A natural language processing method according to at least one of claims 1 to 5, characterized in that each piece of information including the frequency of words is given a likelihood and used for analysis. 7. At least one of claims 1 to 6, characterized in that a wide variety of information with different dimensions is synthesized between a plurality of phrases to obtain likelihood and accuracy, and based on this, the first phrase is determined. The natural language processing method described in Section 1.
JP1171472A 1989-07-03 1989-07-03 Natural language processing method Pending JPH0336662A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
JP1171472A JPH0336662A (en) 1989-07-03 1989-07-03 Natural language processing method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
JP1171472A JPH0336662A (en) 1989-07-03 1989-07-03 Natural language processing method

Publications (1)

Publication Number Publication Date
JPH0336662A true JPH0336662A (en) 1991-02-18

Family

ID=15923739

Family Applications (1)

Application Number Title Priority Date Filing Date
JP1171472A Pending JPH0336662A (en) 1989-07-03 1989-07-03 Natural language processing method

Country Status (1)

Country Link
JP (1) JPH0336662A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH05197712A (en) * 1992-01-17 1993-08-06 Matsushita Electric Ind Co Ltd Method for constructing and updating cooccurrence dictionary and method for analyzing cooccurrence meaning
US5741009A (en) * 1994-09-14 1998-04-21 Konica Corporation Sheet sorting apparatus

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPS55102072A (en) * 1979-01-29 1980-08-04 Nippon Telegr & Teleph Corp <Ntt> Kana-kanji conversion method for homophone
JPS6231467A (en) * 1985-08-01 1987-02-10 Toshiba Corp Sentence preparation device
JPS62121570A (en) * 1985-11-22 1987-06-02 Fujitsu Ltd Continued clause conversion processing system based on connection probability
JPH01134563A (en) * 1987-11-19 1989-05-26 Brother Ind Ltd Kana/kanji converter

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPS55102072A (en) * 1979-01-29 1980-08-04 Nippon Telegr & Teleph Corp <Ntt> Kana-kanji conversion method for homophone
JPS6231467A (en) * 1985-08-01 1987-02-10 Toshiba Corp Sentence preparation device
JPS62121570A (en) * 1985-11-22 1987-06-02 Fujitsu Ltd Continued clause conversion processing system based on connection probability
JPH01134563A (en) * 1987-11-19 1989-05-26 Brother Ind Ltd Kana/kanji converter

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH05197712A (en) * 1992-01-17 1993-08-06 Matsushita Electric Ind Co Ltd Method for constructing and updating cooccurrence dictionary and method for analyzing cooccurrence meaning
US5741009A (en) * 1994-09-14 1998-04-21 Konica Corporation Sheet sorting apparatus

Similar Documents

Publication Publication Date Title
JP5130892B2 (en) Character encoding processing method and system
US5270927A (en) Method for conversion of phonetic Chinese to character Chinese
Den et al. A Proper Approach to Japanese Morphological Analysis: Dictionary, Model, and Evaluation.
JPH0336662A (en) Natural language processing method
Van Driem The creoloid origins of Chinese
JP2004206659A (en) Reading information determination method and apparatus and program
JPH0246976B2 (en)
JPS62251986A (en) Misread character correction processor
JP2798747B2 (en) Natural language processing method
JPH049320B2 (en)
JPH0363767A (en) Text voice synthesizer
JP2995783B2 (en) Katakana translation word estimator
JP2939945B2 (en) Roman character address recognition device
Al-Marghilani et al. Text mining based on the self-organizing map method for arabic-english documents
Kim et al. Hybrid grapheme to phoneme conversion for unlimited vocabulary
Ohyama et al. A sentence analysis method for a Japanese book reading machine for the blind
JPH02122366A (en) natural language processing device
JPH04115383A (en) Character recognizing system for on-line handwritten character recognizing device
JP2001265762A (en) Document structure extracting device and document structure information extracting method
Park et al. Eliminating Implausible Korean Morphological Interpretations by Using History of Previous Analysis and Lexical Association
JPS6132167A (en) Kana-kanji conversion processing device
JPH0760433B2 (en) Kanji converter
JPH0916575A (en) Pronunciation dictionary device
JPH0895976A (en) Natural language analyzer
JPH0778155A (en) Document recognition device