JPH06208659A

JPH06208659A - Handwritten sentence recognizing device

Info

Publication number: JPH06208659A
Application number: JP5002254A
Authority: JP
Inventors: Tatsuya Hayama; 達也羽山
Original assignee: Toshiba Corp
Current assignee: Toshiba Corp
Priority date: 1993-01-11
Filing date: 1993-01-11
Publication date: 1994-07-26

Abstract

(57)【要約】【目的】本発明は、筆記者に負担をかけることなく、日
本語処理までを含む一連の処理を短時間に実行可能にす
る。【構成】日本語処理部は、２文字長の漢字熟語が登録さ
れた２文字長漢字熟語辞書１３と、文字認識部によって
得られた各文字に対応する認識候補文字の組み合わせか
ら成る文字列から、２文字長漢字熟語辞書１３に登録さ
れた漢字熟語を参照して２文字長漢字単語を検出する２
文字長漢字熟語検出部１２と、検出された２文字長漢字
単語に対する接頭字及び接尾字を認識候補文字から検出
する接頭字・接尾字検出部１４と、その他の認識候補文
字から、名詞を成す文字列に対応する文字を検出する名
詞検出部１６と、その他の認識候補文字から、検出済み
の認識候補文字と日本語文法に従う文字を検出する他品
詞検出部１８とによって構成される。 (57) [Summary] [Object] The present invention makes it possible to execute a series of processing including Japanese processing in a short time without burdening the writer. [Structure] The Japanese processing unit is composed of a character string composed of a combination of a 2-character long Kanji compound word dictionary 13 in which 2-character long Kanji compound words are registered and a recognition candidate character corresponding to each character obtained by the character recognition unit. Detecting a 2-character long Kanji word by referring to a Kanji compound word registered in the 2-character long Kanji compound dictionary 2
A noun is formed from the character length Kanji compound detection unit 12, the prefix / suffix detection unit 14 that detects the prefix and suffix for the detected 2-character Kanji word from the recognition candidate characters, and other recognition candidate characters. It is composed of a noun detection unit 16 that detects a character corresponding to a character string, and another part-of-speech detection unit 18 that detects a detected recognition candidate character and a character according to Japanese grammar from other recognition candidate characters.

Description

Detailed Description of the Invention

【０００１】[0001]

【産業上の利用分野】本発明は、表示一体型座標入力装
置を介して筆記入力されるストローク列について文字認
識を行なう手書き文章認識装置に関する。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a handwritten sentence recognition device for recognizing characters in a stroke sequence written and input via a coordinate input device integrated with a display.

【０００２】[0002]

【従来の技術】近年、タブレットなどの座標入力装置を
介して筆記入力されたストローク列について文字認識を
行ない、文書作成等に供するオンライン手書き文章認識
装置が注目されている。従来のオンライン手書き文章認
識装置では、筆記者が入力した文章を正しい日本語文と
して認識するためには以下の手続きが必要となってい
る。（１）筆記者によって座標入力装置の座標検出面上で筆
記された文章のストローク列（座標データ列）を入力す
る。（２）文字切り出し処理により、入力されたストローク
列を１文字毎に分解する。（３）文字認識処理により、分解された文字毎のストロ
ーク列について文字認識を行なう。（４）日本語処理によって、文字認識処理で得られた各
文字毎の複数の候補文字から正しい文字を選択する。2. Description of the Related Art In recent years, an online handwritten text recognition device, which recognizes a stroke sequence written and input through a coordinate input device such as a tablet and is used for document creation, has been receiving attention. In the conventional online handwritten text recognition device, the following procedure is required to recognize the text input by the writer as a correct Japanese text. (1) A stroke string (coordinate data string) of a sentence written by the writer on the coordinate detection surface of the coordinate input device is input. (2) By the character cutout processing, the input stroke sequence is decomposed for each character. (3) By the character recognition processing, character recognition is performed on the decomposed stroke string for each character. (4) A correct character is selected from a plurality of candidate characters for each character obtained by the character recognition processing by the Japanese processing.

【０００３】このような手順において、（３）の文字認
識処理において常に正解の文字を出力することができれ
ば、（４）の手続きは必要がないが、文字認識処理で誤
認識文字を出力する（第１位の候補文字が誤り）場合が
ある。In such a procedure, if the correct answer character can always be output in the character recognition process of (3), the procedure of (4) is not necessary, but an erroneously recognized character is output in the character recognition process ( The first candidate character is incorrect).

【０００４】また、（４）の日本語処理では、多くの自
立語辞書（例えば約１０万語）との照合を行なう必要が
あるため、多くの時間が必要となる。このため、日本語
処理まで一連の処理として実行すると反応が遅くなるこ
とからオンライン処理には適さず、通常、筆記入力が終
わったならば、筆記者がボタン操作を行うことにより日
本語処理を起動する方法が用いられている。日本語処理
を行わない場合には、各文字認識の結果について正誤を
確認し、誤った文字が存在する場合には訂正する等の作
業が必要となる。Further, in the Japanese processing of (4), since it is necessary to collate with many independent word dictionaries (for example, about 100,000 words), much time is required. For this reason, if you execute Japanese processing as a series of processes, the reaction will be slow, so it is not suitable for online processing.Usually, when writing input is completed, the writer will start Japanese processing by button operation. Method is used. When Japanese processing is not performed, it is necessary to confirm whether each character recognition result is correct or incorrect, and correct any incorrect characters.

【０００５】これは、従来の装置の自立語辞書におい
て、漢字単語は２文字長の単語、２文字長単語の組み合
わせの単語が大部分を占めるが、これらの単語が重複し
て登録されている（たとえば、「文字」と「文字認識」
が両者とも自立語辞書に登録されている）ために辞書登
録単語数が非常に多くなるためである。辞書登録単語数
が多くなると、日本語処理での自立語の検索のための処
理負担が大きく、システムの性能を低下を招いてしま
う。This is because, in the independent word dictionary of the conventional device, most of the kanji words are words of two-character length and combinations of two-character length words, but these words are registered in duplicate. (For example, "character" and "character recognition".
Both are registered in the independent word dictionary), so the number of words registered in the dictionary is very large. When the number of words registered in the dictionary is large, the processing load for searching for an independent word in Japanese processing is heavy, and the performance of the system is deteriorated.

【０００６】[0006]

【発明が解決しようとする課題】このように従来のオン
ライン手書き文章認識装置では、日本語処理を行なわず
文字認識処理だけの場合には、筆記者が文字認識結果を
常に見ていなければならず、また認識結果が誤っている
際には訂正作業が必要となるため、筆記者の文章作成に
おける思考の妨げとなり、筆記者に多大な負担を掛けて
いた。また、日本語処理を行なう場合にも、自立語辞書
の登録単語数が多いために処理時間が長くなり、日本語
処理まで一連の処理とするとオンライン処理できなくな
ってしまうといった問題が生じてしまう。As described above, in the conventional online handwritten text recognition device, the writer must always look at the character recognition result when only the character recognition processing is performed without performing Japanese processing. Moreover, when the recognition result is incorrect, correction work is required, which hinders the writer's thinking in writing the sentence, and puts a heavy burden on the writer. Further, even when performing Japanese processing, the processing time becomes long due to the large number of registered words in the independent word dictionary, and if a series of processing up to Japanese processing is performed, online processing cannot be performed.

【０００７】本発明は前記のような点に鑑みてなされた
もので、筆記者に負担をかけることなく、日本語処理ま
でを含む一連の処理を短時間に実行可能なオンライン手
書き文章認識装置を提供することを目的とする。The present invention has been made in view of the above points, and provides an online handwritten text recognition device capable of executing a series of processes including Japanese processing in a short time without burdening the writer. The purpose is to provide.

【０００８】[0008]

【課題を解決するための手段】本発明は、前記課題を解
決するために、座標入力装置を介して筆記入力された文
字列を表わすストローク列から１文字毎のストローク列
を切り出す文字切り出し手段と、前記文字切り出し手段
から出力されるストローク列について文字認識を行な
い、各文字について認識候補文字を求める文字認識手段
と、前記文字認識手段によって得られた各文字に対応す
る認識候補文字の組み合わせから成る２文字長漢字単語
を検出し、この検出した２文字長漢字単語をもとに、他
の文字列部分に対応する認識候補文字から正しい文字を
求める日本語処理手段とを具備したことを特徴とする。In order to solve the above-mentioned problems, the present invention provides a character cutting means for cutting out a stroke string for each character from a stroke string representing a character string written and input through a coordinate input device. A combination of character recognition means for performing character recognition on the stroke sequence output from the character cutout means and obtaining recognition candidate characters for each character, and recognition candidate characters corresponding to each character obtained by the character recognition means. And a Japanese processing means for detecting a 2-character long Kanji word and obtaining a correct character from a recognition candidate character corresponding to another character string portion based on the detected 2-character long Kanji word. To do.

【０００９】また本発明は、前記日本語処理手段が、２
文字長の漢字熟語が登録された２文字長漢字熟語辞書
と、前記文字認識手段によって得られた各文字に対応す
る認識候補文字の組み合わせから成る文字列から、前記
２文字長漢字熟語辞書に登録された漢字熟語を参照して
２文字長漢字単語を検出する２文字長漢字熟語検出手段
と、前記２文字長漢字熟語検出手段によって検出された
前記２文字長漢字単語に対する接頭字及び接尾字を前記
認識候補文字から検出する接頭字・接尾字検出手段と、
前記２文字長漢字熟語検出手段、及び前記接頭字・接尾
字検出手段によって該当する認識候補文字が検出された
文字列部分以外の前記認識候補文字から、名詞を成す文
字列に対応する文字を検出する名詞検出手段と、前記２
文字長漢字熟語検出手段、前記接頭字・接尾字検出手
段、及び名詞検出手段によって検出された文字列部分以
外の前記認識候補文字から、検出済みの認識候補文字と
日本語文法に従う文字を検出する他品詞検出手段とを具
備したことを特徴とする。According to the present invention, the Japanese language processing means is
A 2-character long Kanji compound word dictionary in which character length Kanji compound words are registered and a character string consisting of a combination of recognition candidate characters corresponding to each character obtained by the character recognition means are registered in the 2-character length Kanji compound word dictionary. A 2-character long Kanji compound word detecting means for detecting a 2-character long Kanji compound word by referring to the generated Kanji compound word; and a prefix and a suffix for the 2-character long Kanji word detected by the 2-character long Kanji compound word detecting means. Prefix / suffix detection means for detecting from the recognition candidate characters,
A character corresponding to a character string forming a noun is detected from the recognition candidate characters other than the character string portion in which the corresponding recognition candidate character is detected by the 2-character long Kanji compound word detecting unit and the prefix / suffix detecting unit. Noun detecting means for
Detected recognition candidate characters and characters according to Japanese grammar from the recognition candidate characters other than the character string part detected by the character length kanji compound detection unit, the prefix / suffix detection unit, and the noun detection unit. Another part-of-speech detecting means is provided.

【００１０】[0010]

【作用】このような構成によれば、漢字単語の大部分の
割合を占める２文字長の漢字熟語のみから成る２文字長
漢字熟語辞書をもとに照合が行なわれる。２文字長漢字
熟語のみが登録されているため、重複して登録される内
容がないため、辞書量が少なくなり、従って照合時間が
短縮されることになる。また、２文字長漢字熟語辞書を
もとにした照合により検出された２文字長漢字熟語をも
とに、他の文字列部分に対応する認識候補文字から正し
い文字が検出されるため、辞書量が少なくなったことに
よる誤認識の増加を回避することができる。According to this structure, the collation is performed based on the 2-character long Kanji compound word dictionary consisting of only the 2-character length Kanji compound words that account for the majority of the Kanji words. Since only 2-character long Kanji compound words are registered, there is no content that is registered redundantly, so the amount of dictionary is reduced, and therefore the matching time is shortened. In addition, since the correct character is detected from the recognition candidate characters corresponding to other character string parts based on the 2-character long Kanji compound word detected by the collation based on the 2-character long Kanji compound dictionary, the dictionary amount It is possible to avoid an increase in false recognition due to the decrease in

【００１１】[0011]

【実施例】以下、図面を参照して本発明の一実施例につ
いて説明する。図１は、本実施例に係わるオンライン手
書き文章認識装置の構成を示すブロック図である。図１
に示すように、オンライン手書き文章認識装置は、表示
一体型座標入力装置１、文字入力部２、文字切出し部
３、文字認識部４、参照パターン記憶部５、日本語処理
部６、及び表示制御部７によって構成されている。DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS An embodiment of the present invention will be described below with reference to the drawings. FIG. 1 is a block diagram showing the configuration of an online handwritten text recognition device according to this embodiment. Figure 1
As shown in FIG. 1, the online handwritten text recognition device includes a display-integrated coordinate input device 1, a character input unit 2, a character cutout unit 3, a character recognition unit 4, a reference pattern storage unit 5, a Japanese language processing unit 6, and a display control. It is constituted by the section 7.

【００１２】表示一体型座標入力装置１は、文字列を座
標値の系列として筆記入力する入力機能１ａ、及び筆記
・認識結果を表示する出力機能１ｂを兼ね備えている。
表示一体型座標入力装置１は、座標入力面と表示面とが
重ね合わされて一体化されており、同一面で表示及び座
標入力が可能となっている。文字入力部２は、筆記入力
を制御し、文字列を座標値の系列（ストローク列）とし
て入力する。文字切り出し部３は、文字入力部２によっ
て入力された文字列を表わすストローク列から、１文字
毎のストローク列を切り出す。The display-integrated type coordinate input device 1 has both an input function 1a for writing a character string as a series of coordinate values and an output function 1b for displaying a writing / recognition result.
The display-integrated coordinate input device 1 has a coordinate input surface and a display surface that are superposed on each other and integrated so that display and coordinate input are possible on the same surface. The character input unit 2 controls writing input and inputs a character string as a series of coordinate values (stroke string). The character cutout unit 3 cuts out a stroke string for each character from the stroke string representing the character string input by the character input unit 2.

【００１３】文字認識部４は、文字切り出し部３によっ
て切り出された１文字分のストローク列について、参照
パターン記憶部５に記憶された参照パターンと照合して
文字認識を行ない、複数の候補文字を求める。参照パタ
ーン記憶部５は、文字認識用の認識対象とする文字に関
する参照パターンを記憶する。日本語処理部６は、文字
認識部４によって得られた各文字についての複数の候補
文字から日本語文章として適当な文字を選び出す。表示
制御部７は、日本語処理部６によって選び出された文字
を表示一体型座標入力装置１の出力機能１ｂにおいて表
示する。The character recognition unit 4 collates the stroke sequence for one character cut out by the character cutout unit 3 with the reference pattern stored in the reference pattern storage unit 5 to perform character recognition, and selects a plurality of candidate characters. Ask. The reference pattern storage unit 5 stores a reference pattern relating to a character to be recognized for character recognition. The Japanese processing unit 6 selects an appropriate character as a Japanese sentence from a plurality of candidate characters for each character obtained by the character recognition unit 4. The display control unit 7 displays the character selected by the Japanese language processing unit 6 in the output function 1b of the display-integrated coordinate input device 1.

【００１４】図２は日本語処理部６の詳細な構成を示す
ブロック図である。日本語処理部６は、図２に示すよう
に、句読点検出部１１、２文字長漢字熟語検出部１２、
２文字長漢字熟語辞書１３、接頭字・接尾字検出部１
４、接字辞書１５、名詞検出部１６、名詞辞書１７、他
品詞検出部１８によって構成されている。句読点検出部
１１は、処理単位となる文字列の区切りを示す句読点を
検出し、日本語処理を開始する。FIG. 2 is a block diagram showing a detailed configuration of the Japanese language processing section 6. As shown in FIG. 2, the Japanese processing unit 6 includes a punctuation mark detection unit 11, a two-character long Kanji compound word detection unit 12,
2-character long Kanji compound dictionary 13, prefix / suffix detection unit 1
4, a suffix dictionary 15, a noun detection unit 16, a noun dictionary 17, and another part-of-speech detection unit 18. The punctuation mark detection unit 11 detects a punctuation mark indicating a delimiter of a character string as a processing unit, and starts Japanese processing.

【００１５】２文字長漢字熟語検出部１２は、句読点検
出部１１によって検出された句読点までの各文字の候補
文字の組み合わせから成る文字列について、２文字長漢
字熟語辞書１３に格納された２文字長漢字熟語を参照し
て、２文字長漢字熟語を検出する。２文字長漢字熟語辞
書１３には、例えば「文字」「認識」「装置」等の単語
約１万語が登録されている。The 2-character long Kanji compound word detection unit 12 stores two characters stored in the 2-character long Kanji compound word dictionary 13 for a character string composed of a combination of candidate characters of each character up to the punctuation mark detected by the punctuation mark detection unit 11. A long-character kanji compound word is detected by referring to the long-character kanji compound word. In the 2-character long Kanji idiom dictionary 13, about 10,000 words such as “character”, “recognition”, and “device” are registered.

【００１６】接頭字・接尾字検出部１４は、２文字長漢
字熟語検出部１２によって検出された２文字長漢字熟語
の前後の文字に対応する複数の候補文字から、接字辞書
１５に格納された接頭字・接尾字を参照して、接頭字・
接尾字を検出する。The prefix / suffix detection unit 14 is stored in the suffix dictionary 15 from a plurality of candidate characters corresponding to the characters before and after the 2-character long Kanji phrase detected by the 2-character long Kanji phrase detection unit 12. Refer to the prefixes and suffixes
Detect the suffix.

【００１７】名詞検出部１６は、２文字長漢字熟語検出
部１２及び接頭字・接尾字検出部１４によって検出され
た文字列部分以外の文字に対応する候補文字群から、名
詞辞書１７に格納された名詞を参照して、２文字長以外
の名詞を成す文字列を検出する。The noun detecting section 16 is stored in the noun dictionary 17 from a candidate character group corresponding to characters other than the character string portion detected by the 2-character long Kanji compound word detecting section 12 and the prefix / suffix detecting section 14. A character string forming a noun having a length other than two characters is detected by referring to the noun.

【００１８】他品詞検出部１８は、２文字長漢字熟語検
出部１２、接頭字・接尾字検出部１４、及び名詞検出部
１６によって検出された文字列部分以外の文字に対応す
る候補文字群から、名詞以外の日本語文法的に正しい他
の品詞の文字を検出する。次に、本実施例の動作につい
て、具体例を用いながら説明する。The other part-of-speech detection unit 18 selects from the candidate character group corresponding to characters other than the character string portion detected by the 2-character long Kanji compound detection unit 12, the prefix / suffix detection unit 14, and the noun detection unit 16. , Detects characters in other grammatically correct parts of speech other than nouns. Next, the operation of this embodiment will be described using a specific example.

【００１９】まず、表示一体型座標入力装置上１の出力
機能１ｂによって、例えば図３に示すように、筆記者が
文字を書き易いように筆記位置を示す下線３１を表示す
る。また、常に筆記が可能になるように文字入力部２
は、その他の部分と並列に動作しているものとする。First, the output function 1b of the display-integrated coordinate input device 1 displays an underline 31 indicating the writing position so that the writer can easily write a character, as shown in FIG. 3, for example. Also, the character input unit 2 so that writing is always possible
Are operating in parallel with other parts.

【００２０】この待機時に、表示一体型タブレット１０
の入力面上でペンを用いて文字が筆記されると、文字を
構成するストローク列（座標データ列）を順次入力す
る。ここでは、筆記者によって、例えば図４に示すよう
に、「オンライン文章認識装置に文字列を書く。」とい
う文字列が筆記入力されたものとする。手書き入力され
た文字列は文字切り出し部３、文字認識部４に送られ１
文字毎に複数の認識候補が得られる。During this standby, the display-integrated tablet 10
When a character is written with a pen on the input surface of, the stroke sequence (coordinate data sequence) forming the character is sequentially input. Here, it is assumed that the writer has handwritten the character string "Write a character string in the online sentence recognition device" as shown in FIG. 4, for example. The character string input by handwriting is sent to the character slicing unit 3 and the character recognizing unit 4.
A plurality of recognition candidates are obtained for each character.

【００２１】文字切出し部３における、１行に筆記され
た文字列から１文字毎に分離する手法としては、例えば
「候補文字ラティス法による枠無し筆記文字列のオンラ
イン文字認識」（電子通信学会論文誌 Vol.J68-D No.
4, p765/772, 1985）に記載された技術を用いることが
できる。As a method of separating each character from a character string written in one line in the character cutout unit 3, for example, "online character recognition of a frameless written character string by the candidate character lattice method" (IEICE paper) Magazine Vol.J68-D No.
4, p765 / 772, 1985) can be used.

【００２２】また、文字認識部４における文字認識処理
の手法としては、例えば特願平３−２１１５７号「オン
ライン文字認識装置」に記載された技術を用いることが
できる。As a method of character recognition processing in the character recognition unit 4, for example, the technique described in Japanese Patent Application No. 3-21157 "Online character recognition device" can be used.

【００２３】文字認識部４は、各文字についての認識処
理によって得られた、各文字に対応する認識候補文字を
日本語処理部６に出力する。図４のように筆記入力され
た場合の、文字認識部４から出力される候補文字の例を
図５に示している。The character recognition unit 4 outputs the recognition candidate characters corresponding to each character obtained by the recognition process for each character to the Japanese processing unit 6. FIG. 5 shows an example of the candidate characters output from the character recognition unit 4 when the handwriting is input as shown in FIG.

【００２４】図５中において、左側縦書き文字列は、各
文字についての第１の認識候補による文字列である。縦
書き文字列の各文字の横方向は、各文字についての認識
候補順位が第１位から第１５位までの認識候補文字を示
している。In FIG. 5, the left vertical writing character string is a character string based on the first recognition candidate for each character. The horizontal direction of each character in the vertically written character string indicates the recognition candidate characters whose recognition candidate ranks for each character are from 1st to 15th.

【００２５】日本語処理部６の句読点検出部１１は、文
字列の最後の「。」が筆記されたところで反応し、日本
語処理を開始する。まず、２文字長漢字熟語検出部１２
は、２文字長漢字熟語辞書１３に格納された２文字長漢
字熟語と、図５中に示す認識候補文字中の連続する２つ
の漢字の組み合わせによる文字列との照合を行ない、２
文字長漢字熟語と同じ、意味的に正しい文字列を検出す
る。The punctuation mark detection unit 11 of the Japanese language processing unit 6 responds when the last "." Of the character string is written, and starts Japanese language processing. First, the 2-character long Kanji compound word detection unit 12
Performs matching between the 2-character long Kanji compound words stored in the 2-character long Kanji compound dictionary 13 and a character string formed by combining two consecutive Kanji characters in the recognition candidate characters shown in FIG.
Detects a character string that is semantically correct, which is the same as the character length Kanji compound word.

【００２６】ここで検出された文字列（２文字長漢字熟
語）は、図６に示すように、８単語あるものとする。各
単語については、各文字の文字認識における文字認識率
に基づいた単語認識率が求められる。例えば、各文字に
ついての文字認識率を、第１位の候補文字が１００％と
なるように変換し、２文字長漢字熟語を構成する各文字
の変換後の文字認識率の平均を求める。これにより、
「文章」という単語は、候補順位１位の「文」と候補順
位２位の「章」の２文字から成り、単語認識率が９９％
となっている。The character string (two-character long Kanji compound word) detected here has eight words as shown in FIG. For each word, the word recognition rate based on the character recognition rate in the character recognition of each character is obtained. For example, the character recognition rate for each character is converted so that the first-ranked candidate character is 100%, and the average of the character recognition rates after conversion of the respective characters forming the two-character long Kanji compound word is obtained. This allows
The word "sentence" consists of two letters, "sentence", which is ranked first in the candidate ranking, and "chapter", which is ranked second in the candidate ranking. The word recognition rate is 99%.
Has become.

【００２７】２文字長漢字熟語検出部１２は、こうして
得られた２文字長漢字熟語のうちから、単語認識率に対
する予め設定されたしきい値α（例えば、ここではα＝
９０％とする）より下回る単語を、第１位の単語に対し
て認識率が低く、誤認識文字の組み合わせによる単語と
判断して削除する。この結果、単語「文章」「文責」
「認識」「装置」「文字」の５単語のみが残る。さら
に、単語「文章」と「文責」は、同じ文字の認識候補文
字の組み合わせによる単語であるので、より単語認識率
の高い一方の単語「文章」のみを選択する。単語「文
字」と「文学」についても同様にして、単語「文字」だ
けを選択する。The 2-character long Chinese character idiom detection unit 12 selects a preset threshold value α for the word recognition rate from the thus obtained 2-character long Chinese character idioms (for example, α =
Words that are less than 90%) have a low recognition rate with respect to the first word and are determined to be words due to a combination of erroneously recognized characters, and are deleted. As a result, the words "sentence" and "writing responsibility"
Only the five words "recognition", "device" and "character" remain. Further, since the words “sentence” and “sentence” are words formed by combining recognition candidate characters of the same character, only one word “sentence” having a higher word recognition rate is selected. Similarly, for the words "character" and "literature", only the word "character" is selected.

【００２８】次に、接頭字・接尾字検出部１４が起動さ
れる。接頭字・接尾字検出部１４は、接字辞書１５を参
照して、２文字長漢字熟語検出部１２によって得られた
単語に対する接頭字・接尾字を、同単語の前後の文字の
認識候補から検出する。この照合により、単語「文字」
に対する接尾字「列」が検出される。Next, the prefix / suffix detection unit 14 is activated. The prefix / suffix detection unit 14 refers to the suffix dictionary 15 to determine the prefix / suffix for the word obtained by the 2-character long Kanji idiom detection unit 12 from the recognition candidates of the characters before and after the word. To detect. By this matching, the word "character"
The "string" suffix for is detected.

【００２９】さらに、名詞検出部１６が起動される。名
詞検出部１６は、２文字長漢字熟語検出部１２及び接頭
字・接尾字検出部１４によって検出された文字列部分を
除いた他の文字列について、名詞辞書１７との照合によ
り２文字長以外の単語を検出する。この照合により、単
語「オンライン」が検出される。Further, the noun detection unit 16 is activated. The noun detecting unit 16 compares the character strings other than the character string portions detected by the 2-character long Kanji compound word detecting unit 12 and the prefix / suffix detecting unit 14 with the noun dictionary 17 to have a length other than 2 character length. Detect the word. By this matching, the word "online" is detected.

【００３０】こうして、認識候補文字列中から名詞部分
が検出される。次に、他品詞検出部１８が起動される。
他品詞検出部１８は、認識候補文字列中から名詞を除く
他品詞部分の正しい文字の検出を行なう。他品詞検出部
１８においては、例えば「手書き原稿認識における語彙
および構文の検定」（情報処理学会論文誌 Vol.26 No.
5, p862/869, 1985)に記載された技術を用いることがで
きる。In this way, the noun part is detected from the recognition candidate character string. Next, the other part-of-speech detection unit 18 is activated.
The other part-of-speech detection unit 18 detects the correct character of the other part-of-speech part excluding the noun from the recognition candidate character string. In the other part-of-speech detection unit 18, for example, “validation of vocabulary and syntax in recognition of handwritten manuscript” (Information Processing Society of Japan, Vol. 26 No.
5, p862 / 869, 1985) can be used.

【００３１】こうして得られた正しい文字列結果（「オ
ンライン文章認識装置に文字列を書く。」）は、表示制
御部７に送られ、表示一体型座標入力装置上１の出力機
能１ｂによって表示される。The correct character string result ("Write a character string in the online sentence recognition device") thus obtained is sent to the display controller 7 and displayed by the output function 1b of the display-integrated coordinate input device 1. It

【００３２】このようにして、日本語処理部６において
は、自立語中の漢字単語の大部分の割合を占める２文字
長の漢字熟語のみが登録された２文字長漢字辞書１３を
用いて、２文字長漢字熟語の検出が行なわれ、この検出
された２文字漢字単語をもとに２文字漢字単語以外の他
の文字列部分についての照合が行なわれる。すなわち、
２文字長漢字辞書１３は、２文字長漢字熟語のみが登録
されているため登録単語数が少なく、従って、処理時間
が短くなる。このため、日本語処理までを一連の処理と
してオンラインで文章を入力できるため、日本語処理を
起動する操作や、文字認識結果を確認して誤認識文字を
訂正する等の作業が不要となり、筆記者の文章作成の思
考を妨げること無く、効率的な文章作成が可能となる。In this way, the Japanese language processing unit 6 uses the 2-character long Kanji dictionary 13 in which only the 2-character long Kanji idioms, which account for the majority of the Kanji words in the independent words, are registered. A 2-character long Kanji compound word is detected, and based on the detected 2-character Kanji word, collation is performed for a character string portion other than the 2-character Kanji word. That is,
The 2-character long Kanji dictionary 13 has a small number of registered words because only 2-character long Kanji idioms are registered, and therefore the processing time is shortened. For this reason, you can enter sentences online as a series of processes up to Japanese processing, so operations such as activating Japanese processing and confirming character recognition results and correcting erroneously recognized characters are not required. It is possible to create an efficient sentence without hindering the person's thinking of writing the sentence.

【００３３】なお、本発明は上述した実施例に限定され
るものではない。例えば、文字切り出し部３、文字認識
部４、他品詞検出部１８等における処理技術は他の方法
を用いてもかまわない。The present invention is not limited to the above embodiment. For example, other processing methods may be used for the character cutting unit 3, the character recognition unit 4, the other part-of-speech detection unit 18, and the like.

【００３４】[0034]

【発明の効果】以上説明したように本発明によれば、日
本語処理時間を短縮することにより文章認識装置上オン
ラインで実行する事ができ、筆記者が筆記中に文字認識
の結果を見ながら筆記を続けたり、余計なボタン操作を
行わないと日本語処理を起動できないといった本来筆記
とは全く関係ない操作を排除することが可能になる。As described above, according to the present invention, it is possible to execute online on a text recognition device by shortening the Japanese processing time, and the writer can see the result of character recognition while writing. It is possible to eliminate operations that have nothing to do with writing originally, such as being unable to start Japanese processing without continuing writing or performing extra button operations.

【００３５】これにより、日本語として正しい文章の入
力が効率的にできるようになり、また文章入力中の筆記
者の負担を軽減するという実用上多大な効果を奏するも
のである。As a result, it becomes possible to efficiently input a sentence that is correct in Japanese, and to reduce the burden on the writer while inputting the sentence, which is a great practical effect.

[Brief description of drawings]

【図１】本発明の一実施例に係わるオンライン手書き文
章認識装置の構成を示すブロック図。FIG. 1 is a block diagram showing the configuration of an online handwritten text recognition device according to an embodiment of the present invention.

【図２】日本語処理部６の詳細な構成を示すブロック
図。FIG. 2 is a block diagram showing a detailed configuration of a Japanese language processing unit 6.

【図３】表示一体型座標入力装置１上の文章入力用の表
示画面の一例を示す図。FIG. 3 is a diagram showing an example of a display screen for text input on the display-integrated coordinate input device 1.

【図４】文章入力用の表示画面に筆記入力された一例を
示す図。FIG. 4 is a diagram showing an example of handwriting input on a display screen for text input.

【図５】文字認識部４から出力される候補文字の一例を
示す図。FIG. 5 is a diagram showing an example of candidate characters output from the character recognition unit 4.

【図６】２文字長漢字熟語検出部１２による照合結果の
一例を示す図。FIG. 6 is a diagram showing an example of a matching result by a 2-character long Kanji compound word detecting unit 12;

[Explanation of symbols]

１…表示一体型座標入力装置、２…文字入力部、３…文
字切り出し部、４…文字認識部、５…文字認識用参照パ
ターン、６…日本語処理部、７…表示制御部、１１…句
読点検出部、１２…２文字長漢字熟語検出部、１３…２
文字長漢字熟語辞書、１４…接頭字・接尾字検出部、１
５…接字辞書、１６…名詞検出部、１７…名詞辞書、１
８…他品詞検出部、３０…表示入力面、３１…文字筆記
用下線、５１…認識候補順位、５２…認識候補文字。DESCRIPTION OF SYMBOLS 1 ... Display-integrated coordinate input device, 2 ... Character input unit, 3 ... Character cutout unit, 4 ... Character recognition unit, 5 ... Character recognition reference pattern, 6 ... Japanese processing unit, 7 ... Display control unit, 11 ... Punctuation detection unit, 12 ... 2 Character length Kanji compound word detection unit, 13 ... 2
Character length Kanji compound dictionary, 14 ... Prefix / suffix detection unit, 1
5 ... letter dictionary, 16 ... noun detector, 17 ... noun dictionary, 1
8 ... Other part-of-speech detection unit, 30 ... Display input surface, 31 ... Underwriting character writing line, 51 ... Recognition candidate rank, 52 ... Recognition candidate character.

Claims

[Claims]

1. A character slicing means for slicing a stroke string for each character from a stroke string representing a character string handwritten by a coordinate input device, and character recognition for a stroke string output from the character slicing means. , A 2-character long Kanji word consisting of a combination of character recognition means for obtaining a recognition candidate character for each character and a recognition candidate character corresponding to each character obtained by the character recognition means, and the detected 2-character long kanji character A handwritten sentence recognition device comprising: a Japanese processing means for obtaining a correct character from a recognition candidate character corresponding to another character string portion based on a word.

2. The Japanese language processing means comprises a combination of a 2-character long kanji compound word dictionary in which a 2-character long kanji compound word is registered and a recognition candidate character corresponding to each character obtained by the character recognition means. A two-character long Kanji compound word detection unit for detecting a two-character long Kanji compound word from a character string by referring to a Kanji compound word registered in the two-character long Kanji compound word detection unit, and detected by the two-character length Kanji compound word detection unit. It corresponds to the prefix / suffix detecting means for detecting the prefix and suffix for the 2-character long kanji word from the recognition candidate character, the 2-character long kanji compound word detecting means, and the prefix / suffix detecting means. A noun detecting means for detecting a character corresponding to a character string forming a noun from the recognition candidate characters other than the character string portion in which the recognition candidate character is detected; the two-character long kanji compound word detecting means; And a part-of-speech detecting means for detecting a detected recognition candidate character and a character according to Japanese grammar from the recognition candidate characters other than the character string portion detected by the acronym / suffix detecting means and the noun detecting means. The handwritten sentence recognition device according to claim 1, characterized in that.