JPH02230488A - Character recognizing device - Google Patents
Character recognizing deviceInfo
- Publication number
- JPH02230488A JPH02230488A JP1052369A JP5236989A JPH02230488A JP H02230488 A JPH02230488 A JP H02230488A JP 1052369 A JP1052369 A JP 1052369A JP 5236989 A JP5236989 A JP 5236989A JP H02230488 A JPH02230488 A JP H02230488A
- Authority
- JP
- Japan
- Prior art keywords
- character
- word
- storage unit
- candidate
- character kind
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000605 extraction Methods 0.000 claims description 8
- 230000006870 function Effects 0.000 description 6
- 238000010586 diagram Methods 0.000 description 4
- 230000000694 effects Effects 0.000 description 4
- 241000824268 Kuma Species 0.000 description 1
- 239000000284 extract Substances 0.000 description 1
Landscapes
- Character Discrimination (AREA)
Abstract
Description
【発明の詳細な説明】
〔産業上の利用分野〕
本発明は、文字詔,識装置に関し、特に都道府県名等の
各種の目的に応した認識対象とすべき単語群を予め記憶
しておき文字認識結果を組合わせててきる単語がこの単
語群の中に存在するか否かを調べるための単語照合の機
能を有する文字認識装置に関する。[Detailed Description of the Invention] [Industrial Field of Application] The present invention relates to a character recognition device, and particularly relates to a character recognition device in which groups of words to be recognized according to various purposes such as prefecture names are stored in advance. The present invention relates to a character recognition device having a word matching function for checking whether a word obtained by combining character recognition results exists in a word group.
従来、この種の文字認識装置は、文字認識部と単語照合
部とか論理的に離れた構造となっており、文字認識部は
読取対象となる全字種の中から認識結果として妥当な候
補を選ひ出し、単語照合部へ送る動作となっていた。Conventionally, this type of character recognition device has a structure in which the character recognition section and the word matching section are logically separate, and the character recognition section selects valid candidates as recognition results from among all the character types to be read. The operation was to select it and send it to the word matching section.
また、認識部独自で字種限定のためのテーブルを持ち、
認識対象の字種を絞って候補を選び出すm造のものもあ
ったか、このテーブルの作成に当たっては、人間の手を
介する操作により作成されるものであった。In addition, the recognition unit has its own table for limiting character types.
Some of these tables were created by narrowing down the types of characters to be recognized and selecting candidates, or by manual operations when creating these tables.
上述した従来の文字認識装置は、全字種を認識対象とす
る装置の場合、字種の多さに伴い認識部の認識時間が多
くかかる欠点と、不適当な候補文字が単語照合部へ送ら
れるために、余分な彫語叩合時間か取られるという欠点
かあった。The above-mentioned conventional character recognition devices have the drawbacks that, when recognizing all character types, the recognition unit takes a long time to recognize the large number of character types, and that inappropriate candidate characters are sent to the word matching unit. The disadvantage was that it took extra time to hammer out the words.
また、認識部に字種限定のためのテーフルを保有する装
置の場合、認識と単語照合の時間を短縮することはでき
るが、テーフルを作成するために多くの人為的操作が必
要てあった。例えは、n個の単語を照合用の単語として
登録する場合、全n個の単語中に使用される字種を拾い
上げ、これらの字種をテーブル中に登録する作業は、人
間か行っていた。このため、単語数nが増大ずる程、人
間が行なう字種のチェックの手間かかかる欠点があり、
また結果的にチェックミスか生じたり、認識率が低下し
たりすることにもなった。Furthermore, in the case of a device in which the recognition unit has a table for limiting character types, the time required for recognition and word matching can be shortened, but many manual operations are required to create the table. For example, when registering n words as words for matching, a person would have to pick up the character types used in all n words and register these character types in a table. . For this reason, as the number of words n increases, there is a drawback that it becomes more time-consuming for humans to check the type of characters.
This also resulted in check errors and a drop in recognition rate.
本発明は、予め対象となる単語群を記憶する浄語記憶部
と、予め認識対象とする字種を記憶する字種記憶部と、
入力される文字画像を認,識して前記字種記憶部に記憶
された字種に限定された候補文字を出力する文字認識部
と、複数の前記文字画像について前記文字認識部から出
力される候補文字を組み合をぜててきる単語が前記単語
記憶部に記憶されている単語群の中に存在するかを調べ
る機能を有する単語照合部を備える文字認識装置におい
て、 前記単語記憶部に登録されている中語群に使用さ
れている字種を全て拾い上げ前記字種記憶部に記憶させ
る字種抜き取り部を含んで構成される。The present invention includes a pure word storage unit that stores a group of target words in advance, a character type storage unit that stores a character type that is a recognition target in advance,
a character recognition unit that recognizes and identifies input character images and outputs candidate characters limited to character types stored in the character type storage unit; and a character recognition unit that outputs candidate characters limited to character types stored in the character type storage unit; In a character recognition device comprising a word matching unit having a function of checking whether a word combining candidate characters exists in a group of words stored in the word storage unit, the word matching unit is registered in the word storage unit. The present invention includes a character type extraction section which picks up all the character types used in the Chinese language group and stores them in the character type storage section.
次に、本発明について、図面を参照して説明する。 Next, the present invention will be explained with reference to the drawings.
第1図は、本発明の読取り装置の構成図である。文字イ
メーシ]は、紙等の媒体から人力される文字のイメージ
画像であり、文字認識部2は文字イメーシ]か何の文字
てあるかを認識し、その結果を複製の候補文字3として
tii語照合部4へ出力する。この時、文字32識部2
は、認識の対象字種を記憶する字種記憶部5を参照しな
から、対象外の字種を候補として取り上げない機能を有
する。単語照合部4は、文字詔,識部2か出力した候補
文字3を複数文字分、一時的に記憶し、候補文字3を組
合わせて7できる単語を、単語記憶部6に格納されてい
る単語群と照合し、一致または類似した単語を候補単語
7として出力する機能を有する。FIG. 1 is a block diagram of a reading device of the present invention. The character image] is an image of a character manually generated from a medium such as paper, and the character recognition unit 2 recognizes what character is in the character image and uses the result as a candidate character 3 for reproduction. Output to the matching section 4. At this time, character 32 recognition part 2
has a function of not selecting character types other than the target character types as candidates without referring to the character type storage unit 5 that stores character types to be recognized. The word collation section 4 temporarily stores a plurality of candidate characters 3 outputted from the character edict and identification section 2, and stores words that can be formed by combining the candidate characters 3 into 7 in the word storage section 6. It has a function of comparing word groups and outputting matching or similar words as candidate words 7.
一方、111語記憶部6と字種記憶部5の間に本発明の
中核である字種抜き取り部8が構成されており、単語記
憶部6に格納されている単語群の中から、各単語を構成
する文字種を拾い上け単語群に対応する限定字種テーブ
ルを字種記憶部5の中に形成する機能を有する。On the other hand, a character type extraction unit 8, which is the core of the present invention, is configured between the 111 word storage unit 6 and the character type storage unit 5. It has a function of picking up the character types constituting the word group and forming a limited character type table corresponding to the word group in the character type storage section 5.
以下に本発明の効果を明確にずるために、都道府県名を
認識する場合を例にとり、各部の動作を説明する。In order to clearly demonstrate the effects of the present invention, the operations of each part will be explained below, taking the case of recognizing prefecture names as an example.
認識に先立ち、都道府県名(゛東京都゛′、″神奈川県
′゜等)を全て単語群として単語記憶部6中に登録して
おく。次に、字種拾い」二げ部8を起動し、全都道府県
名に使用されている字種を拾い士け、都道府県名の限定
テーブルを字種記憶部5グ)111Gご形成させる。Prior to recognition, all prefecture names (such as ``Tokyo'', ``Kanagawa Prefecture'', etc.) are registered as a word group in the word storage section 6. Next, the second section 8, which picks up character types, is activated. Then, select the character types used for all prefecture names and create a limited table of prefecture names in the character type storage section 5g) 111G.
認識動作か開始されると、文字イメーシ1か次々と入力
されるか、例として紙面上に“′東京都゛′と書かれた
イメーシか入力されてきた場合について第2図を参照し
て説明する。When the recognition operation starts, character images 1 are input one after another, or as an example, a case where an image written as "'Tokyo ゛'" is input on a piece of paper will be explained with reference to Figure 2. do.
文字認識2は“東′”、゛′京”、″都“゜の順に認識
を行ってゆくか、第2図に示すように゛東′″に対する
類似文字として′゛束′”゛京″、“車゛′なとが゛京
′″に対しては“東″、“哀″″、“享′゜なとが、′
゛都”゜に対して“部″′、′゛卸“′、“′郡′”な
とか考えられる(第2図(a))か、字種記憶部5の限
定字種テーフルを参照することにより、“車“′、゛哀
′”、゛享”′、″部゛′、゜“卸″、″郡“′は認識
対象文字から除外されるため、候補文字3としては″東
′″に対して゜゜東″と“京゜′か、また゛′京′”に
対して゛′京″′と“東”′が、“都″に対して゛都″
′たけか妥当な候補として単語照合部4に渡される(第
2図(b))。従って−中語として考えられる組合せは
、゛゜東東都“、“東京都′″一 5
′“京東都′”“京京都“″の4通りに絞られ、’N語
照合部4ては容易に“′東京都”′を妥当な候補彫語で
あると判定することか可能てある。i;J. lか字種
の限定機能の効果に関する説明である。Character recognition 2 recognizes "To'", "Kyo", and "To" in this order, or, as shown in Figure 2, recognizes characters similar to "To" such as "To" and "Kyo". , “Kuruma natoga kyo” is “east”, “sadness”, “kyoya nato”,
For ゛都゜゛゛゛゛゛゛゛゛゛゛゛゛゛゛゛゛゛゛゛゛゛゛゛゛゛゛゛゛゛゛゛゛゛゛゛゛) As a result, “Kuma”, “Ai”, “Kyo”, “Part”, “Wholesale”, and “Gun” are excluded from the recognition target characters, so “To” is the candidate character 3. For ``゜゜東'' and ``京゜′, and for ``kyo'', ゛′kyo''′ and ``east''′, and for ``to'', ``to''.
' is passed to the word matching unit 4 as a valid candidate (FIG. 2(b)). Therefore, the combinations that can be considered as Chinese words are narrowed down to four: ゛゜Higashi-Toto'', ``Tokyo'''', ``Kyoto'''', and ``Kyo-Kyoto''. It is possible to determine that "'Tokyo"' is a valid candidate carved word. i;J. This is an explanation regarding the effect of the limited function of the character type.
次に、字種抜き取り部8の具体例について説明する。、
第3図は字種記憶部5、乍語記憧部6およひ字種抜ぎ取
り部8の一例の模式図である。1}1−語記憶部6には
、都道府県名や都市名、人名、その他任意に単話群を登
録しておくこととする。字種記憶部5には各単語群に対
応ずる字種セッ1−からなる字種限定テーフルを記憶す
るエリアを確保する。字種抜き取り部8の動作としては
都道府県名の屯語群からは′゜東京都″、″神奈川県″
′なとの単語に使われている“東″、“京′″、“都′
″、″神″、′゛奈“″、゛ノ1ビ、゛゜県″といった
字種を拾い上げ、字種記憶部5中の都道府県名字種セッ
トの字種限定テーブルに登録する。同様に、都市名、人
名、その他からも字種の拾い上け行ない、対応ずる字種
セットとして登録を行なう。Next, a specific example of the character type extraction section 8 will be explained. ,
FIG. 3 is a schematic diagram of an example of the character type storage section 5, the word recording section 6, and the character type extraction section 8. 1}1- In the word storage unit 6, prefecture names, city names, personal names, and other arbitrary single-word groups are registered. The character type storage section 5 has an area for storing a character type limited table consisting of character type set 1- corresponding to each word group. The operation of the character type extractor 8 is to extract ``゜Tokyo'' and ``Kanagawa Prefecture'' from the Tongo group of prefecture names.
``East'', ``Kyo'', ``To'' used in the word ``nato''
Character types such as ``,''``Kami'',``゛NA'', ゛NO1BI, and ゛゜ken are picked up and registered in the character type limited table of the prefecture name type set in the character type storage section 5. Similarly, character types are picked up from city names, people's names, and others, and registered as a corresponding character type set.
文字認識に際しては、都道府県名の認識を行なう時には
字種記憶部5の都道府県名の字種セツ1一の字種限定テ
ーフルに登録された字種に限定して招,識を行なうこと
とし、同様に都市名、人名等に対しても字種セットに登
録それる字種に限定して認識を行なうものとする。In character recognition, when recognizing a prefecture name, recognition is limited to the character types registered in the character type restriction table in the character type set 11 of the prefecture name in the character type storage unit 5. Similarly, for city names, personal names, etc., recognition is limited to character types that are registered in the character type set.
以上説明したように、本発明は単語群の中から全字種を
抜き取り、字種記憶部に記憶する字種抜き収り部を設け
ることにより、認識対象とする限定字種テーフルの作成
を、自動的に行ない、予め人間の手による登録作業を行
なう必要を無くずことかできる効果かある。As explained above, the present invention extracts all character types from a word group and provides a character type extraction storage unit that stores them in the character type storage unit, thereby creating a limited character type table to be recognized. This is done automatically, which has the effect of eliminating the need for manual registration work in advance.
第1図は本発明の一実施例の文字読取装置の構成フロッ
ク図、第2図は第1図に示す文字認識部2におりる字種
の限定の説明図であり、第2図(a)は全字種対象とし
た認識時の候補文字、第2図(1))は字種限定認識時
の候補文字を示す。
第3図は第1図に示す字種抜き取り部8の動作説明図て
ある。
]・・・・文字イメーシ、2・・・文字詔,識部、3・
・・候補文字、4・・・・・・単語照合部、5・・字種
記憶部、6・・・・即語記憶部、7 候補1林語、8
字種抜き取り部。FIG. 1 is a block diagram of the configuration of a character reading device according to an embodiment of the present invention, and FIG. ) shows candidate characters when recognizing all character types, and FIG. 2 (1)) shows candidate characters when recognizing limited character types. FIG. 3 is an explanatory diagram of the operation of the character type extraction section 8 shown in FIG. 1. ]...Character image, 2...Character edict, Shikibu, 3.
... Candidate character, 4... Word matching section, 5... Character type storage section, 6... Immediate word storage section, 7 Candidate 1 Hayashi language, 8
Character type extraction part.
Claims (1)
識対象とする字種を記憶する字種記憶部と、入力される
文字画像を認識して前記字種記憶部に記憶された字種に
限定された候補文字を出力する文字認識部と、複数の前
記文字画像について前記文字認識部から出力される候補
文字を組み合をせてできる単語が前記単語記憶部に記憶
されている単語群の中に存在するかを調べる機能を有す
る単語照合部を備える文字認識装置において、前記単語
記憶部に登録されている単語群に使用されている字種を
全て拾い上げ前記字種記憶部に記憶させる字種抜き取り
部を含むことを特徴とする文字認識装置。A word storage unit that stores a group of target words in advance, a character type storage unit that stores character types to be recognized in advance, and a character type that recognizes an input character image and stores it in the character type storage unit. a character recognition unit that outputs candidate characters limited to , and a word group in which words formed by combining candidate characters output from the character recognition unit for a plurality of the character images are stored in the word storage unit; In a character recognition device including a word matching unit having a function of checking whether a word exists in a word group, the character recognition device picks up all character types used in a word group registered in the word storage unit and stores them in the character type storage unit. A character recognition device comprising a character type extraction section.
Priority Applications (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| JP1052369A JPH02230488A (en) | 1989-03-03 | 1989-03-03 | Character recognizing device |
Applications Claiming Priority (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| JP1052369A JPH02230488A (en) | 1989-03-03 | 1989-03-03 | Character recognizing device |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| JPH02230488A true JPH02230488A (en) | 1990-09-12 |
Family
ID=12912895
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| JP1052369A Pending JPH02230488A (en) | 1989-03-03 | 1989-03-03 | Character recognizing device |
Country Status (1)
| Country | Link |
|---|---|
| JP (1) | JPH02230488A (en) |
Citations (3)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| JPS60207983A (en) * | 1984-03-31 | 1985-10-19 | Toshiba Corp | Production system of dictionary for recognizing character |
| JPS60233782A (en) * | 1984-05-07 | 1985-11-20 | Nec Corp | Address reader |
| JPS6286475A (en) * | 1985-10-14 | 1987-04-20 | Hitachi Ltd | pattern recognition device |
-
1989
- 1989-03-03 JP JP1052369A patent/JPH02230488A/en active Pending
Patent Citations (3)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| JPS60207983A (en) * | 1984-03-31 | 1985-10-19 | Toshiba Corp | Production system of dictionary for recognizing character |
| JPS60233782A (en) * | 1984-05-07 | 1985-11-20 | Nec Corp | Address reader |
| JPS6286475A (en) * | 1985-10-14 | 1987-04-20 | Hitachi Ltd | pattern recognition device |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| KR870011552A (en) | Document registration method | |
| JPH02230488A (en) | Character recognizing device | |
| JPH0388062A (en) | Device for preparing document | |
| JPH0441388B2 (en) | ||
| JPH0944521A (en) | Index generating device and document retrieval device | |
| JPH10198688A (en) | Standard document reader | |
| JP2588261B2 (en) | Address database search device by OCR | |
| JP2874199B2 (en) | Word dictionary matching device | |
| JPS58123126A (en) | Dictionary retrieving device | |
| JPH0962700A (en) | Dictionary construction method and apparatus | |
| JP3380850B2 (en) | Character recognition device | |
| JPS6329882A (en) | Information registration search device | |
| JP2865443B2 (en) | Kanji conversion device for Kana name or Kana corporation name | |
| JP2530659B2 (en) | Optical character reading system | |
| JPS6068425A (en) | Kana-kanji conversion device with learning function | |
| JPS63138479A (en) | Character recognizing device | |
| JPS63131288A (en) | Word collator | |
| JPH0193876A (en) | Character reader | |
| JPS63303481A (en) | Address reader | |
| JPS585840A (en) | electronic search device | |
| JPS61193257A (en) | Method and apparatus for inputting character symbol of chinese language into terminal | |
| JPH023865A (en) | Retrieving system for kanji character | |
| JPS63298583A (en) | Character recognition post-processing system | |
| JPH04250589A (en) | Word collating method | |
| JP2006011653A (en) | Similar character string search method and similar character string search device |