JPH0338631B2

JPH0338631B2 -

Info

Publication number: JPH0338631B2
Application number: JP59194128A
Authority: JP
Inventors: Hiroshi Matsumura; Tatsunosuke Iwahara
Original assignee: Sanyo Electric Co Ltd
Current assignee: Sanyo Electric Co Ltd
Priority date: 1984-09-17
Filing date: 1984-09-17
Publication date: 1991-06-11
Also published as: JPS6172376A

Description

【発明の詳細な説明】 (イ) 産業上の利用分野本発は、手書き漢字を認識する文字認識システ
ムに係り、候補字種カテゴリーの認識順位決定方
式に関する。[Detailed Description of the Invention] (a) Field of Industrial Application The present invention relates to a character recognition system for recognizing handwritten kanji, and relates to a recognition ranking determination method for candidate character type categories.

(ロ) 従来の技術一搬に、文字認識システムでは、入力文字パタ
ーンから抽出した特徴パターンと、予め辞書部に
登録された字種カテゴリー毎の標準特徴パターン
との類似度を計算し、類似度の大きいｎ個の候補
字種カテゴリーを選択する。そして、類似度の最
も大きい候補字種カテゴリーを認識結果として出
力すると共に、誤認識の訂正のために、選択した
ｎ個の候補字種カテゴリーには、類似度の大きい
順に第１位から第ｎ位までの認識順位を決定して
おく。(b) Conventional technology First, in a character recognition system, the degree of similarity between a feature pattern extracted from an input character pattern and a standard feature pattern for each character type category registered in advance in a dictionary section is calculated, and the degree of similarity is calculated. n candidate character type categories with a large value are selected. Then, in addition to outputting the candidate character type category with the highest degree of similarity as a recognition result, in order to correct misrecognition, the selected n candidate character type categories are ranked from 1st to nth in order of the degree of similarity. Decide on the recognition ranking up to the first rank.

ところが、上述の如く、認識順位の決定に類似
度のみを用いていたのでは誤認識が多く、そこ
で、類似度による複数の候補字種カテゴリーの選
択後に、何らかの後処理を施して認識順位を決定
する方式が考えられるようになつた。 However, as mentioned above, using only similarity to determine the recognition ranking often results in false recognition, so after selecting multiple candidate character categories based on similarity, some post-processing is performed to determine the recognition ranking. Now I can think of a way to do that.

そして、従来、後処理としては、特開昭59−
32082号公報に開示されているように、文法的処
理を行なうものや、特開昭59−27381号公報のよ
うに、被認識文字の前後の文字が、漢字か、カタ
カナか或いはひらがなかを判定するものが提案さ
れていた。 Conventionally, as post-processing, JP-A-59-
As disclosed in Japanese Patent No. 32082, there are methods that perform grammatical processing, and methods that determine whether the characters before and after the recognized character are kanji, katakana, or hiragana, as in Japanese Patent Application Laid-open No. 59-27381. Something was proposed.

(ハ) 発明が解決しようとする問題点従来の技術においては、文法的処理を後処理と
して行なうので、文法的な辞書等の知識部が莫大
となり、更には、その処理内容が非常に複雑にな
るという問題があり、又、前後の文字が、漢字か
カタカナか等を判定する方式では、選択した候補
字種カテゴリーが漢字やひらがなばかりである場
合には、認識率の向上は期待できなかつた。(c) Problems to be solved by the invention In the conventional technology, grammatical processing is performed as post-processing, so the knowledge section such as a grammatical dictionary becomes enormous, and furthermore, the processing content becomes extremely complicated. Furthermore, with the method of determining whether the preceding and succeeding characters are kanji or katakana, if the selected candidate character category is only kanji or hiragana, it cannot be expected to improve the recognition rate. .

そこで、本願出願人は、莫大な知識部を必要と
せず、短かい処理時間で認識率を向上させるた
め、学校教育の学習段階あるいは頻度等に応じ
て、字種カデゴリーの各々に優先度を定めてお
き、この優先度と類似度とを用いて複数の候補字
種カテゴリーの認識順位を決定しようとしたが、
類似度と優先度の兼ね合いにより認識率が変わる
ので、この兼ね合いをいかにうまく設定するかと
いうことが問題となつてきた。 Therefore, in order to improve the recognition rate in a short processing time without requiring a huge knowledge department, the applicant set priorities for each character type category according to the learning stage or frequency of school education. I tried to determine the recognition order of multiple candidate character type categories using this priority and similarity, but
Since the recognition rate changes depending on the balance between similarity and priority, the problem has been how to properly set this balance.

(ニ) 問題点を解決するための手段本発明は、候補字種カテゴリー同志の類似度の
差及び優先度の差を計算し、類似度の差が所定値
より大きい場合は、優先度の差が所定値より大き
いときのみ優先度による認識順位の入れ換えを行
ない、優先度の差が所定値より小さいときは優先
度による認識順位の入れ換えを行なわないように
して、候補字種カテゴリーの認識順位を決定する
ようにしたものである。(d) Means for solving the problem The present invention calculates the difference in similarity and the difference in priority between candidate character type categories, and if the difference in similarity is larger than a predetermined value, the difference in priority is calculated. The recognition order of the candidate character type category is changed by swapping the recognition order based on the priority only when the difference in priority is greater than a predetermined value, and not replacing the recognition order based on the priority when the difference in priority is smaller than the predetermined value. It was decided to do so.

(ホ) 作用本発明では、類似度がある程度離れた候補字種
カテゴリー同志においては、それらの優先度が接
近しているときには優先度が無視され類似度のみ
により認識順位が決定され、それらの優先度が大
きく異なるときのみに優先度に基づく認識順位が
決定されることとなり、従つて、類似度及び優先
度を共に有効に活用した認識順位の決定が為さ
れ、認識率が向上する。(e) Effect In the present invention, in candidate character type categories whose similarities are separated by a certain degree, when their priorities are close, the priorities are ignored and the recognition order is determined only by the similarity. The recognition order based on the priority is determined only when the degrees are significantly different. Therefore, the recognition order is determined by effectively utilizing both the similarity and the priority, and the recognition rate is improved.

(ヘ) 実施例第１図は、本発明を適用した文字認識システム
のブロツク図であり、１は入力用原稿に書かれた
文字を読取り、読取り結果を２値の文字パターン
として出力する文字観測部、２は入力文字パター
ンから特徴パターンを抽出する特徴抽出部、３は
字種カテゴリー毎の標準特徴パターンを記憶した
辞書部、４は抽出した特徴パターンと標準特徴パ
ターンとのマツチングを行ない、両パターンの類
似度を計算するパターンマツチング部である。(F) Embodiment Figure 1 is a block diagram of a character recognition system to which the present invention is applied, and 1 is a character observation system that reads characters written on an input manuscript and outputs the reading results as a binary character pattern. 2 is a feature extraction unit that extracts feature patterns from input character patterns; 3 is a dictionary unit that stores standard feature patterns for each character type category; and 4 is a feature extraction unit that matches extracted feature patterns with standard feature patterns; This is a pattern matching unit that calculates the similarity of patterns.

辞書部３の字種カテゴリーは、頻度あるいは学
校教育の学習段階に応じたカテゴリー分けが為さ
れており、各カテゴリーセツトに優先度が定めら
れている。例えば、第２図に示すように、小学校
１〜３年で学習する字種カテゴリーをカテゴリー
セツト１，３ａ、小学校４〜６年で学習する字種
カテゴリーをカテゴリーセツト２，３ｂ、中学校
以上で学習する字種カテゴリーをカテゴリーセツ
ト３，３ｃ、というように全ての字種カテゴリー
を３つのカテゴリーセツトに分け、カテゴリーセ
ツト１〜３に順に優先度０〜２を定めている。 The character type categories in the dictionary section 3 are divided into categories according to frequency or learning stage of school education, and a priority is determined for each category set. For example, as shown in Figure 2, the glyph categories learned in the first to third years of elementary school are set to category sets 1 and 3a, the glyph categories learned in the fourth to sixth years of elementary school are set to category sets 2 and 3b, and the glyph categories learned in junior high school and above. All character categories are divided into three category sets, such as category sets 3 and 3c, and priorities 0 to 2 are assigned to category sets 1 to 3 in order.

パターンマツチング部４は、カテゴリーセツト
１〜３に各々対応する３つの演算部４ａ〜４ｃを
備えており、各演算部に各カテゴリーセツトの中
から類似度の大きい順にｎ個の候補字種カテゴリ
ーを選択し、その字種コード及び計算結果として
の類似度を、候補メモリ５に格納する。この際、
演算部では対応するカテゴリーセツトの優先度を
字種コード及び類似度に付加し、これら３つの情
報が各々の候補字種カテゴリーの情報として候補
メモリ５に記憶される。このようにして、候補メ
モリ５には、各カテゴリーセツトの中からｎ個づ
つ、合計3n個の候補字種カテゴリーが記憶され
る。 The pattern matching unit 4 includes three calculation units 4a to 4c corresponding to category sets 1 to 3, respectively, and each calculation unit selects n candidate character type categories from each category set in descending order of similarity. is selected, and its character type code and the degree of similarity as a calculation result are stored in the candidate memory 5. On this occasion,
The calculation unit adds the priority of the corresponding category set to the character type code and similarity, and these three pieces of information are stored in the candidate memory 5 as information for each candidate character type category. In this way, candidate memory 5 stores n candidate character type categories from each category set, for a total of 3n candidate character type categories.

更に、第１図において、６は類似度と優先度と
の関係を記憶した知識部、７はこの知識部６の内
容を参照して、候補メモリ５に記憶された3n個
の候補字種カテゴリーのうち上位ｎ個の認識順位
を決定し、その字種コードを認識順位順に結果メ
モリ８に格納するクラスタリング制御処理部であ
つて、答出力制御部９は、認識順位が第１位の字
種コードを認識結果としてワープロあるいはパソ
コン等の文字表示装置に出力し、その字種の表示
を行なわせる。そして、答出力制御部９はオペレ
ータから誤認識の指示があれば、第２位以下の字
種コードを順次出力し、正しい認識結果が表示さ
れるように出力の制御を行なう。 Further, in FIG. 1, reference numeral 6 denotes a knowledge section that stores the relationship between similarity and priority, and 7 refers to the contents of this knowledge section 6 to determine the 3n candidate character type categories stored in the candidate memory 5. It is a clustering control processing unit that determines the recognition ranking of the top n characters among them and stores the character type codes in the result memory 8 in order of recognition ranking, and the answer output control unit 9 determines the character type with the first recognition ranking. The code is output as a recognition result to a character display device such as a word processor or a personal computer, and the type of character is displayed. If the operator gives an instruction of incorrect recognition, the answer output control section 9 sequentially outputs the second and lower character type codes and controls the output so that the correct recognition result is displayed.

次に、クラスタリング制御処理部７の処理及び
知識部６の内容について、更に詳しく説明する。 Next, the processing of the clustering control processing section 7 and the contents of the knowledge section 6 will be explained in more detail.

本実施例では、類似度としてシテイブロツク距
離ｄを用い、この距離が小さいほど類似度が大き
いとしており、知識部６には、各候補字種カテゴ
リーのシテイブロツク距離ｄを比較するための複
数の閾値D〓，D〓，D〓（D〓＜D〓＜D〓）が記憶されて
いる。クラスタリング制御処理部７は、第３図の
フローチヤートに示すように、先ず、候補メモリ
５に記憶された3n個の候補字種カテゴリーの中
から、距離が小さい順にｎ個の候補字種カテゴリ
ーG₁，G₂…G_oを選択し、これらを順に作業メモ
リ７ａに記憶する。そして、これらｎ個の候補字
種カテゴリーの各々の距離d₁，d₂…d_oを複数の閾
値D〓，D〓，D〓と比較することにより、候補字種
カテゴリーG₁〜G_oを第４図に示すようなＡ、Ｂ、
Ｃ、Ｄの４つのクラスにクラス分けを行なう。即
ち、クラスタリング制御処理部７には、各クラス
毎に、そのクラスに属する候補字種カテゴリーが
作業メモリ７ａのどのアドレスまで入つているか
を示すクラスポインタ１０ａ〜１０ｄを備えてお
り、第３図のフローチヤートに示すように、各ポ
インタ１０ａ，１０ｂ，１０ｃ，１０ｄの内容
TA、TB、TC、TDが、距離の比較の際インク
リメントされて行く。例えば、作業メモリ７ａの
アドレス０〜１にＡクラス、アドレス２〜４にＢ
クラス、アドレス５〜６にＣクラス、アドレス７
にＤクラスの候補字種カテゴリーが記憶されてい
れば、クラス分けにより、各ボインタの内容は、
TA＝２、TB＝５、TC＝７、TD＝８となる。 In this embodiment, the city block distance d is used as the degree of similarity, and it is assumed that the smaller this distance is, the greater the degree of similarity is. Threshold values D〓, D〓, D〓 (D〓<D〓<D〓) are stored. As shown in the flowchart of FIG. 3, the clustering control processing unit 7 first selects n candidate character type categories G from the 3n candidate character type categories stored in the candidate memory 5 in descending order of distance. ₁ , G ₂ . . . G _o are selected and stored in order in the working memory 7a. _Then , by comparing _distances d ₁ , d ₂ _. A, B, as shown in Figure 4,
Classification is performed into four classes, C and D. That is, the clustering control processing unit 7 is provided with class pointers 10a to 10d for each class, which indicate to which address in the working memory 7a candidate character type categories belonging to that class are stored. As shown in the flowchart, the contents of each pointer 10a, 10b, 10c, 10d
TA, TB, TC, and TD are incremented when comparing distances. For example, class A is assigned to addresses 0 to 1 of the working memory 7a, and class B is assigned to addresses 2 to 4 of the working memory 7a.
Class, address 5-6, C class, address 7
If the D class candidate character type category is stored in , the content of each pointer will be
TA=2, TB=5, TC=7, TD=8.

ところで、知識部６には、第４図に示すよう
に、各クラスにおける優先度による順位入れ換え
の可否及び条件が予め記憶されており、クラスタ
リング制御処理部７は、この知識部６の内容を参
照して、候補メモリ５に記憶された複数の候補字
種カテゴリーの認識順位の決定を行なう。 By the way, as shown in FIG. 4, the knowledge section 6 stores in advance whether or not the order of priority can be changed and the conditions for each class, and the clustering control processing section 7 refers to the contents of this knowledge section 6. Then, the recognition ranking of the plurality of candidate character type categories stored in the candidate memory 5 is determined.

ここで、字種コードがM_I，シテイブロツク距
離がd_I，優先度がP〓（Ｉ＝Ａ，Ｂ，…Ｈ，Ｋ，Ｌ，
…Ｒ，Ｓ，Ｔ，…Ｚ，P_I＝０，１，２）の候補字
種カテゴリーG_Iを（M_I，d_I，P_I）と表わすこと
とし、今、仮に、第５図イに示すように、カテゴ
リーセツト１〜３の各々から、類似度が大きい上
位８個づつの候補字種カテゴリーが候補メモリ５
に選択され、各シテイブロツク距離d_Iの関係が、
d_S＜d_K＜d_L＜d_T＜d_A＜d_U＜d_M＜d_B＜…であつたと
すると、クラスタリング制御処理部７の作業メモ
リ７ａには、第５図ロに示すように、距離d_Iが小
さい順に８個の候補字種カテゴリーG_S〜G_Bが選
択記憶される。 Here, the character type code is M _I , the city block distance is d _I , and the priority is P〓(I=A, B,...H, K, L,
...R, S, T, ...Z, P _I = 0, 1, 2), the candidate character type category G _I is expressed as (M _I , d _I , P _I ). As shown in FIG.
is selected, and the relationship between each city block distance d _I is
Assuming that d _S < d _K < d _L < d _T < d _A < d _U < d _M < d _B <..., the working memory 7a of the clustering control processing section 7 has the following information as shown in FIG. , eight candidate character type categories G _S to G _B are selected and stored in descending order of distance d _I .

そこで、候補字種カテゴリーの各シテイブロツ
ク距離d_Iと閾値D〓，D〓，D〓との関係が、例えば、
d_K＜D〓＜d_L，d_A＜D〓＜d_U，d_B＜D〓であつたとす
ると、クラスタリング処理部７によつてクラス分
けが行なわれ、各クラスポインタ１０ａ〜１０ｄ
の内容は、TA＝２，TB＝５，TC8，TD＝０と
なる。即ち、候補字種カテゴリーG_S，G_KがＡク
ラスに、候補字種カテゴリーG_L，G_T，G_AがＢク
ラスに、そして、候補字種カテゴリーG_U，G_M，
G_BがＣクラスにクラス分けされる。次に、クラ
スタリング制御処理部７は知識部６を参照して各
クラスにおける候補字種カテゴリーの順位入れ換
えを行なう。このとき作業メモリ７ａのアドレス
０〜７までの候補字種カテゴリーG₁〜G₈の字種
コードをＭ（１）〜Ｍ（８），シテイブロツク距離
をｄ（１）〜ｄ（８），優先度をＰ（１）〜Ｐ（８）、
最終認識順位がｉ位の字種カテゴリーの字種コー
ドをｍ（ｉ）と定める。 Therefore, the relationship between each city block distance d _I of the candidate character type category and the threshold values D〓, D〓, D〓 is, for example,
If d _K <D〓<d _L , d _A <D〓<d _U , d _B <D〓, the clustering processing unit 7 performs classification, and each class pointer 10a to 10d
The contents are TA=2, TB=5, TC8, TD=0. That is, the candidate character categories G _S , G _K are in the A class, the candidate character categories _GL , G _T , and G _A are in the B class, and the candidate character categories G _U , _GM ,
G _B is classified into C class. Next, the clustering control processing section 7 refers to the knowledge section 6 and rearranges the ranking of candidate character type categories in each class. At this time, the character type codes of candidate character type categories _G1 to _G8 at addresses 0 to 7 in the working memory 7a are set to M(1) to M(8), the city block distance is set to d(1) to d(8), Set the priority to P(1) to P(8),
The character type code of the character type category whose final recognition rank is i is defined as m(i).

すると、先ずＡクラスの字種カテゴリーについ
ては優先度による入れ換えを行なわないので、字
種コードM_S及びM_Kを第１位ｍ（１）及び第２位
ｍ（２）と決定し、その順位で結果メモリ８に書
込む。次のＢクラスの３つの字種コードM_L，
M_T，M_Aについては第６図のフローチヤートに示
す処理を行なう。即ち、Ｂクラスの場合は、初期
値としてｍ（ｉ）＝Ｍ（TA＋１），ｉ＝TA＋１，
ｌ＝TB，τo＝ａと設定する。 Then, first of all, since the character type category of class A is not replaced by priority, character type codes M _S and M _K are determined as the first m (1) and the second m (2), and their ranks are The result is written to memory 8. The following three character type codes for B class M _L ,
Regarding M _T and M _A , the processing shown in the flowchart of FIG. 6 is performed. That is, in the case of B class, the initial values are m(i)=M(TA+1), i=TA+1,
Set l=TB and τo=a.

上述の例においては、TA＝２、TB＝５なの
で、先ずｍ（３）の字種コードの決定が行なわれ
る。ここで、d_T−d_L＞ａ，d_A−d_T＞ａであり、従
つて、d_A−d_l＞ａとする。 In the above example, since TA=2 and TB=5, first the character type code of m(3) is determined. Here, d _T −d _L >a, d _A −d _T >a, and therefore d _A −d _l >a.

先ず、G_LとG_Tの距離の差τ＝d_T−d_Lが計算さ
れ、所定値ａと比較される。この場合τ＞ａなの
で、次に優先度の差ｔ＝P_L−P_Tが計算され所定
値１と比較される。この場合P_LはP_Tより優先度
が高いので−１となり、ｍ（３）即ちM_LとM_Tの
入れ換えは行なわれない。次にG_LとG_Aの距離の
差γ＝d_A−d_Lが計算され、所定値ａと比較され、
τ＞ａなので、優先度の差ｔ＝P_L−P_Aが計算さ
れてτが所定値１より大きいかどうかが比較され
る。この場合P_AはP_Lより優先度が高いがｔ＝１
なので、M_LとM_Aとの優先度による入れ換えは行
なわれない。依つて、この時点で第３位の認識順
位ｍ（３）がM_Lと決定され、Ｍ（３）〜Ｍ（５）は
M_L，M_T，M_Aのままである。そして、次にｍ
（４）の字種コードの決定を行なう。この場合も、
G_TとG_Aの距離の差τ＝d_A−d_Tが計算され、所定
値ａと比較される。τ＞ａなので優先度の差ｔ＝
P_T−P_Aが計算され所定値１より大きいかどうか
が比較される。この場合、優先度の差は、ｔ＝２
となるので、M_TとM_Aとの優先度による順位入れ
換えが行なわれ依つて認識順位第４位の字種コー
ドｍ（４）がM_Aと決定される。ｍ（４）について
の処理が終了すると、ｉ＝５となりｌ＝TB＝５
と等しくなるので、Ｂクラスにおける優先度によ
る順位入れ換え処理が終了し、第３位〜第５位の
認識順位は、M_L，M_A，M_Tと決定される。尚、
Ｂクラス内で単純に優先度により順位を決定すれ
ば、M_A，M_L，M_Tとなる。 First, the distance difference τ=d _T −d _L between G _L and G _T is calculated and compared with a predetermined value a. In this case, since τ>a, next the priority difference t=P _L -P _T is calculated and compared with the predetermined value 1. In this case, P _L has a higher priority than P _T , so it becomes -1, and m(3), that is, M _L and M _T are not swapped. Next, the difference in distance between G _L and G _A , γ = d _A - d _L , is calculated and compared with a predetermined value a,
Since τ>a, the priority difference t=P _L -P _A is calculated and compared to see if τ is larger than a predetermined value 1. In this case, P _A has higher priority than P _L , but t=1
Therefore, priority swapping between M _L and M _A is not performed. Therefore, at this point, the third recognition rank m(3) is determined to be M _L , and M(3) to M(5) are
M _L , M _T , and M _A remain the same. And then m
(4) The character type code is determined. In this case too,
The distance difference τ=d _A −d _T between G _T and G _A is calculated and compared with a predetermined value a. Since τ>a, the difference in priority t=
P _T -P _A is calculated and compared to see if it is larger than a predetermined value 1. In this case, the difference in priority is t=2
Therefore, the order of M _T and M _A is exchanged based on the priority, and the character type code m(4) which ranks fourth in the recognition order is determined to be M _A. When the processing for m(4) is completed, i=5 and l=TB=5
Therefore, the ranking replacement process based on the priority in the B class is completed, and the recognition rankings of the third to fifth places are determined as M _L , _MA , and M _T . still,
If the order is simply determined based on priority within class B, it becomes M _A , M _L , and M _T .

ところが、例えば、d_T−d_L≦ａ，d_A−d_T≦ａ，
d_A−d_L≦ａと上述の例に比べて、各候補字種カテ
ゴリーの距離が接近していたとすると、優先度の
差に関係なく優先度による順位入れ換えが行なわ
れるので、この場合の認識順位は、M_A，M_L，
M_Tと決定される。 However, for example, d _T −d _L ≦a, d _A −d _T ≦a,
If d _A − d _L ≦ a and the distance between each candidate character type category is closer than in the above example, the recognition in this case The ranking is M _A , M _L ,
It is determined that M _T.

このようにして、決定された字種コードM_L，
M_A，M_Tは結果メモリ８に送出され、M_S，M_Kに
続いてこの順に記憶される。 In this way, the determined character type code M _L ,
M _A and M _T are sent to the result memory 8 and stored in this order following M _S and M _K.

次に、クラスタリング制御処理部７は、Ｃクラ
スの処理に移る。 Next, the clustering control processing section 7 moves on to C class processing.

Ｃクラスの場合、初期値として、ｍ（ｉ）＝Ｍ
（TB＋１），ｉ＝TB＋１，ｌ＝TC，τ_p＝ｂと設
定され、Ｂクラスの場合と同様、第６図のフロー
チヤートに示す処理が行なわれる。例えば、d_M
−d_U＞ｂ，d_B−d_M＞ｂであるとし、従つて、d_B−
d_U＞ｂであつたならば、τ＝d_M−d_U＞ｂであり、
M_Mの優先度はM_Uの優先度より高いが、優先度の
差ｔ＝P_U−P_M＝１≦１であるので、M_UとM_Mと
は優先度による順位入れ換えが行なわれない。次
に、τ＝d_B−d_Uが計算されτ＞ｂであつて、且つ
P_U−P_B＝２＞１なのでM_UとM_Bが優先度により入
れ換えられ、M_B，M_M，M_Uとなつて、第６位の
字種コードｍ（６）がM_Bに決定される。そして、
M_MとM_Uとは優先度がM_Mの方が高いので、これ
らの順位はそのままとなる。従つて、第６位〜第
８位の字種コードがM_B，M_M，M_Uと決定され、
結果メモリ（８）にこの順に記憶される。 In the case of C class, as an initial value, m(i)=M
(TB+1), i=TB+1, l=TC, and τ _p =b, and the process shown in the flowchart of FIG. 6 is performed as in the case of the B class. For example, d _M
−d _U >b, d _B −d _M >b, so d _B −
If d _U > b, then τ = d _M − d _U > b,
The priority of M _M is higher than the priority of M _U , but since the difference in priority is t = P _U − P _M = 1≦1, M _U and M _M are not swapped based on priority. . Next, τ=d _B −d _U is calculated such that τ>b and
Since P _U − P _B = 2 > 1, M _U and M _B are exchanged according to priority, becoming M _B , M _M , M _U , and the 6th character type code m (6) is determined as M _B. be done. and,
Since M _M and _M _U have a higher priority, their rankings remain unchanged. Therefore, the 6th to 8th character type codes are determined as M _B , M _M , M _U ,
The results are stored in the result memory (8) in this order.

以上のようにして順位が決定されると、結果メ
モリ８には、第５図ハに示すような順位で８個の
字種コードが記憶されることになる。 When the rankings are determined as described above, eight character type codes are stored in the result memory 8 in the rankings shown in FIG. 5C.

このように、本発明では２つの候補字種カテゴ
リーにおいて、その距離の差が所定値より大きい
とき、優先度が接近している場合は、優先度によ
る順位入れ換えが行なわれず、優先度が離れてい
るときのみに優先度による順位入れ換えが行なわ
れる。 In this way, in the present invention, when the difference in the distance between two candidate character type categories is greater than a predetermined value, and the priorities are close, the rankings are not swapped based on the priorities, and the priorities are far apart. Reordering based on priority is performed only when there is a problem.

本実施例においては、優先度を０，１，２の３
段階としたが、その段階を増やしても本発明は適
用可能である。又、本実施例では、辞書部３を優
先度に応じたカテゴリーセツトに分割したが、各
字種カテゴリーに優先度情報を付加しておけば、
必ずしも辞書部３を分割する必要はない。更に具
体的処理においては第６図のフローチヤートとは
逆に、優先度の差をチエツクした後に距離の差を
チエツクしてもよい。 In this embodiment, the priority is 0, 1, and 2.
Although the steps are described above, the present invention is applicable even if the steps are increased. Furthermore, in this embodiment, the dictionary section 3 is divided into category sets according to priorities, but if priority information is added to each character type category,
It is not necessarily necessary to divide the dictionary section 3. Furthermore, in a more specific process, contrary to the flowchart of FIG. 6, the distance difference may be checked after the priority difference is checked.

尚、作業メモリ７ａに選択記憶される候補字種
カテゴリーは、各クラスのもの全てを必ずしも含
むわけではなく、全候補字種カテゴリーがＢクラ
スあるいはＣクラスだけの場合もあり、この場合
はこれらの同一クラス内だけで全ての認識順位が
決定される。 Note that the candidate character type categories selectively stored in the working memory 7a do not necessarily include all of the items in each class, and there are cases where all the candidate character type categories are only B class or C class, and in this case, these All recognition rankings are determined only within the same class.

実験によれば、同一クラスの候補字種カテゴリ
ー全てを優先度により順位の決定を行なつた場合
は、第１位の認識率が約90.0％であつたが、本発
明を適用すると認識率が約92.5％と向上した。 According to experiments, when ranking all candidate character categories in the same class based on priority, the recognition rate for the first place was approximately 90.0%, but when the present invention is applied, the recognition rate increases. This improved to approximately 92.5%.

(ト) 発明の効果本発明に依れば、類似度と優先度を共に有効に
活用して、候補字種カテゴリーの認識順位を決定
することができ、従つて、認識率が向上する。
又、知識部としては莫大な容量を必要とせず、更
には短かい処理時間で順位の決定が行なえるよう
になる。(G) Effects of the Invention According to the present invention, it is possible to determine the recognition order of candidate character type categories by effectively utilizing both similarity and priority, and therefore, the recognition rate is improved.
Furthermore, the knowledge section does not require a huge capacity, and furthermore, ranking can be determined in a short processing time.

[Brief explanation of drawings]

第１図は本発明を適用した文字認識システムの
ブロツク図、第２図はカテゴリーセツトの内容を
示す説明図、第３図はクラスタリング制御処理部
のクラス分け処理の内容を示すフローチヤート、
第４図は知識部の内容を示す説明図、第５図は認
識順位決定の具体例を示す説明図、第６図はクラ
スタリング制御処理部の順位入れ換え処理の内容
を示すフローチヤートである。主な図番の説明、３…辞書部、４…パターンマ
ツチング部、５…候補メモリ、６…知識部、７…
クラスタリング制御処理部、８…結果メモリ、９
…答出力制御部。 FIG. 1 is a block diagram of a character recognition system to which the present invention is applied, FIG. 2 is an explanatory diagram showing the contents of a category set, and FIG. 3 is a flowchart showing the contents of the classification process of the clustering control processing section.
FIG. 4 is an explanatory diagram showing the contents of the knowledge section, FIG. 5 is an explanatory diagram showing a specific example of recognition ranking determination, and FIG. 6 is a flowchart showing the contents of the ranking replacement process of the clustering control processing section. Explanation of main figure numbers, 3...Dictionary section, 4...Pattern matching section, 5...Candidate memory, 6...Knowledge section, 7...
Clustering control processing unit, 8...Result memory, 9
...Answer output control section.

Claims

[Claims]

1 Calculate the degree of similarity between the feature pattern extracted from the input character pattern and the standard feature pattern for each character type cadecory registered in advance in the dictionary section, select a plurality of candidate character type categories, and select the character type category from the character type category. In a method in which a priority is determined for each of the candidate character type categories and a recognition order is determined based on the similarity and priority, the difference in similarity and the difference in priority between the candidate character type categories are calculated, and the If the difference in priorities is larger than a predetermined value, the recognition order based on priority is changed only when the difference in priority is greater than a predetermined value, and when the difference in priority is smaller than a predetermined value, the recognition order based on priority is changed. A recognition ranking determination method characterized by not replacing .