JPH1027216A

JPH1027216A - Character pattern recognition processing method and apparatus

Info

Publication number: JPH1027216A
Application number: JP8179404A
Authority: JP
Inventors: Minoru Mori; 稔森; Toru Wakahara; 徹若原; Kazumi Odaka; 和己小高
Original assignee: Nippon Telegraph and Telephone Corp
Current assignee: NTT Inc
Priority date: 1996-07-09
Filing date: 1996-07-09
Publication date: 1998-01-27

Abstract

(57)【要約】【課題】文字線の傾きや接続関係の変動の影響を受け
にくくし、手書き変形の多い文字対象を認識可能とす
る。【解決手段】２値化された文字パターンを複数の粗い
メッシュ領域に分割する（３０１）。各メッシュ領域内
の文字部の黒画素と輪郭を形成する背景部の白画素につ
いて、あらかじめ定めた複数方向に触手を伸ばして各方
向別に白画素の連結長を求める（３０２）。該白画素連
結長から該白画素の背景部の方向成分別の分布状況を表
わす方向寄与度の値を各メッシュ領域毎に計数し、文字
パターンの特徴とする（３０３）。この特徴は、文字の
２次元的な構造を表わし、かつ、文字線の傾きや接続関
係の変化の影響を受けにくく、手書き変形の多い文字対
象の認識が可能になる。 (57) [Summary] [PROBLEMS] To make it difficult to be affected by the inclination of a character line or a change in a connection relationship, and to be able to recognize a character object with many handwritten deformations. A binarized character pattern is divided into a plurality of coarse mesh areas (301). With respect to the black pixels of the character portion and the white pixels of the background portion forming the contour in each mesh area, the tentacles are extended in a plurality of predetermined directions to determine the connection length of the white pixels for each direction (302). From the white pixel connection length, the value of the directional contribution representing the distribution state of each directional component of the background portion of the white pixel is counted for each mesh area, and is set as a character pattern feature (303). This feature represents the two-dimensional structure of the character, is less susceptible to changes in the inclination of the character line and changes in the connection relationship, and makes it possible to recognize character objects that are frequently deformed by handwriting.

Description

DETAILED DESCRIPTION OF THE INVENTION

【０００１】[0001]

【発明の属する技術分野】本発明は、文字パターンの認
識方法及び装置に関し、特に光電変換によって得られた
文字パターンを２値化した文字パターンに対して、手書
き漢字のような多字種、多様な手書き変形をもつ文字対
象を認識するのに好適な文字パターン認識方法及び装置
に関するものである。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a method and an apparatus for recognizing a character pattern, and more particularly to a character pattern obtained by binarizing a character pattern obtained by photoelectric conversion. The present invention relates to a character pattern recognition method and apparatus suitable for recognizing a character object having a handwritten deformation.

【０００２】[0002]

【従来の技術】文字パターンの認識処理には、従来から
種々の方法が提案されているが、その一つにメッシュ分
割法がある。従来、これの代表的な方法としては、次の
ような方法のものが知られている。2. Description of the Related Art Various methods have been conventionally proposed for character pattern recognition processing, one of which is a mesh division method. Conventionally, the following method is known as a typical method.

【０００３】第１の方法は、２値化し、位置及び大きさ
の正規化を行った文字パターンを、複数の粗いメッシュ
領域に分割し、各メッシュ領域内に存在する文字部に対
して複数方向の座標軸から観測し、該座標軸上の各位置
における該座標軸に直交する方向の文字部を横切る文字
線数を計数し、この情報から特徴ベクトルパターンを作
成し、すでに蓄えておいた各文字の特徴辞書テーブルと
のマッチングをとり、文字パターンの認識を行う方法で
ある。A first method is to divide a character pattern, which has been binarized and normalized in position and size, into a plurality of coarse mesh areas and to apply a character pattern existing in each mesh area in a plurality of directions. Observed from the coordinate axis, the number of character lines crossing the character portion in the direction orthogonal to the coordinate axis at each position on the coordinate axis is counted, a feature vector pattern is created from this information, and the characteristics of each character already stored This is a method of matching a dictionary table and recognizing a character pattern.

【０００４】また、第２の方法は、同様に２値化し、位
置及び大きさの正規化を行った文字パターンを複数の粗
いメッシュ領域に分割し、各メッシュ領域内に存在する
文字部に対して複数方向の座標軸から観測し、該座標軸
から走査した際に交差した文字部の黒画素について、文
字線の方向寄与度を求めることにより文字を認識する方
法である。In a second method, a character pattern which has been binarized and the position and size of which have been normalized is divided into a plurality of coarse mesh areas, and a character portion existing in each mesh area is divided. This is a method of recognizing a character by observing from a plurality of coordinate axes and obtaining the directional contribution of a character line for a black pixel of a character portion crossed when scanning from the coordinate axis.

【０００５】[0005]

【発明が解決しようとする課題】上記従来技術におい
て、第１の方法では、文字部を横切る文字線数によって
字種の違いによる文字線構造の大まかな複雑さの違いを
区別できるものの、より詳細な文字線構造の違いを表す
情報がない為、類似文字が多くかつ手書き変形も多い文
字対象をうまく認識できないという問題点があった。ま
た、第２の方法では、文字線の傾き、接続関係の変化等
の手書き変形の多い文字対象をうまく認識できないとい
う問題点があった。In the above-mentioned prior art, in the first method, although the difference in the general complexity of the character line structure due to the difference in the character type can be distinguished by the number of character lines crossing the character portion, the method is more detailed. Since there is no information indicating a difference in a character line structure, there is a problem that a character object having many similar characters and many handwriting deformations cannot be recognized well. In addition, the second method has a problem in that a character object having many handwritten deformations such as a tilt of a character line and a change in a connection relationship cannot be recognized well.

【０００６】本発明の目的は、２値化し、位置及び大き
さの正規化をされた文字パターンについて、文字の二次
元的な構造に関する情報がえられ、かつ文字線の傾きや
接続関係の変化等を受けにくい特徴を用いることによ
り、文字線の傾きや接続関係の変化等の手書き変形の多
い文字対象を認識することを可能にする文字パターン認
識方法及び装置を提供することにある。An object of the present invention is to obtain information on the two-dimensional structure of a character from a binarized character pattern whose position and size have been normalized, and to obtain a change in the inclination of a character line and a connection relationship. It is an object of the present invention to provide a character pattern recognition method and apparatus that can recognize a character object having a large amount of handwritten deformation such as a tilt of a character line or a change in a connection relationship by using a feature that is not easily affected.

【０００７】[0007]

【課題を解決するための手段】上記目的を達成するた
め、本発明は、文字パターンにおける文字部の黒点と輪
郭を形成する背景部の白点の方向寄与度、複数に分割さ
れた各粗いメッシュ領域ごとに計算することにより、文
字線の相互配置関係に関する情報を求めることを特徴と
する。これにより、文字の二次元的な構造が得られ、か
つ文字線の傾きや接続関係の変化の影響を受けにくくな
るので、手書き変形の多い文字対象を認識できるように
なる。In order to achieve the above-mentioned object, the present invention provides a method for producing a character pattern comprising: a directional contribution of a black point of a character portion and a white point of a background portion forming an outline; It is characterized in that information on mutual arrangement of character lines is obtained by calculating for each area. As a result, a two-dimensional structure of the character can be obtained, and the character is hardly affected by the inclination of the character line or the change in the connection relationship. Therefore, it is possible to recognize a character object that is frequently deformed by handwriting.

【０００８】[0008]

【発明の実施の形態】以下に、図面を参照して本発明の
一実施例を説明する。図１は、本発明の一実施例を説明
する全体構成図である。図において、１０は２値化され
た入力文字パターン、２０は前処理部、３０は特徴抽出
部、４０は識別処理部、５０は特徴辞書テーブル、６０
は識別結果である。An embodiment of the present invention will be described below with reference to the drawings. FIG. 1 is an overall configuration diagram for explaining an embodiment of the present invention. In the figure, 10 is a binarized input character pattern, 20 is a preprocessing unit, 30 is a feature extraction unit, 40 is an identification processing unit, 50 is a feature dictionary table, 60
Is the identification result.

【０００９】前処理部２０は、例えば従来から知られて
いる位置の正規化処理法を用いて、入力文字パターン１
０を構成する全黒点のｘ，ｙ座標値の各々の平均値を算
出して文字の重心と定義し、ついで該重心が文字枠の中
心位置にくるように入力文字パターン全体の平行移動処
理を行う。また、同様に、従来から知られている大きさ
の正規化処理法を用いて、重心から各筆点への距離の平
均値があらかじめ定めた正規化半径に等しくなるよう
に、重心回りに一様に文字パターンの拡大／縮小処理を
行う。さらに、該前処理部２０では、位置と大きさの正
規化処理後の文字パターンが文字枠をはみ出した場合に
は、文字枠にはみ出した文字部を除去する処理（枠取り
処理）を行ったり、また、位置と大きさの正規化処理後
の文字パターンの文字線輪郭部分の黒点の１メッシュの
凹凸をそれぞれ埋めるかまたは取り除く処理（平滑化処
理）を行う。The preprocessing unit 20 uses, for example, a conventionally known position normalization processing method to input the input character pattern 1.
The average value of each of the x and y coordinate values of all the black points constituting 0 is calculated and defined as the center of gravity of the character. Then, the parallel movement processing of the entire input character pattern is performed so that the center of gravity is located at the center position of the character frame. Do. Similarly, using a conventionally known size normalization processing method, one point around the center of gravity is set so that the average value of the distance from the center of gravity to each writing point is equal to a predetermined normalized radius. The character pattern is enlarged / reduced as described above. Further, when the character pattern after the position and size normalization processing is out of the character frame, the pre-processing unit 20 performs a process of removing the character part which is out of the character frame (frame processing). In addition, a process (smoothing process) of filling or removing the unevenness of one mesh of the black point of the character line contour portion of the character pattern after the position and size normalization process is performed.

【００１０】図２に、文字「木」について、前処理部２
０の正規化処理により入力文字パターン１０が正規化さ
れる例を示す。図２（ａ）は入力文字パターン１０の例
である。図２（ｂ）は、前処理部２０において、該入力
文字パターン１０に対して位置と大きさの正規化処理を
行った後の文字パターンである。また、図２（ｃ）は、
位置と大きさの正規化処理後の文字パターンが文字枠を
はみ出した例、図２（ｄ）は、これに対して枠取り処理
を行った例、図２（ｅ）は、平滑化処理を行った例をそ
れぞれ示している。FIG. 2 shows the pre-processing unit 2 for the character "tree".
An example in which the input character pattern 10 is normalized by the normalization process of 0 is shown. FIG. 2A is an example of the input character pattern 10. FIG. 2B shows a character pattern after the preprocessing unit 20 has performed a position and size normalization process on the input character pattern 10. FIG. 2 (c)
FIG. 2D shows an example in which the character pattern after the position and size normalization processing has run out of the character frame, FIG. 2D shows an example in which the frame processing is performed on the character pattern, and FIG. Each example is shown.

【００１１】特徴抽出部３０は、本発明の主要部をなす
もので、前処理部２０において正規化処理を施こされた
文字パターンを入力して、該文字パターンを複数の粗い
メッシュ領域に分割し、各メッシュ領域内の文字部の黒
点と輪郭を形成する背景部の白点についてあらかじめ定
めた複数方向に触手を伸ばし、各方向に連絡する白点の
点数を計数し、該白画素の方向寄与度を求める処理を行
う。The feature extracting section 30 is a main part of the present invention. The character extracting section 30 inputs the character pattern subjected to the normalization processing in the preprocessing section 20, and divides the character pattern into a plurality of coarse mesh areas. Then, the tentacles are extended in a plurality of predetermined directions with respect to the black point of the character portion and the white point of the background portion forming the contour in each mesh area, the number of white points connected in each direction is counted, and the direction of the white pixel is counted. Perform the process of calculating the contribution.

【００１２】図３に、該特徴抽出部３０の処理フローチ
ャートを示す。ここで、３０１は文字パターンを複数の
粗いメッシュ領域に分割する処理、３０２は各メッシュ
領域において、文字部の黒画素（黒点）と輪郭を形成す
る背景部の白画素（白点）について、複数方向（例えば
８方向の場合には０°、４５°、９０°、１３５°、１
８０°、２２５°、２７０°、３１５°の８方向）に触
手を伸ばして、各方向別に白画素の連結長を求める処
理、３０３は白画素連結長から該白画素の背景部の各方
向成分別の分布状況を表す方向寄与度の値を各メッシュ
領域毎に計数し、当該文字パターンの特徴とする処理で
ある。なお、この特徴抽出部３０の具体的処理のアルゴ
リズムについては後述する。FIG. 3 shows a processing flowchart of the feature extracting unit 30. Here, reference numeral 301 denotes a process of dividing a character pattern into a plurality of coarse mesh regions, and reference numeral 302 denotes a plurality of black pixels (black points) of a character portion and a plurality of white pixels (white points) of a background portion forming an outline in each mesh region. Direction (for example, 0 °, 45 °, 90 °, 135 °, 1
(8 directions of 80 °, 225 °, 270 °, and 315 °) to extend the tentacles to determine the connection length of the white pixel for each direction. 303 is a component of each direction of the background portion of the white pixel from the white pixel connection length. This is a process in which the value of the directional contribution degree representing another distribution state is counted for each mesh region, and is set as a feature of the character pattern. The specific processing algorithm of the feature extraction unit 30 will be described later.

【００１３】識別部４０は、特徴抽出部３０によって得
られた方向寄与度の値をもとに文字パターンを識別する
ための特徴テーブルを作成し、該特徴テーブルをもと
に、すでに蓄えておいた各文字の特徴辞書テーブル５０
と従来から知られているマッチング方法によりマッチン
グをとり、文字パターンの識別を行う。The identification unit 40 creates a feature table for identifying a character pattern based on the value of the directional contribution obtained by the feature extraction unit 30, and stores the feature table based on the feature table. Character dictionary table 50 for each character
Then, matching is performed by a conventionally known matching method to identify a character pattern.

【００１４】次に、特徴抽出部３０の処理のアルゴリズ
ムとして、文字パターンをＫ個の粗いメッシュ領域に分
割し、各メッシュ領域内の文字部の黒点と輪郭を形成す
る背景部の白点について８方向（０°、４５°、９０
°、１３５°、１８０°、２２５°、２７０°、３１５
°）に触手を伸ばして、背景部の方向寄与度を求め、文
字パターンを識別する場合を説明する。Next, as a processing algorithm of the feature extraction unit 30, the character pattern is divided into K coarse mesh areas, and black points of the character parts in each mesh area and white points of the background part which forms the outline are divided into eight. Direction (0 °, 45 °, 90
°, 135 °, 180 °, 225 °, 270 °, 315
The case where the tentacles are extended to (°) to obtain the directional contribution of the background portion and the character pattern is identified will be described.

【００１５】図４は、前処理部２０によって得られたＮ
×Ｎメッシュの文字パターンを、Ｋ個の粗いメッシュ領
域、例えば正方形のメッシュ領域に等分割する様子を示
す。図４において、１，２，・・・，ｋ，・・・，Ｋが
それぞれメッシュ領域である。図５は、白点連結長を求
めるために触手を伸ばす８方向（０°、４５°、９０
°、１３５°、１８０°、２２５°、２７０°、３１５
°）を示した図、図６は、文字部の黒点と輪郭を形成す
る背景部の白点連結長を求める様子を示した図である。
以下では、０°、４５°、９０°、１３５°、１８０
°、２２５°、２７０°、３１５°の各８方向に、それ
ぞれ１，２，３，４，５，６，７，８の番号を付ける。FIG. 4 shows N obtained by the preprocessing unit 20.
This figure shows how a character pattern of × N mesh is equally divided into K coarse mesh areas, for example, square mesh areas. In FIG. 4, 1, 2,..., K,. FIG. 5 shows eight directions (0 °, 45 °, 90 °) of extending the tentacle to obtain the white point connection length.
°, 135 °, 180 °, 225 °, 270 °, 315
FIG. 6 is a diagram showing the manner in which the black point of the character portion and the white point connection length of the background portion forming the contour are obtained.
In the following, 0 °, 45 °, 90 °, 135 °, 180
225 °, 270 °, and 315 ° are numbered 1, 2, 3, 4, 5, 6, 7, and 8, respectively.

【００１６】第ｋ番目（１，２，・・・，ｋ，・・・，
Ｋ）のメッシュ領域内における文字部の黒点と輪郭を形
成する背景部の各白点の方向寄与度ｆは、ｆ＝（α１，α２，α３，α４，α５，α６，α７，α
８）なる８次元ベクトルで表される。ここで、α１，α２，
・・・，α８はそれぞれ、８方向の方向寄与度成分で、
該当白点から８方向に触手を伸ばし、各方向別に得られ
る白点連結長ｌｉ（ｉ＝１，２，・・・，８）を用い
て、例としてThe k-th (1, 2,..., K,.
K), the directional contribution f of the black point of the character portion and each white point of the background portion forming the contour in the mesh area is f = (α1, α2, α3, α4, α5, α6, α7, α)
8) It is represented by the following eight-dimensional vector. Where α1, α2,
..., α8 are directional contribution components in eight directions, respectively.
The tentacles are extended in eight directions from the corresponding white point, and the white point connection length li (i = 1, 2,..., 8) obtained for each direction is used as an example.

【００１７】[0017]

【数１】 (Equation 1)

【００１８】で表される。このαｉには、ここで示した
ユークリッド距離以外の距離を適用することも可能であ
る。## EQU1 ## It is also possible to apply a distance other than the Euclidean distance shown here to αi.

【００１９】このようにして求められるｆを、各メッシ
ュ領域内の文字部の黒点と輪郭を形成する背景部の全て
の白点について求めて、累積することにより、または累
積した値を白点の数によって平均化することにより、第
ｋ番目のメッシュ領域において得られる特徴パターンｆ
ｋは、ｆｋ＝（αｋ１，αｋ２，・・・，αｋ８）で表される。ここで、αｋ１，αｋ２，・・・，αｋ８
は、第ｋ番目のメッシュ領域内に存在する文字部の黒点
と輪郭を形成する背景部の全ての白点における方向寄与
度ベクトルをそれぞれ方向成分別に累積した方向寄与度
のベクトルの各要素、または方向成分別に累積した方向
寄与度のベクトルの各要素を白点の数によって平均化し
た各要素である。したがって文字パターンの特徴ベクト
ルＦは、Ｆ＝（ｆ１，ｆ２，・・・，ｆｋ，・・・，ｆＫ）で表される。The f obtained in this manner is obtained for the black point of the character portion in each mesh area and all the white points of the background portion forming the contour, and is accumulated. By averaging by the number, the characteristic pattern f obtained in the k-th mesh area
k is represented by fk = (αk1, αk2,..., αk8). Here, αk1, αk2,..., Αk8
Each element of the directional contribution vector obtained by accumulating the directional contribution vectors at the black points of the character part existing in the k-th mesh area and all the white points of the background part forming the contour for each directional component, or Each element of the direction contribution vector accumulated for each direction component is averaged by the number of white points. Therefore, the feature vector F of the character pattern is represented by F = (f1, f2,..., Fk,..., FK).

【００２０】このようにして表される文字パターンの特
徴ベクトルＦの各要素の値を文字パターンの特徴として
特徴テーブルを作成し、識別部４０において、例えば従
来から知られている識別関数としてユークリッド距離な
どの識別関数Ｄ（Ｆ）を求め、文字パターンを識別す
る。A feature table is created using the values of the respective elements of the feature vector F of the character pattern represented in this way as features of the character pattern, and the identification unit 40 uses, for example, the Euclidean distance as a conventionally known identification function. And the like, and a character pattern is identified.

【００２１】ここで、識別関数は入力文字パターンの特
徴ベクトルと、あらかじめ蓄えられている特徴辞書テー
ブル５０の各文字種ごとの特徴ベクトル間で距離値の演
算を行い、距離値の一番小さい（関数によっては一番大
きい）値をとった文字を候補文字として出力する。Here, the discrimination function calculates a distance value between the feature vector of the input character pattern and the feature vector for each character type stored in the feature dictionary table 50 in advance, and determines the smallest distance value (function The character with the largest value is output as a candidate character.

【００２２】特徴抽出部３０で得られた入力文字パター
ンの特徴ベクトルをＦ＝（ｆ１，ｆ２，・・・，ｆｋ，
・・・，ｆＫ）、特徴辞書テーブル５０の各文字ｉ（１
≦ｉ≦Ｍ）の特徴ベクトルをＳｉ＝（ｓｉ１，ｓｉ２，
・・・，ｓｉＫ）とすると、例えばユークリッド距離の
場合、識別部４０では、ｉ＝１〜Ｍまでの字種の間で、The feature vector of the input character pattern obtained by the feature extracting unit 30 is represented by F = (f1, f2,..., Fk,
.., FK), each character i (1) in the feature dictionary table 50
≦ i ≦ M) is defined as Si = (si1, si2,
.., SiK), for example, in the case of the Euclidean distance, in the identification unit 40, between the character types from i = 1 to M,

【００２３】[0023]

【数２】 (Equation 2)

【００２４】の計算を行い、一番小さい値を取ったｉの
字種を正解文字パターンとして出力する。Is calculated, and the character type of i having the smallest value is output as a correct character pattern.

【００２５】次に、特徴抽出部３０における別のアルゴ
リズムの適用例を説明する。前処理部２０によって得ら
れたＮ×Ｎメッシュの文字パターンを、Ｋ個の粗いメッ
シュ領域、例えば正方形のメッシュ領域に等分割する。
本アルゴリズムでは、第ｋ番目（１，２，・・・，ｋ，
・・・，Ｋ）のメッシュ領域内における文字部の黒点と
輪郭を形成する背景部の各白点の方向寄与度ｇを、ｇ＝（β１，β２，β３，β４）なる４次元ベクトルで表す。ここで、β１，β２，・・
・β４はそれぞれ４方向の方向寄与度成分で、該当白点
から８方向に触手を伸ばし、各方向別に得られる白点連
結長ｌｉ（ｉ＝１，２，・・・，８）を用いて、例とし
てNext, an application example of another algorithm in the feature extraction unit 30 will be described. The N × N mesh character pattern obtained by the preprocessing unit 20 is equally divided into K coarse mesh areas, for example, square mesh areas.
In this algorithm, the k-th (1, 2,..., K,
.., K), the directional contribution g of the black point of the character portion and each white point of the background portion forming the contour in the mesh area is represented by a four-dimensional vector g = (β1, β2, β3, β4) . Where β1, β2, ...
Β4 is a directional contribution component in each of the four directions, extending the tentacle in eight directions from the corresponding white point, and using the white point connection length li (i = 1, 2,..., 8) obtained for each direction. ,As an example

【００２６】[0026]

【数３】 (Equation 3)

【００２７】で表される。なお、このβｉには、ここで
示したユークリッド距離以外の距離を適用することも可
能である。## EQU2 ## Note that a distance other than the Euclidean distance shown here can be applied to βi.

【００２８】このようにして求められるｇを、各メッシ
ュ領域内の文字部の黒点と輪郭を形成する背景部の全て
の白点について求めて、累積することにより、または累
積した値を各メッシュ領域内の白点の数によって平均化
することにより、第ｋ番目のメッシュ領域において得ら
れる特徴パターンｇｋは、ｇｋ＝（βｋ１，βｋ２，βｋ３，βｋ４）で表される。ここで、βｋ１，βｋ２，βｋ３，βｋ４
は、第ｋ番目のメッシュ領域内に存在する文字部の黒点
と輪郭を形成する背景部の全ての白点における方向寄与
度ベクトルをそれぞれ方向成分別に累積した方向寄与度
のベクトルの各要素、または方向成分別に累積した方向
寄与度のベクトルの各要素を白点の数によって平均化し
た各要素である。したがって文字パターンの特徴ベクト
ルＧは、Ｇ＝（ｇ１，ｇ２，・・・，ｇｋ，・・・ｇＫ）で表される。The g thus obtained is obtained for the black point of the character portion in each mesh region and all the white points of the background portion forming the contour, and is accumulated. The characteristic pattern gk obtained in the k-th mesh area by averaging with the number of white points in is represented by gk = (βk1, βk2, βk3, βk4). Here, βk1, βk2, βk3, βk4
Each element of the directional contribution vector obtained by accumulating the directional contribution vectors at the black points of the character part existing in the k-th mesh area and all the white points of the background part forming the contour for each directional component, or Each element of the direction contribution vector accumulated for each direction component is averaged by the number of white points. Therefore, the feature vector G of the character pattern is represented by G = (g1, g2,..., Gk,... GK).

【００２９】このようにして表される文字パターンの特
徴ベクトルＧの各要素の値を文字パターンの特徴として
特徴テーブルを作成し、識別部４０において、例えば従
来から知られている識別関数としてユークリッド距離な
どの識別関数Ｄ（Ｇ）を求め、文字パターンを識別す
る。A characteristic table is created using the values of the respective elements of the characteristic vector G of the character pattern represented as described above as the characteristics of the character pattern, and the identification unit 40 uses, for example, a Euclidean distance as a conventionally known identification function. And the like, and a character pattern is identified.

【００３０】[0030]

【発明の効果】以上説明したように、本発明によれば背
景部の方向寄与度を求めることにより網字の相対位置及
び輪郭近傍の形状が抽出できるので、文字線の傾きや接
続関係の変動の影響を受けにくくなることにより、手書
き変形の多い文字対象を認識できることが可能になる。As described above, according to the present invention, the relative position of the halftone character and the shape in the vicinity of the outline can be extracted by obtaining the directional contribution of the background portion. Is less likely to be affected by the character, it becomes possible to recognize a character object with a large amount of handwritten deformation.

[Brief description of the drawings]

【図１】本発明の一実施例を示す全体構成図である。FIG. 1 is an overall configuration diagram showing an embodiment of the present invention.

【図２】前処理部における前処理の様子を示す図であ
る。FIG. 2 is a diagram illustrating a state of preprocessing in a preprocessing unit.

【図３】特徴抽出部における処理フローチャートであ
る。FIG. 3 is a processing flowchart in a feature extraction unit.

【図４】特徴抽出部における文字パターンを粗いメッシ
ュ領域に分割する様子を示す図である。FIG. 4 is a diagram illustrating a manner in which a character pattern is divided into coarse mesh regions in a feature extraction unit.

【図５】特徴抽出部において白点連結長を求めるために
触手を伸ばす方向として、８方向にした場合を示す図で
ある。FIG. 5 is a diagram showing a case where eight directions are set as directions in which a tentacle is extended in order to obtain a white point connection length in a feature extraction unit.

【図６】特徴抽出部において文字部の黒点と輪郭を形成
する背景部の白点連結長を求める様子を示す図である。FIG. 6 is a diagram illustrating a manner in which a feature extraction unit obtains a white point connection length of a background part forming a black point and an outline of a character part.

[Explanation of symbols]

１０入力文字パターン２０前処理部３０特徴抽出部４０識別部５０特徴辞書テーブル６０識別結果 DESCRIPTION OF SYMBOLS 10 Input character pattern 20 Preprocessing part 30 Feature extraction part 40 Identification part 50 Feature dictionary table 60 Identification result

Claims

[Claims]

1. A binarized character pattern is divided into a plurality of mesh areas. In each mesh area, black pixels of a character portion and white pixels of a background portion forming an outline are defined in a plurality of predetermined directions. Extend the tentacles to determine the connection length of white pixels in each direction, and count the value of the direction contribution representing the distribution of each direction component of the background of the white pixels obtained from the white pixel connection length for each mesh area A character pattern recognition processing method characterized by recognizing a character pattern by using the value of the direction contribution.

2. A pre-processing unit for performing a normalization process on a position and a size of a character in a binarized character pattern, and dividing the normalized character pattern into a plurality of mesh regions. In the black portion of the character portion and the white portion of the background portion forming the contour, the tentacles are extended in a plurality of predetermined directions to determine the connection length of the white pixel in each direction, and the connection length of the white pixel is calculated from the white pixel connection length. A feature extraction unit that counts the value of the directional contribution degree representing the distribution state of each directional component of the background part for each mesh region and makes the feature of the character pattern a feature;
A character pattern recognition device, comprising: an identification unit that performs an identification process using the feature.