JPH0327488A

JPH0327488A - Character recognizing device

Info

Publication number: JPH0327488A
Application number: JP1160937A
Authority: JP
Inventors: Toru Ishikawa; 石河　融; Hiroshi Yoshida; 浩史吉田; Koichi Higuchi; 浩一樋口; Yoshiyuki Yamashita; 山下　義征
Original assignee: Oki Electric Industry Co Ltd
Current assignee: Oki Electric Industry Co Ltd
Priority date: 1989-06-26
Filing date: 1989-06-26
Publication date: 1991-02-05
Anticipated expiration: 2013-11-25
Also published as: JP2827288B2

Abstract

PURPOSE:To read even a business form in which characters of different character styles are mixed at high speed and at high precision by detecting an italic character by extracting feature point coordinate, and deciding the character style, and selecting a dictionary mask for collation based on a decided result. CONSTITUTION:An input character pattern is represented by X-Y coordinates system by giving X-coordinate and Y-coordinate to the picture element of the picture data of the input character pattern segmented by every one character. Then, the maximum and the minimum calculation values alphaX+betaY are detected by using X and Y coordinates and a specified values of alpha and beta, and the X and Y coordinates of the picture element of the input character pattern to give these maximum and minimum values are detected. Geometrical feature quantity is calculated based on the detected X and Y coordinates. The character style is decided based on the calculated feature quantity. A recognition diction ary mask corresponding to the decided character style is selected. The collation of the input character pattern is executed by using the selected dictionary mask, and an input character is identified. Thus, the business form containing plural character styles can be read at high speed.

Description

【発明の詳細な説明】（産業上の利用分野）本発明は、複数の字体の文字を含む帳票でも高速にかつ
精度よく読取ることのできる文字認識装置に関するもの
である。DETAILED DESCRIPTION OF THE INVENTION (Field of Industrial Application) The present invention relates to a character recognition device that can read documents containing characters in a plurality of fonts at high speed and with high accuracy.

（従来の技術）従来、文字認識装置においては、例えば、特公昭６０−
３８７５６号公報に開示されるものがあり、以下の構成
要素（ａ）〜（ｆ）を具備して構成される。(Prior Art) Conventionally, in character recognition devices, for example,
There is one disclosed in Japanese Patent No. 38756, which is configured with the following components (a) to (f).

（ａ）文字図形を光電変換して量子化することにより黒
ビット及び白ビットで表わされるディジタル信号の原パ
ターンを作成する。(a) By photoelectrically converting and quantizing character figures, an original pattern of a digital signal represented by black bits and white bits is created.

（ｂ）次に、該原パターンの線幅を算出する。(b) Next, calculate the line width of the original pattern.

（ｃ）次に、前記原パターンを複数の方向に走査を行っ
て各走査列毎の黒ビットの連続個数を検出し、当該黒ビ
ットと連続個数を検出し、当該黒ビット連続個数と前記
線幅とに基づいて前記複数の走査方向毎に対応した複数
のサブパターンを抽出する。(c) Next, the original pattern is scanned in a plurality of directions to detect the number of consecutive black bits for each scanning line, and the number of consecutive black bits and the number of consecutive black bits are detected, and the number of consecutive black bits and the line A plurality of sub-patterns corresponding to each of the plurality of scanning directions are extracted based on the width.

（ｄ）次に、前記原パターンの文字枠内領域をサブパタ
ーンについて（ＮＸＭ）個の領域（Ｎ，’Ｍは定数）に
分割し、該分割された領域内についてセルを単位として
黒点を計数した結果と各サブパターンの線幅とを基に特
徴量を計算する。(d) Next, the region within the character frame of the original pattern is divided into (NXM) regions (N, 'M are constants) for the subpattern, and black points are counted in units of cells within the divided region. The feature amount is calculated based on the result and the line width of each sub-pattern.

（ｅ）次に、該特徴量を文字の大きさで正視化して特徴
マトリクスを作成する。(e) Next, a feature matrix is created by normalizing the feature amount using the character size.

（ｆ）そして、該特徴マトリクスを予め用意した文字図
形パターンの標準文字マスクと照合して文字図形を認識
する。(f) Then, the feature matrix is compared with a standard character mask of a character/figure pattern prepared in advance to recognize the character/figure.

このような文字認識装置において、複数の字体を含む帳
票、例えば第９図に示すような氏名や地名などを強調す
るため特定の単語だけイタリック体で印字されているよ
うな英文を認識する場合、予め認識対象となる全ての字
体の標準文字マスクを辞書マスクとして用意しておき、
前記全ての辞書マスクと入力文字図形を照合し、認識を
行う、という方法が広く用いられている。When such a character recognition device recognizes a form that includes multiple fonts, for example, English text in which only specific words are printed in italics to emphasize names or place names, as shown in Figure 9, Prepare standard character masks for all fonts to be recognized as dictionary masks in advance,
A widely used method is to perform recognition by comparing all of the dictionary masks with input character figures.

（発明が解決しようとする課題）しかしながら、上記従来の方法では、辞書マトリクスの
数が字体の数に比例して大きくなり従って照合回数が増
大し、認識速度が大幅に低下し、更には認識精度の低下
を招くという問題点があった。(Problems to be Solved by the Invention) However, in the conventional method described above, the number of dictionary matrices increases in proportion to the number of fonts, and the number of matching increases, resulting in a significant reduction in recognition speed, and furthermore, recognition accuracy. There was a problem in that it caused a decrease in .

本発明はこれらの問題点を解決するためのもので、高速
に複数の字体を含む帳票を読取ることができ、更には高
精度の読取り可能とする文字認識装置を提供することを
目的とする。The present invention has been made to solve these problems, and an object of the present invention is to provide a character recognition device that can read forms containing a plurality of fonts at high speed and with high accuracy.

（発明が解決するための手段）本発明は前記問題点を解決するために、媒体上の特徴抽
出対象を光電変換し量子化された画像データを得、該画
像データより１文字づつの文字パターンを切り出し、当
該切り出された入力文字パターンの字体の判定を行ない
、判定結果に基づいて認識辞書マスクの選択を行ない、
選択された辞書マスクを用いて入力文字パターンの照合
を行ない、文字を認識する文字認識装置において、１文
字づつ切り出された入力文字パターンの画像データの画
素にＸ座標を付与するＸ座標発生手段と、１文字づつ切
り出された入力文字パターンの画像データの画素にＹ座
標を付与するＹ座標発生手段と、Ｘ，Ｙ座標と少なくと
も２組の特定のα値及びβ値とを用いて所定の画素値を
有する入力文字パターンの画素に関する最大及び最小の
計算値αＸ＋βＹを検出し、これら最大及び最小計算値
を与える入力文字パターンの画素のＸ，Ｙ座標を検出す
る座標検出手段と、検出されたＸ，Ｙ座標に基づき幾何
学的な特徴量を算出する特徴量算出手段とを有し、該特
徴量算出手段により算出された特徴量に基づいて字体の
判定を行なう字体判定部と、判定された字体に対応する認識辞書マスクを選択する辞
書部と、選択された辞書マスクを用いて入力文字パターンの照合
を行ない、文字を識別する識別部とを具備することに特
徴がある。(Means for Solving the Invention) In order to solve the above-mentioned problems, the present invention photoelectrically converts the feature extraction target on the medium to obtain quantized image data, and from the image data, character patterns are created one by one. , determine the font of the extracted input character pattern, select a recognition dictionary mask based on the determination result,
In a character recognition device that recognizes characters by collating an input character pattern using a selected dictionary mask, , a Y-coordinate generation means for assigning a Y-coordinate to a pixel of image data of an input character pattern cut out one character at a time; coordinate detection means for detecting the maximum and minimum calculated values αX+βY for pixels of the input character pattern having values, and detecting the X, Y coordinates of the pixels of the input character pattern that give these maximum and minimum calculated values; , a feature amount calculation means for calculating a geometric feature amount based on the Y coordinate, and a font determination unit that determines the font based on the feature amount calculated by the feature amount calculation means; The present invention is characterized in that it includes a dictionary section that selects a recognition dictionary mask corresponding to a font, and an identification section that uses the selected dictionary mask to compare input character patterns and identify characters.

（作用）以上のような構成を有する本発明によれば、１文字づつ
切り出された入力文字パターンの画像データの画素にＸ
座標及びＹ座標を付与して入力文字パターンをＸ−Ｙ座
標系で表現する。そしてＸ，Ｙ座標と少なくとも２組の
特定のα値及びβ値とを用いて所定の画素値を有する入
力文字パターンの画素に関する最大及び最小の計算値α
Ｘ十βＹを検出し、これら最大及び最小計算値を与える
入力文字パターンの画素のＸ，Ｙ座標を検出する。検出
されたＸ，Ｙ座標に基づき幾何学的な特徴量を算出する
。算出された特徴量に基づいて字体の判定を行なう。判
定された字体に対応する認識辞書マスクが選択される。(Operation) According to the present invention having the above configuration, X is applied to pixels of image data of an input character pattern cut out character by character.
The input character pattern is expressed in the X-Y coordinate system by assigning coordinates and Y coordinates. Then, using the X, Y coordinates and at least two specific sets of α and β values, calculate maximum and minimum values α for pixels of the input character pattern having a predetermined pixel value.
X and βY are detected, and the X and Y coordinates of the pixels of the input character pattern that give these maximum and minimum calculated values are detected. Geometric features are calculated based on the detected X and Y coordinates. The font is determined based on the calculated feature amount. A recognition dictionary mask corresponding to the determined font is selected.

選択された辞書マスクを用いて入力文字パターンの照合
を行ない、入力文字は識別される。The input character pattern is matched using the selected dictionary mask, and the input character is identified.

したがって、本発明は前記問題点を解決でき、高速に複
数の字体を含む帳票を読取ることができ、更には高精度
の読取り可能とする文字認識装置を提供できる。Therefore, the present invention can solve the above-mentioned problems, and can provide a character recognition device that can read forms containing a plurality of fonts at high speed and can also read documents with high accuracy.

（実施例）以下、本発明の一実施例を図面に基づいて説明する。(Example) Hereinafter, one embodiment of the present invention will be described based on the drawings.

第１図は本発明の一実施例を示すブロック図である。同
図において、ｌＯは文字認識装置、１１は光電変換部、
１２はラインバッファ、ｌ３は文字切り出し部、ｌ４は
パターンレジスタ、１５は字体判定部、ｌ６は辞書部、
ｌ７は識別部、ｌ８は出力端子である。FIG. 1 is a block diagram showing one embodiment of the present invention. In the figure, IO is a character recognition device, 11 is a photoelectric conversion unit,
12 is a line buffer, l3 is a character cutting section, l4 is a pattern register, 15 is a font determining section, l6 is a dictionary section,
17 is an identification section, and 18 is an output terminal.

なお、出力端子１８は、例えばコンピュータ等の外部機
器のデータ入力端子等に接続されるものであり、文字認
識の終了した文字名（例えばＪＩＳの文字コード）を出
力するものである。The output terminal 18 is connected to, for example, a data input terminal of an external device such as a computer, and is used to output a character name (for example, a JIS character code) for which character recognition has been completed.

第２図（ａ）は文字行領域における２値画像データを示
す図、第２図（ｂ）は２値画像データによる周辺分布を
示す図、第２図（ｃ）は入力文字パターンの特徴値Ｆを
示す図、第２図（ｄ）は入力文字パターンに対する認識
結果を示す図である。Figure 2 (a) is a diagram showing binary image data in the character line area, Figure 2 (b) is a diagram showing the peripheral distribution of binary image data, and Figure 2 (c) is a diagram showing the characteristic values of the input character pattern. FIG. 2(d) is a diagram showing recognition results for input character patterns.

第３図は第１図の字体判定部１５を示すブロック図であ
る。同図において、３０はＸ座標発生手段、３２はＹ座
標発生手段、３４．　３６は座標検出手段であり、座標
検出手段３４はｘ＋ｙ計算手段３４ｌ、最大値検出手段
３４２、最大値座標保存手段３４３、最小値検出手段３
４４及び最小値座標保存手段３４５を備え、また最小値
検出手段３６はＸ−Ｙ計算手段３６ｌ、最大値検出手段
３６２、最大値座標保存手段３６３、最小値検出手段３
６４、最小値座標保存手段３６５を備える。３８は特徴
算出手段である。FIG. 3 is a block diagram showing the font determining section 15 of FIG. 1. In the figure, 30 is an X coordinate generating means, 32 is a Y coordinate generating means, 34. 36 is a coordinate detection means, and the coordinate detection means 34 includes an x+y calculation means 34l, a maximum value detection means 342, a maximum value coordinate storage means 343, and a minimum value detection means 3.
44 and minimum value coordinate storage means 345, and the minimum value detection means 36 includes an X-Y calculation means 36l, a maximum value detection means 362, a maximum value coordinate storage means 363, and a minimum value detection means 3.
64, minimum value coordinate storage means 365 is provided. 38 is a feature calculation means.

第４図は本実施例における最大値座標検出手段の動作を
示すフローチャートである。FIG. 4 is a flowchart showing the operation of the maximum value coordinate detection means in this embodiment.

第５図は本実施例における最小値座標検出手段の動作を
示すフローチャートである。FIG. 5 is a flowchart showing the operation of the minimum value coordinate detection means in this embodiment.

第６図は第１図の辞書ｌ６の構成を示すブロック図であ
る。同図において、６０は辞書選択部、６１は第１の辞
書マトリクス、６２は第２の辞書マトリクスである。FIG. 6 is a block diagram showing the configuration of dictionary l6 in FIG. 1. In the figure, 60 is a dictionary selection section, 61 is a first dictionary matrix, and 62 is a second dictionary matrix.

第７図は本実施例の特徴点座標検出の原理的説明図であ
る。第７図（ａ）はローマン自体の標準タイブ、第７図
（ｂ）はローマン自体のイタリックタイプである。FIG. 7 is a diagram explaining the principle of feature point coordinate detection in this embodiment. FIG. 7(a) shows Roman's standard type, and FIG. 7(b) shows Roman's italic type.

第８図はローマン自体の標準タイプとローマン自体のイ
タリックタイプの標準文字パターンとその文字の辞書マ
トリクスの説明図である。FIG. 8 is an explanatory diagram of standard character patterns of Roman's standard type and Roman's italic type, and a dictionary matrix of the characters.

第９図は本実施例に用いる文字が記載された帳票を示す
図である。FIG. 9 is a diagram showing a form in which characters used in this embodiment are written.

第ｌＯ図（ａ）は入力書式テーブルの例を示す図、第ｌ
Ｏ図（ｂ）は入力書式テーブルを説明する図である。Figure lO(a) is a diagram showing an example of an input format table,
Figure O (b) is a diagram illustrating the input format table.

以下、第１図から第ｌＯ図を用いて本実施例を詳細に説
明する。Hereinafter, this embodiment will be explained in detail using FIGS. 1 to 10.

先ず、認識対象文字の入力文字パターンを得ることにつ
き説明する。第１図の光電変換部ｌ１は、文字、図形等
（以下、文字と称する）が記載された帳票等の媒体から
の光信号（第１図中Ｓで示す）より文字行領域を検出し
、該文字行領域を光電変換し、文字線部を画素値「１」
の黒ビット及び背景部を画素値ｒＯＪの白ビットとして
各画素毎に２値のディジタル信号で表現した行画像デー
夕を得、ラインバッファ１２に格納する。ここで、文字
行領域とは帳票上における文字が記載される１行分の領
域のことである。First, obtaining an input character pattern of characters to be recognized will be explained. The photoelectric conversion unit l1 in FIG. 1 detects a character line area from an optical signal (indicated by S in FIG. 1) from a medium such as a form on which characters, figures, etc. (hereinafter referred to as characters) are written, The character line area is photoelectrically converted, and the character line part is set to a pixel value of "1".
The row image data expressed by a binary digital signal for each pixel is obtained by using the black bit and the background part as the white bit of the pixel value rOJ, and is stored in the line buffer 12. Here, the character line area is an area for one line in which characters are written on a form.

そして、ラインバツファｌ２は入力文字パターンの行画
像データにおける各画素の信号をこの領域の２次元座標
通りに再現できる形式で記憶し、１２８Ｘ４０９６画素
の大きさを持っているものである。The line buffer l2 stores the signals of each pixel in the line image data of the input character pattern in a format that can reproduce the two-dimensional coordinates of this area, and has a size of 128×4096 pixels.

次に、文字切り出し部ｌ３はラインバツファ１２から行
画像データを読み込み、縦方向に走査を行ない黒点の分
布を作成する。黒点の分布のＯから１以上に変化する点
より１以上からＯに変化する点までを１つの文字予定領
域とし、パターンレジスタ１４に入力文字パターンとし
て格納する。Next, the character cutting section l3 reads the line image data from the line buffer 12, performs scanning in the vertical direction, and creates a distribution of black dots. The area from the point where the distribution of black dots changes from O to 1 or more to the point where it changes from 1 or more to O is defined as one character expected area, and is stored in the pattern register 14 as an input character pattern.

パターンレジスタｌ４は入力文字パターンの文字とて領
域における各画素の信号をこの領域の２次元座標とおり
に再現できる形式で記憶し、ｌ２８×１２８画素の大き
さを持っているものである。The pattern register 14 stores the signals of each pixel in the character area of the input character pattern in a format that can reproduce the two-dimensional coordinates of this area, and has a size of 128×128 pixels.

パターンレジスタ１４に格納されている入力パターンは
識別部ｌ７及び字体判定部１５に出力される。The input pattern stored in the pattern register 14 is output to the identification section 17 and the font determination section 15.

次に、第１図の字体判定部ｌ５における字体制御信号を
出力することにつき説明する。Next, the output of the font control signal in the font determining section l5 of FIG. 1 will be explained.

字体判定部１５はパターンレジスタｌ４から読み込む入
力文字パターンの字体を判定し、辞書部ｌ６に字体判定
信号を出力する。なお、第３図におけるＭはパターンレ
ジスタｌ４から読み込む、認識対象となる入力文字パタ
ーンを含む量子化画像データを示す。字体判定部１５は
画像データＭの画素にＸ座標を付与するためのＸ座標発
生手段３０と、画像データＭの画素にＹ座標を付与する
ためのＹ座標発生手段３２と、前記Ｘ，Ｙ座標と少なく
とも２組の特定のα値及びβ値とを用いて入力文字パタ
ーンの画素に関する最大及び最小の計算値αＸ十βＹを
検出し、これら最大及び最小計算値を与えるパターンの
画素のＸ，Ｙ座標をそれぞれ特徴点座標として出力する
ための座標検出手段３４．　３６と、入力文字パターン
の字体判定を行なうための幾何学的特徴量を特徴点座標
に基づき算出する特徴量算出手段３８を備えている。The font determining section 15 determines the font of the input character pattern read from the pattern register l4, and outputs a font determining signal to the dictionary section l6. Note that M in FIG. 3 indicates quantized image data containing an input character pattern to be recognized, which is read from the pattern register l4. The font determining unit 15 includes an X coordinate generating means 30 for assigning an X coordinate to a pixel of the image data M, a Y coordinate generating means 32 for assigning a Y coordinate to a pixel of the image data M, and the X, Y coordinates. and at least two sets of specific α and β values to detect the maximum and minimum calculated values αX + βY for the pixels of the input character pattern, and calculate the Coordinate detection means 34 for outputting coordinates as feature point coordinates. 36, and feature amount calculating means 38 for calculating a geometric feature amount for determining the font of an input character pattern based on the feature point coordinates.

また、座標検出手段３４は、計算値ａＸ＋βＹを１ｌ１２算出する計算手段３４１と、最大計算値を検出するため
の最大値検出手段３４２と、最大計算値を与える画素の
Ｘ，Ｙ座標を保存するための最大値座標保存手段３４３
と、再承知計算値を検出するための最小値検出手段３４
４と、最小計算値を与える画素のＸ，Ｙ座標を保存する
ための最小値座標保存手段３４５とを備えている。座標
検出手段３６もまたこの座標検出手段３６２と最大値座
標保存手段３６３と再承知検出手段３６４と最小値座標
保存手段３６５とを備えている。Further, the coordinate detection means 34 includes a calculation means 341 for calculating the calculated value aX+βY, a maximum value detection means 342 for detecting the maximum calculated value, and a storage for storing the X and Y coordinates of the pixel giving the maximum calculated value. Maximum value coordinate storage means 343 for
and a minimum value detection means 34 for detecting the re-acceptance calculated value.
4, and minimum value coordinate storage means 345 for storing the X, Y coordinates of the pixel that gives the minimum calculated value. The coordinate detection means 36 also includes a coordinate detection means 362, a maximum value coordinate storage means 363, a reacknowledgement detection means 364, and a minimum value coordinate storage means 365.

本実施例では、（α＝β＝１）及び（α＝１．β＝−１
）の２組のα及びβ値をＸ＋Ｙ及びＸ−Ｙに関する最大
及び最小計算値を与える画素の座標により、傾きを検出
し、辞書選択のための信号を辞書部ｌ６へ出力するので
２個の座標検出手段３４．　３６を備える構成となって
いる。座標検出手段３４にあっては計算値Ｘ＋Ｙに関す
る、及び座標検出手段３６にあっては計算値Ｘ−Ｙに関
する、最大及び最小計算値を与える画素の座標を検出す
る。In this example, (α=β=1) and (α=1.β=-1
), the inclination is detected using the pixel coordinates that give the maximum and minimum calculated values for X+Y and X-Y, and a signal for dictionary selection is output to the dictionary section l6, so two Coordinate detection means 34. 36. The coordinate detecting means 34 detects the coordinates of the pixel giving the maximum and minimum calculated values regarding the calculated value X+Y, and the coordinate detecting means 36 regarding the calculated value X-Y.

以下に第１図の字体判定部１５について第４図及び第５
図を用いて詳細に説明する。The font determination unit 15 in FIG. 1 will be explained below in FIGS. 4 and 5.
This will be explained in detail using figures.

［座標検出手段３４に着目した説明］　（ｘ＋ｙの最大
値及び最小値検出方法の説明）第１図のパターンレジスタｌ４より読み込まれる（ステ
ップ４０１）入力文字パターンは、画素毎に最大値検出
手段３４２及び最小値検出手段３４４に入力される。こ
れと共に、Ｘ座標発生手段３０及びＹ座標発生手段３２
は、画像データＭの出力と同期させてこのデータＭのそ
れぞれの画素毎に対応付けたＸ，Ｙ座標を発生する。そ
の結果、これら発生手段３０．　３２によって画像デー
タＭにＸ，Ｙ座標の付与が行なわれる。そして出力され
たＸ，Ｙ座標は計算手段３４ｌ、最大値座標保存手段３
４３及び最小値座標保存手段３４５に入力される（ステ
ップ４０２）。計算手段３４１はＸ，Ｙ座標を入力する
と、これらＸ，Ｙ座標から計算値Ｘ＋Ｙを算出し、算出
した計算値を最大値検出手段３４２及び最小値検出手段
３４４に対し出力する（ステップ４０３）。[Explanation focusing on the coordinate detection means 34] (Explanation of the maximum value and minimum value detection method of x+y) The input character pattern read from the pattern register l4 in FIG. and is input to the minimum value detection means 344. Along with this, the X coordinate generating means 30 and the Y coordinate generating means 32
generates X and Y coordinates associated with each pixel of the data M in synchronization with the output of the image data M. As a result, these generating means 30. 32, X and Y coordinates are assigned to the image data M. The output X, Y coordinates are calculated by the calculation means 34l and the maximum value coordinate storage means 3.
43 and the minimum value coordinate storage means 345 (step 402). When the calculation means 341 receives the X and Y coordinates, it calculates a calculated value X+Y from these X and Y coordinates, and outputs the calculated value to the maximum value detection means 342 and the minimum value detection means 344 (step 403).

このステップ４０２〜４０３によって、最大値検出千段
３４２は画像デークＭ及び計算値を、最小値検出手段３
４４は画像データＭ及び計算値を、最大値座標保存手段
３４３はＸ，Ｙ座標を、最小値座標保存手段３４５はＸ
，Ｙ座標を、それぞれｌ画素毎に入力する。そして、座
標検出手段が後述のステップ４０４，４０５，４０５あ
るいは４０７の判断を１画素毎に繰り返し行ない、その
判断結果に応じた動作を行なう。Through these steps 402 and 403, the maximum value detection stage 342 transfers the image data M and the calculated value to the minimum value detection means 3.
44 stores image data M and calculated values, maximum value coordinate storage means 343 stores X and Y coordinates, and minimum value coordinate storage means 345 stores X and Y coordinates.
, Y coordinates are input for each l pixel. Then, the coordinate detection means repeatedly performs the determination in steps 404, 405, 405, or 407, which will be described later, for each pixel, and performs an operation according to the determination result.

特にステップ４０４では、最大値検出手段３４２及び最
小値検出手段３４４は入力された画像データＭの画素が
入力文字パターンＭ２の画素であるか否かを判断する。In particular, in step 404, the maximum value detection means 342 and the minimum value detection means 344 determine whether the pixels of the input image data M are the pixels of the input character pattern M2.

この判断は、入力された画素の画素値が入力文字パター
ンＭ２を意味する所定の画素値（この実施例では画素値
「１」）であるか否かを判断することによって行なう。This determination is made by determining whether the pixel value of the input pixel is a predetermined pixel value (pixel value "1" in this embodiment) indicating the input character pattern M2.

所定の画素値を有するときには、最大値検出手段３４２
はステップ４０４の次に比較値及び計算値の比較をステ
ップ４０５で、及び最小値検出手段３４４はステップ４
０４の次に比較値及び計算値の比較をステップ５０１で
行なう。When it has a predetermined pixel value, the maximum value detection means 342
In step 404, the comparison value and the calculated value are compared in step 405, and the minimum value detection means 344 is in step 4.
After step 04, the comparison value and the calculated value are compared in step 501.

ここでステップ４０５において ■計算値が比較値よりも大きいとき最大値検出手段３４２は先に格納されている比較値に換
えて、比較値よりも大きい計算値を新たな比較値として
格納し（比較値の書き換え）、これと共にセットパルス
を最大値座標保存手段３４３に対し出力する。セットパ
ルスを入力した最大値座標保存手段３４３は格納されて
いるＸ，Ｙ座標に換えて、比較値よりも大きな計算値を
与える画素のＸ，Ｙ座標を新たに格納する（Ｘ，Ｙ座標
の書換え）（ステップ４０６）。Here, in step 405, ■ When the calculated value is larger than the comparison value, the maximum value detection means 342 replaces the previously stored comparison value and stores the calculated value larger than the comparison value as a new comparison value (comparison (value rewriting), and together with this, a set pulse is output to the maximum value coordinate storage means 343. The maximum value coordinate storage means 343 to which the set pulse has been input stores the X, Y coordinates of the pixel that gives a calculated value larger than the comparison value, instead of the stored X, Y coordinates. rewriting) (step 406).

■計算値が比較値よりも小さいかあるいは比較値と等し
いとき最大値検出手段３４２は先に格納されている比較値を書
換えずにそのまま格納する共に、最大値座標保存手段３
４３は格納されているＸ，Ｙ座標を書換えない。■When the calculated value is smaller than or equal to the comparison value, the maximum value detection means 342 stores the previously stored comparison value without rewriting it, and the maximum value coordinate storage means 3
43 does not rewrite the stored X, Y coordinates.

最大値検出手段３４２は上記■及び■のいずれの場合も
ステップ４０５の次にステップ４０７の判断を行なう。The maximum value detection means 342 performs the determination in step 407 following step 405 in both cases (1) and (2) above.

ｌ　５１　６なお、最大値検出手段３４２に格納される比較値の初期
値としては、例えば計算値αとして取り得る値よりも小
さな値を用いれば良い。例えばα＝β＝１であり画像デ
ータＭをβ行ｍ列の画素の分割した（従ってＯ≦Ｘ≦ｍ
−１，及びＯ≦Ｙ≦忍一１となる）場合には、例えば−
１を比較値の初期値とすることができる。あるいは一番
最初に入力された計算値αＸ＋βＹを用いるようにして
も良い。l 5 1 6 Note that as the initial value of the comparison value stored in the maximum value detection means 342, for example, a value smaller than the value that can be taken as the calculated value α may be used. For example, α=β=1, and the image data M is divided into pixels of β rows and m columns (therefore, O≦X≦m
−1, and O≦Y≦Shinichi 1), for example, −
1 can be set as the initial value of the comparison value. Alternatively, the first input calculation value αX+βY may be used.

また最大値検出手段３４２が比較値及びＸ，Ｙ座標の書
換えを計算値αＸ十βＹが比較値よりも大きいとき及び
計算値が比較値と等しいときに行なうようにし、これと
ともに計算値が比較値よりも小さいとき比較値及びＸ，
Ｙ座標の書換えを行なわないようにしても良い。Further, the maximum value detection means 342 rewrites the comparison value and the X, Y coordinates when the calculated value αX + βY is larger than the comparison value and when the calculated value is equal to the comparison value. When it is smaller than the comparison value and X,
The Y coordinate may not be rewritten.

また最大値座標保存手段３４３のＸ，Ｙ座標としては任
意好適な数値を用いて良い。Furthermore, any suitable numerical values may be used as the X and Y coordinates of the maximum value coordinate storage means 343.

ここでステップ５０１において ■計算値が比較値よりも小さいとき最小値検出手段３４４は先に格納されている比較値に換
えて比較値よりも小さい計算値を比較値として新たに格
納する（比較値の書き換え）と共に、最小値座標保存手
段３４５に対しセットパルスを出力する。最小値座標保
存手段３４５はセットパルスを入力すると、格納されて
いるＸ，Ｙ座標を比較値よりも小さな計算値を与える画
素のＸ，　Ｙ座標に書換える（ｘ，ｙ座標の書換え）（
ステップ４０６）。Here, in step 501, ■ When the calculated value is smaller than the comparison value, the minimum value detection means 344 replaces the previously stored comparison value with a new calculation value smaller than the comparison value and stores it as a comparison value (comparison value (rewriting), and outputs a set pulse to the minimum value coordinate storage means 345. When the minimum value coordinate storage means 345 receives a set pulse, it rewrites the stored X, Y coordinates to the X, Y coordinates of a pixel that gives a calculated value smaller than the comparison value (rewriting the x, y coordinates).
Step 406).

■計算値が比較値よりも小さいかあるいは比較値と等し
いとき最小値検出手段３４４は先に格納されている比較値を書
換えずにそのまま格納する共に、最小値座標保存手段３
４５は格納されているＸ，Ｙ座標を書換えない。■When the calculated value is smaller than or equal to the comparison value, the minimum value detection means 344 stores the previously stored comparison value without rewriting it, and the minimum value coordinate storage means 344 stores the previously stored comparison value as it is without rewriting it.
45 does not rewrite the stored X, Y coordinates.

最小値検出手段３４４は上記■及び■のいずれかの場合
にはステップ５０１の次にステップ４０７の判断を行な
う。The minimum value detecting means 344 performs the determination in step 407 after step 501 in either of the cases (1) and (2) above.

尚、最小値検出手段３４４に格納される比較値の初期値
としては、例えば計算値αＸ＋βＹとして取り得る値よ
りも大きな値を用いれば良い。例えばα＝β＝１であり
、画像データＭをβ行ｍ列の画素に分割した（従って０
≦Ｘ≦ｍ−１，及びＯ≦Ｙ≦１２−１となる）場合には
、例えばｍ＋ｎ一１を比較値の初期値とすることができ
る。あるいは比較値の初期値として最小値検出手段３４
４に一番最初に入力された計算値αＸ＋βＹを用いるよ
うにしても良い。Note that as the initial value of the comparison value stored in the minimum value detection means 344, for example, a value larger than the value that can be taken as the calculated value αX+βY may be used. For example, α = β = 1, and the image data M is divided into β rows and m columns of pixels (therefore, 0
≦X≦m−1 and O≦Y≦12−1), for example, m+n−1 can be set as the initial value of the comparison value. Alternatively, the minimum value detection means 34 is used as the initial value of the comparison value.
The calculated value αX+βY inputted first in step 4 may be used.

また最小値検出手段３４４は計算値が比較値よりも小さ
いとき及び計算値が比較値と等しいとき比較値及びＸ，
Ｙ座標の書換えを行ない、これと共に計算値が比較値よ
りも大きいとき比較値及びＸ，Ｔ座標の書換えを行なわ
ないようにしても良い。Further, the minimum value detection means 344 detects the comparison value and X, when the calculated value is smaller than the comparison value and when the calculated value is equal to the comparison value.
The Y coordinate may be rewritten, and at the same time, when the calculated value is larger than the comparison value, the comparison value and the X and T coordinates may not be rewritten.

また最小値座標保存手段３４５のＸ，Ｙ座標として任意
好適な数値を用いて良い。Furthermore, any suitable numerical values may be used as the X and Y coordinates of the minimum value coordinate storage means 345.

さらにステップ４０７において ■画像データＭの走査が終了しないとき座標検出手段３
４は、画像データＭの走査が終了せず、従ってデータＭ
の全ての画素につき処理が終了していなければ、画像デ
ータＭの残りの画素につきステップ４０４，　４０５，
　５０１あるいは４０７の判断を行ない、その判断結果
に応じて動作する。Furthermore, in step 407, ■ If the scanning of the image data M is not completed, the coordinate detection means 3
4, the scanning of the image data M is not completed, and therefore the data M
If processing has not been completed for all pixels of image data M, steps 404, 405,
It makes the judgment 501 or 407 and operates according to the judgment result.

■画像データＭの走査が終了したとき最大値検出手段３４２及び最小値検出手段３４４は画像
デークＭの走査が終了しデータＭのすべての画素につき
処理が終了すると、Ｘ，Ｙ座標の出力信号を最大値座標
保存手段３４３及び最小値座標保存手段３４５に対して
出力する。この出力信号を入力した座標保存手段３４３
，　３４５は、格納しているＸ，Ｙ座標を特徴点座標と
して出力する。これ共に検出手段３４２，　３４４は比
較値の初期化を行なう（ステップ４０８）。■When the scanning of the image data M is completed, the maximum value detection means 342 and the minimum value detection means 344 output the output signals of It is output to the maximum value coordinate storage means 343 and the minimum value coordinate storage means 345. Coordinate storage means 343 into which this output signal is input
, 345 outputs the stored X, Y coordinates as feature point coordinates. In both cases, the detection means 342 and 344 initialize the comparison values (step 408).

全ての画素につき処理を終了した時点で、最大値座標保
存手段３４３及び最小値座標保存手段３４５に格納され
ているＸ，Ｙ座標が最大及び最小計算値を与える画素の
Ｘ，Ｙ座標すなわち特徴点座標となる。When processing is completed for all pixels, the X, Y coordinates of the pixel whose X, Y coordinates stored in the maximum value coordinate storage means 343 and the minimum value coordinate storage means 345 give the maximum and minimum calculated values, that is, the feature point It becomes the coordinates.

座標検出手段３４にあってはα＝β＝１としたので、全
処理終了時点で例えば第７図に示す入力文字パターンＭ
２の特徴点ＢＲの座標が最大値座標保存ｌ　９２　０手段３４３に格納されており、また特徴点ＴＬの座標が
最小座標保存手段３４５に格納されている。In the coordinate detection means 34, α=β=1, so at the end of all processing, the input character pattern M shown in FIG.
The coordinates of the second feature point BR are stored in the maximum value coordinate storage means 343, and the coordinates of the feature point TL are stored in the minimum coordinate storage means 345.

［座標検出手段３６に着目した説明］　　（Ｘ−Ｙの最
大値・最小値検出方法の説明）座標検出手段３６は、上述した座標検出手段３４の動作
と並行して座標検出手段３４と同様の動作を行なうので
、この検出手段３６の動作説明を省略する。[Explanation focusing on the coordinate detection means 36] (Explanation of X-Y maximum value/minimum value detection method) The coordinate detection means 36 performs the same operation as the coordinate detection means 34 in parallel with the operation of the coordinate detection means 34 described above. Since the detection means 36 operates in the following manner, a description of the operation of the detection means 36 will be omitted.

座標検出手段３６にあっては、α＝１及びβ＝一ｌとし
たので、全ての画素につき処理を終了した時点で、例え
ば第７図に示す入力文字パターンＭ２の特徴点ＴＲの座
標が最大値座標保存手段３６３に格納されており、また
特徴点ＢＬの座標が最小値座標保存手段３６５に格納さ
れている。In the coordinate detection means 36, since α=1 and β=1l, when processing is completed for all pixels, for example, the coordinates of the feature point TR of the input character pattern M2 shown in FIG. The coordinates of the feature point BL are stored in the value coordinate storage means 363, and the coordinates of the feature point BL are stored in the minimum value coordinate storage means 365.

入力文字パターンの特徴点をＴＬ，　ＢＬ，　ＴＲ及び
ＢＲの座標を用い、特徴値Ｆを下記の式（１）により算
出する。Using the coordinates of TL, BL, TR, and BR for the feature points of the input character pattern, the feature value F is calculated using the following equation (1).

なお、式（１）において特徴点ＴＬ，　ＢＬ，　ＴＲ，
　ＢＲのＸ座標をＴＬＸ，　ＢＬＸ，　ＴＲＸ　，　Ｂ
ＲＸとして表わす。Note that in equation (1), the feature points TL, BL, TR,
The X coordinate of BR is TLX, BLX, TRX, B
Expressed as RX.

Ｆ＝ｋ　（Ｉ２（ＴＬＸ−ＢＬＸ）＋ｍ（ＴＲＸ−ＢＲ
Ｘ）｝　・・・式（１）式（１）中、ｋ，β，ｍは任意
の定数である。F=k (I2(TLX-BLX)+m(TRX-BR
X)}...Equation (1) In Equation (1), k, β, and m are arbitrary constants.

式（１）により算出されるＦの下記の式（２）による条
件により、字体を判定し、辞書部ｌ６に字体判定信号Ｎ
を出力する。The font is determined according to the condition of the following formula (2) of F calculated by the formula (1), and the font determination signal N is sent to the dictionary section l6.
Output.

式（２）中、ＣＩｌ　Ｃ２は、固定閾値であり、任意に
変えることができる。In equation (2), CIl C2 is a fixed threshold value and can be changed arbitrarily.

なお、本実施例では、出力する字体判定信号Ｎは、ｒｌ
ｌ，ｒ２Ｊ，ｒ３Ｊの３種であるが、字体判定信号Ｎは
３種以外のものとなっても何ら差し支えない。In addition, in this embodiment, the font determination signal N to be output is rl
There are three types, l, r2J, and r3J, but there is no problem even if the font determination signal N is other than the three types.

次に、入力文字パターンを識別し、結果を出力すること
について説明する。Next, identifying an input character pattern and outputting the result will be described.

第１図の字体判定部１５により出力された字体判定信号
Ｎは、辞書部ｌ６へ出力され、字体判定信号Ｎに対応す
る辞書を選択する。The font determination signal N output by the font determination section 15 in FIG. 1 is output to the dictionary section l6, and a dictionary corresponding to the font determination signal N is selected.

第６図の辞書部ｌ６は、辞書選択部６０、第１の辞書マ
トリクス６１及び第２の辞書マトリクス６２を備える。The dictionary section l6 in FIG. 6 includes a dictionary selection section 60, a first dictionary matrix 61, and a second dictionary matrix 62.

本実施例では、辞書マトリクスは２つであるが、これは
３つ以上でも何ら差し支えない。In this embodiment, there are two dictionary matrices, but there may be no problem with three or more dictionary matrices.

辞書選択部６０は、字体判定部１５から出力される字体
判定信号Ｎ＝１，Ｎ＝２，又はＮ＝３に対応してそれぞ
れ第１の辞書マトリクス６１，第２の辞書マトリクス６
２、又は第１，第２の辞書マトリクス６１．　６２を選
択し、選択した辞書マトリクスを識別部ｌ７に対して出
力する。The dictionary selection unit 60 selects a first dictionary matrix 61 and a second dictionary matrix 6 in response to the font determination signal N=1, N=2, or N=3 output from the font determination unit 15, respectively.
2, or the first and second dictionary matrices 61. 62 and outputs the selected dictionary matrix to the identification unit l7.

次に、識別部ｌ７について説明する。Next, the identification section l7 will be explained.

識別部ｌ７は、この入力文字パターンについて特徴抽出
処理及び入力文字パターンの認識を行なう。この特徴抽
出の方法は、従来公知の種々の方法を用いることができ
るが、本実施例の場合、以下に説明するような方法で行
なう。The identification unit 17 performs feature extraction processing and recognition of the input character pattern for this input character pattern. Although various conventionally known methods can be used for this feature extraction method, in the case of this embodiment, the method described below is used.

先ず、入力文字パターンについて外接する方形枠を検出
し、これを文字枠とする。更に当該入力文字パターンに
ついて線幅Ｗを算出する。この線幅算出は、例えば下記
に示すような周知の近似式（３）を用いて行なうことが
できる。First, a rectangular frame circumscribing the input character pattern is detected and used as a character frame. Furthermore, the line width W is calculated for the input character pattern. This line width calculation can be performed using, for example, the well-known approximation formula (3) shown below.

Ｖｌ＝１／｛１−（Ｑ／Ａ））　　　　　・・・式（３
）ただし、式（３）において、Ｑは入力文字パターンを
構成する各点をこれらの点が（２Ｘ２）個づつの範囲で
見られる窓で分けたとき、この窓内の全ての点が黒ビッ
トとなる窓の個数であり、またＡは文字枠内の黒ビット
の個数である。Vl=1/{1-(Q/A))...Formula (3
) However, in equation (3), when Q is divided into a window in which each point constituting the input character pattern can be seen in a range of (2×2) points, all points within this window are black bits. is the number of windows, and A is the number of black bits in the character frame.

更に、この入力文字パターンを複数の方向に走査を行な
って各走査列毎の黒ビットの連続個数を検出し、この黒
ビット連続個数と上述の線幅とに基づいて上述の複数の
方向毎に対応したサブパターンをそれぞれ抽出する。そ
して、この入力文字パターンの文字枠内領域をサブパタ
ーンについて（ＮＸＭ）個の領域（Ｍ，Ｎは定数）に分
割し、更に、各領域内の文字線長を表わす特徴量を、文
字を分割した領域毎に計算し、この特徴量を文字枠の大
きさで正視化して特徴マトリクスを得る。Furthermore, this input character pattern is scanned in multiple directions to detect the number of consecutive black bits in each scanning line, and based on this number of consecutive black bits and the above-mentioned line width, the input character pattern is scanned in multiple directions. Extract each corresponding subpattern. Then, the region within the character frame of this input character pattern is divided into (N The feature values are calculated for each area, and the feature values are normalized using the size of the character frame to obtain a feature matrix.

本実施例では、特徴量を、（Δχ十ΔＹ）／２なる値で
除することによって正視化する。ここで、ΔＸは文字枠
の水平方向長さ、ΔＹは垂直方向長さである。In this embodiment, the feature amount is divided by a value of (Δχ + ΔY)/2 for normal viewing. Here, ΔX is the horizontal length of the character frame, and ΔY is the vertical length.

識別部ｌ７は、このようにして抽出した特徴マト２　３２　４リクスと、辞書部１６より出力される、辞書選択後の辞
書マトリクスとの照合を行ない、最も類似度が大きな値
を示した辞書マトリクスに対応する文字名（ＪＩＳコー
ド等）を出力端子ｌ８を介して外部装置に出力する。The identification unit 17 compares the feature matrix 2 3 2 4 extracted in this way with the dictionary matrix after dictionary selection output from the dictionary unit 16, and selects a dictionary that shows the largest value of similarity. The character name (JIS code, etc.) corresponding to the matrix is output to an external device via the output terminal l8.

なお、本実施例の場合上述した類似度は、以下に示す式
（４）に基づいて求めている。In the case of this embodiment, the above-mentioned similarity is calculated based on equation (4) shown below.

但し、式（４）において、Ｒは類似度、ｆ，は入力文字
・パターン、ｇ１は辞書内に格納させてある辞書マトリ
クスをそれぞれ示し、また、ｉ　＝１．　２，　３，・
・・，ＮＸＭである。However, in equation (4), R is the degree of similarity, f is the input character/pattern, g1 is the dictionary matrix stored in the dictionary, and i = 1. 2, 3,・
..., NXM.

次に、本実施例において第１図に沿って具体的に説明す
る。Next, the present embodiment will be specifically explained with reference to FIG. 1.

先ず、文字が記された、例えば第９図に示すような帳票
は画像データＳとして光電変換部１１に入力される。光
電変換部１ｌでは帳票の各文字領域を検出し、行単位で
光電変換し、２値のディジタル画像データに変換し、ラ
インバッファｌ２に格納する。前記行領域の検出は、本
実施例では予め光電変換部１１に設定されている第１０
図（ａ）に示す入力書式テーブルを参照して順次行なう
ものとする。First, a form with characters written on it, for example as shown in FIG. 9, is input as image data S to the photoelectric conversion section 11. The photoelectric conversion unit 1l detects each character area of the form, photoelectrically converts it line by line, converts it into binary digital image data, and stores it in the line buffer l2. In this embodiment, the detection of the row area is performed using the 10th line area set in advance in the photoelectric conversion unit 11.
It is assumed that the steps are performed sequentially with reference to the input format table shown in FIG.

前記入力書式テーブルには第１０図（ｂ）に示すように
第１行目の行領域の帳票の上端及び左端からの距離行領
域の大きさ、行ピッチ及び行数が記録されている。As shown in FIG. 10(b), the input format table records the distance of the first line area from the top and left edges of the form, the size of the line area, the line pitch, and the number of lines.

第２図（ａ）及び第２図（ｂ）に示すように、ラインバ
ッファｌ２から読み込まれた２値のディジタル信号であ
る行画像データは、文字切り出し部１３黒点の分布のＯ
から１以上に変化する点より１以上からＯへ変化する点
までを、文字予定領域として行画像データから検出し、
１２８　Ｘｌ２８画素の入力文字パターンを抽出する。As shown in FIGS. 2(a) and 2(b), the line image data, which is a binary digital signal read from the line buffer l2, has a distribution of black points in the character cutout portion 13.
Detecting from the line image data from the point where the character changes from 1 or more to the point where the character changes from 1 or more to O as the expected character area,
128 Xl Extracts an input character pattern of 28 pixels.

この入力文字パターンは、一文字毎にパターンレジスタ
ｌ４に格納される。This input character pattern is stored character by character in the pattern register l4.

パターンレジスタｌ４から入力文字パターンを読み込ん
だ字体判定部ｌ５は、入力文字パターンの１２８　Ｘ１
２ｇ画素に、Ｘ座標及びＹ座標をそれぞれ付与し、入力
文字パターンの画素に関する最大及び最小の計算値αＸ
＋βＹを検出し、これら最大及び最小計算値を与える特
徴点座標を抽出する。The font determining unit l5 that reads the input character pattern from the pattern register l4 determines the 128 X1 of the input character pattern.
Assign an X coordinate and a Y coordinate to each 2g pixel, and calculate the maximum and minimum calculated values αX for the pixels of the input character pattern.
+βY is detected, and feature point coordinates that give these maximum and minimum calculated values are extracted.

本実施例では、特徴点を検出するためのα及びβは（α
＝１，β＝１）と（α＝１，β＝−１）の２組とし、前
記計算値による特徴点座標は第７図（ａ）では、ＴＬ＝
　（０，　Ｏ）　，　ＢＬ＝　（０，　４？）　，ＴＲ
＝　（４２，　Ｏ）　’＋　ＢＲ＝　（４２．　４７）
　、第７図（ｂ）では、ＴＬ＝　（８，　Ｏ）　，　Ｂ
Ｌ＝　（０．　４７）　，　ＴＲ＝（４２，　Ｏ）　，
　ＢＲ＝　（３５．　４７）である。前記特徴点座標に
より式（１）を用いて特徴値Ｆを算出する。In this example, α and β for detecting feature points are (α
= 1, β = 1) and (α = 1, β = -1), and the feature point coordinates based on the calculated values are TL =
(0, O), BL= (0, 4?), TR
= (42, O) '+ BR= (42. 47)
, In Fig. 7(b), TL= (8, O) , B
L= (0.47), TR=(42, O),
BR=(35.47). A feature value F is calculated using equation (1) based on the feature point coordinates.

このとき、本実施例では定数ｋ，　ｊ２，　ｍはそれぞ
れ（ｋ＝局，氾＝１，ｍ＝１）であり、また固定しきい
値Ｃ，＝５．０　，　Ｃ２＝３、０である。前記条件に
より算出される特徴値は、第７図（ａ）ではＦ＝０であ
り、第７図（ｂ）ではＦ＝７．５である。At this time, in this embodiment, the constants k, j2, and m are respectively (k=station, flood=1, m=1), and the fixed threshold values C,=5.0, C2=3, 0. . The characteristic value calculated under the above conditions is F=0 in FIG. 7(a), and F=7.5 in FIG. 7(b).

従って、式（２）により、第９図の帳票が入力された場
合、第２図（Ｃ）に示す値Ｆが算出され、Ｆ＞Ｃ，の入
力文字パターンについては字体判定信号Ｎ＝ｌがＦ＜Ｃ
２の入力文字パターンは字体判定信号Ｎ＝２が、辞書部
ｌ６の辞書選択部６０に出力される。本実施例では、Ｍ
，Ｙ，Ｎ，Ａ，Ｍ，Ｅ，Ｉ，Ｓに対してはＮ＝２を出力
し、第１の辞書マトリクス６ｌを選択する。Therefore, according to equation (2), when the form shown in Figure 9 is input, the value F shown in Figure 2 (C) is calculated, and for the input character pattern where F>C, the font judgment signal N = l F<C
For input character pattern No. 2, a font determination signal N=2 is output to the dictionary selection section 60 of the dictionary section l6. In this example, M
, Y, N, A, M, E, I, and S, N=2 is output and the first dictionary matrix 6l is selected.

識別部１７は、パターンレジスタｌ４から読み込まれる
入力文字パターンの水平、垂直、右斜め、左斜めの４方
向について　サブパターンを線幅に抽出し、それぞれの
サブパターンを、文字枠内領域についてＮＸＭに分割す
る。本実施例では５×５である。各領域において、文字
線長を表わす特徴量を計算し、特徴マトリクスを得る。The identification unit 17 extracts sub-patterns into line widths in four directions (horizontal, vertical, diagonal right, and diagonal left) of the input character pattern read from the pattern register l4, and converts each sub-pattern into NXM for the area within the character frame. To divide. In this embodiment, it is 5×5. In each region, a feature amount representing the character line length is calculated to obtain a feature matrix.

この特徴マトリクスと、第８図（ａ）及び第８図（ｂ）
に示すような、字体判定され選択された辞書マトリクス
との照合を行ない、最も類似度が大きい値を示した辞書
マトリクスに対応する文字名（ｊＩｓコード等）を出力
端子ｌ８を介して図示していない外部装置に出力する。This feature matrix and FIGS. 8(a) and 8(b)
The character name (jIs code, etc.) corresponding to the dictionary matrix that shows the largest similarity value is shown through the output terminal l8 by comparing it with the dictionary matrix that has been selected after font determination, as shown in FIG. Output to an external device that is not available.

（発明の効果）以上説明したように、本発明によれば、帳票上の各文字
の文字線を有する各画素に対してαＸ＋２　７２　８ βＹ及びαＸ−βＹの最大及び最小の計算値を算出し、
特徴点座標を抽出することにより、イタリック体等の斜
体文字を検出し、入力文字パターンの字体の判定を行な
って判定結果に基づき照合するための辞書マスクを選択
している。したがって、辞書マスクは選択された辞書マ
スクとのみ照合を行なうため、照合に要する時間が短く
なり、字体の異なる文字が混在する帳票も高速に精度よ
く読み取ることの可能な文字認識装置を実現できる。(Effects of the Invention) As explained above, according to the present invention, the maximum and minimum calculated values of αX+2 7 2 8 βY and αX-βY are calculated for each pixel having a character line of each character on a form. death,
By extracting feature point coordinates, italicized characters such as italics are detected, the font of the input character pattern is determined, and a dictionary mask for comparison is selected based on the determination result. Therefore, since the dictionary mask is compared only with the selected dictionary mask, the time required for the comparison is shortened, and it is possible to realize a character recognition device that can read documents containing a mixture of characters with different fonts at high speed and with high accuracy.

[Brief explanation of drawings]

第ｌ図は本発明の一実施例を示すブロック図、第２図（
ａ）は文字行領域における２値画像データを示す図、第
２図（ｂ）は２値画像データによる周辺分布を示す図、
第２図（ｃ）は入力文字パターンの特徴値Ｆを示す図、
第２図（ｄ）は入力文字パターンに対する認識結果を示
す図、第３図は第１図の字体判定部ｌ５を示すブロック
図、第４図は本実施例における最大値座標検出手段の動
作を示すフローチャート、第５図は本実施例における最
小値座標検出手段の動作を示すフローチャート、第６図
は第１図の辞書ｌ６の構成を示すブロック図、第７図（
ａ）はローマン自体の標準タイプの入力文字パターンと
特徴点座標を示す図、第７図（ｂ）はローマン自体のイ
タリックタイプの入力文字パターンと特徴点座標を示す
図、第８図はローマン自体の標準タイプとローマン自体
のイタリックタイプの標準文字パターンとその文字の辞
書マトリクスの説明図、第９図は本実施例に用いる文字
が記載された帳票を示す図、第ｌＯ図（ａ）は入力書式
テーブルの例を示す図、第ｌＯ図（ｂ）は入力書式テー
ブルを説明する図である。１０・１ｌ・１２・ｌ３・ｌ４・ｌ５・ｌ６・文字認識装置、光電変換部、ラインバッファ、文字切り出し部、パターンレジスタ、字体判定部、辞書部、１７・・・識別部、ｌ８・・・出力端子。Figure 1 is a block diagram showing one embodiment of the present invention, Figure 2 (
a) is a diagram showing binary image data in a character line area, FIG. 2(b) is a diagram showing peripheral distribution by binary image data,
FIG. 2(c) is a diagram showing the feature value F of the input character pattern,
FIG. 2(d) is a diagram showing the recognition result for the input character pattern, FIG. 3 is a block diagram showing the font determining unit l5 of FIG. 1, and FIG. 4 is a diagram showing the operation of the maximum value coordinate detection means in this embodiment. 5 is a flowchart showing the operation of the minimum value coordinate detection means in this embodiment, FIG. 6 is a block diagram showing the configuration of the dictionary l6 in FIG. 1, and FIG.
a) is a diagram showing the standard type input character pattern and minutiae coordinates of Roman itself, Figure 7 (b) is a diagram showing the italic type input character pattern and minutiae coordinates of Roman itself, and Figure 8 is Roman itself. An explanatory diagram of the standard character pattern of the standard type and the italic type of Roman itself and the dictionary matrix of the characters, Figure 9 is a diagram showing the form in which the characters used in this example are written, and Figure 1O (a) is the input FIG. 10(b), which is a diagram showing an example of a format table, is a diagram illustrating an input format table. 10, 1l, 12, l3, l4, l5, l6, character recognition device, photoelectric conversion unit, line buffer, character extraction unit, pattern register, font determination unit, dictionary unit, 17... identification unit, l8... Output terminal.

Claims

[Claims]

(1) Obtain quantized image data by photoelectrically converting the feature extraction target on the medium, cut out a character pattern one character at a time from the image data, judge the font of the cut out input character pattern, and make a judgment. A recognition dictionary mask is selected based on the result, the input character pattern is collated using the selected dictionary mask, and the image data of the input character pattern is extracted one character at a time in a character recognition device that recognizes characters. X coordinate generation means for assigning an X coordinate to a pixel of the input character pattern; Y coordinate generation means for assigning a Y coordinate to a pixel of the image data of the input character pattern cut out character by character; and at least two sets of the X and Y coordinates. Detect maximum and minimum calculated values αX+βY for pixels of the input character pattern having a predetermined pixel value using specific α and β values of the pixels of the input character pattern that give these maximum and minimum calculated values. a coordinate detection means for detecting the X, Y coordinates of the object; and a feature amount calculation means for calculating geometric feature amounts based on the detected X, Y coordinates, and the features calculated by the feature amount calculation means. A font determination unit that determines the font based on the amount, a dictionary unit that selects a recognition dictionary mask corresponding to the determined font, and a character identification unit that uses the selected dictionary mask to match input character patterns and identify characters. 1. A character recognition device comprising: an identification unit.

(2) In the coordinate detection means, α=β=1 and α=
1. Detect the maximum and minimum calculated values X+Y and X−Y regarding the pixels of the input character pattern having a predetermined pixel value using the two sets of α and β values with β=−1, and calculate the maximum and 2. The character recognition device according to claim 1, further comprising detecting the X and Y coordinates of pixels of the input character pattern that give the minimum calculated value.

(3) In the feature quantity calculation means, the X coordinate of the maximum value of X+Y in the detected coordinates is TRX, the X coordinate of the minimum value is BLX, the X coordinate of the maximum value of X-Y is BRX, and the minimum value x The coordinates are TLX, and the feature amount F is F=k{l(T
LX-BLX)+m(TRX-BRX)} (however, k,
3. The character recognition device according to claim 2, wherein l and m are arbitrary constants.