JPS603077A

JPS603077A - Inclination extraction system

Info

Publication number: JPS603077A
Application number: JP58110209A
Authority: JP
Inventors: Yoshiyuki Yamashita; 山下　義征; Koichi Higuchi; 浩一樋口
Original assignee: Oki Electric Industry Co Ltd
Current assignee: Oki Electric Industry Co Ltd
Priority date: 1983-06-21
Filing date: 1983-06-21
Publication date: 1985-01-09
Also published as: JPH0420227B2

Abstract

PURPOSE:To extract an inclination stably at a high speed by calculating the weighted mean value of inclinations of stroke components extracted from a character pattern on the basis of the length of each stroke component as weight. CONSTITUTION:A stroke extraction part 6 performs a horizontal or vertical scan on the entire surface of each sub pattern register to detect change points from a white dot to a black dot and from a black dot to a white dot. Then, a stroke is extracted on the basis of relations between the number of change points in scanning on the last column (or row) and coordinates of the change points, and the number of change points on the current column (or row) and coordinates of the change points. The coordinates (based upon the left lower point of a pattern register as an origin) of both terminal points of an extracted stroke in each sub pattern register on a two-dimensional coordinate system defined by a pattern register 3 are sent out to an inclination extraction part 7. The inclination extraction part 7 refers to the coordinates of both terminal points of the stroke in each sub pattern register which is extracted by the stroke extraction part 6, and calculates the mean inclination of each sub pattern.

Description

【発明の詳細な説明】（技術分野）本発明に、高速で安定な傾ｊ」抽出方式に関するもので
ある。DETAILED DESCRIPTION OF THE INVENTION (Technical Field) The present invention relates to a fast and stable slope j'' extraction method.

（背景技術）従来、文字認識装置においては、第１図の例の様々手書
文字の死者の違いによる文字線の傾斜、又印字文字の印
字の傾斜に起因する文字線の傾余１宿に起因する抽出し
た特徴のばらつきを吸収するために、辞層マスクの複数
化の手段によシ前記特徴のほらつきを吸収していた。し
かしなから、この手段は識別を行なう際の抽出したη句
、徴と界層との照合の時間が辞層マスクの数に比例して
増大し、装置の処理速度の低下を招いていた。この欠点
を除去する為に、各方向の文字線傾斜を抽出して使用す
る辞書マスクを選択することにより処理速度を向上させ
る方法があるが、従来の傾斜抽出方式Ｃ１各方向につい
て、当該方向の各ストロークの傾斜の単純平均により算
出しているので、傾斜の傾向が不安定な短かいストロー
クの影響により、抽出する＃ｉ斜が不安定とｆＪ：Ｉ）
ひいては認識性能の低下を招いてい／こ。(Background Art) Conventionally, in a character recognition device, the inclination of the character line due to the difference in the number of characters of various handwritten characters as shown in the example of Fig. 1, and the inclination of the character line due to the inclination of the printed character are In order to absorb the resulting variations in the extracted features, the variations in the features have been absorbed by means of multiple layer masks. However, with this method, the time required to compare the extracted η phrases and signs with the boundary layer during identification increases in proportion to the number of layer masks, resulting in a reduction in the processing speed of the device. In order to eliminate this drawback, there is a method to improve the processing speed by extracting the character line slope in each direction and selecting a dictionary mask to use.However, in the conventional slope extraction method C1, for each direction, Since it is calculated by a simple average of the slope of each stroke, the #i slope to be extracted is unstable due to the influence of short strokes with unstable slope trends fJ:I)
This in turn leads to a decline in recognition performance.

（発明の目的）本発明は、このような欠点全除去する為に、文字パター
ンより抽出したストローク成分の傾斜の、各ストローク
成分の長さを重みとする加重平均を算出することにより
、安定な文字線の傾斜を算出することを特徴とし、その
目的は高速で安定な傾斜抽出方式を提供することにある
。(Objective of the Invention) In order to eliminate all such defects, the present invention calculates a weighted average of the slopes of stroke components extracted from character patterns, using the length of each stroke component as a weight. It is characterized by calculating the slope of character lines, and its purpose is to provide a fast and stable slope extraction method.

（発明の構成及び作用）第２図は、本発明の文字認識装置における一実施例の構
成図を示す。同図において、文字の光信号は光信号人力
１よシ光電変換部２において２値の量子化されたディジ
タル電気信号に変換され、パターンレジスタ３に格納さ
れる。それと同時に線幅計算部４において人カパターン
の線幅（Ｗ）か言１算される。サブパターン抽出部５は
、パター７レジスタ３について垂直スキャンを全面に行
なって黒点（文字線部全黒点とする）の連続する長さと
、線幅計算部４において計算された線幅との関係よシ、
垂直サブパタ〜７（ＶＳＰ）−ｉ抽出する。(Structure and operation of the invention) FIG. 2 shows a block diagram of an embodiment of the character recognition device of the invention. In the figure, an optical signal of a character is converted into a binary quantized digital electrical signal by an optical signal input 1 and a photoelectric conversion section 2, and is stored in a pattern register 3. At the same time, the line width (W) of the human pattern is calculated in the line width calculating section 4. The sub-pattern extractor 5 vertically scans the entire surface of the putter 7 register 3 and calculates the relationship between the continuous length of black dots (all black dots in the character line area) and the line width calculated by the line width calculator 4. C,
Extract vertical subpattern ~7 (VSP)-i.

同様に、水平スキャンにより水平サブパター７（１−１
Ｓ　Ｐ　）　２、右斜め４５°スキヤンにより右斜めサ
ブパター７（ＲＳ　Ｐ　）−ｑ、左斜め４５°スキヤ／
により左斜めザブパターン（ＬＳＰ）４抽出する。第３
図は原パターンと各ザブパターンの例で、（ａ）は原パ
ターン、（１））は垂直ザブパターン（ｖｓｐ）、（Ｃ
）は水平サブパターン（ＩＩＳＰ）、（ｄ）　ｆｉ右斜
めザブパターン（１もＳＰ）、（ｃ）ｑｘ左斜めサブパ
ター７（ＬＳＰ）　である。Similarly, by horizontal scanning, horizontal sub pattern 7 (1-1
S P ) 2, right diagonal 45° scan right diagonal sub putter 7 (RS P )-q, left diagonal 45° ski/
4 left diagonal sub patterns (LSP) are extracted. Third
The figure shows examples of the original pattern and each sub pattern, where (a) is the original pattern, (1)) is the vertical sub pattern (vsp), and (C
) is a horizontal sub-pattern (IISP), (d) fi right diagonal sub pattern (1 is also SP), and (c) qx left diagonal sub pattern 7 (LSP).

ストローク抽出部６は各サブパターンレジスタにおける
水平又は垂直スキャンを全面に行ない、白点から黒点、
黒点から白点への変化点を検出し、１列（又は行）前の
スキャノにおけ為変化点個数と変化点座標と現列（又は
行）の変化点個数と変化点座標の関係よりストロークを
抽出し、抽出した各ザブパター７レジスタ内のストロー
クの両端点のパターンレジスタ３で定義される２次元座
標系における座標（パターンレジスタの左下を原点とす
る）を傾胴］１旧Ｂ部７へ送出する。傾斜抽出部７はス
トローり抽出都６において抽出した各サブパター７レジ
スタ内のストロークの両端点座標を参照し、各ザブパタ
ーンごとに平均傾斜を言１算する。即ち、水平サブパタ
ーンより抽出したストロークの両端点座標ｆ　（ＨＸＳ
ｎ、　、　ＨＹＳ　ｒｌ）　、　（１−ＩＸＥｎ。The stroke extraction unit 6 performs horizontal or vertical scanning over the entire surface of each sub-pattern register, and extracts from the white point to the black point,
Detect the change point from a black point to a white point, and calculate the stroke based on the relationship between the number of change points and the coordinates of the change point in the previous column (or row), and the number of change points and the coordinates of the change point in the current column (or row). Extract the coordinates (with the lower left of the pattern register as the origin) in the two-dimensional coordinate system defined in the pattern register 3 of both end points of the stroke in each of the extracted Zabu putter 7 registers] 1 to the old B part 7 Send. The slope extraction section 7 refers to the coordinates of both end points of the stroke in each sub-pattern 7 register extracted in the stroke extraction capital 6, and calculates the average slope for each subpattern. That is, the coordinates f (HXS
n, , HYS rl), (1-IXEn.

ＨＹＥｎ）、但しｎ−１１−−＋　Ｐ　、　Ｐｔａ、ス
トローク数として（１）式によ逆傾斜ＱＨを計算する（
但し１−ＩＸＥｐ　’）　ｆ（ＸＳ　ｐ　）。HYEn), where n-11--+ P, Pta, and the number of strokes, calculate the reverse slope QH using equation (1) (
However, 1-IXEp') f(XS p ).

（１）式中のｆ−ＩＬ　Ｇ　ｐは当該ストロークの長さ
を表わし、（２）式の近似式によりめる。f-IL G p in equation (1) represents the length of the stroke, which is determined by the approximate equation of equation (2).

１−（ＬＧｐ　＝ＭＡＸ　（１ｌ−ＩＸＥｐ　−ＨＸ５
　ｐ　１．１１−１．ＹＥｐ　−１−ＩＹＳ　ｐ　Ｉ）
（２）式は２点間の距離を、２点間の水平及び垂直座標
差の内で、小さい方の１／２と他の一方との和とする近
似式である。同様にｖＶｌθＲ、０ｔ、　ｆｆ１（３）
〜（５）式ニヨり計算ｔ−ル。但シ、ＶＹＥｑ　）　Ｖ
ＹＳ　ｑ　。1-(LGp = MAX (1l-IXEp -HX5
p 1.11-1. YEp-1-IYS p I)
Equation (2) is an approximate expression in which the distance between two points is the sum of 1/2 of the smaller one of the horizontal and vertical coordinate differences between the two points and the other one. Similarly, vVlθR, 0t, ff1(3)
~ (5) Equation 2 calculation table. However, VYEq) V
YSq.

ＲＸＥ　ｔ　）　ＲＸＳ　ｔ　、　ＬＸＥ　ｋ　）　Ｌ
Ｘ８　ｋとする。RXE t) RXS t, LXE k) L
Let it be X8k.

ΣＶＬＧｑ ■ なお、上記式中Ｑ、Ｌ、１ぐはそれぞれ垂直サブパター
ン、右斜めザブパターン、左斜めサブ／くターンよシ抽
出したストローク数である。またストローク数がＯのと
きは傾斜もΩとする。ま７ｊストロ−クツ長すＶＬＧｑ
　、　ｉもＬＧｔ　、　ＬＬＧｋｔｒＬ（２）式と同様
な計算により算出迂る。。ΣVLGq ■ Note that in the above formula, Q, L, and 1st are the number of strokes extracted from the vertical sub pattern, right diagonal sub pattern, and left diagonal sub/turn, respectively. Furthermore, when the number of strokes is O, the slope is also assumed to be Ω. Ma7j Strokes Long VLGq
, i are also calculated using the same calculation as LGt, LLGktrL (2). .

傾斜抽出部７は、上記式（１）〜（５）よシ計算した各
ザブパター７の傾斜を辞書選択部８へ送出する。、文字
枠検出部９はパターンレジスタ３内の文字パターンに外
接する文字枠を検出し、その結果を文字枠分割決定部１
０−＼送る。The slope extraction section 7 sends the slope of each sub putter 7 calculated according to the above equations (1) to (5) to the dictionary selection section 8. , the character frame detection unit 9 detects a character frame circumscribing the character pattern in the pattern register 3, and sends the result to the character frame division determination unit 1.
0-\Send.

文字枠分割決定部１０は、検出され／こ文字枠内をＭＸ
Ｎ（７）領域（Ｍ、ＮｔｒＪ、整数、本実施例では■＼
４−Ｎ−５）に分割するためのＸｊｌ！ｌｌ＋、　’Ｙ
輔上の分割点座標を決定する。ここでＸ軸は文字枠の水
平方向奮、Ｙ軸は垂直方向をそれぞれ示す。The character frame division determining unit 10 converts the inside of the detected character frame into MX.
N(7) area (M, NtrJ, integer, in this example ■\
4-N-5) Xjl! ll+, 'Y
Determine the coordinates of the division points on the bridge. Here, the X axis indicates the horizontal direction of the character frame, and the Y axis indicates the vertical direction.

特徴マトリクス抽出部１１は、文字枠分割決定部１０に
より決定された分割点座標によシＶＳＰ、ｆｌｓＰ。The feature matrix extraction unit 11 extracts VSP and flsP based on the division point coordinates determined by the character frame division determination unit 10.

ＪもＳＰ　、　ＬＳＩ）の各サブパターンレジスタ上の
文字枠領域を１＼＋Ｉ　ｘ　Ｎの領域に分割し、各領域
の黒点数１３１］を計数し、線幅計算部４で言１ｙｌｌ
＋−シた線幅Ｗを用いて次式（６）により文字線長を示
す！１！ｉ徴全割算し、ｆｖｌｘＮｘ４次元の特徴マド
ＩＪクスを作成する。Divide the character frame area on each sub-pattern register (J, SP, LSI) into 1\+I x N areas, count the number of black dots in each area (131), and use the line width calculation unit 4 to calculate the number of black points (131).
Express the character line length using the following equation (6) using the line width W obtained by +-. 1! Divide all i characteristics to create a fvlxNx4-dimensional feature matrix.

Ｌｉ　ｊ　＝　Ｂｉ　ｊ　／Ｗ　（６）その後、Ｖ　Ｓ
　Ｐ特徴マトリクスは文字枠のＹ軸方向の長さΔＹで、
Ｈ８Ｐ特徴マトリクスＢＸ軸方向の長さΔＸで、ＩＬＳ
Ｐ及びＬＳＰ特徴マトリクスは（Δχ＋ΔＹ）／２　で
それぞれ正規化を行ない最終的にＭｘＮｘ　４次元の特
徴マトリクスを作成する。Li j = Bi j /W (6) Then, V S
The P feature matrix is the length ΔY of the character frame in the Y-axis direction,
H8P feature matrix BX axis direction length ΔX, ILS
The P and LSP feature matrices are each normalized by (Δχ+ΔY)/2 to finally create an M×N× four-dimensional feature matrix.

辞１′選択部８は傾斜抽出部７より出力され／也傾斜θ
Ｈ、θ■、θＲ＋θエニーｋ参照して、入力文字パター
ンに適し／ζ辞書を選択するための選択信号を辞朋°メ
モリ部１３に送出する。辞１：メモリ部１３には、あら
かじめ／［Ｍ′徴ごとに分類して作成した辞書マスクが
舵機ごとに数種類用意しである。本実施例にｇ号いてυ
二辞皆メモリ部にはＶＳＰη寺徴マ青黴クス、１−Ｉ　
ＳＰ時徴マトリクス、１もＳ　Ｐ　％徴マトリクス、Ｉ
、ＳＰ特徴マトリクスについてそれぞれ３種類の傾斜に
対応する辞−１マスクを用意した。各ｑ青黴マトリクス
のイ頃余Ｉはそ）Ｌぞれ θｎ＜−０，２５、−０，２５＜θＨ＜０．２５　、０
．２５　＜θＨの３種類θｖ＜　０．２５．−０．２５
＜θｖ＜０．２５　、０．２５　＜θ■　の３種類θＲ
＜　０．７　、　０．７　＜θＲ・て１４　・】４　く
θＩ？　の３　ｆｆ１ｆ類θＬ＜−１，，４、−１，４
，＜θＬ＜−０，，７、−０，７≦θＩ７　の３種類に
対応している。辞書選択部８は、傾斜抽出部から得た傾
斜θＨ４θ■、θＲ７θＬに対応した各特徴マトリクス
ごとの辞書を選択するための信号を出力する。辞書メモ
リ部１３は上記選択信号で指定された、各特徴ごとの辞
書を識別部］２に参照させる１、識別部１２ば、各特徴
ごとに指定された辞書マスク（ｆｊ　）と前記抽出され
た特徴マトリクス（ｆｌ）との間に式（７）で定義され
る距離（１］を適ｊ１］シ、Ｄが最小の値となるような
辞１マスクのカテゴリ名全文字名出力１４へ出力するも
のである。The slope 1' selection unit 8 is output from the slope extraction unit 7 and the slope θ is output from the slope extraction unit 7.
With reference to H, θ■, θR+θany k, a selection signal for selecting the /ζ dictionary suitable for the input character pattern is sent to the dictionary memory section 13. Dictionary 1: In the memory unit 13, several types of dictionary masks are prepared for each rudder machine, which are classified and created in advance for each /[M' characteristic. In this example, g is υ
In the memory section of the two words, there are VSP η temples, Aokikusu, 1-I.
SP chronological matrix, 1 is also SP % characteristic matrix, I
, 1 masks corresponding to three types of slopes were prepared for each of the SP feature matrices. I and I of each q blue mold matrix are respectively) θn<-0,25, -0,25<θH<0.25, 0
．． 25 <3 types of θH θv< 0.25. -0.25
Three types of θR: <θv<0.25, 0.25 <θ■
< 0.7, 0.7 <θR・te14 ・】4 kuθI? 3 ff1f class θL<-1,,4,-1,4
, <θL<-0,,7, -0,7≦θI7. The dictionary selection unit 8 outputs a signal for selecting a dictionary for each feature matrix corresponding to the slopes θH4θ■ and θR7θL obtained from the slope extraction unit. The dictionary memory unit 13 causes the identification unit 2 to refer to the dictionary for each feature specified by the selection signal. The distance (1) defined by equation (7) is applied between the feature matrix (fl) and the category name of the word 1 mask such that D is the minimum value is output to the full character name output 14. It is something.

Ｄ刊Ｘ（ｆｉ丁百−分　・（７）この様に、本実施例においては入力文字の各方向の文字
線の傾斜を、ストロークの長さを・重みとして、抽出し
た各サブパターン内のストロークのイ頃斜全加重平均す
ることにより抽出しているので、文字パター７内の各方
向彷の文字線の傾斜角度が短いストロークの傾斜の不安
定性の影響を除去して安定に抽出できる利点がある。D issue Since it is extracted by weighted averaging of all the strokes, the advantage is that it can remove the influence of the instability of the slope of strokes where the slope angle of the character line in each direction in the character pattern 7 is short and extract it stably. There is.

換言すれば、筆記者の違いによる文字線の傾斜傾向全抽
出する方法としてはその処理が簡単であり、又、゛大分
類法で使用するザブバタ〜７を使用すれば大分類法との
整合性の点も問題がない。In other words, the process is simple as a method for extracting all the slope trends of character lines due to differences in scribes, and if Zabbata-7 used in the major classification method is used, it is consistent with the major classification method. There is no problem with this point either.

（発明の効果）不発ヴ」は文字パター／内の各方向の文字線の傾余Ｉを
、抽出し／こサブパターン内の各ストロークのイｌ’Ｊ
１斜をス１用コークの長さを重みとする加重平均するこ
とにより抽出しているので、短かいストロークの傾余１
の不安定性の影響をうけることなく傾斜の抽出が安定と
なる利点があり、高速で認識精度の良い文字認識装置に
利用することができる。(Effect of the invention) ``Dufu'' extracts the slope I of the character line in each direction in the character pattern / and calculates the slope I of each stroke in this subpattern.
1 slope is extracted by weighted averaging using the length of the stroke cork as a weight, so the slope remainder 1 of a short stroke is
This method has the advantage that slope extraction is stable without being affected by instability, and can be used in character recognition devices with high speed and high recognition accuracy.

[Brief explanation of drawings]

第１図ｒよ手書文字例、第２図は本発明の文字認識装Ｍ
、における一実施例を示す構成図、第３図は原バター／
と各ザブパター７の例を示す図である。 ■、光情号人力、２．光電変換、３　バタールジ７’ｌ
、４：線幅語算部、５．ザブパター７抽出都、６：スト
ローク抽出部、７　傾斜抽出部、８：辞書選択部、９：
文字枠検出部、１００文字枠分割決定部、１１：特徴マ
トリクス抽出部、１２：識別部、１３：辞書メモリ、Ｊ
４：文字名出力。Figure 1 is an example of handwritten characters, Figure 2 is an example of character recognition system M of the present invention.
, a configuration diagram showing an example of the raw butter/
and FIG. 7 is a diagram showing an example of each sub putter 7. ■, Kojogo Jinriki, 2. Photoelectric conversion, 3 Batarji 7'l
, 4: Line width word calculation section, 5. Zabu putter 7 extraction capital, 6: stroke extraction section, 7 slope extraction section, 8: dictionary selection section, 9:
Character frame detection unit, 100 character frame division determination unit, 11: Feature matrix extraction unit, 12: Identification unit, 13: Dictionary memory, J
4: Character name output.

Claims

[Claims]

In order to select a dictionary mask for a character recognition device, character/figure patterns are scanned and detected in multiple predetermined directions.
This is a slope extraction method that extracts stroke components from each of the 7 sub-butters created by extracting all cross-sections whose cross-sectional length is sufficiently longer than the line width in the character figure pattern from among the character line breaks...j. Q, and the weighted average of the stroke component of each extracted sub-pattern with the length of each stroke component, 7, (R'H1), is taken as the slope of the 9th turn of the sub-pattern. The trend 1 extraction method is characterized by extraction.