JPS59128678A - Separating device of character - Google Patents

Separating device of character

Info

Publication number
JPS59128678A
JPS59128678A JP58003103A JP310383A JPS59128678A JP S59128678 A JPS59128678 A JP S59128678A JP 58003103 A JP58003103 A JP 58003103A JP 310383 A JP310383 A JP 310383A JP S59128678 A JPS59128678 A JP S59128678A
Authority
JP
Japan
Prior art keywords
memory
projection
kanji
value
characters
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
JP58003103A
Other languages
Japanese (ja)
Inventor
Toru Usubuchi
臼「淵」 徹
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
NEC Corp
Original Assignee
NEC Corp
Nippon Electric Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by NEC Corp, Nippon Electric Co Ltd filed Critical NEC Corp
Priority to JP58003103A priority Critical patent/JPS59128678A/en
Publication of JPS59128678A publication Critical patent/JPS59128678A/en
Pending legal-status Critical Current

Links

Landscapes

  • Character Input (AREA)

Abstract

PURPOSE:To connect a left-hand radical of a KANJI (Chinese caracter) to a right-hand radical and to separate parts of the KANJI to separate the KANJI precisely by finding out the average value of projections from the number of picture elements corresponding to the length of a space between characters. CONSTITUTION:A picture signal corresponding to a sheet of paper which is inputted from an input terminal 1 is once stored in a picture memory 2 and then read out in accordance with a signal from a memory controlling circuit 3. A counter 4 counts up the number of black picture elements and writes the counted value in an Y projection memory 6. At the time of writing, a microcomputer 10 sets up a writing address in a memory controlling circuit 7. The projection value from the memory 6 is inputted to the computer 10, and when the value exceeds a fixed value, the projection is discriminated as a character part. When smaller than the fixed value, the projection is discriminated as one interval between lines. The picture signal on a scanning line which is discriminated as a character string is read out in a subscanning direction and processed by a counter 5, an X projection memory 8, an average value calculating circuit 11, and the computer 11. Thus, the parts of the KANJI are separated precisely.

Description

【発明の詳細な説明】 本発明は、OCR等に用いられる文字切り出し装置に関
するものである。従来、この種の文字切り出し装置とし
ては、X方向又はy方向のプロジェクシ冒ン(画像信号
が黒レベルである画素数を求めたものを用いて行うもの
があった。しかるにこの方法では、〔化〕〔像〕等のよ
うに偏と労よ多構成されている漢字の場合には、2つの
文字に分離されてしまうことが多く、正確な文字切出し
が行えないという欠点があった。
DETAILED DESCRIPTION OF THE INVENTION The present invention relates to a character cutting device used in OCR and the like. Conventionally, this type of character cutting device has been used to perform projection calculation in the X or Y direction (by determining the number of pixels for which the image signal is at the black level. However, in this method, In the case of kanji that are composed of many characters, such as ``唚〕〔 〔〕, etc., they are often separated into two characters, and this has the disadvantage that accurate character extraction cannot be performed.

本発明の目的は1.上述の欠点全改良することにあり、
〔化〕〔像〕等のように偏と労より構成されている漢字
も正確に切り出すことにある。
The purpose of the present invention is 1. The aim is to improve all the above-mentioned shortcomings,
It is important to accurately cut out kanji that are composed of bias and labor, such as [ka] and [image].

本発明の装置tは、文字と文字の空白の長さに対応スる
画素数でプロジェクシ璽ンの移動平均直を求める手段と
、前記プロジェクシ冒ンの移動平均値を用いて文字の切
シ出しを行う手段とから構成される。
The apparatus t of the present invention includes means for calculating a moving average value of a projection mark using the number of pixels corresponding to the length of a space between characters, and a means for cutting a character using the moving average value of the projection mark. It consists of a means for performing the extraction.

本発明の装置によれば、文字と文字の空白の長さに対応
する画1gaでプロジェクションの平均値を求めること
により、偏と労の部分は接続し漢字と漢字の部分は、分
離されるので、正確な文字の切シ出しが行えるという作
用効果が得られる。
According to the device of the present invention, by calculating the average value of the projection with a stroke of 1 ga corresponding to the length of the blank space between characters, the parts of the bias and labor are connected, and the parts of kanji and kanji are separated. , it is possible to obtain the effect of accurately cutting out characters.

次に本発明を図面を用いて詳細に説明する。Next, the present invention will be explained in detail using the drawings.

第1図は、本発明に係るプロジヱクシ、ンを文字切り出
し用特徴として用いた例である。第1図(Atに示した
ように文字が横書きの場合にはyプロジェクションを求
め、ある問直で切れば、文字列の切り出しは可能である
。次に第1図(8に示したように文字列の切シ出しを行
った後で各文字列単位にX方向のプロジェクシ冒ンを用
いれば文字の切り出しもOT能となる。同様に文字が縦
書きの場OKは、最初にXプロジェクシ冒ンを求め文字
列の切シ出しを行い、次に各文字列単位にX方向のプロ
ジェクシ画ンを求めて文字の切シ出しを行う。
FIG. 1 is an example in which the projector according to the present invention is used as a feature for cutting out characters. If the characters are written horizontally as shown in Figure 1 (At), it is possible to cut out the character string by finding the y projection and cutting at a certain angle. After cutting out a character string, if you use a projection in the The character string is cut out by determining the distance, and then the projection stroke in the X direction is determined for each character string and the characters are cut out.

第2図は、本発明に用いられるXプロジェクシ四ンの平
均値の効果を示すための図でめる。図においてXプロジ
ェクシ曹ンを用いて文字の切シ出しを行った場合には、
グロジェクションの閾iit”0とすれば、〔像〕、〔
化〕等の漢字は、伽と労の2つの部分に分離されてしま
うことがある。しかし、Xプロジェクシ曹ンの移動平均
値を用い、ウィンドーの大きさを文字と文字の空白の長
さに対応するように決めれば、文字の偏と労の部分は結
合し、文字と文字は分離されるようになる。即ち図にお
いて移動平均値のウィンドーの大きさをlOとして、移
動平均値の閾11!ft5とすれば、〔像〕、〔化〕等
の漢字は備と労がうまく結合し。
FIG. 2 is a diagram showing the effect of the average value of the four X projections used in the present invention. In the diagram, if the characters are cut out using the X projector,
If the threshold of glojection is iit"0, [image], [
A kanji such as 彽 or labor is sometimes separated into two parts: 佽 and 郎. However, if you use the moving average value of the X projection and decide the window size to correspond to the length of the space between the characters, the uneven and labored parts of the characters will be combined, and the characters will be become separated. That is, in the figure, if the size of the window of the moving average value is lO, the threshold of the moving average value 11! If it is ft5, kanji such as [image] and [ka] are well combined with bei and labor.

文字切り出しがうまく行える。このようKxプロジェク
ションの平均値を用いることによって、従来単純なXプ
ロジェクシ叢ンでは難かしかった分離漢字の切シ出しが
oJ能となる。
Characters can be cut out well. By using the average value of the Kx projections in this manner, it becomes possible to cut out separated kanji characters, which was previously difficult to do with simple X projections.

第3図は、本発明を実施するための回@構成の一例を示
す図である。入力端子lより人力された紙面一枚分の画
像信号は、−は1葎メモリ2に蓄積される。蓄積された
画像11号は、メモリ制御回路3の1δ号に従って走査
−一ライン単位で読み出されカウンタ4で黒ll1II
″A数がカウントされ、yプロシェクシ百ン用メモリ6
に書き込まれる。これらの操作は、マイクロコンピュー
タxoOThJllI信号に従って、メモリ制御回路7
が曹き込みアドレスを設定することによってすべての走
査線に対して行なわれる。これらyプロジェクタ、ンの
値は、次にマイクロコンピュータ10に読み込まれ、y
プロジヱクシ舊ンの櫃がある一定値より大きければ、文
字部として、小さければ一行間として判定される。この
ことより最初の一行の文字列として走査線n1から02
 までのI[iII像信号が切り出さ扛る。次に文字列
として判定された走査# n tラインからn2ライン
までの画像信号は、メモリ制御回路3の信号に従りて副
走査方向に読みだされ、カウンタ5で黒画素数がカウン
トされXプロジェクシlン用メモリ8に賽き込まれる。
FIG. 3 is a diagram showing an example of a circuit configuration for implementing the present invention. An image signal for one page of paper manually inputted from the input terminal l is stored in the memory 2. The stored image No. 11 is scanned and read out line by line according to No. 1δ of the memory control circuit 3, and the counter 4 reads out the black ll1II.
``A number is counted, y processing memory 6
will be written to. These operations are carried out by the memory control circuit 7 according to the microcomputer xoOThJllI signal.
is performed for all scan lines by setting the fill address. These y projector, n values are then read into the microcomputer 10, and y
If the size of the projector is larger than a certain value, it is judged as a character part, and if it is smaller, it is judged as a line spacing. From this, the character string of the first line is 02 from scanning line n1.
The I[iII image signal up to is cut out. Next, the image signals from the scan #nt line to the n2 line determined as a character string are read out in the sub-scanning direction according to the signal of the memory control circuit 3, and the number of black pixels is counted by the counter 5. The data is stored in the projection memory 8.

これらの操作は、マイクロコ/ピ、−夕10の制御信号
に従って、メモリ制御回路9が曹き込みアドレスを設定
することによって、すべての走査線の画素に対して行な
われる。これらXプロジェクタ、ンの値は、読み出され
平均値算出回lN111でXプロジェクシ冒ンの平均値
が求められて、マイクロコンビエータlOに入力される
。マイクロコンピュータ10では、Xプロジェクシ璽ン
の平均値がある一定値よシ大きければ文字として、小さ
ければ、文字の文字の間として判定される。このことよ
fi1文字単位で文字の抽出が行われる。抽出された文
字は、メモリ制御回路30制御信号に従って出力端子1
2よシ出力される。
These operations are performed on pixels of all scanning lines by setting the write address by the memory control circuit 9 in accordance with the control signal from the microcopy 10. These X projector values are read out, and an average value of the X projectors is calculated in an average value calculation circuit 1N111 and inputted to the micro combinator 10. In the microcomputer 10, if the average value of the X projector is larger than a certain value, it is determined to be a character, and if it is smaller, it is determined to be between characters. This means that characters are extracted in units of fi1 characters. The extracted characters are output to the output terminal 1 according to the control signal of the memory control circuit 30.
2 is output.

以上のような手順は□すべての走査線のyプロジェクシ
璽ンに対して、即ちすべての文字列に対して5− 行われる。またこれ等すべての手順は、マイクロコンピ
ュータ10のkL OM内にプログラムとして蓄積され
ている。
The above procedure is performed for the y projections of all scanning lines, that is, for all character strings. Further, all these procedures are stored as a program in the kLOM of the microcomputer 10.

以上本発明は実施例に従って説明されたが、これは単な
る例示的なものであって制限的な意味を有するものでは
ない。ここでは、Xプロジェクシ盲ンの平均11Iを求
めるにあたってlOサンプルのデータを用いたが、この
値は、文字の大きさおよびサンプリング密度によって変
わるものであり、それ以外の櫃をとることは勿論である
Although the present invention has been described above according to embodiments, these are merely illustrative and do not have a restrictive meaning. Here, data from 10 samples was used to calculate the average 11I of the X-projection blind, but this value varies depending on the font size and sampling density, and it is of course possible to use other cases. be.

【図面の簡単な説明】[Brief explanation of the drawing]

第1図は、本発明に係わるグロジェクシ璽ンと入力画像
の関係を説明するための図、第2図は、本発明に用いら
れるXプロジェクシ璽ンの千M11[を表わす図、第3
図は、本発明の文字切り出し装置の一例を示すブロック
図である。 なお、図において参照数字1は入力端子、2は画像メモ
リ、3はメモリ制御回路%4および5はカウンタ、6は
yプロジェクタ、ン用メモリ、6一 8はXプロジェクシ曹ン用メモリ、7および9けメモリ
制御回路、10はマイクロコンピュータ、11は平均値
算出回路、12は出力端子を表わす。 代理人弁戸士内原  晋 オ 1  目 (A)
FIG. 1 is a diagram for explaining the relationship between a glojexi seal and an input image according to the present invention, FIG. 2 is a diagram representing 1,000 M11 of the
The figure is a block diagram showing an example of a character cutting device of the present invention. In the figure, reference number 1 is an input terminal, 2 is an image memory, 3 is a memory control circuit, 4 and 5 are counters, 6 is a memory for the Y projector, 6-8 is a memory for the X projector, and 7 is a memory for the X projector. 10 represents a microcomputer, 11 represents an average value calculation circuit, and 12 represents an output terminal. Agent Bentoshi Shino Uchihara 1st (A)

Claims (1)

【特許請求の範囲】[Claims] 文字のセグメンテーシ四ンを行なう文字切出し装置にお
いて文字と文字の間の空白の長さに対比する幽;gat
−用いてプロジェクシ曹ンの移動平均値を求める手段と
、前記プロジェクシ冒ンの移動平均値を用いて文字の切
り出しを行う手段とから構成されることを特徴とする文
字切り出し装置。
In a character segmentation device that performs character segmentation, the distance between characters is compared to the length of the space between characters.
- means for obtaining a moving average value of the projection curve; and means for cutting out a character using the moving average value of the projection curve.
JP58003103A 1983-01-12 1983-01-12 Separating device of character Pending JPS59128678A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
JP58003103A JPS59128678A (en) 1983-01-12 1983-01-12 Separating device of character

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
JP58003103A JPS59128678A (en) 1983-01-12 1983-01-12 Separating device of character

Publications (1)

Publication Number Publication Date
JPS59128678A true JPS59128678A (en) 1984-07-24

Family

ID=11548009

Family Applications (1)

Application Number Title Priority Date Filing Date
JP58003103A Pending JPS59128678A (en) 1983-01-12 1983-01-12 Separating device of character

Country Status (1)

Country Link
JP (1) JPS59128678A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH06111064A (en) * 1992-09-29 1994-04-22 N T T Data Tsushin Kk Character cutout method

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH06111064A (en) * 1992-09-29 1994-04-22 N T T Data Tsushin Kk Character cutout method

Similar Documents

Publication Publication Date Title
JPS6184960A (en) Method and device for correcting oblique posture of document originals
US4901365A (en) Method of searching binary images to find search regions in which straight lines may be found
US5267325A (en) Locating characters for character recognition
JPH0146910B2 (en)
JPS59128678A (en) Separating device of character
JPS5866174A (en) Row extraction method
JPS6015781A (en) Character segment device
DE19858968A1 (en) Identification of document angle on scanner bed
JPH0373916B2 (en)
JPS6343788B2 (en)
JPS62165284A (en) String extraction method
JPH02253383A (en) Image processing device
JP2931041B2 (en) Character recognition method in table
JP2000222577A (en) Ruled line processing method, apparatus and recording medium
JPH0623983B2 (en) Distortion correction device
JPS60140488A (en) Character feature extraction method
JPH1049602A (en) Form recognition method
JPH0443312B2 (en)
JPH03133262A (en) Character area separation system
KR930008774B1 (en) Method of data compression of text image
JPS6362024B2 (en)
JPH05290166A (en) Segment recognition system
JPS63259784A (en) character recognition device
JPS6149554A (en) Image extraction circuit
JPH07123253A (en) Picture area discrimination device