JPS6286465A

JPS6286465A - Document image area division method

Info

Publication number: JPS6286465A
Application number: JP60226721A
Authority: JP
Inventors: Masatoshi Hino; 樋野　匡利; Kuniaki Tabata; 邦晃田畑; Tetsuo Machida; 哲夫町田
Original assignee: Hitachi Ltd
Current assignee: Hitachi Ltd
Priority date: 1985-10-14
Filing date: 1985-10-14
Publication date: 1987-04-20

Abstract

(57)【要約】本公報は電子出願前の出願データであるた
め要約のデータは記録されません。(57) [Summary] This bulletin contains application data before electronic filing, so abstract data is not recorded.

Description

【発明の詳細な説明】〔発明の利用分野〕本発明は文書を画像として入力し、蓄積する装置に係り
、特に入力された文書画像を文字領域。DETAILED DESCRIPTION OF THE INVENTION [Field of Application of the Invention] The present invention relates to a device for inputting and storing a document as an image, and particularly for storing an input document image in a character area.

図形領域、写真領域等に分割する文書画像の領域分割方
式に関する。This invention relates to a method for dividing document images into graphic areas, photographic areas, etc.

[Background of the invention]

従来の文書画像の領域分割方式としては、例えば、文書
画像を縦横方向に投影してヒストグラムを作成し、ヒス
トグラムの極端に変化する場所を判定することにより９
文字列を抽出する方式（秋田、増田：　［書式指定情報
によらない紙面構成要素抽出法」電子通信学会論文誌’
　８３　／　１　ｖ　ｏ　１　。Conventional document image area segmentation methods include, for example, creating a histogram by projecting the document image in the vertical and horizontal directions, and determining the locations where the histogram changes extremely.
Method for extracting character strings (Akita, Masuda: [Paper component extraction method that does not rely on format specification information" Journal of the Institute of Electronics and Communication Engineers)
83/1 v o 1.

Ｊ６６−Ｄ、Ｎα１．第１１１頁〜第１１８頁）や、文
書画像の２次元フーリエ変換を用いる方式（星野：　「
印刷文書画像の領域切り分は方法」、特開昭５９−５５
５８１号公報）、文書画像中に現われる白ラン、黒ラン
の特徴に着目して領域を分割する方法（村尾、坂井：　
「文書画像における構造情報の抽出」、情報処理学会第
２１回全国大会予稿集、７Ｈ−１，第８５７頁〜第８５
８頁）がある。J66-D, Nα1. 111 to 118), and a method that uses two-dimensional Fourier transform of document images (Hoshino: "
``Method for area segmentation of printed document images'', JP-A-59-55
581 Publication), a method of dividing regions by focusing on the characteristics of white runs and black runs that appear in a document image (Murao, Sakai:
“Extraction of structural information from document images”, Proceedings of the 21st National Conference of the Information Processing Society of Japan, 7H-1, pp. 857-85
8 pages).

文書画像の縦横方向の投影や２次元フーリエ変換を用い
る方式は、画像の大局的情報に基づいており、文書の広
い部分にわたって文字が存在する場合には効率よく判定
９分割を行うことができるが、文字部分が小さい場合に
は困難となる。また、大局的情報を処理するため、全画
像を記憶できるメモリが必要であり、処理量も大きくな
る。Methods that use vertical and horizontal projection of a document image or two-dimensional Fourier transformation are based on global information of the image, and can efficiently divide the document into nine parts if there are characters over a wide area. , this becomes difficult when the text part is small. Furthermore, since global information is processed, a memory capable of storing all images is required, which increases the amount of processing.

文書画像中の白ラン、黒ランの情徴に着目する方法も、
ランを処理対象とするため、データ処理量が大きくなる
という問題点がある。There is also a method that focuses on the characteristics of white and black runs in a document image.
Since runs are processed, there is a problem in that the amount of data to be processed becomes large.

[Purpose of the invention]

本発明の目的は、大きな画像メモリを必要とせず、処理
効率の良い文書画像の領域分割方式を提供することにあ
る。An object of the present invention is to provide a document image area division method that does not require a large image memory and has high processing efficiency.

[Summary of the invention]

上記目的を達成するために、本発明による文書画像の領
域分割方式は、文字領域と図形または写真領域とを含む
文書画像中の連結成分に着目して各連結成分に外接する
矩形を抽出するステップと、上記各外接矩形の隣接関係
から縦書若しくは横書き文字列を構成する文字外接矩形
の平均サイズを求めるステップと、」二記平均サイズを
基準に文字領域と、文字以外の図形または写真領域とを
区別するステップとを含むことを特徴とする。In order to achieve the above object, the document image region segmentation method according to the present invention includes a step of focusing on connected components in a document image including a character region and a figure or photo region, and extracting a rectangle circumscribing each connected component. and a step of determining the average size of character circumscribing rectangles constituting a vertically or horizontally written character string from the adjacency relationship of each of the circumscribed rectangles. and a step of distinguishing between the two.

上記本発明の方式は、文書画像における連結成分の外接
矩形が備える次のような性質を利用している。すなわち
、文書中に用いられる文字は一般に縦書き若しくは横書
きの文字列を構成しており、（１）横書文字列の場合は
、高さの類似する矩形が横方向に近接して並ぶ。The above-mentioned method of the present invention utilizes the following properties of a circumscribed rectangle of connected components in a document image. That is, characters used in a document generally constitute a character string written vertically or horizontally, and (1) in the case of a horizontally written character string, rectangles of similar height are arranged close to each other in the horizontal direction.

（２）縦書文列の場合は、幅の類似する矩形が縦方向に
近接して並ぶ。(2) In the case of a vertical text string, rectangles with similar widths are arranged close to each other in the vertical direction.

（３）図形領域や写真領域における連結成分の外接矩形
は、文字を構成する連結成分の外接矩形よりも、高さ２
幅がかなり大きく、また、大きさや位置に規則性が類似
性がない。(3) The circumscribed rectangle of connected components in a graphic area or photo area is 2 times taller than the circumscribed rectangle of connected components that make up a character.
The width is quite large, and there is no similarity in size or position.

（４）文字を構成する連結成分の数は、図形領域や写真
領域を構成する連結成分の数よりも多い。(4) The number of connected components that make up a character is greater than the number of connected components that make up a graphic area or a photographic area.

従ってこらにの性質を利用すると、例えば、第１図に示
すような手順により、文書画像中の文字部とそれ以外の
領域を、分割することができる。Therefore, by utilizing these properties, it is possible to divide a text portion and other areas in a document image by, for example, the procedure shown in FIG.

即ち、入力された文書画像から連結成分の外接矩形を抽
出する（処理１００）。次に、抽出された外矩形の情報
を用いて縦書／横書の判定を行う（処理２００）。この
判定は、前記（１）、（２）の性質に着目し、以下に述
べる方法で行う。先ず、縦書を仮定した場合と横書を仮
定した場合のそれぞれについて、文字列を構成すると思
われる互いに隣接する矩形間の距離を画像全体に対して
求め、その平均をとる。この平均値の低い方の仮定が正
しい判定する。縦書／横書判定と同時に、文字の平均高
さ１幅の推定を行う（処理３００）。これは、文字列を
構成すると判定された各矩形の高さおよび幅の画像全体
の平均を求めることにより行う。このようにして求まっ
た文字部分の高さ、幅の推定値を基準にし、それにより
も明らかに大きい値の（高さ、幅を持つ矩形を図形領域
および写真を構成するものと仮定し、近隣するものがあ
れば統合して、「図形、写真領域」として抽出する（処
理４００）。抽出された「図形、写真領域」について、
原画像を調べ、「図形領域」と「写真領域」を分離する
（処理５００）。この場合、図形は白地部分が多く、写
真は黒地部分が多いという一般的性質に着目し、例えば
、各領域内での白黒画素比を求めることにより上記いず
れの領域かを判断する。ここで、白黒画素比を判定の基
準とする場合には、連結成分の外接矩形抽出時に、その
矩形内の白黒画素比を同時に求めておけば、原画像を求
めておけば、原画像を調べることなく、「図形領域」　
「写真領域」の分離が可能である。That is, a circumscribed rectangle of connected components is extracted from the input document image (process 100). Next, vertical writing/horizontal writing is determined using the extracted outer rectangle information (process 200). This determination is performed by the method described below, focusing on the properties (1) and (2) above. First, distances between mutually adjacent rectangles that are considered to constitute a character string are determined for the entire image for both vertical writing and horizontal writing, and the distances are averaged. The assumption with the lower average value is determined to be correct. Simultaneously with the vertical writing/horizontal writing determination, the average height and one width of the characters are estimated (process 300). This is done by calculating the average of the height and width of each rectangle determined to constitute a character string over the entire image. Based on the estimated height and width of the character part determined in this way, we assume that rectangles with clearly larger values (height and width) constitute the graphic area and the photo, and If there are any, they are integrated and extracted as a "figure, photo area" (process 400). Regarding the extracted "figure, photo area",
The original image is examined and separated into a "graphic area" and a "photographic area" (process 500). In this case, focusing on the general property that figures have many white background parts and photographs have many black background parts, for example, by determining the black and white pixel ratio within each area, it is determined which of the above areas it is. Here, if the black-and-white pixel ratio is used as the criterion, when extracting the circumscribed rectangle of connected components, the black-and-white pixel ratio within the rectangle can be determined at the same time. Without "shape area"
Separation of "photo area" is possible.

最後に、文字を構成すると判定された矩形について、縦
書の場合は縦方向、横書の場合は横方向に統合して文字
列を抽出し、各文字列を統合して文字領域を抽出する（
処理６００）。Finally, for rectangles that are determined to constitute characters, extract character strings by integrating them vertically in the case of vertical writing or horizontally in the case of horizontal writing, and extract the character area by integrating each character string. (
Process 600).

[Embodiments of the invention]

以下、本発明の詳細な説明する。 The present invention will be explained in detail below.

第２図は、本発明による領域分割方式の概要説明図であ
る。処理対象となる文書は、入力文書画像データ１０と
して、スキャナ等の画像入力装置に入力される。連結成
分外接矩形抽出部１１は、入力文書画像データ１０から
、連結成分の外接矩形を抽出し、その結果を外接矩形情
報テーブル１２に書き込む。この処理は例えば、「ディ
ジタル画像処理Ｊ　　（Ｒｏｓｎｆｅｌｄ、ｋａｋ著、
長尾真監訳、近代科学社、第３５３頁〜第３６１頁）等
で公知の手法を採用できる。矩形情報の表し方には種々
の形式があるが、例えば、縦方向、横方向の最大値。FIG. 2 is a schematic explanatory diagram of the area division method according to the present invention. A document to be processed is input as input document image data 10 to an image input device such as a scanner. The connected component circumscribed rectangle extraction unit 11 extracts circumscribed rectangles of connected components from the input document image data 10 and writes the result into the circumscribed rectangle information table 12. This process is described in, for example, "Digital Image Processing J (Rosnfeld, kak),
A known method can be employed, such as the one published by Shin Nagao (supervising translation), Kindai Kagakusha, pages 353 to 361). There are various formats for representing rectangular information; for example, maximum values in the vertical and horizontal directions.

最小値を用いる。１３１は領域分割処理部であり、〔処
理６０１）：処理２００の判定結果が縦書の時、処理６
０２を実行し、横書の時、処理６０４を実行する。Use the minimum value. 131 is an area division processing unit, [process 601): When the determination result of process 200 is vertical writing, process 6
02 is executed, and when horizontal writing is performed, process 604 is executed.

［処理６０２］：第８図において、ｒ文字」。[Process 602]: In FIG. 8, "r character".

「横軸文字」、「縦書文字」と判定される矩形を文字構
成要素とし、これらの中で互いに方向接続条件を満たす
矩形のグループを求める。冬グループの全ての矩形に外
接する矩形を求め、「文字列」とする。Rectangles that are determined to be "horizontal characters" and "vertical characters" are used as character constituent elements, and groups of rectangles that mutually satisfy the directional connection condition are found. Find a rectangle that circumscribes all the rectangles in the winter group and use it as a "character string."

〔処理６０３）：処理６０４で求められた「文字列」を
表わす矩形について、横方向で隣接する２つの矩形が次
の条件を満たす時、文字領域として統合可能とする。[Process 603): Regarding the rectangle representing the "character string" obtained in Process 604, when two horizontally adjacent rectangles satisfy the following condition, they can be integrated as a character area.

隣接する２つの矩形をｉ＋　−１（１１がｉの右隣）と
し、そのＸ座標、ｙ座標の最大値、最小値を、Ｘｍ１ｎ
ｌ、　　ＸｍａＸＩ９３’＋＋＋１ｎｌ、　　ｙｍａＸ
ｌ、　　！、＋ｍ１ｎＪｙＸ　ｍａｘＪ　ｔ　　：ｌ’
　ｍ１ｎｄ　ｌ　　ｙ　ｌｌａ！Ｊとすると、（ｊ）　
　縦方向位置で、重なり合う部分がある。Let the two adjacent rectangles be i+ -1 (11 is the right neighbor of i), and let the maximum and minimum values of their X and Y coordinates be Xm1n
l, XmaXI93'+++1nl, ymaX
l,! , +m1nJyX maxJ t :l'
m1nd l y lla! If J, then (j)
There is some overlap in the vertical position.

ｗａｘ（ｙ＋＋＋ｔｎｔ＋　Ｖ□ｎＪ）〈＋＋ｎ（’／
　ｍａＸＩ　　’／　Ｉｌ＋ａＸｊ）・・・（１４）（５）横方向距離が、２つの矩形の幅の平均のγ１倍以
下である。wax(y+++tnt+ V□nJ)〈++n('/
maXI'/Il+aXj) (14) (5) The lateral distance is γ1 times or less the average width of the two rectangles.

ｘｍｓｎａ　−ｘ＋ｍａｘｔ＜γＩ　Ｘ　（０’＠ａｘ
ｔ　　Ｘｍｔｎｔ）　＋（ＸｍａｘＪ−ｘｍｔｎｚ）　
）　／　２・・・（１５）但し、γ１は、文字列の幅に対する文字列間距離の大き
さの比を示すパラメータで、通常の印刷文書の場合、１
〜２である。xmsna −x+maxt<γI X (0'@ax
t Xmtnt) + (XmaxJ-xmtnz)
) / 2...(15) However, γ1 is a parameter that indicates the ratio of the distance between character strings to the width of the character string, and in the case of a normal printed document, it is 1
~2.

全ての文字列を示す矩形に対して、上記条件（ｉ）、（
ｊ）を互いに満たす矩形のグループを求め、各グループ
で全矩形に外接する矩形を、「文字領域」として抽出す
る。For the rectangle showing all the character strings, the above conditions (i), (
Find groups of rectangles that mutually satisfy j), and extract rectangles that circumscribe all rectangles in each group as a "character area."

〔処理６０４）：処理６０２と同様にして、文字構成要
素のうち、互いに横方向接続条件を満たす矩形のグルー
プを求め、各グループで、全ての矩形に外接する矩形を
求め、「文字列」とする。[Process 604): In the same way as Process 602, groups of rectangles that mutually satisfy the horizontal connection condition are found among the character components, and for each group, a rectangle that circumscribes all the rectangles is found, and a "character string" is found. do.

〔処理６０５）：処理６０４で求められたｒ文字列」を
表わす矩形について、縦方向で隣接する２つの矩形が次
の条件を満す時、文字領域として統合可能とする。隣接
する２つの矩形をＩＩ　ｊ　Ｆがｊの下降）とすると、（ｉｉｉ）　　横方向位置で、重なり合う部分がある。[Process 605): Regarding the rectangle representing the "r character string" obtained in Process 604, when two vertically adjacent rectangles satisfy the following conditions, they can be integrated as a character area. If two adjacent rectangles are II j (F is the descent of j), then (iii) there is an overlapping portion in the lateral position.

＋＋＋ａｘ（Ｘ＋＋ｕｎｉ　、　　ｘｍｔｎａ）〈ｌｌ
ｔｎ（ｘｍａｘｔ　　　　ＸｍａｘＪ）　　”’　　（
１５）（ｔｖ　）　　縦方向距離が、２つの矩形の高さ
の平均の７２倍以下である。+++ax(X++uni, xmtna)〈ll
tn(xmaxt XmaxJ) ”' (
15) (tv) The vertical distance is 72 times or less the average height of the two rectangles.

３／　ｍ１ｎ−ｙ　ｍａｘｔ　＜　γ２Ｘ　　（（３／
＋ａａｘｉ　　　　Ｖｍｌｎｉ）　　＋（’／ｍａｘａ
　　３／ｍ＋ｎｊ）　）　／　２・・・（１６）但し、γ２は、文字列の高さに対する文字列間距離の大
きさの比を示すパラメータで、通常の印刷文書の場合１
〜２である。3/ m1n-y maxt < γ2X ((3/
+aaxi Vmlni) +('/maxa
3/m+nj) ) / 2...(16) However, γ2 is a parameter indicating the ratio of the distance between character strings to the height of the character strings, and is 1 in the case of a normal printed document.
~2.

全ての文字列を示す矩形に対して、上記条件（ｉｉｉ）
（ｔｖ）を互いに満たす矩形のグループを求め、各グル
ープで、全矩形に外接する矩形を「文字領域」として抽
出する。The above condition (iii) is applied to the rectangle representing all character strings.
A group of rectangles that mutually satisfy (tv) is found, and in each group, a rectangle that circumscribes all the rectangles is extracted as a "character area".

〔Effect of the invention〕

本発明によ４１．ば、以下の効果がある。 According to the present invention 41. This has the following effects.

（１）連結成分のり■接矩形を処理の基本単位としてい
るため、画素やランを基本単位とする処理に比べ、効率
良く文書画像を「文字領域」　「図形領域」「写真領域
」に分割することができる。(1) Connected component glue ■Since the basic unit of processing is a bounded rectangle, a document image can be divided into "text area,""graphicarea," and "photo area" more efficiently than processing using pixels or runs as the basic unit. be able to.

（２）「図形領域」と「写真領域」の分離以外は。(2) Except for separating the "graphic area" and "photo area."

連結成分の外接矩形情報だけを用いて処理を行っている
。連結成分の外接矩形の抽出はラインメモリがあれば実
行できるたる、「図形領域」と「写真領域」の分離以外
は、ラインメモリだけで実現できる。また、「図形領域
）と「写真領域」の分離も、分離を行う際の特徴量とし
て白黒画素比など、外接矩形抽出時に同時に求めること
のできるのものを用いれば、全処理を、画像全体を記憶
するメモリを用いることなく、ラインメモリのみで実現
することができる。Processing is performed using only the circumscribed rectangle information of connected components. Extraction of the circumscribed rectangle of connected components can be performed with line memory, but everything except separation of the "figure area" and "photo area" can be achieved with line memory alone. In addition, for separation of "graphic area) and photographic area," if you use features that can be obtained at the same time when extracting the circumscribed rectangle, such as the black-and-white pixel ratio, as the feature quantity when performing separation, the entire process can be performed on the entire image. It can be realized using only a line memory without using a memory for storing data.

[Brief explanation of drawings]

第１図は本発明による領域分割処理の手順を示す全体フ
ロー図、第２図は本発明による領域分割方式の説明図、
第３図は上記領域分割をするためのバードウエフの構成
の１例を示す図、第４図および第５図はそれぞれ横方向
、縦方向の介接条件に関する説明図、第６−１図、第６
−２図、第７図、第９図はそれぞれ第１図の詳細フロー
図、第つからなければ、処理２０７を実行する。〔処理２０５）：着目している矩形と見つかった矩形間
の距離ｄｈを求める。着目している矩形のＸ座標最大値
を、Ｘｍａｘｉ、見つかった矩形のＸ座標最小値をＸ＋
ｍ１ＸＪとすると、ｄ　　ｈ　　＝　ｘ　ｍ１ｎｄ　　　ｘ　ｍａｘ＋　　
　　　　　　　　　　　　−（７）ｄｈは負の値を許す
。〔処理２０６）：横方向で隣接する矩形の組の数Ｃｏｕ
ｎｔＨと、矩形間距離の和ｄｉｓｔＨを更新する。また
、着目している矩形の幅Ｗと高さｈで、幅の和νｓｕｍ
）ｌ、高さの和ｈｓｕｍＨを更新する。〔処理２０７）：横方向と同様に、縦方向隣接条件を満
し、下側で、もつとも近接する矩形を求める。［処理２０Ｂ］：縦方向隣接条件を執す矩形が見つかれ
ば処理２０９を実行し、見つからなければ処理３０２を
実行する。〔処理２０９］　　：着目している矩形と見つかった矩
形間の短離ｄｕを求める。着目している矩形のｙ差標最
大値を、’ｌ　ｍａｎｌ、見つかった矩形のｙ差標最小
値を３’ｍｉ。とすると、ｄ　ｖ＝’ｆｍｓｒａ−ｙ＋ｍａｘ＋　　　　　　　　
”’　（８）ｄｖは負の値を許す。〔処理２１０３：縦方向で隣接する矩形の組の数Ｃｏｕ
ｎｔＶと、矩形間距離の和ｄｉｓｔＶを更新する。また
、着目している矩形の幅Ｗと高さｈで、幅の和ｗｓｕｍ
Ｖ、高さの和ｈｓｕｍＶを更新する。〔処理２１１）：横方向で隣接する矩形間距離の平均ａ
ｖｒｄＨと、縦方向で隣接する矩形間距離の平均ａｖｒ
ｄＶを求める。〔処理２１．２　）　　：　ａｖｒｄ）Ｉ−ａｖｒｄＶ
の値が一εより小さい時、横書と判定、εより大きい時
、縦書と判定する。−ε以上かっε以下の時は、判定不
能とする。ここでＥは非負の定数で１判定不能の範囲を
決めるパラメータである。Ｅ＝０の場合は、ａｖｒｄＨ
＝　ａｖｒｄＶの時のみ、判定不能となる。〔処理３００）：処理２００の結果に基づき文字の高さ
と幅を推定する。第７図は、処理３００の文字高さ、幅推定の詳細に流れ
を示す図である。（工２）〔処理３０１）：処理２００での判定結果が縦書の時、
処理３０２．処理３０８を、実行し、横書の時、処理３
０４．処理３０５を実行する。〔処理３０２）、（処理３０３］　：処理２００で縦方
向文字列を構成すると判定された矩形の高さの平均ａｖ
ｒＨと幅の平均ａｖｒＷを求め、推定値とする。ａｖｒＨ＝　ｈｓｕｍＶ　／　ｃｏｕｎｔＶ　　　　　
　　　０°“（９）ａｖｒＷ＝ｖｓｕｍＶ／ｃｏｕｎｔ
Ｖ　　　　　　　　−（１０）〔処理３０４）、［処理
３０５］　　：処理２００で、横方向文字列を構成する
と判定された矩形の高さの平均ａ　ｖ　ｒ　ｌ（と幅の
平均ａｖｒＷを求め、推定値とする。ａｖｒＨ＝ｈｓｕｍＨ／　ｃｏｕｎｔｌ（−（１１）ａ
ｖｒＷ＝ｗｓｕｍＶ／ｃｏｕｎｔＨ−（１２）〔処理４
００）処理３００で求まった文字の高さ、幅の推定値により、
「図形、写真領域」を抽出する。第８図は、抽出の閾値
を示す図である。ここで、ｓｔ。ｓｚ、ｔｔ、ｔｚはパラメータであり、Ｏ＜８１＜１＜
　ｓ　ｘ　、　Ｏ＜　ｔ　１　＜　１．　＜　ｔ　ｘ。幅が、（Ｓ２）　Ｘ　（ａｕｒａ）より大きいか、高さ
く１３）が（ｔｚ　）　Ｘ　（ａｕｒａ）より大きい矩形を図形
および写真領域構成要素とし、互いに重なり合う矩形が
あれば、それらに外接する矩形で統合し、「図形、写真
領域」とする。〔処理５００〕処理４．　ＯＯで抽出された「図形、写真領域」におい
て、各領域内での白黒画素比ｒｗｂを求める。領域内の白黒画素比を求めるには、一般に原画像データ
を用いなければならないが、連結成分抽出等に、黒画素
数を同時に求めておけば原画素を調べることなく、白黒
画素を求めることができる。ｒｗｂ＝ｎｕｍｔｉｌ／ｎｕｍＢ　　　　　　　　　・
＋　（１３）但し、ｎｕｍＶ　：白画素数、ｎｕｍＢ　
：黒画素数であり、ｒｗｂ＞ｕの時は、「図形領域Ｊ　
、ｒｗｂ＜Ｖの時は、「写真領域」と判定する。ｕ、ｖ
はパラメータで、０　＜　ｕ　＜　ｖ　＜　’Ｊ。［処理６００〕処理４００で文字構成要素と判定された矩形を統合して
、ｒ文字領域」を抽出する。第９図は、処理６００の詳細フロー図である。外接矩形情報テーブル１２の情報を用いて、第１図の流
れに従って領域分割を行い、領域分割結果１４を得る。この領域分割処理においては、「図形領域」と「写真領
域」を分離するために、原画像データを用いるが、それ
以外の処理は、テーブル１２に記憶された外接矩形情報
のみで行うことができる。第３図は上記処理を行うためのハードウェア構成の１例
を示す。文書画像データは、画像入力装置２４により画
像メモリ２５に入力される。画像処理装置２６は、画像
メモリ２５の画像データから連結成分を抽出し、その外
接矩形情報をメイン・メモリ２１上の外接矩形情報テー
ブル１２に書き込む。領域分割処理は、この外接矩形情
報と画像データを用いて、ＣＰＵ２０が実行する。領域
分割の結果は、画像データと共に画像ファイル２７に記
憶される。また、領域分割の結果を、領域の種類ごとに
、異なるパターンでディスプレイ２２に表示したり、プ
リンタ等の画像出力装置２３かに出力することや、原画
像を領域別に表示、出力することができる。ＣＣＰＵ２
０は、上述した各要素の動作制御も行う。この例では、
高速処理を目的に専用の画像処理装置２６を用いたが、
ＣＰＵ２０のプログラム処理により、連結成分の外接矩
形抽出を行う場合は、画像処理装置２６は省略すること
ができる。次に、第１−図に示す処理の詳細について説明する。先
ず、横方向、縦方向隣接条件を第４図に示す矩形−１−
ｙＪにおいて、次のように定義する。〔横方向隣接条件〕（１）横方向について、一方が他方を包含する様な関係
にない。（Ｘｍ１ｎＩ　　Ｘｍ１□Ｊ）Ｘ（ｘｍａｘｉ　　Ｘ＋
ｍａ□）〉０　　・・・（１）（２）双方の重心の縦方
向位置（ｙ座標）が、２つの矩形の縦方向の重なりの間
にある。（３）高さが類似している。１／α＜　ｈ　ｉ／　ｈ　Ｊ　＜α　（α〉１）αは、
高さ比の範囲を示すパラメータ。　　・・・（３）〔縦
方向隣接条件〕第５図に示す矩形に、Ｑにおいて、（１）縦方向について、一方が他方を包含する様な関係
にない。（ｘｍｉｎｈ　　ｘ＋ｍｔｎｋ）Ｘ（ｘｍａｘｋｘｍａ
ｘｔ）＞Ｏ・＝　（４）（２）双方の重心の横方向位置
（Ｘｍ標）が、２つの矩形の横方向の重なりの間にある
。ｍａＸ（ｙ＋ｍ１ｎｋｔ　ｙｍｔｎｔ）＜（ｇｙｈ＋　
ｇｙｔ）＜ｍｔｎ（ｙｍａｘｈ　　ｙｍａｘｔ）ｇｙｋ
＝（３／＋５ａｘｈ　　Ｖｍａ□）／２ｇｙｔ＝（ｙｍ
ａｘｘ　　ｙ＊ａｘｋ）／　２　　　　　　”’　（５
）（３）幅が類似している。〔処理の１００〕画像のデータより連結成分の外接矩形を抽出し、その情
報を外接矩形情報テーブルに書き込む。〔処理２００〕抽出した連結成分の外接矩形情報より、以下の如く文書
画像の縦書横書を判定する。第６−１．６−２図は、処理２００の縦書横書判定の詳
細な流れを示す図である。以下の説明では、連結成分の
外接矩形を矩形と略す。（処理２０１）：変数の初期化を行う。ＣｏｕｎｔＨ。ｄｉｓｔＨは、横方向隣接条件を満し、もつとも近い距
離にある矩形の組の数とその矩形間距離の和、ｈｓｕｍ
ｌ（、ｗｓｕｍＨはそれらの矩形の幅と高さの和である
。同様に、ＣｏｕｎｔＶ、ｄｉｓｔＶ、　ｈｓｕｍＶ、
ｗｓｕｍＶは、縦方向のそれである。〔処理２０２）：処理１００で求めた矩形、全てについ
て、処理２０３〜処理２１．０を繰り返す。繰り返し終了時には、処理２１１を実行する。［処理２０３］：着目している矩形に対して、横方向隣
接条件を満し右側でもつとも近接する矩形を求める。右
隣りを求めるのは、矩形の組が重複して処理されるのを
防ぐ手段であり、左隣りを求めても、また重複を考慮し
て両隣りを求めても良す島。〔処理２０４）：処理２０３で横方向隣接条件を満す矩
形が見つかれば、処理２０５を実行し、見８図は矩形の
高さおよび幅による領域の分類を示す図である。１０・・・入力文書画像データ、１１・・・連結成分外
接矩形抽出部、１２・・・連結成分外接矩形情報テーブ
ル、１３・・・領域分割処理部、１−４・・・領域分割
結果、２０・・・ＣＰＵ、２１・・・メインメモリ、２
２・・・ディプレイ、２３・・・画像出力装置、２４・
・・画像入力装置、２５・・・画像メモリ、２６・・・
画像処理装置、２７・・・画像ファイル。FIG. 1 is an overall flow diagram showing the procedure of region division processing according to the present invention, FIG. 2 is an explanatory diagram of the region division method according to the present invention,
FIG. 3 is a diagram showing an example of the configuration of a bird wuf for the above-mentioned area division, FIGS. 4 and 5 are explanatory diagrams regarding the interposition conditions in the horizontal direction and the vertical direction, respectively, and FIGS. 6
-2, FIG. 7, and FIG. 9 are detailed flowcharts of FIG. 1, respectively. If the process is not completed, the process 207 is executed. [Process 205): Find the distance dh between the rectangle of interest and the found rectangle. The maximum X coordinate value of the rectangle of interest is Xmaxi, and the minimum X coordinate value of the rectangle found is X+
If m1XJ, d h = x m1nd x max+
-(7) dh allows negative values. [Process 206): Number of sets of rectangles adjacent in the horizontal direction Cou
ntH and the sum of distances between rectangles distH are updated. Also, the width W and height h of the rectangle we are looking at, the sum of the widths νsum
)l, update the sum of heights hsumH. [Process 207): Similar to the horizontal direction, find a rectangle that satisfies the vertical adjacency condition and is closest to the bottom. [Process 20B]: If a rectangle satisfying the vertical adjacency condition is found, process 209 is executed; if not, process 302 is executed. [Process 209]: Find the short distance du between the rectangle of interest and the found rectangle. The maximum y-difference value of the rectangle of interest is 'l manl, and the minimum y-difference value of the found rectangle is 3'mi. Then, d v='fmsra-y+max+
”' (8) dv allows negative values. [Processing 2103: Number of pairs of rectangles adjacent in the vertical direction Cou
ntV and the sum of distances between rectangles distV are updated. Also, the width W and height h of the rectangle we are looking at, the sum of the widths wsum
V, update the sum of heights hsumV. [Process 211): Average distance a between horizontally adjacent rectangles
vrdH and the average distance between vertically adjacent rectangles avr
Find dV. [Process 21.2): avrd)I-avrdV
When the value of is smaller than one ε, it is determined to be horizontal writing, and when it is larger than ε, it is determined to be vertical writing. When the value is between -ε and ε, it is impossible to judge. Here, E is a non-negative constant and is a parameter that determines the range in which 1 cannot be determined. If E=0, avrdH
It becomes impossible to judge only when = avrdV. [Process 300): Estimate the height and width of the character based on the result of process 200. FIG. 7 is a diagram showing the detailed flow of character height and width estimation in process 300. (Step 2) [Processing 301): When the judgment result in Processing 200 is vertical writing,
Processing 302. Execute process 308, and when writing horizontally, process 3
04. Processing 305 is executed. [Processing 302), (Processing 303]: Average height av of rectangles determined to constitute a vertical character string in Processing 200
The average avrW of rH and width is determined and used as an estimated value. avrH=hsumV/countV
0°"(9)avrW=vsumV/count
V-(10) [Processing 304), [Processing 305]: Find the average height avrl (and average width avrW) of the rectangles determined to constitute the horizontal character string in the process 200, and calculate the estimated value. avrH=hsumH/countl(-(11)a
vrW=wsumV/countH-(12) [Processing 4
00) Based on the estimated values of the character height and width found in process 300,
Extract "shape, photo area". FIG. 8 is a diagram showing extraction thresholds. Here, st. sz, tt, tz are parameters, O<81<1<
s x , O < t 1 < 1. <tx. A rectangle whose width is larger than (S2) X (aura) or whose height is larger than (tz) and integrate them into the "figure, photo area". [Processing 500] Processing 4. In the "graphic and photo areas" extracted by OO, the black and white pixel ratio rwb in each area is determined. To find the black-and-white pixel ratio within a region, it is generally necessary to use the original image data, but if you simultaneously calculate the number of black pixels for connected component extraction, etc., you can find the black-and-white pixels without examining the original pixels. can. rwb=numtil/numB ・
+ (13) However, numV: number of white pixels, numB
: is the number of black pixels, and when rwb>u, "graphic area J
, rwb<V, it is determined to be a "photograph area". u, v
is a parameter, 0 < u < v <'J. [Process 600] The rectangles determined to be character constituent elements in process 400 are integrated to extract "r character area." FIG. 9 is a detailed flow diagram of process 600. Using the information in the circumscribed rectangle information table 12, region division is performed according to the flow shown in FIG. 1, and a region division result 14 is obtained. In this area division process, the original image data is used to separate the "figure area" and the "photo area", but other processes can be performed only using the circumscribed rectangle information stored in the table 12. . FIG. 3 shows an example of a hardware configuration for performing the above processing. Document image data is input into the image memory 25 by the image input device 24. The image processing device 26 extracts connected components from the image data in the image memory 25 and writes the circumscribed rectangle information to the circumscribed rectangle information table 12 on the main memory 21 . The region division process is executed by the CPU 20 using this circumscribed rectangle information and image data. The region division results are stored in the image file 27 together with the image data. In addition, the results of region division can be displayed on the display 22 in different patterns for each type of region, outputted to an image output device 23 such as a printer, and the original image can be displayed and outputted by region. . CCPU2
0 also controls the operation of each element described above. In this example,
Although a dedicated image processing device 26 was used for the purpose of high-speed processing,
When extracting a circumscribed rectangle of connected components by program processing of the CPU 20, the image processing device 26 can be omitted. Next, details of the processing shown in FIG. 1 will be explained. First, the horizontal and vertical adjacency conditions are determined by rectangle-1- shown in Fig. 4.
yJ is defined as follows. [Horizontal adjacency conditions] (1) In the horizontal direction, there is no relationship in which one side includes the other. (Xm1nI Xm1□J)X(xmaxi X+
ma□)>0 (1) (2) The vertical positions (y coordinates) of both centers of gravity are between the vertical overlaps of the two rectangles. (3) They are similar in height. 1/α< h i/ h J <α (α>1) α is
A parameter that indicates the height ratio range. ...(3) [Vertical adjacency condition] In Q, the rectangle shown in FIG. 5 has the following conditions: (1) In the vertical direction, one does not contain the other. (xminh x+mtnk)X(xmaxkxma
xt)>O・= (4) (2) The lateral positions of both centers of gravity (Xm mark) are between the lateral overlaps of the two rectangles. maX(y+m1nkt ymtnt)<(gyh+
gyt)<mtn(ymaxh ymaxt)gyk
=(3/+5axh Vma□)/2gyt=(ym
axx y*axk)/2 ”' (5
)(3) The widths are similar. [Process 100] Extract the circumscribed rectangle of the connected component from the image data, and write the information in the circumscribed rectangle information table. [Process 200] Vertical or horizontal writing of the document image is determined from the circumscribed rectangle information of the extracted connected components as follows. 6-1.6-2 are diagrams showing the detailed flow of vertical writing/horizontal writing determination in process 200. In the following description, the circumscribed rectangle of a connected component is abbreviated as a rectangle. (Process 201): Initialize variables. CountH. distH is the sum of the number of pairs of rectangles that satisfy the horizontal adjacency condition and are close to each other and the distance between the rectangles, hsum
l(, wsumH is the sum of the width and height of those rectangles. Similarly, CountV, distV, hsumV,
wsumV is that in the vertical direction. [Process 202): Repeat processes 203 to 21.0 for all the rectangles obtained in process 100. At the end of the repetition, process 211 is executed. [Process 203]: For the rectangle of interest, find a rectangle that satisfies the horizontal adjacency condition and is closest to the right side. Finding the right neighbor is a means to prevent a set of rectangles from being processed redundantly, and it is also possible to find the left neighbor, or to take the overlap into consideration, to find both neighbors. [Process 204): If a rectangle that satisfies the horizontal adjacency condition is found in process 203, process 205 is executed, and Figure 8 is a diagram showing the classification of regions according to the height and width of the rectangle. 10... Input document image data, 11... Connected component circumscribing rectangle extraction unit, 12... Connected component circumscribing rectangle information table, 13... Area division processing unit, 1-4... Area division result, 20...CPU, 21...Main memory, 2
2...Display, 23...Image output device, 24.
...Image input device, 25...Image memory, 26...
Image processing device, 27... image file.

Claims

[Claims]

A step of extracting a rectangle circumscribing each connected component by focusing on connected components in a document image including a text area and a graphic or photographic area, and constructing a vertically written or horizontally written character string from the adjacency relationship of each of the circumscribed rectangles. A method for dividing a document image into an area, the method comprising the steps of determining a flat size of a character circumscribing rectangle, and distinguishing between a character area and a figure or photograph area other than characters based on the average size.