JPH0821057B2

JPH0821057B2 - Document image analysis method

Info

Publication number: JPH0821057B2
Application number: JP62172199A
Authority: JP
Inventors: 善丈辻
Original assignee: NEC Corp
Current assignee: NEC Corp
Priority date: 1987-07-10
Filing date: 1987-07-10
Publication date: 1996-03-04
Anticipated expiration: 2011-03-04
Also published as: JPS6415889A

Description

【発明の詳細な説明】（産業上の利用分野）本発明は、書籍等の文書画像を予め定められた紙面構
成に分割し、所望の領域を自動抽出する文書画像解析方
式に係わり、特に、文字行・文字や図表等の基本要素に
分割した後、文書画像を構成する要素及び要素間の配置
関係を生成する文書画像の構造解析方式に係わる。The present invention relates to a document image analysis method for dividing a document image of a book or the like into a predetermined paper surface configuration and automatically extracting a desired area, and in particular, The present invention relates to a structure analysis method of a document image, which is generated after dividing a basic element such as a character line / character or a chart and the like and an arrangement relationship between the elements forming the document image.

（従来の技術）多量の既存文書画像の効率的な蓄積・検索や画像伝送
を行ったり、また、一般書籍を自動的に読み取るために
は、固定書式を持つ帳票の予め定められた特定の文字イ
メージ列だけの文字読取りを行うだけではなく、多種多
様の文書画像を解析し、文字領域や図表領域の分離、更
には、所望の領域を自動抽出することが必要となる。従
来、このような文書画像の構造解析方式、特に、文書画
像の領域分割方式として、例えば、文字領域から成る紙
面の周辺分布と黒画素の連結成分の追跡を行う方法（電
子通信学会論文誌D Vol.J66 NO.1,1983−1,111ページ〜
118ページ）がある。また、文書画像をラスタ走査し、
白／黒ランレングスの情報から、まとまりのよい矩形領
域に構造化する方法がある（情報処理第21回全国大会、
7H−１）。(Prior art) In order to efficiently store / search a large amount of existing document images, transmit images, and to automatically read general books, in order to automatically read general books, a specific character set in advance on a form with a fixed format is specified. Not only is it necessary to read characters from only an image string, but it is also necessary to analyze various document images, separate character areas and chart areas, and automatically extract desired areas. Conventionally, such a structure analysis method of a document image, in particular, as a region division method of a document image, for example, a method of tracking a peripheral distribution of a paper surface composed of character regions and a connected component of black pixels (IEICE Transactions D Vol.J66 NO.1,1983-1,111 pages ~
Page 118). Also, raster scan the document image,
There is a method of structuring from a white / black run length information into a well-coordinated rectangular area (21st National Convention for Information Processing,
7H-1).

（発明が解決しようとする問題点）以上のような文書画像の領域分割方法では、文字行・
図表といった文書画像の基本要素への分解は可能となる
が、各基本要素の配置関係、更には、文章で表現された
文字列情報の流れなどを得ることは困難である。また、
白／黒ランレングス情報からまとまりのよい矩形領域に
構造化する手法などでは、文書画像の全体構成がとらえ
にくいため、予め書式情報や文書画像を構成する文字行
の物理的パラメータを領域分割時に利用する必要があ
る。そのため、上記した文書構成要素の配置関係や文書
の構成などを種々な文書画像に対して自動的に解析する
ことが困難であった。(Problems to be Solved by the Invention) In the document image area dividing method as described above, character lines,
Although it is possible to decompose a document image such as a chart into basic elements, it is difficult to obtain the positional relationship of each basic element and the flow of character string information expressed in text. Also,
In the method of structuring white / black run length information into a well-coordinated rectangular area, it is difficult to grasp the overall structure of the document image, so format information and physical parameters of the character lines that make up the document image are used in advance when dividing the area. There is a need to. Therefore, it is difficult to automatically analyze the above-mentioned arrangement relationship of the document constituent elements and the document structure with respect to various document images.

また、文書画像の構造解析方式として、「スプリット
検出法に基づく頁画像の構造解析」（電子通信学会技術
研究報告パターン認識と学習PRL85−17、1985−６、63
ページ〜70ページ）に記載されているように、垂直又は
水平方向の射影情報を交互に抽出しながら大局的領域か
ら局所的領域に領域分割した後、文書の構成要素を決定
している。しかしながら、上記方法では、書籍の本文等
の文字読取りを前提して行われたものである。そのた
め、一般文書に見られる段組構成の存在や特定な領域の
文字読取りを前提とした場合等には、文書画像を構成す
る配置関係などが得られていないため、領域分割を行っ
た結果に従って文書画像の要素及び要素間の配置関係を
求めることが重要となる。In addition, as a structure analysis method for document images, "Structural analysis of page image based on split detection method" (Technical Research Report of IEICE, Pattern Recognition and Learning PRL85-17, 1985-6, 63)
As described on pages 70 to 70), the constituent elements of the document are determined after segmenting the global region into the local region while alternately extracting vertical or horizontal projection information. However, the above method is performed on the premise of reading characters such as the body of a book. Therefore, if the existence of the column structure found in general documents and the reading of characters in a specific area are presupposed, the layout relationship that forms the document image, etc. is not obtained. It is important to find the elements of the document image and the positional relationship between the elements.

本発明の目的は、従来の上記問題点を解決するため
に、上下又は左右の配置関係を階層的に保持した領域情
報及び文字行等の文書を構成する基本要素の属性値をガ
イドラインとして、文書画像を構成する要素及び要素間
の上下及び左右の配置関係を階層構造として生成し、自
動決定する方式を提供することにある。In order to solve the above-mentioned conventional problems, the object of the present invention is to use the attribute values of basic elements constituting a document such as area information and character lines that hierarchically hold the upper and lower or left and right layout relationships as a guideline, and document An object of the present invention is to provide a method of automatically determining an element that constitutes an image and a vertical and horizontal arrangement relationship between the elements as a hierarchical structure.

本発明の他の目的は、種々な文書画像の構成要素及び
要素間の配置関係を自動生成することによって、文章領
域の情報の流れや文書の構造が容易に検出できる文書画
像の構造記述方式を提供することにある。Another object of the present invention is to provide a structure description method for a document image that can easily detect the flow of information in the text area and the structure of the document by automatically generating the constituent elements of various document images and the positional relationship between the elements. To provide.

本発明の他の目的は、目的に応じて定められる特定な
領域の抽出を容易に行うための文書画像の構造記述方式
を提供することにある。Another object of the present invention is to provide a structure description method of a document image for easily extracting a specific area determined according to the purpose.

（発明の構成）本発明の構成は、分割すべき領域に対し、一意に定ま
る上下又は左右の分割方向を交互に規定しながら、大局
的領域から局小的領域に分割を行う領域分割手段と、前
記分割方向によって定まる上下又は左右の配置関係を階
層的に保持した領域情報及び文字行等の基本要素の属性
値に従って、複数個の基本要素から順次構造化を行い、
文書画像を構成する要素及び要素間の配置関係を階層構
造として生成し、決定する文書構造生成手段とから成
る。(Structure of the Invention) The structure of the present invention is an area dividing means for dividing an area to be divided into a small area from a large area while alternately defining a vertically or horizontally dividing direction that is uniquely determined. , According to the attribute information of the basic elements such as the area information and the character line that hierarchically holds the vertical or horizontal layout relationship determined by the division direction, sequentially structuring from a plurality of basic elements,
And a document structure generating means for generating and determining the elements constituting the document image and the arrangement relationship between the elements as a hierarchical structure.

（実施例）以下本発明における実施例について図面を参照しなが
ら説明する。(Examples) Examples of the present invention will be described below with reference to the drawings.

第１図及び第２図は、それぞれ縦書き及び横書きで記
載された文書画像の構成を説明するために用いた一例で
ある。FIG. 1 and FIG. 2 are examples used to explain the structures of document images written vertically and horizontally, respectively.

図中、黒丸は、文字を示し、斜線で示した矩形領域を
図、表、写真などの要素とする。従来の文書画像の領域
分割方式あるいは行抽出方式などを用いると、第１図及
び第２図の記号Si（第１図ではｉ＝1,…,7、第２図では
ｉ＝1,…,15）で示した文字行領域あるいは第１図の記
号F₁で示し図／表／写真などの領域（以下、画素記述領
域と呼ぶ）が抽出できる。In the figure, black circles indicate characters, and a rectangular area indicated by diagonal lines is an element such as a figure, a table, or a photograph. If a conventional document image area division method or line extraction method is used, the symbol Si in FIGS. 1 and 2 (i = 1, ..., 7 in FIG. 1, i = 1 ,. The character line area shown in 15) or the area such as the figure / table / photo shown by the symbol F ₁ in FIG. ₁ (hereinafter referred to as pixel description area) can be extracted.

次に、文章情報の流れに着目すると、通常、第１図で
示す縦書きの場合、縦書きである文字行は、左から右へ
と文章情報が流れ、文字行内の各文字は上から下へと文
字情報が流れる。即ち、第１図の各文字行の配置関係
は、左右関係がある。また、画素記述領域F₁は文字行
S₅,S₆,S₇の上にあるなどの配置関係が存在する。そこ
で、第１図の図中Ti（ｉ＝1,2）で示した文章領域を検
出すると、左右関係を持つ文字行から成る２つの文章領
域T₁とT₂が左右関係に有り更に画素記述領域F₁と文章領
域T₂とは上下関係であることが容易にわかる。そこで、
上述した配置関係を抽出することによって、例えば、文
章領域T₁からT₂へと順次、文字行内の文字を抽出し、文
字コードに変換したり、あるいは、画素記述領域F₁の下
にある文章領域のみを抽出することなどが容易に可能と
なる。Next, focusing on the flow of sentence information, in the case of vertical writing shown in FIG. 1, normally, in a vertical character line, the sentence information flows from left to right, and each character in the character line is written from top to bottom. Text information flows to. That is, the arrangement relationship of each character line in FIG. 1 has a left-right relationship. Also, the pixel description area F ₁ is a character line.
There is an arrangement relationship such as being above S ₅ , S ₆ , and S ₇ . Therefore, when the text area indicated by Ti (i = 1, 2) in the figure of FIG. 1 is detected, two text areas T ₁ and T ₂ composed of character lines having a left-right relationship have a left-right relationship, and a pixel description is made. It is easy to see that the area F ₁ and the text area T ₂ have a vertical relationship. Therefore,
By extracting the layout relationship described above, for example, from the text area T ₁ to T ₂ in sequence, the characters in the character line are extracted and converted into a character code, or the text under the pixel description area F ₁ It is possible to easily extract only the region.

同様に、第２図で示すような横書きの場合、通常、横
書きである文字行内の各文字は、右から左へと情報が流
れ、横書きの各文字行から成る文章領域は、上から下へ
と情報が流れる。例えば、図中の各文字行Si（ｉ＝1,…
15）において、文字行S₁から文字行S₅及び文字行S₆から
文字行S₁₀及び文字行S₁₁から文字行S₁₅はそれぞれ、上
下関係を持つ文字行から文章領域T₁,T₅,T₆から形成され
ている。また、二段組に類似する構造として、文章領域
T₅,T₆が存在し、情報の流れとして左右関係が存在する
ため、左右関係を持つ文章領域T₅,T₆により文章領域T₄
が形成されていると見ることができる。Similarly, in the case of horizontal writing as shown in FIG. 2, information normally flows from right to left for each character in a horizontal writing line, and the text area consisting of each horizontal writing line moves from top to bottom. Information flows. For example, each character line Si (i = 1, ...
15), character line S ₁ to character line S ₅ and character line S ₆ to character line S ₁₀ and character line S ₁₁ to character line S ₁₅ are respectively from the character lines having a hierarchical relationship to the text areas T ₁ and T ₅ It is formed from T _6. In addition, as a structure similar to two columns, the text area
Since T ₅ and T ₆ exist and there is a left-right relationship as the flow of information, the text area T ₄ has a horizontal relationship with the text areas T ₅ and T _6.
Can be seen as being formed.

また、第１図で示すように、文章領域T₁内で、文字ピ
ッチが異なる性質の文字行が存在した場合、更に、上下
関係を保持する文字行から構成された文章領域T₁を文章
領域T₂とT₃に分解しても、それぞれ上下関係が成立する
ことになる。Further, as shown in FIG. ₁ , when there are character lines having different character pitches in the text region T ₁ , the text region T ₁ composed of the character lines having the upper and lower relation is further added to the text region T _1. Even if it is decomposed into T ₂ and T ₃ , the upper and lower relations are established.

以上説明したように、文書画像の構成要素の配置関係
及び文章情報の流れを表現する場合、各要素間の関係を
上下関係と左右関係（縦書きの場合）あるいは左右関係
（横書きの場合）を階層的に検出し、生成することによ
って可能となることがわかる。As described above, when expressing the layout relationship of the constituent elements of the document image and the flow of the sentence information, the relationship between each element is defined as a vertical relationship and a left-right relationship (in the case of vertical writing) or a left-right relationship (in the case of horizontal writing). It can be seen that this is possible by hierarchically detecting and generating.

第３図（ａ），（ｂ），（ｃ）は、本発明で利用され
る上下及び左右関係の分割方向を交互に規定しながら階
層的に領域分割を行う方式の一例である。FIGS. 3 (a), (b), and (c) are examples of a method of hierarchically dividing an area while alternately defining vertical and horizontal division directions used in the present invention.

上記領域分割の一方法は、前述した「スプリット検出
法に基づく頁画像の構造解析」に記載されている。そこ
で、本内容説明では詳細は省略し、第３図（ａ），
（ｂ），（ｃ）を使って、上記領域分割方式が、上下及
び左右関係を保持しつつ階層的に領域分割される点を中
心に説明する。尚、上記方式は投影情報から得られる特
徴量を用いているが、本発明で利用される領域分割方式
は、これに限るものでなく、ランレングス情報等やそれ
らの併用をはかっても良い。One method of dividing the area is described in "Structural analysis of page image based on split detection method" described above. Therefore, in this description, details are omitted, and FIG.
With reference to (b) and (c), description will be made centering on the point that the area division method is hierarchically divided while maintaining the vertical and horizontal relationships. Although the above method uses the feature amount obtained from the projection information, the area division method used in the present invention is not limited to this, and run length information or the like or a combination thereof may be used.

第３図（ａ）において領域分割対象となる文書画像の
領域Ｐには、黒丸で示した文字及び矩形と斜線で示すよ
うな画素記述領域を含んでおり、第１図で示した縦書き
文書画像の類似構造を持っている。In FIG. 3A, the area P of the document image to be divided into areas includes the characters indicated by black circles and the pixel description areas such as rectangles and diagonal lines, and the vertical writing document shown in FIG. It has a similar structure to the image.

第２図で使用される記号R_i（Lev）（Lev＝1,2,3,…,i
＝1,2,3,…）は、投影分布（図中斜線で示した図形）を
用いた階層的領域分割過程で得られる領域を示してお
り、上記記号Levは、分割レベルを示すものとする。ま
た、分割レベルLevは、階層深さを表わすと共に、投影
情報を求め際の方向をも表わしている。即ち、水平方向
の投影情報により分割された複数個の領域の分割レベル
Levは奇数値を持ち、垂直方向の投影情報により分割さ
れた複数個の領域の分割レベルLevは、偶数値を持つこ
とになる。更に、水平方向（垂直方向）の投影分布によ
り分割された複数個の領域がそれぞれ上下関係（左右関
係）が保存されることは明らかである。Symbol used in FIG. 2 R _i (Lev) (Lev = 1,2,3, ..., i
= 1,2,3, ...) indicates a region obtained by the hierarchical region segmentation process using the projection distribution (the shaded figure in the figure), and the symbol Lev indicates the segmentation level. To do. The division level Lev represents not only the hierarchical depth but also the direction in which the projection information is obtained. That is, the division level of a plurality of areas divided by the projection information in the horizontal direction.
Lev has an odd value, and the division level Lev of a plurality of regions divided by the projection information in the vertical direction has an even value. Further, it is apparent that the vertical relationship (horizontal relationship) is preserved in each of the plurality of regions divided by the horizontal (vertical) projection distribution.

最初に、解析対象領域Ｐに対して、水平投影分布H₁が
適用され、領域R₁（１）が得られる。次に領域R₁（１）
に垂直投影分布V₂が適用され、領域R₁（２），…R
₅（２）が得られる。ここで、分割レベル２を持つ５個
の領域は、順次、左右関係を満足していることは明らか
である。ここで、５個の領域をどのように分割するか
は、各領域の特徴及び各領域間の特徴量（空白値や相関
比）をその親領域R₁（１）の特徴（例えば識別子）と５
個の領域R₁（２），…，R₅（２）の特徴に応じて場合分
けを行い、検査することによって決定される。First, the horizontal projection distribution H ₁ is applied to the analysis target region P, and the region R ₁ (1) is obtained. Next area R ₁ (1)
The vertical projection distribution V ₂ is applied to the region R ₁ (2),… R
₅ (2) is obtained. Here, it is clear that the five areas having the division level 2 sequentially satisfy the left-right relationship. Here, how to divide the five regions is to determine the feature of each region and the feature amount (blank value or correlation ratio) between the regions as the feature (for example, identifier) of the parent region R ₁ (1). 5
, R ₅ (2) are classified according to the characteristics of the individual regions R ₁ (2), ...

第３図（ａ）の場合では、５つの領域に分割され、領
域R₁（２）には識別子として未確定、R₂（２）ないしR₅
（２）には文字行候補という識別子が付加され、記憶さ
れる。In the case of FIG. 3 (a), the region is divided into five regions, and the region R ₁ (2) has not been determined as an identifier, and R ₂ (2) to R ₅
An identifier called a character line candidate is added to (2) and stored.

領域分割は、縦型探索技法が用いられているために、
次に分割すべき領域として領域R₅（２）が取り出され、
同様な処理が繰り返えされる。領域R₅（２）に対しては
水平投影分布を適用すると、複数個の文字候補領域が得
られるため、この時点で領域R₅（２）の分割が停止し、
複数個の文字候補領域も含めて記憶される。同様に、領
域R₄（２）、R₃（２）、R₂（２）が順次、水平投影分布
を適用され、領域R₅（２）の場合と同様な処理が行われ
る。Region segmentation uses the vertical search technique,
The region R ₅ (2) is taken out as the region to be divided next,
Similar processing is repeated. When the horizontal projection distribution is applied to the region R ₅ (2), a plurality of character candidate regions are obtained, so at this point the division of the region R ₅ (2) is stopped,
A plurality of character candidate areas are also stored. Similarly, the regions R ₄ (2), R ₃ (2), and R ₂ (2) are sequentially applied with the horizontal projection distribution, and the same processing as in the case of the region R ₅ (2) is performed.

次に、領域R₁（２）に対して、第３図（ｂ）で示した
ように、水平投影分布H₂が適用され、２つの領域R
₁（３）、R₂（３）が得られ、前述と同様な処理が行わ
れる。Next, the horizontal projection distribution H ₂ is applied to the region R ₁ (2) as shown in FIG.
₁ (3) and R ₂ (3) are obtained, and the same processing as described above is performed.

第３図（ｂ）の場合には、領域R₁（３）と領域R
₂（３）は未確定という識別子が付加され、記憶され
る。この時、領域R₁（３）と領域R₂（３）は上下関係が
成立し、その親領域はR₁（２）である。次に、領域R
₂（３）に対して、第３図（ｃ）で示すように、垂直投
影分布V₄が適用され、３つの文字行候補領域R₁（４）、
R₂（４）、R₃（４）が得られる。In the case of FIG. 3B, the region R ₁ (3) and the region R
₂ (3) is added with an undetermined identifier and stored. At this time, the region R ₁ (3) and the region R ₂ (3) have a vertical relationship, and their parent region is R ₁ (2). Then region R
₂ (3), the vertical projection distribution V ₄ is applied to the three character line candidate regions R ₁ (4), as shown in FIG.
R ₂ (4) and R ₃ (4) are obtained.

また、領域R₁（３）に対して、領域R₂（３）と同様の
垂直射影分布を適用すると、１つのみであるため、これ
以上分割ができず、また、その領域サイズなどから図・
表・写真等の画素表現領域という識別子が与えられる。
以上の如く操作を繰り返し、縦型探索が終了すると、第
４図で示した領域情報の木構造が生成されることにな
る。Also, if the same vertical projection distribution as that of the region R ₂ (3) is applied to the region R ₁ (3), there is only one distribution, so it cannot be further divided, and the region size and other factors can・
An identifier called a pixel expression area such as a table or a photograph is given.
When the vertical search is completed by repeating the above operation, the tree structure of the area information shown in FIG. 4 is generated.

尚、第４図で示した領域分割結果は、第３図（ａ）で
示した文書画像に対応して生成されたものであるが、第
３図（ａ）は第１図と類似した構造を持っているため、
以後の第４図の説明は、第１図に対応して行う。The area division result shown in FIG. 4 is generated corresponding to the document image shown in FIG. 3 (a), but FIG. 3 (a) has a structure similar to that of FIG. Because I have
The following description of FIG. 4 will be given corresponding to FIG.

第４図において、図中サークルで領域情報を表わし、
記号S_iは文字行、記号F_iは、画素表現領域を示すとし、
黒丸は、文字領域情報を示する。また、記号が付加され
ていないサークルは未確定領域情報を表わすとする。In FIG. 4, the area information is represented by a circle in the figure,
The symbol S _i represents a character line, and the symbol F _i represents a pixel representation area,
Black circles indicate character area information. Further, a circle to which no symbol is added represents undetermined area information.

更に、図中Levは分割レベルを表わし、分割レベルが
奇数の時、各領域間に上下関係が成立し、分割レベルが
偶数の時、各領域間に左右関係が成立する。Further, in the figure, Lev represents a division level, and when the division level is an odd number, a vertical relationship is established between the areas, and when the division level is an even number, a horizontal relationship is established between the areas.

ここで、第４図における木構造による階層表現におい
て、分割レベルｉと分割レベルｉ＋１（ｉ＝1,2,…）の
関係は包含関係が成立し、同一分割レベル内の各領域は
左右関係（図中→で示す）又は上下関係（図中↓で示
す）が成立している。Here, in the hierarchical representation with the tree structure in FIG. 4, the relation between the division level i and the division level i + 1 (i = 1, 2, ...) Is an inclusive relation, and the regions within the same division level have a left-right relation ( (Indicated by → in the figure) or a vertical relationship (indicated by ↓ in the figure) is established.

次に、第４図で述べた領域情報の属性値の一例につい
て第５図を用いて説明する。Next, an example of the attribute value of the area information described in FIG. 4 will be described with reference to FIG.

第５図で示した分類名は解析対象領域文字行、画素表
現領域、線分などを示している。位置・大きさは領域の
位置・サイズを表わす。論理的分類名は、例えば、前述
した同一出願人による「スプリット検出法に基づく頁画
像の構造解析」で示されているように、本文、章題や同
一出願人による「分散最小基準に基づく適応型文字分離
方式」（電子通信学会論文誌D'85/8VOL.J68−D.No.8,ペ
ージ1497〜ページ1504）に示されているような文字ピッ
チ推定法を用い、文字行を文字ピッチの性質を含めて分
類する場合などに用いられる。領域間距離は、同一分割
レベルで隣接する領域間の空白サイズを示す。The classification name shown in FIG. 5 indicates the analysis target area character line, pixel expression area, line segment, and the like. The position / size represents the position / size of the area. The logical classification name is, for example, as described in “Structural analysis of page image based on split detection method” by the same applicant as described above, in the text, chapter title, and “adaptation based on minimum variance criterion” by the same applicant. Type character separation method ”(The Institute of Electronics and Communication Engineers, D'85/8 VOL.J68-D.No.8, page 1497 to page 1504). It is used when classifying including the properties of. The inter-region distance indicates a blank size between adjacent regions at the same division level.

ここで、第４図で示した領域分割結果から第１図で説
明したように、文章情報の流れを含む文書構造の記述を
生成する必要がある。Here, as described with reference to FIG. 1, it is necessary to generate a description of the document structure including the flow of text information from the region division result shown in FIG.

即ち、第４図で示した未確定領域に対して、分類記号
等の属性値を決定したり、例えば、第１図で示した文章
領域T₁を新たな領域情報として第４で示した領域分割結
果から生成することが必要である。That is, attribute values such as classification symbols are determined for the undetermined area shown in FIG. 4, or, for example, the text area T ₁ shown in FIG. ₁ is the area shown as the fourth area as new area information. It is necessary to generate from the division result.

第６図は、第４図で示した領域分割結果から、文書構
造を自動生成した一例を示している。FIG. 6 shows an example in which a document structure is automatically generated from the area division result shown in FIG.

第６図で示す文書構造の記述生成は文字行等の基本要
素の属性値として、分類名を用いた一例であり、図中分
類名T_i（但し、ｉ＝1,2,…）は文章領域を示す分類名と
する。The description generation of the document structure shown in FIG. 6 is an example in which a classification name is used as an attribute value of a basic element such as a character line, and the classification name T _i (where i = 1, 2, ...) Is a sentence. It is a classification name indicating the area.

ここで、構造化条件は、第４図の場合では、縦書きで
ある。更に、連続して分類名S_i（文字行）を持つ複数個
の領域に対してのみ、新たな文章領域として生成する場
合とする。Here, the structuring condition is vertical writing in the case of FIG. Further, it is assumed that only a plurality of regions having the classification name S _i (character line) continuously are generated as a new text region.

尚、縦書き・横書きの判定は、予め与えても良いし、
従来技術を用いて自動決定しても良い。また、分類名S_i
を持つ複数個の領域が唯一の親領域である時は、新たな
領域を生成する必要はない。The vertical writing / horizontal writing determination may be given in advance,
It may be automatically determined using a conventional technique. Also, the classification name S _i
It is not necessary to create a new area when the plurality of areas with is the only parent area.

最初に、領域情報の探索として、最も分割レベルの大
きい基本要素（第４図の図中、領域R₁（４）、R
₂（４）、R₃（４））とそれらの親領域（第４図の図中R
₂（３）のみ）が取り出される。First, as a search for area information, the basic element with the largest division level (area R ₁ (4), R in the diagram of FIG. 4) is used.
₂ (4), R ₃ (4)) and their parent regions (R in the diagram of FIG. 4)
₂ (3) only) is taken out.

次に、構造化条件のうち、縦書き・横書きの情報及び
分割レベルLevが検査される。分割レベルLevが奇数の
時、同一親領域を持つ複数個の領域を上下関係が成立す
る順序で第５図で示した上下関係のポインタを用いて連
結することによって並べられる。一方、分割レベルLev
が偶数の時には、文章の流れから見ると、縦書きでは左
右関係（第６図で図中←で示す）が成立し、横書きで
は、左右関係が成立する。そこで、前述した同一親領域
R₂（３）を持つ複数個の領域（第４図の図中R₁（４）、
R₂（４）、R₃（４））を、左右関係が成立する順序で第
５図で示した左右関係のポインタを用いて連結すること
によって並べられる。（第６図の図中R₃（４）→R
₂（４）→R₃（４））尚、上記した縦書き・横書きの情報及び分割レベルLe
vによる並び換えと同等な処理は、第５図で示した各領
域情報の左右関係ポインタ及び上下関係ポインタを用い
て容易に行うことができることは言うまでもない。Next, the vertical writing / horizontal writing information and the division level Lev in the structured condition are checked. When the division level Lev is an odd number, a plurality of areas having the same parent area are arranged by connecting them in the order in which the vertical relationship is established by using the vertical relationship pointers shown in FIG. On the other hand, the split level Lev
When is even, the left-right relationship is established in vertical writing (shown by ← in FIG. 6) when viewed from the flow of text, and the left-right relationship is established in horizontal writing. Therefore, the same parent area described above
A plurality of regions having R ₂ (3) (R ₁ (4) in the diagram of FIG. 4,
R ₂ (4) and R ₃ (4)) are arranged by connecting them in the order in which the left-right relationship is established by using the left-right relationship pointer shown in FIG. (R ₃ (4) → R in the diagram of FIG. 6)
₂ (4) → R ₃ (4)) The above vertical writing / horizontal writing information and division level Le
It goes without saying that the processing equivalent to the rearrangement by v can be easily performed by using the left-right relation pointer and the up-down relation pointer of each area information shown in FIG.

次に、構造化条件である文章領域の生成が行われる。
第４図の例では、３つの領域がすべて分類名S_i（ｉ＝5,
6,7）を持っているため、その親領域R₂（３）がそのま
ま文章領域として分類名T₂が付加され、分割レベルLev
や子領域数（この場合には変化しない）等の更新が行わ
れる。Next, a text area, which is a structuring condition, is generated.
In the example of FIG. 4, all three areas are classified names S _i (i = 5,
6,7), the parent area R ₂ (3) is directly added to the classification name T ₂ as a text area, and the division level Lev
And the number of child areas (which does not change in this case) are updated.

ここで、第４図の場合では、分割レベル４の構造化が
終了する。Here, in the case of FIG. 4, the structuring of the division level 4 is completed.

次に、分割レベル３の領域（第６図の図中領域R
₁（３）、R₂（３））とその親領域R₁（２）が取り出さ
れる。尚、分割レベル３の領域は、第６図の図中黒丸で
示す文字領域が存在するが、それらの親領域は基本要素
である文字行であるため、取り出されないとする。Next, the area of division level 3 (area R in FIG. 6)
₁ (3), R ₂ (3)) and its parent area R ₁ (2) are taken out. In the area of the division level 3, there are character areas indicated by black circles in the drawing of FIG. 6, but since their parent areas are the character lines that are the basic elements, they are not extracted.

同様に、分割レベル３の２つの領域R₁（３）とR
₂（３）に前述した構造化条件が検査される。この場
合、領域R₁（３）とR₂（３）（分類記号T₂が既に付加さ
れている）とが上下関係ポインターが付けられ、それら
の分類名F₁とT₂とでは新たな領域が生成されないため、
それらの親領域R₁（２）に対して、分類名Ｆ＊Ｔ（但し
Ｆ＊Ｔは画素記述表現領域と文章領域の混在を示す）が
付加される。ここで、分割レベル３の構造化が終了す
る。Similarly, there are two regions R ₁ (3) and R of division level 3.
_{2 The} structuring conditions described above in (3) are checked. In this case, the regions R ₁ (3) and R ₂ (3) (with the classification symbol T ₂ already added) are attached with the hierarchical relation pointers, and the classification names F ₁ and T ₂ are new regions. Is not generated,
A classification name F * T (where F * T indicates a mixture of a pixel description expression area and a text area) is added to the parent area R ₁ (2). Here, the structuring of the division level 3 ends.

次に、分割レベル２の領域（第６図では領域R
₁（２）、R₂（２）、R₃（２）、R₄（２）、R₅（２））
とその親領域R₁（１）が取り出される。Next, the area of division level 2 (area R in FIG. 6)
₁ (2), R ₂ (2), R ₃ (2), R ₄ (2), R ₅ (2))
And its parent region R ₁ (1) are retrieved.

同様に、前述した構造化条件が検査され、R₅（２）な
いしR₁（２）の順序（右左関係）で左右関係ポインタが
付けられる。次にそれらの分類名S₁、S₂、S₃、S₄、F₁＊T₂
が順次、調べられる。この場合、分類名S₁、S₂、S₃、S₄が
文章領域の構造化条件を満足し、更に、それらの親領域
R₁（１）は、分類名Ｆ＊Ｔを持つ領域を含んでいるた
め、新たな領域として第６図図中矩形で示す分類名T₁を
持つ領域を生成する。次に、分類記号S₄を持つ領域R
₂（２）と分類名Ｆ＊Ｔを持つ領域R₁（２）との左右関
係ポインタを切り離し、（既ち、領域R₂（２）の左右関
係ポインタをNULLとする。）分類記号T₁を持つ領域の左
右関係ポインタに、領域R₁（２）を示すアドレスを入れ
る。更に、分類記号S₁、S₂、S₃、S₄を示す各領域の第５図
で示した親領域ポインタに分類名T₁を持つ領域を示すア
ドレスが記憶されると共に、分類名T₁を示す領域の第５
図で示す子領域ポインタには、先頭の子領域として分類
記号S₁の領域を示すアドレスが記憶される。次に、分類
記号T₁を持つ領域の分割レベルをその子領域である分類
名S₁…S₄と同一の分割レベル２とし、更にその他の属性
値がセットされる。次に、新たに生成された分類記号T₁
を持つ領域から順次右左関係となる領域（この場合分類
記号Ｆ＊Ｔを持つ領域R₁（２）を取り出し、同様に前述
した構造化条件を調べる。この場合、第６図の親領域R₁
（１）に対して分類名Ｆ＊Ｔが与えられることになる。
以下、同様な操作を行うことにより、第６図で示す文書
構造の記述ができる。Similarly, the structuring conditions described above are examined and left-right relationship pointers are attached in the order R ₅ (2) through R ₁ (2) (right-left relationship). Then their classification names S ₁ , S ₂ , S ₃ , S ₄ , F ₁ * T ₂
Are examined in sequence. In this case, the classification names S ₁ , S ₂ , S ₃ , and S ₄ satisfy the structuring condition of the text area, and
Since R ₁ (1) includes an area having the classification name F * T, an area having the classification name T ₁ shown by a rectangle in FIG. 6 is generated as a new area. Then the region R with the classification symbol S ₄
₂ The left-right relation pointer between (2) and the area R ₁ (2) having the classification name F * T is separated, and the left-right relation pointer of the area R ₂ (2) is set to NULL.) Classification symbol T ₁ The address indicating the area R ₁ (2) is put in the left-right relation pointer of the area having. Furthermore, classification symbols S _1, S _2, S _3, with S address indicating a region having a distinguished name T ₁ parent region pointer indicated ₄ in FIG. 5 for each region showing is stored, classified name T ₁ 5 of the area showing
The child area pointer shown in the figure stores an address indicating the area of the classification symbol S ₁ as the leading child area. Next, the division level of the area having the classification symbol T ₁ is set to the same division level 2 as the child area classification names S ₁ ... S _4, and other attribute values are set. Then, the newly generated classification symbol T ₁
From the region having the left-right relationship (in this case, the region R ₁ (2) having the classification symbol F * T is taken out, and the structuring conditions described above are similarly examined. In this case, the parent region R _{1 in} FIG.
The classification name F * T is given to (1).
Hereinafter, by performing the same operation, the document structure shown in FIG. 6 can be described.

この第６図で示すような文書構造の記述生成結果を示
す木構造を用いて、縦書き文章を所定の順序で読むこと
や、特定な領域、例えば、所定サイズ付近の画素表現領
域の下にある文章などの抽出・読取りが容易に可能とな
ることがわかる。By using the tree structure showing the description generation result of the document structure as shown in FIG. 6, vertical writing sentences are read in a predetermined order, and a specific area, for example, below a pixel expression area near a predetermined size is displayed. It can be seen that certain sentences can be easily extracted and read.

第７図は、第２図で示した横書きの文書画像に対して
領域分割を行った結果を示す一例である。FIG. 7 is an example showing a result of performing area division on the horizontally written document image shown in FIG.

尚、第７図で示す領域分割結果は、前述した同一出願
人による「スプリット検出法に基づく頁画像の構造解
析」に記載されているような従来技術を用いて実現でき
る。また、図中、文字領域については省略する。The area division result shown in FIG. 7 can be realized by using the conventional technique as described in “Analysis of page image structure based on split detection method” by the same applicant. Further, in the figure, the character area is omitted.

第８図は、第７図の領域分割結果に対して文書構造の
記述生成を行った一例である。FIG. 8 shows an example in which the description of the document structure is generated for the area division result of FIG.

第８図で示す文書構造の記述生成は、文字行等の基本
要素として、分類名（S_i,T_iなど）及び文字ピッチＰの
値を用いた一例である。The description generation of the document structure shown in FIG. 8 is an example in which the classification name (S _i , T _i, etc.) and the value of the character pitch P are used as basic elements such as character lines.

構造化条件は、横書きである。更に、連続して分類名
S_iを持ち、且つ、文字ピッチがほぼ等しい（閾値処理）
という条件を満たす時、新たな文章領域の生成を行うも
のとする。The structured condition is horizontal writing. Furthermore, the classification names are consecutive
Has S _i and the character pitches are almost equal (threshold processing)
When the condition is satisfied, a new text area is generated.

第６図の場合と同様な処理であるが、分割レベルが偶
数の時、左右関係ポインタは例えば図中分類名T₅を持つ
領域と分類名T₆を持つ領域は左右関係に基づいてセット
される点及び領域情報における子領域ポインタは、左右
関係を満たす最初の領域のアドレスが格納される点（例
えば、分類名T₄を持つ領域の子領域ポインタは分類名T₅
を持つ領域を示す）が異なる。The process is the same as in the case of FIG. 6, but when the division level is an even number, the left / right relation pointer is set based on the left / right relation, for example, the region having the classification name T ₅ and the region having the classification name T ₆ in the figure. The child area pointer in the point and area information stores the address of the first area that satisfies the left-right relationship (for example, the child area pointer of the area having the classification name T ₄ is the classification name T ₅
Area) is different.

第８図において、文字ピッチＰを用いない場合には、
図中分類名S₁,S₂,S₃,S₄,S₅は、同一の文章領域（図中
T₁）と表わされ、分類名T₂,T₃を持つ領域が生成されな
いことになる。In FIG. 8, when the character pitch P is not used,
The classification names S ₁ , S ₂ , S ₃ , S ₄ , S ₅ in the figure are the same text area (in the figure
T ₁₎ and is represented, so that the area having a distinguished name T _2, T ₃ is not generated.

また、分類名T₁とT₄を持つ文章領域は内部構造として
異なることが容易にわかる。Also, it is easy to see that the text areas having the classification names T ₁ and T ₄ have different internal structures.

即ち、分類名T₁を持つ文章領域の各要素はすべて上下
関係を持っているのに対し、分類名T₄を持つ文章領域
は、左右関係が成立する２つの文章領域から成ってお
り、各文章領域の要素は上下関係を持っている。That is, all the elements of the text area having the classification name T ₁ have a vertical relationship, whereas the text area having the classification name T ₄ is composed of two text areas having a left-right relationship. The elements of the text area have a hierarchical relationship.

このように、第８図で示したように、本発明における
文書構造の記述生成結果を用いることによって、各基本
要素レベルで文章情報の流れも容易に検出できるし、ま
た、特定な部分で段組の構成があることが容易に判明す
る。In this way, as shown in FIG. 8, by using the description generation result of the document structure according to the present invention, the flow of the text information can be easily detected at each basic element level, and in addition, it is possible to detect the flow at a specific portion. It is easy to see that there are sets of configurations.

尚、第６図及び第７図で示した構造化条件として、分
類名や文字ピッチを用いた一例を示したが、本発明では
これのみに限定されるものではない。Although an example of using the classification name and the character pitch is shown as the structuring condition shown in FIGS. 6 and 7, the present invention is not limited to this.

第９図は、本発明の具体的な実施例を示すブロック図
である。図において、１は文書画像を量子化された画像
情報として記憶する文書画像メモリである。２は領域分
割部である。領域分割部２は、文書画像メモリ１は文書
画像に対して第３図で説明したように、上下関係及び左
右関係の配置関係を保持しながら大局的領域から局小的
領域へ領域分割を行う機能を有しており、第４図あるい
は第５図で示したような領域分割過程で得られる領域情
報は順次構造化データ記憶部４に格納する。FIG. 9 is a block diagram showing a specific embodiment of the present invention. In the figure, 1 is a document image memory for storing a document image as quantized image information. Reference numeral 2 is a region dividing unit. The area dividing unit 2 divides the document image memory 1 into a small area from a large area while maintaining the layout relationship of the vertical relationship and the left-right relationship in the document image memory 1 as described in FIG. It has a function, and area information obtained in the area dividing process as shown in FIG. 4 or 5 is sequentially stored in the structured data storage unit 4.

また、領域分割部２は、領域分割結果から文章画像が
縦書きか横書きかを判定し、その結果を縦・横情報記憶
部２に記憶する。Further, the area dividing unit 2 determines whether the text image is written vertically or horizontally based on the area division result, and stores the result in the vertical / horizontal information storage unit 2.

尚、構造化データ記憶部４に格納された各領域情報の
ポインタ関連（親領域ポインタ、子領域ポインタ、上下
関係ポインタ、左右関係ポインタなど）の値は、各領域
の構造化データ記憶部内での相対的位置によって表現さ
れるとする。更に、各領域には、構造化データ記憶部４
内の自分自身の相対的位置も同時に領域の属性値として
相対位置ポインタに記憶されているとする。また、相対
位置カウンタ11には、領域分割部２から構造化データ記
憶部４に格納された最後の領域の次の相対位置が初期と
して記憶されているとし、相対位置カウンタ11はその値
が読み出された時、各領域の属性値単位でカウントアッ
プされるものとする。Note that the pointer-related values (parent area pointer, child area pointer, vertical relation pointer, left-right relation pointer, etc.) of each area information stored in the structured data storage section 4 are stored in the structured data storage section of each area. Suppose it is represented by a relative position. Further, each area has a structured data storage unit 4
It is also assumed that the relative position of itself in the inside is also stored in the relative position pointer as the attribute value of the area at the same time. Further, the relative position counter 11 stores the relative position next to the last region stored in the structured data storage unit 4 from the region division unit 2 as an initial value, and the relative position counter 11 reads the value. When issued, it shall be counted up in units of attribute value of each area.

次に、領域情報制御部４は、第６図で示したように、
分割レベルが最も大きな値を持つ構造化対象となる領域
（以下、後述する親領域に対応して子領域と呼び、最初
は基本要素領域となる。）及びそれらの親領域をペアー
として構造化データ記憶部４から取り出すと同時に前述
した分割レベル及び縦・横情報記憶部３の縦書き／横書
きの情報を用いて、各領域の上下関係ポインタ又は左右
関係ポインタに連結する領域の相対位置を記憶する。更
に、それらの親領域の子領域ポインタには、最初に親領
域から探索される子領域の相対位置を記憶し、対象デー
タ記憶部６に格納する。Next, the area information control unit 4, as shown in FIG.
An area to be structured having the largest division level (hereinafter referred to as a child area corresponding to a parent area, which will be described later, first becomes a basic element area) and structured data in which the parent areas are paired. At the same time as it is retrieved from the storage unit 4, the relative position of the area connected to the vertical relation pointer or the horizontal relation pointer of each area is stored using the division level and the vertical writing / horizontal writing information of the vertical / horizontal information storage unit 3 described above. . Further, the relative position of the child area first searched from the parent area is stored in the child area pointers of those parent areas, and is stored in the target data storage unit 6.

次に、領域情報制御部５は、対象データ記憶部６から
親領域及び複数個の子領域を読み出し、領域判定部８に
転送する。Next, the area information control unit 5 reads the parent area and the plurality of child areas from the target data storage unit 6 and transfers them to the area determination unit 8.

構造化検査部８は、第６図及び第８図で説明したよう
に、複数個の子領域の属性値に対する構造化条件が記憶
された条件記憶部７に従って、複数個の子領域の属性値
は順次検査する。上記検査を順次行った時、新たな領域
を生成する必要が生じた場合、前述した親領域及び子領
域を領域生成部９に転送する。領域生成部９では、複数
個の子領域のうち、構造化されるべき複数個の子領域の
属性値を用いて新たな子領域を生成し、第６図及び第８
図を用いて説明したように新たな子領域の子領域ポイン
タに、構造化されるべき所定の子領域の相対位置を、構
造化されるべき複数個の子領域の親領域ポインタに新た
な子領域の相対位置を、構造化されるべき複数個の子領
域と未検査子領域との上下又は左右関係の切り離し処理
が行われる。尚、新たな子領域の相対位置ポインタには
相対位置カウンタ11の値が読み出され、セットされてい
るものとする。As described with reference to FIGS. 6 and 8, the structured inspection unit 8 uses the condition storage unit 7 in which the structured conditions for the attribute values of the plurality of child regions are stored according to the attribute values of the plurality of child regions. Will be inspected sequentially. When it is necessary to generate a new area when the above inspections are sequentially performed, the parent area and the child area described above are transferred to the area generation unit 9. The area generation unit 9 generates a new child area using the attribute values of a plurality of child areas to be structured among the plurality of child areas, and the new child areas are generated as shown in FIGS.
As described with reference to the figure, the relative position of the predetermined child area to be structured is set to the child area pointer of the new child area, and the new child is set to the parent area pointer of the plurality of child areas to be structured. The relative position of the area is separated from the vertical or horizontal relationship between the plurality of child areas to be structured and the uninspected child area. It is assumed that the value of the relative position counter 11 is read and set to the relative position pointer of the new child area.

また、新たな子領域の親領域ポインタには、その親領
域の相対位置が記憶される。The relative position of the parent area is stored in the parent area pointer of the new child area.

上述した処理は、終了すると、領域生成部９によって
複数個の子領域のうち、前述したようにして構造化され
た複数個の領域が構造化データ記憶部４の所定の相対位
置に書き込まれ、新たに生成された子領域及び複数個の
未検査子領域とその親領域が再度構造化検査部８へ送ら
れ、前述した同様な処理が行われる。次に、構造化検査
部８で順次検査される→複数個の子領域に対して新たな
領域を生成する必要がない場合、複数個の子領域及びそ
の親領域を属性決定部10へ転送する。Upon completion of the above-mentioned processing, the area generation unit 9 writes a plurality of areas structured as described above among the plurality of child areas at predetermined relative positions in the structured data storage unit 4, The newly generated child area, a plurality of uninspected child areas, and their parent areas are sent to the structured inspection unit 8 again, and the same processing as described above is performed. Next, the structured inspection unit 8 sequentially inspects → If it is not necessary to generate a new region for a plurality of child regions, the plurality of child regions and their parent regions are transferred to the attribute determination unit 10. .

属性決定部10では、親領域の属性値を第６図で説明し
たようにして決定し、構造化データ記憶部４の所定の相
対位置に親領域及び複数個の子領域を書き込む。The attribute determining unit 10 determines the attribute value of the parent area as described with reference to FIG. 6, and writes the parent area and the plurality of child areas at predetermined relative positions in the structured data storage unit 4.

次に、領域情報制御部５は、前述したようにして、構
造化を行った分割レベルを持つ親領域と複数個の子領域
のペアーが対象データ記憶部６に残っていれば、それら
のペアーを順次、構造化検査部８へ転送する。Next, the area information control unit 5, as described above, if a pair of the parent area having the structured division level and the plurality of child areas remains in the target data storage unit 6, then those pairs are paired. Are sequentially transferred to the structured inspection unit 8.

一方、対象データ記憶部６が空であれば、次に分割レ
ベルを１つ減らし、構造化対象となる領域及びその親領
域のペアーを構造化データ記憶部４から取り出し、以
下、分割レベル１の領域が構造化対象として取り出され
るまで同様な動作が行われ、その結果、構造化データ記
憶部４に、文書画像メモリ１に格納された文書の構造記
述が自動生成されることになる。On the other hand, if the target data storage unit 6 is empty, the division level is reduced by one, the pair of the area to be structured and its parent area is taken out from the structured data storage unit 4, and The same operation is performed until the area is taken out as a structured object, and as a result, the structural description of the document stored in the document image memory 1 is automatically generated in the structured data storage unit 4.

（発明の効果）以上説明したように、本発明の文書画像解析方式によ
れば、上下又は左右の配置関係を階層的に保持した領域
情報及び文書を構成する基本要素の属性値をガイドとし
て、文書画像の構成要素及び要素間の配置関係を階層構
造として自動決定することによって、文章情報の流れを
含む文書の構造が抽出される。それ故、様々の目的に応
じて定められる特定領域の抽出を容易に行うことができ
る。(Effect of the Invention) As described above, according to the document image analysis method of the present invention, the area information in which the vertical or horizontal layout relationship is hierarchically held and the attribute value of the basic element forming the document are used as guides. The structure of the document including the flow of text information is extracted by automatically determining the constituent elements of the document image and the positional relationship between the elements as a hierarchical structure. Therefore, it is possible to easily extract the specific area determined according to various purposes.

[Brief description of drawings]

第１図及び第２図はそれぞれ縦書き及び横書きで記載さ
れた文書画像の構成を説明した図である。第３図は、上下及び左右関係の分割方向を交互に規定し
ながら階層的に領域分割を行う文書領域分割方式の一例
である。第４図及び第７図は、それぞれ第１図及び第２図文書画
像の領域分割結果の一例を示している。第５図は、領域情報の一例を示している。第６図及び第８図は、第４図及び第７図で示した領域分
割結果から文書構造を自動生成した一例である。第９図は、本発明の具体的な一実施例を示す論理ブロッ
ク図である。図において、１は文書画像メモリ、２は領域分割部、３
は縦・横情報記憶部、４は構造化データ記憶部、５は領
域情報制御部、６は対象データ記憶部、７は条件記憶
部、８は構造化検査部、９は領域生成部、10は属性決定
部、11は相対位置カウンタである。FIG. 1 and FIG. 2 are diagrams for explaining the structures of document images written vertically and horizontally, respectively. FIG. 3 is an example of a document area division method in which area division is hierarchically performed while alternately defining vertical and horizontal division directions. FIG. 4 and FIG. 7 show an example of the result of area division of the document image of FIG. 1 and FIG. 2, respectively. FIG. 5 shows an example of area information. FIGS. 6 and 8 are examples of automatically generating the document structure from the area division results shown in FIGS. 4 and 7. FIG. 9 is a logical block diagram showing a concrete example of the present invention. In the figure, 1 is a document image memory, 2 is a region dividing unit, 3
Is a vertical / horizontal information storage unit, 4 is a structured data storage unit, 5 is a region information control unit, 6 is a target data storage unit, 7 is a condition storage unit, 8 is a structured inspection unit, 9 is a region generation unit, 10 Is an attribute determining unit, and 11 is a relative position counter.

Claims

[Claims]

1. When a document image is divided into a plurality of basic element regions such as character lines, the region to be divided is defined from the large local region while alternately defining the vertically or horizontally divided direction that is uniquely determined. Area dividing means for dividing into small areas, and a plurality of basic elements according to the attribute value of the basic element and a plurality of area information that hierarchically holds the vertical or horizontal arrangement relationship determined by the dividing direction And a document structure generating means for generating and deciding the elements constituting the document image and the arrangement relationship between the elements as a hierarchical structure.