JPH08202859A

JPH08202859A - Electronic filing apparatus and method thereof

Info

Publication number: JPH08202859A
Application number: JP7013789A
Authority: JP
Inventors: Takayuki Shimizu; 高幸清水
Original assignee: Canon Inc
Current assignee: Canon Inc
Priority date: 1995-01-31
Filing date: 1995-01-31
Publication date: 1996-08-09

Abstract

(57)【要約】【目的】レイアウトに特徴のない文書でも容易に識別
できるようにする。【構成】読み込んだ文書のイメージデータの領域分割
を行ない、領域分割された領域から１つの文字領域を特
定して、この特定された文字領域から特定の大きさのイ
メージデータを切り出す。そして、読み込んだ文書のイ
メージデータと、切り出されたイメージデータとを関連
付けて記憶し、検索の要求に応じて、この記憶されたイ
メージデータを表示する。 (57) [Summary] [Purpose] To make it possible to easily identify documents that have no characteristic layout. [Structure] The image data of the read document is divided into areas, one character area is specified from the area-divided area, and image data of a specific size is cut out from the specified character area. Then, the image data of the read document and the cut-out image data are stored in association with each other, and the stored image data is displayed in response to a search request.

Description

Detailed Description of the Invention

【０００１】[0001]

【産業上の利用分野】本発明は、所定の管理情報と関連
付けて記憶された文書のイメージデータの検索、表示、
印刷などを行なう電子ファイリング装置及びその方法に
関するものである。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to searching, displaying, and displaying image data of a document stored in association with predetermined management information.
The present invention relates to an electronic filing device and method for printing.

【０００２】[0002]

【従来の技術】近年、文書をスキャナ等で読み込むなど
して作成した文書のイメージデータに文書管理情報を関
連付けて記憶し、これらの情報に対して検索、表示、印
刷等を行なう電子ファイリング装置が発表されている。
このような電子ファイリング装置では、従来、文書管理
情報として、文書名、ページ数、登録日、キーワードな
どを登録し、文書の一覧や検索結果リストなどを表示す
る場合には、文書を識別するための情報として、これら
の文書管理情報を表示している。2. Description of the Related Art In recent years, an electronic filing apparatus for storing image data of a document created by reading the document with a scanner or the like in association with the document management information and performing retrieval, display, printing, etc. on the information is disclosed. Has been announced.
In such an electronic filing apparatus, conventionally, a document name, the number of pages, a registration date, a keyword, and the like are registered as document management information, and when displaying a document list or a search result list, the document is identified. The document management information is displayed as the information of.

【０００３】しかし、このような文書管理情報のみで
は、その文書の概要を識別するのが困難である。そのた
め、文書の縮小画像を作成、登録し、それを文書の一覧
や検索結果リストなどにおいて表示することができる電
子ファイリング装置が発表されている。However, it is difficult to identify the outline of the document only with such document management information. Therefore, an electronic filing device has been announced that can create and register a reduced image of a document and display it in a document list or a search result list.

【０００４】このような電子ファイリング装置は、例え
ば、図４１に示すように、文書読み込み部１０００、縮
小画像作成部１００１、文書記憶部１００２、表示制御
部１００３とを備えている。これらの内、文書読み込み
部１０００は、不図示のスキャナ装置などから読み込ま
れた文書のイメージデータを読み込む。また、縮小画像
作成部１００１は、文書読み込み部１０００により読み
込まれた文書のイメージデータから、適当な大きさに縮
小したイメージデータを作成し、文書登録部１００２
は、文書読み込み部１０００により読み込まれた文書の
イメージデータと、縮小画像作成部１００１により作成
された文書の縮小画像データとを関連付けて記憶する。
そして、表示制御部１００３は、文書記憶部１００２に
記憶されている文書のイメージデータや縮小画像の表示
の制御を行ない、例えば、図４２に示すような縮小画像
による文書の一覧表示を行なう。Such an electronic filing apparatus includes, for example, as shown in FIG. 41, a document reading section 1000, a reduced image creating section 1001, a document storage section 1002, and a display control section 1003. Of these, the document reading unit 1000 reads image data of a document read from a scanner device (not shown) or the like. Further, the reduced image creation unit 1001 creates image data reduced to an appropriate size from the image data of the document read by the document reading unit 1000, and the document registration unit 1002.
Stores the image data of the document read by the document reading unit 1000 and the reduced image data of the document created by the reduced image creation unit 1001 in association with each other.
Then, the display control unit 1003 controls the display of image data and reduced images of the documents stored in the document storage unit 1002, and, for example, displays a list of documents by reduced images as shown in FIG.

【０００５】さらに、近年、作成した文書のイメージデ
ータに、キーワードなどの文書管理情報を関連付けて記
憶し、これらの情報に対して検索、表示、印刷などを行
なう電子ファイリング装置が開発されている。このよう
なキーワードは、以前は、文書の登録時などにユーザが
手作業で入力する必要があったが、近年では、文書のイ
メージデータに対して文字認識を行ない、その認識結果
に対応する文字列を文書に関連付けて自動的に登録し、
その文字列に対して全文検索を行なうなど、キーワード
を特に登録する必要がないものが発表されている。Further, in recent years, an electronic filing apparatus has been developed in which image management data such as a created document is stored in association with document management information such as a keyword, and such information is searched, displayed, and printed. Previously, such a keyword had to be manually input by a user when registering a document, but in recent years, character recognition is performed on image data of a document, and a character corresponding to the recognition result is used. Automatically associate columns with the document,
It has been announced that there is no need to register a keyword, such as performing a full-text search for the character string.

【０００６】このような電子ファイリング装置では、登
録した文書の文字列に対して検索を行なうのみならず、
文書の文字列から特定の文字列を取り出し、別ファイル
に保存したり、他のアプリケーションに複写することが
でき、その場合、文書の文字列から特定の文字列を選択
している。In such an electronic filing apparatus, not only the character string of the registered document is searched but also
It is possible to extract a specific character string from the document character string, save it in another file, or copy it to another application. In that case, the specific character string is selected from the document character string.

【０００７】この種の電子ファイリング装置としては、
例えば、図４３に示すような装置がある。同図に示す装
置は、文書読み込み部３００１、文字認識部３００２、
文書記憶部３００３、不図示のディスプレイ装置などか
らなるイメージ表示部３００４、文字読み込み部３００
５、イメージ表示部３００４と同一のディスプレイ装置
などからなる文字表示部３００６、ポインティングデバ
イスなどからなる選択範囲指定部３００７、選択文字取
得部３００８を備えている。As an electronic filing device of this type,
For example, there is a device as shown in FIG. The apparatus shown in the figure includes a document reading unit 3001, a character recognition unit 3002,
Document storage unit 3003, image display unit 3004 including a display device (not shown), and character reading unit 300
5, a character display unit 3006 including the same display device as the image display unit 3004, a selection range specifying unit 3007 including a pointing device, and a selected character acquisition unit 3008.

【０００８】上記の構成要素の内、文書読み込み部３０
０１は、例えば、スキャナ装置などから読み込まれた文
書のイメージデータを読み込み、そのイメージデータを
文字認識部３００２と文書記憶部３００３に供給する。
文字認識部３００２は、文書読み込み部３００１からの
イメージデータに対して文字認識を行ない、文書に記さ
れている文字列を抽出して、抽出した文字列を文書記憶
部３００３に供給する。文書記憶部３００３は、文書読
み込み部３００１から文書のイメージデータを、また、
文字認識部３００２から文書の文字列を受け取り、これ
らを他の文書管理情報と関連付けて記憶する。Of the above components, the document reading unit 30
01 reads image data of a document read from a scanner device or the like, and supplies the image data to the character recognition unit 3002 and the document storage unit 3003.
The character recognition unit 3002 performs character recognition on the image data from the document reading unit 3001, extracts the character string described in the document, and supplies the extracted character string to the document storage unit 3003. The document storage unit 3003 stores the image data of the document from the document reading unit 3001,
The character string of the document is received from the character recognition unit 3002, and these are stored in association with other document management information.

【０００９】イメージ表示部３００４は、文書記憶部３
００３から文書のイメージデータを読み込み、そのイメ
ージデータを表示する。また、文字読み込み部３００５
は、文書記憶部３００３から文書の文字列を読み込み、
読み込んだ文書の文字列を文字表示部３００６に供給す
る。この文字表示部３００６は、文字読み込み部３００
５から文書の文字列を受け取り、その文字列を所定の位
置に表示して、文字列と各文字列の表示の位置情報を選
択文字取得部３００に供給するとともに、オペレータが
特定の文字列を選択した場合は、選択範囲指定部３００
７から選択範囲の位置情報を受け取り、選択範囲を反転
表示する。The image display unit 3004 includes a document storage unit 3
The image data of the document is read from 003 and the image data is displayed. Also, the character reading unit 3005
Reads the character string of the document from the document storage unit 3003,
The character string of the read document is supplied to the character display unit 3006. The character display unit 3006 is a character reading unit 300.
The character string of the document is received from 5, the character string is displayed at a predetermined position, the position information of the character string and the display of each character string is supplied to the selected character acquisition unit 300, and the operator specifies the specific character string. When selected, the selection range designation unit 300
The position information of the selected range is received from 7, and the selected range is highlighted.

【００１０】選択範囲指定部３００７は、文字表示部３
００６の表示画面上でオペレータにより指定される選択
範囲の位置情報を、文字表示部３００６と選択文字取得
部３００８に供給する。そして、選択文字取得部３００
８は、文字表示部３００６から文書の文字列と各文字の
表示の位置情報を受け取り、選択範囲指定部３００７か
ら選択範囲の位置情報を受け取って、文書の文字列から
選択範囲内の文字列を取得する。The selection range designation section 3007 is provided in the character display section 3
The position information of the selection range designated by the operator on the display screen 006 is supplied to the character display unit 3006 and the selected character acquisition unit 3008. Then, the selected character acquisition unit 300
8 receives the character string of the document and the position information of the display of each character from the character display unit 3006, receives the position information of the selected range from the selection range designation unit 3007, and extracts the character string within the selected range from the character string of the document. get.

【００１１】図４４は、この種の電子ファイリング装置
における文書の表示画面例である。同図に示すように、
上記イメージ表示部３００４により、文書のイメージデ
ータが、符号４００１にて示される位置に表示され、文
字表示部３００６により、文書の文字列が符号４００２
の位置に表示され、さらに、選択範囲指定部３００７に
より、符号４００３にて示される範囲内の文字列が選択
される。FIG. 44 is an example of a document display screen in this type of electronic filing apparatus. As shown in the figure,
The image display unit 3004 displays the image data of the document at the position indicated by the reference numeral 4001, and the character display unit 3006 displays the character string of the document by the reference numeral 4002.
The character string within the range indicated by reference numeral 4003 is selected by the selection range designation unit 3007.

【００１２】[0012]

【発明が解決しようとする課題】しかしながら、上記従
来の電子ファイリング装置では、文書の一覧や検索結果
リストなどの表示において、文書の縮小画像から文書の
概要を識別することはできるものの、識別できるのは文
書全体のレイアウトぐらいであり、文書に記されている
文字を識別することはできない。このため、レイアウト
が似ている文書や、大きな文字や図形などがなく、特徴
の少ないレイアウトの文書などを識別することは非常に
困難であるという問題がある。However, in the above-described conventional electronic filing apparatus, in displaying a document list or a search result list, the outline of the document can be identified although it can be identified from the reduced image of the document. Is only the layout of the entire document and cannot identify the characters written in the document. Therefore, there is a problem in that it is very difficult to identify a document having a similar layout or a document having a layout with few features without large characters or figures.

【００１３】さらに、図４３にて示される、上記従来の
電子ファイリング装置では、文書の文字列から特定の文
字列を選択する場合、文書の文字列を表示する文字表示
部において選択する必要があり、通常、電子ファイリン
グ装置では、オペレータは、まず、文書のイメージデー
タを表示するイメージ表示部により文書の内容を確認す
るために、選択したい文字列も必然的にイメージ表示部
で確認することになる。そこで、文字列を選択するに
は、イメージ表示部において文書の内容と選択したい文
字列を確認し、それから文字表示部により文書の文字列
を表示して、そこであらためて選択したい文字列を探し
て選択する必要があるので、文字選択の操作が繁雑にな
るという問題がある。Further, in the above-mentioned conventional electronic filing apparatus shown in FIG. 43, when a specific character string is selected from the character strings of the document, it is necessary to select it in the character display section for displaying the character string of the document. Generally, in an electronic filing device, an operator first confirms a character string to be selected on the image display unit in order to confirm the content of the document by the image display unit that displays the image data of the document. . Therefore, to select a character string, check the content of the document and the character string you want to select in the image display section, then display the character string of the document in the character display section, search for the character string you want to select again and select it. Therefore, there is a problem that the character selection operation becomes complicated.

【００１４】本発明は、上述の課題に鑑みてなされたも
のであり、その目的とするところは、文書の一覧や検索
結果リストなどの表示において、レイアウトが似ている
文書や、あまり特徴のないレイアウトの文書でも容易に
識別できる電子ファイリング装置及びその方法を提供す
ることである。The present invention has been made in view of the above problems, and an object of the present invention is to display documents such as a document list and a search result list with documents having a similar layout or to have less features. An object of the present invention is to provide an electronic filing device and a method thereof that can easily identify a layout document.

【００１５】また、本発明の他の目的は、文字領域の組
方向にかかわらず文字領域のイメージデータを適切に切
り出して表示できる電子ファイリング装置及びその方法
を提供することである。Another object of the present invention is to provide an electronic filing apparatus and method capable of appropriately cutting out and displaying image data of a character area regardless of the direction in which the character area is set.

【００１６】また、本発明のさらなる目的は、文書に記
されている文字列から文字選択を容易に行なえる電子フ
ァイリング装置及びその方法を提供することである。A further object of the present invention is to provide an electronic filing apparatus and a method thereof which can easily select a character from a character string written in a document.

【００１７】[0017]

【課題を解決するための手段】及び[Means for Solving the Problems] and

【作用】上記の目的を達成するため、本発明は、読み込
んだ複数の文書を検索する電子ファイリング装置におい
て、前記読み込んだ文書のイメージデータに対して領域
分割を行なう手段と、前記領域分割された領域から文字
領域を判別する手段と、前記文字領域が複数ある場合、
該複数の文字領域から１つの文字領域を特定する特定手
段と、前記特定された文字領域から特定の大きさのイメ
ージデータを切り出す切出し手段と、前記読み込んだ文
書のイメージデータと、前記切り出されたイメージデー
タとを関連付けて記憶する記憶手段と、前記検索に対応
させて、前記記憶されたイメージデータを表示する表示
手段とを備える。In order to achieve the above object, the present invention is an electronic filing apparatus for retrieving a plurality of read documents, and a means for performing area division on the image data of the read documents and the area division. Means for discriminating a character area from an area, and a plurality of character areas,
Specifying means for specifying one character area from the plurality of character areas; cutting means for cutting out image data of a specific size from the specified character area; image data of the read document; and the cut-out A storage unit that stores the image data in association with each other and a display unit that displays the stored image data in association with the search are provided.

【００１８】また、他の発明は、読み込んだ複数の文書
を検索する電子ファイリング装置において、前記読み込
んだ文書のイメージデータに対して領域分割を行なう手
段と、前記領域分割された領域から文字領域を判別する
手段と、前記文字領域のイメージデータに対して文字認
識を行なう手段と、前記文字認識に基づいて所定の情報
を抽出する手段と、前記読み込んだ文書のイメージデー
タと、前記所定の情報とを関連付けて記憶する記憶手段
と、前記読み込んだ文書のイメージデータを表示する手
段と、前記表示された文書のイメージデータ上において
特定の範囲を指定する手段と、前記所定の情報から、前
記特定の範囲の文字列を取得する取得手段とを備える。According to another invention, in an electronic filing apparatus for retrieving a plurality of read documents, means for dividing a region of image data of the read document, and a character region from the divided regions. A means for determining, a means for performing character recognition on the image data of the character area, a means for extracting predetermined information based on the character recognition, image data of the read document, and the predetermined information. Associated with each other, storing means for displaying the image data of the read document, means for designating a specific range on the image data of the displayed document, and the specific information from the specific information. And an acquisition unit that acquires a character string of the range.

【００１９】以上の構成において、レイアウトの似た文
書やレイアウトに特徴のない文書でも容易に識別でき、
また、文字領域の組方向にかかわらず文字領域のイメー
ジデータを適切に切り出して表示するよう機能する。With the above structure, even documents with similar layouts or documents with no characteristic layout can be easily identified.
In addition, the image data of the character area is appropriately cut out and displayed regardless of the direction in which the character area is set.

【００２０】また、文字列から文字選択を容易に行なえ
るよう機能する。Further, it also functions to easily select a character from the character string.

【００２１】[0021]

【実施例】以下、添付図面を参照して、本発明に係る好
適な実施例を詳細に説明する。［第１実施例］図１は、本発明の第１の実施例に係る電
子ファイリング装置の構成を示すブロック図である。同
図に示す電子ファイリング装置は、文書読み込み部１
０、領域分割部１、先頭文字領域特定部２、文字領域切
り出し部３、文書登録部４、ファイル装置５、表示制御
部６を備えている。DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS Preferred embodiments of the present invention will be described in detail below with reference to the accompanying drawings. [First Embodiment] FIG. 1 is a block diagram showing the arrangement of an electronic filing apparatus according to the first embodiment of the present invention. The electronic filing device shown in FIG.
0, a region dividing unit 1, a leading character region specifying unit 2, a character region cutting unit 3, a document registration unit 4, a file device 5, and a display control unit 6.

【００２２】文書読み込み部１０は、不図示のスキャナ
装置などから読み込まれた文書のイメージデータを、内
蔵するメモリに読み込む。領域分割部１は、文書読み込
み部１０より読み込んだ文書のイメージデータの中か
ら、文字領域や図形領域などを抽出する。また、先頭文
字領域特定部２は、領域分割部１で抽出した文字領域の
順序付けを行ない、最初の文字領域を特定する。そし
て、文字領域切り出し部３は、先頭文字領域特定部２で
特定された文字領域から、特定の大きさのイメージデー
タを切り出す。The document reading unit 10 reads image data of a document read from a scanner device (not shown) or the like into a built-in memory. The area dividing unit 1 extracts a character area, a graphic area, and the like from the image data of the document read by the document reading unit 10. Further, the first character area specifying unit 2 orders the character areas extracted by the area dividing unit 1 and specifies the first character area. Then, the character area cutout unit 3 cuts out image data of a specific size from the character area specified by the leading character area specification unit 2.

【００２３】文書登録部４は、文書読み込み部１０で読
み込んだ文書のイメージデータと、文字領域切り出し部
３で切り出した文字領域のイメージデータとを関連付け
てファイル装置５に登録する。ファイル装置５は、文書
のイメージデータとキーワードなどの文書管理情報とを
関連付けてデータベースとして記憶する。また、表示制
御部６は、ファイル装置５にて記憶した文字領域のイメ
ージデータのディスプレイ（不図示）などへの表示を制
御する。The document registration section 4 registers the image data of the document read by the document reading section 10 and the image data of the character area cut out by the character area cutting section 3 in the file device 5 in association with each other. The file device 5 stores image data of a document and document management information such as a keyword in association with each other as a database. Further, the display control unit 6 controls the display of the image data of the character area stored in the file device 5 on a display (not shown) or the like.

【００２４】なお、本実施例に係る電子ファイリング装
置には、上記以外にも、ファイル装置５に記憶された文
書の情報に対して、検索、印刷などの機能を実現するた
めの種々の構成要素が設けられているが、ここでは、そ
れらの説明を省略する。In addition to the above, the electronic filing apparatus according to the present embodiment has various components for realizing functions such as searching and printing for document information stored in the file apparatus 5. Are provided, but the description thereof will be omitted here.

【００２５】次に、上記の構成をとる、本実施例に係る
電子ファイリング装置における文書登録処理について、
図２に示すフローチャートに従って説明する。Next, regarding the document registration processing in the electronic filing apparatus according to this embodiment having the above-mentioned configuration,
A description will be given according to the flowchart shown in FIG.

【００２６】まず、電子ファイリング装置では、ユーザ
からの所定の開始指示により処理を開始し、スキャナ装
置などで読み込んだ文書のイメージデータをメモリ上に
読み込み（ステップＳ１）、そのイメージデータに対し
て領域分割を行なう（ステップＳ２）。First, in the electronic filing apparatus, processing is started by a predetermined start instruction from a user, image data of a document read by a scanner apparatus or the like is read into a memory (step S1), and an area for the image data is read. Divide (step S2).

【００２７】ここでの領域分割とは、文書全体のイメー
ジデータを、文章、図形、表などの様々な属性の領域に
分割し、抽出するもので、通常、文字認識の前処理とし
て行なわれている。また、領域の属性は、文字領域とイ
メージ領域とに大別され、文字領域からは、属性情報と
して領域の位置、領域のサイズ、領域内の文字の平均文
字サイズなどが抽出される。なお、領域分割の方法につ
いは、本発明と直接関係がないので、その説明を省略す
る。The area division here is to divide the image data of the entire document into areas of various attributes such as sentences, figures, and tables, and extract it. Usually, this is performed as a preprocessing for character recognition. There is. The attributes of the area are roughly classified into a character area and an image area, and the position of the area, the size of the area, the average character size of the characters in the area, and the like are extracted from the character area as attribute information. The method of area division is not directly related to the present invention, and therefore its explanation is omitted.

【００２８】次に、読み込んだ文書に文字領域があるか
どうかを判別し（ステップＳ３）、その文書に文字領域
がある場合は、文字領域の順序付けを行なって、最初の
文字領域を特定する（ステップＳ４）。ここで、最初の
文字領域を特定するのは、一般的に、文書の最初の部分
にその文書の特徴的な文があるからである。Next, it is determined whether or not the read document has a character area (step S3). If the document has a character area, the character areas are ordered to specify the first character area (step S3). Step S4). Here, the first character area is specified because, in general, the first part of the document has a characteristic sentence of the document.

【００２９】上記の文字領域の順序付けとは、分割した
文字領域毎に文字認識した文字列を、その前後関係に基
づいて適切に結合し、一つの文章として再生するために
行なうもので、上記の領域分割とともに文字認識の前処
理として行なわれている。なお、この順序付けの方法に
ついても本発明の主眼ではないので、その説明を省略す
る。The above-mentioned ordering of the character areas is performed so that the character strings recognized for each divided character area are appropriately combined based on the context thereof and reproduced as one sentence. It is performed as a pre-process for character recognition along with area division. Since this ordering method is not the main object of the present invention, its explanation is omitted.

【００３０】図３は、本実施例における文書の領域分割
と文字領域の順序付けを説明するための図である。同図
では、領域分割によりイメージ領域と文字領域が分割、
抽出された様子を示している。これらの内、イメージ領
域については、灰色で塗りつぶした矩形領域で、また、
文字領域については、フレームの矩形領域で示されてい
る。そして、順序付けにより文字領域が順序付けられた
様子を、文字領域を示すフレームの中の番号で示してい
る。FIG. 3 is a diagram for explaining area division of a document and ordering of character areas in this embodiment. In the figure, the image area and the character area are divided by the area division,
The extracted state is shown. Of these, the image area is a rectangular area filled with gray, and
The character area is indicated by a rectangular area of the frame. The state in which the character areas are ordered by the ordering is indicated by the number in the frame indicating the character area.

【００３１】一方、ステップＳ３で、文書中に文字領域
がないと判定された場合には、適当な１つのイメージ領
域を特定する（ステップＳ５）。このイメージ領域を特
定する条件は、領域の位置やサイズなどから適当に決め
る。On the other hand, if it is determined in step S3 that there is no character area in the document, an appropriate one image area is specified (step S5). The conditions for specifying the image area are appropriately determined based on the position and size of the area.

【００３２】次に、上記のように特定された領域の属性
情報としての、領域の位置と領域のサイズをもとに、あ
らかじめ定められた大きさを超えない幅と高さのイメー
ジデータを、特定された領域の左上隅から切り出し（ス
テップＳ６）、切り出したイメージデータを、ページ
数、登録日など、他の文書管理情報とともに文書のイメ
ージデータと関連付けてデータベースに登録し（ステッ
プＳ７）、本処理を終了する。Next, based on the position of the area and the size of the area as the attribute information of the area specified as described above, the image data having a width and a height not exceeding a predetermined size are The cut-out image data is cut out from the upper left corner of the specified area (step S6), and the cut-out image data is registered in the database in association with the image data of the document together with other document management information such as the number of pages and the registration date (step S7). The process ends.

【００３３】以下、図１の構成をとる電子ファイリング
装置における文字領域あるいはイメージ領域のイメージ
データを一覧表示する処理手順について、図４に示すフ
ローチャートに従って説明する。Hereinafter, a processing procedure for displaying a list of image data of a character area or an image area in the electronic filing apparatus having the configuration of FIG. 1 will be described with reference to the flowchart shown in FIG.

【００３４】ユーザからの開始指示により本処理を開始
し、文書の登録日やページ数などの文書管理情報での分
類分けなどの結果から一覧表示する文書を確定する（ス
テップＳ１１）。また、このステップＳ１１で確定した
文書で、まだ表示していない文書があるかどうかを判別
し（ステップＳ１２）、まだ表示していない文書があれ
ば、その文書に関連付けられて記憶されている文字領域
あるいはイメージ領域のイメージデータを読み込む（ス
テップＳ１３）。This process is started in response to a start instruction from the user, and the document to be displayed as a list is determined based on the result of classification and the like in the document management information such as the document registration date and the number of pages (step S11). Further, it is determined whether or not there is a document that has not been displayed in the document confirmed in step S11 (step S12). If there is a document that has not been displayed, the characters stored in association with the document are stored. The image data of the area or image area is read (step S13).

【００３５】次に、読み込んだイメージデータを所定の
位置に表示し（ステップＳ１４）、その後、処理をステ
ップＳ１２に戻して、上記のステップＳ１２〜ステップ
Ｓ１４の処理を繰り返す。そして、確定したすべての文
書の表示を完了したならば（ステップＳ１２でＹＥ
Ｓ）、本処理を終了する。Next, the read image data is displayed at a predetermined position (step S14), and then the process returns to step S12 to repeat the processes of steps S12 to S14. If the display of all the confirmed documents is completed (YES in step S12)
S), and this processing ends.

【００３６】図５は、上記のような一覧表示処理にて表
示される文書領域画像一覧の表示画面の一例である。同
図に示すように、３つの文書１１，１２，１４に対して
は文字領域のイメージデータが表示され、また、１つの
文書１３については、文字領域がないためイメージ領域
のイメージデータが表示される。FIG. 5 is an example of a display screen of the document area image list displayed by the list display processing as described above. As shown in the figure, the image data of the character area is displayed for the three documents 11, 12, and 14, and the image data of the image area is displayed for one document 13 because there is no character area. It

【００３７】以上説明したように、本実施例によれば、
文書の文字領域のイメージデータを登録し、それを表示
することで、文書の一覧や検索結果リストなどの表示に
おいて、文書に記されている文字の一部を識別すること
ができ、レイアウトの似た文書や、あまり特徴のないレ
イアウトの文書でも容易に識別することができる。As described above, according to this embodiment,
By registering the image data of the character area of the document and displaying it, you can identify some of the characters written in the document when displaying the document list or search result list, and the layout is similar. It is possible to easily identify even a document having a poor layout or a document having a featureless layout.

【００３８】また、一般に特徴的な文を有する文書の最
初の文字領域のイメージデータを登録し、表示すること
で、文書の識別がより容易になる。Further, generally, by registering and displaying the image data of the first character area of a document having a characteristic sentence, the document can be identified more easily.

【００３９】さらに、文書中に文字領域がある場合とな
い場合とで、表示されるイメージデータの種類が異なる
ので、文書の特徴として文字を含まないのか否かを即、
知ることができる。［第２実施例］本発明に係る第２の実施例について説明
する。Further, since the type of image data to be displayed differs depending on whether the document has a character area or not, it is immediately determined whether or not the character is included as a feature of the document.
I can know. [Second Embodiment] A second embodiment according to the present invention will be described.

【００４０】図６は、本発明の第２の実施例に係る電子
ファイリング装置の構成を示すブロック図である。な
お、同図に示す電子ファイリング装置において、図１に
示す、上記第１実施例と同様の機能を有するものには、
同一符号を付して、ここでは、それらの説明を省略す
る。FIG. 6 is a block diagram showing the configuration of an electronic filing apparatus according to the second embodiment of the present invention. In addition, in the electronic filing apparatus shown in the figure, one having the same function as that of the first embodiment shown in FIG.
The same reference numerals are given and the description thereof is omitted here.

【００４１】本実施例に係る電子ファイリング装置は、
文書読み込み部１０、領域分割部１、最大文字領域特定
部７、文字領域切り出し部３、文書登録部４、ファイル
装置５、表示制御部６を備えている。これらの構成要素
の内、最大文字領域特定部７は、領域分割部１で領域分
割した文字領域から、文字領域の属性情報の平均文字サ
イズが一番大きい文字領域を特定する。The electronic filing device according to this embodiment is
A document reading unit 10, a region dividing unit 1, a maximum character region specifying unit 7, a character region cutting unit 3, a document registration unit 4, a file device 5, and a display control unit 6 are provided. Among these constituent elements, the maximum character area specifying unit 7 specifies the character area having the largest average character size of the character area attribute information from the character areas divided by the area dividing unit 1.

【００４２】次に、上記の構成を有する電子ファイリン
グ装置での文書登録処理手順について、図７に示すフロ
ーチャートに従って説明する。なお、図７のステップＳ
２１〜Ｓ２３，Ｓ２５〜Ｓ２７での処理は、図２に示
す、上記第１の実施例に係る文書登録処理の内、ステッ
プＳ１〜Ｓ３，Ｓ５〜Ｓ７における処理と同様であるた
め、ここでは、それらの説明を省略する。Next, a document registration processing procedure in the electronic filing apparatus having the above configuration will be described with reference to the flowchart shown in FIG. Note that step S in FIG.
Since the processing in 21 to S23 and S25 to S27 is the same as the processing in steps S1 to S3 and S5 to S7 in the document registration processing according to the first embodiment shown in FIG. The description thereof will be omitted.

【００４３】図７のステップＳ２４では、上記のステッ
プＳ２２で文字領域の属性情報として抽出されている平
均文字サイズと領域の位置とから、１つの文字領域を特
定する処理を行なう。In step S24 of FIG. 7, one character area is specified from the average character size and the area position extracted as the attribute information of the character area in step S22.

【００４４】ここで、図８に示すフローチャートに従
い、上記のステップＳ２４での文字領域特定処理の動作
をさらに詳細に説明する。The operation of the character area specifying process in step S24 will be described in more detail with reference to the flowchart shown in FIG.

【００４５】まず、領域分割された各文字領域の平均文
字サイズを比較し、最大の平均文字サイズの文字領域を
特定する（図８のステップＳ３１）。次に、最大の平均
文字サイズの文字領域が１つに特定されたかどうかを判
別し（ステップＳ３２）、それが特定されない場合に
は、ステップＳ３１にて特定されている文字領域の垂直
位置を比較し、最も上に位置する文字領域を特定する
（ステップＳ３３）。First, the average character sizes of the divided character regions are compared, and the character region having the maximum average character size is specified (step S31 in FIG. 8). Next, it is determined whether or not one character area having the maximum average character size is specified (step S32). If not specified, the vertical positions of the character areas specified in step S31 are compared. Then, the uppermost character area is specified (step S33).

【００４６】ステップＳ３４では、上記のステップＳ３
３で文字領域が１つに特定されたかどうかを判別し、そ
れが特定されない場合には、ステップＳ３３にて特定さ
れている文字領域の水平位置を比較して、最も左に位置
する文字領域を特定する（ステップＳ３５）。そして、
これらの内のいずれかの文字領域特定の処理により１つ
の文字領域が特定されるので、図７のステップＳ２６に
処理を移す。In step S34, the above step S3 is performed.
In step 3, it is determined whether or not one character area has been specified, and if it is not specified, the horizontal positions of the character areas specified in step S33 are compared to determine the leftmost character area. It is specified (step S35). And
Since one character area is specified by the character area specifying process of any of these, the process proceeds to step S26 of FIG.

【００４７】なお、図６に示す構成を有する、本実施例
に係る電子ファイリング装置における文字領域あるいは
イメージ領域のイメージデータを一覧表示する処理手順
は、図４に示す、上記第１の実施例に係る領域画像表示
処理と同様であるため、ここでは、その説明を省略す
る。The processing procedure for displaying a list of image data of the character area or the image area in the electronic filing apparatus according to this embodiment having the configuration shown in FIG. 6 is the same as that of the first embodiment shown in FIG. Since this is the same as the area image display processing, the description thereof is omitted here.

【００４８】このように、本実施例によれば、上記第１
実施例における効果に加えて、文書のタイトルなど、そ
の文書の特徴的な文書列がある文書中の最も大きい文字
サイズの文字領域のイメージデータを登録し、表示する
ことで、文書をより容易に識別することができる。［第３実施例］図９は、本発明の第３の実施例に係る電
子ファイリング装置の構成を示すブロック図である。同
図に示すように、本実施例に係る装置は、文書読み込み
部１１、領域分割部１２、文字領域特定部１３、文字領
域切り出し部１４、文書登録部１５、ファイル装置１
６、表示制御部１７を備えている。As described above, according to this embodiment, the first
In addition to the effects of the embodiment, by registering and displaying the image data of the character area of the largest character size in a document having a document string that is characteristic of the document such as the title of the document, the document can be made easier. Can be identified. [Third Embodiment] FIG. 9 is a block diagram showing the arrangement of an electronic filing apparatus according to the third embodiment of the present invention. As shown in the figure, the apparatus according to the present embodiment includes a document reading unit 11, a region dividing unit 12, a character region specifying unit 13, a character region cutting unit 14, a document registration unit 15, and a file device 1.
6, the display control unit 17 is provided.

【００４９】これらの内、文書読み込み部１１は、スキ
ャナ装置（不図示）などから読み込まれた文書のイメー
ジデータを領域分割部１２と文字領域切り出し部１４、
及び文書登録部１５に供給する。領域分割部１２は、文
書読み込み部１１から文書のイメージデータを受け取
り、そのイメージデータに対して領域分割を行なう。そ
して、文字領域とその属性情報として領域の位置と大き
さ、組方向が抽出され、抽出したすべての文字領域の属
性情報が文字領域特定部１３に供給される。Of these, the document reading unit 11 uses image data of a document read from a scanner device (not shown) or the like to divide the region into a region dividing unit 12 and a character region cutting unit 14.
And the document registration unit 15. The area dividing unit 12 receives the image data of the document from the document reading unit 11 and divides the image data into areas. Then, the position, size, and set direction of the character area and its attribute information are extracted, and the attribute information of all the extracted character areas is supplied to the character area specifying unit 13.

【００５０】文字領域特定部１３は、領域分割部１２か
らすべての文字領域の属性情報を受け取り、それらの属
性情報から１つの文字領域を特定し、特定した文字領域
の属性情報を文字領域切り出し部１４に供給する。この
文字領域切り出し部１４は、文書読み込み部１１から文
書のイメージデータを、また、文字領域特定部１３か
ら、特定された文字領域の属性情報を受け取り、その文
字領域の位置、大きさ、組方向より決定される領域のイ
メージデータを文書のイメージデータから切り出し、そ
のイメージデータと組方向情報を文書登録部１５に供給
する。The character area specifying unit 13 receives the attribute information of all the character areas from the area dividing unit 12, specifies one character area from the attribute information, and extracts the attribute information of the specified character area from the character area cutout unit. Supply to 14. The character area cutout unit 14 receives the image data of the document from the document reading unit 11 and the attribute information of the specified character area from the character area specifying unit 13, and determines the position, size, and set direction of the character area. The image data of the area determined by the above is cut out from the image data of the document, and the image data and the set direction information are supplied to the document registration unit 15.

【００５１】文書登録部１５は、文書読み込み部１１か
ら文書のイメージデータを、また、文字領域切り出し部
１４から文字領域のイメージデータと組方向情報を受け
取り、これらを他の文書管理情報と関連付けてファイル
装置１６に登録する。なお、ファイル装置１６は、文書
のイメージデータとキーワードなどの文書管理情報とを
関連付けてデータベースとして記憶する。また、表示制
御部１７は、ファイル装置１６にて記憶した組方向情報
と文字領域のイメージデータを読み込み、読み込んだ組
方向情報に従って文字領域のイメージデータをディスプ
レイ（不図示）などへの表示を制御する。The document registration unit 15 receives the image data of the document from the document reading unit 11 and the image data of the character region and the set direction information from the character region cutout unit 14, and associates these with other document management information. Register in the file device 16. The file device 16 stores the image data of the document and the document management information such as a keyword in association with each other as a database. Further, the display control unit 17 reads the group direction information and the image data of the character area stored in the file device 16, and controls the display of the image data of the character area on a display (not shown) according to the read group direction information. To do.

【００５２】なお、本実施例に係る電子ファイリング装
置においても、上記以外に、ファイル装置１６に記憶さ
れた文書の情報に対して、検索、印刷などの機能を実現
するための種々の構成要素が設けられているが、ここで
は、それらの説明を省略する。In addition to the above, the electronic filing apparatus according to the present embodiment also has various components for realizing functions such as search and print for the information of the document stored in the file apparatus 16. Although provided, the description thereof will be omitted here.

【００５３】次に、上記のように構成をとる、本実施例
に係る電子ファイリング装置での文書登録処理の動作に
ついて、図１０に示すフローチャートに従って説明す
る。Next, the operation of the document registration process in the electronic filing apparatus according to this embodiment having the above-mentioned configuration will be described with reference to the flowchart shown in FIG.

【００５４】まず、ユーザーの開始指示により処理を開
始し、処理スキャナ装置等（不図示）で読み込んだ文書
のイメージデータをメモリ上に読み込む（ステップＳ４
１）。そして、そのイメージデータに対して領域分割を
行ない、文字領域とその属性情報として位置、大きさ、
組方向を抽出する（ステップＳ４２）。First, processing is started by a user's start instruction, and image data of a document read by a processing scanner device (not shown) is read into a memory (step S4).
1). Then, the image data is divided into areas, and the character area and its attribute information are the position, size,
The set direction is extracted (step S42).

【００５５】ここでの領域分割とは、文書全体のイメー
ジデータから、文章、図形、表などの様々な属性の領域
とその属性情報を抽出するものであり、文字認識の前処
理として行なわれている。なお、領域分割の方法につい
ては、詳細な説明を省略する。The area division here is to extract areas of various attributes such as sentences, figures, and tables and their attribute information from the image data of the entire document, which is performed as a preprocessing of character recognition. There is. A detailed description of the area division method will be omitted.

【００５６】続く処理では、抽出したすべての文字領域
から最も左上に位置する１つの文字領域を特定する（ス
テップＳ４３）。この文字領域を特定する方法として
は、他にも、文字領域の属性情報から文字領域の順序付
けを行ない、最初の文字領域を選択したり、平均文字サ
イズが最大の文字領域を選択する方法などがあるが、こ
こでは、それらの詳細な説明を省略する。In the subsequent processing, one character area located at the upper left is specified from all the extracted character areas (step S43). Other methods to specify this character area include ordering the character areas from the attribute information of the character area and selecting the first character area or selecting the character area with the largest average character size. However, detailed description thereof will be omitted here.

【００５７】次に、特定された文字領域の位置、大き
さ、組方向から、あらかじめ定められた大きさを超えな
い幅と高さの領域を決定し、文書のイメージデータから
その領域のイメージデータを切り出す（ステップＳ４
４）。そして、切り出したイメージデータと文字領域の
組方向情報を、ページ数、登録日などの他の文書管理情
報とともに、文書のイメージデータと関連付けてデータ
ベースに登録し（ステップＳ４５）、本処理を終了す
る。Next, an area having a width and height that does not exceed a predetermined size is determined from the position, size, and set direction of the specified character area, and the image data of the area is determined from the image data of the document. (Step S4
4). Then, the cut-out image data and the set direction information of the character area are registered in the database in association with the image data of the document together with other document management information such as the number of pages and the registration date (step S45), and this processing is ended. .

【００５８】以下、図１１に示すフローチャートに従
い、本実施例における文書登録処理の内、図１０のステ
ップＳ４４での文字領域切り出し処理の詳細を説明す
る。Hereinafter, the details of the character area cutout processing in step S44 of FIG. 10 in the document registration processing of this embodiment will be described with reference to the flowchart shown in FIG.

【００５９】まず、上記の領域分割により抽出された組
方向情報により、文字領域が横書きか否かを判定し（ス
テップＳ５１）、それが横書きであれば、横書きに適し
た切り出し領域を確定する（ステップＳ５２）。また、
縦書きであれば、縦書きに適した切り出し領域を確定す
る（ステップＳ５３）。First, based on the set direction information extracted by the above area division, it is determined whether or not the character area is horizontal writing (step S51). If it is horizontal writing, a cutout area suitable for horizontal writing is determined (step S51). Step S52). Also,
If it is vertical writing, a cutout area suitable for vertical writing is determined (step S53).

【００６０】次に、上記のように確定された領域のイメ
ージデータを文書のイメージデータから切り出し（ステ
ップＳ５４）、その後、図１０のステップＳ４５の処理
に移行する。Next, the image data of the area determined as described above is cut out from the image data of the document (step S54), and then the process proceeds to step S45 of FIG.

【００６１】図１２は、図１１のステップＳ５２の横書
きの文字領域の切り出し領域確定処理の詳細フローチャ
ートである。なお、ここでは、座標系は、水平方向は右
方向が正で、垂直方向は下方向が正であるとする。ま
た、領域分割により抽出された文字領域の左上隅の水平
位置をｘ１、垂直位置をｙ１、幅をｗ１、高さをｈ１と
し、切り出す領域の左上隅の水平位置をｘ、垂直位置を
ｙ、幅をｗ、高さをｈとし、さらに、横書きの文字領域
から切り出す領域の幅の上限をａ、高さの上限をｂとす
る。これらの上限ａ，ｂは、あらかじめ定められた値、
あるいは、ユーザーが定める値である。FIG. 12 is a detailed flow chart of the cutout area confirmation processing of the horizontally written character area in step S52 of FIG. In the coordinate system, the right direction is positive in the horizontal direction and the downward direction is positive in the vertical direction. Also, the horizontal position of the upper left corner of the character region extracted by the region division is x1, the vertical position is y1, the width is w1, the height is h1, and the horizontal position of the upper left corner of the region to be cut out is x, the vertical position is y, The width is w, the height is h, the upper limit of the width of the region cut out from the horizontally written character region is a, and the upper limit of the height is b. These upper limits a and b are predetermined values,
Alternatively, it is a value defined by the user.

【００６２】最初の処理として、まず、ｘをｘ１に、ｙ
をｙ１に設定する（ステップＳ６１）。次に、ｗ１がａ
を超えているかどうかを判別し（ステップＳ６２）、ｗ
１がａを超えている場合は、ｗをａに設定し（ステップ
Ｓ６３）、ｗ１がａを超えていない場合には、ｗをｗ１
に設定する（ステップＳ６４）。As the first processing, first, x is set to x1, y is set.
Is set to y1 (step S61). Next, w1 is a
Is determined (step S62), w
When 1 exceeds a, w is set to a (step S63), and when w1 does not exceed a, w is set to w1.
Is set (step S64).

【００６３】そして、ｈ１がｂを超えているか否かを判
別し（ステップＳ６５）、ｈ１がｂを超えている場合
は、ｈをｂに設定し（ステップＳ６６）、ｈ１がｂを超
えていない場合は、ｈをｈ１に設定して（ステップＳ６
７）、処理を図１１のステップＳ５４に処理を移す。Then, it is determined whether or not h1 exceeds b (step S65). If h1 exceeds b, h is set to b (step S66), and h1 does not exceed b. In this case, h is set to h1 (step S6
7), the process moves to step S54 in FIG.

【００６４】図１３は、上記のステップＳ５２での処理
により、文書中の横書きの文字領域２０において切り出
し領域２１が確定された様子を示す図である。FIG. 13 is a diagram showing a state in which the cutout area 21 is fixed in the horizontally written character area 20 in the document by the processing in the above step S52.

【００６５】図１４は、図１１のステップＳ５３での縦
書きの文字領域の切り出し領域確定処理の動作を、さら
に詳細に説明するためのフローチャートである。なお、
ここでは、縦書きの文字領域から切り出す領域の幅の上
限をｃ、高さの上限をｄとする。これらの上限ｃ，ｄ
は、あからじめ定められた値、あるいは、ユーザーが定
める値である。また、その他の記号は、図１２における
記号と同じ意味を有する。FIG. 14 is a flow chart for explaining in more detail the operation of the cut-out area confirmation processing of the vertically written character area in step S53 of FIG. In addition,
Here, the upper limit of the width of the region cut out from the vertically written character region is c, and the upper limit of the height is d. These upper limits c, d
Is a value that has been set in advance, or a value that is set by the user. Further, other symbols have the same meanings as the symbols in FIG.

【００６６】図１４において、まず、ｙをｙ１に設定す
る（ステップＳ７１）。次に、ｗ１がｃを超えているか
どうかを判別し（ステップＳ７２）、ｗ１がｃを超えて
いる場合は、ｘをｘ１＋ｗ１−ｃに、また、ｗをｃに設
定する（ステップＳ７３）。また、ｗ１がｃを超えてい
ない場合は、ｘをｘ１に、ｗをｗ１に設定する（ステッ
プＳ７４）。In FIG. 14, first, y is set to y1 (step S71). Next, it is determined whether or not w1 exceeds c (step S72). When w1 exceeds c, x is set to x1 + w1-c and w is set to c (step S73). If w1 does not exceed c, x is set to x1 and w is set to w1 (step S74).

【００６７】そして、次に、ｈ１がｄを超えているか否
を判別し（ステップＳ７５）、ｈ１がｄを超えている場
合は、ｈをｄに設定し（ステップＳ７６）、ｈ１がｄを
超えていない場合には、ｈをｈ１に設定して（ステップ
Ｓ７７）、処理を図１１のステップＳ５４に移す。Then, it is determined whether h1 exceeds d (step S75). If h1 exceeds d, h is set to d (step S76), and h1 exceeds d. If not, h is set to h1 (step S77), and the process proceeds to step S54 in FIG.

【００６８】図１５は、上記ステップＳ５３での処理に
より、文書中の縦書きの文字領域２５において切り出し
領域２６が確定された様子を示したものである。FIG. 15 shows a state in which the cutout area 26 has been determined in the vertically written character area 25 in the document by the processing in step S53.

【００６９】次に、本実施例に係る電子ファイリング装
置における文字領域のイメージデータを一覧表示する処
理手順について、図１６に示すフローチャートに従って
説明する。Next, a processing procedure for displaying a list of image data of character areas in the electronic filing apparatus according to this embodiment will be described with reference to the flowchart shown in FIG.

【００７０】まず、ユーザーからの開始指示により処理
を開始し、文書の登録日やページ数などの文書管理情報
での分類分けなどの結果から、一覧表示する文書を確定
する（ステップＳ８１）。次に、ステップＳ８１で確定
した文書で、まだ表示していない文書があるかどうかを
判別し（ステップＳ８２）、まだ表示していない文書が
あれば、その文書に関連付けられて記憶されている組方
向情報と文字領域のイメージデータを読み込む（ステッ
プＳ８３，Ｓ８４）。First, the process is started in response to a start instruction from the user, and the documents to be displayed as a list are decided based on the results of classification and the like in the document management information such as the registration date of the document and the number of pages (step S81). Next, it is determined whether or not there is a document which has not been displayed yet in the document confirmed in step S81 (step S82), and if there is a document which has not been displayed yet, the set stored in association with the document is stored. The direction information and the image data of the character area are read (steps S83 and S84).

【００７１】続く処理では、読み込んだイメージデータ
を縦書きと横書きとで分けて所定の位置に表示し（ステ
ップＳ８５）、ステップＳ８２の処理に戻る。そして、
ステップＳ８２〜ステップＳ８５の処理を繰り返し行な
い、ステップＳ８２で、確定したすべての文書の表示を
完了したと判定されたならば、本処理を終了する。In the subsequent process, the read image data is divided into vertical writing and horizontal writing and displayed at a predetermined position (step S85), and the process returns to step S82. And
The processes of steps S82 to S85 are repeated, and if it is determined in step S82 that the display of all the confirmed documents has been completed, this process ends.

【００７２】図１７は、上記のような一覧表示処理によ
り表示される文書領域画像一覧の表示画面の一例であ
る。同図は、ウィンドウの左側に縦書きの文字領域のイ
メージデータが表示され、右側に横書きの文字領域のイ
メージデータが表示されている様子を示している。FIG. 17 is an example of a display screen of the document area image list displayed by the list display processing as described above. In the figure, the image data of the vertically written character area is displayed on the left side of the window, and the image data of the horizontally written character area is displayed on the right side.

【００７３】以上説明したように、本実施例によれば、
上記第１実施例における効果に加えて、読み込んだ文書
のイメージデータに対して領域分割を行ない、文字領域
の位置、大きさ、組方向を抽出し、文字領域の組み方向
に適した領域のイメージデータを切り出して、記憶し、
表示することで、文字領域の組方向にかかわらず文字領
域のイメージデータを適切に切り出し、表示することが
できる。As described above, according to this embodiment,
In addition to the effect of the first embodiment, the image data of the read document is divided into areas, and the position, size, and direction of the character area are extracted, and an image of an area suitable for the direction of combining the character areas is obtained. Cut out the data, memorize it,
By displaying the image data, the image data of the character area can be appropriately cut out and displayed regardless of the direction in which the character area is set.

【００７４】また、組方向を記憶し、組方向ごとに分け
て文字領域のイメージデータを表示することで、識別し
やすい状態にて文字領域のイメージデータを表示するこ
とができる。［第４実施例］図１８は、本発明の第４の実施例に係る
電子ファイリング装置の構成を示すブロック図である。
同図に示す電子ファイリング装置は、文書読み込み部３
１、領域分割部３２、文字認識部３３、文書登録部３
４、ファイル装置３５、文字情報読み込み部３７、不図
示のディスプレイ装置などからなるイメージ表示部３
６、ポインティングデバイスなどからなる選択範囲指定
部３８、選択文字取得部３９を備えている。Further, by storing the set direction and displaying the image data of the character area separately for each set direction, the image data of the character area can be displayed in an easily distinguishable state. [Fourth Embodiment] FIG. 18 is a block diagram showing the arrangement of an electronic filing apparatus according to the fourth embodiment of the present invention.
The electronic filing device shown in FIG.
1, area division unit 32, character recognition unit 33, document registration unit 3
4, an image display unit 3 including a file device 35, a character information reading unit 37, a display device (not shown), and the like.
6, a selection range designation unit 38 including a pointing device and the like, and a selected character acquisition unit 39.

【００７５】文書読み込み部３１は、スキャナ装置など
から読み込まれた文書のイメージデータを領域分割部３
２と文書登録部３４に供給する。この領域分割部３２
は、文書読み込み部３１から文書のイメージデータに対
して領域分割を行ない、すべての文字領域の左上隅の水
平位置、垂直位置、右下隅の水平位置、垂直位置とから
なる文字領域の位置情報と、文字領域の組方向（縦書き
／横書き）を文字領域情報として抽出し、抽出した文字
領域情報を文書のイメージデータとともに文字認識部３
３に供給する。The document reading unit 31 divides the image data of the document read from the scanner device into the area dividing unit 3
2 and the document registration unit 34. This area dividing unit 32
Performs the area division on the image data of the document from the document reading unit 31 and the positional information of the character area including the horizontal position of the upper left corner of all the character areas, the vertical position, the horizontal position of the lower right corner, and the vertical position. , The set direction of the character area (vertical writing / horizontal writing) is extracted as the character area information, and the extracted character area information together with the image data of the document is recognized by the character recognition unit 3
Supply 3

【００７６】文字認識部３３は、領域分割部３２から文
書のイメージデータとすべての文字領域の文字領域情報
を受け取り、すべての文字領域のイメージデータに対し
て文字認識を行ない、各文字領域に記されている文字列
と、各文字のイメージデータの領域の左上隅の水平位
置、垂直位置、右下隅の水平位置、垂直位置を文字列情
報として抽出し、抽出した文字列情報を文字領域情報と
ともに文書登録部３４に供給する。The character recognition unit 33 receives the image data of the document and the character region information of all the character regions from the region division unit 32, performs character recognition on the image data of all the character regions, and writes them in each character region. The specified character string and the horizontal position, vertical position, horizontal position of the lower right corner, and vertical position of the upper left corner of the image data area of each character are extracted as character string information, and the extracted character string information is extracted with the character area information. It is supplied to the document registration unit 34.

【００７７】文書登録部３４は、文書読み込み部３１か
ら文書のイメージデータを、また、文字認識部３３から
すべての文字領域の文字領域情報と文字列情報を受け取
り、これらを他の文書管理情報と関連付けてファイル装
置３５に登録する。このファイル装置３５は、文書のイ
メージデータと、文書名や登録日などの文書管理情報と
を関連付けてデータベースとして記憶する。The document registration unit 34 receives the image data of the document from the document reading unit 31 and the character region information and the character string information of all the character regions from the character recognition unit 33, and uses these as other document management information. It is associated and registered in the file device 35. The file device 35 stores the image data of the document and the document management information such as the document name and the registration date in association with each other as a database.

【００７８】また、文字情報読み込み部３７は、ファイ
ル装置３５からすべての文字領域の文字領域情報と文字
列情報を読み込み、読み込んだ文字領域情報と文字列情
報をイメージ表示部３６と選択文字取得部３９に供給す
る。イメージ表示部３６は、ファイル装置３５から文書
のイメージデータを読み込み、そのイメージデータを表
示するとともに、文字情報読み込み部３７からすべての
文字領域の文字領域情報と文字列情報と受け取る。そし
て、オペレータがイメージ表示部３６の表示画面上で特
定の２点からなる範囲を指定した場合は選択範囲指定部
３８から指定範囲を受け取り、指定範囲を含む文字領域
のフレームを表示するとともに、指定範囲に含まれる文
字列の部分を反転表示する。Further, the character information reading unit 37 reads the character region information and the character string information of all the character regions from the file device 35, and the read character region information and the character string information are displayed on the image display unit 36 and the selected character acquisition unit. 39. The image display unit 36 reads image data of a document from the file device 35, displays the image data, and receives character area information and character string information of all character areas from the character information reading unit 37. When the operator designates a range consisting of two specific points on the display screen of the image display unit 36, the designated range is received from the selection range designation unit 38, and a frame of the character area including the designated range is displayed and designated. The part of the character string included in the range is highlighted.

【００７９】選択範囲指定部３８は、オペレータがイメ
ージ表示部３６の表示画面上で指定した範囲を取得し、
取得した指定範囲をイメージ表示部３６と選択文字取得
部３９に供給する。選択文字取得部３９は、文字情報読
み込み部３７からすべての文字領域の文字領域の文字領
域情報と文字列情報を受け取り、また、選択範囲指定部
３８から指定範囲を受け取って、指定位置を含む文字領
域の文字列を取得する。The selection range designation unit 38 acquires the range designated by the operator on the display screen of the image display unit 36,
The acquired designated range is supplied to the image display unit 36 and the selected character acquisition unit 39. The selected character acquisition unit 39 receives the character region information and the character string information of the character regions of all the character regions from the character information reading unit 37, and also receives the designated range from the selected range designation unit 38, and receives the character including the designated position. Gets the area string.

【００８０】なお、上記の電子ファイリング装置には、
上記以外にも、ファイル装置３５に記憶された文書の情
報に対して、検索、印刷などの機能を実現するための種
々の構成要素が設けられているが、ここでは、それらの
説明を省略する。The above electronic filing apparatus has the following:
In addition to the above, various components for realizing functions such as search and print are provided for the information of the document stored in the file device 35, but description thereof will be omitted here. .

【００８１】次に、上記のような構成をとる、本実施例
に係る電子ファイリング装置における文書登録処理の動
作について、図１９に示すフローチャートに従って説明
する。Next, the operation of the document registration processing in the electronic filing apparatus according to this embodiment having the above configuration will be described with reference to the flowchart shown in FIG.

【００８２】まず、オペレータからの開始指示により処
理を開始、例えば、処理スキャナ装置などで読み込んだ
文書のイメージデータをメモリ上に読み込む（ステップ
Ｓ９１）。そして、読み込んだイメージデータに対して
領域分割を行ない、文字領域情報として、文書のイメー
ジデータのすべての文字領域の左上隅の水平位置、垂直
位置、右下隅の水平位置、垂直位置と、その組方向（縦
書き／横書き）を抽出する（ステップＳ９２）。First, the process is started in response to a start instruction from an operator, for example, image data of a document read by a processing scanner device or the like is read into a memory (step S91). Then, the read image data is divided into areas, and as character area information, the horizontal position of the upper left corner, the vertical position, the horizontal position of the lower right corner, and the vertical position of all the character areas of the image data of the document, and their combinations. The direction (vertical writing / horizontal writing) is extracted (step S92).

【００８３】図２０は、本実施例における領域分割の動
作を説明するための図である。同図は、文書のイメージ
データに対して領域分割を行ない、縦書きの文字領域と
して領域４０を抽出し、横書きの文字領域として領域４
１，４２を抽出した様子を示している。また、同図にお
いては、文字領域４２の左上隅の水平位置をleft、垂直
位置をtop、右下隅の水平位置をright、垂直位置をbott
omとして示している。FIG. 20 is a diagram for explaining the operation of area division in this embodiment. In the drawing, the image data of the document is divided into areas, the area 40 is extracted as a vertically written character area, and the area 4 is extracted as a horizontally written character area.
It shows that 1, 42 are extracted. Further, in the figure, the horizontal position of the upper left corner of the character area 42 is left, the vertical position is top, the horizontal position of the lower right corner is right, and the vertical position is bottom.
Shown as om.

【００８４】なお、領域分割は、文書のイメージデータ
から、文章、図形、表などの様々な属性の領域とその属
性情報を抽出するものであり、通常、文字認識の前処理
として行なわれるが、その領域分割の方法については、
詳細な説明を省略する。The area division is to extract areas having various attributes such as sentences, figures, and tables and their attribute information from the image data of the document, and is usually performed as a preprocessing for character recognition. For the method of dividing the area,
Detailed description is omitted.

【００８５】次に、抽出した文字領域のイメージデータ
に対して文字認識を行ない、文字列情報として、各文字
イメージデータの領域の左上隅の水平位置、垂直位置、
右上隅の水平位置、垂直位置と、その文字コードを抽出
する（ステップＳ９３）。Next, character recognition is performed on the image data of the extracted character area, and as character string information, the horizontal and vertical positions of the upper left corner of the area of each character image data,
The horizontal and vertical positions of the upper right corner and the character code thereof are extracted (step S93).

【００８６】図２１は、本実施例における文字認識によ
り抽出される各文字のイメージデータの領域を説明する
ための図である。同図では、「さしすせそ」の各文字の
領域として５０，５１，５２，５３，５４を抽出した様
子を示している。また、「さ」の文字の領域の左上隅の
水平位置をleft、垂直位置をtop，右下隅の水平位置をr
ight、垂直位置をbottomとして示している。なお、ここ
での文字認識の方法については、周知の技術を使用すれ
ばよいので、その詳細な説明を省略する。FIG. 21 is a diagram for explaining the area of the image data of each character extracted by the character recognition in this embodiment. In the figure, a state is shown in which 50, 51, 52, 53, and 54 are extracted as the area of each character of "Sashisuseso". Also, the horizontal position of the upper left corner of the character area of "sa" is left, the vertical position is top, and the horizontal position of the lower right corner is r.
ight, vertical position is shown as bottom. Note that a known technique may be used for the character recognition method here, and thus a detailed description thereof will be omitted.

【００８７】本実施例に係る文書登録処理では、上記の
ステップＳ９２により抽出した文字領域情報と、ステッ
プＳ９３により抽出した文字列情報とから、図２２に示
すような文字情報データを作成し（ステップＳ９４）、
作成した文書の文字情報データを、ページ数、登録日な
どの他の文書管理情報とともに、文書のイメージデータ
と関連付けてデータベースに登録して（ステップＳ９
５）、本処理を終了する。In the document registration processing according to this embodiment, character information data as shown in FIG. 22 is created from the character area information extracted in step S92 and the character string information extracted in step S93 (step S94),
The character information data of the created document is registered in the database in association with the image data of the document together with other document management information such as the number of pages and the registration date (step S9).
5) Then, this process ends.

【００８８】次に、図１８に示すように構成された、本
実施例に係る電子ファイリング装置での文書表示処理の
動作について、図２３に示すフローチャートに従って説
明する。Next, the operation of the document display processing in the electronic filing apparatus according to the present embodiment configured as shown in FIG. 18 will be described with reference to the flowchart shown in FIG.

【００８９】まず、オペレータの開始指示により処理を
開始し、オペレータが文書の一覧表示画面において文書
を選択するなどして、表示する文書を確定する（ステッ
プＳ１００）。次に、このステップＳ１００で確定した
文書に関連付けられて記憶されている文書のイメージデ
ータと文字情報データをデータベースから読み込み（ス
テップＳ１０１）、読み込んだ文書のイメージデータを
表示して（ステップＳ１０２）、本処理を終了する。First, the process is started in response to an operator's start instruction, and the operator selects a document on the document list display screen to determine the document to be displayed (step S100). Next, the image data and the character information data of the document stored in association with the document determined in step S100 are read from the database (step S101), the image data of the read document is displayed (step S102), This process ends.

【００９０】以下、本実施例に係る電子ファイリング装
置における文書の文字列の選択処理の動作について、図
２４に示すフローチャートを参照して説明する。The operation of the process for selecting the character string of a document in the electronic filing apparatus according to this embodiment will be described below with reference to the flowchart shown in FIG.

【００９１】最初にオペレータが、上記の文書表示処理
（図２３参照）により文書のイメージデータを表示後、
その表示画面上でポインタの位置を所望の位置に合わせ
てポインティングデバイス（不図示）のボタンを押すこ
とにより処理を開始する。そして、そのときのポインタ
の水平位置と垂直位置を、それぞれバッファｓｔａｒｔ
Ｐｏｓ．ｘとｓｔａｒｔＰｏｓ．ｙに取得し（ステップ
Ｓ１１０）、点（ｓｔａｒｔＰｏｓ．ｘ，ｓｔａｒｔＰ
ｏｓ．ｙ）を含む文字領域があるかどうかを、文字情報
データの各文字領域の位置情報から判別する（ステップ
Ｓ１１１）。First, the operator displays the image data of the document by the above document display processing (see FIG. 23).
The process is started by adjusting the position of the pointer on the display screen to a desired position and pressing a button of a pointing device (not shown). Then, the horizontal position and the vertical position of the pointer at that time are respectively stored in the buffer start
Pos. x and startPos. y (step S110), and the point (startPos.x, startP
os. It is determined from the position information of each character area of the character information data whether there is a character area including y) (step S111).

【００９２】上記の点（ｓｔａｒｔＰｏｓ．ｘ，ｓｔａ
ｒｔＰｏｓ．ｙ）を含む文字領域がある場合には、その
文字領域のフレームを文字領域に位置情報により表示し
（ステップＳ１１２）、その文字領域の各文字の位置情
報から、その文字領域内の行を抽出して、行情報データ
を作成する（ステップＳ１１３）。そして、作成した行
情報データとその文字領域の各文字の位置情報から、点
（ｓｔａｒｔＰｏｓ．ｘ，ｓｔａｒｔＰｏｓ．ｙ）によ
り指定される文字を確定し、その文字番号をバッファｓ
ｅｌｅｃｔ１に取得する（ステップＳ１１４）。The above points (startPos.x, sta
rtPos. If there is a character area including y), the frame of the character area is displayed in the character area by position information (step S112), and the line in the character area is extracted from the position information of each character in the character area. Then, line information data is created (step S113). Then, the character specified by the point (startPos.x, startPos.y) is determined from the created line information data and the position information of each character in the character area, and the character number is stored in the buffer s.
It is acquired as select1 (step S114).

【００９３】次に、オペレータがポインティングデバイ
スのボタンを押したまま、ポインタを移動することによ
り、そのときのポインタの水平位置と垂直位置を、それ
ぞれバッファｅｎｄＰｏｓ．ｘとｅｎｄＰｏｓ．ｙに取
得し（ステップＳ１１５）、上記のステップＳ１１４と
同様の処理で、点（ｅｎｄＰｏｓ．ｘ，ｅｎｄＰｏｓ．
ｙ）により指定される文字を確定し、その文字番号をバ
ッファｓｅｌｅｃｔ２に取得する（ステップＳ１１
６）。Next, the operator moves the pointer while pressing the button of the pointing device, so that the horizontal position and the vertical position of the pointer at that time are respectively stored in the buffer endPos. x and endPos. y (step S115), and by the same processing as the above step S114, the points (endPos.x, endPos.
The character designated by y) is confirmed, and the character number is acquired in the buffer select2 (step S11).
6).

【００９４】そして、ｓｅｌｅｃｔ１とｓｅｌｅｃｔ２
の間の行の領域を反転表示し（ステップＳ１１７）、オ
ペレータがポインティングデバイスのボタンを離したか
否かを判別する（ステップＳ１１８）。ここで、オペレ
ータがポインティングデバイスのボタンを離していない
場合は、処理をステップＳ１１５に戻し、オペレータが
ポインティングデバイスのボタンを離すまで、ステップ
Ｓ１１５からステップＳ１１８の処理を繰り返し行な
う。Then, select1 and select2
The area of the line between is highlighted (step S117), and it is determined whether the operator has released the button of the pointing device (step S118). Here, if the operator has not released the button of the pointing device, the process returns to step S115, and the processes of steps S115 to S118 are repeated until the operator releases the button of the pointing device.

【００９５】オペレータがポインティングデバイスのボ
タンを離した場合、文字列情報からｓｅｌｅｃｔ１とｓ
ｅｌｅｃｔ２の間の文字コードをバッファｓｔｒｉｎｇ
にコピーし（ステップＳ１１９）、本処理を終了する。
なお、上記のステップＳ１１１で、点（ｓｔａｒｔＰｏ
ｓ．ｘ，ｓｔａｒｔＰｏｓ．ｙ）を含む文字領域がない
と判断された場合は、何ら処理をせずに終了する。When the operator releases the button of the pointing device, select1 and s are selected from the character string information.
Buffer character code between select2
(Step S119), and this processing ends.
In step S111 described above, the point (startPo
s. x, startPos. If it is determined that there is no character area including y), the process ends without any processing.

【００９６】ここで、図２５に示すフローチャートに従
い、上記の文字列選択処理における行情報データの作成
処理（ステップＳ１１３）の動作について、さらに詳細
に説明する。The operation of the line information data creation process (step S113) in the character string selection process will now be described in more detail with reference to the flowchart shown in FIG.

【００９７】なお、ここでは、座標系は、水平方向は右
方向が正で、垂直方向は下方向が正であるとする。ま
た、文字列情報におけるｉ番目の文字の上の位置をｃｈ
ａｒ（ｉ）．ｔｏｐ、下の位置をｃｈａｒ（ｉ）．ｂｏ
ｔｔｏｍとし、作成する行情報データのｉ番目の行の先
頭文字の番号をｌｉｎｅ（ｉ）．ｓｔａｒｔ、最終文字
の番号をｌｉｎｅ（ｉ）．ｅｎｄ、上の位置をｌｉｎｅ
（ｉ）．ｔｏｐ、下の位置をｌｉｎｅ（ｉ）．ｂｏｔｔ
ｏｍとする。さらに、行番号のカウンタをｌ、文字番号
のカウンタをｍ，ｎとし、抽出中の行の上の位置をｔｏ
ｐ、下の位置をｂｏｔｔｏｍとする。Here, in the coordinate system, it is assumed that the rightward direction is positive in the horizontal direction and the downward direction is positive in the vertical direction. In addition, ch is the position above the i-th character in the character string information.
ar (i). top, the lower position is char (i). bo
ttom, and the number of the first character of the i-th line of the line information data to be created is line (i). start, the last character number is line (i). end, line above
(I). top, the lower position is line (i). bottom
om. Further, the line number counter is 1, the character number counters are m and n, and the position above the line being extracted is to
p and the lower position are bottom.

【００９８】まず、文字領域の組方向が横書きかどうか
を判別し（ステップＳ１２０）、横書きの場合は、カウ
ンタｌ，ｍ，ｎを１に、ｔｏｐをｃｈａｒ（１）．ｔｏ
ｐに、ｂｏｔｔｏｍをｃｈａｒ（１）．ｂｏｔｔｏｍに
初期設定する（ステップＳ１２１）。そして、次に、ｃ
ｈａｒ（ｎ）．ｔｏｐがｂｏｔｔｏｍより下にあるかど
うかを判別し（ステップＳ１２２）、ｃｈａｒ（ｎ）．
ｔｏｐがｂｏｔｔｏｍより下にある場合は、その文字は
次の行にあるものとし、その行の行情報としてｌｉｎｅ
（ｌ）．ｓｔａｒｔにｍを、ｌｉｎｅ（ｌ）．ｅｎｄに
ｎ−１を、ｌｉｎｅ（ｌ）．ｔｏｐにｔｏｐを、ｌｉｎ
ｅ（ｌ）．ｂｏｔｔｏｍにｂｏｔｔｏｍを入れる（ステ
ップＳ１２３）。First, it is judged whether or not the set direction of the character area is horizontal writing (step S120). In the case of horizontal writing, the counters l, m, n are set to 1 and top is set to char (1). to
p to the bottom (1). Initially set to bottom (step S121). And then c
har (n). top is below bottom (step S122), char (n).
If top is lower than bottom, the character is assumed to be on the next line, and line information is used as line information for that line.
(L). m for start, line (l). n-1 is added to end, and line (l). top to top, lin
e (l). The bottom is inserted in the bottom (step S123).

【００９９】次に、次の行情報を抽出するため、上記の
ｌを１インクリメントし（ステップＳ１２４）、ｍにｎ
を、ｔｏｐにｃｈａｒ（ｎ）．ｔｏｐを、ｂｏｔｔｏｍ
にｃｈａｒ（ｎ）．ｂｏｔｔｏｍに入れる（ステップＳ
１２５）。Next, in order to extract the next line information, the above l is incremented by 1 (step S124), and n is added to m.
To top of char (n). top to bottom
To char (n). Put in bottom (step S
125).

【０１００】しかし、ｃｈａｒ（ｎ）．ｔｏｐがｂｏｔ
ｔｏｍより下にない場合（ステップＳ１２２での判断が
ＮＯ）は、その文字は、まだ同じ行にあるものとして、
ｃｈａｒ（ｎ）．ｔｏｐがｔｏｐより上にあるかどうか
を判別し（ステップＳ１２６）、ｃｈａｒ（ｎ）．ｔｏ
ｐがｔｏｐより上にある場合にのみ、ｔｏｐにｃｈａｒ
（ｎ）．ｔｏｐを入れる（ステップＳ１２７）。そし
て、ｃｈａｒ（ｎ）．ｂｏｔｔｏｍがｂｏｔｔｏｍより
下にある場合にのみ、ｂｏｔｔｏｍにｃｈａｒ（ｎ）．
ｂｏｔｔｏｍを入れる（ステップＳ１２９）。However, char (n). top is bot
If it is not below tom (NO in step S122), the character is still on the same line,
char (n). It is determined whether or not top is higher than top (step S126), and char (n). to
char to top only if p is above top
(N). top is entered (step S127). Then, char (n). Only if the bottom is below the bottom, char (n).
Insert bottom (step S129).

【０１０１】ステップＳ１３０では、その文字領域にｎ
＋１番目の文字があるかどうかを判別し、ｎ＋１番目の
文字ある場合、次の文字を調べるためにｎを１インクリ
メントして（ステップＳ１３１）、処理をステップＳ１
２２に戻す。そして、最後の文字を調べるまで、ステッ
プＳ１２２からステップＳ１３１の処理を繰り返し行な
う。しかし、その文字領域にｎ＋１番目の文字がない場
合には、その行の行情報としてｌｉｎｅ（ｌ）．ｓｔａ
ｒｔにｍを、ｌｉｎｅ（ｌ）．ｅｎｄにｎを、ｌｉｎｅ
（ｌ）．ｔｏｐにｔｏｐを、ｌｉｎｅ（ｌ）．ｂｏｔｔ
ｏｍにｂｏｔｔｏｍを入れ（ステップＳ１３２）、処理
を、図２４のステップＳ１１４に移す。In step S130, n is written in the character area.
It is determined whether or not there is the + 1st character, and if there is the n + 1th character, n is incremented by 1 to check the next character (step S131), and the process is step S1.
Return to 22. Then, the processing from step S122 to step S131 is repeated until the last character is examined. However, when there is no (n + 1) th character in the character area, line (l). sta
rt to m, line (l). n for end, line
(L). top to line, line (l). bottom
bot is put in om (step S132), and the process proceeds to step S114 in FIG.

【０１０２】一方、ステップＳ１２０の判別で、組方向
が横書きでない場合は、縦書きの文字領域として行情報
データを作成し（ステップＳ１３３）、その後、図２４
のステップＳ１１４に処理を移す。なお、このステップ
Ｓ１３３での処理は、上記のステップＳ１２１からステ
ップＳ１３２の横書き文字領域の行情報データの作成処
理と同様の方法で行なえるので、ここでは、その説明を
省略する。On the other hand, if it is determined in step S120 that the set direction is not horizontal writing, line information data is created as a vertical writing character area (step S133), and then FIG.
Then, the process proceeds to step S114. The process in step S133 can be performed by the same method as the process of creating the line information data of the horizontal writing character area in steps S121 to S132 described above, and therefore the description thereof is omitted here.

【０１０３】図２６は、上記のような行情報データ作成
処理により作成される行情報データの構成を示したもの
である。同図に示すように、各行の行情報は、その行の
先頭文字の番号、最終文字の番号、上の位置、下の位置
からなり、すべての行についてこれらの情報を持つ。FIG. 26 shows a structure of line information data created by the above-described line information data creating process. As shown in the figure, the line information of each line includes the number of the first character, the number of the last character, the upper position, and the lower position of the line, and all the lines have this information.

【０１０４】また、図２７，図２８は、横書きの文字領
域から実際に作成される行情報データを説明するための
図である。図２７では、横書きで２５文字を含む文字領
域６０において、１番目から７番目の文字まで、８番目
から１２番目の文字まで、１３番目から２１番目の文字
まで、そして、２２番目から２５番目の文字までをそれ
ぞれ行として抽出し、そのときの行の上の位置をｔｏｐ
１からｔｏｐ４とし、行の下の位置をｂｏｔｔｏｍ１か
らｂｏｔｔｏｍ４として抽出した様子を示している。ま
た、図２８は、図２７に示す文字領域から作成される行
情報データの内容を示している。27 and 28 are diagrams for explaining the line information data actually created from the horizontally written character area. In FIG. 27, in a character area 60 including 25 characters in horizontal writing, the 1st to 7th characters, the 8th to 12th characters, the 13th to 21st characters, and the 22nd to 25th characters Characters are extracted as lines, and the position on the line at that time is top
1 to top4 and the positions under the row are extracted as bottom1 to bottom4. 28 shows the contents of line information data created from the character area shown in FIG.

【０１０５】次に、図２９に示すフローチャートに従
い、上記の文字列選択処理（図２４参照）におけるステ
ップＳ１１４及びステップＳ１１６の指定文字の確定処
理の動作について、さらに詳細に説明する。Next, with reference to the flow chart shown in FIG. 29, the operation of the designated character confirmation process of steps S114 and S116 in the above character string selection process (see FIG. 24) will be described in more detail.

【０１０６】なお、ここでは、文字列情報におけるｉ番
目の文字の左の位置をｃｈａｒ（ｉ）．ｌｅｆｔ、右の
位置をｃｈａｒ（ｉ）．ｒｉｇｈｔとする。また、ポイ
ンティングデバイスの位置をｐｏｉｎｔ．ｘ，ｐｏｉｎ
ｔ．ｙ、確定した文字の番号をｓｅｌｅｃｔとする。他
の記号については、図２５に示すフローチャートでの説
明における記号と同じである。Here, the position to the left of the i-th character in the character string information is char (i). left, the position on the right is char (i). right. Also, the position of the pointing device is set to point. x, point
t. Let y be the number of the confirmed character be select. Other symbols are the same as the symbols in the description of the flowchart shown in FIG.

【０１０７】まず、文字領域の組方向が横書きかどうか
を判別し（ステップＳ１４０）、それが横書きの場合に
は、ｌを１に初期設定する（ステップＳ１４１）。そし
て、ｐｏｉｎｔ．ｙがｌｉｎｅ（ｌ）．ｔｏｐより上に
あるかどうかを判別し（ステップＳ１４２）、ｐｏｉｎ
ｔ．ｙがｌｉｎｅ（ｌ）．ｔｏｐより上にない場合は、
次に、ｐｏｉｎｔ．ｙがｌｉｎｅ（ｌ）．ｔｏｐとｌｉ
ｎｅ（ｌ）．ｂｏｔｔｏｍの間にあるかどうかを判別す
る（ステップＳ１４３）。First, it is determined whether the writing direction of the character area is horizontal writing (step S140), and if it is horizontal writing, 1 is initially set to 1 (step S141). Then, the point. y is line (l). It is determined whether or not it is above top (step S142), and point
t. y is line (l). If not above top,
Next, point. y is line (l). top and li
ne (l). It is determined whether or not it is between the bottoms (step S143).

【０１０８】上記のｐｏｉｎｔ．ｙがｌｉｎｅ（ｌ）．
ｔｏｐとｌｉｎｅ（ｌ）．ｂｏｔｔｏｍの間にない場合
は、ｌ＋１番目の行があるか否かを判別し（ステップＳ
１４４）、ｌ＋１番目の行がある場合には、次の行を調
べるために、ｌを１インクリメントして（ステップＳ１
４５）、処理をステップＳ１４２に戻す。そして、最後
の行を調べるまで、ステップＳ１４２からステップＳ１
４５の処理を繰り返し行ない、ｐｏｉｎｔ．ｙが最後の
行よりも下にある場合、ｓｅｌｅｃｔに文字領域の最後
の文字の番号ｌｉｎｅ（ｌ）．ｅｎｄを入れて（ステッ
プＳ１４６）、次のステップに処理を移す。The point. y is line (l).
top and line (l). If it is not between the bottoms, it is determined whether or not there is the (l + 1) th row (step S
144), if there is the l + 1th row, l is incremented by 1 to check the next row (step S1).
45), and returns the process to step S142. Then, from the step S142 to the step S1 until the last line is examined.
45 is repeated, and point. If y is below the last line, select is the number of the last character in the character area line (l). end is inserted (step S146), and the process proceeds to the next step.

【０１０９】しかし、ｐｏｉｎｔ．ｙがｌｉｎｅ
（ｌ）．ｔｏｐより上にある場合には、ｓｅｌｅｃｔに
文字領域の最初の文字の番号ｌｉｎｅ（ｌ）．ｓｔａｒ
ｔを入れ（ステップＳ１４７）、次のステップに処理を
移す。However, the point. y is line
(L). If it is above top, select is the number of the first character in the character area line (l). star
t is put (step S147), and the process is moved to the next step.

【０１１０】一方、ｐｏｉｎｔ．ｙがｌｉｎｅ（ｌ）．
ｔｏｐとｌｉｎｅ（ｌ）．ｂｏｔｔｏｍの間にある場合
（ステップＳ１４３での判定がＹＥＳ）は、ｍにｌｉｎ
ｅ（ｌ）．ｓｔａｒｔを入れ（ステップＳ１４８）、次
の処理にて、ｐｏｉｎｔ．ｘがｃｈａｒ（ｍ）．ｒｉｇ
ｈｔより左にあるかどうかを判別する（ステップＳ１４
９）。そして、ｐｏｉｎｔ．ｘがｃｈａｒ（ｍ）．ｒｉ
ｇｈｔより左にない場合（ステップＳ１４９での判断が
ＮＯ）は、１番目の行にｍ＋１番目の文字があるか否か
を判別し（ステップＳ１５０）、そこにｍ＋１番目の文
字がある場合には、次の文字を調べるためにｍを１イン
クリメントしてから（ステップＳ１５１）、ステップＳ
１４９の処理に戻る。行の最後の文字を調べるまで、ス
テップＳ１４９からステップＳ１５１の処理を繰り返し
行なう。On the other hand, the point. y is line (l).
top and line (l). If it is between the bottoms (YES in step S143), lin is set in m.
e (l). start is entered (step S148), and in the next process, point. x is char (m). rig
It is determined whether it is on the left of ht (step S14).
9). Then, the point. x is char (m). ri
If it is not to the left of ght (NO in step S149), it is determined whether or not the m + 1th character is in the first row (step S150). , M is incremented by 1 to check the next character (step S151), and then step S
Returning to the processing of 149. The processes of steps S149 to S151 are repeated until the last character in the line is examined.

【０１１１】ｐｏｉｎｔ．ｘがｃｈａｒ（ｍ）．ｒｉｇ
ｈｔより左にある場合、または、ｍ＋１番目の文字がな
い場合には、ｓｅｌｅｃｔにｍを入れ（ステップＳ１５
２）、次のステップ（図２４における指定文字の確定処
理に続く処理）に処理を移す。Point. x is char (m). rig
If it is on the left of ht, or if there is no m + 1th character, m is entered in select (step S15).
2) Then, the process is moved to the next step (the process following the process of confirming the designated character in FIG. 24).

【０１１２】ステップＳ１４０での判別で、組方向が縦
書きでない場合は、縦書きの文字領域として指定文字を
確定し（ステップＳ１５３）、次のステップに処理を移
す。なお、ステップＳ１５３での処理は、上記ステップ
Ｓ１４１からステップＳ１５２による横書きの文字領域
の指定文字の確定処理と同様の方法で行なえるので、こ
こでは、その説明を省略する。If it is determined in step S140 that the set direction is not vertical writing, the designated character is determined as a vertical writing character area (step S153), and the process proceeds to the next step. The process in step S153 can be performed by the same method as the process of confirming the designated character in the horizontally written character region in steps S141 to S152, and thus the description thereof is omitted here.

【０１１３】図３０は、上記のような文字列選択処理の
表示画面の一例である。同図では、オペレータが文書の
イメージデータの表示画面上で、符号７０にて示される
文字領域内の範囲７１の文字列を指定した様子を示して
いる。FIG. 30 is an example of a display screen of the character string selection processing as described above. The figure shows a state in which the operator designates a character string in a range 71 within the character area indicated by reference numeral 70 on the display screen of the image data of the document.

【０１１４】以上説明したように、本実施例によれば、
抽出した文字領域情報と文字列情報とから文字情報デー
タを作成し、文書のイメージデータと文字情報データと
を関連付けて記憶するとともに、表示された文書のイメ
ージデータ上で２点を指定し、最初に指定した点を含む
文字領域内の行情報データを作成して指定点の文字を確
定し、確定した２つの文字の間の行の領域を反転表示
し、確定した２つの文字の間の文字列を取得すること
で、文書の文字列の表示画面上であらためて選択したい
文字列を探して選択することなく、文書のイメージデー
タの表示画面上で文書の文字列からの文字選択を容易に
行なうことができる。As described above, according to this embodiment,
Character information data is created from the extracted character area information and character string information, the image data of the document and the character information data are stored in association with each other, and two points are designated on the displayed image data of the document. The line information data in the character area including the specified point is created, the character of the specified point is confirmed, the area of the line between the confirmed two characters is highlighted, and the character between the confirmed two characters is displayed. By acquiring the columns, you can easily select characters from the document character string on the document image data display screen without searching for and selecting the character string you want to select again on the document character string display screen. be able to.

【０１１５】また、特定の文字領域を確定し、その文字
領域内の任意の文字列を選択することで、異なる文字領
域にまたがった、文章として意味をなさない文字列を選
択することなく、任意の文字列を容易に選択することが
できる。＜変形例１＞以下、上記第４実施例の変形例１について
説明する。Further, by defining a specific character area and selecting an arbitrary character string in the character area, it is possible to select an arbitrary character string that does not make sense as a sentence and that spans different character areas. The character string of can be easily selected. <Modification 1> Hereinafter, Modification 1 of the fourth embodiment will be described.

【０１１６】図３１は、本変形例に係る電子ファイリン
グ装置の構成を示すブロック図である。なお、同図にお
いて、図１８に示す上記第４実施例に係る装置と同一構
成要素には同一符号を付し、ここでは、簡単にそれらを
説明する。FIG. 31 is a block diagram showing the structure of the electronic filing apparatus according to this modification. In the figure, the same components as those of the device according to the fourth embodiment shown in FIG. 18 are designated by the same reference numerals, and they will be briefly described here.

【０１１７】図３１において、領域分割部３２は、すべ
ての文字領域の左上隅の水平位置、垂直位置、幅、高さ
からなる領域情報を抽出する。また、文字認識部３３
は、すべての文字領域の領域情報と文字列を文書登録部
３４に供給する。さらに、イメージ表示部３６は、文書
のイメージデータとずべての文字領域のフレームを表示
するとともに、イメージ表示部３６の表示画面上の特定
位置を指定した場合は、文字領域指定部７５から指定位
置を受け取り、その指定位置を含む文字領域のフレーム
を強調表示する。In FIG. 31, the area dividing unit 32 extracts area information including the horizontal position, vertical position, width, and height of the upper left corner of all character areas. In addition, the character recognition unit 33
Supplies the area information and character strings of all the character areas to the document registration unit 34. Further, the image display unit 36 displays the image data of the document and all the frames of the character area, and when a specific position on the display screen of the image display unit 36 is specified, the character area specifying unit 75 specifies the specified position. Is received and the frame of the character area including the specified position is highlighted.

【０１１８】次に、上記のような構成をとる、本変形例
に係る電子ファイリング装置における文書登録処理の動
作について、図３２に示すフローチャートに従って説明
する。Next, the operation of the document registration process in the electronic filing apparatus according to this modification having the above-mentioned configuration will be described with reference to the flowchart shown in FIG.

【０１１９】まず、オペレータからの開始指示により処
理を開始、例えば、処理スキャナ装置などで読み込んだ
文書のイメージデータをメモリ上に読み込む（ステップ
Ｓ１６１）。そして、読み込んだイメージデータに対し
て領域分割を行ない、文字領域情報として、文書のイメ
ージデータのすべての文字領域の左上隅の水平位置、垂
直位置、幅、高さからなる領域情報を抽出する（ステッ
プＳ１６２）。First, the process is started by the start instruction from the operator, for example, the image data of the document read by the processing scanner device is read into the memory (step S161). Then, the read image data is divided into areas, and the area information including the horizontal position, the vertical position, the width, and the height of the upper left corner of all the character areas of the image data of the document is extracted as the character area information ( Step S162).

【０１２０】なお、ここでの領域分割は、上記第１実施
例において説明した領域分割と同様であるため、その説
明を省略する。Since the area division here is the same as the area division described in the first embodiment, the description thereof will be omitted.

【０１２１】次に、抽出した文字領域のイメージデータ
に対して文字認識を行ない、その文字領域に記されてい
る文字列を抽出し、図３３に示すような、文字領域の領
域情報と文字列からなる文字情報データを作成する（ス
テップＳ１６３）。そして、ステップＳ１６３で作成し
た文書の文字情報データを、ページ数、登録日などの他
の文書管理情報とともに、文書のイメージデータと関連
付けてデータベースに登録し（ステップＳ１６４）、本
処理を終了する。Next, character recognition is performed on the image data of the extracted character area, the character string described in the character area is extracted, and the area information and character string of the character area as shown in FIG. 33 are extracted. The character information data consisting of is created (step S163). Then, the character information data of the document created in step S163 is registered in the database in association with the image data of the document together with other document management information such as the number of pages and the registration date (step S164), and this processing is ended.

【０１２２】なお、本変形例における文書表示処理は、
図２３に示す、上記第４実施例に係る装置における文書
表示処理と同じであるため、ここでは、その説明を省略
する。The document display process in this modification is as follows.
Since this is the same as the document display processing in the apparatus according to the fourth embodiment shown in FIG. 23, the description thereof will be omitted here.

【０１２３】次に、図３１に示すような構成を有する電
子ファイリング装置における文書の文字列の選択処理の
動作について、図３４に示すフローチャートに従って説
明する。Next, the operation of the character string selection processing of the document in the electronic filing apparatus having the structure shown in FIG. 31 will be described with reference to the flowchart shown in FIG.

【０１２４】まず、オペレータの開始指示により処理を
開始し、文書表示処理により読み込まれた文字情報デー
タの文字領域の領域情報をもとに、すべての文字領域の
矩形領域のフレームを文書のイメージデータ上に重ねて
表示する（ステップＳ１７１）。そして、オペレータが
文書のイメージデータの表示画面上でポインタを所望の
位置に合わせてポインティングデバイスのボタンを押し
たかどうかを判別し（ステップＳ１７２）、それを押し
ていない場合は、オペレータがそのボタンを押すまで、
ステップＳ１７２の処理を繰り返し行なう。First, processing is started by an operator's start instruction, and based on the area information of the character area of the character information data read by the document display processing, the frames of the rectangular areas of all the character areas are converted into the image data of the document. It is displayed in an overlapping manner on the top (step S171). Then, it is determined whether or not the operator has moved the pointer to a desired position on the display screen of the image data of the document and pressed the button of the pointing device (step S172). If not pressed, the operator presses the button. Until,
The process of step S172 is repeated.

【０１２５】オペレータが文書のイメージデータの表示
画面上でポインティングデバイスのボタンを押した場合
は、ボタンを押した位置をバッファselectPointに取得
する（ステップＳ１７３）。When the operator presses the button of the pointing device on the display screen of the image data of the document, the position where the button is pressed is acquired in the buffer selectPoint (step S173).

【０１２６】次に、文字情報データに、まだ、以下に述
べるステップＳ１７４以降の処理をしていない文字情報
データがあるか否かを判別し（ステップＳ１７４）、未
処理の文字情報データがある場合は、まず、１番目の文
字領域の領域情報を取得して（ステップＳ１７５）、そ
の水平位置、垂直位置、高さ、幅からなる文字領域がse
lectPointを含むかどうかを判別する（ステップＳ１７
６）。Next, it is determined whether or not there is character information data in the character information data that has not been processed in step S174 and subsequent steps (step S174), and if there is unprocessed character information data. First obtains the area information of the first character area (step S175), and the character area consisting of its horizontal position, vertical position, height, and width is se.
It is determined whether lectPoint is included (step S17).
6).

【０１２７】上記の文字領域が指定点selectPointを含
まない場合は、処理をステップＳ１７４に戻し、次の文
字領域に対して、ステップＳ１７４からステップＳ１７
６の処理を繰り返し行ない、最後の文字領域の領域情報
まで調べたならば、本処理を終了する。If the above character area does not include the designated point selectPoint, the process returns to step S174, and for the next character area, steps S174 to S17.
If the area information of the last character area is checked by repeating the processing of step 6, this processing ends.

【０１２８】しかし、上記の文字領域が指定点selectPo
intを含む場合（ステップＳ１７６での判定がＹＥＳ）
は、その文字領域のフレームを強調表示し（ステップＳ
１７７）、その文字領域の文字列を取得して（ステップ
Ｓ１７８）、本処理を終了する。However, the above character area is the designated point selectPo
When int is included (YES in step S176)
Highlights the frame of the character area (step S
177), the character string in the character area is acquired (step S178), and this processing ends.

【０１２９】図３５は、上記のような文字烈選択処理の
表示画面の一例である。同図では、文書のイメージデー
タ上に、文字領域を示すフレーム８０，８１，８２，８
３，８４が表示され、オペレータが文字領域８２を指定
すると、その文字領域が強調表示され、その文字領域の
文字列が選択される様子を示している。FIG. 35 is an example of a display screen of the character intense selection process as described above. In the figure, frames 80, 81, 82, 8 indicating character areas are added on the image data of the document.
3, 84 are displayed, and when the operator designates the character area 82, the character area is highlighted and the character string of the character area is selected.

【０１３０】このように、文書の文字領域の位置情報と
文字領域に記されている文字列を抽出し、文書のイメー
ジデータと文字領域のフレームを重ねて表示した画面上
でオペレータがポインティングデバイスで指定した文字
領域のフレームを強調表示後、その文字領域の文字列を
取得することで、文書の文字列の表示画面上であらため
て選択したい文字列を捜して選択することなく、文書の
イメージデータの表示画面上で文書の文字列からの文字
選択を容易に行なうことができる。In this way, the position information of the character area of the document and the character string written in the character area are extracted, and the operator uses the pointing device on the screen where the image data of the document and the frame of the character area are displayed in an overlapping manner. After highlighting the frame of the specified character area, by acquiring the character string of the character area, you can search for the character string you want to select on the display screen of the character string of the document and select the image data of the document without selecting it. It is possible to easily select a character from a character string of a document on the display screen.

【０１３１】また、文字領域ごとに文字列を選択するこ
とができるので、纏まりのある文章の文字選択を容易に
行なうことができる。＜変形例２＞以下、上記第４実施例の変形例２について
説明する。Further, since a character string can be selected for each character area, it is possible to easily select characters in a group of sentences. <Modification 2> Modification 2 of the fourth embodiment will be described below.

【０１３２】図３６は、本変形例に係る電子ファイリン
グ装置の構成を示すブロック図である。なお、同図に示
す電子ファイリング装置において、図１８に示す、上記
第４実施例に係る装置と同一構成要素には同一符号を付
し、それらについては、簡単に説明する。FIG. 36 is a block diagram showing the structure of the electronic filing apparatus according to this modification. In the electronic filing apparatus shown in the figure, the same components as those of the apparatus according to the fourth embodiment shown in FIG.

【０１３３】図３６において、文字認識部３３は、文書
読み込み部３１からの文書のイメージデータを受け取
り、そのイメージデータに対して文字認識を行ない、文
書に記されている文字列と、各文字のイメージデータの
領域の位置と大きさからなる位置を抽出し、抽出した文
字列と位置情報を文書登録部３４に供給する。In FIG. 36, the character recognizing unit 33 receives the image data of the document from the document reading unit 31, performs character recognition on the image data, and recognizes the character string written in the document and each character. A position consisting of the position and size of the image data area is extracted, and the extracted character string and position information are supplied to the document registration unit 34.

【０１３４】文書登録部３４は、文書読み込み部３１か
ら文書のイメージデータを、文字認識部３３から文字列
と位置情報を受け取り、これらを他の文書管理情報と関
連付けてファイル装置３５に登録する。The document registration unit 34 receives the image data of the document from the document reading unit 31, the character string and the position information from the character recognition unit 33, and registers these in the file device 35 in association with other document management information.

【０１３５】ここでは、文字情報読み込み部３７は、フ
ァイル装置３５からの文字列と位置情報を読み込み、そ
れらを選択文字取得部３９に供給する。イメージ表示部
３６は、ファイル装置３５から文書のイメージデータを
読み込み、そのイメージデータを表示するとともに、オ
ペレータが特定の領域を選択した場合は、選択範囲指定
部３８から選択領域の位置情報を受け取り、その領域の
フレームを文書のイメージデータに重ねて表示する。Here, the character information reading unit 37 reads the character string and position information from the file device 35 and supplies them to the selected character acquisition unit 39. The image display unit 36 reads the image data of the document from the file device 35, displays the image data, and when the operator selects a specific region, receives the position information of the selected region from the selection range designation unit 38, The frame in that area is displayed over the image data of the document.

【０１３６】選択範囲指定部３８は、オペレータにより
指定される選択領域の位置情報をイメージ表示部３６と
選択文字取得部３９に供給する。そして、選択文字取得
部３９は、文字情報読み込み部３７からの文書の文字列
と位置情報と文字列情報を、また、選択範囲指定部３８
からの選択領域の位置情報を受け取って、文書の文字列
から選択領域の文字列を取得する。The selection range designation section 38 supplies the position information of the selection area designated by the operator to the image display section 36 and the selected character acquisition section 39. Then, the selected character acquisition unit 39 receives the character string, the position information, and the character string information of the document from the character information reading unit 37, and the selected range specification unit 38.
The position information of the selected area is received from and the character string of the selected area is acquired from the character string of the document.

【０１３７】次に、上記の構成を有する、本変形例に係
る電子ファイリング装置における文書登録処理の動作に
ついて、図３７に示すフローチャートに従って説明す
る。Next, the operation of the document registration process in the electronic filing apparatus according to this modification having the above-mentioned configuration will be described with reference to the flowchart shown in FIG.

【０１３８】まず、オペレータの開始指示により処理を
開始し、例えば、処理スキャナ装置などで読み込んだ文
書のイメージデータをメモリ上に読み込む（ステップＳ
１８１）。そして、そのイメージデータに対して文字認
識を行ない、文書に記されている文字列と、各文字のイ
メージ領域を抽出し、図３８に示すような、文字コー
ド、文字のイメージ領域の左上隅の水平位置、左上隅の
垂直位置、幅、高さからなる文書の文字情報データを作
成する（ステップＳ１８２）。First, the processing is started by an operator's start instruction, and, for example, the image data of the document read by the processing scanner device is read into the memory (step S
181). Then, character recognition is performed on the image data to extract the character string described in the document and the image area of each character, and the character code and the character image area as shown in FIG. Character information data of a document including horizontal position, vertical position of upper left corner, width, and height is created (step S182).

【０１３９】上記の文字認識処理においては、従来よ
り、文字切りと呼ばれる処理で各文字のイメージ領域を
抽出し、それから各文字のイメージ領域のイメージデー
タにおいて文字認識を行ない、その文字の文字コードを
抽出する。しかし、ここでは、これら文字認識の方法に
ついては、その詳細な説明は省略する。In the above character recognition processing, conventionally, an image area of each character is extracted by a processing called character cutting, and then character recognition is performed in the image data of the image area of each character, and the character code of the character is determined. Extract. However, a detailed description of these character recognition methods will be omitted here.

【０１４０】次に、ステップＳ１８２で作成した文書の
文字情報データを、ページ数、登録日などの他の文書管
理情報とともに、文書のイメージデータと関連付けてデ
ータベースに登録し（ステップＳ１８３）、本処理を終
了する。Next, the character information data of the document created in step S182 is registered in the database in association with the image data of the document together with other document management information such as the number of pages and the registration date (step S183), and this processing is executed. To finish.

【０１４１】なお、本変形例に係る文書表示処理につい
ても、図２３に示す、上記第４実施例に係る装置におけ
る文書表示処理と同じであるため、ここでは、その説明
を省略する。The document display process according to the present modification is also the same as the document display process in the apparatus according to the fourth embodiment shown in FIG. 23, and therefore its description is omitted here.

【０１４２】以下、本変形例に係る電子ファイリング装
置における文書の文字列の選択処理の動作について、図
３９に示すのフローチャートに従って説明する。The operation of the document character string selection processing in the electronic filing apparatus according to this modification will be described below with reference to the flowchart shown in FIG.

【０１４３】まず、オペレータが、文書表示処理によ
り、文書のイメージデータの表示画面上でポインタを所
望の位置に合わせて、ポインティングデバイスのボタン
を押すことにより処理を開始し、そのときのポインタの
水平位置と垂直位置を、それぞれバッファstartPos.xと
startPos.yに取得する（ステップＳ１９１）。次に、オ
ペレータがポインティングデバイスのボタンを押したま
ま、その位置を移動することにより、そのときのポイン
タの水平位置と垂直位置を、それぞれバッファendPos.x
とendPos.yに取得する（ステップＳ１９２）。このとき
のstartPos.x、startPos.y、endPos.x、endPos.yからな
る矩形領域のフレームを文書のイメージデータに重ねて
表示する（ステップＳ１９３）。First, the operator starts the processing by aligning the pointer with a desired position on the display screen of the image data of the document by the document display processing and pressing the button of the pointing device to start horizontal processing of the pointer at that time. The position and vertical position are set in the buffers startPos.x and
It is acquired as startPos.y (step S191). Next, the operator moves the position of the pointing device while holding down the button of the pointing device.
And endPos.y are acquired (step S192). The frame of the rectangular area composed of startPos.x, startPos.y, endPos.x, and endPos.y at this time is displayed so as to be superimposed on the image data of the document (step S193).

【０１４４】そして、オペレータがポインティングデバ
イスのボタンを離したかどうかを判別し（ステップＳ１
９４）、オペレータが、まだ、ポインティングデバイス
のボタンを離していない場合は、処理をステップＳ１９
２に戻し、オペレータがポインティングデバイスのボタ
ンを離すまで、上記のステップＳ１９２からステップＳ
１９４の処理を繰り返し行なう。Then, it is determined whether or not the operator has released the button of the pointing device (step S1).
94), if the operator has not yet released the button of the pointing device, the process proceeds to step S19.
2 until the operator releases the button of the pointing device, the above steps S192 to S192 are performed.
The processing of 194 is repeated.

【０１４５】一方、オペレータがポインティングデバイ
スのボタンを離したならば、文書表示処理により読み込
まれた文字情報データに、まだ、後述するステップＳ１
９５以降の処理をしていないデータがあるか否かを判別
し（ステップＳ１９５）、未処理の文字情報データがあ
る場合は、まず、１番目の要素の文字情報を文字コー
ド、水平位置、垂直位置、高さ、幅からなる構造体text
Infoに取得する（ステップＳ１９６）。On the other hand, when the operator releases the button of the pointing device, the character information data read by the document display process still includes step S1 which will be described later.
It is determined whether or not there is data that has not been processed after 95 (step S195). If there is unprocessed character information data, the character information of the first element is first set to the character code, horizontal position, and vertical position. Structure text consisting of position, height and width
It is acquired as Info (step S196).

【０１４６】次に、上記の水平位置、垂直位置、高さ、
幅からなる文字の領域が、startPos.x、startPos.y、en
dPos.x、endPos.yからなる選択領域に完全に含まれるか
どうかを判別する（ステップＳ１９７）。そして、文字
の領域が完全に含まれる場合は、その文字の文字コード
をバッファstringにコピーし（ステップＳ１９８）、処
理をステップＳ１９５に戻す。Next, the above horizontal position, vertical position, height,
The character area consisting of width is startPos.x, startPos.y, en
It is determined whether or not the selected area consisting of dPos.x and endPos.y is completely included (step S197). When the character area is completely included, the character code of the character is copied to the buffer string (step S198), and the process returns to step S195.

【０１４７】また、文字の領域が選択領域に完全に含ま
れない場合は、何も処理せずに、ステップＳ１９５の処
理に戻る。そして、文字情報のデータのすべての要素に
対して、上記のステップＳ１９５からステップＳ１９８
の処理を繰り返し行ない、最後の要素まで調べたなら
ば、本処理を終了する。なお、このとき、stringには、
選択領域に完全に含まれるすべての文字の文字コードが
取得されている。If the character area is not completely included in the selected area, no processing is performed and the process returns to step S195. Then, with respect to all the elements of the character information data, the above steps S195 to S198 are performed.
When the process of is repeated and the last element has been checked, this process ends. At this time, in the string,
The character codes of all the characters completely included in the selected area are acquired.

【０１４８】図４０は、上記のような文字列選択処理に
おける表示画面の一例を示す図である。同図では、オペ
レータが、文書のイメージデータの表示画面上で、領域
９０を指定し、その領域内の文字列が選択された様子を
示している。FIG. 40 is a diagram showing an example of a display screen in the character string selection processing as described above. In the figure, the operator specifies the area 90 on the display screen of the image data of the document, and the character string in the area is selected.

【０１４９】このように、文書のイメージデータから文
字認識を行ない、文書の文字列と各文字のイメージ領域
の位置情報を抽出して、文書のイメージデータの表示画
面上の選択領域の位置情報と文字情報から選択文字を取
得することで、文書の文字列の表示画面上であらためて
選択したい文字列を捜して選択することなく、文書のイ
メージデータの表示画面上で直接文書の文字列を選択す
ることができ、文書に記されている文字列からの文字選
択を容易に行なうことができる。In this way, character recognition is performed from the image data of the document, the character string of the document and the position information of the image area of each character are extracted, and the position information of the selected area on the display screen of the image data of the document is extracted. By obtaining the selected character from the character information, you can directly select the character string of the document on the display screen of the image data of the document without searching for and selecting the character string you want to select on the display screen of the character string of the document. It is possible to easily select a character from the character string written in the document.

【０１５０】なお、本発明は、複数の機器から構成され
るシステムに適用しても、１つの機器から成る装置に適
用しても良い。また、本発明は、システムあるいは装置
にプログラムを供給することによって達成される場合に
も適用できることは言うまでもない。The present invention may be applied to a system composed of a plurality of devices or an apparatus composed of one device. Further, it goes without saying that the present invention can be applied to the case where it is achieved by supplying a program to a system or an apparatus.

【０１５１】[0151]

【発明の効果】以上説明したように、本発明によれば、
特定された文書の文字領域のイメージデータを表示する
ことで、文書に記されている文字の一部の識別や、似た
レイアウト、特徴のないレイアウトの文書の識別が容易
になり、また、文字領域の組方向にかかわらず適切な表
示が可能となる。As described above, according to the present invention,
By displaying the image data of the character area of the specified document, it is easy to identify some of the characters described in the document, and to identify documents with a similar layout or a featureless layout. Appropriate display is possible regardless of the direction of grouping of areas.

【０１５２】また、他の発明によれば、文書のイメージ
データ上で指定された特定範囲や文字領域より文字列を
取得することで、文書の文字列からの文字選択が容易に
行なえる。According to another invention, a character string is acquired from a specific range or a character area designated on the image data of the document, so that the character selection from the character string of the document can be easily performed.

【０１５３】[0153]

[Brief description of drawings]

【図１】本発明の第１実施例に係る電子ファイリング装
置の構成を示すブロック図である。FIG. 1 is a block diagram showing a configuration of an electronic filing device according to a first embodiment of the present invention.

【図２】第１実施例に係る電子ファイリング装置におけ
る文書登録処理を示すフローチャートである。FIG. 2 is a flowchart showing a document registration process in the electronic filing device according to the first embodiment.

【図３】第１実施例における文書の領域分割と文字領域
の順序付けを説明するための図である。FIG. 3 is a diagram for explaining area division of a document and ordering of character areas in the first embodiment.

【図４】イメージデータ領域のイメージデータを一覧表
示する処理手順を示すフローチャートである。FIG. 4 is a flowchart showing a processing procedure for displaying a list of image data in an image data area.

【図５】一覧表示処理にて表示される文書領域画像一覧
の表示画面の一例を示す図である。FIG. 5 is a diagram showing an example of a display screen of a document area image list displayed in a list display process.

【図６】第２の実施例に係る電子ファイリング装置の構
成を示すブロック図である。FIG. 6 is a block diagram showing a configuration of an electronic filing device according to a second embodiment.

【図７】第２実施例における文書登録処理手順を示すフ
ローチャートである。FIG. 7 is a flowchart showing a document registration processing procedure in the second embodiment.

【図８】文字領域特定処理の動作の詳細手順を示すフロ
ーチャートである。FIG. 8 is a flowchart showing a detailed procedure of an operation of character area specifying processing.

【図９】第３の実施例に係る電子ファイリング装置の構
成を示すブロック図である。FIG. 9 is a block diagram showing a configuration of an electronic filing device according to a third embodiment.

【図１０】第３実施例における文書登録処理の動作を示
すフローチャートである。FIG. 10 is a flowchart showing the operation of document registration processing in the third embodiment.

【図１１】文字領域切り出し処理の詳細手順を示すフロ
ーチャートである。FIG. 11 is a flowchart showing a detailed procedure of a character area cutout process.

【図１２】横書きの文字領域の切り出し領域確定処理の
詳細フローチャートである。FIG. 12 is a detailed flowchart of a cutout area confirmation process of a horizontally written character area.

【図１３】文書中の横書きの文字領域において切り出し
領域が確定された様子を示す図である。FIG. 13 is a diagram illustrating a state in which a cutout area is defined in a horizontally written character area in a document.

【図１４】縦書きの文字領域の切り出し領域確定処理の
動作を詳細に示すフローチャートである。FIG. 14 is a flowchart showing in detail the operation of a cutout area determination process for a vertically written character area.

【図１５】文書中の縦書きの文字領域において切り出し
領域が確定された様子を示す図である。FIG. 15 is a diagram showing a state in which a cutout area has been determined in a vertically written character area in a document.

【図１６】文字領域のイメージデータを一覧表示する処
理手順を示すフローチャートである。FIG. 16 is a flowchart showing a processing procedure for displaying a list of image data of character areas.

【図１７】一覧表示処理により表示される文書領域画像
一覧の表示画面の一例を示す図である。FIG. 17 is a diagram showing an example of a display screen of a document area image list displayed by a list display process.

【図１８】第４の実施例に係る電子ファイリング装置の
構成を示すブロック図である。FIG. 18 is a block diagram showing a configuration of an electronic filing device according to a fourth embodiment.

【図１９】第４実施例における文書登録処理の動作を示
すフローチャートである。FIG. 19 is a flowchart showing the operation of document registration processing in the fourth embodiment.

【図２０】第４実施例における領域分割の動作を説明す
るための図である。FIG. 20 is a diagram for explaining an operation of area division in the fourth embodiment.

【図２１】文字認識により抽出される各文字のイメージ
データの領域を説明するための図である。FIG. 21 is a diagram illustrating an area of image data of each character extracted by character recognition.

【図２２】作成された文字情報データを示す図である。FIG. 22 is a diagram showing created character information data.

【図２３】第４実施例における文書表示処理の動作を示
すフローチャートである。FIG. 23 is a flowchart showing an operation of document display processing in the fourth embodiment.

【図２４】文書の文字列の選択処理の動作を示すフロー
チャートである。FIG. 24 is a flowchart showing the operation of a document character string selection process.

【図２５】文字列選択処理における行情報データの作成
処理の詳細を示すフローチャートである。FIG. 25 is a flowchart showing details of a line information data creation process in the character string selection process.

【図２６】行情報データ作成処理により作成される行情
報データの構成を示す図である。FIG. 26 is a diagram showing a structure of line information data created by a line information data creating process.

【図２７】横書きの文字領域から実際に作成される行情
報データを説明するための図である。FIG. 27 is a diagram for explaining line information data actually created from a horizontally written character area.

【図２８】横書きの文字領域から実際に作成される行情
報データを説明するための図である。FIG. 28 is a diagram for explaining line information data actually created from a horizontally written character area.

【図２９】文字列選択処理における指定文字の確定処理
の詳細フローチャートである。FIG. 29 is a detailed flowchart of a designated character confirmation process in the character string selection process.

【図３０】文字列選択処理の表示画面の一例を示す図で
ある。FIG. 30 is a diagram showing an example of a display screen of a character string selection process.

【図３１】第４実施例の変形例１に係る電子ファイリン
グ装置の構成を示すブロック図である。FIG. 31 is a block diagram showing the configuration of an electronic filing device according to Modification 1 of the fourth embodiment.

【図３２】変形例１における文書登録処理を示すフロー
チャートである。FIG. 32 is a flowchart showing a document registration process in the first modification.

【図３３】文字領域の領域情報と文字列からなる文字情
報データを示す図である。FIG. 33 is a diagram showing character information data including area information of a character area and a character string.

【図３４】文書の文字列の選択処理を示すフローチャー
トである。FIG. 34 is a flowchart showing a process of selecting a character string of a document.

【図３５】文字列選択処理の表示画面の一例を示す図で
ある。FIG. 35 is a diagram showing an example of a display screen of a character string selection process.

【図３６】第４実施例の変形例２に係る電子ファイリン
グ装置の構成を示すブロック図である。FIG. 36 is a block diagram showing the configuration of an electronic filing device according to Modification 2 of the fourth embodiment.

【図３７】変形例２に係る文書登録処理を示すフローチ
ャートである。FIG. 37 is a flowchart showing a document registration process according to Modification 2.

【図３８】文書の文字情報データを示す図である。FIG. 38 is a diagram showing character information data of a document.

【図３９】変形例２における文書の文字列の選択処理を
示すフローチャートである。FIG. 39 is a flowchart showing a process of selecting a character string of a document in the second modification.

【図４０】文字列選択処理における表示画面の一例を示
す図である。FIG. 40 is a diagram showing an example of a display screen in a character string selection process.

【図４１】従来の電子ファイリング装置の構成を示すブ
ロック図である。FIG. 41 is a block diagram showing a configuration of a conventional electronic filing device.

【図４２】従来の装置における縮小画像による文書の一
覧表示を示す図である。である。FIG. 42 is a diagram showing a list display of documents by reduced images in a conventional apparatus. Is.

【図４３】従来の電子ファイリング装置の他の構成を示
すブロック図である。FIG. 43 is a block diagram showing another configuration of a conventional electronic filing device.

【図４４】従来の電子ファイリング装置における文書の
表示画面例を示す図である。FIG. 44 is a diagram showing an example of a document display screen in a conventional electronic filing device.

[Explanation of symbols]

１，３２領域分割部２先頭文字領域特定部３文字領域切り出し部４，３４文書登録部５，３５ファイル装置６表示制御部７最大文字領域特定部１０，３１文書読み込み部３７文字情報読み込み部３９選択文字取得部 1, 32 area dividing section 2 first character area specifying section 3 character area cutting section 4, 34 document registration section 5, 35 file device 6 display control section 7 maximum character area specifying section 10, 31 document reading section 37 character information reading section 39 Selected character acquisition part

Claims

[Claims]

1. An electronic filing device for retrieving a plurality of read documents, means for dividing a region of image data of the read document, and means for discriminating a character region from the divided regions. When there are a plurality of character areas, 1 is selected from the plurality of character areas.
Specifying means for specifying one character area, cutout means for cutting out image data of a specific size from the specified character area, image data of the read document, and the cutout image data are stored in association with each other. An electronic filing apparatus comprising: a storage unit for storing the image data and a display unit for displaying the stored image data in association with the search.

2. The method according to claim 1, further comprising means for ordering the plurality of character areas, wherein the specifying means specifies a character area having the first order in the ordering. Electronic filing equipment.

3. The apparatus further comprises means for extracting an average character size in each area of the plurality of character areas, and means for comparing the extracted average character sizes with each other, wherein the specifying means performs the comparison. Of the average character size
The electronic filing apparatus according to claim 1, wherein the area having the largest average character size is specified.

4. The apparatus further comprises means for extracting area information and a set direction of the character area, and the cutting means cuts out an area determined by the area information and the set direction of the character area. The electronic filing device according to claim 1.

5. The storage means stores the cut-out image data in association with the set direction, and the display means displays the stored image data in a direction according to the set direction. The electronic filing device according to claim 4, wherein

6. An electronic filing device for retrieving a plurality of read documents, means for dividing a region of image data of the read document, and means for discriminating a character region from the divided regions. A unit for performing character recognition on the image data of the character area, a unit for extracting predetermined information based on the character recognition, and image data of the read document and the predetermined information are stored in association with each other. Storage means, means for displaying the image data of the read document, means for designating a specific range on the image data of the displayed document, and a character string of the specific range from the predetermined information. An electronic filing device comprising: an acquisition unit for acquiring.

7. The predetermined information is character area information including the area information of the character area and a set direction, and character string information including area information of a character string described in the character area and image data of each character. The electronic filing device according to claim 6, wherein:

8. The electronic filing apparatus according to claim 6, wherein the predetermined information is area information of the character area and a character string of the character area.

9. The electronic filing apparatus according to claim 6, wherein the predetermined information is a character string written in the document and area information of image data of each character.

10. A means for creating line information data for a character area including a first designated point of the two designated points as the specific range on the image data of the displayed document. 7. The electronic filing apparatus according to claim 6, further comprising: the acquisition unit acquiring the character string based on the line information data.

11. The electronic filing apparatus according to claim 6, wherein the character area for the specific range is displayed in reverse with respect to the image data of the displayed document.

12. The electronic filing apparatus according to claim 6, further comprising: highlighting a character area in the specific range with respect to the image data of the displayed document.

13. An electronic filing method for retrieving a plurality of read documents, dividing the image data of the read documents into regions, and determining a character region from the divided regions. When there are a plurality of character areas, 1 is selected from the plurality of character areas.
Specifying one character area, cutting out image data of a specific size from the specified character area, and storing the read image data of the document and the cut out image data in association with each other. And a step of displaying the stored image data in correspondence with the search, the electronic filing method.

14. An electronic filing method for retrieving a plurality of read documents, dividing the image data of the read document into regions, and determining a character region from the divided regions. Character recognition is performed on the image data of the character area, step of extracting predetermined information based on the character recognition, image data of the read document, and the predetermined information are stored in association with each other. A step of displaying image data of the read document, a step of designating a specific range on the image data of the displayed document, and a character string of the specific range from the predetermined information An electronic filing method comprising the steps of: