JPH0242561A

JPH0242561A - Document doctrine structure extraction method

Info

Publication number: JPH0242561A
Application number: JP63192744A
Authority: JP
Inventors: Akihiro Kaneko; 明弘金子; Yasuyuki Takahashi; 高橋　康幸; Hidefumi Iwami; 岩見　秀文
Original assignee: Hitachi Ltd; Hitachi Microcomputer Engineering Ltd
Current assignee: Hitachi Microcomputer System Ltd; Hitachi Ltd
Priority date: 1988-08-03
Filing date: 1988-08-03
Publication date: 1990-02-13

Abstract

PURPOSE:To realize the analysis of the logical structure of a document produced in a free form by preparing a specific processing step for a document editing device using a multi-window system. CONSTITUTION:In a 1st step a character string inputted by an operator according to the template specifications is stored in a logical structure file 4 as a logical structure template. In a 2nd step a list of names of logical structure templates which are previously registered are displayed on a template menu window according to an instruction of the logical structure analysis start by an operator. At the same time, the logical structure template is analyzed based on the template specifications and a hierarchical structure is stored in a logical structure memory 15. The operator points a character string subject on a document display window against each logical element name of the corresponding template and stores the pointing result as the contents of the hierarchical structure of the memory 15.

Description

【発明の詳細な説明】［産業上の利用分野〕本発明は、既存の文書から、論理構造を抽出する方式に
係り、特に、様式に対する制約なしに記述された文書か
ら、論理的な構造を抽出するのに好適な論理構造抽出方
式に関する。[Detailed Description of the Invention] [Industrial Application Field] The present invention relates to a method for extracting a logical structure from an existing document, and in particular, a method for extracting a logical structure from a document written without format restrictions. This invention relates to a logical structure extraction method suitable for extraction.

[Conventional technology]

計算機で文書処理を行なう場合、文書は、表題・章・節
などの階層的な構造を持つ論理構造を表現することがで
きる。その論理構造を扱うためには、論理要素（文書の
章・節・項や図表といった意味的に区切られる単位）を
予め指定しなければならない。従来の論理要素を指定す
る方法として。When a computer processes a document, the document can represent a logical structure with a hierarchical structure such as titles, chapters, and sections. In order to handle the logical structure, logical elements (units that are semantically divided, such as chapters, sections, sections, and charts of a document) must be specified in advance. As a way to specify traditional logical elements.

情報処理学会第３４回（昭和６２年前期）全国大会講演
論文集（ＩＩ）Ｐ１３０９には、見出し部か内容部かを
識別する規則１と、支間の関係から、章・節などの階層
構造を識別する規則２をシステム内部に記述し１文書の
論理構造を自動的に抽出する方法が述べられている。Information Processing Society of Japan 34th (first half of 1986) National Convention Proceedings (II) P1309 describes the hierarchical structure of chapters, sections, etc. based on Rule 1 to identify whether it is a heading section or a content section, and the relationship between spans. A method for automatically extracting the logical structure of one document by writing identification rule 2 inside the system is described.

この方法では、入力文書は、レイアウト編集されていな
い、べた書き文書で、改行コードまでを一単位としてい
る。また文章は、論理的展開類に並んでいると仮定する
。このような形式の入力文書から自動的に文書の論理構
造を抽出する。自動抽出のための規則ｌ、規則２は、シ
ステム内部に記述している。まず、規則１による一単位
ごとの解析を行い、−文の要素が見出しなのか、内容部
なのかを識別する。その後、規則２による文書内の文の
つながりを解析して、各文の親子関係、兄弟関係などを
識別子・章・節などの階層的な構造を抽出している。こ
れらの処理は、バッチ型で行われる。In this method, the input document is a solid document that has not been layout edited, and includes the line feed code as one unit. It is also assumed that the sentences are arranged in a logical class. The logical structure of a document is automatically extracted from an input document in such a format. Rules 1 and 2 for automatic extraction are written inside the system. First, a unit-by-unit analysis is performed according to Rule 1 to identify whether the element of the - sentence is a heading or a content part. After that, the connection of sentences in the document is analyzed according to Rule 2, and the hierarchical structure of identifiers, chapters, sections, etc., such as parent-child relationships and sibling relationships of each sentence, is extracted. These processes are performed in batch mode.

[Problem to be solved by the invention]

上記従来技術は、見出し部か内容部かを識別する規則と
１文間の関係から章・節などの階層構造を識別する規則
によって文書を解析している。The above-mentioned prior art analyzes a document using rules for identifying whether it is a heading section or a content section and rules for identifying a hierarchical structure such as chapters and sections based on the relationship between sentences.

このため論理構造を抽出するには、文書をこれらの規則
に従って記述する必要があるので、上記規則に従って記
述されていない既存の文書からは論理構造を抽出できな
い、また新規に文書を作成する場合に自由な様式で文書
を記述できない等の問題があった。Therefore, in order to extract a logical structure, it is necessary to write the document according to these rules, so it is not possible to extract a logical structure from an existing document that is not written according to the above rules, and when creating a new document. There were problems such as not being able to write documents in a free format.

本発明の目的は、かかる従来方法の問題点を解決し、自
由な様式で作成さ才した文書の、論理構造解析を可能と
することにある。An object of the present invention is to solve the problems of the conventional methods and to enable logical structure analysis of documents created in a free format.

[Means to solve the problem]

マルチウィンドウを用いた文書処理システムにおいて、
文書の論理構造を論理要素名（表題９章。In a document processing system using multi-windows,
The logical structure of a document is defined by logical element names (Chapter 9).

節、参考文献等）で登録したものを、論理構造テンプレ
ートと呼ぶ。Sections, references, etc.) are called logical structure templates.

第１ステップとして、オペレータがテンプレート登録ウ
ィンドウを用いてテンプレート仕様に従って入力した文
字列を、論理構造テンプレートとして論理構造ファイル
に格納する。As a first step, a character string input by an operator using a template registration window according to template specifications is stored in a logical structure file as a logical structure template.

第２ステップとして、オペレータによる論理構造解析開
始の指示により、予め４ｉ録された論理構造テンブレー
１−の名称−覧を、テンプレートメニューウィンドウに
表示し、オペレータによるメニューの中からの適当なテ
ンプレートの選択に基づいて、テンプレートメニューウ
ィンドウを消去し、論理要素メニューウィンドウにテン
プレートを表示する。内部では、テンプレート仕様に従
って。As a second step, in response to the operator's instruction to start logical structure analysis, a list of names of logical structure templates 1 recorded in advance is displayed in the template menu window, and the operator can select an appropriate template from the menu. Clear the template menu window and display the template in the logical element menu window based on . Internally, according to the template specification.

論理構造テンプレートを解析し、論理構造メモリ上に１
階層構造を記憶する。そのテンプレートの各論理要素名
に対して、オペレータは、文書表示ウィンドウ上での文
字列対象を指示し、その結果を、論理構造メモリ上の階
層構造の内容として、記憶する０以上、第１ステップ、
第２ステップにより、自由な様式で作成された文書の論
理構造解析を可能とすることができる。Analyze the logical structure template and save 1 on the logical structure memory.
Memorize the hierarchical structure. For each logical element name of the template, the operator specifies the character string target on the document display window, and the result is stored as the content of the hierarchical structure in the logical structure memory. ,
The second step makes it possible to analyze the logical structure of a document created in a free format.

また、第３ステップとして、文書表示ウィンドウに文書
を表示し、論理要素メニューウイ、ンドウにテンプレー
トを表示した後、オペレータによるテンプレートの修正
に基づき、論理構造テンプレートの文字列を修正する。Further, as a third step, after displaying the document in the document display window and displaying the template in the logical element menu window, the character string of the logical structure template is modified based on the template modification by the operator.

そのデータを、テンプレート仕様に従って解析し、第１
メモリ上に記憶する。Analyze the data according to the template specifications and
Store in memory.

以上の第３ステップを加えることにより、標準的な論理
構造テンブレー１−を第１ステップで登録し、それを修
正して使うことができるので、オペレータの作業を削減
して、本方式の目的を達成することができる。By adding the third step above, the standard logical structure template 1- can be registered in the first step, modified and used, reducing the operator's work and achieving the purpose of this method. can be achieved.

[Effect]

論理構造テンプレートを、テンプレート仕様によって解
析して階層構造を生成し、オペレータが要素名毎に画面
上で選択した文字列対象を階層構造に格納することによ
って、論理構造を作成する。A logical structure is created by analyzing a logical structure template according to template specifications to generate a hierarchical structure, and storing character string objects selected by an operator on the screen for each element name in the hierarchical structure.

それによって１文書データ本体を規則で解析することが
ないので１文書を自由な様式で記述しても論理構造を生
成することができる。As a result, one document data body is not analyzed according to rules, so a logical structure can be generated even if one document is written in a free format.

また、論理構造テンプレートを文書を兄ながら修正でき
るようにしたことにより、予め登録した標準的な論理構
造テンプレートを修正して使用すればよいので、テンプ
レート登録にかかるオペレータの作業を削減することが
できる。In addition, by making it possible to modify the logical structure template while editing the document, the operator can reduce the work involved in registering the template, since all he has to do is modify and use the standard logical structure template that has been registered in advance. .

〔Example〕

まず、第１図を参照して１本発明が適用される文書編集
システムの全体構成を説明する。このシステムは、シス
テム全体動作を制御する主制御装置（マイクロプロセッ
サ）１．主制御装置１が実行する各種プログラムを格納
するメモリ２．これらのプログラムの実行過程で生ずる
データを一時的に格納するためのワークメモリ３２本シ
ステムで作成、参照する論理構造テンプレートデータを
格納するためのテンプレートファイル４９文書データや
論理構造データ等を格納するためのデータファイル１６
．データファイル１６からロードした文書データを格納
する文書データメモリ１７表示画面７に表示すべきデー
タが格納されるリフレッシュメモリ（ビットマツプメモ
リ）５．リフレッシュメモリ５の内容を順次に読出して
表示画面７に出力するデイスプレィ制御装置６．各仮想
画面に対応する第４図に示すデータレコード４１〜４７
を格納するための仮想画面メモリ８．第３図に示すウィ
ンドウ管理テーブル３０を格納するためのウィンドウ管
理テーブルメモリ９９文字コードに対応する文字フォン
トを格納するためのメモリ１０．仮想画面上の表示デー
タをリフレッシュメモリ５上にビットマツプデータとし
で展開するビットマツプ・プロセッサ（ＢＭＰ）１１．
上記ビットマツプ・プロセッサ１１を動作させるための
各種コマンドを格納するためのＢＭＰコマンドメモリ１
２．論理構造データを格納するための論理構造メモリ１
５２本システムに対して各種の制御指示やデータを入力
するためのキーボード１３゜および表示画面上でカーソ
ルによる位置を指定するためのマウス（ＭＯＵＳＥ）な
どのボインティング装置１４からなる。First, the overall configuration of a document editing system to which the present invention is applied will be explained with reference to FIG. This system consists of a main controller (microprocessor) that controls the overall system operation;1. Memory 2 for storing various programs executed by the main controller 1. 32 work memories for temporarily storing data generated during the execution process of these programs 49 template files for storing logical structure template data created and referenced by the system 49 for storing document data, logical structure data, etc. data file 16
．． 5. Document data memory 17 stores document data loaded from data file 16; refresh memory (bitmap memory) stores data to be displayed on display screen 7; A display control device 6 that sequentially reads out the contents of the refresh memory 5 and outputs them to the display screen 7. Data records 41 to 47 shown in FIG. 4 corresponding to each virtual screen
Virtual screen memory for storing 8. Window management table memory 99 for storing the window management table 30 shown in FIG. 3; Memory 10 for storing character fonts corresponding to character codes. A bitmap processor (BMP) 11 that develops display data on the virtual screen as bitmap data on the refresh memory 5.
BMP command memory 1 for storing various commands for operating the bitmap processor 11
2. Logical structure memory 1 for storing logical structure data
52 This system includes a keyboard 13° for inputting various control instructions and data, and a pointing device 14 such as a mouse (MOUSE) for specifying a position on a display screen with a cursor.

次に１本システムが利用するマルチウィンドウシステム
について、第２図、第３図、第４図を参照しながら説明
する。Next, the multi-window system used by one system will be explained with reference to FIGS. 2, 3, and 4.

第２図は、仮想画面メモリ８に格納される仮想画面２０
．２１上のウィンドウ表示域２３．２４とデイスプレィ
実画面７上に設定されるウィンドウ２５．２６との関係
を示す、この例では、ウィンドウ２５．２６の位置と大
きさは、ウィンドウ矩形の左上端点のＸＹ座４１！１！
　（Ｘｌ、　Ｙｔ）と右下端点のＸＹ座ＪＩＡ　ＣＸｘ
’　、　Ｙｔ’　）とで表わされる。FIG. 2 shows a virtual screen 20 stored in the virtual screen memory 8.
．． In this example, the position and size of the window 25.26 are determined by the upper left end point of the window rectangle. XY seat 41!1!
(Xl, Yt) and the lower right end point of the XY locus JIA CXx
', Yt').

仮想画面２０．２１のウィンドウ表示域２３゜２４内に
位置するデータが、ウィンドウ２５゜２６に表示される
。ウィンドウ表示域２３．．２４の位置と大きさは、ウ
ィンドウ２５．２６と同様に当該矩形の左上端点、右下
端点のｘｙ座標で表わす。Data located within the window display areas 23, 24 of the virtual screen 20, 21 is displayed in windows 25, 26. Window display area 23. ．． The position and size of 24 are expressed by the xy coordinates of the upper left end point and the lower right end point of the rectangle, similarly to windows 25 and 26.

第２図に示したウィンドウ２５．２６とウィンドウ表示
域２３．２４との対応関係は、第３図に示すウィンドウ
管理テーブル３ｏに記憶する。The correspondence between windows 25, 26 and window display areas 23, 24 shown in FIG. 2 is stored in the window management table 3o shown in FIG.

第３図はウィンドウ管理テーブル３ｏのデータ項目を示
す、３１はウィンドウが重なる時の表示優先順位である
。３２はデイスプレィ実画面の左上端点を原点としたウ
ィンドウの左上端点のＸＹ座標、３３は同じく右下端点
のＸ’　Ｙ’座標である。３４は仮想画面の左上端点を
原点としたウィンドウ表示域の左上端点のｘｙａ標、３
５は同じく右下端点のｘ’　ｙ’座標である。FIG. 3 shows the data items of the window management table 3o. 31 is the display priority order when windows overlap. 32 is the XY coordinate of the upper left end point of the window with the origin at the upper left end point of the actual display screen, and 33 is the X'Y' coordinate of the lower right end point. 34 is the xya mark of the upper left end point of the window display area with the origin at the upper left end point of the virtual screen; 3
Similarly, 5 is the x'y' coordinate of the lower right end point.

第４図は、仮想画面メモリ８のデータ項目を示す０図に
おいて、４１は仮想画面の大きさを表ねす横と縦のサイ
ズ、４２は仮想画面内の領域の総数である。仮想画面内
のデータは１次に述べる領域単位で管理する。４３は仮
想画面の左上端点を原点とした領域の横縦座標、４４は
領域のサイズである。４５は当該領域のデータ種別（テ
キスト／図形／画像）である、４６は当該領域のデータ
属性である。データ種別４５がテキストの場合は。FIG. 4 shows the data items of the virtual screen memory 8, where 41 is the horizontal and vertical size representing the size of the virtual screen, and 42 is the total number of areas within the virtual screen. Data within the virtual screen is managed in area units as described below. 43 is the horizontal and vertical coordinates of the area whose origin is the upper left corner of the virtual screen, and 44 is the size of the area. 45 is the data type (text/figure/image) of the area, and 46 is the data attribute of the area. If the data type 45 is text.

横／縦書き種別２行ピッチ、字ピッチ等である。Horizontal/vertical writing type, 2-line pitch, character pitch, etc.

データ種別４５が図形の場合は、図形数等である。If the data type 45 is a figure, it is the number of figures, etc.

データ種別４５が画像の場合は圧縮形式、階調数等であ
る。４７は領域データである。データ種別４５がテキス
トの場合は、文字コード列、図形の場合は、図形コマン
ド列、画像の場合は、画像データである。If the data type 45 is an image, the information includes the compression format, number of gradations, etc. 47 is area data. If the data type 45 is text, it is a character code string, if it is a graphic, it is a graphic command string, and if it is an image, it is image data.

第５図は、論理構造メモリ１５上に記憶される論理構造
データテーブルである。論理構造テーブル５０は、対象
（文書・表題・章・節等）単位毎のレコード５９からな
る。各レコードは、階層構造の最上位レベル、中間レベ
ル、最下位レベルを示す種別５１．対象の上下関係を示
す識別子５２゜下位に属する対象数を示す従属子５３．
対象が文書内容を持っているか否かを示す内容部５４．
小象の論理要素名（要旨、１章等）を示す利用者可視名
５５１文書内容を、どの枠に表示するか等のレイアウト
情報とリンクをとる割付様式識別子５６、文書データメ
モリ１７上にある表示データ５９の位置を示す開始ポイ
ンタ５７と終了ポインタ５８を含む。FIG. 5 is a logical structure data table stored on the logical structure memory 15. The logical structure table 50 consists of records 59 for each object (document, title, chapter, section, etc.). Each record has a type 51, which indicates the top level, intermediate level, and bottom level of the hierarchical structure. Identifier 52 indicating the hierarchical relationship of objects; Dependent 53 indicating the number of subordinate objects.
Content section 54 indicating whether the target has document content.
User-visible name 551 indicating the logical element name of the small elephant (abstract, chapter 1, etc.) Layout style identifier 56 that links with layout information such as in which frame the document content is displayed, located on the document data memory 17 It includes a start pointer 57 and an end pointer 58 indicating the position of display data 59.

第６図に識別子５２をつけた論理構造の一列を示し、対
象識別子の意味を説明する。識別子は、階層レベルが１
段階下がる毎に１桁多くなり、同じ対象から発生した同
レベルの対象の場合は１発生順に、下１桁の数字が１づ
つ増加する０例えば。FIG. 6 shows a sequence of logical structures with identifiers 52 attached, and the meanings of the object identifiers will be explained. Identifiers have a hierarchy level of 1
Each time the level goes down, the number increases by one digit, and in the case of targets of the same level generated from the same target, the last digit increases by 1 in the order of occurrence.For example, 0.

識別子“３２１０　”の次にあられれた対象が下位レベ
ルの場合は、識別子“３２１００″′となり、同位レベ
ルの場合は、”３２１１”となる、このように、識別子
は、階層構造における対象の位置を明確にする役割を果
たす。If the object that appears next to the identifier "3210" is at a lower level, it will be the identifier "32100"'; if it is at the same level, it will be "3211". In this way, the identifier indicates the position of the object in the hierarchical structure. It plays a role in clarifying the

次に第５図に示した。論理構造データ構造を利用して実
施される、文書論理構造抽出処理のための制御フローに
ついて、第１４図、第７図、第８図及び第９図に示すプ
ログラム・フローチャートを参照にして説明する。第１
０図は、処理に用いるキーボード１３で、第１１図、第
１２図は、処理中の表示画面７の表示例である。Next, it is shown in FIG. The control flow for the document logical structure extraction process performed using the logical structure data structure will be explained with reference to the program flowcharts shown in FIGS. 14, 7, 8, and 9. . 1st
0 shows the keyboard 13 used for processing, and FIGS. 11 and 12 show examples of the display screen 7 during processing.

先ず、第１４図に本実施例のメインルーチンを示す。オ
ペレータによりコマンドが入力されるのを待ち（ステッ
プ１４１）、テンプレート登録キー１０１が入力された
場合は（ステップ１４２）。First, FIG. 14 shows the main routine of this embodiment. It waits for a command to be input by the operator (step 141), and if the template registration key 101 has been input (step 142).

テンプレート登録制御ルーチンを起動しくステップ１４
３）、論理構造解析キー１０２が入力された場合は（ス
テップ１４２）、論理構造抽出制御ルーチンを起動する
（ステップ１４４）、終了キー１０３が入力されたら、
制御を終了する。Step 14: Start the template registration control routine
3) If the logical structure analysis key 102 is input (step 142), start the logical structure extraction control routine (step 144), and if the end key 103 is input,
End control.

第７図は、テンプレート登録制御ルーチン１４３の詳細
を示すプログラム・フローチャートである。FIG. 7 is a program flowchart showing details of the template registration control routine 143.

先ず、ウィンドウ管理テーブルメモリ９に、テンプレー
ト登録ウィンドウ１１１を設定する（ステップ７２）、
オペレータが文字コードキー１０４によりテンプレート
名を記述し入カキ−１０７を入力したら（ステップ７３
）、ワークメモリ３にテンプレート名の文字コード列を
入力する（ステップ７４）。First, a template registration window 111 is set in the window management table memory 9 (step 72).
When the operator writes the template name using the character code key 104 and inputs the input key 107 (step 73
), inputs the character code string of the template name into the work memory 3 (step 74).

そして、オペレータは、文字コードキー１０４により、
論理要素名を入力する。要素名毎の終わりには改行キー
１０８を入力し、入力は、テンプレートをコンピュータ
で解析するための、テンプレート仕様に基づいて行う（
ステップ７５）。Then, the operator uses the character code key 104 to
Enter the logical element name. Enter the line feed key 108 at the end of each element name, and input based on the template specifications for analyzing the template on a computer (
Step 75).

文書は、表題・著者名等の文書開始部１章節部後書き、
参考文献等の文書終了部の、３部に分けることができる
。テンプレート仕様は、この３部について、次の４項目
からなる。The document includes the title, author name, etc. at the beginning of the document, chapter 1, section, afterword,
It can be divided into three parts: the end part of the document such as references, etc. The template specifications consist of the following four items for these three parts.

第１項目は、文書開始部、終了部の論理要素名の制限（
例、タイトル、標題は、表題に統一）。The first item is the restriction on logical element names at the beginning and end of the document (
For example, the title and title are unified with the title).

第２項目は、章節部の論理要素名は数字とピリオドで記
述（例、３．１，２．）。For the second item, the names of logical elements in chapters and sections should be written using numbers and periods (for example, 3.1, 2.).

第３項目は１章節部の論理要素名は、章、節の従属関係
を表わすために１節のレベルが下がる毎に、論理要素名
の前にスペースを付ける（例えば、“１．”　　”−１
，１”−一１．１．１”）。The third item is the name of the logical element in the chapter 1 section. In order to express the subordination relationship between the chapter and section, a space is added before the logical element name each time the level of the 1 section is lowered (for example, "1." - 1
, 1"-1.1.1").

第４項目は論理要素名は、論理的展開類に入力。For the fourth item, enter the logical element name in the logical expansion category.

オペレータが、終了キー１０３を入力したら（ステップ
７６）、データをワークメモリ３に一時的に入力しくス
テップ７７）、先はど入力したテンプレート名をファイ
ル名とし、テンプレートファイル４にテンプレートを格
納する（ステップ７８）。When the operator inputs the end key 103 (step 76), the data is temporarily input into the work memory 3 (step 77), and the template is stored in the template file 4 with the previously input template name as the file name (step 77). Step 78).

第１１図は、オペレータが論理要素を入力するときの、
表示画面７の１例である。FIG. 11 shows when an operator inputs a logical element.
This is an example of the display screen 7.

次に、ウィンドウ管理テーブルメモリ９に、メツセージ
ウィンドウ１２３を設定する（ステップ７９）、そして
、メツセージウィンドウ１２３に対応した仮想画面メモ
リ８に、ｒ別テンプレートを登録しますか？」という文
字列データを入力する（ステップ７１０）、オペレータ
がＹＥＳキー１０５を入力した場合はステップ７３に戻
り、Ｎｏキー１０６を入力した場合は（ステップ７１１
）。Next, set the message window 123 in the window management table memory 9 (step 79), and register the r-separate template in the virtual screen memory 8 corresponding to the message window 123? ” (step 710). If the operator inputs the YES key 105, the process returns to step 73; if the operator inputs the No key 106, the process returns to step 711.
).

テンプレート登録ウィンドウ及びメツセージウィンドウ
を消去し、ウィンドウ管理テーブルメモリ９から当該ウ
ィンドウの管理データを削除する（ステップ７１２）、
以上がテンプレート登録処理である。Delete the template registration window and message window, and delete the management data of the window from the window management table memory 9 (step 712);
The above is the template registration process.

第８図は、論理構造抽出制御ルーチン１４４の詳細を示
すプログラム・フロチャートである。FIG. 8 is a program flowchart showing details of the logical structure extraction control routine 144.

オペレータによる論理構造解析キー１０２の入力に応答
して、ウィンドウ管理テーブルメモリ９にテンプレート
メニューウィンドウを設定する（ステップ９１）０次に
、テンプレートファイル４より、テンプレートファイル
名の一覧をワークメモリ３に読込み、テンプレートメニ
ューウィンドウに対応する仮想画面メモリ８に入力する
（ステップ９２）、オペレータがマウスで指示したテン
プレート名のファイルを、テンプレートファイル４より
、ワークメモリ３に読込む（ステップ９３．９４）、そ
して、そのデータを論理構造解析ルーチン（第９図参照
）で解析する（ステップ９５）０次に、ウィンドウ管理
テーブルメモリ９に、論理要素メニューウィンドウ１２
２を設定しくステップ９６）、ワークメモリ３上のテン
プレートを、論理要素メニューウィンドウ１２２に対応
した仮想画面メモリ８に、出力する（ステップ９７）。In response to the input of the logical structure analysis key 102 by the operator, a template menu window is set in the window management table memory 9 (step 91).Next, a list of template file names is read into the work memory 3 from the template file 4. , input into the virtual screen memory 8 corresponding to the template menu window (step 92), read the file with the template name specified by the operator with the mouse from the template file 4 into the work memory 3 (steps 93 and 94), and , the data is analyzed by the logical structure analysis routine (see FIG. 9) (step 95).Next, the logical element menu window 12 is stored in the window management table memory 9.
2 is set (step 96), and the template on the work memory 3 is output to the virtual screen memory 8 corresponding to the logical element menu window 122 (step 97).

コマンド入力待ちの時（ステップ９１８）、オペレータ
が論理要素ウィンドウ１２２上で、マウスを用いて論理
要素名を指示したら（ステップ９１９）、論理構造メモ
リ１５上の論理構造テーブルを検索し、指定された論理
要素名の識別子臼２をワークメモリ３に人力する（ステ
ップ９９）。While waiting for a command input (step 918), when the operator specifies a logical element name using the mouse on the logical element window 122 (step 919), the logical structure table in the logical structure memory 15 is searched and the specified The identifier 2 of the logical element name is manually entered into the work memory 3 (step 99).

そして、ウィンドウ管理テーブル９に、メツセージウィ
ンドウ１２３を設定しくステップ９１０）。Then, the message window 123 is set in the window management table 9 (step 910).

メツセージウィンドウに対応した仮想画面メモリ８に、
文字対象の指定を促す旨のメツセージの文字コード列を
入力する（ステップ９１１）、第１２図は、その１例で
ある。In the virtual screen memory 8 corresponding to the message window,
An example of this is shown in FIG. 12, in which a character code string for a message prompting the user to specify a character object is input (step 911).

オペレータが、文書表示ウィンドウ１２１上に表示され
た文字列を指示すると（ステップ９１２）。When the operator specifies the character string displayed on the document display window 121 (step 912).

文書データメモリ１７上での、文字列の開始ポインタと
終了ポインタを検索する（ステップ９１３）。The start pointer and end pointer of the character string on the document data memory 17 are searched (step 913).

その結果を、先程ワークメモリ３に入力した識別子に対
応する開始ポインタ５７．終了ポインタ５８へ格納する
（ステップ９１４）、そして、ステップ９１８へ続く。The result is transferred to the start pointer 57 .corresponding to the identifier input into the work memory 3 earlier. Store in end pointer 58 (step 914) and continue to step 918.

オペレータが終了キー１０３を入力したら、メツセージ
ウィンドウ１２３．論理要素メニューウィンドウ１２２
を消去し、ウィンドウ管理テーブルメモリ９から当該ウ
ィンドウの管理データを削除する（ステップ９１７）・第９図に、論理構造テンプレート解析ルーチン９５のフ
ローチャートを示す。When the operator inputs the end key 103, the message window 123. Logical element menu window 122
and deletes the management data of the window from the window management table memory 9 (step 917). FIG. 9 shows a flowchart of the logical structure template analysis routine 95.

先ず、論理構造メモリ１５上の論理構造テープＪＬ１５
０（７）、項目５１〜５４，５６〜５８にｏを、５６″
＆こヌル文字コードを入力することによって初期化する
（ステップ８０）。次に、ワークメモリ３より、テンプ
レートに記述された論理要素名を、改行コードまでを一
輪理要素名として、ワークメモリ３の別アドレスに入力
する（ステップ８２）。First, the logical structure tape JL15 on the logical structure memory 15
0 (7), o in items 51-54, 56-58, 56''
& This is initialized by inputting the null character code (step 80). Next, the logical element name written in the template is input from the work memory 3 to another address in the work memory 3, with the part up to the line feed code as a single logical element name (step 82).

もし、データが終了した場合は、第８図の制御フローに
戻る（ステップ８３．８４）。If the data has ended, the control flow returns to the one shown in FIG. 8 (steps 83 and 84).

次に、入力した論理要素名の先頭文字コードを調べ、数
字、スペース文字、それ以外の文字の３通りに分ける（
ステップ８５）、１文字目がスペースの場合は、何文字
目までがスペース文字かを文字コードで調べ（ステップ
８６）、数字、それ以外の文字の場合と同様に、第６図
に示したルールに従って識別子の計算を行う（ステップ
８７）。Next, check the first character code of the input logical element name and divide it into three types: numbers, space characters, and other characters (
Step 85), if the first character is a space, check the character code to see how many characters are spaces (step 86), and follow the rules shown in Figure 6 in the same way as for numbers and other characters. An identifier is calculated according to the following (step 87).

その結果を論理構造メモリ１５上の論理構造テーブルに
格納し、ステップ８２に戻る（ステップ８８）６以上が
論理構造抽出処理である。The result is stored in the logical structure table on the logical structure memory 15, and the process returns to step 82 (step 88).6 and above are logical structure extraction processing.

上記実施例によれば、論理構造を抽出する場合。According to the above embodiment, when extracting a logical structure.

文字データのポインタを検索する以外に、文書データメ
モリ１７上のデータを解析する必要がないので９文書を
自由に記述しても、論理構造を抽出できるという利点が
ある。Since there is no need to analyze the data on the document data memory 17 other than searching for character data pointers, there is an advantage that the logical structure can be extracted even if nine documents are freely written.

しかし、上述した第１の実施例においては、登録済のテ
ンプレートと文書の論理構造が一致しない場合は、新し
くテンプレートを登録する必要があった。この場合、テ
ンプレートファイルの数が非常に多くなりまた、オペレ
ータにとっては、テンプレート登録が負担になる可能性
がある。However, in the first embodiment described above, if the registered template and the logical structure of the document do not match, it is necessary to register a new template. In this case, the number of template files becomes very large, and template registration may become a burden for the operator.

そこで、次に、テンプレートの修正を可能にすることに
より、かかる問題点を改善した第２の実施例について説
明する。Next, a second embodiment will be described in which this problem is solved by making it possible to modify the template.

第１３図の制御フローは、第８図の論理構造抽出制御ル
ーチンを改良した、プログラム・フローチャートである
。以下、第１３図を参照して、説明する。The control flow shown in FIG. 13 is a program flowchart that is an improved version of the logical structure extraction control routine shown in FIG. This will be explained below with reference to FIG.

第１３図において、先ず、テンプレート管理テーブルメ
モリ９に、メツセージウィンドウ１２３を設定する（ス
テップ１３１）、そして、メツセージウィンドウ１２３
に対応する仮想画面メモリ８に、ｌ１ｆｆｉ録済みのテ
ンプレートを使用しますか？」という文字コード列を入
力する（ステップ１３２）。In FIG. 13, first, the message window 123 is set in the template management table memory 9 (step 131).
Do you want to use the l1ffi recorded template for virtual screen memory 8 that corresponds to ? ” (step 132).

オペレータがＮｏキー１０６を入力した場合は（ステッ
プ１３３）、テンプレート管理テーブルメモリ９に、テ
ンプレート登録ウィンドウ１１１を設定し、オペレータ
は、文字コードキー１０４によりテンプレート名を記述
し、終了後、入カキ−１０７を入力する（ステップ１３
４，１３５）。When the operator inputs the No key 106 (step 133), the operator sets the template registration window 111 in the template management table memory 9, writes the template name using the character code key 104, and after finishing, presses the input key. Enter 107 (step 13
4,135).

テンプレート名は、ワークメモリ３に入力する。The template name is input into the work memory 3.

そして、オペレータは１文字コードキー１０４により、
論理要素名を入力する（ステップ１３１．６）。Then, the operator uses the one-character code key 104 to
Enter the logical element name (step 131.6).

その際１要素毎に改行キー１０８を人力する。又記述は
、テンプレート仕様に基づいて行う（ステップ１３１６
）、オペレータが終了キー１０３を入力したら、記述さ
れたデータをワークメモリ３に一時的に入力し、先程入
力したテンプレート名をファイル名とし、テンプレート
ファイル４にテンプレートを格納する（ステップ１３１
７〜１３１９）。At this time, the line feed key 108 is pressed manually for each element. Also, the description is performed based on the template specifications (step 1316
), when the operator inputs the end key 103, the written data is temporarily input into the work memory 3, the template name input earlier is set as the file name, and the template is stored in the template file 4 (step 131).
7-1319).

その後は第８図ステップ９４以降と同様である・もし、
メツセージに対してＹＥＳキー１０５が入力されたら（
ステップ１３３）、まず、ウィンドウ管理テーブルメモ
リ９に、テンプレートメニューウィンドウを設定する（
ステップ１３７）。After that, the process is the same as after step 94 in Figure 8. If
When the YES key 105 is input in response to a message (
Step 133): First, a template menu window is set in the window management table memory 9 (
Step 137).

そして、テンプレートファイル４よりテンプレート名一
覧をワークメモリ３に読込み、テンプレートメニューウ
ィンドウに対応する仮想画面メモリ８に人力する（ステ
ップ１３８）、オペレータは、マウスでテンプレート名
を指示する（ステップ１３９）。Then, the list of template names is read into the work memory 3 from the template file 4, and manually entered into the virtual screen memory 8 corresponding to the template menu window (step 138).The operator indicates the template name with the mouse (step 139).

次に、ウィンドウ管理テーブルメモリ９に、論理要素メ
ニューウィンドウ１２２を設定し、ウィンドウ１２２に
対応した仮想画面メモリ８に、オペレータが指示したフ
ァイルのデータを、テンプレートファイル４より入力す
る（ステップ１３１０゜１３１１）、そして、メツセー
ジウィンドウに対応した仮想画面メモリ８に、テンプレ
ートを修正するか否かのメツセージの文字コード列を入
力する（ステップ１３１２）、オペレータがＮｏキー１
０６を入力した場合は、ステップ９４八続く。Next, the logical element menu window 122 is set in the window management table memory 9, and the data of the file specified by the operator is input from the template file 4 into the virtual screen memory 8 corresponding to the window 122 (steps 1310 and 1311). ), and inputs a message character code string indicating whether or not to modify the template into the virtual screen memory 8 corresponding to the message window (step 1312), and the operator presses the No key 1.
If 06 is entered, step 948 continues.

ＹＥＳキー１０５を入力した場合は、ウィンドウ管理テ
ーブルメモリ９に、テンプレート登録ウィンドウ１１１
を設定し、ウィンドウ１１１に対応した仮想画面メモリ
８に、オペレータが指示したファイルを、テンプレート
ファイル４より人力する（ステップ１３１４．１３１５
）。そして、新規に登録する場合と同様に、ステップ１
３１３〜１３１９を行い、ステップ９４に続く。If the YES key 105 is input, the template registration window 111 is stored in the window management table memory 9.
, and manually input the file specified by the operator into the virtual screen memory 8 corresponding to the window 111 from the template file 4 (steps 1314 and 1315).
). Then, as in the case of new registration, step 1
Steps 313 to 1319 are performed and the process continues to step 94.

以上説明した第２の実施例によれば、予め！：Ａ＄的な
論理構造テンプレートを登録しておけば、テンプレート
を修正するだけで論理構造の抽出ができ、また特殊な論
理構造を持つ文書の場合も、テンプレートの新規登録が
容易にできるので、オペレータの作業が削減され、様々
な文書に対して柔軟に対応できる。According to the second embodiment described above, in advance! : If you register an A$-like logical structure template, you can extract the logical structure just by modifying the template, and even if the document has a special logical structure, you can easily register a new template. Operator work is reduced and it is possible to respond flexibly to a variety of documents.

〔Effect of the invention〕

以上の実施例の説明から明らかな如く１本発明によれば
、予め文書の論理構造を、簡易な方法で登録しておき、
文書中の文章単位と、登録済の論理要素ｍ位との対応づ
けを、対話的に行うことができるので、様々な様式の文
書の論理構造を容易に抽出する手段を提供することがで
きる。As is clear from the above description of the embodiments, according to one aspect of the present invention, the logical structure of a document is registered in advance using a simple method,
Since the correspondence between a sentence unit in a document and the mth registered logical element can be performed interactively, it is possible to provide a means for easily extracting the logical structure of documents in various formats.

[Brief explanation of the drawing]

第１図は１本発明によるマルチウィンドウ・システｌ＼
の全体構成を示すブロック図、第２図は、仮想画面上の
ウィンドウ表示域と、デイスプレィ実画面上のウィンド
ウとの関係を示す図、第３図は、ウィンドウ管理テーブ
ルの説明図、第４図は、仮想画面メモリの構成を示す説
明図、第５図は、論理構造データテーブルの説明図、第
６図は、識別子を付けた論理構造の１倒を示した説明図
、第７図は、テンプレート登録のために実行されるプロ
グラム・フローチャート、第８図は、論理構造を抽出す
るために実行されるプログラム・フローチャート、第９
図は、第８図における論理構造テンプレート解析ルーチ
ンを詳細にしたプログラム・フローチャート、第１０図
は１本システムで使用するキーボードの一例の平面図、
第１１図は、テンプレート登録の際の表示画面を示した
図、第１２図は、論理構造抽出の際の表示画面を示した
図、第１３図は１本発明の第２の実施例で実行されるプ
ログラム・フローチャート、第１４図は、循図第１ρ 口第図第図Figure 1 shows a multi-window system according to the present invention.
FIG. 2 is a block diagram showing the overall configuration of the screen, FIG. 2 is a diagram showing the relationship between the window display area on the virtual screen and the windows on the real display screen, FIG. 3 is an explanatory diagram of the window management table, and FIG. 4 is an explanatory diagram showing the configuration of a virtual screen memory, FIG. 5 is an explanatory diagram of a logical structure data table, FIG. 6 is an explanatory diagram showing a logical structure with an identifier attached, and FIG. The program flowchart executed for template registration, FIG. 8, is the program flowchart executed for extracting the logical structure, FIG. 9.
The figure shows a detailed program flowchart of the logical structure template analysis routine in Figure 8, and Figure 10 is a plan view of an example of a keyboard used in a single system.
FIG. 11 is a diagram showing a display screen when registering a template, FIG. 12 is a diagram showing a display screen when extracting a logical structure, and FIG. 13 is a diagram showing a display screen when registering a template. The program flowchart shown in Figure 14 is the program flowchart shown in Figure 14.

Claims

[Claims] 1. In a document editing device that uses an input means and a multi-window system in which a plurality of windows are set on a display screen, an operator can determine the hierarchical relationship of logical element identifiers using logical element identifiers. A first step of displaying the logical structure of the document input according to the description shown in the first window set on the display screen and storing the same input data in a logical structure file, and a first step of storing the data stored in the first step. The hierarchical structure of logical element names is read from the logical structure file and stored in the logical structure memory based on the specifications used when inputting data in the first step. Display the document in the window, and enter the logical element identification name in the first window and the logical element identifier in the second window by the operator.
a second step of generating a logical structure by storing the result in a logical element identification name area of a hierarchical structure on a logical structure memory based on an instruction for correspondence with a character string for each logical element of a document in a window; A document logical structure extraction method characterized by having the following. 2. Display the document in the window set on the first display screen, read the data stored in the first step from the logical element file, display the data of the logical element name in the second window, and display the data on the second window by the operator. Modify the data in the logical structure file based on the instruction to modify the logical element name data in , and when the modification is completed, read the data from the logical structure file and change the hierarchical structure of logical element identifiers to the logical structure based on the specifications. 2. The document logical structure extraction method according to claim 1, further comprising a third step of storing the document in a memory.