JPH1115852A

JPH1115852A - Image database center device, image database registration / search method, and recording medium

Info

Publication number: JPH1115852A
Application number: JP9171377A
Authority: JP
Inventors: Masahiro Asakawa; 雅洋浅川; Takafumi Hagiwara; 貴文萩原; Teruaki Shibata; 輝昭柴田; Takahiro Oohora; 崇裕大洞
Original assignee: Nippon Telegraph and Telephone Corp
Current assignee: NTT Inc
Priority date: 1997-06-27
Filing date: 1997-06-27
Publication date: 1999-01-22

Abstract

(57)【要約】【課題】汎用性がありユーザーが特定されず、効率良
く登録／検索できる画像データベースセンタ装置と、そ
れに備える登録／検索方法を提供する。【解決手段】ファックス端末１６，１７及びパソコン
１８により送信可能な用紙又は、スキャナ１１より解読
可能な用紙から画像データベース１２へのアクセスを可
能とする。更に、同一用紙同一頁内にキーワードデータ
の文字列及びイメージデータの混在記入と、それらの記
載領域を限定しないフリーフォーマットを可能とし、キ
ーワードデータの文字列を文字認識して、キーワードを
元に画像データベース１２への登録及び検索を可能とす
る。又、カナ、数字に対してそれぞれの辞書を使用する
事、濁点・半濁点を含んだ文字に対して濁点・半濁点処
理する事、分離文字に対し分離文字処理する事で、フリ
ーフォーマットで記載されたカナ、数字の混在文字列に
対し、高い文字認識率とヒット率を実現する。 (57) [Summary] [PROBLEMS] To provide an image database center apparatus which is versatile and does not specify a user and can be efficiently registered / retrieved, and a registration / retrieval method provided therein. An image database (12) can be accessed from paper that can be transmitted by fax terminals (16, 17) and a personal computer (18) or paper that can be decoded by a scanner (11). Furthermore, it is possible to enter the character string of the keyword data and the image data in the same page and the same page and to perform free format without limiting the description area thereof. Registration and search in the database 12 are enabled. In addition, use the respective dictionaries for kana and numbers, process voiced / semi-voiced characters for characters containing voiced / semi-voiced characters, and process separated characters for separated characters, and write in free format. A high character recognition rate and a high hit rate can be realized for a mixed character string of kana and numbers.

Description

DETAILED DESCRIPTION OF THE INVENTION

【０００１】[0001]

【発明の属する技術分野】本発明は、画像データベース
センタ装置及び画像データベース登録／検索方法、特に
ファックス端末から遠隔のデータベースをアクセスし、
希望する画像データを画像データベースより検索して送
信させる画像データベースセンタ装置及びこれに備える
画像データベース登録／検索方法に関するものである。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to an image database center apparatus and an image database registration / retrieval method, and more particularly, to accessing a remote database from a fax terminal.
The present invention relates to an image database center apparatus for searching and transmitting desired image data from an image database and an image database registration / search method provided for the apparatus.

【０００２】[0002]

【従来の技術】従来、この種のシステムとしては、ファ
クシミリ案内サービスシステムがある。このサービスで
は、画像データを登録する時には、ファックス端末を用
い、登録用のサービス番号と登録先のＢＯＸ番号をダイ
ヤルし、画像データを送信する。一方、登録された画像
データを検索する時には、検索用のサービス番号とＢＯ
Ｘ番号をダイヤルすると、サービスセンタがその検索要
求に対応する画像データをデータベースから検索して検
索要求者に送信する。しかし、このシステムでは、ＢＯ
Ｘ番号を調べるために時間や手間がかかる事、ＢＯＸ番
号内に登録されている画像データが不明確なため、ＢＯ
Ｘ番号内に登録されている画像データを調べるために一
度メニュー画面の検索を行う必要がある事、検索に必要
な登録者の電話番号が不明確な事、登録又は、検索時の
操作（ダイヤル）等が覚えられないため、マニュアルが
必要である、といった欠点があった。2. Description of the Related Art Conventionally, as this type of system, there is a facsimile guidance service system. In this service, when registering image data, a fax terminal is used to dial a service number for registration and a BOX number of a registration destination, and the image data is transmitted. On the other hand, when searching for registered image data, the service number for search and BO
When the X number is dialed, the service center searches the database for image data corresponding to the search request and transmits the image data to the search requester. However, in this system, BO
It takes time and effort to check the X number, and the image data registered in the BOX number is unclear.
It is necessary to search the menu screen once to check the image data registered in the X number, that the telephone number of the registrant required for the search is unclear, the operation at the time of registration or search (dial ) Cannot be remembered, so a manual is required.

【０００３】これに対して、サービス番号やＢＯＸ番号
をダイヤルせずに、ファックス端末より送信されるマー
クシートをデータベースセンタ毎に専用マークシートと
して限定し、専用マークシートに記載したマークや文字
を認識することによりデータベースをアクセスするよう
構成されたシステムがある。[0003] On the other hand, without dialing a service number or a BOX number, a mark sheet transmitted from a fax terminal is limited to a dedicated mark sheet for each database center, and marks and characters described in the dedicated mark sheet are recognized. Some systems are configured to access a database.

【０００４】文字認識によりデータベースを構築するシ
ステムとしては、前記専用マークシートに文字を記載す
る必要があるとともに、記載するにあたっての制約事項
として、予め決められた位置の所定の枠組みの中へ、決
められた大きさの字体で記載する必要があった。こうし
た、専用マークシートへの記載の必要と、ユーザー側に
対する制約により、文字認識の認識率を高めていた。In a system for constructing a database by character recognition, it is necessary to write a character on the dedicated mark sheet, and as a restriction in writing, a character is set in a predetermined frame at a predetermined position. It was necessary to write in the font of the size. Due to the necessity of writing on the special mark sheet and restrictions on the user side, the recognition rate of character recognition has been increased.

【０００５】これに用いられている文字認識技術として
は、文字認識結果として挙げられた個々の候補文字を単
純に結合して候補文字列を生成する手段がある。また、
生成された候補文字列をキーワードとして用いるデータ
ベースシステムも存在するが、一つの画像データに対し
一つの候補文字列を生成するサービスが一般的である。
中には、生成された候補文字列を単語照合し、単語照合
の結果がマッチングしたものだけを候補文字列として生
成するシステムもある。As a character recognition technique used for this, there is a means for simply combining individual candidate characters given as a result of character recognition to generate a candidate character string. Also,
Although there are database systems that use the generated candidate character strings as keywords, a service that generates one candidate character string for one image data is generally used.
In some systems, the generated candidate character strings are word-matched, and only those that match the result of the word matching are generated as candidate character strings.

【０００６】[0006]

【発明が解決しようとする課題】しかしながら、上記従
来の文字認識によるデータベースシステムでは、専用マ
ークシートを使用する必要と、ユーザー側に対する制約
のため、汎用性がない事と、更にデータベースへアクセ
スできるユーザーが特定されてしまい不利益である、と
いう問題があった。However, in the above-described conventional database system using character recognition, it is necessary to use a dedicated mark sheet and there is a restriction on the user side, so there is no versatility, and more users who can access the database are required. There was a problem that it was disadvantageous because it was specified.

【０００７】また、このシステムに用いられている従来
の文字認識技術には、文字認識の認識結果が悪い事、カ
タカナを数字と誤って認識する場合がある事、数字をカ
タカナと誤って認識する場合がある、といった問題や、
登録時及び検索時の候補文字列をそれぞれ一つづつしか
生成しないためマッチングしにくい事、濁点や半濁点の
欠落（見落とし）により、あるいは分離した文字の文字
切り出しＮＧ（失敗）により、候補文字列の文字数が不
一致となりマッチングしにくい、といった問題があっ
た。Further, in the conventional character recognition technology used in this system, the recognition result of character recognition is bad, katakana may be erroneously recognized as a number, and a number may be erroneously recognized as katakana. Problems,
Matching is difficult because only one candidate character string is generated at the time of registration and search, respectively. Due to missing (overlooking) of voiced or semi-voiced characters, or character segmentation NG (failure) of separated characters, There was a problem that the number of characters did not match and matching was difficult.

【０００８】本発明の課題は、従来の画像データベース
システムにおけるデータベースセンタ毎に専用マークシ
ートを使用する必要と、データベースへ格納されている
画像データへのアクセスに対する制約事項の問題、文字
認識精度の低さ、マッチングのしにくさ、といった問題
を解決し、汎用性があり、ユーザーが特定されることな
く、しかも高精度な文字認識が可能であり、検索時にマ
ッチングしやすく、結果としてユーザーが希望する画像
データを効率良く送信できる画像データベースセンタ装
置と、それに備える画像データベース登録／検索方法を
提供する事にある。An object of the present invention is to use a dedicated mark sheet for each database center in a conventional image database system, to restrict access to image data stored in a database, and to reduce character recognition accuracy. It solves problems such as difficulty in matching, is versatile, does not identify the user, and can perform high-precision character recognition. It is easy to match when searching, and as a result, the image desired by the user An object of the present invention is to provide an image database center device capable of efficiently transmitting data and an image database registration / search method provided for the image database center device.

【０００９】[0009]

【課題を解決するための手段】本発明は、下記の手段
（１）から手段（１２）までにより、上記の課題を解決
する。The present invention solves the above-mentioned problems by the following means (1) to (12).

【００１０】（１）通信メディアとの間で画像データを
送受信し、画像データを登録／検索する画像データベー
スセンタにおいて、登録要求者からは、通信ネットワー
クを介した通信メディアから、またはイメージスキャナ
から画像データを受信する第１の手段と、前記登録要求
者から受信した画像データの中から候補文字列を抽出
し、不確定文字を含む正確度の高い候補文字列を一つ又
は複数生成する第２の手段と、前記第２の手段で生成し
た候補文字列とその抽出元の画像データを対応させて画
像データベースに登録する第３の手段と、検索要求者か
らは通信ネットワークを介した通信メディアから画像デ
ータを受信する第４の手段と、前記検索要求者から受信
した画像データの中から候補文字列を抽出し、不確定文
字を含む正確度の高い候補文字列を一つ又は複数生成す
る第５の手段と、前記第５の手段で生成された候補文字
列と前記画像データベース内に登録された候補文字列と
の一致率を比較し、その一致率と予め設定した一致率に
より一致したと判断した候補文字列に対応する画像デー
タを前記画像データベースより検索する第６の手段と、
前記検索要求者の通信メディアに前記検索した画像デー
タを送信する第７の手段と、を備えた事を特徴とする画
像データベースセンタ装置。(1) In an image database center for transmitting / receiving image data to / from a communication medium and registering / retrieving the image data, a registration requester sends an image from a communication medium via a communication network or from an image scanner. A first means for receiving data; and a second means for extracting a candidate character string from image data received from the registration requester and generating one or more candidate character strings with high accuracy including an uncertain character Means, the candidate character string generated by the second means and the image data from which the candidate character string is extracted are registered in an image database in association with each other, and the search requester receives the candidate character string from a communication medium via a communication network. Fourth means for receiving image data; extracting candidate character strings from the image data received from the search requester; Fifth means for generating one or more candidate character strings, and comparing the matching rate between the candidate character strings generated by the fifth means and the candidate character strings registered in the image database, Sixth means for searching the image database for image data corresponding to a candidate character string determined to match with the rate based on a preset matching rate;
And a seventh means for transmitting the searched image data to a communication medium of the search requester.

【００１１】（２）前記第２および第５の手段は、文字
認識手段、絶対基準値判定部、確定基準値判定部、およ
び相対基準値判定部を備えて、それらの判定結果に基づ
いて候補文字列を生成するものであって、前記文字認識
手段は、文字認識結果として複数の候補文字を順位づけ
し、該複数の候補文字とそれに対応した距離値を前記絶
対基準値判定部、確定基準値判定部、および相対基準値
判定部へ返却するものであり、前記絶対基準値判定部
は、前記順位づけされた複数の候補文字とそれに対応す
る距離値の値と予め設定した絶対基準値の値を比較し、
前記距離値の値と前記絶対基準値の値の大小により、そ
の候補文字を候補文字列の組み合わせ対象とするかしな
いかを判定するものであり、前記確定基準値判定部は、
前記上位と下位の候補文字とそれに対応した上位と下位
の距離値の値を減算し、該減算した値と予め設定した確
定基準値の値を比較し、前記減算した値と前記確定基準
値の値の大小により、その上位の候補文字のみを候補文
字列の組み合わせ対象とするか、又は、その上位と下位
の候補文字をそれぞれ候補文字列の組み合わせ対象とす
るかを判定するものであり、相対基準値判定部は、前記
上位と下位の候補文字とそれに対応した上位と下位の距
離値の値を減算し、該減算した値と予め設定した相対基
準値の値を比較し、前記減算した値と前記相対基準値の
値の大小により、その上位と下位の候補文字を不確定文
字に置き換えて候補文字列の組み合わせ対象とするか、
又は、その上位と下位の候補文字をそれぞれ候補文字列
の組み合わせ対象とするかを判定するものである事を特
徴とするの画像データベースセンタ装置。(2) The second and fifth means include a character recognizing means, an absolute reference value judging section, a fixed reference value judging section, and a relative reference value judging section. Generating a character string, wherein the character recognition means ranks a plurality of candidate characters as a character recognition result, and determines the plurality of candidate characters and a distance value corresponding to the plurality of candidate characters by the absolute reference value determination unit; Value determination unit, and returns to the relative reference value determination unit, the absolute reference value determination unit, the plurality of ranked candidate characters and the corresponding distance value value and a preset absolute reference value Compare the values,
Based on the magnitude of the value of the distance value and the value of the absolute reference value, it is determined whether or not the candidate character is to be combined with the candidate character string, and the final reference value determination unit,
The upper and lower candidate characters and the corresponding upper and lower distance values are subtracted, the subtracted value is compared with a preset fixed reference value, and the subtracted value and the fixed reference value are compared. Based on the magnitude of the value, it is determined whether only the upper candidate character is to be combined with the candidate character string, or whether the upper and lower candidate characters are to be combined with the candidate character string. The reference value determination unit subtracts the upper and lower candidate characters and the corresponding upper and lower distance values, compares the subtracted value with a preset relative reference value, and calculates the subtracted value. And depending on the value of the relative reference value, whether the upper and lower candidate characters are replaced with indeterminate characters and the candidate character strings are combined,
Alternatively, the image database center apparatus determines whether upper and lower candidate characters are to be combined with candidate character strings.

【００１２】（３）前記第６の手段は、登録要求者から
受信した画像データより生成された一つ又は複数の候補
文字列と、検索要求者から受信した画像データより生成
された一つ又は複数の候補文字列との一致率を比較する
場合において、登録要求時に生成された候補文字列の個
々の文字列と検索要求時に生成された候補文字列の個々
の文字列の一致率を個々の文字列毎に計算し、その一致
率と予め設定した一致率により一致したと判断した文字
列を含む候補文字列に対応する画像データを画像データ
ベースより検索するものである事を特徴とする画像デー
タベースセンタ装置。(6) The sixth means includes one or more candidate character strings generated from the image data received from the registration requester and one or more candidate character strings generated from the image data received from the search requester. When comparing the match rates with multiple candidate strings, the match rates between the individual strings of the candidate strings generated at the time of the registration request and the individual strings of the candidate strings generated at the An image database that is calculated for each character string and searches the image database for image data corresponding to a candidate character string including a character string determined to match with the match rate based on a preset match rate. Center equipment.

【００１３】（４）前記相対基準値判定部は、候補文字
を不確定文字に置き換える手段と、前記不確定文字を含
んだ候補文字列には辞書データベースより文字列を検索
し、該辞書データベースよりマッチングする文字列を候
補文字列として追加生成する手段と、を有する事を特徴
とする画像データベースセンタ装置。(4) The relative reference value determination unit includes means for replacing a candidate character with an uncertain character, and searching a candidate character string including the uncertain character for a character string from a dictionary database. Means for additionally generating a character string to be matched as a candidate character string.

【００１４】（５）前記第２および第５の手段は、濁点
又は、半濁点を含んだ候補文字列には、濁点又は、半濁
点を取り除いた候補文字列を追加生成する手段と、分離
した文字を含んだ候補文字列には、当該文字を二文字に
分離した文字に置き換えた候補文字列を追加生成する手
段と、を備える事を特徴とする画像データベースセンタ
装置。(5) The second and fifth means are separated from a means for additionally generating a candidate character string from which a voiced point or a semi-voiced point is removed, to a candidate character string including a voiced point or a semi-voiced point. Means for additionally generating a candidate character string including a character and replacing the character with a character obtained by separating the character into two characters, an image database center apparatus.

【００１５】（６）通信メディアと画像データを送受信
し、画像データを登録／検索する画像データベースセン
タに備える画像データベースの登録／検索方法におい
て、登録要求者からの画像データを、通信ネットワーク
を介した通信メディア、またはイメージスキャナから受
信する第１の手順と、前記登録要求者から受信した画像
データの中から文字認識ソフトウェアにより、候補文字
列を抽出し不確定文字を含む正確度の高い候補文字列を
一つ又は複数生成する第２の手順と、前記第２の手順で
生成した候補文字列とその抽出元の画像データを対応さ
せて画像データベースに登録する第３の手順と、検索要
求者の通信メディアから通信ネットワークを介して画像
データを受信する第４の手順と、前記検索要求者から受
信した画像データの中から文字認識ソフトウェアによ
り、候補文字列を抽出し不確定文字を含む正確度の高い
候補文字列を一つ又は複数生成する第５の手順と、前記
第５の手順で生成された候補文字列と前記画像データベ
ース内に登録された候補文字列との一致率を比較し、そ
の一致率と予め設定した一致率により一致したと判断し
た候補文字列に対応する画像データを前記画像データベ
ースより検索する第６の手順と、前記検索要求者の通信
メディアに前記検索した画像データを送信する第７の手
順と、を備える事を特徴とする画像データベース登録／
検索方法。(6) In a method for registering / searching an image database provided in an image database center for transmitting / receiving image data to / from a communication medium and registering / retrieving image data, image data from a registration requester is transmitted via a communication network. A first procedure of receiving from a communication medium or an image scanner, and extracting a candidate character string from image data received from the registration requester by using character recognition software, and selecting a candidate character string with high accuracy including an uncertain character A second procedure for generating one or more of the above, a third procedure for registering the candidate character string generated in the second procedure and the image data from which the candidate character string is extracted in the image database, A fourth step of receiving image data from a communication medium via a communication network; A fifth procedure of extracting one or more candidate character strings with high accuracy including an uncertain character by extracting candidate character strings from the character recognition software, and a candidate character string generated in the fifth procedure. Comparing a matching rate with the candidate character string registered in the image database, and searching the image database for image data corresponding to the candidate character string determined to match the matching rate with a preset matching rate. Image database registration / procedure, comprising: a sixth procedure; and a seventh procedure of transmitting the searched image data to a communication medium of the search requester.
retrieval method.

【００１６】（７）前記第２および第５の手順における
文字認識ソフトウェアは、カタカナと数字が混在した文
字列の文字認識を行う場合に、前記カタカナと数字が混
在した文字列には、カタカナと数字の中間位置に区切り
文字を用いて、その区切り文字の前後をカタカナと数字
で使い分け、カタカナ部分にはカタカナ専用辞書を、数
字部分には数字専用辞書を使用して文字認識する事を特
徴とする画像データベース登録／検索方法。(7) When the character recognition software in the second and fifth procedures performs character recognition of a character string in which katakana and numbers are mixed, the character string in which katakana and numbers are mixed has katakana and Using a delimiter at the middle position of numbers, the characters before and after the delimiter are used separately with katakana and numbers, and the characters are recognized using a katakana dedicated dictionary for the katakana part and a numeric only dictionary for the number part. Database registration / search method to be performed.

【００１７】（８）前記第２および第５の手順は、文字
認識ソフトウェアの文字認識結果として複数の候補文字
を順位づけし、該複数の候補文字とそれに対応した距離
値を返却する手順と、前記順位づけした複数の候補文字
とそれに対応する距離値の値と予め設定した絶対基準値
の値を比較し、前記距離値の値と前記絶対基準値の値の
大小により、その候補文字を候補文字列の組み合わせ対
象とするかしないかを判定する絶対基準値判定手順と、
前記上位と下位の候補文字とそれに対応した上位と下位
の距離値の値を減算し、該減算した値と予め設定した確
定基準値の値を比較し、前記減算した値と前記確定基準
値の値の大小により、その上位の候補文字のみを候補文
字列の組み合わせ対象とするか、又は、その上位と下位
の候補文字をそれぞれ候補文字列の組み合わせ対象とす
るかを判定する確定基準値判定手順と、前記上位と下位
の候補文字とそれに対応した上位と下位の距離値の値を
減算し、該減算した値と予め設定した相対基準値の値を
比較し、前記減算した値と前記相対基準値の値の大小に
より、その上位と下位の候補文字を不確定文字に置き換
えて候補文字列の組み合わせ対象とするか、又は、その
上位と下位の候補文字をそれぞれ候補文字列の組み合わ
せ対象とするかを判定する相対基準値判定手順と、を備
え、前記判定に基づいて前記候補文字を組み合わせて前
記候補文字列を生成する事を特徴とする画像データベー
ス登録／検索方法。(8) The second and fifth procedures include a step of ranking a plurality of candidate characters as a character recognition result of the character recognition software, and returning the plurality of candidate characters and a distance value corresponding thereto. The plurality of ranked candidate characters are compared with a value of a distance value corresponding thereto and a value of a preset absolute reference value, and the candidate character is determined as a candidate according to the magnitude of the value of the distance value and the value of the absolute reference value. An absolute reference value determination procedure for determining whether or not to be a combination of character strings;
The upper and lower candidate characters and the corresponding upper and lower distance values are subtracted, the subtracted value is compared with a preset fixed reference value, and the subtracted value and the fixed reference value are compared. Determining reference value determination procedure for determining whether only the upper candidate character is to be combined with the candidate character string or whether the upper and lower candidate characters are to be combined with the candidate character string according to the magnitude of the value. And subtracts the upper and lower candidate characters and the corresponding upper and lower distance values, compares the subtracted value with a predetermined relative reference value, and compares the subtracted value with the relative reference value. Depending on the magnitude of the value, the upper and lower candidate characters are replaced with indeterminate characters and the candidate character string is combined, or the upper and lower candidate characters are combined with the candidate character string, respectively. Or With a relative reference value determining procedure of a constant, the image database registration / search method characterized in that to generate the candidate character strings by combining the candidate characters on the basis of the determination.

【００１８】（９）前記第６の手順では、登録要求者か
ら受信した画像データより生成された一つまたは複数の
候補文字列と、検索要求者から受信した画像データより
生成された一つまたは複数の候補文字列の一致率を比較
する場合において、登録要求時に生成された候補文字群
の個々の文字列と検索要求時に生成された候補文字群の
個々の文字列との一致率を個々の文字列毎に計算し、そ
の一致率と予め設定した一致率により一致したと判断し
た文字列を含む候補文字列に対応する画像データを画像
データベースより検索する事を特徴とする画像データベ
ース登録／検索方法。(9) In the sixth procedure, one or more candidate character strings generated from the image data received from the registration requester, and one or more candidate character strings generated from the image data received from the search requester. When comparing the matching rates of a plurality of candidate character strings, the matching rate between the individual character strings of the candidate character group generated at the time of the registration request and the individual character strings of the candidate character group generated at the time of the search request is calculated. Image database registration / retrieval, characterized in that image data corresponding to a candidate character string including a character string that is calculated based on a character string and determined to be matched by a predetermined matching rate is searched from an image database. Method.

【００１９】（１０）前記相対基準値判定手順は、候補
文字を不確定文字に置き換える手順と、前記不確定文字
を含んだ候補文字列には辞書データベースより文字列を
検索し、該辞書データベースよりマッチングする文字列
を候補文字列として追加生成する手順とを備える事を特
徴とする画像データベース登録／検索方法。(10) The relative reference value determining step includes a step of replacing a candidate character with an uncertain character, and a step of searching a candidate character string including the uncertain character from a dictionary database. A method of additionally generating a character string to be matched as a candidate character string.

【００２０】（１１）前記第２および第５の手順は、濁
点又は、半濁点を含んだ候補文字列には、濁点又は、半
濁点を取り除いた候補文字列を追加生成する手順と、分
離した文字を含んだ候補文字列には、当該文字を二文字
に分離した文字に置き換えた候補文字列を追加生成する
手順と、を備える事を特徴とする画像データベース登録
／検索方法。(11) The second and fifth procedures are separated from a procedure of additionally generating a candidate character string from which a voiced point or a semi-voiced point is removed for a candidate character string including a voiced point or a semi-voiced point. An additional step of generating a candidate character string including a character by replacing the character with a character obtained by separating the character into two characters, and a method of registering / retrieving an image database.

【００２１】（１２）上記の（６）から（１１）までの
いずれかの画像データベース登録／検索方法の手順をコ
ンピュータに実行させるプログラムを、該コンピュータ
が読み取り可能な媒体に記録した事を特徴とする記録媒
体。(12) A program for causing a computer to execute the procedure of any of the image database registration / retrieval methods described in (6) to (11) above is recorded on a computer-readable medium. Recording medium.

【００２２】本発明は、ファックス端末及びパソコンに
より送信が可能である用紙（アクセスシート）又は、ス
キャナ装置より解読が可能である用紙から画像データベ
ースへのアクセスを可能とし、更に、同一用紙同一頁内
にキーワードデータの文字列及びイメージデータの混在
記入と、前記キーワードデータの文字列及び前記イメー
ジデータの記載領域を限定しないフリーフォーマットを
可能とし、前記キーワードデータの候補文字列を複数生
成可能として、この候補文字列を元に前記画像データベ
ースへの登録並びに検索を可能とした事を特徴とする。
これにより、汎用性があり、ユーザーが特定されること
のないシステムを構築するとともに、画像データベース
へのアクセス時にマッチングしやすくする。従来の技術
とは、データベースセンタ毎に必要であった専用マーク
シートを使用する事と、同時にキーワードデータの文字
列とイメージデータの記載領域が明確化されていた点が
異なり、本発明と従来の技術とは前記マークシート等の
アクセスシートに記載する書式、および登録時及び検索
時の候補文字列をそれぞれ複数生成できる点が異なる。According to the present invention, an image database can be accessed from a sheet (access sheet) that can be transmitted by a fax terminal and a personal computer or a sheet that can be decoded by a scanner device. In this way, it is possible to enter a character string of keyword data and image data in a mixed manner, and to perform a free format that does not limit the description area of the character string of the keyword data and the image data. It is characterized in that registration and search in the image database are enabled based on a candidate character string.
This makes it possible to construct a system that is versatile and does not specify a user, and facilitates matching when accessing an image database. The conventional technology differs from the conventional technology in that a dedicated mark sheet required for each database center is used, and at the same time, a character string of keyword data and an area for describing image data are clarified. The difference is that a format described in an access sheet such as the mark sheet and a plurality of candidate character strings at the time of registration and search can be respectively generated.

【００２３】また、本発明は、カタカナ及び数字に対し
てカタカナ、数字とそれぞれの辞書を使用する事と、濁
点・半濁点を含んだ文字に対して、濁点・半濁点処理す
る事と、分離している文字に対して、分離文字処理する
事を特徴とする。これにより、カタカナを数字と誤った
り、数字をカタカナと誤ったりすることをなくし、文字
認識率を向上させて、従来の技術よりも高度な文字認識
が必要となったフリーフォーマットで記載されたカタカ
ナ及び数字の混在した文字列を高い認識率で文字認識す
る事を実現するとともに、濁点の欠落（見落とし）、半
濁点の欠落（見落とし）、あるいは分離した文字の文字
切り出しＮＧを、フォローできるようにして、候補文字
列の文字数が不一致とならないようにし、検索のヒット
率を向上させる事を最も主要な特徴とする。Also, the present invention uses a katakana and a number and a dictionary for katakana and a number, respectively, and processes a character including a voiced point and a half-voiced voice for a voiced and a half-voiced voice. It is characterized in that a separated character process is performed on the character that is being performed. As a result, Katakana characters are not mistaken for numbers or numbers are mistaken for Katakana characters, and the character recognition rate is improved. Character recognition with a high recognition rate for character strings with mixed numbers and numbers, and the following features: omission of voiced dots (overlooked), omission of semi-voiced voices (overlooked), or character separation NG of separated characters The most important feature is to prevent the number of characters in the candidate character string from being inconsistent and to improve the search hit rate.

【００２４】従来の技術におけるデータベースへのアク
セスを可能とする手段では、アクセスに必要なマークシ
ートに汎用性がなく、そのために前記データベースを利
用するユーザが特定されるが、本発明のフリーフォーマ
ットを可能としたアクセスシートを利用する手段は、文
字認識によりキーワードデータの文字列化とイメージデ
ータ化の分類を行っているため、同一アクセスシート内
にキーワードデータの文字列とイメージデータを混在す
る事が可能となり、本発明の課題である、ユーザが希望
する画像データを効率良く送信する事が出来るようにな
る。In the means for enabling access to a database according to the prior art, the mark sheet required for access is not versatile, and the user who uses the database is specified for that purpose. The means of using the access sheet that has been used classifies keyword data into character strings and image data by character recognition, so it is possible to mix keyword data character strings and image data in the same access sheet Thus, it is possible to efficiently transmit the image data desired by the user, which is the subject of the present invention.

【００２５】なお、本発明は、従来のファクシミリ案内
サービスシステムと比較して、ＢＯＸ番号が不要である
ため時間や手間がかからない事、ＢＯＸ番号の代わりに
タイトル名で画像データを登録できるので、登録されて
いる画像データが明確にでき、このためメニュー画面の
検索を行う必要がない事、検索に必要な登録者の電話番
号が不要であるため効率が良い事、登録又は、検索時の
操作（ダイヤル）等は電話番号一つでアクセスが可能で
あり、マニュアルも不要である、という利点がある。According to the present invention, compared to the conventional facsimile guidance service system, a BOX number is not required, so that time and labor are not required. Further, since image data can be registered with a title name instead of a BOX number, registration is possible. It is possible to clarify the image data that has been entered, so that there is no need to search the menu screen, it is efficient because there is no need for the telephone number of the registrant required for the search, and the operation at the time of registration or search ( Dial) can be accessed with a single telephone number, and there is an advantage that a manual is not required.

【００２６】[0026]

【発明の実施の形態】以下、本発明の実施の形態につい
て図を用いて詳細に説明する。Embodiments of the present invention will be described below in detail with reference to the drawings.

【００２７】図１は、本発明の画像データベースセンタ
装置の一実施形態例を含むシステム構成図である。画像
データベースセンタ装置１０は、スキャナ１１と、画像
データベース１２と、画像データベースセンタ主装置１
３と、回線ボード１４から構成されている。回線ボード
１４は通信ネットワーク１５に接続されており、この通
信ネットワーク１５を介して、通信メディアであるファ
ックス端末１６，１７およびパーソナルコンピュータ
（以下、パソコン）１８が画像データベースセンタ装置
１０にアクセス可能となっている。ファックス端末１６
は、画像データベース１２に登録を行うファックス端末
の例であり、ファックス端末１７は、画像データベース
１２に対し検索を行うファックス端末の例である。FIG. 1 is a system configuration diagram including an embodiment of the image database center apparatus of the present invention. The image database center device 10 includes a scanner 11, an image database 12, and an image database center main device 1.
3 and a line board 14. The line board 14 is connected to a communication network 15 via which the fax terminals 16 and 17 and personal computers (hereinafter, personal computers) 18 as communication media can access the image database center device 10. ing. Fax terminal 16
Is an example of a fax terminal that registers in the image database 12, and a fax terminal 17 is an example of a fax terminal that searches the image database 12.

【００２８】図２は、上記画像データベースセンタ装置
１０の機能ブロック構成図である。画像データベースセ
ンタ装置１０は、通信ネットワーク１５を介して前述の
通信メディア１６，１７，１８と画像データを送受信す
る画像データ送受信部、スキャナ１１から入力される画
像データを受信する画像データ受信部、カナ辞書と数字
辞書とを用いた文字認識ソフトウェア、区切り文字削除
処理ブロック、絶対基準値判定部、確定基準値判定部、
相対基準値判定部、候補文字列組み合わせ処理ブロッ
ク、濁点・半濁点処理ブロック、分離文字処理ブロッ
ク、辞書ＤＢ（データベース）比較処理ブロック、辞書
ＤＢ、および画像データ登録／検索比較ブロックを有す
る。それぞれの機能・動作については、以下の処理の説
明において説明する。FIG. 2 is a functional block diagram of the image database center apparatus 10. As shown in FIG. The image database center device 10 includes an image data transmitting / receiving unit that transmits / receives image data to / from the above-described communication media 16, 17, and 18 via the communication network 15, an image data receiving unit that receives image data input from the scanner 11, Character recognition software using a dictionary and a number dictionary, a delimiter deletion processing block, an absolute reference value determination unit, a fixed reference value determination unit,
It has a relative reference value determination unit, a candidate character string combination processing block, a voiced / semi-voiced voice processing block, a separated character processing block, a dictionary DB (database) comparison processing block, a dictionary DB, and an image data registration / search comparison block. Each function and operation will be described in the following description of the processing.

【００２９】図３は、上記画像データベースセンタ装置
によるサービスイメージを示した図である。画像データ
として、例えばハワイに関する旅行情報、現地詳細情報
等を登録／検索する場合のサービスイメージを説明す
る。旅行情報、現地詳細情報等の情報提供者は、提供す
る情報をフリーフォーマットのデータベースアクセスシ
ートに記載して、ファックス端末１６から電話回線を介
してサービスセンタに備えた画像データベースセンタ装
置１０に入力するか、あるいは、サービスセンタに備え
たスキャナ１１を用いて画像データベースセンタ装置１
０に入力する。画像データベースセンタ装置１０は、入
力された提供情報を文字認識してキーワードの候補を１
ないし複数抽出し、抽出元の画像データに対応させて画
像データベースに登録する。一方、ハワイに関する旅行
情報、現地詳細情報等を入手したい一般市民は、ファッ
クス端末１７からキーワードをフリーフォーマットのデ
ータベースアクセスシートに記載して、電話回線を介し
て画像データベースセンタ装置１０にアクセスする。あ
るいは、インターネット等を通してパソコンから画像デ
ータベースセンタ装置１０にアクセスする。画像データ
ベースセンタ装置１０は、入力されたデータベースアク
セスシートから文字認識によりキーワードの候補を１な
いし複数抽出し、このキーワードの候補で画像データベ
ースを検索し、画像データベース内に登録されているキ
ーワードの候補との一致率により該当するハワイに関す
る旅行情報、現地詳細情報等を電話回線を介してアクセ
スしたファックス端末１７にファックス出力する。ある
いは、インターネットを通してアクセスしたパソコンに
出力する。FIG. 3 is a diagram showing a service image by the image database center device. A description will be given of a service image in the case of registering / searching, for example, travel information and local detailed information on Hawaii as image data. An information provider such as travel information and local detailed information writes the information to be provided in a free format database access sheet and inputs the information to the image database center device 10 provided in the service center from the fax terminal 16 via a telephone line. Alternatively, the image database center device 1 is operated by using the scanner 11 provided in the service center.
Enter 0. The image database center apparatus 10 recognizes characters of the provided information and determines one keyword candidate.
Or, a plurality of extractions are made and registered in the image database corresponding to the image data of the extraction source. On the other hand, a general citizen who wants to obtain travel information, local detailed information, and the like regarding Hawaii writes keywords from a fax terminal 17 on a free format database access sheet and accesses the image database center apparatus 10 via a telephone line. Alternatively, the image database center device 10 is accessed from a personal computer through the Internet or the like. The image database center device 10 extracts one or more keyword candidates by character recognition from the input database access sheet, searches the image database with the keyword candidates, and searches for the keyword candidates registered in the image database. The travel information and the local detailed information on the corresponding Hawaii are fax-outputted to the fax terminal 17 accessed via the telephone line according to the coincidence rate of. Alternatively, output to a personal computer accessed through the Internet.

【００３０】図４に、データベースアクセスシートの記
載例を示す。２０は用紙、２１，２３はキーワードデー
タ記載例、２２はキーワードを抽出するための黒線（二
重線）、２４はイメージデータ記載可能領域を示す。図
４（ａ），（ｂ）はキーワード２１を二重線２２で示し
た記載例であり、図４（ｃ）はキーワードを抽出するイ
メージデータ記載可能領域２４にキーワード２３を記載
した例である。FIG. 4 shows a description example of a database access sheet. Reference numeral 20 denotes a sheet, reference numerals 21 and 23 denote examples of description of keyword data, reference numeral 22 denotes a black line (double line) for extracting a keyword, and reference numeral 24 denotes an area where image data can be written. FIGS. 4A and 4B are examples in which the keyword 21 is indicated by a double line 22, and FIG. 4C is an example in which the keyword 23 is described in an image data writeable area 24 for extracting the keyword. .

【００３１】図５に、上記データベースアクセスシート
に記載が可能な文字種の例として、（ａ）数字、
（ｂ），（ｃ）カタカナ文字を示す。FIG. 5 shows, as examples of character types that can be described in the database access sheet, (a) numerals,
(B), (c) indicate katakana characters.

【００３２】図６から図１２までは、上記構成の画像デ
ータベースセンタ装置におけるプログラム処理の一実施
形態例を示すフロー図である。以下、該プログラムの処
理の実施形態例を説明する。FIGS. 6 to 12 are flowcharts showing an embodiment of the program processing in the image database center apparatus having the above-mentioned configuration. Hereinafter, an embodiment of the processing of the program will be described.

【００３３】［実施形態例１］本実施形態例では、画像
データベース登録までの基本的な処理例を説明する（処
理フロー１０１〜１５２）。[Embodiment 1] In this embodiment, a basic processing example up to registration of an image database will be described (processing flows 101 to 152).

【００３４】以下の実施形態例では、登録に用いるメデ
ィアをファックス端末とし、予めシステム内に設定した
認識文字種別選定を「カナのみ」モードに設定している
事を例に説明する。また、ファックス端末より登録する
キーワードデータの文字列を「マルチメテ゛イア」を例
とする。In the following embodiment, an example will be described in which the medium used for registration is a facsimile terminal, and the recognition character type selection preset in the system is set to the "kana only" mode. The character string of the keyword data registered from the fax terminal is "multi-media" as an example.

【００３５】１０１：画像データベース登録に用いるメ
ディアによって分岐する。本実施形態例では、ファック
ス端末（図１参照）を用いるので、この場合は１０２の
ＦＡＸ送受信ソフトで処理される。なお、メディアとし
てスキャナを用いる場合は、２０１の外部入出力機能、
２０２のスキャナ登録、２０３のユーティリティソフト
により、スキャナから入力された画像データを受信す
る。101: Branch depending on the media used for image database registration. In the present embodiment, a fax terminal (see FIG. 1) is used. When a scanner is used as a medium, the external input / output function of 201,
The image data input from the scanner is received by the scanner registration 202 and the utility software 203.

【００３６】１０２：ＦＡＸ送受信ソフトではファック
ス端末、または、携帯情報端末または、パソコンより送
信されてきた画像データを受信する。本実施形態例で
は、ファックス端末より送信されてきた画像データを受
信する。102: FAX transmission / reception software receives image data transmitted from a facsimile terminal, a portable information terminal, or a personal computer. In the present embodiment, the image data transmitted from the fax terminal is received.

【００３７】１０３：１０２のＦＡＸ送受信ソフトまた
は２０３のユーティリティソフトで受信した画像データ
よりｘｂｍ（ｘビットマップ）ファイルを作成する。At step 103, an xbm (x bitmap) file is created from the image data received by the fax transmission / reception software at 102 or the utility software at 203.

【００３８】１０４：ＧＵＩ（グラフィカルユーザー
インターフェース）で設定したシステム設定（図１
３、図１４のシステム設定画面例（図１４は図１３の続
き画面）参照）の状態を参照し、システム設定状態によ
って以下の１０５及び１０６で分岐する。104: System setting (FIG. 1) set by GUI (Graphical User Interface)
3. Referring to the state of the system setting screen example of FIG. 14 (see FIG. 14 is a continuation screen of FIG. 13), the process branches at 105 and 106 depending on the system setting state.

【００３９】１０５：区切り文字「＃」が有るか無いか
のモードによって分岐する。１０４のシステム設定にお
いて、認識文字種別選定（図１３参照）で「カナのみ」
または「数字のみ」または「カナ＋数字」または「数字
＋カナ」を選択している場合は、１０６のカナ文字有無
モードで処理される。また、認識文字種別選定で「カナ
＋（＃）＋数字」または「数字＋（＃）＋カナ」を選択
している場合は、１０７−１のキーワードデータ抽出と
イメージデータの分離で処理される。本実施形態例で
は、「カナのみ」モードを設定しており、「＃」が無い
モードなので、この場合は１０６のカナ文字有無モード
で処理される。105: Branch depending on whether or not there is a delimiter "#". In the system setting of 104, "Kana only" is selected in the recognition character type selection (see FIG. 13).
Alternatively, if “only numbers”, “kana + numeric”, or “numeric + kana” is selected, processing is performed in the kana character presence / absence mode of 106. If “kana + (#) + numerical” or “numeric + (#) + kana” is selected in the recognition character type selection, processing is performed by extracting keyword data and separating image data in 107-1. . In the present embodiment, the “kana only” mode is set and there is no “#”. In this case, processing is performed in the kana character presence / absence mode of 106.

【００４０】１０６：カナ文字が有るか無いかのモード
によって分岐する。１０４のシステム設定において、認
識文字種別選定（図１３参照）で「カナのみ」または
「カナ＋数字」または「数字＋カナ」を選択している場
合は１０７−１のキーワードデータ抽出とイメージデー
タの分離で処理される。本実施形態例では「カナのみ」
モードを設定しているので、この場合は１０７−１のキ
ーワードデータ抽出とイメージデータの分離で処理され
る。106: Branch depending on whether there is a kana character or not. In the system setting of 104, if “kana only”, “kana + numeral” or “numeric + kana” is selected in the recognition character type selection (see FIG. 13), the keyword data extraction and image data extraction of 107-1 are performed. Processed in separation. In this embodiment, “only kana”
Since the mode is set, in this case, processing is performed by extracting keyword data and separating image data in 107-1.

【００４１】１０７−１：キーワードデータ抽出とイメ
ージデータの分離を行う。まず、画像データよりキーワ
ードデータのみを抽出する。抽出したキーワードデータ
のみが１０７−２の文字認識ＡＰで処理される。また、
抽出したキーワードデータ以外の画像データをイメージ
データと判断し、そのイメージデータは１０７−２の文
字認識ＡＰでは処理をせずに、直接キーワードデータ対
応に画像データベースへ登録する。107-1: Extract keyword data and separate image data. First, only the keyword data is extracted from the image data. Only the extracted keyword data is processed by the character recognition AP 107-2. Also,
The image data other than the extracted keyword data is determined to be image data, and the image data is directly registered in the image database corresponding to the keyword data without being processed by the character recognition AP 107-2.

【００４２】ここで、キーワードデータ抽出とイメージ
データの分離についての具体的な方法について説明す
る。Here, a specific method for extracting keyword data and separating image data will be described.

【００４３】まず、データベースアクセスシートとして
使用できる用紙は特定せず、フリーフォーマットとす
る。用紙にはターゲットとするキーワードデータとイメ
ージデータを同一頁内に混在して記載する事が可能であ
る。記載方法としては、（１）図１５（ａ）に例示する
ようにキーワードデータの下部に二重アンダーラインを
引く方法と、（２）図１５（ｂ）例示するように、キー
ワードデータを記載する場所（範囲）を指定する方法の
二通りがある。First, a sheet that can be used as a database access sheet is not specified, but is set in a free format. It is possible to describe target keyword data and image data on a sheet in a mixed manner on the same page. As a description method, (1) a method of drawing a double underline below keyword data as illustrated in FIG. 15A, and (2) a description of keyword data as illustrated in FIG. There are two ways to specify the location (range).

【００４４】次に、上記用紙に記載されたターゲットと
するキーワードデータを、以下の二通りの抽出方法、す
なわち（１）二重アンダーライン抽出方法、（２）記入
場所指定抽出方法のいずれかにより抽出する。Next, the target keyword data described on the sheet is extracted by one of the following two extraction methods: (1) a double underline extraction method, and (2) an entry place designation extraction method. Extract.

【００４５】（１）二重アンダーライン抽出方法とは、
図１６（ａ），（ｂ），（ｃ）に例示するように、ター
ゲットとするキーワードデータの下部に引かれた二重ア
ンダーラインを元にキーワードデータを抽出する方法で
ある。具体的には、一頁内にある二重アンダーラインを
探し出し、その二重アンダーラインの上部に記載された
文字列をターゲットとするキーワードデータとして抽出
し、その文字列を文字認識する。文字認識の結果につい
ては、候補文字列として画像データベースへキーワード
データとして登録する。また、ターゲットとするキーワ
ードデータ以外の画像データをイメージデータとして判
断し、キーワードデータとリンクさせて画像データベー
スへ登録する。(1) The double underline extraction method is as follows.
As shown in FIGS. 16A, 16B, and 16C, this is a method of extracting keyword data based on a double underline drawn below target keyword data. More specifically, a double underline in one page is searched for, a character string described above the double underline is extracted as target keyword data, and the character string is recognized. The result of character recognition is registered as keyword data in the image database as a candidate character string. Further, image data other than the target keyword data is determined as image data, and linked to the keyword data and registered in the image database.

【００４６】（２）記入場所指定抽出方法とは、図１６
（ｃ）に例示するように、ある決められた範囲の中だけ
でターゲットとするキーワードデータを抽出する方法で
ある。具体的には、一頁内にある、決められた広さの範
囲の中だけを対象に、その範囲の中に記載された文字列
をターゲットとするキーワードデータとして抽出し、そ
の文字列を文字認識する。文字認識の結果については、
候補文字列として画像データベースへキーワードデータ
として登録する。また、ターゲットとするキーワードデ
ータ以外の画像データをイメージデータとして判断し、
キーワードデータとリンクさせて画像データベースへ登
録する。(2) The entry location designation extraction method is described in FIG.
As shown in (c), this is a method of extracting target keyword data only within a predetermined range. Specifically, only within a predetermined range of a page, a character string described in the range is extracted as keyword data, and the character string is extracted as a character string. recognize. For the result of character recognition,
It is registered as keyword data in the image database as a candidate character string. In addition, image data other than the target keyword data is determined as image data,
Link to the keyword data and register it in the image database.

【００４７】１０７−２：文字認識ＡＰ（アプリケーシ
ョンプログラム）では、１０７−１で抽出されたキー
ワードデータを文字認識できる形式にファイル化（ｘｂ
ｍファイル化）し、そのファイル（ｘｂｍファイル）を
文字認識する事により、一文字単位に候補文字とそれに
対応する距離値を返却する。図１７に文字認識ＡＰ出力
結果例を示す。ここで、文字認識ＡＰでは、第一位から
第十位までの十個の文字を候補文字として候補に挙げ、
その候補文字毎に距離値を一文字単位に付与する。距離
値とは、文字認識ＡＰが認識し返却した候補文字の確か
らしさを数量的に表現した値を指す。つまり、候補文字
毎の確からしさ（正確度）を数値で表し、その値が小さ
ければ小さいほど正確度が高い。なお、１０５で「＃」
有りモードと判定された場合には、３０１の区切り文字
削除処理により、３０２で区切り文字「＃」を削除した
後、１０８で処理する。107-2: In the character recognition AP (application program), the keyword data extracted in 107-1 is converted into a file (xb
m files), and character recognition of the file (xbm file) returns the candidate characters and the corresponding distance values for each character. FIG. 17 shows an example of a character recognition AP output result. Here, in the character recognition AP, ten characters from the first place to the tenth place are listed as candidates, and
A distance value is assigned to each candidate character in units of one character. The distance value indicates a value that quantitatively expresses the likelihood of the candidate character recognized and returned by the character recognition AP. That is, the likelihood (accuracy) of each candidate character is represented by a numerical value, and the smaller the value, the higher the accuracy. Note that "#" is used in 105.
If it is determined that the mode is the presence mode, the delimiter character deletion processing of 301 deletes the delimiter “#” in 302 and then processes in 108.

【００４８】１０８：絶対基準値判定部では、１０７−
２の文字認識ＡＰより返却された候補文字とその距離値
（図１７参照）により、候補文字列の作成対象となる候
補文字を精査する。具体的には、第一位から第十位まで
の候補文字とそれに対応する距離値の値とシステム内に
設定した絶対基準値の値を比較し、距離値の値と絶対基
準値の値の大小により、その候補文字を候補文字列の組
み合わせ対象とするかしないかを判定する。本実施形態
例では、「カナのみ」モードを設定しているので、この
場合は１０４のシステム設定（図１３参照）において、
絶対基準値（カナ）の「１９０００」を参照しながら、
以下の１０９〜１１１を処理する。なお、絶対基準値と
は、候補文字を候補文字列の組み合わせ対象とするかし
ないかを判断する基準の値を指す。108: In the absolute reference value determination unit,
Based on the candidate character returned from the second character recognition AP and its distance value (see FIG. 17), the candidate character for which the candidate character string is to be created is closely examined. Specifically, the first to tenth candidate characters are compared with the corresponding distance value and the absolute reference value set in the system, and the distance value and the absolute reference value are compared. It is determined whether or not the candidate character is to be combined with the candidate character string based on the magnitude. In the present embodiment, since the “kana only” mode is set, in this case, in the system setting of 104 (see FIG. 13),
While referring to "19000" of the absolute reference value (Kana),
The following 109 to 111 are processed. Note that the absolute reference value refers to a reference value for determining whether or not a candidate character is to be combined with a candidate character string.

【００４９】１０９：一列目の第一位候補文字から最終
列の第一位候補文字に対して１０７−２の文字認識ＡＰ
から返却された距離値と１０４のシステム設定で設定さ
れた絶対基準値を比較し、その値の大小により分岐す
る。本実施形態例では、一列目の第一位候補文字は
「マ」でその距離値として「−６１６２」（図１７参
照）が返却され、二列目の第一位候補文字は「ル」でそ
の距離値として「−１４８１４」（図１７参照）が返却
されたことを参照する。同様の見方で最終列の第一位候
補文字である「タ」まで参照を続ける。この結果、一列
目の第一候補文字「マ」から最終列の第一候補文字
「タ」までの距離値は、全て「１９０００」より小さい
値であったので引き続き１１０で処理される。なお、１
文字でも距離値が絶対基準値より大きい場合には、４０
１で第一位候補文字のみ組み合わせて候補文字列を作成
し、１３３で処理する。109: Character recognition AP of 107-2 from the first candidate character in the first column to the first candidate character in the last column
Is compared with the absolute reference value set in the system setting of 104, and the process branches depending on the magnitude of the value. In the present embodiment, the first candidate character in the first column is "ma" and its distance value is "-6162" (see FIG. 17), and the first candidate character in the second column is "ru". It is referred to that "-14814" (see FIG. 17) is returned as the distance value. In the same way, the reference is continued up to the first candidate character “ta” in the last column. As a result, since the distance values from the first candidate character “MA” in the first column to the first candidate character “TA” in the last column are all smaller than “19000”, the process is continued at 110. In addition, 1
If the distance value is larger than the absolute reference value even for characters, 40
At 1, a candidate character string is created by combining only the first candidate character, and then processed at 133.

【００５０】１１０：一列目の文字に対し、第二位候補
文字から第十位候補文字に対して１０７−２の文字認識
ＡＰから返却された距離値と１０４のシステム設定で設
定された絶対基準値を比較し、その値の大小により分岐
する。本実施形態例では、一列目の第二位候補文字は
「ア」でその距離値として「８５００」（図１７参照）
が返却され、同一列目の第三位候補文字は「コ」でその
距離値として「１２５００」が返却されたとする。同様
に第四位から第十位までの候補文字が絶対基準値（カ
ナ）で設定した「１９０００」より小さい値であるの
で、１１１で処理される。110: Distance values returned from the character recognition AP 107-2 for the second to tenth candidate characters for the characters in the first column, and the absolute reference set in the system settings 104 Compare values and branch depending on the magnitude of the value. In the present embodiment, the second-place candidate character in the first column is “A” and its distance value is “8500” (see FIG. 17).
Is returned, and the third-place candidate character in the same column is “U”, and “12500” is returned as the distance value. Similarly, since the fourth to tenth candidate characters have a value smaller than "19000" set by the absolute reference value (kana), they are processed at 111.

【００５１】１１１：１０７−２の文字認識ＡＰから返
却された距離値と１０４のシステム設定で設定された絶
対基準値を比較した結果、第一列目の第一位候補文字か
ら第十位候補文字までの十文字を対象に１１２の確定基
準値判定部で処理される。As a result of comparing the distance value returned from the character recognition AP 111: 107-2 with the absolute reference value set in the system setting 104, the first candidate character in the first column to the tenth candidate character The final reference value determination unit 112 processes ten characters up to the character.

【００５２】１１２：確定基準値判定部では、１０７−
２の文字認識ＡＰより返却された候補文字とその距離値
（図１７参照）により候補文字列の作成対象となる候補
文字を精査する。具体的には、上位と下位の候補文字と
それに対応した上位と下位の距離値の値を減算し、減算
した値とシステム内に設定した確定基準値の値を比較
し、減算した値と確定基準値の値の大小により、その上
位の候補文字のみを候補文字列の組み合わせ対象とする
か、又は、その上位と下位の候補文字をそれぞれ候補文
字列の組み合わせ対象とするかを判定する。本実施形態
例では「カナのみ」モードを設定しているので、この場
合は１０４のシステム設定（図１３参照）において、確
定基準値（カナ）の「６０００」を参照しながら、以下
の１１３〜１１６を処理する。なお、確定基準値とは、
正確度の高い候補文字のみを候補文字列の組み合わせ対
象とするかしないかを判断する基準の値を指す。112: In the final reference value determination section,
Based on the candidate characters returned from the character recognition AP of No. 2 and their distance values (see FIG. 17), candidate characters for which a candidate character string is to be created are closely examined. Specifically, the upper and lower candidate characters and the corresponding upper and lower distance values are subtracted, and the subtracted value is compared with the value of the final reference value set in the system. Based on the value of the reference value, it is determined whether only the upper candidate character is to be combined with the candidate character string, or whether the upper and lower candidate characters are to be combined with the candidate character string. In the present embodiment, the “kana only” mode is set. In this case, in the system setting of 104 (see FIG. 13), the following 113 to 113 are set while referring to “6000” of the fixed reference value (kana). Process 116. In addition, the fixed reference value is
It indicates a criterion value for determining whether only candidate characters with high accuracy are to be combined with candidate character strings.

【００５３】１１３：一列目の第一位候補文字の距離値
と第二位候補文字の距離値を減算処理し、その差の値を
算出値として１１４で処理する。本実施形態例では、一
列目の第一候補文字は「マ」でその距離値として「−６
１６２」が、第二位候補文字は「ア」でその距離値とし
て「８５００」が返却されたので、減算処理の結果、算
出値が「１４６６２」となる。113: The distance value of the first candidate character in the first column and the distance value of the second candidate character are subtracted, and the difference value is processed at 114 as a calculated value. In the present embodiment, the first candidate character in the first column is “ma” and its distance value is “−6”.
Since “162” is the second-place candidate character and “8500” is returned as its distance value, the calculated value becomes “14662” as a result of the subtraction process.

【００５４】１１４：１１３の算出値と１０４のシステ
ム設定で設定された確定基準値（図１３参照）を比較
し、その値の大小により分岐する。本実施形態例では、
算出値「１４６６２」が確定基準値（カナ）で設定した
「６０００」より大きい値であるので１１５で処理され
る。114: The calculated value at 113 is compared with the fixed reference value (see FIG. 13) set by the system setting at 104, and the process branches depending on the magnitude of the value. In the present embodiment,
Since the calculated value “14662” is a value larger than “6000” set as the fixed reference value (Kana), it is processed at 115.

【００５５】１１５：１１４で算出値が確定基準値より
大きい値であったので、１１４の算出値と確定基準値の
比較を終了し、引き続き１１６で処理する。115: At 114, the calculated value is larger than the fixed reference value, so the comparison between the calculated value at 114 and the fixed reference value is terminated, and the process is continued at 116.

【００５６】１１６：１１５で確定した候補文字を候補
文字列の組み合わせ対象とする。本実施形態例では、候
補文字「マ」が確定基準値により確定したので、候補文
字「マ」を候補文字列の組み合わせ対象とする。The candidate character determined at 116: 115 is set as a candidate for combining candidate character strings. In the present embodiment, since the candidate character "ma" is determined by the determination reference value, the candidate character "ma" is set as a candidate for combining candidate character strings.

【００５７】１１７：１１６で確定した候補文字が最終
列かにより分岐する。本実施形態例では、候補文字
「マ」は最終列ではないので、１１８の次列の第二位候
補文字に対して絶対基準値判定部より処理する。At 117: 116, the candidate character is branched depending on whether the candidate character is the last column. In the present embodiment, since the candidate character "MA" is not the last column, the second reference character in the next column of 118 is processed by the absolute reference value determination unit.

【００５８】１１８：次列の第二位候補文字に対して絶
対基準値判定部より処理する。本実施形態例では、二列
目の第二位候補文字「ホ」に対して１０８の絶対基準値
判定部で処理する。118: The second reference candidate character in the next column is processed by the absolute reference value determination unit. In the present embodiment, the second reference character “e” in the second column is processed by the absolute reference value determination unit 108.

【００５９】１０８：本実施形態例では「カナのみ」モ
ードを設定しているので、この場合は１０４のシステム
設定（図１３参照）において、絶対基準値（カナ）の
「１９０００」を参照しながら、以下の１０９〜１１１
を処理する。108: In this embodiment, since the “kana only” mode is set, in this case, in the system setting of 104 (see FIG. 13), referring to “19000” of the absolute reference value (kana). , 109-111 below
Process.

【００６０】１０９：本処理は一列目の第一位候補文字
から最終列の第一位候補文字に対する処理なので、二列
目の第二位候補文字については、１１０で処理する。109: This process is for the first candidate character in the first column to the first candidate character in the last column, so the second candidate character in the second column is processed in 110.

【００６１】１１０：二列目の文字に対し、第二位候補
文字から第十位候補文字に対して１０７−２の文字認識
ＡＰから返却された距離値と１０４のシステム設定で設
定された絶対基準値を比較し、その値の大小により分岐
する。本実施形態例では、二列目の第二位候補文字は
「ホ」でその距離値として「１６９２１」（図１７参
照）が返却され、同二列目の第三位候補文字は「カ」で
その距離値として「２８３２３」が返却されたとする。
この時点で第三位候補文字「カ」は絶対基準値（カナ）
で設定した「１９０００」より大きい値であるので１１
９で処理される。110: For the second column of characters, the distance value returned from the character recognition AP 107-2 for the second to tenth candidate characters and the absolute value set in the system setting 104 The reference value is compared, and branching is performed depending on the value. In the present embodiment, the second candidate character in the second column is “e” and its distance value is “16921” (see FIG. 17), and the third candidate character in the second column is “f”. It is assumed that “28323” is returned as the distance value.
At this point, the third-place candidate character "K" is the absolute reference value (Kana)
Since the value is larger than "19000" set in step 11,
9 is processed.

【００６２】１１９：１１０で二列目の第三位候補文字
「カ」が距離値「２８３２３」で絶対基準値（カナ）で
設定した「１９０００」より大きい値であったので、距
離値と絶対基準値の比較を終了し、引き続き１２０で処
理される。At 119: 110, the third-place candidate character "" in the second column was a distance value "28323" which was larger than "19000" set by the absolute reference value (Kana). The comparison of the reference values is terminated, and the process is continued at 120.

【００６３】１２０：１１０の結果、距離値が絶対基準
値内であった候補文字のみを確定基準値判定の対象とす
る。本実施形態例では、二列目の文字に対し、第二位候
補文字「ホ」が距離値「１６９２１」で絶対基準値（カ
ナ）で設定した「１９０００」より小さい値であるので
「ホ」を確定基準値判定の対象とする。また、第三位位
候補文字「カ」が距離値「２８３２３」で絶対基準値
（カナ）で設定した「１９０００」より大きい値である
ので「カ」を確定基準値判定の対象外とする。この結
果、第二列目については第二位候補文字の一文字を対象
に１１２の確定基準値判定部で処理される。As a result of 120: 110, only the candidate characters whose distance value is within the absolute reference value are subjected to the determination of the fixed reference value. In the present embodiment, since the second-place candidate character "e" is a value smaller than "19000" set by the absolute reference value (kana) with the distance value "16921" for the character in the second column, "e" Is the target of the determination of the reference value. Also, since the third-place candidate character "K" is a distance value "28323" and a value larger than "19000" set by the absolute reference value (Kana), "K" is excluded from the determination of the fixed reference value. As a result, the second column is processed by the 112 fixed reference value determination units for one character of the second candidate character.

【００６４】１１２：確定基準値判定部では、１０７−
２の文字認識ＡＰより返却された候補文字とその距離値
（図１７参照）により候補文字列の作成対象となる候補
文字を精査する。本実施形態例では「カナのみ」モード
を設定しているので、この場合は１０４のシステム設定
（図１３参照）において、確定基準値（カナ）の「６０
００」を参照しながら、以下の１１３〜１１６を処理す
る。112: The fixed reference value determination unit 107-
Based on the candidate characters returned from the character recognition AP of No. 2 and their distance values (see FIG. 17), candidate characters for which a candidate character string is to be created are closely examined. In the present embodiment, the “kana only” mode is set. In this case, in the system setting of 104 (see FIG. 13), “60” of the fixed reference value (kana) is set.
The following 113 to 116 are processed with reference to “00”.

【００６５】１１３：二列目の第一位候補文字の距離値
と第二位候補文字の距離値を減算処理し、その差の値を
算出値として１１４で処理する。本実施形態例では、二
列目の第一候補文字は「ル」でその距離値として「−１
４８１４」が、第二位候補文字は「ホ」でその距離値と
して「１６９２１」が返却されたので、減算処理の結
果、算出値が「３１７３５」となる。113: The distance value of the first candidate character in the second column is subtracted from the distance value of the second candidate character, and the difference value is processed at 114 as the calculated value. In the present embodiment, the first candidate character in the second column is “ル” and its distance value is “−1”.
4814 ", the second-place candidate character is" e ", and" 16921 "is returned as its distance value. As a result of the subtraction processing, the calculated value becomes" 31735 ".

【００６６】１１４：１１３の算出値と１０４のシステ
ム設定で設定された確定基準値を比較し、その値の大小
により分岐する。本実施形態例では、算出値「３１７３
５」が確定基準値（カナ）で設定した「６０００」より
大きい値であるので１１５で処理される。114: The calculated value at 113 is compared with the fixed reference value set in the system setting at 104, and the process branches depending on the magnitude of the value. In the present embodiment, the calculated value “3173
Since “5” is a value larger than “6000” set as the fixed reference value (Kana), it is processed at 115.

【００６７】１１５：１１４で算出値が確定基準値より
大きい値であったので、１１４の算出値と確定基準値の
比較を終了し、引き続き１１６で処理する。At 115: 114, the calculated value is larger than the fixed reference value, so the comparison between the calculated value at 114 and the fixed reference value is terminated, and the process is continued at 116.

【００６８】１１６：１１５で確定した候補文字を候補
文字列の組み合わせ対象とする。The candidate character determined at 116: 115 is set as a candidate for combining candidate character strings.

【００６９】本実施形態例では、候補文字「ル」が確定
基準値により確定したので、候補文字「ル」を候補文字
列の組み合わせ対象とする。In the present embodiment, since the candidate character "R" is determined by the determination reference value, the candidate character "R" is set as a candidate for combining candidate character strings.

【００７０】１１７：１１６で確定した候補文字が最終
列かにより分岐する。本実施形態例では、候補文字
「ル」は最終列ではないので、１１８の次列の第二位候
補文字に対して絶対基準値判定部より処理する。The candidate character determined at 117: 116 branches depending on whether it is the last column. In the present embodiment, since the candidate character “ル” is not the last column, the second reference candidate character in the next column of 118 is processed by the absolute reference value determination unit.

【００７１】１１８：次列の第二位候補文字に対して絶
対基準値判定部より処理する。本実施形態例では、三列
目の第二位候補文字「キ」に対して１０８の絶対基準値
判定部で処理する。Step 118: The second reference candidate character in the next column is processed by the absolute reference value determination unit. In the present embodiment, the second reference character "" in the third column is processed by the absolute reference value determination unit 108.

【００７２】１０８：本実施形態例では「カナのみ」モ
ードを設定しているので、この場合は１０４のシステム
設定（図１３参照）において、絶対基準値（カナ）の
「１９０００」を参照しながら、以下の１０９〜１１１
を処理する。108: In this embodiment, the "kana only" mode is set. In this case, in the system setting of 104 (see FIG. 13), referring to "19000" of the absolute reference value (kana). , 109-111 below
Process.

【００７３】１０９：本処理は一列目の第一位候補文字
から最終列の第一位候補文字に対する処理なので、三列
目の第二位候補文字については、１１０で処理する。109: Since this process is a process from the first candidate character in the first column to the first candidate character in the last column, the second candidate character in the third column is processed in 110.

【００７４】１１０：三列目の文字に対し、第二位候補
文字から第十位候補文字に対して１０７−２の文字認識
ＡＰから返却された距離値と１０４のシステム設定で設
定された絶対基準値を比較し、その値の大小により分岐
する。本実施形態例では、三列目の第二位候補文字は
「キ」でその距離値として「３０００」（図１７参照）
が返却され、同三列目の第三位候補文字は「テ」でその
距離値として「１９３５５」が返却されたとする。この
時点で第三位候補文字「テ」は絶対基準値（カナ）で設
定した「１９０００」より大きい値であるので１１９で
処理される。110: For the third column of characters, the distance value returned from the character recognition AP 107-2 for the second to tenth candidate characters and the absolute value set in the system setting 104 The reference value is compared, and branching is performed depending on the value. In the present embodiment, the second candidate character in the third column is “K” and its distance value is “3000” (see FIG. 17).
Is returned, and the third-place candidate character in the third column is “te”, and “19355” is returned as the distance value. At this point, the third-place candidate character "te" is larger than "19000" set by the absolute reference value (kana), and is therefore processed at 119.

【００７５】１１９：１１０で三列目の第三位候補文字
「テ」が距離値「１９３５５」で絶対基準値（カナ）で
設定した「１９０００」より大きい値であったので、距
離値と絶対基準値の比較を終了し、引き続き１２０で処
理される。At 119: 110, the third-order candidate character "te" in the third column was a distance value "19355", which was larger than "19000" set by the absolute reference value (kana). The comparison of the reference values is terminated, and the process is continued at 120.

【００７６】１２０：１１０の結果、距離値が絶対基準
値内であった候補文字のみを確定基準値判定の対象とす
る。本実施形態例では、三列目の文字に対し、第二位候
補文字「キ」が距離値「３０００」で絶対基準値（カ
ナ）で設定した「１９０００」より小さい値であるので
「キ」を確定基準値判定の対象とする。また、第三位位
候補文字「テ」が距離値「１９３５５」で絶対基準値
（カナ）で設定した「１９０００」より大きい値である
ので「テ」を確定基準値判定の対象外とする。この結
果、第三列目については第二位候補文字の一文字を対象
に１１２の確定基準値判定部で処理される。As a result of 120: 110, only the candidate characters whose distance value is within the absolute reference value are subjected to the determination of the fixed reference value. In the present embodiment, since the second-place candidate character "" is a value smaller than "19000" set by the absolute reference value (Kana) with the distance value "3000" for the character in the third column, "" Is the target of the determination of the reference value. Since the third-place candidate character "te" is a distance value "19355" and a value larger than "19000" set by the absolute reference value (kana), "te" is excluded from the determination of the final reference value. As a result, the third column is processed by the 112 fixed reference value determining units for one character of the second candidate character.

【００７７】１１２：確定基準値判定部では、１０７−
２の文字認識ＡＰより返却された候補文字とその距離値
（図１７参照）により候補文字列の作成対象となる候補
文字を精査する。本実施形態例では「カナのみ」モード
を設定しているので、この場合は１０４のシステム設定
（図１３参照）において、確定基準値（カナ）の「６０
００」を参照しながら、以下の１１３〜１１６を処理す
る。112: The final reference value judgment unit 107-
Based on the candidate characters returned from the character recognition AP of No. 2 and their distance values (see FIG. 17), candidate characters for which a candidate character string is to be created are closely examined. In the present embodiment, the “kana only” mode is set. In this case, in the system setting of 104 (see FIG. 13), “60” of the fixed reference value (kana) is set.
The following 113 to 116 are processed with reference to “00”.

【００７８】１１３：三列目の第一位候補文字の距離値
と第二位候補文字の距離値を減算処理し、その差の値を
算出値として１１４で処理する。本実施形態例では、三
列目の第一候補文字は「チ」でその距離値として「３１
６」が、第二位候補文字は「キ」でその距離値として
「３０００」が返却されたので、減算処理の結果、算出
値が「２６８４」となる。113: Subtract the distance value of the first candidate character and the distance value of the second candidate character in the third column, and process the difference value as a calculated value at 114. In the present embodiment, the first candidate character in the third column is “H” and its distance value is “31”.
Since “6” is the second candidate character and “3000” is returned as the distance value, the calculated value becomes “2684” as a result of the subtraction process.

【００７９】１１４：１１３の算出値と１０４のシステ
ム設定で設定された確定基準値を比較し、その値の大小
により分岐する。本実施形態例では、算出値「２６８
４」が確定基準値（カナ）で設定した「６０００」より
小さい値であるので１２１で処理される。114: The calculated value at 113 is compared with the fixed reference value set in the system setting at 104, and the process branches depending on the magnitude of the value. In the present embodiment, the calculated value “268
Since "4" is a value smaller than "6000" set as the fixed reference value (Kana), it is processed at 121.

【００８０】１２１：１１４で算出値が確定基準値より
小さい値であったので、１１４の算出値と確定基準値の
比較を終了し、上位候補文字及び下位候補文字を相対基
準値判定の対象とする。本実施形態例では、三列目の第
一位候補文字「チ」と第二位候補文字「キ」の二文字を
対象に１２２の相対基準値判定部で処理される。At 121: 114, since the calculated value was smaller than the fixed reference value, the comparison between the calculated value of 114 and the fixed reference value was terminated, and the upper candidate character and the lower candidate character were determined as relative reference value determination targets. I do. In the present embodiment, the relative reference value determination unit 122 processes two characters of the first candidate character “H” and the second candidate character “K” in the third column.

【００８１】１２２：相対基準値判定部では、１０７−
２の文字認識ＡＰより返却された候補文字とその距離値
（図１７参照）により候補文字列の作成対象となる候補
文字を精査する。具体的には、上位と下位の候補文字と
それに対応した上位と下位の距離値の値を減算し、減算
した値とシステム内に設定した相対基準値の値を比較
し、減算した値と相対基準値の値の大小により、その上
位と下位の候補文字を不確定文字に置き換えて候補文字
列の組み合わせ対象とするか、又は、その上位と下位の
候補文字をそれぞれ候補文字列の組み合わせ対象とする
かを判定する。本実施形態例では「カナのみ」モードを
設定しているので、この場合は１０４のシステム設定
（図１３参照）において、相対基準値（カナ）の「３５
００」を参照しながら、以下の１２３〜１２８を処理す
る。なお、相対基準値とは、正確度の低い候補文字を不
確定文字に置き換えて候補文字列の組み合わせ対象とす
るかしないかを判断する基準の値を指す。122: In the relative reference value determination section,
Based on the candidate characters returned from the character recognition AP of No. 2 and their distance values (see FIG. 17), candidate characters for which a candidate character string is to be created are closely examined. Specifically, the upper and lower candidate characters and the corresponding upper and lower distance values are subtracted, and the subtracted value is compared with the value of the relative reference value set in the system. Depending on the value of the reference value, the upper and lower candidate characters may be replaced with uncertain characters and combined with candidate character strings, or the upper and lower candidate characters may be combined with candidate character strings, respectively. Is determined. In this embodiment, the “kana only” mode is set. In this case, in the system setting of 104 (see FIG. 13), “35” of the relative reference value (kana) is set.
The following 123 to 128 are processed with reference to “00”. Note that the relative reference value indicates a reference value for determining whether or not a candidate character having low accuracy is replaced with an uncertain character and is set as a candidate for combining candidate character strings.

【００８２】１２３：三列目の第一位候補文字の距離値
と第二位候補文字の距離値を減算処理し、その差の値を
算出値として１２４で処理する。本実施形態例では、三
列目の第一候補文字は「チ」でその距離値として「３１
６」が、第二位候補文字は「キ」でその距離値として
「３０００」が返却されたので、減算処理の結果、算出
値が「２６８４」となる。123: The distance value of the first candidate character in the third column is subtracted from the distance value of the second candidate character, and the difference value is processed as a calculated value at 124. In the present embodiment, the first candidate character in the third column is “H” and its distance value is “31”.
Since “6” is the second candidate character and “3000” is returned as the distance value, the calculated value becomes “2684” as a result of the subtraction process.

【００８３】１２４：１２３の算出値と１０４のシステ
ム設定で設定された相対基準値を比較し、その値の大小
により分岐する。本実施形態例では、算出値「２６８
４」が相対基準値（カナ）で設定した「３５００」より
小さい値であるので１２５で処理される。124: The calculated value of 123 is compared with the relative reference value set in the system setting of 104, and the process branches depending on the magnitude of the value. In the present embodiment, the calculated value “268
Since "4" is a value smaller than "3500" set by the relative reference value (kana), it is processed at 125.

【００８４】１２５：１２４で算出値が相対基準値より
小さい値であったので、１２４の算出値と相対基準値の
比較を終了し、引き続き１２６で処理する。Since the calculated value is smaller than the relative reference value at 125: 124, the comparison between the calculated value of 124 and the relative reference value is terminated, and the processing is continued at 126.

【００８５】１２６：１２５で候補文字を不確定文字
「＠」に置き換えて候補文字列の組み合わせ対象に追加
する。本実施形態例では、相対基準値により、三列目の
第一位候補文字「チ」と第二位候補文字「キ」の候補文
字を候補文字「＠」に置き換えて候補文字列の組み合わ
せ対象に追加する。At 126: 125, the candidate character is replaced with the uncertain character "@" and added to the candidate character string combination target. In the present embodiment example, the candidate characters of the first candidate character “H” and the second candidate character “G” in the third column are replaced with the candidate character Add to

【００８６】１２７：１２６で確定した候補文字が最終
列かにより分岐する。本実施形態例では、候補文字
「＠」は最終列ではないので、１２８の次列の第二位候
補文字に対して絶対基準値判定部より処理する。At 127: 126, the candidate character is branched depending on whether the candidate character is the last column. In this embodiment, since the candidate character “＠” is not the last column, the second reference character in the next column of 128 is processed by the absolute reference value determination unit.

【００８７】１２８：次列の第二位候補文字に対して絶
対基準値判定部より処理する。本実施形態例では、四列
目の第二位候補文字「メ」に対して１０８の絶対基準値
判定部で処理する。すなわち、四列目の第二位候補文字
「メ」に対して１０８の絶対基準値から順番に以上と同
様の処理を繰り返す。結果、四列目の第一位候補文字
「メ」と第二位候補文字「ヌ」は距離値の減算処理によ
り算出値が「２０５１４」となり、確定基準値（カナ）
で設定した「６０００」より大きな値であるので、第一
位候補文字「メ」を一文字のみ候補文字列の組み合わせ
対象とする。同様に五列目の第一位候補文字「テ」と第
二位候補文字「ヲ」も確定基準値により第一位候補文字
「テ」を一文字のみ候補文字列の組み合わせ対象とす
る。さらに六列目、七列目についても同様に確定基準値
により、六列目は第一位候補文字「゛」と七列目は
「イ」がそれぞれ一文字のみ候補文字列の組み合わせ対
象とする。八列目（最終列）の第二位候補文字「タ」に
対して１０８絶対基準値判定部で処理する。128: The second reference candidate character in the next column is processed by the absolute reference value determination unit. In the present embodiment, the second reference character “me” in the fourth column is processed by the absolute reference value determination unit 108. That is, the same processing as above is repeated for the second candidate character “me” in the fourth column in order from the absolute reference value of 108. As a result, the calculated value of the first candidate character “me” and the second candidate character “nu” in the fourth column is “20514” by the subtraction processing of the distance value, and the final reference value (kana)
Since the value is larger than “6000” set in the above, only the first candidate character “me” is targeted for combination of candidate character strings. Similarly, for the first candidate character "te" and the second candidate character "$" in the fifth column, only one character of the first candidate character "te" is subjected to the combination of candidate character strings according to the determined reference value. In the same manner, the sixth and seventh columns are also subject to the combination of candidate character strings, with the first candidate character “゛” in the sixth column and “A” in the seventh column each being one character. The 108th absolute reference value determination unit processes the second-place candidate character “ta” in the eighth column (final column).

【００８８】１０８：本実施形態例では「カナのみ」モ
ードを設定しているので、この場合は１０４のシステム
設定（図１３参照）において、絶対基準値（カナ）の
「１９０００」を参照しながら、以下の１０９〜１１１
を処理する。108: In the present embodiment, the "kana only" mode is set. In this case, in the system setting of 104 (see FIG. 13), referring to "19000" of the absolute reference value (kana). , 109-111 below
Process.

【００８９】１０９：本処理は一列目の第一位候補文字
から最終列の第一位候補文字に対する処理なので、八列
目の第二位候補文字については、１１０で処理する。109: Since this processing is for the first candidate character in the first column to the first candidate character in the last column, the second candidate character in the eighth column is processed in 110.

【００９０】１１０：八列目の文字に対し、第二位候補
文字から第十位候補文字に対して１０７−２の文字認識
ＡＰから返却された距離値と１０４のシステム設定で設
定された絶対基準値を比較し、その値の大小により分岐
する。本実施形態例では、八列目の第二位候補文字は
「ア」でその距離値として「１１２００」（図１７参
照）が返却され、同八列目の第三位候補文字は「マ」で
その距離値として「１９２７５」が返却されたとする。
この時点で第三位候補文字「マ」は絶対基準値（カナ）
で設定した「１９０００」より大きい値であるので１１
９で処理される。110: For the characters in the eighth column, the distance value returned from the character recognition AP 107-2 for the second to tenth candidate characters and the absolute value set in the system setting 104 The reference value is compared, and branching is performed depending on the value. In the present embodiment, the second candidate character in the eighth column is "A" and its distance value is "11200" (see FIG. 17), and the third candidate character in the eighth column is "MA". Assume that "19275" is returned as the distance value.
At this point, the third candidate character "ma" is the absolute reference value (kana)
Since the value is larger than "19000" set in step 11,
9 is processed.

【００９１】１１９：１１０で八列目の第三位候補文字
「マ」が距離値「１９２７５」で絶対基準値（カナ）で
設定した「１９０００」より大きい値であったので、距
離値と絶対基準値の比較を終了し、引き続き１２０で処
理される。At 119: 110, the third-place candidate character “ma” in the eighth column was a distance value “19275”, which was larger than “19000” set by the absolute reference value (kana). The comparison of the reference values is terminated, and the process is continued at 120.

【００９２】１２０：１１０の結果、距離値が絶対基準
値内であった候補文字のみを確定基準値判定の対象とす
る。本実施形態例では、八列目の文字に対し、第二位候
補文字「ア」が距離値「１１２００」で絶対基準値（カ
ナ）で設定した「１９０００」より小さい値であるので
「ア」を確定基準値判定の対象とする。また、第三位位
候補文字「マ」が距離値「１９２７５」で絶対基準値
（カナ）で設定した「１９０００」より大きい値である
ので「マ」を確定基準値判定の対象外とする。この結
果、第八列目については第二位候補文字の一文字を対象
に１１２の確定基準値判定部で処理される。As a result of 120: 110, only the candidate characters whose distance value is within the absolute reference value are subjected to the fixed reference value determination. In the present embodiment, the second-place candidate character “A” is smaller than “19000” set by the absolute reference value (Kana) with the distance value “11200” for the character in the eighth column. Is the target of the determination of the reference value. Further, since the third-place candidate character "ma" is a distance value "19275" and is a value larger than "19000" set by the absolute reference value (kana), "ma" is excluded from the determination of the fixed reference value. As a result, the eighth column is processed by the 112 fixed reference value determination units for one character of the second candidate character.

【００９３】１１２：確定基準値判定部では、１０７−
２の文字認識ＡＰより返却された候補文字とその距離値
（図１７参照）により候補文字列の作成対象となる候補
文字を精査する。本実施形態例では「カナのみ」モード
を設定しているので、この場合は１０４のシステム設定
（図１３参照）において、確定基準値（カナ）の「６０
００」を参照しながら、以下の１１３〜１１６を処理す
る。112: In the final reference value determination section,
Based on the candidate characters returned from the character recognition AP of No. 2 and their distance values (see FIG. 17), candidate characters for which a candidate character string is to be created are closely examined. In the present embodiment, the “kana only” mode is set. In this case, in the system setting of 104 (see FIG. 13), “60” of the fixed reference value (kana) is set.
The following 113 to 116 are processed with reference to “00”.

【００９４】１１３：八列目の第一位候補文字の距離値
と第二位候補文字の距離値を減算処理し、その差の値を
算出値として１１４で処理する。本実施形態例では、八
列目の第一候補文字は「タ」でその距離値として「７０
００」が、第二位候補文字は「ア」でその距離値として
「１１２００」が返却されたので、減算処理の結果、算
出値が「４２００」となる。113: The distance value of the first candidate character in the eighth column is subtracted from the distance value of the second candidate character, and the difference value is processed as a calculated value in 114. In the present embodiment, the first candidate character in the eighth column is “ta” and its distance value is “70”.
Since “00” is the second candidate character and “11200” is returned as its distance value, the calculated value becomes “4200” as a result of the subtraction processing.

【００９５】１１４：１１３の算出値と１０４のシステ
ム設定で設定された確定基準値を比較し、その値の大小
により分岐する。本実施形態例では、算出値「４２０
０」が確定基準値（カナ）で設定した「６０００」より
小さい値であるので１２１で処理される。114: The calculated value of 113 is compared with the fixed reference value set in the system setting of 104, and the process branches depending on the magnitude of the value. In the present embodiment, the calculated value "420
Since “0” is a value smaller than “6000” set as the fixed reference value (Kana), it is processed at 121.

【００９６】１２１：１１４で算出値が確定基準値より
小さい値であったので、１１４の算出値と確定基準値の
比較を終了し、上位候補文字及び下位候補文字を相対基
準値判定の対象とする。本実施形態例では、八列目の第
一位候補文字「タ」と第二位候補文字「ア」の二文字を
対象に１２２の相対基準値判定部で処理される。At 121: 114, since the calculated value was smaller than the fixed reference value, the comparison between the calculated value of 114 and the fixed reference value was terminated, and the upper candidate character and the lower candidate character were set as targets of the relative reference value determination. I do. In the present embodiment, the relative reference value determination unit 122 processes two characters of the first candidate character “ta” and the second candidate character “a” in the eighth column.

【００９７】１２２：相対基準値判定部では、１０７−
２の文字認識ＡＰより返却された候補文字とその距離値
（図１７参照）により候補文字列の作成対象となる候補
文字を精査する。本実施形態例では「カナのみ」モード
を設定しているので、この場合は１０４のシステム設定
（図２参照）において、相対基準値（カナ）の「３５０
０」を参照しながら、以下の１２３〜１２８を処理す
る。122: In the relative reference value determination section,
Based on the candidate characters returned from the character recognition AP of No. 2 and their distance values (see FIG. 17), candidate characters for which a candidate character string is to be created are closely examined. In the present embodiment, the “kana only” mode is set. In this case, in the system setting of 104 (see FIG. 2), “350” of the relative reference value (kana) is set.
The following 123 to 128 are processed with reference to “0”.

【００９８】１２３：八列目の第一位候補文字の距離値
と第二位候補文字の距離値を減算処理し、その差の値を
算出値として１２４で処理する。本実施形態例では、八
列目の第一候補文字は「タ」でその距離値として「７０
００」が、第二位候補文字は「ア」でその距離値として
「１１２００」が返却されたので、減算処理の結果、算
出値が「４２００」となる。123: The distance value of the first candidate character in the eighth column is subtracted from the distance value of the second candidate character, and the difference value is processed as a calculated value at 124. In the present embodiment, the first candidate character in the eighth column is “ta” and its distance value is “70”.
Since “00” is the second candidate character and “11200” is returned as its distance value, the calculated value becomes “4200” as a result of the subtraction processing.

【００９９】１２４：１２３の算出値と１０４のシステ
ム設定で設定された相対基準値を比較し、その値の大小
により分岐する。本実施形態例では、算出値「４２０
０」が相対基準値（カナ）で設定した「３５００」より
大きい値であるので１２９で処理される。124: The calculated value of 123 is compared with the relative reference value set in the system setting of 104, and the process branches depending on the value. In the present embodiment, the calculated value "420
Since "0" is a value larger than "3500" set by the relative reference value (kana), it is processed in 129.

【０１００】１２９：１２４で算出値が相対基準値より
大きい値であったので、上位及び下位の候補文字を候補
文字列の組み合わせ対象とする。本実施形態例では、八
列目の第一位候補文字「タ」と第二位候補文字「ア」が
相対基準値により、第一位候補文字「タ」と第二位候補
文字「ア」の候補文字の二候補文字を候補文字列の組み
合わせ対象に追加する。Since the calculated value was larger than the relative reference value at 129: 124, the upper and lower candidate characters are set as a candidate for combining candidate character strings. In the present embodiment, the first candidate character "T" and the second candidate character "A" are determined based on the relative reference value in the eighth column. Are added to the candidate for combining candidate character strings.

【０１０１】１３０：下位に相対基準値判定の対象とな
った候補文字が有るか無いかにより分岐する。本実施形
態例では、八列目の第二位候補「ア」以降については、
対象外となっているので、１２７で処理する。130: Branch depending on whether or not there is a candidate character for which the relative reference value is determined below. In the present embodiment, for the second candidate “A” in the eighth column and thereafter,
Since it is not a target, it is processed in 127.

【０１０２】１２７：１３０で確定した候補文字が最終
列かにより分岐する。本実施形態例では、八列目の第一
位候補文字「タ」と第二位候補文字「ア」の候補文字の
二候補文字が最終列であるので、１３１の候補文字列組
み合わせ処理で処理する。At 127: 130, the candidate character branches depending on whether it is the last column. In the present embodiment, since two candidate characters of the first candidate character “ta” and the second candidate character “a” in the eighth column are the last column, the process is performed by the 131 candidate character string combination process. I do.

【０１０３】１３１：１０８の絶対基準値判定部、１１
２の確定基準値判定部、１２２の相対基準値判定部によ
り精査された候補文字を組み合わせて候補文字列を作成
する。131: 108 absolute reference value determination unit, 11
A candidate character string is created by combining candidate characters scrutinized by the final reference value determination unit 2 and the relative reference value determination unit 122.

【０１０４】１３２：候補文字列の組み合わせ対象とな
った候補文字により候補文字列を作成する。本実施形態
例では、一列目の候補文字は「マ」一文字、二列目の候
補文字は「ル」一文字、三列目の候補文字は「チ」及び
「＠」の二文字、四列目の候補文字は「メ」一文字、五
列目の候補文字は「テ」一文字、六列目の候補文字は
「゛」一文字、七列目の候補文字「イ」一文字、最終列
の候補文字は「タ」及び「ア」の二文字を候補文字列と
して作成する。本実施形態例で作成される候補文字列は
「マルチメテ゛イタ」、「マル＠メテ゛イタ」、「マル
チメテ゛イア」、「マル＠メテ゛イア」の四種類とな
る。132: A candidate character string is created from candidate characters for which the candidate character strings have been combined. In the present embodiment, the candidate character in the first column is one character “ma”, the candidate character in the second column is one character “ru”, the candidate character in the third column is two characters “chi” and “ , The candidate character in the fifth column is one character, the candidate character in the sixth column is one character, the candidate character in the seventh column is one character, and the candidate character in the last column is The two characters “TA” and “A” are created as candidate character strings. The candidate character strings created in the present embodiment are of four types: "multi-media,""multi-media,""multi-media," and "multi-media."

【０１０５】１３３：許容不確定文字基準値判定部で
は、１３２で作成された候補文字列に対して、不確定文
字「＠」の割合を計算し許容範囲により候補文字列を精
査する。本実施形態例では「カナのみ」モードを設定し
ているので、この場合は１０４のシステム設定（図１４
参照）において、変換許容割合（カナ）「８４」を参照
しながら、以下の１３４で処理する。133: The allowable uncertain character reference value determination unit calculates the ratio of the uncertain character "@" to the candidate character string created in 132, and scrutinizes the candidate character string according to the allowable range. In this embodiment, since the “kana only” mode is set, in this case, the system setting of 104 (FIG. 14)
), The process is performed in the following 134 with reference to the conversion allowable ratio (kana) “84”.

【０１０６】１３４：候補文字列が複数かでないかによ
り分岐する。本実施形態例では、候補文字列が複数ある
（四つ）ので、１３５で処理する。なお、複数ないとき
は、１３８で処理する。134: Branch depending on whether or not there are a plurality of candidate character strings. In the present embodiment, since there are a plurality of candidate character strings (four), processing is performed at 135. If there is not a plurality, processing is performed at 138.

【０１０７】１３５：不確定文字割合と許容不確定文字
基準値の比較により分岐する。本実施形態例では「カナ
のみ」モードを設定しているので、この場合は１０４の
システム設定（図１４参照）において、最大不確定文字
基準値内の最小文字数（カナ）で設定した「８」と変換
許容割合（カナ）で設定した「８４」により一つの候補
文字列に許容できる「＠」の数を決定する。計算方法は
一つの候補文字列の文字数８文字に対し、８４％が
「＠」を非許容とする。逆に８文字の候補文字列では１
６％が「＠」を許容とする。つまり、この場合８文字の
候補文字列中に２文字は「＠」を許容する。結果、１３
２の第一位候補文字列「マルチメテ゛イタ」について
は、不確定文字「＠」が含まれていないので、許容不確
定文字基準値より小さい値となり、１３６で処理する。135: Branch by comparing the uncertain character ratio with the allowable uncertain character reference value. In this embodiment, the “kana only” mode is set. In this case, in the system setting of 104 (see FIG. 14), “8” set with the minimum number of characters (kana) within the maximum uncertain character reference value is set. Then, the number of “$” that can be permitted in one candidate character string is determined from “84” set by the conversion allowable ratio (Kana). According to the calculation method, 84% does not allow “$” for 8 characters in one candidate character string. Conversely, for an 8-character candidate string, 1
6% accepts "＠". That is, in this case, two characters in the eight character candidate character string allow "@". As a result, 13
Since the second-place candidate character string “multi-meta-ita” of No. 2 does not include the uncertain character “＠”, the value becomes smaller than the allowable uncertain character reference value, and is processed at 136.

【０１０８】１３６：１３５の結果、候補文字列「マル
チメテ゛イタ」を候補文字列の対象とし、１３７で処理
する。As a result of 136: 135, the candidate character string "multi-me- iterator" is set as a candidate for the candidate character string, and is processed at 137.

【０１０９】１３７：候補文字列が最終文字列かにより
分岐する。本実施形態例では、候補文字列「マルチメテ
゛イタ」は最終文字列でないので、１３４で処理する。137: Branch depending on whether the candidate character string is the last character string. In the present embodiment, since the candidate character string “multi-meta-data” is not the final character string, it is processed at 134.

【０１１０】１３４：候補文字列が複数かでないかによ
り分岐する。本実施形態例では、候補文字列が複数ある
（四つ）ので、１３５で処理する。134: Branch depending on whether or not there are a plurality of candidate character strings. In the present embodiment, since there are a plurality of candidate character strings (four), processing is performed at 135.

【０１１１】１３５：不確定文字割合と許容不確定文字
基準値の比較により分岐する。135: Branch by comparing the uncertain character ratio with the allowable uncertain character reference value.

【０１１２】本実施形態例では「カナのみ」モードを設
定しているので、この場合は１０４のシステム設定（図
１４参照）において、最大不確定文字基準値内の最小文
字数（カナ）で設定した「８」と変換許容割合（カナ）
で設定した「８４」により一つの候補文字列に許容でき
る「＠」の数を決定する。計算方法は一つの候補文字列
の文字数８文字に対し、８４％が「＠」を非許容とす
る。逆に８文字の候補文字列では１６％が「＠」を許容
とする。つまり、この場合８文字の候補文字列中に２文
字は「＠」を許容する。結果、１３２の第二位候補文字
列「マル＠メテ゛イタ」については、不確定文字「＠」
が一文字しか含まれていないので、許容不確定文字基準
値より小さい値となり、１３６で処理する。In this embodiment, the "kana only" mode is set. In this case, in the system setting 104 (see FIG. 14), the minimum number of characters (kana) within the maximum uncertain character reference value is set. "8" and conversion allowable ratio (Kana)
The number of “$” that can be allowed in one candidate character string is determined by “84” set in the above. According to the calculation method, 84% does not allow “$” for 8 characters in one candidate character string. Conversely, 16% of the eight character candidate character strings allow "$". That is, in this case, two characters in the eight character candidate character string allow "@". As a result, for the second-place candidate character string of “132”, the uncertain character “
Contains only one character, the value is smaller than the allowable uncertain character reference value, and is processed at 136.

【０１１３】１３６：１３５の結果、候補文字列「マル
チメテ゛イタ」を候補文字列の対象とし、１３７で処理
する。同様に、１３２の第三位候補文字列「マルチメテ
゛イア」及び第四位候補文字列「マル＠メテ゛イア」に
対しても１３５の不確定文字割合と許容不確定文字基準
値の比較処理を行う。以上の結果、本実施形態例では、
第一位候補文字列から第四位候補文字列までの四種類
「マルチメテ゛イタ」、「マル＠メテ゛イタ」、「マル
チメテ゛イア」、「マル＠メテ゛イタ」が全て候補文字
列の対象となり、１３７で処理される。As a result of 136: 135, the candidate character string "multi-me- iterator" is set as a candidate for the candidate character string, and is processed at 137. Similarly, the uncertain character ratio and the allowable uncertain character reference value of 135 are compared for the 132nd third candidate character string “Multimedia” and the fourth candidate character string “Multimedia”. As a result, in the present embodiment,
Four types of “multi-medium,” “multi-medium,” “multi-medium,” and “multi-medium” from the first candidate character string to the fourth candidate character string are all candidate character strings and are processed at 137. Is done.

【０１１４】１３７：候補文字列が最終文字列かにより
分岐する。１３６の候補文字列の四種類が候補文字列の
対象となり、最終文字列となるので１３８で処理する。137: Branch depending on whether the candidate character string is the last character string. Since four types of 136 candidate character strings are targeted for the candidate character strings and become the final character strings, they are processed in 138.

【０１１５】１３８：モードにより分岐する。本実施形
態例では、「かなのみ」モードを設定しているので、こ
の場合は１０４のシステム設定（図１３参照）におい
て、認識文字種別判定として「カナのみ」を参照しなが
ら、以下の１３９〜１４６を処理する。138: Branch depending on the mode. In the present embodiment, the “kana only” mode is set. In this case, in the system setting of 104 (see FIG. 13), the following 139 to 139 are used while referring to “kana only” as the recognition character type determination. 146 is processed.

【０１１６】１３９：濁点・半濁点処理部では、１０４
のシステム設定（図１３参照）において、濁点／半濁点
として「ＯＮ／ＯＦＦ」の設定により処理が異なる。139: In the cloud point / semi-voice point processing unit, 104
In the system setting (see FIG. 13), the processing differs depending on the setting of “ON / OFF” as the turbid point / semi-turbid point.

【０１１７】１４０：１０４のシステム設定（図１３参
照）において、濁点／半濁点がＯＮかＯＦＦかにより分
岐する。本実施形態例では、１０４のシステム設定（図
１３参照）において、濁点／半濁点として「ＯＮ」に設
定されているので、１４１で処理する。なお、「ＯＦ
Ｆ」に設定されている場合は、１４３で処理する。In the system setting at 140: 104 (see FIG. 13), the process branches depending on whether the turbid point / semi-turbid point is ON or OFF. In the present embodiment, since it is set to “ON” as a cloud point / semi-voice point in the system setting of 104 (see FIG. 13), the processing is performed in 141. Note that "OF
If it is set to "F", processing is performed in 143.

【０１１８】１４１：第一位候補文字列に濁点「゛」又
は半濁点「゜」が有るか無いかにより分岐する。本実施
形態例では、第一位候補文字列「マルチメテ゛イタ」に
濁点「゛」が含まれているので、１４２で処理する。141: Branch depending on whether or not the first candidate character string has a clouded point “゛” or a semi-voiced point “゜”. In the present embodiment, since the first candidate character string “multi-meta-data” includes the turbid point “゛”, it is processed in 142.

【０１１９】１４２：濁点「゛」、半濁点「゜」を取り
除いた候補文字列を一つ追加作成する。本実施形態例で
は、第一位候補文字列「マルチメテ゛イタ」から濁点
「゛」を取り除いた候補文字列「マルチメテイタ」を一
つ追加作成し、１４３で処理する。142: One additional candidate character string from which the voiced point "@" and the semi-voiced point "@" have been removed is created. In the present embodiment, one additional candidate character string “multi-meta” obtained by removing the turbid point “゛” from the first-place candidate character string “multi-meta-ita” is created and processed in 143.

【０１２０】１４３：分離文字処理部では、１０４のシ
ステム設定（図１３参照）において、分離文字として
「ＯＮ／ＯＦＦ」の設定により処理が異なる。143: In the separation character processing unit, the processing differs depending on the setting of "ON / OFF" as a separation character in the system setting of 104 (see FIG. 13).

【０１２１】１４４：１０４のシステム設定（図１３参
照）において、分離文字がＯＮかにより分岐する。本実
施形態例では、１０４のシステム設定（図１３参照）に
おいて、分離文字として「ＯＮ」に設定されているの
で、１４５で処理する。In the system setting of 144: 104 (see FIG. 13), the process branches depending on whether the separation character is ON. In the present embodiment, since “ON” is set as the separation character in the system setting 104 (see FIG. 13), the processing is performed at 145.

【０１２２】１４５：第一位候補文字列に分離文字
「ル」又は「ノ」「レ」が有るか無いかにより分岐す
る。本実施形態例では、第一位候補文字列「マルチメテ
゛イタ」に分離文字「ル」が含まれているので、１４６
で処理する。なお、分離文字が含まれていない場合は、
１４７で処理する。145: Branch depending on whether or not there is a separation character "ru", "no", or "re" in the first place candidate character string. In the present embodiment, since the first candidate character string “multi-meta-data” includes the separating character “ru”, 146
To process. If no separator character is included,
Processing is performed at 147.

【０１２３】１４６：分離文字「ル」を「ノ」と「レ」
に分離した候補文字列を一つ追加作成する。本実施形態
例では、第一位候補文字列「マルチメテ゛イタ」から分
離文字「ル」一文字を「ノ」と「レ」の二文字に分離し
た候補文字列「マノレチメテ゛イタ」を一つ追加作成
し、１４７で処理する。146: Separation characters “ru” are replaced by “no” and “re”
One additional candidate character string is created. In the present embodiment example, one additional candidate character string "Manorechimeitaita" in which one character separated from the first-place candidate character string "Multimeter" is separated into two characters "No" and "Le", Processing is performed at 147.

【０１２４】１４６の処理が終了した現時点で作成され
た候補文字列は、第一位候補文字列「マルチメテ゛イ
タ」、第二位候補文字列「マル＠メテ゛イタ」、第三位
候補文字列「マルチメテ゛イア」、第四位候補文字列
「マル＠メテ゛イア」、第五位候補文字列「マルチメテ
イタ」、第六位候補文字列「マノレチメテ゛イタ」、以
上の六種類とする。The candidate character strings created at the time of completion of the processing of 146 are the first candidate character string “multi-meta-data”, the second candidate character string “multi-meta-data”, and the third candidate character string “multi-method”. It is assumed that there are six types, namely, "medium", the fourth candidate character string "multimedia", the fifth candidate character string "multimeter", and the sixth candidate character string "manorechimeitaita".

【０１２５】１４７：辞書ＤＢ（データベース）比較処
理では、１０４のシステム設定（図１４参照）におい
て、キーワード辞書として「使用／未使用」の設定によ
り処理が異なる。147: In the dictionary DB (database) comparison processing, the processing differs depending on the setting of “used / unused” as the keyword dictionary in the system setting of 104 (see FIG. 14).

【０１２６】１４８：１０４のシステム設定（図１４参
照）において、キーワードデータをマッチングする辞書
ＤＢを使用するかしないかにより分岐する。本実施形態
例では、１０４のシステム設定（図１４参照）におい
て、キーワードデータをマッチングする辞書ＤＢとして
「辞書Ａ」を「使用中」に設定しているので、１４９で
処理する。なお、「未使用」に設定している場合は、１
５１で処理する。In the system setting at 148: 104 (see FIG. 14), the process branches depending on whether or not to use a dictionary DB for matching keyword data. In the present embodiment, since "dictionary A" is set to "in use" as the dictionary DB for matching the keyword data in the system setting of 104 (see FIG. 14), the processing is performed at 149. In addition, when it is set to “unused”, 1
Process at 51.

【０１２７】１４９：候補文字列に不確定文字「＠」が
含まれているかいないかにより分岐する。本実施形態例
では、第二位候補文字列「マル＠メテ゛イタ」と第四位
候補文字列「マル＠メテ゛イア」が不確定文字「＠」を
含んでいるので、１５０で処理する。なお、不確定文字
「＠」を含んでいないなら、１５１で処理する。149: Branch depending on whether or not the candidate character string contains the uncertain character "@". In the present embodiment, since the second-place candidate character string “multi-meta-data” and the fourth-place candidate character string “multi-meta-data” include the uncertain character “＠”, the processing is performed at 150. If it does not include the uncertain character "@", it is processed at 151.

【０１２８】１５０：辞書ＤＢの中から「使用中」に設
定した「辞書Ａ」を使って、不確定文字「＠」の部分を
その辞書ＤＢ（辞書Ａ）に登録してある文字列より検索
し、マッチングした文字列を候補文字列として追加す
る。本実施形態例では、不確定文字「＠」を含む第二位
候補文字列「マル＠メテ゛イタ」と第四位候補文字列
「マル＠メテ゛イア」に対し辞書ＤＢ（辞書Ａ）より文
字列を検索する。辞書ＤＢ（辞書Ａ）には、図１８の辞
書ＤＢの例に示すように、予め画像データベースに登録
される事が想定される名詞として「マルチメテ゛イア」
という文字列が登録されている事とする。第二位候補文
字列「マル＠メテ゛イタ」については、不確定文字
「＠」以外に八列目の文字「タ」について一致する文字
列がないので、この場合は候補文字列として追加しな
い。次に第四位候補文字列「マル＠メテ゛イア」につい
ては、不確定文字「＠」以外の文字が全て一致する文字
列があるので、この場合は候補文字列として「マルチメ
テ゛イア」を一つ追加する。なお、具体的な辞書ＤＢの
使用方法と処理の例は、以下の実施形態例３で説明す
る。150: Using "dictionary A" set to "in use" from the dictionary DB, search for an uncertain character "@" from a character string registered in the dictionary DB (dictionary A) Then, the matched character string is added as a candidate character string. In the present embodiment, a character string is searched from the dictionary DB (dictionary A) for the second-place candidate character string “Maru-me-data” and the fourth-place candidate character string “Maru-me-time” including the uncertain character “＠”. I do. In the dictionary DB (dictionary A), as shown in the example of the dictionary DB in FIG. 18, “multi-media” is a noun that is assumed to be registered in the image database in advance.
Is assumed to be registered. As for the second-place candidate character string "Maru-me-taitaita", there is no matching character string for the character "ta" in the eighth column other than the uncertain character "@". Next, with regard to the fourth candidate character string "multimedia", there is a character string in which all characters other than the uncertain character "＠" match, so in this case, one "multimedia" is added as a candidate character string. I do. A specific example of a method of using the dictionary DB and an example of processing will be described in a third embodiment below.

【０１２９】１５１：候補文字列は登録か、もしくは検
索かにより分岐する。本実施形態例では、画像データベ
ースへの登録が目的であるので、１５２で処理する。151: The candidate character string branches depending on whether it is registered or searched. In the present embodiment, since the purpose is registration in the image database, the processing is performed in step 152.

【０１３０】１５２：候補文字列を画像データベースへ
登録する。本実施形態例では、最終的に作成された候補
文字列は、第一位候補文字列「マルチメテ゛イタ」、第
二位候補文字列「マル＠メテ゛イタ」、第三位候補文字
列「マルチメテ゛イア」、第四位候補文字列「マル＠メ
テ゛イア」、第五位候補文字列「マルチメテイタ」、第
六位候補文字列「マノレチメテ゛イタ」、第七位候補文
字列「マルチメテ゛イア」、以上の七種類の候補文字列
を、図１９に示すとおり抽出元の画像データにリンクさ
せて画像データベースへ登録する。[0151] 152: Register the candidate character string in the image database. In the present embodiment, the finally created candidate character strings are the first candidate character string “multi-medium”, the second candidate character string “multi-measure”, and the third candidate character string “multi-measure”. , The fourth candidate character string "multi-medium", the fifth candidate character string "multi-meta", the sixth candidate character string "manorechi-me-taita", and the seventh candidate character string "multi-media" The character string is linked to the extraction source image data as shown in FIG. 19 and registered in the image database.

【０１３１】［実施形態例２］本実施形態例では、画像
データベース検索までの基本的な処理例を説明する（処
理フロー１５３〜１５５）。[Embodiment 2] In this embodiment, a basic processing example up to image database search will be described (processing flows 153 to 155).

【０１３２】以下の実施形態例では、検索に用いるメデ
ィアをファックス端末とし、予めシステム内に設定した
認識文字種別選定を「カナのみ」モードに設定している
事を例に説明する。また、ファックス端末より検索する
キーワードデータの文字列として「マルチメテ゛イア」
を例とする。なお、本実施形態例でも、上記第１の実施
形態例で説明した１０１から１５１までと同様の処理を
実行するが、ここでは説明を省略する。ただし、１５１
の処理を終了した時点で作成された候補文字列を、第一
位候補文字列「アルトメテ゛イヤ」、以上の一種類のみ
とした例で説明する。In the following embodiment, an example will be described in which the medium used for the search is a facsimile terminal, and the recognition character type selection preset in the system is set to the “kana only” mode. In addition, "Multimedia" is used as a character string of keyword data searched from the fax terminal.
Is taken as an example. In this embodiment, the same processes as 101 to 151 described in the first embodiment are executed, but the description is omitted here. However, 151
In the following, an example will be described in which the candidate character string created at the point of time when the above processing is completed is the first candidate character string “Altmedium” and only one of the above.

【０１３３】１５１：候補文字列は登録か、もしくは検
索かにより分岐する。本実施形態例では、画像データベ
ースへの検索が目的であるので、１５３で処理する。151: The candidate character string branches depending on whether it is registered or searched. In the present embodiment, since the purpose is to search the image database, the processing is performed in 153.

【０１３４】１５３：候補文字列で画像データベースを
検索する。本実施形態例では、予め画像データベースへ
登録されている文字列を１５２で登録された七種類の候
補文字列とする（図１９参照）。検索するキーワードの
候補文字列は、「アルトメテ゛イヤ」の一種類とする。153: Search the image database for the candidate character string. In the present embodiment, character strings registered in advance in the image database are set as seven types of candidate character strings registered in 152 (see FIG. 19). The candidate character string of the keyword to be searched is one type of “Altmedium”.

【０１３５】１５４：候補文字列の正解率と検索一致率
の比較により分岐する。本実施形態例では、「カナの
み」モードを設定しているので、この場合は１０４のシ
ステム設定（図１４参照）において、ＤＢ検索関連内の
許容値（カナ）で設定した「８」と一致率（カナ）で設
定した「６８」により、画像データベースに登録されて
いる候補文字列と画像データベースを検索する候補文字
列のヒット条件を決定する。計算方法は、一つの候補文
字列の文字数８文字に対し、６８％の文字が一致しなけ
ればならない。逆に８文字の候補文字列では３２％が不
一致でも良い。つまり、この場合８文字の候補文字列中
に２文字は文字が不一致でもヒットする。結果、１５２
の第二位候補文字列「マル＠メテ゛イタ」と第四位候補
文字列「マル＠メテ゛イア」が検索をする文字列「アル
トメテ゛イヤ」に対して、２文字不一致であるので、二
つの候補文字列についてはヒットするので、１５５で処
理する。ここで、候補文字列の正解率がシステム設定の
検索一致率よりも小さい場合は、８０１で画像データベ
ースに該当するデータが存在しない旨を通知文で検索元
メディアへ返送して終了する。154: Branch by comparing the correct answer rate of the candidate character string with the search match rate. In the present embodiment, since the “kana only” mode is set, in this case, the system setting 104 (see FIG. 14) matches “8” set by the allowable value (kana) in the DB search relation. The hit condition of the candidate character string registered in the image database and the candidate character string for searching the image database is determined based on “68” set by the rate (Kana). The calculation method requires that 68% of the characters match the number of characters in one candidate character string. Conversely, 32% may not match for an eight character candidate string. That is, in this case, two characters in the candidate character string of eight characters are hit even if the characters do not match. As a result, 152
Since two characters do not match the character string "Altometia" which is searched for the second candidate character string "Malmetaitita" and the fourth candidate character string "Malmetaia", the two candidate character strings Is hit, so it is processed at 155. Here, when the correct answer rate of the candidate character string is smaller than the search matching rate set in the system, the process returns to the search source medium in 801 by notifying that there is no corresponding data in the image database, and ends.

【０１３６】なお、上記の画像データベース検索方法と
候補文字列の一致率（正解率）と予めシステム内に設定
した一致率の比較処理の具体的な方法については、第４
の実施形態例で説明する。The above-described image database search method and a specific method of comparing the matching rate (correct answer rate) of candidate character strings with the matching rate set in the system in advance are described in the fourth section.
The embodiment will be described.

【０１３７】１５５：画像データベースに登録された文
字列でマッチングしたデータを検索元メディアへ返送し
て終了する。本実施形態例では、画像データベースに登
録された第二位候補文字列「マル＠メテ゛イタ」と第四
位候補文字列「マル＠メテ゛イア」に対し、マッチング
したイメージデータを検索元ファックス端末へ返送す
る。ただし、本実施形態例では、同一のイメージデータ
に対し、二つの候補文字列がヒットしているが、同一イ
メージデータであるので検索元ファックス端末へ返送さ
れるイメージデータは一つとする。155: The data matched by the character string registered in the image database is returned to the search source medium, and the processing ends. In the present embodiment, matching image data is returned to the search source fax terminal with respect to the second-place candidate character string “multi-media” and the fourth-place candidate character string “multi-media” registered in the image database. . However, in the present embodiment, two candidate character strings are hit for the same image data, but since they are the same image data, only one image data is returned to the search source fax terminal.

【０１３８】［実施形態例３］本実施形態例は、図１２
のフロー図で示した辞書ＤＢの使用方法と処理について
の具体的な実施形態例である。図２０は、本方法につい
ての説明図であり、（１）〜（８）はキーワードデータ
（候補文字列）を示す。[Embodiment 3] This embodiment is similar to FIG.
9 is a specific example of a method for using and processing the dictionary DB shown in the flowchart of FIG. FIG. 20 is an explanatory diagram of the present method, wherein (1) to (8) indicate keyword data (candidate character strings).

【０１３９】先に述べたとおり、辞書ＤＢには、画像デ
ータベースにキーワードデータとして登録される事が想
定される名詞と文字数を登録してある。そして、キーワ
ードデータを画像データベースへ登録する際に、必要に
応じて辞書ＤＢに登録してある名詞を参照する。具体的
には、画像データベースへ登録しようとするキーワード
データに不確定文字「＠」が含まれている場合に辞書Ｄ
Ｂを参照し、辞書ＤＢに類似したキーワードデータが予
め登録してあった場合、その単語を引用し画像データベ
ースへキーワードデータとして追加登録する。以下、図
２０を参照し、動作の流れ（ａ）〜（ｅ）に従って説明
する。[0139] As described above, the noun and the number of characters that are assumed to be registered as keyword data in the image database are registered in the dictionary DB. When the keyword data is registered in the image database, a noun registered in the dictionary DB is referred to as necessary. More specifically, if the keyword data to be registered in the image database contains an uncertain character “＠”, the dictionary D
B, if keyword data similar to the dictionary DB is registered in advance, the word is quoted and additionally registered as keyword data in the image database. Hereinafter, description will be made in accordance with the operation flows (a) to (e) with reference to FIG.

【０１４０】（ａ）文字認識後のキーワードデータに不
確定文字「＠」が含まれている文字列があるかどうか判
断する。不確定文字「＠」の含まれたキーワードデータ
が存在しない場合（（７），（８）のキーワードデータ
の場合）は、辞書ＤＢは参照せずに、文字認識後のキー
ワードデータのみをそのまま画像データベースへ登録す
る。もし、不確定文字「＠」が含まれたキーワードデー
タ（３）が存在した場合は、辞書ＤＢを参照する。(A) It is determined whether or not the keyword data after character recognition includes a character string including the uncertain character "@". If there is no keyword data including the uncertain character "@" (in the case of the keyword data of (7) and (8)), the keyword data after character recognition is used as it is without referring to the dictionary DB. Register in the database. If there is keyword data (3) including the uncertain character "@", the dictionary DB is referred to.

【０１４１】（ｂ）辞書ＤＢの参照は、キーワードデー
タの文字数から検索し、文字数が一致したキーワードデ
ータのみを照合の対象とする。本実施形態例では、文字
認識後のキーワードデータ（３）が「＠ルチメテ゛イ
ア」となり、不確定文字「＠」を含むため、辞書ＤＢを
参照する。(B) The reference to the dictionary DB is performed by searching from the number of characters of the keyword data, and only the keyword data having the same number of characters is to be compared. In the present embodiment, since the keyword data (3) after character recognition is "Ultimate Window" and includes the uncertain character "@", the dictionary DB is referred to.

【０１４２】（ｃ）辞書ＤＢでは、８文字であるキーワ
ードデータが（４）の「マルチメテ゛イア」と（５）の
「コンヒ゜ユーター」の２つが登録されており、辞書Ｄ
Ｂの照合対象となる。(C) In the dictionary DB, two keyword data of 8 characters, “multi-media” of (4) and “computer” of (5) are registered.
B is to be collated.

【０１４３】（ｄ）まず、文字認識後のキーワードデー
タ（３）の「＠ルチメテ゛イア」と辞書ＤＢ（４）の
「マルチメテ゛イア」を照合する。照合は、一文字単位
に行い、不確定文字「＠」を除く文字がすべて対象とな
る。辞書ＤＢ（４）の「マルチメテ゛イア」の場合、不
確定文字「＠」以外の文字がすべて一致するので、
（４）の「マルチメテ゛イア」をキーワードデータに１
つ追加し画像データベースへ登録を行う。次に（３）の
「＠ルチメテ゛イア」と辞書ＤＢ（５）の「コンヒ゜ユ
ーター」を照合する。照合結果、不確定文字「＠」を除
く文字に一致しない文字があるため、（５）の「コンヒ
゜ユーター」はキーワードデータとして画像データベー
スへ追加登録しない。(D) First, "multi-media" in the keyword data (3) after character recognition is compared with "multi-media" in the dictionary DB (4). The collation is performed on a character-by-character basis, and all characters except the uncertain character “@” are targeted. In the case of "multimedia" in the dictionary DB (4), all characters other than the uncertain character "@" match, so
(4) "Multimedia" as keyword data 1
And register it in the image database. Next, (3) the "multimedia" is collated with the "computer" in the dictionary DB (5). As a result of the collation, there is a character that does not match any character other than the uncertain character “@”, and therefore “computer” in (5) is not additionally registered as keyword data in the image database.

【０１４４】（ｅ）上記（ａ）から（ｄ）の処理の最終
結果として、画像データベースへ登録されるキーワード
データは文字認識された（１）の「アルチメテ゛イ
ア」、（２）の「アルチメテイア」、および（３）の
「＠ルチメテ゛イア」と、辞書ＤＢとの照合結果の
（４）の「マルチメテ゛イア」の４つなる。(E) As a final result of the above processes (a) to (d), the keyword data to be registered in the image database is character-recognized "Ultimate" and (2) "Ultimate". , And (3) “Multimedia” and (4) “Multimedia” as a result of collation with the dictionary DB.

【０１４５】［実施形態例４］本実施形態例は、図１２
で示した画像データベース検索方法と候補文字列の一致
率と予めシステム内に設定した一致率の比較の処理につ
いての具体的な実施形態例である。図２１は本処理につ
いての説明図であり、（１）〜（１１）はキーワードデ
ータ（候補文字列）を示す。[Embodiment 4] This embodiment is different from the embodiment shown in FIG.
9 is a specific example of the embodiment of the image database search method and the process of comparing the matching rate of the candidate character string with the matching rate set in the system in advance. FIG. 21 is an explanatory diagram of this processing, and (1) to (11) show keyword data (candidate character strings).

【０１４６】本処理は、画像データベースに予め登録さ
れているキーワードデータと検索要求のあったキーワー
ドデータを利用して画像データベースを検索するもので
ある。本実施形態例では、画像データベースに予め登録
されているイメージデータを「マルチメテ゛イア」（画
像データＡ）、「ハ゜ソコン」（画像データＢ）、「コ
ンヒ゜ユーター」（画像データＣ）の３種類とし、３種
類それぞれのイメージデータに対して「マルチメテ゛イ
ア」には５つ、「ハ゜ソコン」には２つ、「コンヒ゜ユ
ーター」には２つのキーワードデータが作成されたとす
る。また、検索要求には１つのイメージデータ（画像デ
ータＤ）に対しキーワードデータが「マルチメテ゛イ
ア」と「マルチメテイア」の２つが作成されたので、そ
の２つのキーワードデータで検索する事とする。以下、
図２１を参照し、処理の流れ（ａ）〜（ｈ）に従って説
明する。This process searches the image database using the keyword data registered in advance in the image database and the keyword data requested to be searched. In the present embodiment, three types of image data registered in advance in the image database are defined as "multi-media" (image data A), "PC" (image data B), and "computer" (image data C). It is assumed that five keyword data are created for “multi-media”, two keyword data are created for “computer”, and two keyword data are created for “consumer” for each type of image data. Also, in the search request, two keyword data, "multi-media" and "multi-media", are created for one image data (image data D), and the search is performed using the two keyword data. Less than,
The process will be described with reference to FIG. 21 in accordance with the processing flows (a) to (h).

【０１４７】（ａ）まず（１）の「マルチメテ゛イア」
というキーワードデータで画像データベースを検索す
る。(A) First, (1) "Multimedia"
Search the image database with the keyword data.

【０１４８】（ｂ）（１）の「マルチメテ゛イア」は文
字数にすると８文字となる（濁点も一文字とする）。検
索要求側が８文字であった場合、画像データベースに登
録されているデータ、今回、検索の対象となるキーワー
ドデータは８文字のデータのみが該当する。つまり、キ
ーワードデータの検索方法では、最初にキーワードデー
タの文字数を参照し、文字数が一致したデータのみを検
索する。(B) The “multi-medium” in (1) has eight characters in terms of the number of characters (the dull point is also one character). If the search requesting side has eight characters, only the data registered in the image database and the keyword data to be searched this time correspond to only eight characters. That is, in the keyword data search method, first, the number of characters in the keyword data is referred to, and only data having the same number of characters is searched.

【０１４９】（ｃ）（１）の「マルチメテ゛イア」で検
索要求した結果、今回、一致率算出の比較対象となった
キーワードデータは（３）の「マルチメテ゛イア」と
（４）の「アルチメテ゛イア」、（５）の「ヤルチメテ
゛イマ」、（１０）の「コンヒ゜ユーター」の４つとな
る。一致率とは、検索要求のあったキーワードデータと
検索されてきたキーワードデータを一文字単位に照合
し、照合結果を百分率で算出した値を一致率と定義す
る。一致率の算出式は、一致率＝一致した文字数（検索
要求のキーワードデータと画像データベースのキーワー
ドデータとの比較による）÷キーワードデータ文字数、
となる。例えば、検索要求のあったキーワードデータを
「マルチメテ゛イア」とし、検索されてきたキーワード
データも「マルチメテ゛イア」であれば、８文字中８文
字一致した事になるので、一致率は１００［％］とな
る。(C) As a result of a search request for “multi-media” in (1), the keyword data that was compared with the match rate calculation this time is “multi-media” in (3) and “ulti-media” in (4). , "(5)" Yaruchimetima "and (10)" Consumer ". The coincidence rate is defined as a value obtained by collating the keyword data requested to be retrieved with the retrieved keyword data on a character-by-character basis, and calculating the collation result in percentage. The formula for calculating the match rate is: match rate = number of matched characters (by comparing the keyword data in the search request with the keyword data in the image database) ÷ the number of characters in the keyword data,
Becomes For example, if the keyword data for which a search request is made is “multi-media” and the searched keyword data is also “multi-media”, eight characters out of eight characters match, so the matching rate is 100 [%]. Becomes

【０１５０】（ｄ）上記（ｃ）と同様に検索要求のあっ
たキーワードデータと上記（ｃ）で一致率算出の比較対
象となった４つのキーワードデータを１データづつ照合
する。結果、以下の様になる。（３）の「マルチメテ゛
イア」は８文字中８文字が一致したので一致率が１００
［％］、（４）の「アルチメテ゛イア」は８文字中７文
字が一致したので一致率が８７．５［％］、（５）の
「ヤルチメテ゛イマ」は８文字中６文字が一致したので
一致率が７５［％］、（１０）の「コンヒ゜ユーター」
は８文字中０文字が一致したので一致率が０［％］とな
る。(D) Similar to the above (c), the keyword data requested to be searched and the four keyword data to be compared in the above (c) for calculating the matching rate are collated one by one. The result is as follows. In the case of (3) “Multi-metadia”, 8 characters out of 8 characters matched, so the matching rate was 100.
[%], (4) "Ultimate timer" matches 7 characters out of 8 characters, so the match rate is 87.5 [%], and (5) "Alarm data" matches 6 out of 8 characters, so match "Consumer" with a rate of 75% and (10)
Matches 0 characters out of 8 characters, so the matching rate is 0 [%].

【０１５１】（ｅ）また、上記（ｄ）で説明した検索要
求のあったキーワードデータと予め登録されていたキー
ワードデータとを文字単位に照合して算出する一致率
と、それとは別に予めシステムで設定する一致率があ
る。これは、図１４のシステム設定画面にて、ＧＵＩ上
より数値を投入する必要があり、検索結果を検索要求元
へ返送する判断を行うために使用する。具体的には、前
記一致率と前記一致率を更に比較し、システム設定によ
り設定した一致率より、一致率の方が高ければヒットし
たと判断し検索要求元へ画像データを返送する。また、
一致率の方が小さければ、「該当するデータが存在しな
い」旨の通知文を返送する。ただし、一致率の方が小さ
くても、他のキーワードデータが１つでもヒットしてい
れば「該当するデータが存在しない」旨の通知文は返送
しない。(E) Also, a match rate calculated by collating the keyword data requested for search described in (d) above with keyword data registered in advance on a character-by-character basis. There is a match rate to set. It is necessary to input a numerical value from the GUI on the system setting screen of FIG. 14, and is used to determine whether to return the search result to the search request source. More specifically, the matching rate is further compared with the matching rate. If the matching rate is higher than the matching rate set in the system settings, it is determined that a hit has occurred, and the image data is returned to the search request source. Also,
If the matching rate is smaller, a notification message stating that "the corresponding data does not exist" is returned. However, even if the coincidence rate is smaller, if at least one other keyword data is hit, the notification message to the effect that "there is no corresponding data" is not returned.

【０１５２】本実施形態例では、「システム設定で設定
した一致率を８０」とする。この場合、（３）の「マル
チメテ゛イア」は一致率が１００［％］なのでヒットし
たと判断する。（４）の「アルチメテ゛イア」は一致率
が８７．５［％］なのでヒットしたと判断する。（５）
の「ヤルチメテ゛イマ」は一致率が７５［％］なのでヒ
ットしなかったと判断する。（１０）の「コンヒ゜ユー
ター」は一致率が０［％］なのでヒットしなかったと判
断する。以上により（５）と（１０）はヒットしなかっ
たと判断するが（３）と（４）がヒットしているので
「該当するデータが存在しない」旨の通知文は返送され
ない。また、（３）と（４）は同一画像データ（画像デ
ータＡ）なのでこの場合、同じ画面を２枚返送する事は
しない。結果、検索要求元へ返送される画像データは画
像データＡの１枚となる。In this embodiment, it is assumed that “the coincidence rate set in the system setting is 80”. In this case, since the match rate of “multi-media” of (3) is 100 [%], it is determined that a hit has occurred. Since the match rate of “ultimate wear” in (4) is 87.5 [%], it is determined that a hit has occurred. (5)
Is not hit because the matching rate is 75 [%]. The "computer" in (10) has a matching rate of 0 [%], so it is determined that no hit was made. As described above, it is determined that (5) and (10) have not been hit, but since (3) and (4) have been hit, a notification message to the effect that "corresponding data does not exist" is not returned. Also, since (3) and (4) are the same image data (image data A), in this case, the same screen is not returned two times. As a result, the image data returned to the search request source is one image data A.

【０１５３】（ｆ）上記（ａ）から（ｅ）までと同様に
（２）の「マルチメテイア」で画像データベースを検索
する。(F) Similar to the above (a) to (e), the image database is searched by the "multimedia" of (2).

【０１５４】（ｇ）（２）の「マルチメテイア」は文字
数７文字なので、一致率算出の比較対象となるキーワー
ドデータは（６）の「マルチメテイア」と（１１）の
「コンヒユーター」となる。一致率の算出結果は、
（６）の「マルチメテイア」は７文字中７文字が一致し
たので一致率が１００［％］、（１１）の「コンヒユー
ター」は７文字中０文字が一致したので一致率が０
［％］、（８）算出した一致率とシステム内に設定した
一致率の比較については、本実施形態例でも、「システ
ム設定で設定した一致率を８０」とする。この場合、
（６）の「マルチメテイア」は一致率が１００［％］な
のでヒットしたと判断する。（１１）の「コンヒユータ
ー」は一致率が０［％］なのでヒットしなかったと判断
する。以上により（６）はヒットしたが上記（５）の
（３）と同一画像データ（画像データＡ）なので、この
場合、同じ画面を２枚返送する事はしない。また、（１
１）はヒットしなかったと判断するが（３）がヒットし
ているので「該当するデータが存在しない」旨の通知文
は返送されない。(G) Since the "multi-media" in (2) has 7 characters, the keyword data to be compared in the calculation of the coincidence rate are "multi-media" in (6) and "commuter" in (11). The result of calculating the match rate is
In the case of “(6)”, the matching rate was 100 [%] because 7 out of 7 characters matched, and in “(11)” the matching rate was 0 since 0 out of 7 characters matched.
[%], (8) Regarding the comparison between the calculated coincidence rate and the coincidence rate set in the system, also in this embodiment, the coincidence rate set in the system setting is set to “80”. in this case,
It is determined that a hit has occurred in the “multi-media” of (6) because the matching rate is 100 [%]. Since the match rate of “Consumer” in (11) is 0 [%], it is determined that no hit was made. As described above, (6) was hit, but since the same image data (image data A) as (3) in (5) above, in this case, the same two screens are not returned. Also, (1
It is determined that no hit has occurred in 1), but since (3) has hit, a notification message stating that "corresponding data does not exist" is not returned.

【０１５５】（ｈ）上記（ａ）から（ｇ）の最終結果と
して、検索要求元へ返送される画像データは画像データ
Ａの１枚となる。(H) As the final result of (a) to (g), the image data returned to the search request source is one of the image data A.

【０１５６】以上で説明した第１から第４までの実施形
態例で示したプログラム処理は、その処理手順をコンピ
ュータで実行することが可能であり、その処理手順をコ
ンピュータに実行させるためのプログラムを該コンピュ
ータが読み取り可能な媒体、例えばフロッピーディスク
やＣＤ−ＲＯＭなどに記録して配布することが可能であ
る。In the program processing described in the first to fourth embodiments described above, the processing procedure can be executed by a computer, and a program for causing the computer to execute the processing procedure is provided. The program can be recorded on a computer-readable medium such as a floppy disk or a CD-ROM and distributed.

【０１５７】[0157]

【発明の効果】本発明によれば、データベースアクセス
シートを画像データとして画像データベースセンタへ送
ることができるため、既存のファックス端末から画像デ
ータベースセンタをアクセスできる。また、データベー
スアクセスシートを画像データとしてスキャナで読み取
ることができ、スキャナからも画像データベースセンタ
へアクセスできる。また、文字認識により、同一用紙同
一頁内のデータベースアクセスシートにキーワードデー
タとイメージデータを混在して記載が可能であり、記載
領域はフリーフォーマットである事から、汎用性があ
り、ユーザーを特定せず、画像データベースへのアクセ
スを容易とした効果を提供できる。カタカナと数字を用
いる場合では、両者を分けてそれぞれ専用の辞書を用い
て文字認識することで、高精度の文字認識が実現でき
る。また、候補文字列を複数生成可能とする事、あるい
は濁点、半濁点を削除した候補文字列を追加生成する事
で、検索時にマッチングしやすくする。従って、本発明
によれば、検索要求者が所望する画像データベースセン
タでの処理結果を検索要求者のファックス端末に効率良
く送信できる。According to the present invention, since the database access sheet can be sent to the image database center as image data, the existing fax terminal can access the image database center. Further, the database access sheet can be read as image data by a scanner, and the scanner can also access the image database center. In addition, by character recognition, keyword data and image data can be mixed and described on the database access sheet on the same page and on the same page. The description area is free format, so it has versatility and can identify the user. Therefore, an effect of facilitating access to the image database can be provided. In the case where katakana and numbers are used, high-precision character recognition can be realized by separating the two and performing character recognition using dedicated dictionaries. In addition, by making it possible to generate a plurality of candidate character strings, or by additionally generating a candidate character string from which a clouded point and a semi-voiced point are deleted, matching is facilitated at the time of search. Therefore, according to the present invention, the processing result at the image database center desired by the search requester can be efficiently transmitted to the fax terminal of the search requester.

[Brief description of the drawings]

【図１】本発明の第１の実施形態例を示す画像データベ
ースセンタ装置を含むシステム構成図である。FIG. 1 is a system configuration diagram including an image database center device according to a first embodiment of the present invention.

【図２】上記実施形態例における画像データベースセン
タ装置の機能ブロック構成図である。FIG. 2 is a functional block configuration diagram of an image database center device in the embodiment.

【図３】上記実施形態例における画像データベースセン
タ装置によるサービスイメージを示した図である。FIG. 3 is a diagram showing a service image by the image database center device in the embodiment.

【図４】上記実施形態例におけるデータベースアクセス
シートへのキーワードの記載例を示す図であって、
（ａ），（ｂ）はキーワードを二重線で示した記載例を
示し、（ｃ）はキーワードを抽出するイメージデータ記
載可能領域にキーワードを記載した例を示す。FIG. 4 is a diagram showing a description example of a keyword in a database access sheet according to the embodiment;
(A), (b) shows a description example in which the keyword is indicated by a double line, and (c) shows an example in which the keyword is described in a region where the image data can be extracted for extracting the keyword.

【図５】上記実施形態例におけるデータベースアクセス
シートに記載が可能な文字種の例として、（ａ）は数
字、（ｂ），（ｃ）はカタカナ文字を示す図である。FIGS. 5A and 5B are diagrams illustrating examples of character types that can be described in the database access sheet according to the embodiment, wherein FIG. 5A illustrates numbers, and FIGS. 5B and 5C illustrate katakana characters.

【図６】（ａ），（ｂ）は、上記実施形態例での画像デ
ータベースセンタ装置におけるプログラム処理の一実施
形態例を示すフロー図（その１）である。FIGS. 6A and 6B are flowcharts (part 1) illustrating an embodiment of a program process in the image database center device in the embodiment.

【図７】上記実施形態例での画像データベースセンタ装
置におけるプログラム処理の一実施形態例を示すフロー
図（その２）である。FIG. 7 is a flowchart (part 2) illustrating an embodiment of a program process in the image database center device in the embodiment.

【図８】上記実施形態例での画像データベースセンタ装
置におけるプログラム処理の一実施形態例を示すフロー
図（その３）である。FIG. 8 is a flowchart (part 3) illustrating an embodiment of a program process in the image database center device in the embodiment.

【図９】上記実施形態例での画像データベースセンタ装
置におけるプログラム処理の一実施形態例を示すフロー
図（その４）である。FIG. 9 is a flowchart (part 4) illustrating an embodiment of a program process in the image database center device in the embodiment.

【図１０】上記実施形態例での画像データベースセンタ
装置におけるプログラム処理の一実施形態例を示すフロ
ー図（その５）である。FIG. 10 is a flowchart (part 5) showing one embodiment of the program processing in the image database center device in the above embodiment.

【図１１】上記実施形態例での画像データベースセンタ
装置におけるプログラム処理の一実施形態例を示すフロ
ー図（その６）である。FIG. 11 is a flowchart (part 6) illustrating an embodiment of the program processing in the image database center device in the embodiment.

【図１２】上記実施形態例での画像データベースセンタ
装置におけるプログラム処理の一実施形態例を示すフロ
ー図（その７）である。FIG. 12 is a flowchart (part 7) illustrating an embodiment of a program process in the image database center device in the embodiment.

【図１３】上記実施形態例におけるシステム設定画面例
を示す図である。FIG. 13 is a diagram showing an example of a system setting screen in the embodiment.

【図１４】上記実施形態例におけるシステム設定画面例
の続きを示す図である。FIG. 14 is a diagram illustrating a continuation of the example of the system setting screen in the embodiment.

【図１５】（ａ），（ｂ）は、上記実施形態例におい
て、データベースアクセスシートに記載されたキーワー
ドを抽出する方法を説明する図（その１）である。FIGS. 15A and 15B are diagrams (part 1) for explaining a method for extracting a keyword described in a database access sheet in the embodiment.

【図１６】（ａ），（ｂ），（ｃ），（ｄ）は、上記実
施形態例において、データベースアクセスシートに記載
されたキーワードを抽出する方法を説明する図（その
２）である。FIGS. 16 (a), (b), (c) and (d) are diagrams (part 2) for explaining a method of extracting a keyword described in a database access sheet in the embodiment.

【図１７】上記実施形態例における文字認識ＡＰの出力
結果例を示す図である。FIG. 17 is a diagram showing an example of an output result of a character recognition AP in the embodiment.

【図１８】上記実施形態例における辞書ＤＢの登録例を
示す図である。FIG. 18 is a diagram showing a registration example of a dictionary DB in the embodiment.

【図１９】本発明の第２の実施形態例を説明するための
画像データベースの説明図である。FIG. 19 is an explanatory diagram of an image database for explaining a second embodiment of the present invention.

【図２０】本発明の第３の実施形態例を説明するための
辞書ＤＢの使用例を示す説明図である。FIG. 20 is an explanatory diagram showing a usage example of a dictionary DB for explaining a third embodiment of the present invention.

【図２１】本発明の第４の実施形態例を説明するための
キーワード検索方法を説明する図である。FIG. 21 is a diagram illustrating a keyword search method for explaining a fourth embodiment of the present invention.

[Explanation of symbols]

１０…画像データベースセンタ装置１１…スキャナ１２…画像データベース１３…画像データベースセンタ主装置１４…回線ボード１５…通信ネットワーク１６，１７…ファックス端末１８…パーソナルコンピュータ（パソコン）２０…用紙２１…キーワードデータ記載例２２…キーワードデータを抽出するための黒線（二重
線）２３…キーワードデータ記載例２４…イメージデータ記載可能領域DESCRIPTION OF SYMBOLS 10 ... Image database center apparatus 11 ... Scanner 12 ... Image database 13 ... Image database center main apparatus 14 ... Line board 15 ... Communication network 16, 17 ... Fax terminal 18 ... Personal computer (PC) 20 ... Paper 21 ... Keyword data description example 22 black line for extracting keyword data (double line) 23 keyword data description example 24 image data description area

───────────────────────────────────────────────────── フロントページの続き (72)発明者大洞崇裕東京都新宿区西新宿３丁目19番２号日本電信電話株式会社内 ──────────────────────────────────────────────────続き Continued on the front page (72) Inventor Takahiro Odo 3-19-2 Nishi Shinjuku, Shinjuku-ku, Tokyo Nippon Telegraph and Telephone Corporation

Claims

[Claims]

1. An image database center for transmitting / receiving image data to / from a communication medium and registering / retrieving the image data, wherein a registration requester transmits the image data from a communication medium via a communication network or from an image scanner. A second means for extracting a candidate character string from the image data received from the registration requester and generating one or more candidate character strings with high accuracy including an uncertain character Means for registering the candidate character string generated by the second means with the image data from which the candidate character string is extracted in the image database.
Means for receiving image data from a search requester via a communication medium via a communication network; extracting candidate character strings from the image data received from the search requester; A fifth means for generating one or more candidate character strings having high accuracy including: a matching rate between the candidate character strings generated by the fifth means and the candidate character strings registered in the image database A sixth means for searching the image database for image data corresponding to a candidate character string determined to match the match rate with the preset match rate from the image database; and And a seventh means for transmitting the image data.

2. The apparatus according to claim 1, wherein the second and fifth units include a character recognizing unit, an absolute reference value determining unit, a fixed reference value determining unit, and a relative reference value determining unit. Generating a sequence, the character recognition means ranks a plurality of candidate characters as a character recognition result, the plurality of candidate characters and the corresponding distance value the absolute reference value determination unit, the fixed reference value A determination unit, and a relative reference value determination unit, wherein the absolute reference value determination unit includes the plurality of ranked candidate characters, a value of a distance value corresponding thereto, and a value of a preset absolute reference value. Comparing the value of the distance value and the value of the absolute reference value to determine whether or not the candidate character is to be combined with a candidate character string.The fixed reference value determination unit includes: The upper and lower candidate sentences The value of the upper and lower distance value corresponding to the character and the corresponding value is subtracted, the subtracted value is compared with a value of a preset reference value, and the value of the subtracted value and the value of the reference value are determined according to the magnitude of the value. It is to determine whether only the upper candidate characters are to be combined with the candidate character strings, or whether the upper and lower candidate characters are to be combined with the candidate character strings, respectively. The upper and lower candidate characters and the corresponding upper and lower distance values are subtracted, the subtracted value is compared with a preset relative reference value, and the subtracted value and the relative reference value are compared. Depending on the magnitude of the value, whether the upper and lower candidate characters are replaced with uncertain characters and the candidate character string is combined or whether the upper and lower candidate characters are combined with the candidate character string respectively What to judge The image database center device according to claim 1, wherein:

3. The method according to claim 1, wherein the sixth means includes one or more candidate character strings generated from the image data received from the registration requester, and one or more candidate character strings generated from the image data received from the search requester. When comparing the match rates with the candidate strings of the characters, the match rate between the individual strings of the candidate strings generated at the time of the registration request and the individual strings of the candidate strings generated at the The image data corresponding to a candidate character string including a character string that is calculated for each column and is determined to match with the matching rate based on a predetermined matching rate is searched from an image database. The image database center device according to claim 1 or 2.

4. A method according to claim 1, wherein the relative reference value determining unit replaces a candidate character with an uncertain character, and searches the dictionary database for a character string for the candidate character string including the uncertain character, and performs matching from the dictionary database. 4. An image database center apparatus according to claim 1, further comprising: means for additionally generating a character string to be executed as a candidate character string.

5. The method according to claim 2, wherein the candidate character string including a cloud point or a semi-voice point includes a cloud point or
Means for additionally generating a candidate character string from which the semi-voiced voice has been removed; and means for additionally generating a candidate character string in which the character is replaced with a character obtained by separating the character into two characters, for a candidate character string including the separated character. The image database center device according to any one of claims 1 to 4, wherein the image database center device is provided.

6. Sending and receiving image data to and from a communication medium,
An image database registration / search method provided in an image database center for registering / searching image data, comprising: a first procedure of receiving image data from a registration requester from a communication medium via a communication network or an image scanner; A second procedure of extracting a candidate character string from the image data received from the registration requestor and generating one or more candidate character strings having high accuracy including an uncertain character; and Register the candidate character string generated in the procedure 2 in the image database in association with the image data from which the candidate character string is extracted.
And a fourth step of receiving image data from a communication medium of the search requester via a communication network; and extracting a candidate character string from the image data received from the search requester by character recognition software. A fifth procedure of generating one or more candidate character strings having high accuracy including an uncertain character; and a candidate character string generated in the fifth procedure and a candidate character string registered in the image database. A sixth procedure of comparing image data corresponding to a candidate character string determined to match with the coincidence rate with a preset coincidence rate from the image database; and a communication medium of the search requester. A seventh procedure of transmitting the searched image data to the image database registration / search method.

7. The character recognition software according to the second and fifth procedures, when performing character recognition of a character string in which katakana and numbers are mixed, the character string in which katakana and numbers are mixed includes katakana and numerals. Characters are distinguished using katakana and numbers before and after the delimiter, using katakana-only dictionaries for the katakana part and numeric-only dictionaries for the number part. The image database registration / search method according to claim 6.

8. The second and fifth procedures include: ranking a plurality of candidate characters as a character recognition result of character recognition software, and returning the plurality of candidate characters and a distance value corresponding to the plurality of candidate characters; A plurality of ranked candidate characters, a value of a distance value corresponding thereto and a value of a preset absolute reference value are compared, and the candidate character is determined as a candidate character according to the magnitude of the distance value and the value of the absolute reference value. An absolute reference value determination procedure for determining whether or not to combine the columns; and subtracting the upper and lower candidate characters and the corresponding upper and lower distance values, and setting the subtracted value to a preset value. The value of the fixed reference value is compared, and the value of the subtracted value and the value of the fixed reference value are compared to determine whether only the upper candidate character is to be combined with the candidate character string or the upper and lower candidate characters. Each A fixed reference value determination procedure for determining whether or not the candidate character string is to be combined; subtracting the upper and lower candidate characters and the corresponding upper and lower distance values, and setting the subtracted value to a preset value By comparing the value of the relative reference value, by the value of the subtracted value and the value of the relative reference value, the upper and lower candidate characters are replaced with uncertain characters and the combination of candidate character strings, or A relative reference value determination step of determining whether the upper and lower candidate characters are to be combined with the candidate character strings, respectively, comprising: generating the candidate character string by combining the candidate characters based on the determination. The image database registration / retrieval method according to claim 6 or 7, wherein:

9. The method according to claim 6, wherein one or more candidate character strings generated from the image data received from the registration requester and one or more candidate character strings generated from the image data received from the search requester are included in the sixth procedure. When comparing the match rates with the candidate character strings of the above, the match rate between the individual character strings of the candidate character group generated at the time of the registration request and the individual character strings of the candidate character group generated at the 7. The image data corresponding to a candidate character string including a character string that is calculated for each column and determined to match with the matching rate based on a predetermined matching rate is searched from an image database. Item 9. The image database registration / search method according to any one of Items 8 to 8.

10. The relative reference value determining step includes: replacing a candidate character with an uncertain character; searching a dictionary database for a candidate character string including the uncertain character, and performing matching from the dictionary database. The image database registration / retrieval method according to claim 8, comprising: a step of additionally generating a character string to be executed as a candidate character string.

11. The second and fifth procedures are as follows: a candidate character string including a cloud point or a semi-voice point includes a cloud point or
A step of additionally generating a candidate character string from which the semi-voiced voice is removed; and a step of additionally generating a candidate character string in which the character is replaced with a character obtained by separating the character into two characters for the candidate character string including the separated character. The image database registration / search method according to any one of claims 6 to 10, wherein the method is provided.

12. A program for causing a computer to execute the procedure of the image database registration / retrieval method according to any one of claims 6 to 11, wherein the program is recorded on a computer-readable medium. Recording medium.