JPH06103404A

JPH06103404A - Business card recognition device

Info

Publication number: JPH06103404A
Application number: JP4275243A
Authority: JP
Inventors: Hitoshi Sato; 仁佐藤
Original assignee: Sony Corp
Current assignee: Sony Corp
Priority date: 1992-09-18
Filing date: 1992-09-18
Publication date: 1994-04-15

Abstract

PURPOSE:To more correctly correct the error of a dwelling place read out of a business card. CONSTITUTION:A storage means to store the dwelling place and a telephone number at that dwelling place as correlating them with each other, an image scanner 5 as a reading means to read the image of the business card, and a character recognizing device 6 as a detecting means to detect the dwelling place and the telephone number out of the image of the business card read by the image scanner 5 are provided, and the telephone number coincident with the telephone number detected by the character recognizing device 6 is retrieved, and the dwelling place detected by the character recognizing device 6 is corrected on the basis of the dwelling place coordinated with that telephone number. Further, it is made more correct by a postal code.

Description

Detailed Description of the Invention

【０００１】[0001]

【産業上の利用分野】本発明は、名刺から読み取られた
住所の誤りを訂正する場合に用いて好適な名刺認識装置
に関する。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a business card recognition device suitable for use in correcting an error in an address read from a business card.

【０００２】[0002]

【従来の技術】従来の名刺認識装置においては、名刺の
イメージを読み取り、そのイメージを文字認識し、名刺
から、例えば名前、住所、電話番号、または郵便番号な
どに対応する文字列を切り出すようになっている。2. Description of the Related Art In a conventional business card recognition device, an image of a business card is read, the image is recognized, and a character string corresponding to, for example, a name, an address, a telephone number, or a postal code is cut out from the business card. Has become.

【０００３】そして、文字列としての名前、住所、電話
番号、または郵便番号などが、例えばディスクなどの記
録媒体に記録され、データベース化されることにより、
必要に応じて、例えば電話番号の検索などができるよう
になっている。The name, address, telephone number, zip code, etc. as a character string are recorded in a recording medium such as a disk and made into a database, so that
If necessary, for example, a telephone number can be searched for.

【０００４】[0004]

【発明が解決しようとする課題】ところで、名刺のイメ
ージから、文字数が多く、漢字、ひらがな、カタカナ、
または数字が混在する住所を正確に認識することは、ほ
ぼ数字だけから構成される電話番号や郵便番号を認識す
る場合に比較して、困難であった。By the way, from the image of a business card, there are many characters, such as kanji, hiragana, katakana,
Alternatively, it is difficult to accurately recognize an address containing a mixture of numbers, as compared with the case of recognizing a telephone number or a postal code that is composed almost entirely of numbers.

【０００５】そこで、住所と関連のある郵便番号により
住所を訂正する方法がある。即ち、郵便番号により、あ
る程度限定される住所と、文字認識された住所とを比較
し、郵便番号により限定された住所のいずれかに一致す
るように、文字認識された住所を訂正する方法がある。Therefore, there is a method of correcting an address using a postal code associated with the address. That is, there is a method of comparing an address which is limited to a certain extent by the postal code with an address where the character is recognized and correcting the address where the character is recognized so as to match any of the addresses limited to the postal code. .

【０００６】しかしながら、この方法では、ある程度ま
では住所の誤りを訂正することができるが、それ以上に
正確に誤りを訂正することは困難であった。However, this method can correct the address error to some extent, but it is difficult to correct the error more accurately.

【０００７】本発明は、このような状況に鑑みてなされ
たものであり、名刺から読み取られた住所の誤りを、よ
り正確に訂正することができるようにするものである。The present invention has been made in view of such a situation, and it is possible to more accurately correct an error in an address read from a business card.

【０００８】[0008]

【課題を解決するための手段】請求項１に記載の名刺認
識装置は、住所と、その住所における電話番号とを関連
付けて記憶している記憶手段としてのＲＡＭ１３と、名
刺のイメージを読み取る読取手段としてのイメージスキ
ャナ５と、イメージスキャナ５により読み取られた名刺
のイメージから、住所および電話番号を検出する検出手
段としての文字認識装置６と、ＲＡＭ１３から、文字認
識装置６により検出された電話番号と一致する電話番号
を検索し、その電話番号に関連付けられた住所に基づい
て、文字認識装置６により検出された住所を訂正する訂
正手段としてのプログラムの処理ステップＳ２乃至Ｓ９
とを備えることを特徴とする。According to another aspect of the present invention, there is provided a business card recognition device including: a RAM 13 as a storage means for storing an address and a telephone number at the address in association with each other; and a reading means for reading an image of the business card. And a character recognition device 6 as detection means for detecting an address and a telephone number from an image of a business card read by the image scanner 5, and a telephone number detected by the character recognition device 6 from the RAM 13. Processing steps S2 to S9 of the program as a correction unit that searches for a matching telephone number and corrects the address detected by the character recognition device 6 based on the address associated with the telephone number.
And is provided.

【０００９】請求項２に記載の名刺認識装置は、ＲＡＭ
１３に、住所と、その住所における郵便番号とを関連付
けてさらに記憶させておき、文字認識装置６に、名刺の
イメージから、郵便番号をさらに検出させ、プログラム
の処理ステップＳ２乃至Ｓ９に、ＲＡＭ１３から、文字
認識装置６により検出された郵便番号と一致する郵便番
号を検索させ、その郵便番号に関連付けられた住所に基
づいて、文字認識装置６により検出された住所をさらに
訂正させることを特徴とする。The business card recognition device according to claim 2 is a RAM
The address and the postal code at the address are stored in association with each other in 13, and the character recognition device 6 is caused to further detect the postal code from the image of the business card. From the RAM 13 to the processing steps S2 to S9 of the program. , A postal code that matches the postal code detected by the character recognition device 6 is searched, and the address detected by the character recognition device 6 is further corrected based on the address associated with the postal code. .

【００１０】請求項３に記載の名刺認識装置は、ＲＡＭ
１３に、住所、その住所における電話番号、またはその
住所における郵便番号を階層的に記憶させることを特徴
とする。The business card recognition device according to claim 3 is a RAM
13, the address, the telephone number at the address, or the zip code at the address is hierarchically stored.

【００１１】[0011]

【作用】上記構成の名刺認識装置においては、ＲＡＭ１
３に、住所と、その住所における電話番号、または郵便
番号とを関連付けて記憶させておく。そして、名刺のイ
メージを読み取り、そのイメージから、住所、電話番
号、または郵便番号を検出する。さらに、ＲＡＭ１３か
ら、文字認識装置６により検出された電話番号または郵
便番号と一致する電話番号または郵便番号をそれぞれ検
索し、その電話番号または郵便に関連付けられた住所に
基づいて、文字認識装置６により検出された住所を訂正
する。従って、名刺から読み取られた住所の誤りを、よ
り正確に訂正することができる。In the business card recognition device having the above structure, the RAM 1
The address and the telephone number or zip code at the address are stored in association with each other in 3. Then, the image of the business card is read, and the address, telephone number, or zip code is detected from the image. Further, the RAM 13 is searched for a telephone number or a zip code that matches the telephone number or zip code detected by the character recognition device 6, and based on the address associated with the telephone number or the mail, the character recognition device 6 Correct the detected address. Therefore, an error in the address read from the business card can be corrected more accurately.

【００１２】[0012]

【実施例】図１は、本発明の名刺認識装置の一実施例の
構成を示すブロック図であり、図２は、その全体図であ
る。ＷＳ（ワークステーション）１は、ＣＰＵ１１、Ｒ
ＯＭ１２、およびＲＡＭ１３から構成され、インターフ
ェイス１４およびイメージボード１５が、その拡張スロ
ット（図示せず）に挿入されている。ＣＰＵ１１は、装
置全体を制御するとともに、ＲＯＭ１２に記憶されてい
るシステムプログラム、またはＲＡＭ１３に記憶された
ユーザプログラムを読み出して実行する。1 is a block diagram showing the configuration of an embodiment of a business card recognition device of the present invention, and FIG. 2 is an overall view thereof. WS (workstation) 1 has CPU 11 and R
It is composed of an OM 12 and a RAM 13, and an interface 14 and an image board 15 are inserted into its expansion slot (not shown). The CPU 11 controls the entire apparatus and reads and executes the system program stored in the ROM 12 or the user program stored in the RAM 13.

【００１３】ＲＯＭ１２には、システムプログラムが記
憶されている。ＲＡＭ１３は、ユーザプログラムを記憶
する他、装置の動作上必要なデータを記憶する。さら
に、ＲＡＭ１３には、図３に示すような住所と、その住
所における郵便番号、および電話番号のうちの、例えば
市外局番とが関連付けられた辞書が記憶されている。A system program is stored in the ROM 12. The RAM 13 stores the user program and also data necessary for the operation of the device. Further, the RAM 13 stores a dictionary in which an address as shown in FIG. 3, a postal code at the address, and a telephone number, for example, an area code are associated with each other.

【００１４】なお、図中、郵便番号の最後に付加された
シンボル「＊」は、郵便番号のキーワードを示し、市外
局番の最後に付加されたシンボル「＋」は、市外局番の
キーワードを示す。In the figure, the symbol "*" added to the end of the zip code indicates the keyword of the zip code, and the symbol "+" added to the end of the area code indicates the keyword of the area code. Show.

【００１５】また、ＲＡＭ１３には、住所のキーワード
としての、例えば「都」、「道」、「府」、「県」、
「区」、「郡」、「市」、「村」、または「町」、郵便
番号のキーワード「＊」、並びに電話番号のキーワード
「＋」などからなるキーワードテーブルが記憶されてい
る（図５）。Further, in the RAM 13, for example, "city", "road", "prefecture", "prefecture", etc.
A keyword table including “ward”, “county”, “city”, “village”, or “town”, a zip code keyword “*”, and a telephone number keyword “+” is stored (FIG. 5). ).

【００１６】インターフェイス１４は、ＣＰＵ１１に代
わって文字認識装置６（インターフェイス２２）との入
出力を管理する。イメージボード１５は、ＣＰＵ１１に
代わってイメージスキャナ５またはプリンタ７との入出
力を管理する。The interface 14 manages input / output with the character recognition device 6 (interface 22) in place of the CPU 11. The image board 15 manages input / output with the image scanner 5 or the printer 7 instead of the CPU 11.

【００１７】ディスプレイ２は、ＷＳ１より供給される
文字データやグラフィックデータを表示する。キーボー
ド３またはマウス４は、例えばコマンドや装置の動作上
必要なデータを入力する場合に操作される。The display 2 displays character data and graphic data supplied from WS1. The keyboard 3 or the mouse 4 is operated, for example, when a command or data necessary for operating the device is input.

【００１８】イメージスキャナ５は、そこにセットされ
た名刺のイメージを読み取り、イメージデータをＷＳ１
（インターフェイス１４）に出力する。The image scanner 5 reads the image of the business card set there and outputs the image data to WS1.
(Interface 14).

【００１９】文字認識装置６は、ＯＣＲ（Optical Char
acter Reader）ボード２１およびインターフェイス２２
から構成される。ＯＣＲボード２１は、ＷＳ１（ＲＡＭ
１３）より供給される、イメージスキャナ５で読み取ら
れた名刺のイメージデータ（文字データ）を認識する。
インターフェイス２２は、文字認識装置６へのデータの
入力、またはそこからのデータの出力を管理する。The character recognition device 6 uses an OCR (Optical Char
acter Reader) board 21 and interface 22
Composed of. The OCR board 21 is WS1 (RAM
13) The image data (character data) of the business card read by the image scanner 5, which is supplied from 13), is recognized.
The interface 22 manages the input of data to the character recognition device 6 or the output of data therefrom.

【００２０】プリンタ７は、ＷＳ１より供給されるプリ
ントデータにしたがって、文字またはグラフィックをプ
リントアウトする。The printer 7 prints out characters or graphics according to the print data supplied from WS1.

【００２１】次に、図４のフローチャートを参照して、
その動作について説明する。まず、イメージスキャナ５
に名刺がセットされ、そのイメージが読み取られてイメ
ージデータが生成される。この名刺のイメージデータ
は、イメージボード１５を介してＷＳ１のＲＡＭ１３に
記憶された後、インターフェイス１４を介して文字認識
装置６に供給される。Next, referring to the flowchart of FIG.
The operation will be described. First, the image scanner 5
A business card is set in, the image is read, and image data is generated. The image data of the business card is stored in the RAM 13 of the WS 1 via the image board 15 and then supplied to the character recognition device 6 via the interface 14.

【００２２】文字認識装置６（ＯＣＲボード２１）にお
いて、その内蔵するメモリ（図示せず）に記憶されてい
る文字パターンが参照され、ＷＳ１（ＲＡＭ１３）より
供給された名刺のイメージデータが文字認識される。さ
らに、文字認識装置６において、文字認識されたイメー
ジデータから、郵便番号、電話番号、または住所に対応
する文字列が、それぞれ抽出される。In the character recognition device 6 (OCR board 21), the character pattern stored in the built-in memory (not shown) is referred to, and the image data of the business card supplied from WS1 (RAM 13) is recognized. It Further, in the character recognition device 6, character strings corresponding to a postal code, a telephone number, or an address are extracted from the image data subjected to character recognition.

【００２３】そして、図４のステップＳ１において、Ｒ
ＡＭ１３に記憶されている辞書（図３）が、図５に示す
ように展開される。即ち、辞書は、まず住所のキーワー
ド「都」、「道」、「府」、または「県」が最上位の階
層に属する特殊キーワードとされ、それに付く都道府県
名とともに４つのカテゴリに分類される。そして、各特
殊キーワード「都」、「道」、「府」、または「県」の
下位階層には、その都道府県名の下に付く地名（例え
ば、市町村名や区名、郡名など）が、その地名に付くキ
ーワード（例えば「市」、「町」、「村」、「区」、ま
たは「郡」など）とともに階層的に（ツリー構造上に）
展開される。Then, in step S1 of FIG.
The dictionary (FIG. 3) stored in the AM 13 is expanded as shown in FIG. That is, in the dictionary, first, the address keywords "city", "road", "fu", or "prefecture" are special keywords belonging to the highest hierarchy, and are classified into four categories together with the prefecture names attached to them. . Then, in the lower hierarchy of each special keyword “city”, “road”, “prefecture”, or “prefecture”, there is a place name (for example, a municipality name, a ward name, a county name, etc.) under the prefecture name. , Hierarchically (on the tree structure) along with keywords attached to the place name (eg "city", "town", "village", "ward", or "county")
Be expanded.

【００２４】さらに、郵便番号または市外局番は、それ
に対応する住所の下の階層に、キーワード「＊」または
「＋」とともにそれぞれ展開される。Further, the zip code or area code is expanded in the hierarchy below the corresponding address, together with the keywords "*" or "+", respectively.

【００２５】なお、１つのカテゴリにおいて、同一の階
層に属するキーワード、または地名は、それぞれ１つの
チェインで結ばれる（例えば図中、東京の下位階層に属
するキーワード「区」、「村」、および「市」は、１つ
のチェインで結ばれている）。また、カテゴリを越えて
キーワードまたは地名の検索ができるように、異なるカ
テゴリにおける同一のキーワード（例えば東京都の下位
階層に属するキーワード「市」と埼玉県の下位階層に属
するキーワード「市」など）も、１つのチェインで結ば
れる。In one category, keywords or place names belonging to the same hierarchy are connected by one chain (for example, in the figure, keywords "ward", "village", and "village" belonging to the lower hierarchy of Tokyo. "City" is connected by one chain). In addition, the same keyword in different categories (for example, the keyword "city" that belongs to the lower hierarchy of Tokyo and the keyword "city" that belongs to the lower hierarchy of Saitama prefecture) can be searched so that you can search for keywords or place names across categories. They are tied together in one chain.

【００２６】以上のように展開された辞書におけるキー
ワードのアドレス（ポインタ）は、ＲＡＭ１３における
キーワードテーブルに、キーワードとともに記憶される
（図５）。The address (pointer) of the keyword in the dictionary expanded as described above is stored together with the keyword in the keyword table in the RAM 13 (FIG. 5).

【００２７】ステップＳ１における辞書の展開後、ステ
ップＳ２乃至Ｓ４それぞれにおける市外局番の抽出処
理、住所におけるキーワードの切り出し処理、または郵
便番号の抽出処理が並列に行われる。After the dictionary is developed in step S1, the area code extraction processing, the keyword extraction processing of the address, or the postal code extraction processing in steps S2 to S4 is performed in parallel.

【００２８】即ち、ステップＳ２においては、まず文字
認識装置６より出力された名刺の電話番号に対応する文
字列から、電話番号に特有のキーワード（例えば文字列
「ＴＥＬ」や「Ｔｅｌ」、「電話」、「電話番号」、電
話のマークなど）が除去され、それに漢数字が含まれて
いる場合には、数字（英数字）に変換される。That is, in step S2, first, from the character string corresponding to the telephone number of the business card output from the character recognition device 6, a keyword peculiar to the telephone number (for example, the character strings "TEL", "Tel", "telephone" , “Telephone number”, phone mark, etc.) are removed, and if they contain Kanji numbers, they are converted to numbers (alphanumeric characters).

【００２９】そして、上述した処理が施された文字列の
最初の文字が数字の「０」である場合、文字列の最初の
文字から、区切り文字「−」もしくは「（」の１つ前の
文字までか、または区切り文字「−」もしくは「（」が
ないときには、文字列の最初の文字から６文字目までが
抽出される。When the first character of the character string subjected to the above-mentioned processing is the numeral "0", the character immediately before the delimiter "-" or "(" is deleted from the first character of the character string. Up to the character, or when there is no delimiter "-" or "(", the first to sixth characters of the character string are extracted.

【００３０】また、上述した処理が施された文字列の最
初の文字が「（」であり、その次の文字が数字の「０」
である場合、文字列の２番目の文字（数字の「０」）か
ら、区切り文字「）」の１つ前の文字までか、またはま
たは区切り文字「（」がないときには、文字列の最初の
文字から６文字目までが抽出される。The first character of the character string subjected to the above-mentioned processing is "(", and the second character is the numeral "0".
Is from the second character of the string (the number "0") to the character before the delimiter ")", or when there is no delimiter "(", the first character of the string Characters up to the sixth character are extracted.

【００３１】以上のようにして、電話番号から市外局番
が抽出され、その最後にキーワード「＋」が付加されて
ステップＳ５に進む。As described above, the area code is extracted from the telephone number, the keyword "+" is added to the end of the area code, and the process proceeds to step S5.

【００３２】なお、上述した処理が施された文字列の最
初の文字が数字の「０」でなく、また「（」でもない場
合、電話番号からの市外局番の抽出は行われない。If the first character of the character string subjected to the above-mentioned processing is neither the numeral "0" nor the character "(", the area code is not extracted from the telephone number.

【００３３】一方、ステップＳ３においては、まず文字
認識装置６より出力された名刺の住所に対応する文字列
を構成する各文字と、キーワードテーブルに記憶されて
いるキーワードとがそれぞれ照合され、住所に対応する
文字列から、住所のキーワードおよびその位置（何文字
目か）が検出される。On the other hand, in step S3, first, each character forming the character string corresponding to the address of the business card output from the character recognition device 6 is collated with the keyword stored in the keyword table to obtain the address. From the corresponding character string, the keyword of the address and its position (what character) are detected.

【００３４】ここで、住所に対応する文字列から、住所
のキーワードを検出することができなかった場合、例え
ば１０個だけあらかじめ用意された、その文字列を構成
する各文字に対する候補文字（各文字に類似した文字）
と、キーワードテーブルに記憶されているキーワードと
がそれぞれ照合される。なお、候補文字からもキーワー
ドが検出されなかった場合には、処理を中止する。Here, if the keyword of the address cannot be detected from the character string corresponding to the address, for example, only 10 prepared beforehand, candidate characters for each character constituting the character string (each character Character similar to)
And the keywords stored in the keyword table are compared with each other. If no keyword is detected from the candidate characters, the process is stopped.

【００３５】住所に対応する文字列から、住所のキーワ
ードおよびその位置が検出された後、その文字列から住
所検索用の文字列が切り出される。After the address keyword and its position are detected from the character string corresponding to the address, a character string for address search is cut out from the character string.

【００３６】即ち、住所に対応する文字列の先頭から、
その文字列の一番前に位置する住所のキーワードまでが
切り出され、さらにその切り出された文字列から、その
先頭の文字を除いた文字列が、住所のキーワードの文字
が１つになるまで順次切り出される。That is, from the beginning of the character string corresponding to the address,
Up to the keyword of the address located at the front of the character string is cut out, and the character string obtained by removing the first character from the cut out character string is repeated until the number of characters of the address keyword becomes one. It is cut out.

【００３７】以上の処理が、住所に対応する文字列の２
番目以降に位置する住所のキーワードに関しても順次行
われ、住所検索用の文字列が切り出されてステップＳ６
に進む。The above processing is performed for the character string 2 corresponding to the address.
The keyword of the address located after the th is also sequentially performed, and the character string for address search is cut out, and step S6 is performed.
Proceed to.

【００３８】即ち、文字認識装置６より出力された住所
に対応する文字列が、例えば「束京都葛飾区柴又３−１
９−３」である場合（正確な住所は、「東京都葛飾区柴
又３−１９−３」であるが、文字認識装置６において、
先頭の文字「東」が、文字「束」に誤認識されたものと
する）、ステップＳ３においては、まずこの文字列から
住所のキーワード「都」および「区」が検出され、住所
に対応する文字列から、このキーワードを基に、住所検
索用の文字列（検索文字列）「束京都」「京都」「束京都葛飾区」「京都葛飾区」「都葛飾区」「葛飾区」「飾区」が切り出されることになる。That is, the character string corresponding to the address output from the character recognition device 6 is, for example, "3-1 Shibamata, Katsushika-ku, Kyoto.
9-3 "(the exact address is" 3-19-3 Shibamata, Katsushika-ku, Tokyo ", but in the character recognition device 6,
It is assumed that the first character "east" is erroneously recognized as a character "bundle"). In step S3, first, the keywords "tou" and "ward" of the address are detected from this character string and correspond to the address. From the character string, based on this keyword, the character string for address search (search character string) "Bundle Kyoto""Kyoto""Bundle Kyoto Katsushika Ward""Kyoto Katsushika Ward""Matsu Katsushika Ward""KatsushikaWard""Decoration" The ward will be cut out.

【００３９】同時に、ステップＳ４においては、文字認
識装置６より出力された名刺の郵便番号に対応する文字
列の最後に、郵便番号のキーワード「＊」が付加され、
ステップＳ７に進む。At the same time, in step S4, the postal code keyword "*" is added to the end of the character string corresponding to the postal code of the business card output from the character recognition device 6,
Go to step S7.

【００４０】以上のようにして、ステップＳ２乃至Ｓ４
における処理が並列に行われた後、ステップＳ５乃至Ｓ
７に進み、市外局番、検索文字列、または郵便番号を基
に、ステップＳ１で展開された辞書から住所が、それぞ
れ選出される。As described above, steps S2 to S4
After the processing in step S5 is performed in parallel, steps S5 to S
7, the address is selected from the dictionary developed in step S1 based on the area code, search character string, or zip code.

【００４１】即ち、ステップＳ５においては、まずステ
ップＳ２で抽出された市外局番の最後に付加されたキー
ワード「＋」が、キーワードテーブルから検出され、さ
らにそこから、そのキーワード「＋」の、辞書（図５）
に記憶されているアドレスが読み出される。そして、辞
書から、そのアドレスに基づいて、キーワード「＋」と
ともに記憶されている市外局番が読み出され、ステップ
Ｓ２で抽出された市外局番と照合される。That is, in step S5, first, the keyword "+" added to the end of the area code extracted in step S2 is detected from the keyword table, and from there, the dictionary of that keyword "+" is added. (Fig. 5)
The address stored in is read. Then, the area code stored together with the keyword "+" is read out from the dictionary based on the address and collated with the area code extracted in step S2.

【００４２】照合の結果、ステップＳ２で抽出された市
外局番と一致する（マッチングする）市外局番が、辞書
に記憶されていた場合、その市外局番の階層と結ばれた
上位の階層に属する住所（地名）が読み出され、候補住
所（正確な住所の候補）として、ＲＡＭ１３に確保され
た所定の領域に記憶されてステップＳ８に進む。As a result of the collation, if the area code that matches (matches) the area code extracted in step S2 is stored in the dictionary, it is stored in the upper hierarchy connected to the hierarchy of the area code. The belonging address (place name) is read out, stored as a candidate address (correct address candidate) in a predetermined area secured in the RAM 13, and the process proceeds to step S8.

【００４３】なお、ステップＳ２で抽出された市外局番
と一致する（マッチングする）市外局番が辞書に記憶さ
れていなかった場合、例えば１０個だけあらかじめ用意
された、ステップＳ２で抽出された市外局番としての各
文字（各数字）に対する候補文字（各文字に類似した文
字）と、辞書に記憶されている市外局番を構成する各文
字とをそれぞれ照合することにより、ステップＳ２で抽
出された市外局番と、辞書に記憶されている市外局番と
のマッチングが行われる。When the area code that matches (matches) the area code extracted in step S2 is not stored in the dictionary, for example, only 10 prepared area codes are prepared in advance and the area extracted in step S2 is stored. Candidate characters (characters similar to each character) for each character (each number) as an area code and each character that constitutes the area code stored in the dictionary are collated and extracted in step S2. The area code and the area code stored in the dictionary are matched.

【００４４】一方、ステップＳ６では、まずステップＳ
３で切り出された検索文字列の最後の文字、即ち住所の
キーワードが、キーワードテーブルから検出され、さら
にそこから、そのキーワードの、辞書（図５）に記憶さ
れているアドレスが読み出される。そして、辞書から、
そのアドレスに基づいて、住所のキーワードとともに記
憶されている地名（都道府県名も含む）が読み出され、
ステップＳ３で切り出された検索文字列の最後の文字
（住所のキーワード）を除いた文字列と照合される。On the other hand, in step S6, first, in step S6
The last character of the search character string cut out in 3, that is, the keyword of the address is detected from the keyword table, and the address of the keyword stored in the dictionary (FIG. 5) is read therefrom. And from the dictionary
Based on the address, the place name (including the prefecture name) stored with the address keyword is read out,
The search character string cut out in step S3 is collated with the character string excluding the last character (address keyword).

【００４５】照合の結果、ステップＳ３で切り出された
検索文字列の最後の文字（住所のキーワード）を除いた
文字列と一致する（マッチングする）地名が、辞書に記
憶されていた場合、その地名の階層と結ばれた上位およ
び下位の階層に属する地名がすべて読み出され、これら
の地名を組み合わせた住所が、候補住所（正確な住所の
候補）として、ＲＡＭ１３に確保された所定の領域に記
憶されてステップＳ８に進む。As a result of the collation, when a place name that matches (matches) the character string excluding the last character (address keyword) of the search character string cut out in step S3 is stored in the dictionary, the place name is stored. All the place names belonging to the upper and lower layers connected to the hierarchy are read out, and an address obtained by combining these place names is stored in a predetermined area secured in the RAM 13 as a candidate address (correct address candidate). Then, the process proceeds to step S8.

【００４６】即ち、例えば検索文字列が、「葛飾区」で
ある場合、この文字列の最後の文字（住所のキーワー
ド）「区」を除いた文字列「葛飾」と一致する（マッチ
ングする）地名「葛飾」の階層と結ばれた上位および下
位の階層に属する地名（住所）（図６において、影を付
してある部分）がすべて読み出され、これらの地名を組
み合わせた住所が、候補住所（正確な住所の候補）とし
て、ＲＡＭ１３に確保された所定の領域に記憶される。That is, for example, when the search character string is "Katsushika-ku", a place name that matches (matches) with the character string "Katsushika" excluding the last character (keyword of the address) "ku" of this character string All the place names (addresses) (the shaded portions in FIG. 6) belonging to the upper and lower layers connected to the "Katsushika" layer are read out, and the address obtained by combining these place names is the candidate address. It is stored in a predetermined area secured in the RAM 13 as (correct address candidate).

【００４７】なお、ステップＳ５における場合と同様
に、このステップＳ６においても、ステップＳ３で切り
出された検索文字列の最後の文字（住所のキーワード）
を除いた文字列と一致する（マッチングする）地名が辞
書に記憶されていなかった場合、例えば１０個だけあら
かじめ用意された、ステップＳ３で切り出された検索文
字列の最後の文字（住所のキーワード）を除いた文字列
を構成する各文字に対する候補文字（各文字に類似した
文字）と、辞書に記憶されている地名を構成する各文字
とをそれぞれ照合することにより、ステップＳ３で切り
出された検索文字列の最後の文字（住所のキーワード）
を除いた文字列と、辞書に記憶されている地名とのマッ
チングが行われる。As in the case of step S5, also in step S6, the last character (keyword of address) of the search character string cut out in step S3.
If there is no place name stored in the dictionary that matches (matches) with the character string excluding, the last character (keyword of the address) of the search character string cut out in step S3 prepared in advance, for example, 10 characters The candidate characters (characters similar to each character) for each character forming the character string excluding the character string and each character forming the place name stored in the dictionary are collated with each other, and the search cut out in step S3 is performed. The last character of the string (address keyword)
Matching is performed on the character string excluding "" and the place name stored in the dictionary.

【００４８】同時に、ステップＳ７においては、まずス
テップＳ４で抽出された郵便番号の最後に付加されたキ
ーワード「＊」が、キーワードテーブルから検出され、
さらにそこから、そのキーワード「＊」の辞書（図５）
に記憶されているアドレスが読み出される。そして、辞
書から、そのアドレスに基づいて、キーワード「＊」と
ともに記憶されている郵便番号が読み出され、ステップ
Ｓ４で抽出された郵便番号と照合される。At the same time, in step S7, the keyword "*" added to the end of the postal code extracted in step S4 is detected from the keyword table.
Further from that, the dictionary of the keyword "*" (Fig. 5)
The address stored in is read. Then, the postal code stored together with the keyword "*" is read out from the dictionary based on the address, and collated with the postal code extracted in step S4.

【００４９】照合の結果、ステップＳ４で抽出された郵
便番号と一致する（マッチングする）郵便番号が、辞書
に記憶されていた場合、その郵便番号の階層と結ばれた
上位の階層に属する住所（地名）が読み出され、候補住
所（正確な住所の候補）として、ＲＡＭ１３に確保され
た所定の領域に記憶されてステップＳ８に進む。As a result of the collation, when the postal code that matches (matches) the postal code extracted in step S4 is stored in the dictionary, the address (that belongs to the upper hierarchy connected to the hierarchy of the postal code ( The place name is read out and stored as a candidate address (correct address candidate) in a predetermined area secured in the RAM 13, and the process proceeds to step S8.

【００５０】なお、ステップＳ５における場合と同様
に、このステップＳ７においても、ステップＳ４で抽出
された郵便番号と一致する（マッチングする）郵便番号
が辞書に記憶されていなかった場合、例えば１０個だけ
あらかじめ用意された、ステップＳ４で抽出された郵便
番号としての各文字（各数字）に対する候補文字（各文
字に類似した文字）と、辞書に記憶されている郵便番号
を構成する各文字とをそれぞれ照合することにより、ス
テップＳ４で抽出された郵便番号と、辞書に記憶されて
いる郵便番号とのマッチングが行われる。As in the case of step S5, also in this step S7, when the postal code matching (matching) with the postal code extracted in step S4 is not stored in the dictionary, for example, only 10 postal codes are stored. Candidate characters (characters similar to each character) for each character (each number) as the postal code extracted in step S4 prepared in advance, and each character that constitutes the postal code stored in the dictionary, respectively. By collating, the zip code extracted in step S4 and the zip code stored in the dictionary are matched.

【００５１】以上のようにして、ステップＳ５乃至Ｓ７
では、候補住所の選出が行われる。なお、ステップＳ５
乃至Ｓ７で候補住所が重複して選出された場合には、そ
の候補住所は、例えば最初に選出されたものを除き、削
除（無視）される。As described above, steps S5 to S7
Then, the candidate address is selected. Note that step S5
If the candidate addresses are redundantly selected in S7 to S7, the candidate addresses are deleted (ignored) except for the first selected address, for example.

【００５２】そして、ステップＳ８に進み、ＲＡＭ１３
に記憶された候補住所すべてと、文字認識装置６より出
力された住所に対応する文字列とがそれぞれ照合され、
その中から、文字認識装置６より出力された住所に対応
するものが決定される。Then, the process proceeds to step S8 and the RAM 13
All of the candidate addresses stored in and the character string corresponding to the address output from the character recognition device 6 are respectively collated,
From among them, the one corresponding to the address output from the character recognition device 6 is determined.

【００５３】即ち、ステップＳ８においては、ＲＡＭ１
３に記憶された候補住所を構成する各文字と、文字認識
装置６より出力された住所に対応する文字列を構成する
各文字とが、先頭から順次照合され、一致する（マッチ
ングする）か否かが判定される。そして、ＲＡＭ１３に
記憶された各候補住所に対して、次式で定義される正解
率ｐｐ＝（一致したと判定された文字数）／（照合された文
字数）が算出される。That is, in step S8, RAM1
Whether each character forming the candidate address stored in No. 3 and each character forming the character string corresponding to the address output from the character recognition device 6 are sequentially collated from the beginning and match (match). Is determined. Then, for each candidate address stored in the RAM 13, a correct answer rate p p = (the number of characters determined to match) / (the number of collated characters) defined by the following equation is calculated.

【００５４】ここで、文字認識装置６より出力された住
所に対応する文字列には、そこにおける処理過程におい
て、その先頭部分に、真の（正確な）住所に無関係な文
字または文字列が誤って付加されている場合や、またそ
の先頭部分で、真の（正確な）住所に対応する文字また
は文字列が誤って削除されている場合が考えられる。Here, in the character string corresponding to the address output from the character recognizing device 6, a character or a character string irrelevant to the true (correct) address is erroneously detected at the beginning of the character string in the processing process. It is possible that the character or character string corresponding to the true (correct) address is accidentally deleted at the beginning of the message.

【００５５】従って、文字認識装置６より出力された住
所に対応する文字列と、候補住所とを、各先頭の文字か
ら順次照合した場合、正確な正解率ｐが得られないとき
がある。Therefore, when the character string corresponding to the address output from the character recognition device 6 and the candidate address are sequentially collated from the first character, the correct accuracy rate p may not be obtained.

【００５６】そこで、ＲＡＭ１３に記憶された候補住所
の文字列を、その先頭の文字から１文字ずつ削除しなが
ら、文字認識装置６より出力された住所に対応する文字
列との照合が行われるとともに、文字認識装置６より出
力された住所に対応する文字列を、その先頭の文字から
１文字ずつ削除しながら、ＲＡＭ１３に記憶された候補
住所の文字列との照合が行われ、正解率ｐが算出され
る。Therefore, the character string of the candidate address stored in the RAM 13 is collated with the character string corresponding to the address output from the character recognition device 6 while deleting the character string from the leading character one by one. , The character string corresponding to the address output from the character recognition device 6 is collated with the character string of the candidate address stored in the RAM 13 while deleting the character from the leading character one by one, and the correct answer rate p is It is calculated.

【００５７】そして、正解率ｐの最大値をｐ_maxとした
場合、（ｐ_max−Ｃ）以上の正解率ｐが算出された候補
住所のうち、文字数の一番大きいものが検出され、ステ
ップＳ９に進む。[0057] Then, when the maximum value of the accuracy rate p was p _max, (p _max -C) or better rate p is among the candidate address that is calculated, as the largest number of characters is detected, step S9 Proceed to.

【００５８】なお、Ｃは、統計的手法によりあらかじめ
求められた、例えば０．１３５などの所定の係数であ
る。C is a predetermined coefficient, such as 0.135, which is obtained in advance by a statistical method.

【００５９】ステップＳ９において、文字認識装置６よ
り出力された住所が、ステップＳ８で検出された候補住
所により訂正され、処理を終了する。In step S9, the address output from the character recognition device 6 is corrected by the candidate address detected in step S8, and the process ends.

【００６０】以上の処理により、文字認識装置６より出
力された住所に対応する文字列が、例えば「束京都葛飾
区柴又３−１９−３」である場合、その先頭の文字
「束」が、文字「東」に訂正され、正確な住所「東京都
葛飾区柴又３−１９−３」が出力される。By the above processing, when the character string corresponding to the address output from the character recognition device 6 is, for example, "3-10-3 Shibamata, Shibamata, Katsushika-ku, Kyoto", the first character "bun" is The characters are corrected to "east", and the correct address "3-19-3 Shibamata, Katsushika-ku, Tokyo" is output.

【００６１】そして、訂正された住所は、例えば磁気デ
ィスク（図示せず）などに記録され、データベースとし
て利用される。The corrected address is recorded on, for example, a magnetic disk (not shown) and used as a database.

【００６２】なお、本実施例においては、電話番号のう
ちの市外局番と住所を関連付けておくようにしたが、市
外局番だけでなく、市内局番も住所と関連付けておくよ
うにすることができる。In this embodiment, the area code and the address of the telephone number are associated with each other. However, not only the area code but also the local area code is associated with the address. You can

【００６３】[0063]

【発明の効果】以上のように、本発明の名刺認識装置に
よれば、記憶手段に、住所と、その住所における電話番
号、または郵便番号とを関連付けて記憶させておく。そ
して、名刺のイメージを読み取り、そのイメージから、
住所、電話番号、または郵便番号を検出する。さらに、
記憶手段から、検出手段により検出された電話番号また
は郵便番号と一致する電話番号または郵便番号それぞれ
を検索し、その電話番号または郵便に関連付けられた住
所に基づいて、検出手段により検出された住所を訂正す
る。従って、名刺から読み取られた住所の誤りを、より
正確に訂正することができる。As described above, according to the business card recognition device of the present invention, the address and the telephone number or the postal code at the address are associated and stored in the storage means. Then, read the image of the business card, and from that image,
Find addresses, phone numbers, or zip codes. further,
The storage means is searched for a telephone number or postal code, respectively, which matches the telephone number or postal code detected by the detection means, and the address detected by the detection means is determined based on the address associated with the telephone number or postal mail. correct. Therefore, an error in the address read from the business card can be corrected more accurately.

[Brief description of drawings]

【図１】本発明の名刺認識装置の一実施例の構成を示す
ブロック図である。FIG. 1 is a block diagram showing a configuration of an embodiment of a business card recognition device of the present invention.

【図２】図１の実施例の全体図である。FIG. 2 is an overall view of the embodiment shown in FIG.

【図３】住所と、その住所における郵便番号、および市
外局番とが関連付けられた辞書を示す図である。FIG. 3 is a diagram showing a dictionary in which an address, a zip code at the address, and an area code are associated with each other.

【図４】図１の実施例の動作を説明するためのフローチ
ャートである。FIG. 4 is a flow chart for explaining the operation of the embodiment of FIG.

【図５】図３の辞書が階層構造に展開された様子を示す
図である。5 is a diagram showing how the dictionary of FIG. 3 is expanded in a hierarchical structure.

【図６】図４のフローチャートのステップＳ６における
住所の選出を説明するための図である。FIG. 6 is a diagram for explaining selection of an address in step S6 of the flowchart of FIG.

[Explanation of symbols]

１ＷＳ（ワークステーション）２ディスプレイ３キーボード４マウス５イメージスキャナ６文字認識装置７プリンタ１１ＣＰＵ１２ＲＯＭ１３ＲＡＭ１４インターフェイス１５イメージボード２１ＯＣＲ（Optical Character Reader）ボード２２インターフェイス 1 WS (Workstation) 2 Display 3 Keyboard 4 Mouse 5 Image Scanner 6 Character Recognition Device 7 Printer 11 CPU 12 ROM 13 RAM 14 Interface 15 Image Board 21 OCR (Optical Character Reader) Board 22 Interface

Claims

[Claims]

1. A storage unit for storing an address and a telephone number in the address in association with each other, a reading unit for reading an image of a business card, and an image of the business card read by the reading unit,
Detecting means for detecting an address and a telephone number, and searching the storage means for a telephone number that matches the telephone number detected by the detecting means, and based on the address associated with the telephone number, the detecting means. A business card recognition device, comprising: a correction unit that corrects the address detected by.

2. The storage means further stores the address and a zip code in the address in association with each other, and the detecting means further detects a zip code from the image of the business card, and the correcting means. Searching the storage means for a postal code that matches the postal code detected by the detection means, and further correcting the address detected by the detection means based on the address associated with the postal code. The business card recognition device according to claim 1, wherein:

3. The business card recognition device according to claim 2, wherein the storage unit hierarchically stores the address, the telephone number at the address, or the postal code at the address.