JPS60181987A

JPS60181987A - Optical reader of character

Info

Publication number: JPS60181987A
Application number: JP59038409A
Authority: JP
Inventors: Kunihiko Fujii; 邦彦藤井
Original assignee: Tokyo Sanyo Electric Co Ltd; Tokyo Electric Co Ltd
Current assignee: Tokyo Sanyo Electric Co Ltd; Toshiba Tec Corp
Priority date: 1984-02-29
Filing date: 1984-02-29
Publication date: 1985-09-17

Abstract

(57)【要約】本公報は電子出願前の出願データであるた
め要約のデータは記録されません。(57) [Abstract] This bulletin contains application data before electronic filing, so abstract data is not recorded.

Description

【発明の詳細な説明】発明の技術分野本発明は、帳票や値札等に印字された英字、数字、記号
等を光学的に読み取る光学式文字読取装置に関するもの
である。DETAILED DESCRIPTION OF THE INVENTION Technical Field of the Invention The present invention relates to an optical character reading device that optically reads alphabetic characters, numbers, symbols, etc. printed on forms, price tags, etc.

技術的背景およびその問題点一般にこの種の装置においては、文字等が印刷されたメ
ディアをランプで照明し、その反射光の量によって文字
であるか否かを判定している。そのため、人間の眼には
明らかに文字でないことが解っていても、反射光の量に
変化があれば文字であると認識してしまうことが多い。Technical Background and Problems Generally, in this type of device, a lamp is used to illuminate a medium on which characters or the like are printed, and it is determined whether or not the medium is a character based on the amount of reflected light. Therefore, even if it is obvious to the human eye that the text is not a text, it is often recognized as a text if there is a change in the amount of reflected light.

とくに、メディアが値札などである場合には、紙の端と
印刷されたデータとの間のスペースが少ないため、紙の
端で生じる影を文字であると判定し易い。また、ビニー
ル袋に入った紙やコーティングされた紙などの場合には
通常と異なる反射光が発生することが多く、それをも文
字であると判断し易い。この場合、正常のデータ部分で
この現象が現われたとしても正常なデータ部分の反射光
量の変化が太きいために影響を受けることが少ないが、
データの両端であるとこの現象による影響を直接的に受
けて文字として出力してしまうことが多い。あるいは、
読取るべき文字の近くに読取用ではない文字や記号が印
字されている場合にも、読取データの両端にノイズとし
てつき易いものである。In particular, when the media is a price tag or the like, there is little space between the edge of the paper and the printed data, so it is easy to determine that shadows appearing at the edges of the paper are characters. Furthermore, in the case of paper in a plastic bag or coated paper, a different reflected light than usual is often generated, and it is easy to judge that this is also a character. In this case, even if this phenomenon appears in the normal data part, it will not be affected much because the change in the amount of reflected light in the normal data part is large.
Both ends of the data are often directly affected by this phenomenon and are output as characters. or,
Even when characters or symbols that are not intended for reading are printed near the characters to be read, noise tends to appear at both ends of the read data.

このような理由によって入力した一連のデータ文字列の
両端には１個また複数個の必要としない誤った文字群が
現われ易い。For this reason, one or more unnecessary erroneous character groups are likely to appear at both ends of a series of input data character strings.

この両端の誤まった文字群の影響をなくす手段として特
公昭５８−３６３９１号公報に記載されたような技術が
存する。すなわち、正常なデータの両端に生じる紙の端
などを原因とする信号は、信号として受付けるもののそ
れは識別不能信号であり、これをフォーマットチェック
部では識別不能信号と認識した場合にそれが一連のデー
タ文字列の両端にあるものについてのみ除去するように
している。そのため、一連のデータ文字列の両端の誤読
取りの影響を識別不能信号の除去と云う手段によって防
止しているものである。There is a technique described in Japanese Patent Publication No. 58-36391 as a means for eliminating the influence of the erroneous character groups at both ends. In other words, a signal caused by the edge of paper that occurs at both ends of normal data is accepted as a signal, but it is an unidentifiable signal, and if the format checker recognizes this as an unidentifiable signal, it is determined that it is a series of data. Only the characters at both ends of the string are removed. Therefore, the influence of erroneous reading of both ends of a series of data character strings is prevented by means of removing unidentifiable signals.

しかしながら、実際には紙の両端縁等を原因とする誤読
取りが行なわれたときに、それが識別不能文字として認
識されるよりも−（マイナス）、・（ピリオド）、・（
カンマ）、口（スペース）等の記号として読み取ってし
まうことが多い。そのため、このような特定文字として
読み取られれば正常なデータに付加されることになシ、
正常なデータを正しく読取っているにもかかわらず桁が
合わないためエラーになってしまうものである。However, in reality, when misreading occurs due to the edges of the paper, etc., rather than being recognized as unidentifiable characters, - (minus), ・(period), ・(
It is often read as a symbol such as a comma) or a space. Therefore, if it is read as such a specific character, it will not be added to normal data.
Even though normal data is being read correctly, an error occurs because the digits do not match.

発明の目的本発明は、真のデータの内容を損なうことなくノイズを
削除して読取率を向上させることができる光学式文字読
取装置を得ることを目的とする。OBJECTS OF THE INVENTION It is an object of the present invention to provide an optical character reading device that can improve the reading rate by removing noise without damaging the true data content.

発明の概要本発明は、紙の端などの反射光の量変化により一（マイ
ナス）、・（ピリオド）、ν（カンマ）、Ｕ（スペース
）等の認識可能な文字として読取ってしまう特定文字ｔ
−あらかじめ設定しておき、これが一連のデータ文字列
の両端に発生したときにはそれらを除去するようにし、
これにより、真のデータを損なうことなく、かつ、エラ
ー信号の発生がないようにし、これにより、両端にノイ
ズがあってもそれによる影響をなくして読取率を向上さ
せうるように構成したものである。Summary of the Invention The present invention provides special characters t that can be read as recognizable characters such as 1 (minus), .
- set in advance so that when this occurs at both ends of a series of data strings, they are removed,
This ensures that true data is not lost and no error signals are generated, thereby eliminating the effects of noise on both ends and improving the reading rate. be.

発明の実施例まず、スキャナ（１）は文字列上を走査して複数個の文
字を一方向から順次光学的に入力するハンディタイプの
ものであり、読取開口（２）に臨ませてレンズ（３）と
ランプ（４）とが設けられており、内部には光電変換素
子としてのＣＯＤ　（５）が設けられている。Embodiments of the Invention First, the scanner (1) is a handy type that scans a character string and optically inputs a plurality of characters sequentially from one direction. 3) and a lamp (4), and a COD (5) as a photoelectric conversion element is provided inside.

また、ＣＯＤ　（５）のためのＬＳＩ（６）と回路部品
（７）とが内蔵されている。このようなスキャナ（１）
は量子化回路（８）、検出切出回路（９）、標本化回路
叫、類似度計算回路α力、リスト処理編集回路（２）に
順次接続されている。Furthermore, an LSI (6) and circuit components (7) for the COD (5) are built-in. Scanner like this (1)
are sequentially connected to a quantization circuit (8), a detection extraction circuit (9), a sampling circuit, a similarity calculation circuit α, and a list processing/editing circuit (2).

しかして、光学的に読取られたＡ矢視の如きアナログ的
なデータは、Ｂ矢視に示すように量子化回路（８）でデ
ジタル的な二値化データに変換される。The optically read analog data as shown by arrow A is converted into digital binary data by the quantization circuit (8) as shown by arrow B.

このような二値化データは検出フレームα１内で図示の
ように変化する。いま、「２」するデータが認識されて
いるものとすれば、検出フレームαｊ内の信号は図示の
ものにおいて左側から右側に向順次変化し、検出切出回
路（９）で切出される。そして、切出、正規化されて縦
１６ビツト、横１２ビツトのデータとされる。Such binary data changes as shown in the figure within the detection frame α1. Assuming that the data for "2" is recognized, the signal in the detection frame αj changes sequentially from the left to the right in the diagram, and is extracted by the detection extraction circuit (9). The data is then cut out and normalized into data of 16 bits vertically and 12 bits horizontally.

ついで゛、リスト処理編集回路四から出力される認識文
字データはＩ１０ポートα４に入力される。このＩ１０
ポートはＣＰＵαυに接続されているものであるが、こ
のＣＰＵ（至）は、ＲＯＭα０、ＲＡＭαη、カウンタ
α杓、ホストへ出力するＩ１０ポートα傷にデータバス
（イ）、アドレスバスＱυを介して接続されている。ま
た、前記カウンタ（１８のチップセレクト（ＣＳ）は前
記Ｉ１０ポートα冶、ＲＯＭ　Ｑｄ、ＲＡＭａη、Ｉ１
０ポート（至）に接続されている。Next, the recognized character data output from the list processing/editing circuit 4 is input to the I10 port α4. This I10
The port is connected to the CPU αυ, and this CPU (to) is connected to the ROMα0, RAMαη, counter α, and I10 port α that outputs to the host via the data bus (A) and address bus Qυ. has been done. Further, the counter (18 chip select (CS)) is connected to the I10 port α, ROM Qd, RAMaη, I1
Connected to port 0 (to).

しかして、メディアとしては、たとえば値札に）があり
、この値札（イ）にはＡ　１２３−４５６と云うデータ
が印刷されている。そして、値札（イ）の端（イ）や汚
れ等により意味のある記号として読みとられる特定文字
は、−（マイナス）、・（ピリオド）、・（カンマ）、
Ｕ（スペース）であるものとする。このような特定文字
はいろいろなケースで実際に試験してみたときに発生し
易いものとして選ばれたものである。As a medium, for example, there is a price tag (a), and the data A 123-456 is printed on this price tag (a). Specific characters that can be read as meaningful symbols due to the edge (a) of a price tag (a) or dirt, etc. are - (minus), ・(period), ・(comma),
Assume that it is U (space). These specific characters were selected because they were likely to occur when actually tested in various cases.

ついで、第４図に基づいてその動作を説明する。Next, the operation will be explained based on FIG.

まず、イニシャライズ後に入力データを受け付け、その
文字認識をし、全文字が入力されたか否かをチェックす
る。全文字が入力されると桁数を示すｎを１とし、全入
力文字の左端より１桁目が特定文字であるか否かをチェ
ックし、特定文字であればその文字を削除する。そして
、ｎ　ｔ−２として特定文字でない文字を認識するまで
繰り返す。たとえば、第３図の値札（イ）について読取
った全文字の内容が −ｌ　Ｌ−ＩＡ１２３−４５６−・−ｐであるものとす
る。これは真のデータがＡ１２３−４５６であることか
ら伺の処理も行なわなければエラーとなる筈のものであ
る。そこで、前述の処理を行なった結果、左側から３桁
分の一−一は除去されたことになる。First, after initialization, input data is accepted, its characters are recognized, and it is checked whether all characters have been input. When all characters are input, n indicating the number of digits is set to 1, and it is checked whether the first digit from the left end of all input characters is a specific character or not, and if it is a specific character, that character is deleted. Then, the process is repeated until a character other than a specific character is recognized as nt-2. For example, assume that the contents of all the characters read on the price tag (a) in FIG. 3 are -l L-IA123-456-.-p. Since the true data is A123-456, an error would occur if the corresponding processing was not performed. Therefore, as a result of performing the above-described processing, three digits 1-1 from the left side are removed.

しかして、左側の処理でＡなる文字を認識すると、それ
は特定文字ではないので、今度は右側からの桁数ｎ　ｔ
−１として全文字入力の右側からの第１桁が特定文字で
あるか否かをチェックする。それが特定文字であればそ
の文字を除去し、ついで、ｎに＋１をして第２桁、第３
桁と順次チェックする。前述の例であると右側からｔ−
・−の１桁の文字が除去される。However, when the character A is recognized in the processing on the left, it is not a specific character, so this time the number of digits from the right is n t
-1, and checks whether the first digit from the right side of all characters input is a specific character. If it is a specific character, remove that character, then add +1 to n and enter the second and third digits.
Check the digits in sequence. In the above example, from the right side t-
・The 1-digit character is removed.

このようにして左右の文字を除去すると、データはＡ　１２３−４５６となる。このデータはいろいろなファンクションチェッ
クがなされた後に、さらに他の理由に基づくエラーがあ
ればエラー信号を出力し、正常であればデータ出力をす
る。このようなノイズ削除の動作は、リスト処理編集回
路で行なわれる。When the left and right characters are removed in this way, the data becomes A 123-456 . After this data is subjected to various function checks, if there is an error due to other reasons, an error signal is output, and if normal, the data is output. Such noise deletion operation is performed by the list processing and editing circuit.

発明の効果本発明は、上述のようにメディアに印刷された文字また
は記号の両端側において、反射光量の変化によって−（
マイナス）、・（ピリオド）、ｔ（カン−）、、−（ス
ペース）等の特定文字として読取ってしまった場合、そ
れらの特定文字が一連の全文字列の両端に１桁または連
続した複数桁存することを条件として除去するようにし
たので、真のデータの内容を損なうことなくノイズを有
効に除去して読取ることができ、これにより読取率を向
上することができるものである。Effects of the Invention The present invention provides -(
If it is read as a specific character such as minus), ・(period), t(k-), -(space), etc., those specific characters may be one digit or consecutive multiple digits at both ends of the entire string. Since noise is removed on the condition that it exists, noise can be effectively removed and read without damaging the true data content, thereby improving the reading rate.

[Brief explanation of the drawing]

図面は本発明の実施例を示すもので、第１図はブロック
図、第２図はＣＰＵを中心とする回路部分のブロック図
、第３図は値札の平面図、第４図はフローチャートであ
る。５・・・ＣＣＤ　（光電変換素子）、８・・・量子化回
路、９・・・検出切出回路、１０・・・標本化回路、１
２・・・リスト処理編集回路、２２・・・値札（メディ
ア）出　願　人　東京電気株式会社」ＺフThe drawings show an embodiment of the present invention; FIG. 1 is a block diagram, FIG. 2 is a block diagram of the circuit centering on the CPU, FIG. 3 is a plan view of a price tag, and FIG. 4 is a flowchart. . 5... CCD (photoelectric conversion element), 8... quantization circuit, 9... detection extraction circuit, 10... sampling circuit, 1
2...List processing editing circuit, 22...Price tag (media) application person Tokyo Electric Co., Ltd.

Claims

[Claims]

a photoelectric conversion element that optically reads characters or symbols printed on the media and converts them into electrical signals; a quantization circuit that converts the output of the photoelectric conversion element into a binary digital signal; A detection extraction circuit that extracts image data, a sampling circuit that eliminates the effects of noise, etc. in the image data, and a sampling circuit that checks the format of the entire input data and one or more specific symbols set in advance at both ends of the input data. An optical character reading device characterized by a list processing/editing circuit that removes and outputs specific symbols when they are continuous.