JPH0150954B2

JPH0150954B2 -

Info

Publication number: JPH0150954B2
Application number: JP57097606A
Authority: JP
Inventors: Akihiro Asada
Original assignee: Hitachi Ltd
Current assignee: Hitachi Ltd
Priority date: 1982-06-09
Filing date: 1982-06-09
Publication date: 1989-11-01
Also published as: JPS58214972A

Description

[Detailed description of the invention]

本発明は、オンライン手書き文字認識装置に係
り、特に、筆記者の負担軽減と文字記入速度の低
下防止とを図つたオンライン手書き文字認識装置
に関するものである。従来技術とその問題点を第１図、第２図により
説明する。第１図において、１はダブレツト（座
標入力装置）、２は文字記入用シート、３は入力
ペン、４は前処理部、５は特徴抽出回路、、６は
マツチング回路、７は最小値選択回路、８はコー
ド変換回路、９は出力端、１０はUP／DOWN検
出回路、１１はストローク数検出回路、１２は標
準パターンメモリ、１３は位置比較回路、１４は
指示枠位置メモリ、１５は大文字／小文字フラグ
メモリである。筆記者は、入力ペン３によつて、
タブレツト１上に配置されている文字記入用シー
ト２の各文字記入枠２―１内に文字を筆記する。このとき、タブレツト１は、入力ペン３のペン
先のXY座標の位置情報を、一定時間（標本化周
期）ごとに出力ライン１―３より出力する。ま
た、入力ペン３には、入力ペン３が文字記入用シ
ート２に圧着しているか否かを検出するスイツチ
が内蔵されており、このスイツチ出力も、Ｚ軸情
報として、前記XY座標の位置情報とともに標本
化周期ごとに出力ライン１―３より出力される。これらのＸ，Ｙ，Ｚ軸情報は、筆跡情報として
前処理部４に供給される。前処理部４では、ま
ず、Ｚ軸情報を見て、入力ペン３が文字記入用シ
ート２に圧着しているデータ（以下筆跡点と呼
ぶ）のみを選択して取り込み以下の正規化処理を
する。筆跡点の系列には冗長な点を含んでいる。それ
は、文字記入時のペン先の移動が一定速度ではな
く、隣接する筆跡点間の空間的な距離が非常に近
接しているものでもあるからである。このため
に、筆跡冗長点の除去が行なわれる。除去の方法
は、ストローク（入力ペン３が文字記入用シート
２に圧着してから離れるまでに描かれた一つの線
分、筆跡点系列）の始点から一定距離はなれた筆
跡点を再標本化点とし、次に、この再標化点から
同様に一定距離はなれた筆跡点を再び再標本化点
とする処理をストロークの終点まで行なう。つま
り、各ストロークの筆跡点系列を、時間空間系列
から距離空間系列に変換する（以下この処理を再
標本化処理と呼ぶ）。次に、前処理部４は、再標本化処理された１文
字分のデータに対して、位置と大きさの正規化を
行なう。文字記入用シート２のどの文字記入枠２
―１に文字を筆記するかによつて筆跡点のXY座
標値が異なること、及び文字記入枠２―１内のど
の位置に文字を筆記するかによつても筆跡点の
XY座標値が異なることのために、１文字単位ご
とに、文字の重心位置が一定となるように座標変
換するのが、位置の正規化である。また、文字記
入枠２―１に記入される文字の大きさは筆記者に
よつて異なるため、記入文字の大きさが一定とな
るように各再標本化点の座標変換を行なうのが大
きさの正規化である。これは、文字の重心位置に
対する各再標本化点の距離の平均が一定になるよ
うにすることで行なわれる。このように前処理された入力文字データは、特
徴抽出回路５によつて、以後の処理が容易に行な
えるように、情報量を低減した形で表現される。
例えば、Ｍストロークからなる入力文字I〓は、第
ｍ番目に記入されたストロークをImとして I〓＝（I₁，I₂，…，I_M）というように、ストロークの筆記順に表現する。
また、各ストロークI₁〜I_Mは、それぞれストロー
クの始点（書き始めの筆跡点）から終点（書き終
わりの筆跡点）までの１ストロークの線分をＮ等
分するＮ＋１個の折線近似点の系列で表現する。
つまり、第ｍ番目のストロークI_nは、折線近似点
P_n1，P_n2，…，P_nN+1の系列を用いて I_n＝（P_n1，P_n2，…，P_nN+1）と表現する。ここで折線近似点P_noは P_no＝（x_no，y_no）で示されるXY座標値である。このように特徴抽出回路５で記述された入力文
字は、マツチング回路６の一方の入力端に供給さ
れる。マツチング回路６の他方の入力端には、予
め認識対象の各文字ごとに、入力文字に対すると
同様な前処理、特徴抽出された多数の筆記者によ
る入力文字の平均的なパターンが、標準パターン
メモリ１２より供給される。ここで、文字θに対
する標準パターンS〓〓を S〓〓＝（S〓₁，S〓₂，…，S〓_M）とする。ただし、Ｍは文字θのストローク数でS〓
_ｎは S〓_n＝（P〓_n1，P〓_n2，…，P〓_nN+1）と表現される第ｍ番目のストロークである。 P〓_noは P〓_no＝（x〓_no，y〓_no）と表現される、第ｍ番目のストロークの線分をＮ
等分する折線近似点の第ｎ番目のXY座標値であ
る。マツチング回路６では、入力文字I〓と、このI〓の
ストローク数Ｍに等しいストローク数の標準パタ
ーンS〓〓との距離Ｄ（θ）を次のように計算する。Ｄ（θ）＝_M 〓^m=1 dS（S〓_n，I_n） (7) ここで、I_nは入力文字I〓の第ｍ番目のストロー
ク、S〓_nは標準パターンS〓〓の第ｍ番目のストロー
ク、dS（S〓_n，I_n）は両パターンの第ｍ番目のスト
ロークS〓_n，I_n間の距離を示し dS（S〓_n，I_n）＝_N+1 〓ⁿ⁼¹ dP （P〓_no，P_no） (8) である。Ｎ＋１はストロークの折線近似点数、P〓
_ｎｏ，P_noは両パターンの第ｍ番目のストロークの
第ｎ番目の折線近似点、dP（P〓_no，P_no）は両パ
ターンの第ｍ番目のストロークの第ｎ番目の折線
近似点間の距離を示し dP（P〓_no，P_no）＝√（〓_no−_no）²
＋（〓_no−_no）²(9) である。以上の(7)〜(9)式をまとめて、マツチング回路６
は、両パターン間距離Ｄ（θ）としてＤ（θ）_M 〓^m=1 _N+1 〓〓ⁿ⁼¹ √（〓_no−_no）²＋（〓_no−_no）²(10)
を計算する。ただし、x〓_no，x_noは第ｍ番目のス
トロークの第ｎ番目の折線近似点のＸ座標値、y〓
_ｎｏ，y_noは同じくＹ座標値である。ここで、入力文字I〓のストローク数Ｍに等しい
標準パターンがＬ個あれば、マツチング回路６は
このＬ個の標準パターンに対して順次入力文字I〓
とのパターン間距離Ｄ（θ）を計算し、結果を、
最小値選択回路７に供給する。なお、入力文字I〓のストローク数は、タブレツ
ト１からのＺ軸情報をもとに、UP／DOWN検出
回路１０によつて、入力ペン３のUP，DOWNを
検出し、そして、UPからDOWNへの変化を、ス
トローク数検出回路１１によつて、１文字分にわ
たり、計数することによつて求める。ストローク
数検出回路１１の出力は、標準パターンメモリ１
２を制御し、入力文字I〓のストローク数Ｍに等し
い標準パターンS〓〓を選択し、マツチング回路６に
供給する。最小値選択回路７は、順次供給されるパターン
間距離のＬ個Ｄ（θ₁）〜Ｄ（θ_L）のうちの最小値を
検出する。検出した最小値がＤ（θ¹）であるとす
れば、入力文字I〓は、標準パターンがS〓〓¹の文字で
あると認識し、標準パターンS〓〓¹に対応する文字
コードを標準パターンメモリ１２より取り込みこ
れをコード変換回路８に出力する。以上のような従来のオンライン手書き文字認識
方式において問題となるのは、例えば仮名文字
「キヤ」「キユ」「キヨ」における大文字「キ」と
小文字「ヤ」「ユ」「ヨ」のような、大文字と小文
字の入力方式と認識方式である、なんとなれば、
これらの小文字は、大文字と形状を全く同一に
し、、大きさが異なるのみであり、そして、入力
文字の大きさの異なりは、従来技術では、前処理
部における正規化によつて、認識部では全く同一
文字となつてしまい、大文字であるか小文字であ
るかの判断ができないことになるからである。例えば、第２図のａのように「ツ」という仮名
を文字記入枠２―１の枠内一ぱいに書いた場合
と、ｃのように文字記入枠２―１のすみに小さく
書いた場合とで、前処理結果は、ｂ，ｄのように
W₀を重心点として、全く等しいパターンとなつ
てしまう。このように大文字と小文字の文字パタ
ーンが同一形状となるものは、日本字の片仮名、
平仮名ばかりでなく、英字にも存在する。これに対処して、従来、大文字であるか小文字
であるかの情報を、筆記時にタブレツト１を介し
て筆記者が供給する方式が採用されていた。例え
ば、第１図に示すように、タブレツト１に大文字
指示枠１―１と小文字指示枠１―２とを設け、ど
ちらかの指示枠を入力ペン３で押圧することによ
つて、以後の入力文字は大文字であること、ある
いは小文字であることを宣言する。そして認識部
では、位置比較回路１３において、指示枠１―
１，１―２に対応する座標値（指示枠位置メモリ
１４に記憶されている）を比較用基準として、入
力ペン３がどちらの指示枠を押圧したか、あるい
は全く押圧していないかを検出し、検出結果に対
応して、大文字／小文字フラグメモリ１５をセツ
トする。ここで、大文字／小文字フラグメモリ１５の出
力Ｆを、大文字の場合Ｆ＝１、小文字の場合Ｆ＝
０とする。また、大文字の「ツ」の文字コードを
A5C4、小文字の「ツ」の文字コードをA5C3と
し、「ツ」の標準パターンに対して大文字の文字
コードA5C4を与えたとする。そして、「ツ」を入
力ペン３で筆記した場合を見ると、最小値選択回
路７の出力には文字コードA5C4が出力される。
この文字コードはコード変換回路８に供給され
る。コード変換回路８は、大文字／小文字フラグ
メモリ１５からの指示がＦ＝１（大文字指示）の
ときはそのまま文字コードA5C4を出力し、Ｆ＝
０（小文字指示）のときは、小文字に対応する文
字コードA5C3を出力する。つまり、コード変換
回路８は、大文字に対応する小文字の文字コード
対応表を内部にもち、これを用いて、Ｆ＝０のと
きは大文字の文字コードを小文字の文字コードに
変換して出力する。しかし上記した従来方式には、文字を記入する
以外に大文字であるか小文字であるかの指示をも
しなければならず、筆記者にとつて負担となるば
かりでなく、入力速度を低下させるという問題が
ある。本発明の目的は、従来技術での上記した問題点
を解決し、筆記者の負担を軽減し、文字記入速度
の低下を防止するとともに、英字に対する認識能
率を向上させることのできるオンライン手書き文
字認識装置を提供することにある。本発明の特徴は、上記目的を達成するために、
筆記文字の大きさを検出する文字サイズ検出手段
と、この検出値と設定値とを比較して設定値より
大きいとき大文字、小さいとき小文字と判定する
文字サイズ判定手段と、筆記文字の文字サイズに
は依存しないで筆記文字をパターン認識した結果
に対応する文字コードと上記サイズ判定結果信号
とを入力に受けてサイズ判定結果が大文字のとき
は筆記文字に対応する文字コードをそのまま出力
しサイズ判定結果が小文字のときは入力に受けた
文字コードをもとに所定のコード変換を行なつて
出力するコード変換手段とを備えた構成とするに
ある。以下、本発明の一実施例を第３図により説明す
る。第３図において、４―１は再標本化回路、４
―２は位置正規化回路、４―３は重心点抽出回
路、４―４は大きさ正規化回路、４―５は平均半
径抽出回路、１６は比較回路であり、その他は第
１図の場合と同じである。入力された文字の筆跡
情報はタブレツトの出力ライン１―３より再標本
化回路４―１に入力され、ここでは前述したよう
に、各ストロークの筆跡点のうちの冗長点を除去
し、各ストロークの筆跡点系列を時空記系列から
距離空間系列に変換する。つまり、ストロークの
始点から終点までの線分を一定距離間隔で再標本
化する。この再標本化された入力文字I〓の第ｍ番目のス
トロークI_nを再標本化点Q_n1，Q_n2…，Q_nE（ｍ）
の系列で表現し I_n＝（Q_n1，Q_n2，…，Q_nE（ｍ））とする。ただし、Q_neは第ｍ番目のストロークの
第ｅ番目の再標本化点、Ｅ（ｍ）は第ｍ番目のス
トロークの再標本化点の数である。また、再標本
化点Q_neは、XY座標値を示し Q_ne＝（X_ne，Y_ne）である。こうして再標本化された入力文字データは、重
心点抽出回路４―３に供給され、入力文字の重心
位置W₀が抽出される。この重心位置W₀は、一文
字分の全再標本化点Q_ne（ｍ＝１〜Ｍ，ｅ＝１〜
Ｅ（ｍ））のＸ座標値の平均値X₀，Ｙ座標値の平
均値Y₀をXY座標値とするもので W₀＝（X₀，Y₀）と表現される。次に、位置正規化回路４―２において、この重
心位置W₀を新たなXY座標軸の原点とするよう
に、各再標本化点Q_neの座標値を変換する。つま
り Q_ne＝（X_ne−X₀，Y_ne−Y₀）のように、各再標本化点Q_neのXY座標値X_ne，
Y_neから、X₀，Y₀を減算する。ここで x_ne＝X_ne−X₀ y_ne＝Y_ne−Y₀ とすると、重心位置はｘ＝０，ｙ＝０となる。次に、上記のように位置の正規化処理が行なわ
れた入力文字データに対して、平均半径抽出回路
４―５において、入力文字の大きさ、ここでは入
力文字の平均半径ＲをＲ＝１／Ｕ_M 〓^m=1 _E(n) 〓^e=1 ｛｜xme｜＋｜y_ne｜｝として求める。ここで、Ｕ＝_M 〓^m=1 Ｅ（ｍ）で入力文
字の再標本化点数であり、Ｍは入力文字のストロ
ーク数、｜x_ne｜，｜y_ne｜は入力文字の重心位置
W₀を原点とする第ｍ番目のストロークの第ｅ番
目の再標本化点のＸ軸値、Ｙ軸値の絶対値であ
る。即ち、各再標本化点Q_neの重心位置W₀から
の距離の平均値として平均半径Ｒを求めたことに
なる。この平均半径Ｒは、文字を大きく記入する
ほど大きな値となるものであり、入力文字の大き
さに対応するパラメータである。大きさ正規化回路４―４は、この平均半径Ｒが
設定値R₀となるように、各再標本化点Q_neの座標
値を変換する。この変換処理後の再標本化点Q_ne
のＸ，Ｙ軸の値をx〓_ne，y〓_neとすれば、大きさの正
規化は x〓_ne＝R₀／Ｒx_ne y〓_ne＝R₀／Ｒy_ne のように、入力文字の平均半径Ｒで各XY座標値
を正規化（除算）することである。この入力文字の平均半径Ｒは、また、比較回路
１６の一方の入力端に入力され、他方の入力端に
入力される設定値Rthと比較される。つまり、比
較回路１６は、入力文字が設定値よりも大きいか
否かを判定する。判定結果はコード変換回路８に
供給され、コード変換回路８の動作を制御する。大きさ正規化回路４―４の出力信号は特徴抽出
回路５に供給され、従来技術で説明したように、
入力文字が情報量圧縮された形で表現されそし
て、マツチング回路６において標準パターンとの
マツチング計算（パターン間距離の計算）を行な
う。標準パターンメモリ１２には、英字に関しては
Ａ，Ｂ，Ｃ，…等の大文字、仮名文字に関しても
ア、イ、ウ、…ツ等の大文字の、多数の筆記者に
よつて記入され前述の前処理、特徴抽出が行なわ
れたパターンの平均的なパターンを、その文字に
対応する文字コードとともに記憶させておく。ま
た、このとき、文字のストローク数によつて分類
して、記憶させておく。ここで、第４図の図１、図２のように、文字記
入枠２―１に英字大文字「Ａ」を第１ストローク
I₁「∧」、第２ストロークI₂「―」の２ストローク
で記入したとすると、出力ライン１―３の筆跡情
報は、前処理、特徴抽出が行なわれ、マツチング
回路６で、標準パターンメモリ内の２ストローク
からなる標準パターンとのマツチング計算（パタ
ーン間距離の計算）が行なわれ、結果が順次、最
小値選択回路７に供給される。そして、最小値選
択回路７において、パターン間距離Ｄ（θ）の最
小値を検出して、その最小値に対応する標準パタ
ーンの文字コードをコード変換回路８に供給す
る。ここで、標準パターンとその文字（大文字）に
対応する文字コードを第１表の左部のようにし、
またその大文字に対応する小文字の文字コードを
第１表の右部のように設定したとする。 The present invention relates to an online handwritten character recognition device, and more particularly to an online handwritten character recognition device that reduces the burden on a scribe and prevents a decrease in character entry speed. The prior art and its problems will be explained with reference to FIGS. 1 and 2. In Figure 1, 1 is a doublet (coordinate input device), 2 is a character entry sheet, 3 is an input pen, 4 is a preprocessing section, 5 is a feature extraction circuit, 6 is a matching circuit, and 7 is a minimum value selection circuit. , 8 is a code conversion circuit, 9 is an output end, 10 is an UP/DOWN detection circuit, 11 is a stroke number detection circuit, 12 is a standard pattern memory, 13 is a position comparison circuit, 14 is an instruction frame position memory, 15 is a capital letter / Lowercase flag memory. The scribe uses the input pen 3 to
Characters are written in each character entry frame 2-1 on the character entry sheet 2 placed on the tablet 1. At this time, the tablet 1 outputs the positional information of the XY coordinates of the tip of the input pen 3 from the output lines 1-3 at regular intervals (sampling period). In addition, the input pen 3 has a built-in switch that detects whether or not the input pen 3 is pressed against the character entry sheet 2, and the output of this switch also provides the position information of the XY coordinates as Z-axis information. It is also output from output lines 1-3 every sampling period. These X, Y, and Z axis information are supplied to the preprocessing section 4 as handwriting information. The preprocessing unit 4 first looks at the Z-axis information, selects and imports only the data that the input pen 3 presses onto the character entry sheet 2 (hereinafter referred to as handwriting points), and performs the following normalization process. . The series of handwriting points includes redundant points. This is because the pen tip does not move at a constant speed when writing characters, and the spatial distance between adjacent handwriting points is very close. For this purpose, redundant handwriting points are removed. The removal method is to resample handwriting points that are a certain distance away from the starting point of a stroke (one line segment drawn from when the input pen 3 presses on the character entry sheet 2 until it leaves the character writing sheet 2, a series of handwriting points). Then, the process of again setting handwriting points that are a certain distance away from this re-marking point as re-sampling points is performed until the end point of the stroke. That is, the handwriting point series of each stroke is converted from a time-space series to a distance-space series (hereinafter, this process will be referred to as resampling process). Next, the preprocessing unit 4 normalizes the position and size of the data for one character that has been resampled. Which character entry frame 2 on character entry sheet 2
The XY coordinate values of the handwriting point differ depending on where the character is written in 2-1.
Since the XY coordinate values are different, position normalization involves converting the coordinates of each character so that the center of gravity of the character remains constant. In addition, since the size of the characters written in the character entry frame 2-1 differs depending on the scribe, it is necessary to perform coordinate transformation of each resampling point so that the size of the characters written is constant. This is the normalization of This is done by ensuring that the average distance of each resampling point to the center of gravity of the character is constant. The input character data preprocessed in this manner is expressed by the feature extraction circuit 5 in a form with a reduced amount of information so that subsequent processing can be easily performed.
For example, an input character I〓 consisting of M strokes is expressed in the writing order of the strokes, such as I〓=(I ₁ , I ₂ , . . . , I _M ), where the mth stroke written is Im.
In addition, each stroke I ₁ to I _M is made up of N+1 broken line approximation points that divide the line segment of one stroke from the start point (starting handwriting point) to the end point (finishing handwriting point) into N equal parts. Express as a series.
In other words, the mth stroke I _n is the approximate point of the broken line
Using the sequence of P _n1 , P _n2 , ..., P _nN+1, it is expressed as I _n = (P _n1 , P _n2 , ..., P _nN+1 ). Here, the broken line approximate point P _no is the XY coordinate value shown by P _no = (x _no , y _no ). The input characters described by the feature extraction circuit 5 in this manner are supplied to one input terminal of the matching circuit 6. At the other input terminal of the matching circuit 6, a standard pattern memory stores an average pattern of characters input by a large number of scribes, which has undergone the same preprocessing and feature extraction as for the input characters for each character to be recognized. 12. Here, let the standard pattern S〓〓 for the character θ be S〓〓=(S〓 ₁ , S〓 ₂ ,..., S〓 _M ). However, M is the number of strokes of the character θ and S〓
_n is the m-th stroke expressed as S〓 _n = (P〓 _n1 , P〓 _n2 , . . . , P〓 _nN+1 ). P〓 _no is the line segment of the m-th stroke expressed as P〓 _no = (x〓 _no , y〓 _no )
This is the nth XY coordinate value of the approximate point of the polygonal line that divides into equal parts. The matching circuit 6 calculates the distance D (θ) between the input character I〓 and the standard pattern S〓 whose stroke number is equal to the stroke number M of this I〓 as follows. D (θ) = _M 〓 ^m=1 dS (S〓 _n , I _n ) (7) Here, I _n is the mth stroke of the input character I〓, and S〓 _n is the mth stroke of the standard pattern S〓〓. The m-th stroke, dS (S〓 _n , I _n ) indicates the distance between the m-th stroke S〓 _n , I _n of both patterns, and dS (S〓 _n , I _n )= _N+1 〓 ^{n= 1} dP (P〓 _no , P _no ) (8). N+1 is the number of stroke line approximation points, P〓
_no , P _no are the n-th broken line approximate points of the m-th strokes of both patterns, and dP (P〓 _no , P _no ) are the points between the n-th broken line approximate points of the m-th strokes of both patterns. Indicates the distance dP (P〓 _no , P _no ) = √ (〓 _no − _no ) ²
+(〓 _no − _no ) ² (9). Combining the above equations (7) to (9), matching circuit 6
_is ^the _distance _between ^both _patterns ^D ₍ _θ ) ^.
Calculate. However, x〓 _no , x _no is the X coordinate value of the nth broken line approximate point of the mth stroke, y〓
_no and y _no are the Y coordinate values as well. Here, if there are L standard patterns equal to the number of strokes M of the input character I〓, the matching circuit 6 sequentially matches the input character I〓 to these L standard patterns.
Calculate the inter-pattern distance D(θ) with
It is supplied to the minimum value selection circuit 7. The number of strokes of the input character I〓 is determined by detecting UP and DOWN of the input pen 3 by the UP/DOWN detection circuit 10 based on the Z-axis information from the tablet 1, and then from UP to DOWN. The stroke number detection circuit 11 counts the change in the number of strokes over one character. The output of the stroke number detection circuit 11 is stored in the standard pattern memory 1.
2, selects a standard pattern S〓 equal to the number of strokes M of the input character I〓, and supplies it to the matching circuit 6. The minimum value selection circuit 7 detects the minimum value of L inter-pattern distances D(θ ₁ ) to D(θ _L ) sequentially supplied. If the detected minimum value is D(θ ¹ ), the input character I〓 is recognized as a character whose standard pattern is S〓〓 ¹ , and the character code corresponding to the standard pattern S〓〓 ¹ is set as the standard. It is fetched from the pattern memory 12 and outputted to the code conversion circuit 8. The problem with the conventional online handwritten character recognition method described above is that, for example, the uppercase letter ``ki'' and the lowercase letters ``ya'', ``yu'', and ``yo'' in the kana characters ``kiya'', ``kiyu'', and ``kiyo'' are difficult to recognize. It is an input method and recognition method for uppercase and lowercase letters.
These lowercase letters have exactly the same shape as the uppercase letters, and differ only in size. In the conventional technology, the difference in the size of input characters is normalized in the preprocessing unit, and is not recognized in the recognition unit. This is because the characters will be exactly the same, making it impossible to determine whether they are uppercase or lowercase. For example, when the kana ``tsu'' is written in the entire text box 2-1 as shown in a in Figure 2, and when it is written small in the corner of the text box 2-1 as shown in c. Then, the preprocessing results are as shown in b and d.
With W ₀ as the center of gravity, the pattern becomes exactly the same. In this way, uppercase and lowercase letter patterns with the same shape are Japanese katakana,
It exists not only in hiragana but also in English. In order to cope with this problem, a method has been adopted in the past in which a scribe supplies information on whether a letter is an uppercase or a lowercase letter via the tablet 1 when writing. For example, as shown in FIG. 1, the tablet 1 is provided with an uppercase letter instruction frame 1-1 and a lowercase letter instruction frame 1-2, and by pressing either of the instruction frames with the input pen 3, subsequent input can be performed. Declares a character to be uppercase or lowercase. Then, in the recognition section, the position comparison circuit 13 selects the instruction frame 1--
Using the coordinate values corresponding to 1 and 1-2 (stored in the instruction frame position memory 14) as a reference for comparison, it is detected which instruction frame is pressed by the input pen 3, or whether it is not pressed at all. Then, the uppercase/lowercase flag memory 15 is set in accordance with the detection result. Here, the output F of the uppercase/lowercase flag memory 15 is set to F=1 for uppercase letters and F=1 for lowercase letters.
Set to 0. Also, the character code for the capital letter “tsu” is
Assume that A5C4, the character code for the lowercase "tsu" is A5C3, and the uppercase character code A5C4 is given to the standard pattern for "tsu". When the character "tsu" is written with the input pen 3, the minimum value selection circuit 7 outputs the character code A5C4.
This character code is supplied to the code conversion circuit 8. When the instruction from the uppercase/lowercase flag memory 15 is F=1 (uppercase instruction), the code conversion circuit 8 directly outputs the character code A5C4, and F=
When 0 (lowercase letter instruction), the character code A5C3 corresponding to lowercase letters is output. That is, the code conversion circuit 8 internally has a character code correspondence table of lowercase letters corresponding to uppercase letters, and uses this to convert the uppercase character code to a lowercase character code when F=0 and outputs the converted character code. However, in the conventional method described above, in addition to writing the letters, it is also necessary to indicate whether the letters are uppercase or lowercase, which not only burdens the scribe but also reduces the input speed. There is. The purpose of the present invention is to provide online handwritten character recognition that can solve the above-mentioned problems in the prior art, reduce the burden on scribes, prevent a decrease in character entry speed, and improve the recognition efficiency for English characters. The goal is to provide equipment. In order to achieve the above object, the features of the present invention are as follows:
a font size detection means for detecting the size of a written character; a font size determination means for comparing the detected value with a set value and determining that it is an uppercase letter when it is larger than the set value; and a lowercase letter when it is smaller than the set value; receives the character code corresponding to the result of pattern recognition of the written character and the above size judgment result signal as input, and when the size judgment result is an uppercase letter, outputs the character code corresponding to the written character as is, and the size judgment result When the character code is a lowercase letter, the code conversion means performs a predetermined code conversion based on the input character code and outputs the converted code. An embodiment of the present invention will be described below with reference to FIG. In Fig. 3, 4-1 is a resampling circuit;
-2 is a position normalization circuit, 4-3 is a centroid point extraction circuit, 4-4 is a size normalization circuit, 4-5 is an average radius extraction circuit, 16 is a comparison circuit, and the others are as in Figure 1. is the same as The handwriting information of the input characters is input to the resampling circuit 4-1 from the output line 1-3 of the tablet, and as described above, redundant points among the handwriting points of each stroke are removed, and each stroke is Convert the handwriting point series from a spatiotemporal series to a metric space series. In other words, the line segments from the start point to the end point of the stroke are resampled at regular distance intervals. The mth stroke _{I n} of this resampled input character I is resampled at points Q _n1 , Q _n2 ..., Q _nE (m)
Expressed as a series of I _n = (Q _n1 , Q _n2 , ..., Q _nE (m)). However, Q _ne is the e-th resampling point of the m-th stroke, and E(m) is the number of resampling points of the m-th stroke. Further, the resampling point Q _ne indicates the XY coordinate value and is Q _ne =(X _ne , Y _ne ). The input character data resampled in this way is supplied to the centroid point extraction circuit 4-3, and the centroid position W ₀ of the input character is extracted. This centroid position W ₀ is the total resampling point Q _ne for one character (m=1~M, e=1~
The average value X ₀ of the X coordinate value and the average value Y ₀ of the Y coordinate value of E(m)) are expressed as W ₀ =(X ₀ , Y ₀ ). Next, in the position normalization circuit 4-2, the coordinate values of each re-sampling point _Qne are transformed so that this center of gravity position _W0 becomes the origin of the new XY coordinate axes. In other words, as Q _ne = (X _ne −X ₀ , Y _ne −Y ₀ ), the XY coordinate value X _ne of each resampling point Q _ne ,
Subtract X ₀ and Y ₀ from Y _ne . Here, if x _ne =X _ne -X ₀ and y _ne =Y _ne -Y ₀ , then the center of gravity will be x=0 and y=0. Next, for the input character data whose position has been normalized as described above, the average radius extraction circuit 4-5 calculates the size of the input character, in this case the average radius R of the input character, by R=1. /U _M 〓 ^m=1 _E(n) 〓 ^e=1 {|xme|+|y _ne |}. Here, U= _M 〓 ^m=1 E(m) is the number of resampling points of the input character, M is the number of strokes of the input character, |x _ne |, |y _ne | is the center of gravity position of the input character
These are the absolute values of the X-axis value and Y-axis value of the e-th resampling point of the m-th stroke with W ₀ as the origin. That is, the average radius R is determined as the average value of the distance of each resampling point Q _ne from the center of gravity position W ₀ . This average radius R becomes a larger value as the characters are written larger, and is a parameter corresponding to the size of the input characters. The size normalization circuit 4-4 converts the coordinate values of each resampling point _Qne so that this average radius R becomes the set value _R0 . The resampling point Q _ne after this transformation process
Let _the _values _of _the _X _and _Y axes _of This is to normalize (divide) each XY coordinate value by the average radius R. This average radius R of the input characters is also input to one input terminal of the comparator circuit 16 and compared with a set value Rth input to the other input terminal. In other words, the comparison circuit 16 determines whether the input character is larger than the set value. The determination result is supplied to the code conversion circuit 8, and the operation of the code conversion circuit 8 is controlled. The output signal of the magnitude normalization circuit 4-4 is supplied to the feature extraction circuit 5, and as explained in the prior art,
The input character is expressed in a form with the amount of information compressed, and a matching circuit 6 performs a matching calculation (calculation of distance between patterns) with a standard pattern. In the standard pattern memory 12, capital letters such as A, B, C, . The average pattern of the processed and feature-extracted patterns is stored together with the character code corresponding to that character. Also, at this time, the characters are classified and stored according to the number of strokes. Here, as shown in Figures 1 and 2 of Figure 4, write the uppercase alphabetic letter "A" in the character entry frame 2-1 with the first stroke.
Assuming that the handwriting information on output lines 1-3 is written with two strokes: I ₁ "∧" and the second stroke I ₂ "-", the handwriting information on output lines 1-3 is pre-processed and feature extracted, and then stored in the standard pattern memory in the matching circuit 6. A matching calculation (calculation of inter-pattern distance) with a standard pattern consisting of two strokes is performed, and the results are sequentially supplied to the minimum value selection circuit 7. Then, the minimum value selection circuit 7 detects the minimum value of the inter-pattern distance D(θ) and supplies the character code of the standard pattern corresponding to the minimum value to the code conversion circuit 8. Here, set the standard pattern and the character code corresponding to the character (uppercase) as shown in the left part of Table 1,
It is also assumed that the character codes of lowercase letters corresponding to the uppercase letters are set as shown on the right side of Table 1.

【表】最小値選択回路７において、大文字「Ａ」に対
するパターン間距離が最小となつたとき、最小値
選択回路７は、大文字「Ａ」に付与した文字コー
ドA3C1をコー変換回路８に供給する。そして、
第４図の図１、図２の英字大文字「Ａ」の筆記に
対して、平均半径抽出回路４―５で抽出した平均
半径R₍₁₎，R₍₂₎が、設定値Rthより大きいか否か
が、比較回路１６で比較される。ここでは RthR₍₁₎ …図１の場合 Rth＞R₍₂₎ …図２の場合と判定されたとすると、コード変換回路８は、図
１の場合は、入力文字コードA3C1（「Ａ」に対す
る文字コード）をそのまま出力し、図２の場合
は、小さく「Ａ」が記入されたとして、入力文字
コードA3C1（「Ａ」に対する文字コードを基に、
対応する英字小文字「ａ」の文字コードA3E1を、
第１表の大文字―小文字の文字コード対応テーブ
ル（コード変換回路８に内蔵）を参照して、出力
する。片仮名「ツ」を第４図の図３、図４のように筆
記した場合も同様である。これらの様子を第２表
にまとめて示している。[Table] In the minimum value selection circuit 7, when the distance between patterns for the capital letter "A" becomes the minimum, the minimum value selection circuit 7 supplies the character code A3C1 given to the capital letter "A" to the code conversion circuit 8. . and,
Are the average radii R (1) and R ₍ _{2) extracted by the average radius extraction circuit 4-5 larger than the set value Rth for the writing of the capital letter "A" in Figures 1 and 2} in Figure 4? Comparison circuit 16 compares whether or not it is true. Here, if it is determined that RthR ₍₁₎ ...in the case of Figure 1 Rth>R ₍₂₎ ...in the case of Figure 2, the code conversion circuit 8 converts the input character code A3C1 (character for "A") in the case of Figure 1 In the case of Figure 2, assuming that a small "A" is written, the input character code A3C1 (based on the character code for "A") is output as is.
The character code A3E1 of the corresponding lowercase alphabet "a" is
It outputs by referring to the uppercase-lowercase character code correspondence table (built in the code conversion circuit 8) shown in Table 1. The same is true when the katakana ``tsu'' is written as shown in FIGS. 3 and 4 of FIG. These conditions are summarized in Table 2.

【表】 (注) 大きく…文字記入枠一ぱいに大
きく記入
小さく…文字記入枠に対して小さ
く記入することを意味する。
以上の実施例では、位置の正規化を入力文字の
重心位置を原点にするようにし、また入力文字の
大きさを、入力文字の重心位置と再標本化点との
平均距離（平均半径）とし、この平均半径を基に
大きさの正規化を行なう構成のものについて説明
したが、本発明はこれに限定されず上記に代え
て、位置の正規化を、入力文字の外接矩形の中心
位置を原点にするようにし、また入力文字の大き
さを、上記外接矩形の対角線長とし、この対角線
長を基に、一定の対角線長となるように、大きさ
の正規化を行なう構成とすることもできる。以上説明したように、本発明によれば、入力文
字の大きさを検出し、検出結果が一定値以上のと
きは大文字、一定値より小さいときは小文字と判
定し、その判定された文字に対応する文字コード
を出力する構成としたことにより、従来、入力ペ
ンを大文字、小文字の指示枠エリア内に押圧する
ことによる入力文字の大文字、小文字の指示操作
を不必要とし、筆記者の負担を除くとともに、文
字入力速度の低下を防止することができるように
なり、また、文字が曲線を主体に構成されている
英字小文字を認識対象外としたことにより、従
来、英字小文字が曲線を主体にしているが故に筆
記者によつて多種多様の変形があつて認識率の低
下を生じていたのを防止することができる効果が
ある。[Table] (Note) Larger… Fill in the text box in larger size.
Small...means to write small in the character entry frame.
In the above example, the position is normalized so that the center of gravity of the input character is the origin, and the size of the input character is set to the average distance (average radius) between the center of gravity of the input character and the resampling point. , a configuration has been described in which the size is normalized based on this average radius, but the present invention is not limited to this, and instead of the above, position normalization is performed by using the center position of the circumscribed rectangle of the input character. Alternatively, the input character size may be set to the diagonal length of the circumscribed rectangle, and the size may be normalized based on this diagonal length so that the diagonal length is a constant. can. As explained above, according to the present invention, the size of an input character is detected, and when the detection result is larger than a certain value, it is judged as an uppercase letter, and when it is smaller than a certain value, it is judged as a lowercase letter, and the size of the input character is determined. By having a configuration that outputs the character code to be input, it becomes unnecessary to specify the uppercase and lowercase letters of input characters by pressing the input pen into the uppercase and lowercase letter indication frame area, thereby eliminating the burden on the scribe. At the same time, it has become possible to prevent a decrease in character input speed, and by excluding lowercase letters that are mainly composed of curved lines from being recognized, lowercase letters that were previously composed of mainly curved lines are no longer recognized. This has the effect of preventing a reduction in recognition rate due to various deformations caused by the scribe.

[Brief explanation of drawings]

第１図は従来例の構成図、第２図は入力文字と
前処理結果を示す図、第３図は本発明の一実施例
構成図、第４図は本発明の文字筆記例を示す図で
ある。符号の説明、１…タブレツト、２…文字記入用
シート、２―１…文字記入枠、３…入力ペン、４
…前処理部、４―４…大きさ正規化回路、４―５
…平均半径抽出回路、５…特徴抽出回路、７…最
小値選択回路、８…コード変換回路、１２…標準
パターンメモリ、１６…比較回路。 Fig. 1 is a block diagram of a conventional example, Fig. 2 is a diagram showing input characters and preprocessing results, Fig. 3 is a block diagram of an embodiment of the present invention, and Fig. 4 is a diagram showing an example of character writing according to the present invention. It is. Explanation of symbols, 1...Tablet, 2...Character entry sheet, 2-1...Character entry frame, 3...Input pen, 4
...Preprocessing section, 4-4...Size normalization circuit, 4-5
...Average radius extraction circuit, 5. Feature extraction circuit, 7. Minimum value selection circuit, 8. Code conversion circuit, 12. Standard pattern memory, 16. Comparison circuit.

Claims

[Scope of Claims] 1. In an online handwritten character recognition device that writes characters on a tablet with an input pen and recognizes the written characters based on handwriting information of the written characters output from the tablet, font size detection means for detecting the font size; font size determination means for comparing the detected value with a set value and determining it as an uppercase letter when it is larger than the set value; and a lowercase letter when it is smaller than the set value; It receives the character code corresponding to the result of character pattern recognition and the above size judgment result signal as input, and when the size judgment result is an uppercase letter, the character code corresponding to the written character is output as is, and when the size judgment result is a lowercase letter, the character code corresponding to the written character is output as is. What is claimed is: 1. An online handwritten character recognition device comprising code conversion means for converting a predetermined code based on a character code received as input and outputting the converted code. 2. In the device according to claim 1,
The character size detecting means calculates the average distance between the center of gravity of the written character and each of the remarking points, which is obtained based on the resampling points extracted from the line segments of each stroke of the written character at regular distance intervals. An online handwritten character recognition device characterized in that it is a character size detection means whose value is the size of a handwritten character. 3. In the device according to claim 1,
The online handwritten character recognition device is characterized in that the character size detection means is a character size detection means that determines the size of the handwritten character to be the diagonal length of a rectangle circumscribing the handwritten character.