JPH04149665A

JPH04149665A - Method and device for processing character

Info

Publication number: JPH04149665A
Application number: JP2271128A
Authority: JP
Inventors: Yukie Kinugawa; 衣川　幸恵; Junichi Kubota; 淳市久保田
Original assignee: Matsushita Electric Industrial Co Ltd
Current assignee: Panasonic Holdings Corp
Priority date: 1990-10-09
Filing date: 1990-10-09
Publication date: 1992-05-22

Abstract

(57)【要約】本公報は電子出願前の出願データであるた
め要約のデータは記録されません。(57) [Summary] This bulletin contains application data before electronic filing, so abstract data is not recorded.

Description

【発明の詳細な説明】産業上の利用分野本発明１よ　文書の作直　管理等の文書処理を目的とし
た文字処理方法およびその装置に関するものであム従来の技術日本語文章の中では外来語を表わすためにカタカナが用
いられていも　しかしなか収　外来語のカタカナ表記方
法は一定でなく、一つの文章中に複数の表記が混在して
いることが多（ち　このような現象は　文章の統一性を
損なしく　読み易さを阻害すると言われていも　これに
対し　カタカナ表記のゆれを自動的に検出するカタカナ
表記のゆれ検出装置が近年考案されていも　（特開昭６
２−２９０９６５号公報）第２図（よ　従来のカタカナ表記のゆれ検出装置の構成
図であ４１屯　２１（友　文章記憶部であり、入力された文章を
記憶する。文章記憶部２１としては　ＩＣメモリ、磁気
ディスク装置などが用いられていも２２は　カタカナ列
抽出部であり、文章記憶部２１が記憶している文章の中
からカタカナ列を抽出す４２３１１　　カタカナ列記憶
部であり、カタカす列抽出部２２によって抽出されたカ
タカナ列とともに　文章記憶部２１に記憶された文章に
おける位置の情報も記憶すも　カタカナ記憶部２３とし
て１上　文章記憶部２１と同様に　ＩＣメモリ、磁気デ
ィスク装置などが用いられていム　２４はカタカナ列変
形部であり、カタカナ列記憶部２３に記憶されたカタカ
ナ列に対してカタカナあるいは部分カタカナ列を削除ま
たは置換することによって変形を加え、Ｌ２５＋Ｌ　　
変形結果記憶部であり、カタカナ列変形部２４によって
変形された結果を、カタカナ列記憶部２３に記憶された
カタカナ列と対応づけて記憶す４２６１；Ｌ　　変形結
果比較部であり、変形結果記憶部２５に記憶された変形
結果の一致するカタカナ列の一群を検出すム２７（友　
カタカナ列比較部であり、変形結果の一致する変形前の
カタカナ列の一群についてカタカナ列記憶部２３に記憶
されたカタカナ列を比較することによって、変形結果は
一致するが変形前のカタカナ列は異なるカタカナ列の一
群を検出すも２８代　ゆれ部分表示部であり、カタカナ
列比較部２７によって検出されたカタカナ列の一群を表
示すも　ゆれ部分表示部２８として４よ　たとえζ区Ｃ
ＲＴデイスプレィ、液晶デイスプレィなどを用いも　表
示の方法ζ友　検出されたカタカナ列の部分を反転表示
したり、カラー表示したりする方法があム　これら以外
にも構成要素が存在する力（本発明との対比のためには
必要がないので省略すも上記のように構成された従来のカタカナ表記のゆれ検出
装置で１上　文書作成作業が完了した文書に対してカタ
カナ表記のゆれを検出してい九発明が解決しようとした
課題従来のカタカナ表記のゆれ検出装置では　文書作成作業
が完了した文書を対象としてゆれを検出してい九　この
たべ　ゆれを検出するために１よ文書作成作業が完了し
てから改めてゆれの検出処理を行わなければならなかっ
な発明１．２沫　従来のカタカナ表記のゆれ検出装置が有
していた前記の問題点に鑑へ　文書作成作業の途屯　文
字列を入力する段階で人力されたカタカナ表記があらか
じめ設定されたカタカナ表記と異なる場合級　あらかじ
め設定されたカタカナ表記に統一することにより、カタ
カナ表記のゆれがない文書を効率よく作成することがで
きる文字処理装置を提供することを目的とすムまｆ：、
　　発明３．４　Ｌ　　あらかじめカタカナ文字列が設
定されていない場合でＬ　文書中で既に使用したカタカ
ナ表記にゆれが生じた場合には表記のゆれを統一できる
文字処理装置を提供することを目的とすａ更に　発明５、６は　カタカナ表記のゆれが生じている
場合に　あらかじめ設定されたカタカナ表記または文書
中で既に使用したカタカナ表記力＼入力中のカタカナ表
記のどちらの表記に統一するかをオペレータが選択でき
る文字処理装置を提供することを目的とすム課題を解決するための手段第１の発明１よ　入力文字列を一時記憶する入力文字列
一時記憶部と、カタカナ文字列を記憶するカタカナ文字
列記憶部と、前記カタカナ文字列記憶部に記憶している
カタカナ文字列に対して相互に表記のゆれと認められる
カタカナ文字列を作成するゆれ文字列作成部と、前記ゆ
れ文字列作成部で作成したゆれ文字列を前記カタカナ文
字列記憶部で記憶しているカタカナ文字列と対応づけて
一時記憶するゆれ文字列一時記憶部と、前記入力文字列
一時記憶部で一時記憶している文字列と前記ゆれ文字列
一時記憶部で一時記憶しているゆれ文字列を比較して一
致する部分があるか否かを判定する文字列判定部と、前
記文字列判定部で一致すると判定されたとき、入力文字
列中の一致すると判定された部分を前記ゆれ文字列−鰺
記憶部が一致したゆれ文字列と対応づけて一時記憶して
いるカタカナ文字列と置換する表記統一部とを備えた文
字処理装置であム第２の発明は　入力文字列を一時記憶する入力文字列一
時記憶段階と、カタカナ文字列に対して相互に表記のゆ
れと認められるカタカナ文字列を作成するゆれ文字列作
成段階と、前記ゆれ文字列作成段階で作成したゆれ文字
列を前記カタカナ文字列記憶段階で記憶しているカタカ
ナ文字列と対応づけて一時記憶するゆれ文字列一時記憶
段階と、前記入力文字列一時記憶段階で一時記憶してい
る文字列と前記ゆれ文字列一時記憶段階で一時記憶して
いるゆれ文字列を比較して一致する部分があるか否かを
判定する文字列判定段階と、前記文字列判定段階で一致
すると判定されたとき、入力文字列中の一致すると判定
された部分を一致したゆれ文字列と対応づけて一時記憶
しているカタカナ文字列と置換する表記統一段階とを備
えた文字処理方法であａ第３の発明は　前記入力部で入力された文字列を漢字か
なまじりの文字列に変換するかな漢字変換部と、前記か
な漢字変換部で変換した漢字かなまじりの文字列を格納
する変換文字列格納部と、前記変換文字列格納部で格納
した文字列の中からカタカナ文字列を抽出し　前記カタ
カナ文字列記憶部に記憶させるカタカナ文字列抽出部と
を付加した文字処理装置であム第４の発明は　前記入力段階で入力された文字列を漢字
かなまじりの文字列に変換するかな漢字変換段階と、前
記かな漢字変換段階で変換した漢字かなまじりの文字列
を格納する変換文字列格納段階と、前記変換文字列格納
段階で格納した文字列の中からカタカナ文字列を抽出し
　前記カタカナ文字列記憶部に記憶させるカタカナ文字
列抽出段階とを付加した文字処理方法であム第５の発明は　前記入力部で入力された文字列を漢字か
なまじりの文字列に変換するかな漢字変換部と、前記か
な漢字変換部で変換した漢字かなまじりの文字列を格納
する変換文字列格納部と、前記変換文字列格納部で格納
した文字列の中からカタカナ文字列を抽出し　前記カタ
カナ文字列記憶部に記憶させるカタカナ文字列抽出部と
、前記文字列比較部で一致すると判定された部分があっ
たとき、ゆれ文字列と対応づけて記憶しているカタカナ
文字列に統一する力＼　ゆれ文字列に統一するかをオペ
レータに選択させる統−表記指示部と、前記統一表記指
示部でゆれ文字列と対応づけて記憶しているカタカナ文
字列に統一すると決定したとき１よ　入力文字列の中で
ゆれ文字列と一致した部分をゆれ文字列と対応づけて記
憶している文字列と置換し　ゆれ文字列に統一すると決
定したとき（友　変換文字列格納部で格納している文字
列の中でゆれ文字列と対応づけて記憶しているカタカナ
文字列と一致する部分をゆれ文字列のカタカナ文字列に
置換する拡張表記統一部とを付加した文字処理装置であ
ム第６の発明（戴　前記入力段階で入力された文字列を漢
字かなまじりの文字列に変換するかな漢字変換段階と、
前記かな漢字変換段階で変換した漢字かなまじりの文字
列を格納する変換文字列格納段階と、前記変換文字列格
納段階で格納した文字列の中からカタカナ文字列を抽出
し　前記カタカナ文字列記憶部に記憶させるカタカナ文
字列抽出段階と、前記文字列比較段階で一致すると判定
された部分があったとき、ゆれ文字列と対応づけて記憶
しているカタカナ文字列に統一する力＼　ゆれ文字列に
統一するかをオペレータに選択させる統一表記指示段階
と、前記統一表記指示段階でゆれ文字列と対応づけて記
憶しているカタカナ文字列に統一すると決定したときは
　入力文字列の中でゆれ文字列と一致した部分をゆれ文
字列と対応づけて記憶している文字列と置換し　ゆれ文
字列に統一すると決定したときは、　変換文字列格納段
階で格納している文字列の中でゆれ文字列と対応づけて
記憶しているカタカナ文字列と一致する部分をゆれ文字
列のカタカナ文字列に置換する拡張表記続−段階とを付
加した文字処理方法であ４作　　　用第１、第２の発明は前記した構成により、ゆれ文字列作
成部（′Ｌ　カタカナ文字列記憶部で記憶しているカタ
カナ文字列に対して、相互に表記のゆれと認められるカ
タカナ文字列を作成すも　ゆれ文字列一時記憶部（友　
ゆれ文字列作成部が作成したゆれ文字列をカタカナ文字
列記憶部で記憶しているカタカナ文字列と対応づけて一
時記憶す４文字列判定部は　人力文字列一時記憶部で一
時記憶された文字列とゆれ文字列一時記憶部で記憶して
いるゆれ文字列を比較すム　一致する部分があるときに
は　表記統一部力（入力文字列一時記憶部で一時記憶し
ている文字列のゆれ文字列と一致した部分をカタカナ文
字列記憶部で記憶しているカタカナ文字列に置き換えも第３、第４の発明は前記した構成より、かな漢字変換部
が変換した漢字かなまじりの文字列を変換文字列格納部
が格納すム　カタカナ文字列作成部１友　格納された文
字列からカタカナ文字列を抽出し　カタカナ文字列記憶
部に記憶させム　以阪第１、第２の発明と同様に文字列
のゆれを統一すム第５、第６の発明は前記した構成より、統−表記指示部
カミ　カタカナ文字列記憶部で記憶しているカタカナ文
字列に統一するかゆれ文字列に統一するかをオペレータ
に選択させた抵　拡張表記統一部は　文書全体を統−表
記指示部が決定したカタカナ文字列に統一すム実施例以下、本発明の実施例を図面を用いて説明すも第１図（
上　本発明における一実施例の文字処理装置の構成図で
あａ第１図において１０１ｉ１　　人力文字列一時記憶部で
あり、キーボード等から入力されたひらがな文字列を一
時記憶す４１０Ｈ＆　　かな漢字変換部であり、入力文
字列一時記憶部１０１で一時記憶されたひらがな文字列
を漢字かなまじりの文字列に変換す、５１０３ｉ！　　
変換文字列格納部であり、かな漢字変換部１０２で変換
された結果を格納す４１０４ｉ上　カタカナ文字列記憶
部であり、カタカナ文字列を記憶す、４１０５ｊ＆　　
カタカナ文字列抽出部であり、変換文字列格納部１０３
で格納した漢字かなまじりの文字列から一連のカタカナ
文字列を抽出し　カタカナ文字列記憶部１０４に重複を
除いて追加記憶させ４１０６Ｇｉ　　ゆれ文字列作成部
であり、カタカナ文字列記憶部１０４で記憶しているカ
タカナ文字列に対して相互に表記のゆれと認められるカ
タカナ文字列を作成すも　ゆれ文字列ζよ　ゆれが生じ
やすい文字列をグループにして変形規則として持板　カ
タカナ文字列中にあてはまる部分があればそこを置き換
えることによって作成す４１０７！ｉ　　ゆれ文字列一
時記憶部であり、ゆれ文字列作成部１０６で作成したゆ
れ文字列をカタカナ文字列記憶部１０２で記憶している
カタカナ文字列と対応づけて一時記憶すム　ｌ　Ｏ８１
友　　文字列判定部であり、人力文字列一時記憶部ｌＯ
１で入力したひらがな文字列の中で、ゆれ文字列一時記
憶部１０７で一時記憶しているゆれ文字列と一致するも
のがあるか否かを判定す４１０９４１　　統−表記指示
部であり、統−表記指示部１０８で一致するものがある
と判定されたとき番ミ　カタカナ文字列記憶部１０２で
記憶している表記とゆれ文字列のどちらに統一するかを
表示し　オペレータに選択させも　１１０ζ戴　表記統
一部であり、統−表記指示部１０９でオペレータが選択
したカタカナ文字列に文書全体を統一すａ　さら＆ζ　
ゆれ文字列に統一したときには　ゆれ文字列一時記憶部
１０７で一時記憶しているゆれ文字列を削除すム以上のように構成された本実施例の文字処理装置につい
て以下その動作を説明すも今、ゆれ文字列作成部１０６でゆれ文字列を作成するた
めの変形規則の例として次のようなものを考えもゆれが生じやすいカタカナ文字列のグループ（ジち　　
ゼ）（ニー、　　　エイ、　　　エイ）（つ入　　ウニ）ゆれ文字列作成部１０６では　変形規則の中のカタカナ
文字列をカタカナ文字列の中から検索して同じグループ
の別のカタカナ文字列に変換すもカタカナ文字列記憶部
１０４で（よ　カタカナ表記の１つとして「ブロゼクト
」を記憶しているものとすムここで、次のような文章を入力すも［文章例］プロジェクト運営において、この問題は大きなウエート
を占めも　ところ力（別のテーマが占めるウェイトの方
が大きいようちまず、ゆれ文字列作成部１０６４友　カタカナ文字列記
憶部１０４に記憶している「ブロゼクト」に対して、変
形規則（ジち　　ゼ）を適用してゆれ文字列「プロジェ
クト」を作成すも　ゆれ文字列一時記憶部１０’Ｎ；ｔ
、、ｒ’プロジエクト」をひらがな表記に変換すム　こ
のとき、ゆれ文字列一時記憶部１０７に記憶される結果
は次のようになム［］内の数字はゆれ文字列の連番であ
り、　０内の数字は　そのカタカナ文字列記憶部１０２
で記憶での識別番号であム［１］ぶろじ丸くと　　　　　　　（１）次＆二　人力
文字列一時記憶部１０１では　ひらがな文字列として、
次のようなものを一時記憶し入力終了後にかな漢字変換
指示を出力すも入力文字列ぷろじぇくとうんえいにおいて、このもんだいはおおき
なうえ−とをしめム文字列判定部１０８４表　　ゆれ文字列一時記憶部１０
７に一時記憶している文字列と入力文字列を比較すム　
この場合、　「ぷろし丸くと」は大力文字列中に一致す
る部分があるので、統−表記指示部１０９４友　ｒブロ
ゼクト」と「プロジェクト」を表示して、どちらにする
かをオペレータに選択させも　「プロゼクト」を選択し
た場合　人力文字列中の「ぷろじ、えくと」の部分を「
ブロゼクトに置き換えも　「プロジェクト」を選択した
場合ゆれ文字列一時記憶部で記憶している「ぷろじ丸く
と」を削除すム　その抵　残りのひらがな文字列をかな
漢字変換部１０２で漢字かなまじりの文字列に変換し　
変換結果を変換文字列格納部１０３で格納す社　「ブロ
ゼクト」を選択したとき、変換文字列格納部１０３で格
納した結果は次にようになム［変換文字列格納結果］ブロゼクト運営において、この問題は大きなウエートを
占めも次４ミ　カタカナ文字列抽出部１０５　ｉｌ　　変換文
字列格納部１０３で格納した文字列からカタカナ文字列
を抽出し　カタカナ文字列記憶部１０４で記憶すム　ゆ
れ文字列作成部１０６はゆれ文字列を作成すも　この場
合　「プロゼクト」、　「ウエート」が抽出され　カタ
カナ文字列記憶部１０４で沫　以下のように記憶されも（１）プロゼクト（２）ウエートまた　ゆれ文字列作成部１０６で６１　　ｒプロゼクト
」から「プロジェクト」、　「ウエート」から「ウェイ
ト」と「ウェイト」と「ウェート」が作成されて、ゆれ
文字列一時記憶部１０７でＧ戴　以下のように一時記憶
されも［１］［２］［３コ［４］ぶろじえくうえいとうえいとうえ−ととさらに続きの文字列を次のように入力す表入力文字列ところ力（べつのて−まがしめるうえいとのほうがおお
きいようへこのとき、入力文字列中の「うえいと」とゆれ文字列一
時記憶部の［２］　「うえいと」が一致すも　統−表記
指示部１０９でゆれ文字列の「ウェイト」に統一すると
オペレータが選択したとき、変換文字列格納部１０３で
格納している文字列中の「ウエート」を「ウェイト」に
すべて置換することによって、文書全体のカタカナ表記
を統一す以上のように　本実施例によれば　かな漢字変
換を行う前に　ゆれが生じやすいカタカナ文字列をあら
かじめカタカナに変換することによって、かな漢字変換
用の辞書に登録されていないカタカチ語でｋ　単語区切
れを間違えることなく漢字かなまじりの文字列に変換で
きもな耘　本実施例で（友　カタカナ文字列をカタカナない
しひらがな文字の連鎖として表現していたが英字その他
の手段により表記してもよ（℃発明の効果第１．第２の発明の文字処理装置及びその方法において
は　カタカナ文字列記憶舐　ゆれ文字列作成縁　ゆれ文
字列一時記憶服　文字列判定皿表記統一部を設置す、文
字列の入力時に入力文字列をゆれ文字列と比較し　入力
された文字列中にカタカナ文字列記憶部で記憶している
カタカナ表記と異なるものがある場合に　カタカナ文字
列記憶部で記憶しているカタカナ表記に統一することに
より、カタカナ表記のゆれがない文書を効率よく作成す
ることができもまた　第３、第４の発明の文字処理装置及びその方法に
おいて１１　かな漢字変換服　変換文字列格納餓　カタ
カナ文字列抽出部を設け、かな漢字変換が確定した段階
でカタカナ文字列を抽出し抽出したカタカナ文字列に対
してもゆれ文字列を作成することによって、あらかじめ
カタカナ文字列記憶部にカタカナ文字列が記憶されてい
なくてＬ　文書中に使用したカタカナ表記にゆれが生じ
た場合には表記のゆれを統一でき、その実用的効果は太
き（■ 更に　第５、第６の発明の文字処理装置及びその方法に
おいて（よ　統−表記指示部を設け、カタカナ表記のゆ
れが生じている場合に　あらかじめ設定したカタカナ表
記または文書中で既に使用したカタカナ表記力＼　入力
中のカタカナ表記のどちらの表記に統一するかをオペレ
ータが選択できようにすることによって、オペレータが
所望する表記に柔軟に対応でき、その実用的効果は太き
（′＋５[Detailed description of the invention] Industrial application field This invention 1 Rewriting documents This relates to a character processing method and device for the purpose of document processing such as management. Even though katakana is used to represent a word, the method of writing foreign words in katakana is not fixed, and multiple writings often coexist in a single sentence. Although it is said that it impairs uniformity and impedes readability, a device to detect deviations in katakana notation has been devised in recent years to automatically detect deviations in katakana notation.
2-290965 Publication) Fig. 2 is a block diagram of a conventional katakana notation deviation detection device. Even if an IC memory, a magnetic disk device, etc. are used, 42311 is a katakana sequence extraction unit, which extracts katakana sequences from the sentences stored in the sentence storage unit 21; Along with the katakana string extracted by the extraction unit 22, position information in the sentence stored in the sentence storage unit 21 is also stored.As the katakana storage unit 23, IC memory, magnetic disk device, etc. are used like the sentence storage unit 21. 24 is a katakana string transformation unit, which deforms the katakana string stored in the katakana string storage unit 23 by deleting or replacing katakana or partial katakana strings, and transforms the katakana string stored in the katakana string storage unit 23 into
A transformation result storage unit, which stores the results transformed by the katakana sequence transformation unit 24 in association with the katakana sequence stored in the katakana sequence storage unit 23; 4261;L a transformation result comparison unit, which stores the results transformed by the katakana sequence transformation unit 24; A program 27 (friend
This is a katakana string comparison section, which compares the katakana strings stored in the katakana string storage section 23 with respect to a group of katakana strings before transformation that match the transformation results, and compares the katakana strings stored in the katakana string storage section 23 to determine whether the transformation results match but the katakana strings before transformation are different. Detecting a group of katakana strings is also the 28th wavering part display section, and displaying a group of katakana strings detected by the katakana string comparison section 27.
Display methods such as RT displays and liquid crystal displays are also available.There are other ways to display the detected katakana sequence in reverse or in color. This is omitted as it is not necessary for comparison, but the conventional katakana notation deviation detection device configured as above detects katakana notation deviations in documents for which the document creation work has been completed. Problems that the invention sought to solve In conventional katakana notation deviation detection devices, deviations are detected only after the document creation process has been completed. Invention 1.2 Invention that required the shake detection process to be performed again In view of the above-mentioned problems that existed with the conventional shake detection device for Katakana notation. To provide a character processing device that can efficiently create a document with no fluctuation in katakana notation by unifying the katakana notation manually set to the katakana notation set in advance. The purpose is to:
Invention 3.4 L An object of the present invention is to provide a character processing device that can unify the variations in the notation when the katakana character string has not been set in advance. a Furthermore, inventions 5 and 6 are such that when there is a fluctuation in katakana notation, the operator decides which notation to unify, the preset katakana notation or the katakana notation already used in the document/the katakana notation being input. Means for Solving the Problem First Invention 1 An input character string temporary storage unit that temporarily stores an input character string, and a katakana character that stores a katakana character string. a string storage unit, a skewed character string creation unit that creates a katakana character string that is recognized to have a spelling variation with respect to the katakana character string stored in the katakana character string storage unit; a shaky character string temporary storage unit that temporarily stores the created shaky character string in association with the katakana character string stored in the katakana character string storage unit; and a character string that is temporarily stored in the input character string temporary storage unit. and a character string determination unit that compares the fluctuation character string temporarily stored in the fluctuation character string temporary storage unit and determines whether there is a matching part, and when the character string determination unit determines that they match. , a notation unification unit that replaces a part of the input character string that is determined to be a match with a katakana character string that is temporarily stored in association with the wobbling character string--the matching wobbling character string. The second invention of the processing device is an input character string temporary storage stage of temporarily storing an input character string, and a character string creation stage of creating a katakana character string that is recognized as having a mutually different orthography from a katakana character string. a shaky character string temporary storage stage for temporarily storing the shaky character string created in the shaky character string creation stage in association with the katakana character string stored in the katakana character string storage stage; and a shaky character string temporary storage stage; a character string determination step of comparing the character string temporarily stored in the step and the wobbling character string temporarily stored in the wobbling string temporary storage step to determine whether there is a matching portion; Characters equipped with a notation unification stage in which, when a match is determined in the determination stage, the portion of the input character string that is determined to be a match is associated with the matched wobble character string and replaced with a temporarily stored katakana character string. A third invention is a processing method, comprising: a kana-kanji conversion unit that converts a character string inputted by the input unit into a character string containing kanji and kana; and a character string containing kanji and kana that has been converted by the kana-kanji conversion unit. A character processing device that includes a converted character string storage unit and a katakana character string extraction unit that extracts katakana character strings from the character strings stored in the converted character string storage unit and stores them in the katakana character string storage unit. The fourth invention is a kana-kanji conversion stage for converting the character string input in the input stage into a character string containing kanji and kana, and a converted character string storage for storing the character string containing kanji and kana that has been converted in the kana-kanji conversion stage. and a katakana character string extraction step of extracting a katakana character string from the character string stored in the converted character string storage step and storing it in the katakana character string storage section. The invention includes: a kana-kanji conversion unit that converts a character string inputted by the input unit into a character string containing kanji and kana; a converted character string storage unit that stores a character string containing kanji and kana that has been converted by the kana-kanji conversion unit; A katakana character string extraction unit extracts a katakana character string from the character string stored in the converted character string storage unit and stores it in the katakana character string storage unit, and there is a portion that is determined to be a match by the character string comparison unit. At the time, the power to unify the katakana character string stored in association with the erratic character string ＼ A unified notation instruction section that allows the operator to select whether to unify the erratic character string to the katakana character string; When it is decided to standardize to the katakana character string that has been associated and stored, 1. Replaces the part of the input string that matches the yure character string with the memorized character string that is associated with the yure character string. When it is decided to unify the string into a string (Friend), the part of the string stored in the converted string storage unit that matches the katakana string stored in association with the yure string is converted into the katakana string of the yure string. The sixth invention is a character processing device having an extended notation unification section that replaces the character string with a kana-kanji conversion step that converts the character string input in the input step into a character string containing kanji and kana;
a converted character string storage stage for storing the character strings containing kanji and kana mixed in the kana-kanji conversion stage; and a katakana character string is extracted from the character strings stored in the converted character string storage stage and stored in the katakana character string storage section. When there is a part that is determined to be a match in the katakana character string extraction stage to be memorized and the character string comparison stage, the ability to unify it into the memorized katakana character string by associating it with the erratic character string ＼ Unify into the erratic character string When it is decided in the unified notation instruction stage that the operator selects whether to use the katakana character strings that are stored in association with the yakuji character strings in the input character strings, When it is decided to associate the matched part with the wobbling string and replace it with the stored string, and unify the wobbling string, the wobbling string and the wobbling string are replaced in the stored string at the converted string storage stage. The first and second inventions are a character processing method that includes an extended notation continuation step in which a part that matches a katakana character string that is stored in correspondence is replaced with a katakana character string that is a distorted character string. With the above-mentioned configuration, a katakana character string that is recognized as having a mutually different spelling is created for the katakana character string stored in the katakana character string storage unit ('L). Department (friend)
4. The character string determination unit temporarily stores the shaky character string created by the shaky character string creation unit in association with the katakana character string stored in the katakana character string storage unit. Compare the character string with the character string stored in the temporary input character string storage unit.If there is a match, the notation unification unit The third and fourth aspects of the present invention store the converted character string by replacing the matched part with the katakana character string stored in the katakana character string storage unit. The katakana character string creation unit 1 extracts the katakana character string from the stored character string and stores it in the katakana character string storage unit.As in the first and second inventions, the katakana character string creation unit extracts the katakana character string from the stored character string and stores it in the katakana character string storage unit. The fifth and sixth inventions based on the above-described configuration allow the operator to select whether to unify the katakana character strings stored in the katakana character string storage unit or the katakana character strings stored in the katakana character string storage unit. The extension notation unification unit unifies the entire document into the katakana character string determined by the unification notation instruction unit.Example: Hereinafter, an embodiment of the present invention will be explained with reference to the drawings.
1 is a configuration diagram of a character processing device according to an embodiment of the present invention.a In FIG. , 5103i! converts the hiragana character string temporarily stored in the input character string temporary storage unit 101 into a character string containing kanji and kana.
4104i, which is a converted character string storage unit and stores the results of conversion by the kana-kanji conversion unit 102; and 4105j, which is a katakana character string storage unit and stores katakana character strings;
It is a katakana character string extraction unit and a converted character string storage unit 103
Extracts a series of katakana character strings from the kanji/kana-mixed character string stored in 4106 Gi, removes duplicates, and stores them in the katakana character string storage unit 104. Create a katakana character string that can be recognized as having variations in notation for the katakana character strings. If there is, create 4107 by replacing it! i This is a wobble character string temporary storage unit that temporarily stores the wobble character string created by the wobble character string creation unit 106 in association with the katakana character string stored in the katakana character string storage unit 102. l O81
Friend Character string judgment unit, human-powered character string temporary storage unit lO
Among the hiragana character strings input in step 1, it is determined whether or not there is one that matches the wobble character string temporarily stored in the wobble string temporary storage section 107. When it is determined that there is a match in the notation instruction section 108, it is displayed whether to unify the notation stored in the katakana character string storage section 102 or the irregular character string, and allows the operator to select. It is a unification section that unifies the entire document to the katakana character string selected by the operator in the unification notation instruction section 109.
When unifying the shaky character strings, the shaky character strings temporarily stored in the shaky character string temporary storage section 107 are deleted. As an example of a transformation rule for creating a shaky character string in the shaky character string creation unit 106, consider the following.
ze) (nee, ei, ei) (tsuiri uni) The shaky character string creation unit 106 searches for the katakana character string in the transformation rule among the katakana character strings and converts it into another katakana character string in the same group. It is assumed that the katakana character string storage unit 104 stores ``brosect'' as one of the katakana notations.Here, enter the following sentence. occupies a large weight (as other themes have a larger weight), first of all, the transformation rule is (Jichi Ze) is applied to create a shaky character string "project" shaky character string temporary storage section 10'N;
,,r'project'' into hiragana notation.At this time, the result stored in the wobble character string temporary storage unit 107 is as follows.The numbers in square brackets are the serial numbers of the wobble character strings, The numbers within 0 are the katakana character string storage unit 102
The identification number in memory is Am [1] Blog Maru (1) Next & 2 In the human character string temporary storage unit 101, as a hiragana character string,
Temporarily stores the following and outputs a kana-kanji conversion instruction after the input is completed. In the input character string project, this function is large. 10
A program to compare the input string with the string temporarily stored in 7.
In this case, since there is a matching part of ``Puroshimaruto'' in the main character string, the standard notation instruction section 1094 displays ``Project'' and ``Project,'' and allows the operator to select which one to use. Also, if you select "Project", change the "Project, Ext" part in the human string to "
If you select ``Project,'' the ``Projimaruto'' stored in the temporary string storage section will be deleted.Then, the remaining hiragana string will be converted into kanji-kana mixed characters using the kana-kanji conversion section 102. convert to column
When "Browsect" is selected, the result stored in the converted string storage section 103 is stored in the converted string storage section 103. The main problem is as follows: Katakana string extraction section 105 Extracts katakana strings from the strings stored in converted string storage section 103 and stores them in katakana string storage section 104. 106 creates a wobble character string. In this case, "project" and "weight" are extracted and stored in the katakana character string storage section 104 as follows: (1) project (2) weight and wobble string creation section In step 106, "project" is created from "61r project", "wait", "wait", and "wait" are created from "wait", and the strings are temporarily stored in the temporary string storage section 107 as shown below. 1] [2] [3 pieces [4] Input the following character strings: is larger.At this time, if "ueito" in the input character string and [2] "ueito" in the wobbling string temporary storage section match, the standard notation instruction section 109 sets the "weight" of the wobbling string. When the operator selects to unify the katakana notation in the entire document by replacing all "weight" in the character string stored in the converted character string storage unit 103 with "weight", the katakana notation of the entire document is unified. According to this embodiment, by converting katakana character strings that tend to fluctuate into katakana before performing kana-kanji conversion, it is possible to convert katakachi words that are not registered in the dictionary for kana-kanji conversion into kanji without making mistakes when separating k words. In this embodiment, a katakana character string is expressed as a chain of katakana or hiragana characters, but it may also be represented using alphabetic characters or other means. .In the character processing device and method of the second invention, a katakana character string memory device, a shaky character string creation edge, a shaky character string temporary memory device, a character string judgment plate notation unification unit are installed, and an input character string is input when a character string is input. By comparing the input character string with the katakana notation that is different from the katakana notation stored in the katakana string storage, if there is a character string that is input that is different from the katakana notation stored in the katakana string storage, In the character processing device and method of the third and fourth inventions, it is possible to efficiently create documents with no fluctuations in katakana notation. By extracting the katakana character string at the stage when the conversion is confirmed and creating a wobble character string for the extracted katakana character string, it is possible to eliminate the possibility that the katakana character string is not stored in the katakana character string storage unit in advance and is in the L document. If there is a fluctuation in the katakana notation used, it is possible to unify the fluctuation in the notation, and the practical effect is significant (■ Furthermore, in the character processing device and method of the fifth and sixth inventions (Yoto - notation instruction) If there is a discrepancy in katakana notation, the operator can select either the preset katakana notation or the katakana notation already used in the document, or the katakana notation currently being input. By using

[Brief explanation of drawings]

第１図は本発明の一実施例の文字処理装置の構成艮　第
２図は従来の文字処理装置の構成図であム１０１・・・入力文字列一時記憶臥　１０２・・・かな
漢字変換臥　１０３・・・変換文字列格納眼　１０４・
・・カタカナ文字列記憶＠　１０５・・・カタカナ文字
列抽出Ｋ　　１０６・・・ゆれ文字列作成服　１０７・
・・ゆれ文字列一時記憶舐　１０８・・・文字列判定へ
１０９・・・統一表記指示ａ（！、、１１０・・・拡張
表記統一代理人の氏名　弁理士　小鍜治　明ほか２名第図FIG. 1 is a configuration diagram of a character processing device according to an embodiment of the present invention. FIG. 2 is a configuration diagram of a conventional character processing device. ...Conversion string storage eye 104・
・・Katakana character string memory @ 105 ・Katakana character string extraction K 106 ・Wobble character string creation clothes 107・
・・Temporary memory of shaking character string 108 ・To judge character string 109 ・Uniform notation instruction a (!,, 110 ・Extended notation Unified name of agent Patent attorney Akira Kokaji and 2 others Figure

Claims

[Claims]

(1) An input character string temporary storage unit that temporarily stores input character strings, a katakana character string storage unit that stores katakana character strings, and a katakana character string stored in the katakana character string storage unit A shaky character string creation unit that creates a katakana character string that is recognized to be a shaky character string; and a shaky character string creation unit that associates the shaky character string created by the shaky character string creation unit with the katakana character string stored in the katakana character string storage unit. Comparing the temporarily stored shaky character string temporary storage section with the character string temporarily stored in the input character string temporary storage section and the shaky character string temporarily stored in the shaky character string temporary storage section, and the part that matches. When the character string determination unit determines that there is a match, the character string temporary storage unit stores the part of the input character string that is determined to be a match, and the character string temporary storage unit stores the fluctuation A character processing device comprising a notation unification unit that replaces a katakana character string temporarily stored in association with a character string.

(2) An input character string temporary storage stage of temporarily storing an input character string, a deviated character string creation stage of creating a katakana character string that is recognized as having a mutual spelling variation with respect to a katakana character string, and a deviated character string creation stage. a shaky character string temporary storage stage for temporarily storing the shaky character string created in the katakana character string storage step in association with the katakana character string stored in the katakana character string storage stage;
Character string determination that compares the character string temporarily stored in the input character string temporary storage stage and the shaky character string temporarily stored in the shaky character string temporary storage stage to determine whether there is a matching part. step, and when a match is determined in the character string determination step, a notation is unified in which the part of the input string that is determined to be a match is associated with the matched wobble character string and replaced with a temporarily stored katakana character string. A character processing method characterized by comprising steps.

(3) a kana-kanji conversion unit that converts a character string inputted in the input unit into a character string containing kanji and kana; a converted character string storage unit that stores the character string containing kanji and kana that has been converted by the kana-kanji conversion unit; 2. The character processing device according to claim 1, further comprising: a katakana character string extracting unit that extracts a katakana character string from the character strings stored in the converted character string storage unit and stores the extracted katakana character string in the katakana character string storage unit. .

(4) a kana-kanji conversion stage for converting the character string input in the input stage into a character string containing kanji and kana; a converted character string storage stage for storing the character string containing kanji and kana that has been converted in the kana-kanji conversion stage; 3. The character processing method according to claim 2, further comprising a step of extracting a katakana character string from the character string stored in the step of storing the converted character string and storing the katakana character string in a katakana character string storage section. .

(5) a kana-kanji conversion unit that converts a character string inputted in the input unit into a character string that includes kanji and kana; a converted character string storage unit that stores the character string that includes kanji and kana that has been converted by the kana-kanji conversion unit; A katakana character string extraction unit extracts a katakana character string from the character string stored in the converted character string storage unit and stores it in the katakana character string storage unit, and there is a portion determined to be a match by the character string comparison unit. a unified notation instruction section that allows the operator to select whether to unify the katakana character string stored in association with the unsteady character string or unify the unwritten character string; When it is decided to standardize on the katakana character strings that are stored in association with each other, the part of the input string that matches the yure character string is replaced with the string that is stored and associated with the yure character strings, and the yure When it is decided to unify the character strings, the part of the character string stored in the converted character string storage unit that matches the katakana string stored in association with the yure character string is converted to the katakana character string of the yure character string. 2. The character processing device according to claim 1, further comprising: an extended notation unification section that replaces a string.

(6) a kana-kanji conversion stage for converting the character string input in the input stage into a character string containing kanji and kana; a converted character string storage stage for storing the character string containing kanji and kana that has been converted in the kana-kanji conversion stage; A katakana character string extraction stage in which a katakana character string is extracted from the character string stored in the converted character string storage stage and stored in the katakana character string storage unit, and a portion determined to be a match in the character string comparison stage is If there is a shaky character string, there is a unified notation instruction stage in which the operator selects whether to unify the katakana character string stored in association with the shaky character string or to unify the shaky character string; When it is decided to unify the katakana character string that is stored in association with the ``shake'' character string, the part of the input string that matches the ``shake'' character string is replaced with the character string that is stored in association with the ``shake'' character string, and When it is decided to unify the yure character strings, the part of the string stored in the converted character string storage stage that matches the katakana character string stored in association with the yure character string is converted to the katakana characters of the yure character string. 3. The character processing method according to claim 2, further comprising an extended notation unification step of replacing the character with a column.