JPH0863463A

JPH0863463A - Document reading device

Info

Publication number: JPH0863463A
Application number: JP6200444A
Authority: JP
Inventors: Isamu Iwai; 勇岩井; Kenichiro Kobayashi; 賢一郎小林
Original assignee: Toshiba Corp; Toshiba AVE Co Ltd
Current assignee: Toshiba Corp; Toshiba AVE Co Ltd
Priority date: 1994-08-25
Filing date: 1994-08-25
Publication date: 1996-03-08

Abstract

(57)【要約】【目的】文書を常にその文脈や場面に適した読みで自
然に読み上げること。【構成】解析結果バッファ７には、文書データファイ
ル１内の指定された文書の日本語解析部４による日本語
解析結果が格納されている。音声データ生成部８は前記
日本語解析結果から音声データを作成する際に、特殊文
字検出部１７により特殊文字が検出された前後にある数
字文字列に対して、特殊文字処理規則テーブル２１の対
応する規則に従って特殊な読みを当てて音声データを生
成し、又、特殊パターン検出部１９により特殊パターン
の数字文字列が検出されると、これに対して、特殊パタ
ーン処理規則テーブル２２内の対応する規則に従って特
殊な読みを当てて音声データを生成して音声データファ
イル１０に格納する。音声合成装置１１は前記音声デー
タを音声信号に変換して音声出力部１３が出力するた
め、読み上げ文書中の電話番号や郵便番号等の数字文字
列もそれに相応しい読み上げがなされる。 (57) [Summary] [Purpose] To read a document naturally in a manner suitable for the context and situation. [Arrangement] The analysis result buffer 7 stores the Japanese analysis result of the specified document in the document data file 1 by the Japanese analysis unit 4. When the voice data generation unit 8 creates voice data from the Japanese analysis result, the special character processing rule table 21 corresponds to the numeric character strings before and after the special character is detected by the special character detection unit 17. When the special pattern detection unit 19 detects a numeric character string of a special pattern by applying a special reading in accordance with the rule, a corresponding character in the special pattern processing rule table 22 is dealt with. According to the rule, a special reading is applied to generate voice data and the voice data is stored in the voice data file 10. The voice synthesizer 11 converts the voice data into a voice signal and outputs the voice signal by the voice output unit 13. Therefore, a numeric character string such as a telephone number or a postal code in the read-aloud document is also read appropriately.

Description

Detailed Description of the Invention

【０００１】[0001]

【産業上の利用分野】本発明は読み上げ対象文書を日本
語解析した後、音声データを生成し、この音声データを
音声合成して得た音声を外部に出力することにより文書
を読み上げる文書読上装置に関する。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention reads a document by analyzing the reading target document in Japanese, generating voice data, and outputting the voice obtained by synthesizing the voice data to the outside. Regarding the device.

【０００２】[0002]

【従来の技術】従来の音声読上装置においては、読み上
げ対象文書データを日本語解析して得られる解析結果の
中で、例えば数字の文字列に対しては予め設定されてい
るモードによって棒読み、又は桁読みのどちらかの読み
で音声データの生成が行なわれていた。具体的な例を上
げると、前記解析結果として、「０３（１２３）４５６
７」が得られた場合、この数字の文字列に対して、音声
データ生成部は、通常、「ぜろさ＾ん／か＾っこ／ひゃ
くに＾じゅうさん／か＾っこ／よんせ＾ん／ごひゃくろ
くじゅうな＾な」という読みの生成を行なっており、電
話番号であった場合の慣例的な読み方、「ぜろさ＾んの
／いちに＾いさんの／よんご−ろくな＾な」という読み
を生成することがなかった。2. Description of the Related Art In a conventional speech reading apparatus, in an analysis result obtained by analyzing the reading target document data in Japanese, for example, for a character string of numbers, a bar reading is performed in a preset mode, The voice data is generated by reading either the digit or the digit. As a concrete example, as the analysis result, "03 (123) 456" is obtained.
If “7” is obtained, the voice data generation unit normally responds to “Zerosa ^ // ^^ / hyakuni ^ 30 // ^^ / onse for this numeric character string. It generates a reading "^ / Gohyakukurojana ^ na", which is the customary way of reading when it is a telephone number, "Zerosa ^ no / Ichinii ^ san / Yongo-Rokuna". I didn't generate a reading saying "^".

【０００３】又、前記解析結果として、読み方に何種類
かのバリエーションがある「１０：１３」等の特殊読み
を行う記号を含んだ数字の文字列が得られた場合も、音
声データ生成部は前後に出現する単語により「じゅった
＾い／じゅ＾うさん」が「じゅ＾うじ／＾じゅうさ＾ん
ぷん」かを読み分けなければならないが、これも予め決
められた設定どおりのいずれか一方の読み方しかできな
かった。従って、従来の文書読み上げ装置によって文書
データを読み上げさせた際、上記のような例が出てきた
場合、必ずしもその文脈や場面に適した自然な読み上げ
方を行わないため、聴取者に違和感を与えると共に場合
によって聴取者が意味を取り違えてしまうという不具合
があった。尚、本例では数字文字列の中にかっこ
や「：」のような特殊記号を含むものも、一括して数字
文字列と読んでいる。Also, even if a character string of a number including a symbol for special reading such as "10:13", which has some variations in reading, is obtained as the analysis result, the voice data generating section Depending on the words that appear before and after, it is necessary to distinguish whether "juta ^ i / ju ^ u-san" is "ju ^ uji / ^ jusa ^ mpun", which is either one of the preset settings. I could only read it. Therefore, when reading the document data with the conventional document reading device, if the above example appears, it does not necessarily make a natural reading suitable for the context or scene, which gives the listener a feeling of discomfort. At the same time, there was a problem that listeners sometimes got the meaning wrong. Incidentally, in this example, those including special symbols such as parentheses and ":" in the numeric character string are collectively read as the numeric character string.

【０００４】[0004]

【発明が解決しようとする課題】上記のような従来の音
声読み上げ装置では、読み上げ対象文書データの日本語
解析結果が、電話番号や郵便番号を示す数字文字列であ
った場合や、時刻や比率を示す特殊記号を含む数字文字
列であった場合、予め設定してある読み方しか行なわれ
ないため、文脈や場面対して適した自然な読み上げ方を
しない場合が生じ、このような場合には聴取者に違和感
を与えると共に、場合によって聴取者が意味を取り違え
てしまうという不具合があった。In the conventional voice reading device as described above, when the Japanese analysis result of the read target document data is a numeric character string indicating a telephone number or a postal code, the time and ratio are set. If it is a numeric character string that includes a special symbol that indicates, only the preset reading is performed, and there are cases where natural reading appropriate for the context or scene is not performed. There is a problem in that the listener feels uncomfortable and the listener sometimes misunderstands the meaning.

【０００５】そこで本発明は上記の事情に鑑み、読み上
げ対象文書データ中の慣例的に特殊な読み方をする数字
文字列に対して常に文脈や場面に適した自然な読み上げ
を行うことができると共に、前記特殊な読み方をユーザ
が任意に設定できる文書読上装置を提供することを目的
としている。In view of the above circumstances, the present invention can always perform natural reading suitable for a context or a scene with respect to a numerical character string that is conventionally and specially read in reading target document data. It is an object of the present invention to provide a document reading apparatus in which the user can arbitrarily set the special reading method.

【０００６】[0006]

【課題を解決するための手段】請求項１の発明は、読み
上げ対象の文書データを日本語解析して得た解析結果か
ら音声データ生成規則に従って音声データを生成し、こ
の音声データを音声合成装置により電気的な音声信号に
変換し、得られた音声信号を音声出力装置により音声に
して外部に出力する文書読上装置において、複数の特殊
単語を一覧としたデータを記憶する第１の記憶手段と、
前記日本語解析結果から前記記憶手段に記憶されている
特殊単語を検出する特殊単語検出手段と、この特殊単語
検出手段により検出された特殊単語の前後に近接して存
在する数字文字列を抽出する抽出手段と、前記第１の記
憶手段に記憶されている複数の特殊単語毎に対応して決
められた規則を一覧としたデータを記憶する第２の記憶
手段と、この抽出手段により抽出された数字文字列に前
記検出手段により検出された特殊単語に対応して前記第
２の記憶手段に記憶されている規則に従った読みを当て
ることにより前記数字文字列から音声データに生成する
音声データ生成手段と、前記第１の記憶手段に特殊単語
を登録すると共に前記第２の記憶手段に前記規則を登録
する第１の登録手段とを具備した構成を有する。According to a first aspect of the present invention, voice data is generated according to a voice data generation rule from an analysis result obtained by analyzing Japanese of document data to be read aloud, and the voice data is synthesized by a voice synthesizer. First storage means for storing data in which a list of a plurality of special words is used in a document reading device that converts the obtained voice signal into a voice by a voice output device and outputs the voice to the outside. When,
Special word detection means for detecting a special word stored in the storage means from the Japanese analysis result, and a numeric character string existing close to before and after the special word detected by the special word detection means are extracted. Extraction means, second storage means for storing data in which a list of rules determined corresponding to each of the plurality of special words stored in the first storage means is stored, and the extraction means extracts the data. Voice data generation for generating voice data from the numeric character string by applying reading according to the rule stored in the second storage means to the numeric character string corresponding to the special word detected by the detection means And a first registration unit that registers the special word in the first storage unit and the rule in the second storage unit.

【０００７】請求項２の発明は、読み上げ対象の文書デ
ータを日本語解析して得た解析結果から音声データ生成
規則に従って音声データを生成し、この音声データを音
声合成装置により電気的な音声信号に変換し、得られた
音声信号を音声出力装置により音声にして外部に出力す
る文書読上装置において、複数の特殊数字文字列パター
ンを一覧としたデータを記憶する第３の記憶手段と、前
記日本語解析結果から前記第１の記憶手段に記憶された
特殊数字文字列パターンを検出する特殊パターン検出手
段と、前記第３の記憶手段に記憶されている複数の特殊
数字文字列パターン毎に対応して決められた規則を一覧
としたデータを記憶する第４の記憶手段と、前記特殊パ
ターン検出手段により検出された特殊数字文字列パター
ンにこの特殊数字文字列パターンに対応して前記第４の
記憶手段に記憶されている規則に従った読みを当てるこ
とにより前記数字文字列パターンから音声データを生成
する音声データ生成手段と、前記第３の記憶手段に特殊
数字文字列パターンを登録すると共に前記第４の記憶手
段に前記規則を登録する第２の登録手段とを具備した構
成を有する。According to a second aspect of the present invention, voice data is generated according to a voice data generation rule from an analysis result obtained by analyzing the document data to be read aloud in Japanese, and the voice data is converted into an electric voice signal by a voice synthesizer. In a document reading device which converts the obtained voice signal into a voice by a voice output device and outputs the voice to the outside, a third storage means for storing data listing a plurality of special numeral character string patterns, Corresponding to a special pattern detecting means for detecting the special numeric character string pattern stored in the first storage means from the Japanese analysis result, and a plurality of special numeric character string patterns stored in the third storage means And a special number character string pattern detected by the special pattern detection means and a fourth storage means for storing data in which the rules determined by Voice data generating means for generating voice data from the numeric character string pattern by applying reading according to the rule stored in the fourth storage means corresponding to the character string pattern, and the third storage means. And a second registration means for registering the special numerical character string pattern and the rule in the fourth storage means.

【０００８】請求項３の発明は、読み上げ対象の文書デ
ータを日本語解析して得た解析結果から音声データ生成
規則に従って音声データを生成し、この音声データを音
声合成装置により電気的な音声信号に変換し、得られた
音声信号を音声出力装置により音声にして外部に出力す
る文書読上装置において、複数の特殊単語を一覧とした
データを記憶する第１の記憶手段と、前記日本語解析結
果から前記記憶手段に記憶されている特殊単語を検出す
る特殊単語検出手段と、この特殊単語検出手段により検
出された特殊単語の前後に近接して存在する数字文字列
を抽出する抽出手段と、前記第１の記憶手段に記憶され
ている複数の特殊単語毎に対応して決められた規則を一
覧としたデータを記憶する第２の記憶手段と、この抽出
手段により抽出された数字文字列に前記検出手段により
検出された特殊単語に対応して前記第２の記憶手段に記
憶されている規則に従った読みを当てることにより前記
数字文字列から音声データに生成する第１の音声データ
生成手段と、前記第１の記憶手段に特殊単語を登録する
と共に前記第２の記憶手段に前記規則を登録する第１の
登録手段と、複数の特殊数字文字列パターンを一覧とし
たデータを記憶する第３の記憶手段と、前記日本語解析
結果から前記第３の記憶手段に記憶された特殊数字文字
列パターンを検出する特殊パターン検出手段と、前記第
３の記憶手段に記憶されている複数の特殊数字文字列パ
ターン毎に対応して決められた規則を一覧としたデータ
を記憶する第４の記憶手段と、前記特殊パターン検出手
段により検出された特殊数字文字列パターンにこの特殊
数字文字列パターンに対応して前記第４の記憶手段に記
憶されている規則に従った読みを当てることにより前記
数字文字列パターンから音声データを生成する音声デー
タ生成手段と、前記第３の記憶手段に特殊数字文字列パ
ターンを登録すると共に前記第４の記憶手段に前記規則
を登録する第２の登録手段とを具備した構成を有する。According to a third aspect of the present invention, voice data is generated according to a voice data generation rule from an analysis result obtained by analyzing the document data to be read out in Japanese, and the voice data is converted into an electric voice signal by a voice synthesizer. In a document reading device which converts the obtained voice signal into a voice by a voice output device and outputs the voice to the outside, a first storage means for storing data in which a plurality of special words are listed, and the Japanese language analysis. A special word detecting means for detecting a special word stored in the storage means from the result, and an extracting means for extracting a numeric character string existing before and after the special word detected by the special word detecting means, Second storage means for storing data in which a list of rules determined corresponding to each of the plurality of special words stored in the first storage means is stored; The first numerical character string is converted into voice data by applying reading to the numerical character string according to the rule stored in the second storage means corresponding to the special word detected by the detecting means. Of voice data generation means, first registration means for registering special words in the first storage means and the rules in the second storage means, and a list of a plurality of special numeral character string patterns. Third storage means for storing data, special pattern detection means for detecting the special numeral character string pattern stored in the third storage means from the Japanese analysis result, and the third storage means. Fourth storage means for storing data in which a list of rules determined corresponding to each of the plurality of special numeric character string patterns is stored, and a special numeric character string detected by the special pattern detection means. Voice data generating means for generating voice data from the numeric character string pattern by applying a reading according to a rule stored in the fourth storage means in response to the special numeric character string pattern, The third storage means is provided with a special number character string pattern and a second registration means for registering the rule in the fourth storage means.

【０００９】[0009]

【作用】請求項１の発明の文書読上装置において、第１
の記憶手段は複数の特殊単語を一覧としたデータを記憶
する。特殊単語検出手段は日本語解析結果から前記記憶
手段に記憶されている特殊単語を検出する。抽出手段は
前記特殊単語検出手段により検出された特殊単語の前後
に近接して存在する数字文字列を抽出する。第２の記憶
手段は前記第１の記憶手段に記憶されている複数の特殊
単語毎に対応して決められた規則を一覧としたデータを
記憶する。音声データ生成手段は前記抽出手段により抽
出された数字文字列に前記検出手段により検出された特
殊単語に対応して前記第２の記憶手段に記憶されている
規則に従った読みを当てることにより前記数字文字列か
ら音声データに生成する。第１の登録手段は前記第１の
記憶手段に特殊単語を登録すると共に前記第２の記憶手
段に前記規則を登録する。In the document reading apparatus according to the first aspect of the present invention, the first
The storage means stores data in which a plurality of special words are listed. The special word detecting means detects the special word stored in the storage means from the Japanese analysis result. The extracting means extracts a numeric character string existing before and after the special word detected by the special word detecting means. The second storage means stores data in which a list of rules determined corresponding to each of the plurality of special words stored in the first storage means is listed. The voice data generating means applies the reading according to the rule stored in the second storing means to the numeric character string extracted by the extracting means in correspondence with the special word detected by the detecting means. Generates voice data from numeric character strings. The first registration means registers the special word in the first storage means and the rule in the second storage means.

【００１０】請求項２の発明の文書読上装置において、
第３の記憶手段は複数の特殊数字文字列パターンを一覧
としたデータを記憶する。特殊パターン検出手段は前記
日本語解析結果から前記第３の記憶手段に記憶された特
殊数字文字列パターンを検出する。第４の記憶手段は前
記第３の記憶手段に記憶されている複数の特殊数字文字
列パターン毎に対応して決められた規則を一覧としたデ
ータを記憶する。音声データ生成手段は前記特殊パター
ン検出手段により検出された特殊数字文字列パターンに
この特殊数字文字列パターンに対応して前記第４の記憶
手段に記憶されている規則に従った読みを当てることに
より前記数字文字列パターンから音声データを生成す
る。第２の登録手段は前記第３の記憶手段に特殊数字文
字列パターンを登録すると共に前記第４の記憶手段に前
記規則を登録する。In the document reading apparatus of the invention of claim 2,
The third storage means stores data in which a plurality of special numeric character string patterns are listed. The special pattern detection means detects the special numeric character string pattern stored in the third storage means from the Japanese analysis result. The fourth storage means stores data as a list of rules determined corresponding to each of the plurality of special numeral character string patterns stored in the third storage means. The voice data generating means applies the reading in accordance with the rule stored in the fourth storage means to the special numeral character string pattern detected by the special pattern detecting means in correspondence with the special numeral character string pattern. Voice data is generated from the numeric character string pattern. The second registration means registers the special numeral character string pattern in the third storage means and the rule in the fourth storage means.

【００１１】請求項３の発明の文書読上装置において、
第１の記憶手段は複数の特殊単語を一覧としたデータを
記憶する。特殊単語検出手段は前記日本語解析結果から
前記記憶手段に記憶されている特殊単語を検出する。抽
出手段は前記特殊単語検出手段により検出された特殊単
語の前後に近接して存在する数字文字列を抽出する。第
２の記憶手段は前記第１の記憶手段に記憶されている複
数の特殊単語毎に対応して決められた規則を一覧とした
データを記憶する。第１の音声データ生成手段は前記抽
出手段により抽出された数字文字列に前記検出手段によ
り検出された特殊単語に対応して前記第２の記憶手段に
記憶されている規則に従った読みを当てることにより前
記数字文字列から音声データに生成する。第１の登録手
段は前記第１の記憶手段に特殊単語を登録すると共に前
記第２の記憶手段に前記規則を登録する。第３の記憶手
段は複数の特殊数字文字列パターンを一覧としたデータ
を記憶する。特殊パターン検出手段は前記日本語解析結
果から前記第３の記憶手段に記憶された特殊数字文字列
パターンを検出する。第４の記憶手段は前記第３の記憶
手段に記憶されている複数の特殊数字文字列パターン毎
に対応して決められた規則を一覧としたデータを記憶す
る。音声データ生成手段は前記特殊パターン検出手段に
より検出された特殊数字文字列パターンにこの特殊数字
文字列パターンに対応して前記第４の記憶手段に記憶さ
れている規則に従った読みを当てることにより前記数字
文字列パターンから音声データを生成する。第２の登録
手段は前記第３の記憶手段に特殊数字文字列パターンを
登録すると共に前記第４の記憶手段に前記規則を登録す
る。In the document reading apparatus of the invention of claim 3,
The first storage means stores data in which a plurality of special words are listed. The special word detecting means detects the special word stored in the storage means from the Japanese analysis result. The extracting means extracts a numeric character string existing before and after the special word detected by the special word detecting means. The second storage means stores data in which a list of rules determined corresponding to each of the plurality of special words stored in the first storage means is listed. The first voice data generation means applies the reading according to the rule stored in the second storage means to the numeric character string extracted by the extraction means corresponding to the special word detected by the detection means. As a result, voice data is generated from the numeric character string. The first registration means registers the special word in the first storage means and the rule in the second storage means. The third storage means stores data in which a plurality of special numeric character string patterns are listed. The special pattern detection means detects the special numeric character string pattern stored in the third storage means from the Japanese analysis result. The fourth storage means stores data as a list of rules determined corresponding to each of the plurality of special numeral character string patterns stored in the third storage means. The voice data generating means applies the reading in accordance with the rule stored in the fourth storage means to the special numeral character string pattern detected by the special pattern detecting means in correspondence with the special numeral character string pattern. Voice data is generated from the numeric character string pattern. The second registration means registers the special numeral character string pattern in the third storage means and the rule in the fourth storage means.

【００１２】[0012]

【実施例】以下、本発明の一実施例を図面を参照して説
明する。図１は本発明の文書読上装置の一実施例を示し
たブロック図である。１は計算機上で扱える形の文書デ
ータを格納している文書データファイル、２は読み上げ
時の各種設定データ等を入力する入力装置、３は文書デ
ータを読み上げる際の総合的な制御を行う制御部、４は
読み上げる文書データを単語辞書６を参照して形態的、
構文的及び意味的に解析する日本語解析部、５は読み上
げ時の各種設定データが保存される設定バッファ、６は
文書データを解析するための見出し、品詞、読み、アク
セント、意味、その他の情報が一覧となって収集されて
いる単語辞書、７は日本語解析部４による解析結果を保
存する解析結果バッファ、８は日本語解析部４の解析結
果に対応する音声データを生成する音声データ生成部、
９は前記音声データファイル１０で音声データを生成す
る際に参照される音声データ生成規則を格納している音
声データ生成規則ファイル、１０は音声データ生成部８
により生成された音声データを保存する音声データファ
イル、１１は音声データファイル１０から読み出された
音声データに基づいて音声信号を規則合成する音声合成
装置、１２は「郵便番号」や「比率等」のように、これ
ら文字の前後に続く数字文字列を特殊な読み方で読む特
殊文字を一覧として保持している特殊文字テーブル、１
３は音声信号を出力するスピーカ等の音声出力部、１４
は表示データを画面上に表示するＣＲＴやＬＣＤ等の表
示装置、１５は表示装置１４に表示する表示データを音
声データから作成する表示データ作成部、１６は表示デ
ータを保存する表示データファイル、１７は日本語解析
結果から特殊文字テーブル１２に保持されている特殊文
字を検出する特殊文字検出部、１８は電話番号を示した
数字文字列のような特殊な数字文字列のパターンを一覧
として保持している特殊パターンテーブル、１９は日本
語解析結果から特殊パターンテーブル１８に保持されて
いる特殊パターンを検出する特殊パターン検出部、２０
は特殊数字文字列に対する音声データの処理の際に特殊
文字処理規則テーブル２１又は特殊パターン処理規則テ
ーブル２２内の規則情報を読みだして音声データ生成部
８に与える特殊数字文字処理部、２１は前記特殊文字に
対する音声データ生成規則を一覧として保持している特
殊文字処理規則テーブル、２２は前記した特殊な数字文
字列のパターンに対する音声データ生成規則を一覧とし
て保持している特殊パターン処理規則テーブル、２３は
オペレータが特殊文字テーブル１２や特殊パターンテー
ブル１８に登録した特殊文字や特殊パターン及び特殊読
み規則を確認するための確認データを生成する確認デー
タ生成部である。DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS An embodiment of the present invention will be described below with reference to the drawings. FIG. 1 is a block diagram showing an embodiment of the document reading apparatus of the present invention. Reference numeral 1 is a document data file storing document data in a form that can be handled on a computer, 2 is an input device for inputting various setting data at the time of reading, and 3 is a control unit for performing comprehensive control when reading the document data. 4, morphologically refers to the document data to be read out by referring to the word dictionary 6,
A Japanese analysis unit for syntactically and semantically analyzing, 5 is a setting buffer in which various setting data at the time of reading are stored, 6 is a heading, part of speech, reading, accent, meaning, and other information for analyzing document data Is a word dictionary collected as a list, 7 is an analysis result buffer for storing the analysis result of the Japanese analysis unit 4, and 8 is voice data generation for generating voice data corresponding to the analysis result of the Japanese analysis unit 4. Department,
Reference numeral 9 is a voice data generation rule file storing a voice data generation rule referred to when voice data is generated by the voice data file 10, and 10 is a voice data generation unit 8
A voice data file for storing the voice data generated by, 11 is a voice synthesizer for regularly synthesizing voice signals based on the voice data read from the voice data file 10, and 12 is a "zip code" or "ratio, etc." A special character table that holds a list of special characters that read the numeric character string that follows these characters in a special way, such as
3 is an audio output unit such as a speaker that outputs an audio signal, and 14
Is a display device such as a CRT or LCD for displaying the display data on the screen, 15 is a display data creating section for creating the display data to be displayed on the display device 14 from audio data, 16 is a display data file for saving the display data, 17 Is a special character detection unit for detecting special characters held in the special character table 12 from the Japanese analysis result, and 18 is a list of special numeric character string patterns such as numeric character strings indicating telephone numbers. The special pattern table 19 includes a special pattern detecting unit 19 for detecting the special pattern held in the special pattern table 18 from the Japanese analysis result, and 20.
Is a special numeric character processing unit for reading out the rule information in the special character processing rule table 21 or the special pattern processing rule table 22 and processing the special numeric character processing unit 21 for processing the voice data for the special numeric character string. A special character processing rule table holding a list of voice data generation rules for special characters, 22 is a special pattern processing rule table holding a list of voice data generation rules for the patterns of the special numeric character strings, 23 Is a confirmation data generation unit for generating confirmation data for confirming the special characters and the special patterns and the special reading rules registered by the operator in the special character table 12 and the special pattern table 18.

【００１３】次に本実施例の動作について説明する。ま
ず、オペレータは図１に示した文書読上装置を使用する
前に、特殊読み上げを行う数字文字列をどのように読み
上げるのかを予め登録する。制御部３は入力装置２から
オペレータの登録指示を受けると、装置を登録モードに
し、表示装置１４に対して図２に示すような特殊文字の
登録画面を表示する。この画面を見て、オペレータは特
殊文字として郵便番号を表す「〒」を入力装置２から制
御部３に入力すると、制御部３はこの「〒」を表示装置
１４の画面に図２に示すように表示する。その後、オペ
レータはこの「〒」の後に続く数字文字パターン中の
「−」を「の」と読み、数字部分を棒読みするという特
殊読み規則を前記入力装置２から入力する。これによ
り、制御部３は前記数字文字パターンに関する特殊読み
規則を図２に示すように表示装置１２の画面に表示す
る。但し、前記特殊文字はこの文字の前後にある数字文
字パターンの読みを限定する例えば郵便番号であると
か、電話等の文字やこれらを表す記号である。Next, the operation of this embodiment will be described. First, before using the document reading apparatus shown in FIG. 1, the operator registers in advance how to read the numeric character string for special reading. When the control unit 3 receives an operator's registration instruction from the input device 2, the control unit 3 sets the device to the registration mode and displays a special character registration screen as shown in FIG. Looking at this screen, the operator inputs “〒” representing a postal code as a special character from the input device 2 into the control unit 3, and the control unit 3 displays this “〒” on the screen of the display device 14 as shown in FIG. To display. After that, the operator reads from the input device 2 a special reading rule that "-" in the numeric character pattern following "" is read as "no" and the numeric part is read as a stick. As a result, the control unit 3 displays the special reading rule regarding the numeric character pattern on the screen of the display device 12 as shown in FIG. However, the special character is, for example, a postal code that limits the reading of the numerical character pattern before and after this character, a character such as a telephone, or a symbol representing these.

【００１４】制御部３は上記した表示処理を行うと同時
に、入力された「〒」を特殊文字テーブル１２に図１３
に示すように登録すると共に、特殊処理規則テーブル２
１とのリンクを付けるために図１３に示すようなルール
Ａという接続子を発生する。その後、制御部３は次に入
力される特殊読み規則を特殊文字処理規則テーブル２１
に図１４に示すように前記ルールＡという接続子を用い
て登録する。At the same time that the control unit 3 performs the above-mentioned display processing, the input "" is displayed in the special character table 12 as shown in FIG.
The special processing rule table 2
In order to attach a link with 1, a connector called rule A as shown in FIG. 13 is generated. After that, the control unit 3 sets the special reading rule input next to the special character processing rule table 21.
As shown in FIG. 14, it is registered by using the connector of the rule A.

【００１５】制御部３は前記特殊文字の登録が終わる
と、表示装置１４に対して図３に示すような特殊数字パ
ターンの登録画面を表示する。この画面を見て、オペレ
ータは特殊数字パターンとして「数字（数字）数字」を
入力装置２から制御部３に入力すると、制御部３はこの
数字（数字）数字を表示装置１４の画面に図３に示すよ
うに表示する。その後、オペレータはこの数字パターン
中の「（」及び「）」を「の」と読み、数字部分を棒読
みするという特殊読み規則を前記入力装置２から入力す
る。これにより、制御部３は前記数字文字パターンに関
する特殊読み規則を図３に示すように表示装置１２の画
面に表示する。但し、特殊数字パターンとは例えば電話
番号のような特殊な読み方をする数字文字列のことであ
る。When the special characters have been registered, the control unit 3 displays a special numeral pattern registration screen as shown in FIG. 3 on the display device 14. Looking at this screen, the operator inputs a "numeral (numeral) number" as a special numeral pattern from the input device 2 to the controller 3, and the controller 3 displays this numeral (numerical number) on the screen of the display device 14. Display as shown in. After that, the operator reads "(" and ")" in the number pattern as "no" and inputs a special reading rule that the number portion is read as a stick from the input device 2. As a result, the control unit 3 displays the special reading rule regarding the numeric character pattern on the screen of the display device 12 as shown in FIG. However, the special numeric pattern is a numeric character string that is read specially, such as a telephone number.

【００１６】制御部３は上記した表示処理を行うと同時
に、入力された「数字（数字）数字」を特殊文字パター
ンテーブル１８に図１６に示すように登録すると共に、
特殊パターン処理規則テーブル２２とのリンクを付ける
ために、図１６に示すようなルールＡＰという接続子を
発生する。その後、制御部３は次に入力される特殊読み
規則を特殊パターン処理規則テーブル２２に図１７に示
すように前記ルールＡＰという接続子を用いて登録す
る。At the same time that the control section 3 performs the above-mentioned display processing, the inputted "numeric (numeric) numeral" is registered in the special character pattern table 18 as shown in FIG.
In order to link with the special pattern processing rule table 22, a connector called rule AP as shown in FIG. 16 is generated. After that, the control unit 3 registers the special reading rule to be inputted next in the special pattern processing rule table 22 using the connector called the rule AP as shown in FIG.

【００１７】上記のような登録後に、制御部３はオペレ
ータから登録規則の確認をする指示を入力装置２より受
けると、確認データ生成部２３を起動する。確認データ
生成部２３はオペレータが上記のように登録した例えば
「〒」と、これに関わる特殊読み規則を特殊文字テーブ
ル１２と特殊処理規則テーブル２１を参照して、図３に
示すような確認データを生成し、これを制御部３に対し
て送る。制御部３はこの確認データを後述する通常の文
書の読み上げと同じ処理を行って、音声出力部１３から
読み上げることにより、オペレータは登録内容の確認を
行うことができる。尚、確認データ生成部２３は確認デ
ータを作成する際に、内部に保持してある数字データか
ら適当な数字を選択することにより、図３に示した１２
３や４５６等の数字を自動生成する。After the registration as described above, the control unit 3 activates the confirmation data generation unit 23 when receiving an instruction from the input device 2 to confirm the registration rule from the operator. The confirmation data generation unit 23 refers to the special character table 12 and the special processing rule table 21 for the special reading rules related to this, for example, “〒” registered by the operator as described above, and the confirmation data as shown in FIG. Is generated and sent to the control unit 3. The control unit 3 performs the same processing as the reading of a normal document, which will be described later, and reads it from the voice output unit 13, so that the operator can confirm the registered content. It should be noted that the confirmation data generation unit 23 selects the appropriate number from the numerical data held therein when the confirmation data is created, so that the confirmation data generation unit 23 shown in FIG.
Numbers such as 3 and 456 are automatically generated.

【００１８】次に図１に示した文書読上装置の文書読み
上げ動作について図５に示したフローチャートを参照し
て説明する。文書データファイル１に格納されている文
書データは計算機により処理可能な形で格納された複数
の文書であり、これら文書データは日本語ワードプロセ
ッサ等の文書作成装置により作成されたり、ＯＣＲ等に
より計算機の中に読み込まれたものである。Next, the document reading operation of the document reading apparatus shown in FIG. 1 will be described with reference to the flowchart shown in FIG. The document data stored in the document data file 1 is a plurality of documents stored in a form that can be processed by a computer, and these document data are created by a document creation device such as a Japanese word processor or by a computer such as OCR. It has been read in.

【００１９】キーボード等の入力装置２からの指示を制
御部３が受けることにより文書読み上げ処理は開始され
る。この時、制御部３に対しては、オペレータによる入
力装置２からの入力により、読み上げの対象となる文書
の指定と、その文書を読み上げる条件が与えられる。オ
ペレータはこの作業を制御部３によりＣＲＴやＬＣＤ等
の表示装置１４に表示されているガイドに従いながら設
定することが可能である。When the control unit 3 receives an instruction from the input device 2 such as a keyboard, the document reading process is started. At this time, the control unit 3 is given an instruction from the input device 2 by an operator to specify a document to be read aloud and a condition for reading the document. The operator can set this work by the control unit 3 while following the guide displayed on the display device 14 such as a CRT or LCD.

【００２０】上記のようなオペレータの操作を受けて、
制御部３は図５のステップ５０１にて入力装置２から指
定された文書を文書データファイル１から読みだして、
読み上げる文書の選択処理を行い、この選択した文書を
日本語解析部４に渡す。次に制御部３はステップ５０２
にて入力装置２から入力される読み上げの条件を設定バ
ッファ５に対して書き込む。ここで、オペレータの入力
装置２からの入力により指定された前記文書データの一
部は、例えば図６に示すようなものである。又、オペレ
ータは制御部３が表示装置１４に対して表示されている
ガイドに従いながら入力装置２から読み上げ時の条件を
設定することも可能で、設定できるものとしては、基本
の読み上げ速度、音質、高さ、強さ、読み上げ終了の設
定時間、強調文字の特殊読み上げの有無、強調文字の特
殊読み上げ時の変更点、読み上げの有無、休みの長さ、
特殊文字の読みの入れ替えの有無等があり、設定バッフ
ァ５に対して図７に示すように上記条件データが格納さ
れる。In response to the operation of the operator as described above,
The control unit 3 reads the document designated by the input device 2 from the document data file 1 in step 501 of FIG.
A process of selecting a document to be read is performed, and the selected document is passed to the Japanese analysis unit 4. Next, the control unit 3 performs step 502.
The reading condition input from the input device 2 is written in the setting buffer 5. Here, a part of the document data designated by the input from the operator's input device 2 is as shown in FIG. 6, for example. Further, the operator can set the reading condition from the input device 2 while the control unit 3 follows the guide displayed on the display device 14. The settings include basic reading speed, sound quality, Height, strength, set time for reading end, presence / absence of special reading of emphasized characters, changes during special reading of emphasized characters, presence / absence of reading, rest length,
Whether or not the reading of special characters has been changed, etc., the condition data is stored in the setting buffer 5 as shown in FIG.

【００２１】制御部３より指示を受けた日本語解析部４
は、ステップ５０３にて前記渡された文書に対して形態
的、構文的、意味的に解析を、単語の形態情報、読み情
報、アクセント情報、単語間の共起情報等を収めた単語
辞書６を参照しながら行なうことにより、文書を単語単
位に切り分け、単語毎に分割、解析された結果を解析結
果バッファ７に対して書き込んだ後、制御部３に対して
文書の解析が終了したことを示す信号を送る。A Japanese analysis unit 4 which receives an instruction from the control unit 3.
Is a word dictionary 6 containing morphologically, syntactically, and semantically-analyzed the received document in step 503, and including word morphological information, reading information, accent information, co-occurrence information between words, and the like. The document is divided into words, divided into words, and the analyzed result is written in the analysis result buffer 7, and then the control unit 3 is informed that the document analysis is completed. Send a signal to indicate.

【００２２】ここで、前記日本語解析部４が用いる単語
辞書６は図８に示すようなデータ構造例を有しており、
又、日本語解析部４による解析結果が格納される解析結
果バッファ７は図９に示すようなデータ構造例を有して
いる。尚、上記した日本語解析部４が文書を解析して読
みを導く際に、一つの単語に対して複数の読みが存在す
る場合には、上記した単語辞書中に記述されている共起
情報を基づいて適切な読みを決定するものとする。Here, the word dictionary 6 used by the Japanese analysis unit 4 has a data structure example as shown in FIG.
The analysis result buffer 7 in which the analysis result by the Japanese analysis unit 4 is stored has an example of data structure as shown in FIG. When the Japanese analysis unit 4 analyzes the document and guides the reading, and when there are a plurality of readings for one word, the co-occurrence information described in the word dictionary is used. To determine the appropriate reading.

【００２３】文書の解析が終了したことを示す信号を受
けた制御部３は音声データ生成部８に対して起動をかけ
る。音声データ生成部８は音声データ生成規則ファイル
９内のデータを参照して解析結果バッファ７内の日本語
解析結果に対応する音声データを生成して、これを音声
データファイル１０に格納する。この際、音声データ生
成部８は制御部３を介して特殊文字テーブル１２を参照
して、解析結果バッファ７内の日本語解析結果の中に図
１３に示すような特殊文字テーブル１２に登録されてい
る特殊文字（単語）が存在するかどうかをステップ５０
４にて判定し、存在する場合は前記特殊文字テーブル１
２内の前記検索された特殊文字に対応するルール（接続
子）を読み出した後、ステップ５０７に進み、存在しな
い場合はステップ５０５に進む。Upon receiving the signal indicating that the analysis of the document is completed, the control unit 3 activates the voice data generation unit 8. The voice data generation unit 8 refers to the data in the voice data generation rule file 9 to generate voice data corresponding to the Japanese analysis result in the analysis result buffer 7, and stores this in the voice data file 10. At this time, the voice data generation unit 8 refers to the special character table 12 via the control unit 3 and registers the Japanese analysis result in the analysis result buffer 7 in the special character table 12 as shown in FIG. Step 50: Check if there are any special characters (words)
4, the special character table 1 if it exists
After the rule (connector) corresponding to the searched special character in 2 is read, the process proceeds to step 507, and if it does not exist, the process proceeds to step 505.

【００２４】音声データ生成部８はステップ５０７に進
んだ場合、前記検出された特殊文字の前後にある数字文
字列を前記日本語解析結果から抽出し、この抽出した数
字文字列と前記特殊文字テーブル１２から読み出した前
記ルールを制御部３を介して特殊数字文字処理部１５に
与えて、これに起動をかける。これにより、特殊数字文
字処理部２０はステップ５０８にて図１４に示すような
特殊文字処理規則テーブル２１内のデータの中で、前記
与えられたルールに対応する規則を制御部３を介して読
み出して、これを制御部３を介して音声データ生成部８
に与える。音声データ生成部８はステップ５０９にて前
記与えられた数字文字列に対応して読みとアクセントを
前記規則に従って付与することにより、音声データを生
成する。When the voice data generation unit 8 proceeds to step 507, it extracts the numeric character strings before and after the detected special character from the Japanese analysis result, and extracts the numeric character string and the special character table. The rule read from 12 is given to the special numeric character processing unit 15 via the control unit 3 to activate it. As a result, the special numeric character processing unit 20 reads the rule corresponding to the given rule from the data in the special character processing rule table 21 as shown in FIG. 14 through the control unit 3 in step 508. The audio data generation unit 8
Give to. The voice data generation unit 8 generates voice data by adding readings and accents in accordance with the rules corresponding to the given number character string in step 509.

【００２５】一方、音声データ生成部８はステップ５０
５に進んだ場合、図１６に示すような特殊パターンテー
ブル１７を制御部３を介して参照して、解析結果バッフ
ァ７内の日本語解析結果の中に特殊パターンテーブル１
８に登録されている特殊パターンが存在するかどうかを
ステップ５０５にて判定し、存在する場合は前記特殊パ
ターンテーブル１７内の前記検索された特殊パターンに
対応するルールを読み出した後、ステップ５０６に進
み、存在しない場合はステップ５０９に進む。音声デー
タ生成部８はステップ５０６に進んだ場合、前記特殊パ
ターンテーブル１８から制御部３を介して読み出したル
ールを特殊数字文字処理部２０に制御部３を介して与え
ることにより、これに起動をかける。これにより、特殊
数字文字処理部２０はステップ５０８にて図１７に示す
ような構造の特殊パターン処理規則テーブル２２内のデ
ータの中で、前記与えられたルールに対応する規則を制
御部３を介して読み出して、これを音声データ生成部８
に制御部３を介して与える。音声データ生成部８はステ
ップ５０９にて前記検出された特殊数字文字列に対応し
て読みとアクセントを前記規則に従って付与することに
より、音声データを生成する。On the other hand, the voice data generation unit 8 performs step 50.
When the process proceeds to step 5, the special pattern table 17 as shown in FIG. 16 is referred to via the control unit 3, and the special pattern table 1 is included in the Japanese analysis result in the analysis result buffer 7.
It is determined in step 505 whether or not the special pattern registered in No. 8 exists, and if it exists, the rule corresponding to the searched special pattern in the special pattern table 17 is read out, and then in step 506. If it does not exist, the process proceeds to step 509. When the voice data generation unit 8 proceeds to step 506, the rule read from the special pattern table 18 via the control unit 3 is given to the special numeric character processing unit 20 via the control unit 3 to activate it. Call. As a result, in step 508, the special numeric character processing unit 20 sends the rule corresponding to the given rule in the data in the special pattern processing rule table 22 having the structure shown in FIG. 17 via the control unit 3. Read out, and this is read out by the voice data generation unit 8
To the controller via the control unit 3. The voice data generation unit 8 generates voice data by adding a reading and an accent in accordance with the rule corresponding to the detected special numeral character string in step 509.

【００２６】次に、上記したステップ５０４〜５０９の
具体的処理内容を説明する。例えば、図１２の上段に示
すような「彼の郵便番号は１２３−４５だ。」という文
に対して特殊文字列として「郵便番号」が特殊文字テー
ブル１２に登録されていない場合は、数字文字列部分の
「１２３−４５」は「ひゃくに＾じゅう／さん／まいな
す／よ＾んじゅう／ご」という読みが付与されるが、
「郵便番号」が図１３に示すような特殊文字テーブル１
２に登録してある場合には、特殊文字検出部１７により
「郵便番号」が特殊文字として検出され、特殊数字文字
処理部２０に対して起動がかかり、特殊数字文字処理部
２０は特殊文字処理規則２１を参照して読みの生成を行
なう。この例では特殊文字処理規則２１には隣接する数
字の読みを棒読みにし、「−」に対しては「の」という
読みを振るという規則が書かれており、この規則に従う
と、前記１２３−４５の読みは図１２の下段に示すよう
に「いちに＾い／さんの／よんご＾−」となる。Next, the specific processing contents of the above steps 504 to 509 will be described. For example, in the case where "postal code" is not registered in the special character table 12 as a special character string for the sentence "His postal code is 123-45" as shown in the upper part of FIG. The reading "123-45" in the row is given as "Hyaku ni ^ 13 / san / mainasu / yo ^ ju / go".
"Postal code" is a special character table 1 as shown in FIG.
When registered in No. 2, the special character detecting unit 17 detects "zip code" as a special character, the special numerical character processing unit 20 is activated, and the special numerical character processing unit 20 performs special character processing. The reading is generated with reference to rule 21. In this example, the special character processing rule 21 describes a rule that the adjacent numbers are read as sticky readings, and the reading of "no" is given to "-". 12 is read as "Ichinii ^ san / san / gongo ^-" as shown in the lower part of FIG.

【００２７】又、図１５の上段に示すような文字列「０
３（１２３）４５６７」が文書中にあった場合、特殊パ
ターン検出部１９が参照する特殊パターンテーブル１８
に該当するパターンが登録されていない時には、「ぜろ
さ＾ん／か＾っこ／ひゃくに＾じゅうさん／か＾っこ／
よんせ＾ん／ごひゃくろくじゅうな＾な」という読みが
付与される。しかし、図１６に示すような構造の特殊パ
ターンテーブル１８に「数字（数字）数字」という登録
がある場合は、特殊文字検出部１７によりパターンの照
らし合わせが行なわれ、入力文字列が特殊パターンテー
ブル１８内のデータにあると、特殊数字文字処理部２０
に対して起動がかかり、特殊数字文字処理部２０は図１
７に示すような特殊文字処理規則テーブル２１を参照し
て読みの生成を行なう。この例では、特殊文字処理規則
テーブル２１には隣接する数字の読みを棒読みにし、
「（」「）」に対しては「の」という読みを当てるいう
規則が書かれており、この規則に従うと、入力文字列に
対する読みは「ぜろさ＾んの／いちに＾いさんの／よん
ご−ろくな＾な」となる。従って、文字列「０３（１２
３）４５６７」に付与される読みは図１５の下段に示す
ようになる。Further, the character string "0" as shown in the upper part of FIG.
3 (123) 4567 ”in the document, the special pattern table 18 referred to by the special pattern detection unit 19
When the pattern that corresponds to is not registered, "Zerosa ^ / / ^ kko / Hyaku ni ^ san / ka ^ kko /
"Yonsei / Goyakurokujana" is added. However, if the special pattern table 18 having the structure as shown in FIG. 16 has a registration of "numeral (numeral) numeral", the special character detection unit 17 collates the patterns and the input character string is converted into the special pattern table. If it is in the data in 18, the special numeric character processing unit 20
Is started, and the special numeric character processing unit 20 is activated as shown in FIG.
The reading is generated by referring to the special character processing rule table 21 as shown in FIG. In this example, in the special character processing rule table 21, the adjacent numbers are read as stick readings,
There is a rule that applies "no" to "(" and ")". According to this rule, the reading for the input character string is "zero" Good-good! ” Therefore, the character string "03 (12
3) The reading given to "4567" is as shown in the lower part of FIG.

【００２８】同様に、読み上げ対象文書中の「１０：０
０」という文字列に対しては、特殊パターンテーブル１
８に「数字：数字」というパターンの登録がある場合
は、特殊パターン検出部１９により前記文字列とパター
ンの照らし合わせにより、前記「数字：数字」というパ
ターンが検出され、特殊数字文字処理部２０に対して起
動がかかる。これにより、特殊数字文字処理部２０は特
殊パターン処理規則テーブル２２内のデータを参照して
読みの生成を行なう。この例では「：」に対して「じ」
という読みを振り、後ろの数字の後に「ふん」という読
みを降るという規則が書かれており、「じゅ＾うじ／ぜ
ろぜろ＾ふん」という読みになる。尚、この例では、特
殊文字処理規則テーブル２１に記述されている規則によ
って「じゅ＾うじ／ぜろ＾ふん」又は「じゅ＾うじ」と
いう読みを生成することができる。音声データ作成部８
は上記読みに対して音声データをステップ２０９にて作
成することになる。Similarly, "10: 0" in the reading target document
For the character string "0", the special pattern table 1
If the pattern "Numeric: Numerical" is registered in 8, the special pattern detector 19 detects the pattern "Numeral: Numerical" by comparing the character string with the pattern, and the special numeral character processor 20 To start up. As a result, the special numeral character processing unit 20 refers to the data in the special pattern processing rule table 22 to generate the reading. In this example, ":" stands for ":"
There is a rule that you can read "Jun" and "Fun" after the number in the back, and it will be read "Juujiuji / Zerozero ^ fun". Incidentally, in this example, the reading "ju ^ uji / zero ^ un" or "ju ^ uji" can be generated according to the rules described in the special character processing rule table 21. Voice data creation unit 8
Will generate voice data in response to the above reading in step 209.

【００２９】ここで、特殊数字文字処理部２０に送られ
る数字文字列として特殊文字検出部１７と特殊パターン
検出部１９の両方に適合する文字列がある場合は、特殊
文字検出部１７の規則を優先する。例えば「その比率は
１０：１３である。」という文に関して、数字文字列
「１０：１３」は特殊パターンテーブル１８内のデータ
に「数字：数字」という登録がある場合は、特殊パター
ン検出部１９によりパターンの照らし合わせが行なわ
れ、入力文字列が特殊パターンデータと一致すると、特
殊数字文字処理部２０に対して起動がかかり、特殊数字
文字処理部２０は特殊パターン処理規則テーブル２２の
データを参照して読みの生成を行なう。この例で
は「：」に対して「じ」という読みを振り、後ろの数字
の後に「ふん」という読みをたすという規則が書かれて
おり、「じゅ＾うじ／じゅうさ＾んぷん」という読みに
なる。音声データ作成部８は上記読みに対して音声デー
タをステップ５０９にて作成することになる。If there is a character string that matches both the special character detecting unit 17 and the special pattern detecting unit 19 as a numeric character string sent to the special numeric character processing unit 20, the rule of the special character detecting unit 17 is set. Prioritize. For example, regarding the sentence “the ratio is 10:13”, if the numeric character string “10:13” is registered in the data in the special pattern table 18 as “number: number”, the special pattern detection unit 19 When the input character string matches the special pattern data, the special numeric character processing unit 20 is activated, and the special numeric character processing unit 20 refers to the data of the special pattern processing rule table 22. Then, the reading is generated. In this example, the rule is to put the reading "ji" for ":" and add the reading "fun" after the number after it, which is called "ju ^ uji / jusa ^ mpun". Be read. The voice data creation unit 8 will create voice data for the above reading in step 509.

【００３０】しかし、特殊文字テーブル１２に「比率」
が登録されている場合は特殊文字検出部１７により前記
「比率」が特殊文字テーブル１２から特殊文字として検
出され、特殊数字文字処理部２０に対して起動がかか
り、特殊数字文字処理部２０は特殊文字処理規則テーブ
ル２１を参照して、読みの生成を行なう。この例では特
殊文字処理規則テーブル２１に、隣接する数字の読みを
棒読みにし、「：」に対しては「たい」という読みを振
るという規則が書かれており、この規則に従うと「１
０：１３」に対する読みは「じゅった＾い／じゅ＾うさ
ん」となり、全体の読みは「そのひりつは／じゅうた＾
い／じゅ＾うさんで／あ＾る」という読みが得られる。
このような場合、前述したように特殊文字検出部１７の
検出により得られた読みが優先するため、音声データ生
成部８は「そのひりつは／じゅうた＾い／じゅ＾うさん
で／あ＾る」という読みを採用し、これら読みに対する
音声データをステップ５０９にて作成することになる。However, the "ratio" is displayed in the special character table 12.
If is registered, the special character detection unit 17 detects the “ratio” as a special character from the special character table 12, and the special numeric character processing unit 20 is activated. The reading is generated with reference to the character processing rule table 21. In this example, the special character processing rule table 21 describes a rule that the adjacent numbers are read as stick readings and "tai" is read for ":".
The reading for "0:13" will be "Jutta ^ i / Ju ^ san", and the whole reading will be "the secret
You can get the reading "I / Ju U-san / Aru".
In such a case, as described above, the reading obtained by the detection by the special character detection unit 17 is prioritized, so that the voice data generation unit 8 is "the secret is / 13 ^ / / The readings “Ru” are adopted, and voice data for these readings is created in step 509.

【００３１】尚、上記した特殊文字の検出と、特殊パタ
ーンの検出とが同一の数字文字列に対して同時に発生し
た場合の優先順位は入力装置２から制御部３に予め設定
できるようになっており、制御部３がこの優先順位を音
声データ生成部８にセットすることにより、上記処理が
行われるようになっている。従って、制御部３に前記優
先順位を逆に設定しておけば、特殊パターン検出部１９
の検出による音声データ生成処理が優先されて、実行さ
れることになる。In the case where the detection of the special character and the detection of the special pattern described above occur simultaneously for the same numeric character string, the priority order can be preset in the control unit 3 from the input device 2. Therefore, the control unit 3 sets this priority in the voice data generation unit 8 so that the above processing is performed. Therefore, if the priorities are set in the control unit 3 in reverse, the special pattern detection unit 19
The voice data generation process by the detection of is prioritized and executed.

【００３２】音声データ生成部８は解析結果バッファ７
内の日本語解析結果データの特殊文字又は特殊パターン
でない部分については、ステップ５０８にて音声データ
生成規則ファイル９を参照してから音声データファイル
１０を生成し、前記特殊文字又は特殊パターンについて
はこれらに当てられた読みに対して音声データを生成し
て、音声データファイル１０に格納する。音声データ生
成部８は音声データの作成が終了すると、音声データの
作成が終了したことを示す信号を制御部３に対して送
る。The voice data generator 8 has an analysis result buffer 7
For a portion of the Japanese analysis result data that is not a special character or a special pattern, the voice data generation rule file 9 is referred to in step 508, and then the voice data file 10 is generated. The voice data is generated for the reading applied to, and stored in the voice data file 10. When the creation of the audio data is completed, the audio data generation unit 8 sends a signal indicating that the creation of the audio data is completed to the control unit 3.

【００３３】ここで、音声データ生成規則の一部を図１
０に示し、出力される音声データファイル１０を図１１
に示す。図１０は、五段動詞でアクセントの形がＯ型で
ない場合で、その活用形が未然形の場合はそのアクセン
トの形をＯ型にするという規則の例である。又、図１１
には音声データファイル１０のフォーマットを示す。但
し、読み上げ文字列データにおいてカタカナ文字は音声
データを表し「＾」はアクセントの位置を表し「．」は
設定バッファ５に設定されている長さの休みを表す。Here, a part of the voice data generation rule is shown in FIG.
The audio data file 10 shown in FIG.
Shown in FIG. 10 shows an example of a rule in which the accent form is not O-type in the case of a five-verb, and the accent form is O-type when the inflectional form is incomplete. Also, FIG.
Shows the format of the audio data file 10. However, in the reading character string data, katakana characters represent voice data, “^” represents the position of accent, and “.” Represents a break of the length set in the setting buffer 5.

【００３４】音声データ作成が終了したことを示す信号
を受けた制御部３は音声データファイル１０内の音声デ
ータを音声合成装置１１に渡す。音声合成装置１１はス
テップ５１０にて設定バッファ５内に設定された速度、
音質、高さ、強さにより前記音声データを電気信号に変
換することにより、スピーカ等の音声出力部１３から音
声を出力する。ここで、音声合成装置１１は音韻例と、
特殊制御コードからなるデータを入力すると、これを電
気的な音声信号に変換する装置であり、前記データは音
声データファイル１０内の音声データと同じ形式をとっ
ている。即ち、音声合成装置１１は音声データファイル
１０のフォーマットの文字列を受けると、音声の規則合
成を行なえる装置とも言え、指定された速度、音質、高
さ、強さによりそれに続いて送られてくる文字列に対し
て規則合成を行なう。Upon receiving the signal indicating that the voice data creation is completed, the control section 3 transfers the voice data in the voice data file 10 to the voice synthesizer 11. The voice synthesizer 11 sets the speed set in the setting buffer 5 in step 510,
A voice is output from the voice output unit 13 such as a speaker by converting the voice data into an electric signal according to sound quality, height, and strength. Here, the speech synthesizer 11 includes a phoneme example,
It is a device for converting data, which is composed of a special control code, into an electric voice signal, and the data has the same format as the voice data in the voice data file 10. That is, when the voice synthesizer 11 receives a character string in the format of the voice data file 10, it can be said that the voice synthesizer 11 is capable of performing regular voice synthesis. Rule composition is performed on the incoming character string.

【００３５】図５に示したステップ５０３〜５１０一連
の処理により文書の読み上げが行なわれる。制御部３は
ステップ５１１にて読み上げ対象文書が最後まで読み上
げられたかを判定し、読み上げられていない場合は、ス
テップ５０３に戻り、読み上げが終了した場合は処理を
終了する。The document is read aloud by a series of processes of steps 503 to 510 shown in FIG. The control unit 3 determines in step 511 whether or not the reading target document has been read to the end. If not, the process returns to step 503, and if the reading has ended, the process ends.

【００３６】本実施例によれば、日本語解析結果から音
声データを作成する際に、電話番号や郵便番号等のよう
な桁読みでない特殊な読み上げ方をする数字文字列と通
常の桁読みを行う数字文字列とを識別し、前記特殊な読
み上げ方をする数字文字列に対しては、この数字文字列
のパーターン、又はこの数字文字列の前後にある前記電
話番号や郵便番号等のような特殊文字によって、その文
脈や場面に相応しい読みを当てることができるため、上
記のような数字文字列に対して常に適切な読み上げを行
うことができる。これにより、「（、）」や「−」等の
特殊な文字に条件を満たす数字が隣接している場合に、
「ぜろさんの／いちにいさんの／よんご−ろくな
な」等のように、電話番号や郵便番号の読み上げに適し
た自然な文書読み上げを実現でき、文書読み上げ時に聴
取者に違和感を与えたり、又は意味を取り違えるような
読み方をなくして、文書読上装置の性能を向上させるこ
とができる。しかも、前記数字文字列のパーターンとこ
れに関わる特殊読み規則や特殊文字とこれに関わる特殊
読み規則をオペレータが任意に登録できるため、読み上
げ対象文書の種類やオペレータサイドの事情によって、
上記のような特殊な数字文字列を自由に決定できると共
に、その特殊な読み方も自由に決定することができ、文
書読み上げ時の自由度を更に向上させることができる。According to the present embodiment, when the voice data is created from the Japanese analysis result, the digit string and the normal digit reading which are not digit-reading such as telephone numbers and postal codes are used. The numeric character string to be performed is identified, and for the numeric character string to be read aloud in a special way, the pattern of this numeric character string, or the telephone number or zip code before and after this numeric character string, etc. Since the special characters can give a reading suitable for the context or scene, it is possible to always read appropriately the numerical character string as described above. As a result, when special characters such as "(,)" and "-" are adjacent to numbers that meet the conditions,
"Zerosan's / Ichinii's / Japanese-Good
Such as `` na '' etc., it is possible to realize natural document reading suitable for reading phone numbers and zip codes. The performance of the device can be improved. Moreover, since the operator can arbitrarily register the pattern of the numerical character string and the special reading rules and special characters and special reading rules related thereto, depending on the type of the reading target document and the circumstances of the operator side,
The special numeric character string as described above can be freely determined, and the special way of reading can also be freely determined, so that the degree of freedom when reading a document can be further improved.

【００３７】[0037]

【発明の効果】以上記述した如く請求項１乃至３記載の
文書読上装置によれば、読み上げ対象文書データ中の慣
例的に特殊な読み方をする数字文字列に対して常に文脈
や場面に適した自然な読み上げを行うことができると共
に、前記特殊な読み方をユーザが任意に設定できる。As described above, according to the document reading apparatus according to the first to third aspects, the numerical character string which is conventionally read specially in the reading target document data is always suitable for the context and the scene. Natural reading can be performed, and the user can arbitrarily set the special reading.

[Brief description of drawings]

【図１】本発明の文書読上装置の一実施例を示したブロ
ック図。FIG. 1 is a block diagram showing an embodiment of a document reading apparatus according to the present invention.

【図２】図１に示した表示装置に表示される特殊文字及
びそれに関わる特殊読み規則を登録する登録画面例を示
した図。FIG. 2 is a diagram showing an example of a registration screen for registering special characters displayed on the display device shown in FIG. 1 and special reading rules related thereto.

【図３】図１に示した表示装置に表示される特殊数字パ
ターン及びそれに関わる特殊読み規則を登録する登録画
面例を示した図。FIG. 3 is a diagram showing an example of a registration screen for registering a special numeral pattern displayed on the display device shown in FIG. 1 and a special reading rule related thereto.

【図４】図１に示した確認データ生成部により生成され
た登録内容確認のためのデータ例を示した図。FIG. 4 is a diagram showing an example of data for confirming the registered content generated by the confirmation data generating unit shown in FIG.

【図５】図１に示した装置の文書読み上げ処理を示した
フローチャート。5 is a flowchart showing a document reading process of the apparatus shown in FIG.

【図６】図１に示した文書データファイル内の文書デー
タの一部を示した図。6 is a diagram showing a part of the document data in the document data file shown in FIG.

【図７】図１に示した設定バッファの内容例を示した
図。7 is a diagram showing an example of contents of a setting buffer shown in FIG.

【図８】図１に示した単語辞書の構造例を示した図。8 is a diagram showing an example of the structure of the word dictionary shown in FIG.

【図９】図１に示した解析結果バッファの内容例を示し
た図。9 is a diagram showing an example of contents of an analysis result buffer shown in FIG.

【図１０】図１に示した音声データ生成規則ファイル内
のデータ例を示した図。10 is a diagram showing an example of data in the voice data generation rule file shown in FIG.

【図１１】図１に示した音声データファイル内の音声デ
ータの一例を示した図。11 is a diagram showing an example of audio data in the audio data file shown in FIG.

【図１２】文書中の特殊文字例と特殊文字処理規則の適
用により生成された音声データ例を示した図。FIG. 12 is a diagram showing an example of special character in a document and audio data generated by applying a special character processing rule.

【図１３】図１に示した特殊文字テーブル内の特殊文字
例を示した図。13 is a diagram showing an example of special characters in the special character table shown in FIG.

【図１４】図１に示した特殊文字処理規則バッファ内の
特殊文字処理規則例を示した図。14 is a diagram showing an example of special character processing rules in a special character processing rule buffer shown in FIG.

【図１５】文書中の特殊パターン例と特殊パターン処理
規則の適用により生成された音声データの例を示した
図。FIG. 15 is a diagram showing an example of special patterns in a document and audio data generated by applying special pattern processing rules.

【図１６】図１に示した特殊パターンテーブル内の特殊
パターン例を示した図。16 is a diagram showing an example of a special pattern in the special pattern table shown in FIG.

【図１７】図１に示した特殊パターン処理規則バッファ
内の特殊パターン処理規則例を示した図。17 is a diagram showing an example of a special pattern processing rule in the special pattern processing rule buffer shown in FIG.

[Explanation of symbols]

１…文書データファイル２…入力装置３…制御部４…日本語解析
部５…設定バッファ６…単語辞書７…解析結果バッファ８…音声データ
生成部９…音声データ生成規則ファイル１０…音声デー
タファイル１１…音声合成装置１２…特殊文字
テーブル１３…音声出力部１４…表示装置１５…表示データ生成部１６…表示デー
タファイル１７…特殊文字検出部１８…特殊パタ
ーンテーブル１９…特殊パターン検出部２０…特殊数字
文字処理部２１…特殊文字処理規則テーブル２２…特殊パタ
ーン処理規則テーブル２３…確認データ生成部1 ... Document data file 2 ... Input device 3 ... Control unit 4 ... Japanese analysis unit 5 ... Setting buffer 6 ... Word dictionary 7 ... Analysis result buffer 8 ... Voice data generation unit 9 ... Voice data generation rule file 10 ... Voice data file 11 ... Voice synthesizer 12 ... Special character table 13 ... Voice output unit 14 ... Display device 15 ... Display data generation unit 16 ... Display data file 17 ... Special character detection unit 18 ... Special pattern table 19 ... Special pattern detection unit 20 ... Special Numerical character processing unit 21 ... Special character processing rule table 22 ... Special pattern processing rule table 23 ... Confirmation data generation unit

Claims

[Claims]

1. A voice data is generated according to a voice data generation rule from an analysis result obtained by performing Japanese analysis on document data to be read out, and this voice data is converted into an electric voice signal by a voice synthesizer to obtain the voice data. In a document reading device for outputting the obtained voice signal as voice by a voice output device to the outside, first storage means for storing data in which a plurality of special words are listed, and the storage means from the Japanese analysis result. A special word detecting means for detecting a special word stored in the special memory, an extracting means for extracting a numeric character string existing before and after the special word detected by the special word detecting means, and the first memory. The second storage means for storing data as a list of rules determined corresponding to each of the plurality of special words stored in the means, and the numerical character string extracted by the extraction means. Voice data generation means for generating voice data from the numeric character string by applying reading according to the rule stored in the second storage means corresponding to the special word detected by the note detection means; A document reading device comprising: a first storage unit for registering a special word in the first storage unit and a first registration unit for registering the rule in the second storage unit.

2. A voice data is generated according to a voice data generation rule from an analysis result obtained by performing Japanese analysis on document data to be read out, and this voice data is converted into an electric voice signal by a voice synthesizer, and the voice data is obtained. In a document reading device for converting the obtained voice signal into a voice by a voice output device and outputting it to the outside, a third storage means for storing data listing a plurality of special numeric character string patterns, and the Japanese analysis result The special pattern detecting means for detecting the special numeric character string pattern stored in the first storage means and the plurality of special numeric character string patterns stored in the third storage means are determined in correspondence with each other. Fourth storage means for storing data in which rules are listed, and special numeric character string patterns detected by the special pattern detection means are paired with the special numeric character string patterns. Accordingly, the voice data generating means for generating voice data from the numeric character string pattern by applying the reading according to the rule stored in the fourth storage means, and the special numeric character in the third storage means. A document reading device comprising: a second registration unit that registers a row pattern and registers the rule in the fourth storage unit.

3. A voice data is generated according to a voice data generation rule from an analysis result obtained by analyzing the document data to be read out in Japanese, and this voice data is converted into an electric voice signal by a voice synthesizer to obtain the voice data. In a document reading device for outputting the obtained voice signal as voice by a voice output device to the outside, first storage means for storing data in which a plurality of special words are listed, and the storage means from the Japanese analysis result. A special word detecting means for detecting a special word stored in the special memory, an extracting means for extracting a numeric character string existing before and after the special word detected by the special word detecting means, and the first memory. The second storage means for storing data as a list of rules determined corresponding to each of the plurality of special words stored in the means, and the numerical character string extracted by the extraction means. First voice data generation means for generating voice data from the numeric character string by applying a reading in accordance with a rule stored in the second storage means corresponding to the special word detected by the note detection means. A first registration means for registering a special word in the first storage means and the rule in the second storage means; and a data storing list of a plurality of special numeral character string patterns. 3 storage means and the third from the Japanese analysis result
A list of special pattern detecting means for detecting the special numeric character string pattern stored in the storage means and a rule determined corresponding to each of the plurality of special numeric character string patterns stored in the third storage means. And a rule stored in the fourth storage means corresponding to the special numeric character string pattern in the special numeric character string pattern detected by the special pattern detection means. Voice data generating means for generating voice data from the numeric character string pattern by applying the reading according to the above, and a special numeric character string pattern is registered in the third storage means and the rule is stored in the fourth storage means. And a second registration means for registering the document reading device.