JPH0143320B2

JPH0143320B2 -

Info

Publication number: JPH0143320B2
Application number: JP54039051A
Authority: JP
Inventors: Shigeaki Masuzawa; Shinya Shibata; Hiroshi Myazaki
Original assignee: Sharp Corp
Current assignee: Sharp Corp
Priority date: 1979-03-30
Filing date: 1979-03-30
Publication date: 1989-09-20
Also published as: JPS55130598A

Description

【発明の詳細な説明】本発明は出力すべき内容を音声により出力させ
るようにした音声出力機器（例えば、音声報知時
計、音声出力式計算機、音声報知テスター等）の
数値情報の音声出力方法に関するものであり、音
声合成のためのデータの容量をできるだけ少なく
し、しかも品位のよい合成音を得ることを目的と
するものである。DETAILED DESCRIPTION OF THE INVENTION The present invention relates to a method for audio outputting numerical information of an audio output device (for example, an audio alarm clock, an audio output type calculator, an audio alarm tester, etc.) that outputs the content to be output by audio. The purpose is to minimize the amount of data for speech synthesis and to obtain high-quality synthesized speech.

一般に数値情報を音声出力する場合、幾つかの
音素を組合わせることによりその音声としての数
値を構成することができる。例えば2534「ニセン
ゴヒヤクサンジユウヨン」は「ニ」「セン」
「ゴ」「ヒヤク」「サン」「ジユウ」「ヨン」の７つ
の音素を組合わせることにより構成することがで
きる。このように基本の音素を必要数メモリーに
記憶しておき所望の順序で取り出し音成合成を行
うことにより、所望の数値情報を音声出力する方
法が知られている。 Generally, when numerical information is outputted as voice, the numerical value as voice can be constructed by combining several phonemes. For example, 2534 ``Nisen gohyaku sanjiyuyon'' is ``ni'' and ``sen''.
It can be constructed by combining seven phonemes: ``go'', ``hyaku'', ``san'', ``jiyu'', and ``yon''. A method is known in which desired numerical information is outputted as voice by storing a required number of basic phonemes in a memory and extracting them in a desired order and performing sound synthesis.

しかし我々の研究の結果、このような基本の音
素の単なる組合わせでは場合により不都合が生じ
ることが明らかとなつた。 However, as a result of our research, it has become clear that such simple combinations of basic phonemes may cause problems in some cases.

その例として123「ヒヤクニジユウサン」
の「ヒヤク」と234「ニヒヤクサンジユウヨ
ン」の「ヒヤク」、または100の「ヒヤク」は、全
て同じ読み方をするが、夫々ピツチや振幅の変化
が微妙に異つていて、一つの音素として共用すれ
ば、場合によりかなり不自然な発音になつてしま
う。従つて「ヒヤク」の音素は場合により３種類
使い分けなければ、あまり品位の高い音声は出力
することができない。また、これに類似して10代
の「ジユウ」と20〜30代の「ジユウ」を異なつた
単語デーータで記憶するものとして本出願人にお
いて既に提案されているが、本発明ではさらにこ
れらの関係を詳細に検討し、また位取りの言葉だ
けでなく数字の音声も数種類設けることにより、
より品位の高い合成音を得るようになされたもの
である。 For example, 123 “Hiyaku Ni Jiyu San”
'Hiyaku' in 234 'Ni Hiyaku Sanjiyuu Yon' or 'Hiyaku' in 100 are all pronounced the same, but each has slightly different changes in pitch and amplitude, so they are not treated as one phoneme. If they are shared, the pronunciation may become quite unnatural. Therefore, unless three types of phonemes are used for "hiyaku" depending on the situation, high-quality speech cannot be output. In addition, similar to this, the applicant has already proposed a method for storing different word data for "Jiyuu" for teenagers and "Jiyuu" for people in their 20s to 30s, but the present invention further improves the relationship between these. By carefully considering the following, and by providing not only place value words but also several types of number sounds,
This was done to obtain a higher quality synthesized sound.

第１図は各音素とその使用例を示すもので、例
えば20、30、40の「ジユウ」と50、60、80等の
「ジユウ」は別のデータを使用する。また50、５
千の「ゴ」と500、５万の「ゴ」と５億の「ゴ」
はそれぞれ別のデータを使用する。第１図ａは位
取りの音素例を示し、第１図ｂのn₁ n₃
n₇n₁₀n₁₂n₁₈n₂₂n₂₅n₂₉n₃₂は夫々単音の数字の音声
であり、例えば、小数点以下を棒読みする場合に
使用する。また第２図は第１図の場合に比べ使い
分けを少なくしたものであり、音声データの記憶
容量を極力少なくする必要がある場合に有効であ
る。第２図の場合例えば“３”はすべての場合に
一つのデータを使用することを示している。 FIG. 1 shows each phoneme and an example of its use. For example, "jiyu" in 20, 30, 40 and "jiyu" in 50, 60, 80, etc. use different data. Also 50,5
1,000 “go” and 500, 50,000 “go” and 500 million “go”
each uses different data. Figure 1a shows an example of a phoneme with place value, and n ₁ n ₃ in Figure 1b.
n ₇ n ₁₀ n ₁₂ n ₁₈ n ₂₂ n ₂₅ n ₂₉ n ₃₂ are the voices of single-tone numbers, and are used, for example, when reading the numbers below the decimal point. Furthermore, FIG. 2 shows fewer uses than in FIG. 1, and is effective when it is necessary to minimize the storage capacity of audio data. In the case of FIG. 2, for example, "3" indicates that one data is used in all cases.

以下図面に従つて本発明の一実施例を説明す
る。第３図はそのブロツク図であり、Ｘは演算結
果等の数値を数字データにより記憶する数値デー
タ記憶レジスター、ｘは該数値データ記憶レジス
ターＸにおいて記憶されている数値の小数点位置
をを記憶する小数点位置記憶レジスターであり共
にRAM（ランダムアクセスメモリー）内に構
成されている。また、OCはＸレジスタの内容を
音声出力順に取り出し１桁分のバツフアＢに出力
する出力制御部であり、J₁はｘに記憶されている
小数点位置とOCがＸレジスタのどの桁の値をバ
ツフアＢに出力したかを示す信号S₁によりこれか
ら音声出力しようとする数値の位（くらい）を判
別し、バツフアＢの内容によりどのデータを使用
するかを指示する信号S_2oとS_2dを出力する。ま
た、Ｂに出力された内容が小数第１位であれば信
号S₃をS_2o、S_2dと共に出力する。 An embodiment of the present invention will be described below with reference to the drawings. FIG. 3 is its block diagram, where X is a numerical data storage register that stores numerical values such as calculation results as numerical data, and x is a decimal point that stores the decimal point position of the numerical value stored in the numerical data storage register X. It is a position memory register, and both are configured in RAM (random access memory). Also, OC is an output control unit that extracts the contents of the X register in the order of audio output and outputs it to buffer B for one digit, and J ₁ is the decimal point position stored in x and the value of which digit in the X register is determined by OC. The digit of the numerical value to be outputted as audio is determined based on the signal S ₁ indicating whether it has been output to buffer B, and the signals S _2o and S _2d are outputted to instruct which data to use depending on the contents of buffer B. do. Further, if the content output to B is in the first decimal place, the signal S ₃ is output together with S _2o and S _2d .

出力デートGOはS₃を受ければCG₂の出力
CG₁の出力の順にコードをVCCに出力し、S₃を受
けなければCG₁の出力、CG₂の出力の順に
VCCにコードを出力する。J₂はｘに記憶されて
いる小数点位置とOCがＸレジスタのどの桁の値
をバツフアＢに出力したかを示す信号S₁により、
これから音声出力しようとする数値の位（くら
い）を判別し、CG₂へ位取りのどのデータを使用
するかを指示する。CG₁，CG₂はコード発生部で
ありCG₁は数字音声コード発生部であり、バツフ
アＢの内容とJ₁より出力される選択信号S_2oとに
より、音声出力すべき音素のコードをゲートGO
を介してVCCへ出力する。また、CG₂は位取り音
声コード発生部でありJ₂より出力される位取り信
号とJ₁より出力される選択信号S_2dとにより、音
声出力すべき音素のコードをゲートGOを介して
VCCへ出力する。VCCは音声合成部であり、CC
はコード変換部、ACはアドレスカウンタ、AD
はアドレスデコーダー、VRは音声の素となるデ
ータを記憶しているメモリーである。例えば「イ
チ」のコードがCCに入力されれば、CCは「イ
チ」という音声のデータが記憶されている先頭ア
ドレスのコードに変換し、ACはアドレスをカウ
ントアツプし、VRより「イチ」の音声データを
順次Ｄ／Ａへ出力する。Ｄ／Ａはデイジタルアナ
ログ変換部であり、最終的に音声信号を出力し、
SPの音声出力部より音声出力を行う。JEはVR
よりENDコードが出力されたか否かを判別し、
OCとGOへ検出信号を出力する。 Output date GO is CG ₂ output if S ₃ is received
Output the code to VCC in the order of CG ₁ output, and if S ₃ is not received, then CG ₁ output, then CG ₂ output.
Output code to VCC. _J2 is determined by the decimal point position stored in x and the signal _S1 indicating which digit of the X register the OC outputs to the buffer B.
It determines the digit of the numerical value to be output as audio and instructs CG ₂ which scale data to use. CG ₁ and CG ₂ are code generators, and CG ₁ is a numeric voice code generator, which gates the code of the phoneme to be output as voice based on the contents of buffer B and the selection signal _S2o output from _J1 .
Output to VCC via. In addition, CG ₂ is a scale audio code generation unit, and uses the scale signal output from J ₂ and the selection signal S _2d output from J ₁ to generate the code of the phoneme to be output as audio via gate GO.
Output to VCC. VCC is a voice synthesis section, and CC
is code converter, AC is address counter, AD
is an address decoder, and VR is a memory that stores the data that forms the basis of audio. For example, if the code for "Ichi" is input to the CC, the CC converts it to the code of the first address where the voice data of "Ichi" is stored, the AC counts up the address, and the VR inputs the "Ichi". Sequentially output audio data to D/A. D/A is a digital-to-analog converter, which ultimately outputs an audio signal.
Audio is output from the audio output section of the SP. JE is VR
Determine whether or not the END code has been output.
Outputs detection signal to OC and GO.

例えば、レジスタＸの内容が24356であり、レ
ジスタｘの内容が２であればＸとｘにより243・
56が記憶されていることになり、先ず出力制御部
OCは百の位の“２”をバツフアＢに入力する。
このため数字音声コード発生部CG₁は信号S_2oと
バツフアＢの内容により第１図のn₅のコードを発
生する。また、位取り音声コード発生部CG₂は信
号S_2dとJ₂より出力される位取り選択信号により
第１図のd₁₂のコードを発生する。出力ゲートGO
は信号S₃が発生していないため、先ずCG₁の出力
コードをVCCへ出力し、「ニ」を音声出力する。
この音声出力が終了すれば、GOは終了信号Seを
受け、次にCG₂の出力コードをVCCへ出力し、
「ヒヤク」を音声出力する。その後、出力制御部
OCは十の位をバツフアＢに出力する。この場合
CG₁はn₁₁のコードをCG₂はd₈を発生し、出力ゲー
トは信号S₃が発生していないことよりCG₁の出
力CG₂の出力の順にVCCへコード出力し「ヨ
ン」「ジユウ」を音声出力する。 For example, if the content of register X is 24356 and the content of register x is 2, then
56 is stored in memory, and first the output control section
The OC inputs "2" in the hundreds place into buffer B.
Therefore, the numeric voice code generator _CG1 generates the code _n5 in FIG. 1 based on the contents of the signal _S2o and the buffer B. Further, the scale voice code generating section _CG2 generates the code _d12 in FIG. 1 based on the scale selection signal outputted from the signals _S2d and _J2 . Output gate GO
Since the signal S ₃ is not generated, the output code of CG ₁ is first output to VCC, and "ni" is output as a sound.
When this audio output ends, GO receives the end signal Se, then outputs the output code of CG ₂ to VCC,
Outputs "Hiyaku" aloud. Then the output control section
OC outputs the tens place to buffer B. in this case
CG ₁ generates the code n ₁₁ , CG ₂ generates d ₈ , and since the signal S ₃ is not generated, the output gate outputs the code to VCC in the order of the output of CG ₁ and the output of CG ₂ . ” is output aloud.

また、バツフアＢに１の位が出力された場合、
CG₁は第１図のn₈を発生するが、CG₂はコードを
発生しない。従つて「サン」と音声出力される。
次に小数第１位がバツフアＢに出力されれば、
CG₁はn₁₃のコードを出力し、CG₂はd₂₁のコード
を出力する。またこの時J₁は信号S₃を出力するた
め、出力ゲートGOはCG₂の出力、CG₁の出
力の順にコードをVCCへ出力する。このため
「テン」「ゴ」が音声出力される。小数第２位がバ
ツフアＢに出力されればCG₁はn₁₈のコードを発
生するがCG₂はコードを発生しない。これにより
「ロク」が音声出力される。以上のようにして全
体として「ニ」「ヒヤク」「ヨン」「ジユウ」「サ
ン」「テン」「ゴ」「ロク」が音声出力される。 Also, if the 1's digit is output to buffer B,
CG ₁ generates n ₈ in Figure 1, but CG ₂ does not generate any code. Therefore, the sound "san" is output.
Next, if the first decimal place is output to buffer B,
CG ₁ outputs the code of n ₁₃ and CG ₂ outputs the code of d ₂₁ . Also, at this time, since _J1 outputs the signal _S3 , the output gate GO outputs the code to VCC in the order of the output of _CG2 and the output of _CG1 . Therefore, "ten" and "go" are output as sounds. If the second decimal place is output to buffer B, CG ₁ generates a code of _n18 , but CG ₂ does not generate a code. As a result, "Roku" is output audibly. As described above, "ni", "hiyaku", "yon", "jiyu", "san", "ten", "go", and "roku" are outputted as voices as a whole.

以上の説明より明らかなようにコード発生部、
CG₁とCG₂に第１図のコードを記憶しておき外部
からの選択指定信号により選択されたコードを出
力するように構成すれば必要な音素が組合わさ
れ、同じ読み方を行う場合でも、後続する数値の
条件等にて異なる音声コードを選択し、その時々
で最適な音声を出力するようにしていることか
ら、非常に品位の良い合成音を得ることができ、
ごく自然の音声にて数値を報知することが可能と
なる。 As is clear from the above explanation, the code generation section,
By storing the codes shown in Figure 1 in CG ₁ and CG ₂ and configuring them to output the code selected by an external selection designation signal, the necessary phonemes will be combined, and even if the same reading is used, subsequent Since different audio codes are selected depending on the numerical conditions, etc., and the optimal audio is output at each time, it is possible to obtain synthesized sounds of very high quality.
Numerical values can be announced using natural sounds.

また本発明に係わる実施態様としては、次の如
きものが挙げられる。 Furthermore, embodiments of the present invention include the following.

(1) 100代の“ヒヤク”と200〜900代の“ヒヤク”
を異なつたデータで記憶する手段、100代か200
〜900代かを判別し、上記各データに従つてそ
れぞれ異なつた音で音声出力させる手段とを備
えてなる音声出力式機器。(1) “Hiyaku” in his 100s and “Hiyaku” in his 200s to 900s
A means of storing different data, 100 or 200
An audio output device comprising means for determining whether the person is in the 900s and outputting audio with different sounds according to each of the above data.

(2) 単なる100の“ヒヤク”を100〜900代の“ヒ
ヤク”とは異なつたデータで記憶する手段、
100か100〜900代かを判別し、上記各データに
従つてそれぞれ異なつた音で音声出力させる手
段とを備えてなる音声出力式機器。(2) A means of storing mere 100 “Hiyaku” with different data from 100 to 900 “Hiyaku”;
An audio output type device comprising means for determining whether the age is 100 or 100 to 900 and outputting audio with different sounds according to each of the above data.

(3) 単なる10の“ジユウ”を10〜90代の“ジユ
ウ”とは異なつたデータで記憶する手段、10か
10〜90代かを判別し上記各データに従つてそれ
ぞれ異なつた音で音声出力させる手段とを備え
てなる音声出力式機器。(3) A means to memorize simple 10 “JIU” as data different from “JIU” of 10 to 90s, 10
A voice output type device comprising means for determining whether a person is in their 10s to 90s and outputting a voice with a different sound according to each of the above data.

(4) その数字の前後に言葉が付加されない単語の
数の音声とそうでない数の音声を異なつたデー
タで記憶する手段、単音の数であるか否かを判
別し上記各データに従つてそれぞれ異なつた音
で音声出力させる手段とを備えてなる音声出力
式機器。(4) Means for storing sounds of the number of words with no words added before and after the number as different data, and sounds of the number of words other than that, determining whether the number is a single sound or not, and storing each sound according to each of the above data. An audio output device comprising means for outputting audio with different sounds.

(5) 4000の“セン”と1000、2000、5000〜9000の
“セン”を異なつたデータで記憶する手段、
4000の“セン”であるか否か判別し上記各デー
タに従つて、それぞれ異なつた音で音声出力さ
せる手段とを備えてなる音声出力式機器。(5) means for storing 4000 "sens" and 1000, 2000, 5000 to 9000 "sens" as different data;
4,000 "sens" or not, and outputs audio with different sounds according to each of the above data.

(6) 200、500の“ヒヤク”と400、700、900の
“ヒヤク”を異なつたデータで記憶する手段と
それを判別し各データに従つて、それぞれ異な
つた音で音声出力させる手段とを備えてなる音
声出力式機器。(6) Means for storing 200, 500 "Hiyaku" and 400, 700, 900 "Hiyaku" as different data, and means for discriminating them and outputting audio with different sounds according to each data. Audio output device.

以上説明した様に本発明の音声出力機器によれ
ば、同一の数値言語情報及び位取り言語情報にそ
れぞれ対応する複数種類の音声コード信号を発生
する手段を設け、数値情報及びその位取り情報の
組合せを判別し、判別出力に基づいてコード信号
発生手段を駆動し、所定のコードを選択し、出力
させるようにしたから、所定の数知言語情報及び
その位取り言語情報により自動的に音声出力させ
ることができ、かつ、合成音の品質を著しく向上
させ得る利点がある。 As explained above, according to the audio output device of the present invention, means for generating a plurality of types of audio code signals respectively corresponding to the same numerical linguistic information and scale linguistic information is provided, and a combination of numerical information and its scale information is generated. Since the code signal generation means is driven based on the determined output to select and output a predetermined code, it is possible to automatically output audio based on the predetermined numerical language information and its scaled language information. This has the advantage of being able to significantly improve the quality of synthesized speech.

[Brief explanation of drawings]

第１図、及び第２図は音素とその使用例を示す
図、第３図は本発明に係わる音声出力式機器のブ
ロツク図を示す。図中、Ｘ：Ｘレジスタ、ｘ：ｘレジスタ、
OC：出力制御回路、Ｂ：１桁分のバツフア、
J₁：数字判別部、J₂：位取り判別部、CG₁：数字
音声コード発生部、CG₂：位取り音声コード発生
部、GO：出力ゲート、VCC：音声合成部、
CC：コード変換部、AC：アドレスカウンタ、
AD：アドレスデコーダー、VR：音声データ記
憶部、JE：ENDコード検出部、Ｄ／Ａ：デイジ
タルアナログ変換部、SP：音声出力部。 1 and 2 are diagrams showing phonemes and examples of their use, and FIG. 3 is a block diagram of an audio output type device according to the present invention. In the figure, X: X register, x: x register,
OC: Output control circuit, B: 1-digit buffer,
J ₁ : Numerical discrimination section, J ₂ : Scale discrimination section, CG ₁ : Numerical voice code generation section, CG ₂ : Scaled voice code generation section, GO: Output gate, VCC: Voice synthesis section,
CC: Code converter, AC: Address counter,
AD: Address decoder, VR: Audio data storage section, JE: END code detection section, D/A: Digital to analog conversion section, SP: Audio output section.

Claims

[Scope of Claims] 1. An audio output device configured to output numerical values aloud using the numbers constituting the numerical value and place value words, comprising: a pronunciation data storage unit that stores pronunciations for each place of the numerical value; and numerical data that stores the numerical value. a storage register; a decimal point position storage register for storing the decimal point position of the numerical value stored in the numerical data storage register; an output control means for sequentially outputting digits from the numerical data storage register; a discriminating means for discriminating the digit of the number output from the numerical data storage register according to the position of the decimal point; What is claimed is: 1. A voice output device, comprising: a voice synthesizing section that selectively outputs a voice, synthesizes it into voice, and outputs the voice, and outputs the numerical value of the numerical data storage register as voice with scale.