JPS6237799B2

JPS6237799B2 -

Info

Publication number: JPS6237799B2
Application number: JP55019940A
Authority: JP
Inventors: Hitoshi Takase
Original assignee: Sanyo Electric Co Ltd
Current assignee: Sanyo Electric Co Ltd
Priority date: 1980-02-19
Filing date: 1980-02-19
Publication date: 1987-08-14
Also published as: JPS56116099A

Description

【発明の詳細な説明】本発明は、圧縮された音声情報（以下音声デー
タと称す）を用いた音声合成装置に於けるメモリ
情報作成方法に関する。DETAILED DESCRIPTION OF THE INVENTION The present invention relates to a method for creating memory information in a speech synthesis device using compressed speech information (hereinafter referred to as speech data).

近年、音声に関する研究が急速に進められ、デ
ルタ変調法、ホルマント情報抽出法、偏自己相関
係数抽出法、等の音声データ圧縮手段が開発され
ると同時に、これらの手段に依つて圧縮された音
声ダータに基いて音声を合成する音声合成装置が
実現されている。 In recent years, research on speech has progressed rapidly, and speech data compression methods such as the delta modulation method, formant information extraction method, and partial autocorrelation coefficient extraction method have been developed. A speech synthesis device that synthesizes speech based on speech data has been realized.

斯様な音声合成装置の内デルタ変調に依る音声
データ圧縮手段を用いたものを第１図に示す。同
図に於いて、１は音声データを貯える第１メモリ
で、この音声データは、音声波形の変化状態を保
存したものであつて、64KHz程度でサンプリン
グされた音声波形の各々のサンプリング値がその
直前のサンプリング値より大であれば１、小であ
れば０、とする８ビツト単位のアドレス付けされ
た２進コードである。２は該第１メモリ１のアド
レス指定を為すアドレスレジスタ、３は該アドレ
スレジスタ２に依るアドレス指定の順序が記憶さ
れた第２メモリ、４は上記第１メモリ１の指定さ
れたアドレスから読み出される音声データから１
ビツトづつの符号を順次出力するマルチプレクサ
であり、この出力の周期は上述した音声波形のサ
ンプリング周期に等しいものである。５は該マル
チプレクサ４から得られる１ビツトづつの符号に
基いて音声波形を合成する音声合成器、６は該音
声合成器からの音声信号を増巾する増巾器であつ
て、スピーカ７を駆動するものである。８は制御
部であり、上記したアドレスレジスタ２、マルチ
プレクサ４、第１メモリ１、第２メモリ３、の各
動作を制御している。 FIG. 1 shows such a speech synthesizer using speech data compression means based on delta modulation. In the figure, 1 is the first memory that stores audio data, and this audio data stores the changing state of the audio waveform, and each sampling value of the audio waveform sampled at about 64KHz is the first memory that stores the audio data. This is a binary code that is addressed in 8-bit units, with a value of 1 if it is greater than the previous sampled value, and 0 if it is less than the previous sampled value. 2 is an address register for specifying the address of the first memory 1, 3 is a second memory in which the order of address specification by the address register 2 is stored, and 4 is read from the specified address of the first memory 1. 1 from audio data
This is a multiplexer that sequentially outputs codes bit by bit, and the period of this output is equal to the sampling period of the audio waveform mentioned above. 5 is a speech synthesizer that synthesizes a speech waveform based on the codes obtained from the multiplexer 4 one bit at a time; 6 is an amplifier that amplifies the speech signal from the speech synthesizer, and drives the speaker 7; It is something to do. A control section 8 controls the operations of the address register 2, multiplexer 4, first memory 1, and second memory 3 described above.

斯様な音声合成装置は、音声波形が10〜20ｍ
secの短時間に於いて定常的である事に着目し
て、この定常区間の内の同一波形と見做される部
分の音声データを一部だけ上記第１メモリ１に貯
えておき、この音声波形を再合成する際、同一波
形部分の音声データをくり返して用いる手段とし
て上記第２メモリ３を備えたものであつて全体的
な音声データの圧縮が為されているが、２つのメ
モリを必要とするので、アドレス・ライン等のデ
ータ・ラインが多くなり構成が複雑化する上に、
コスト面でも不利であつた。 Such a speech synthesizer has a speech waveform of 10 to 20 meters.
Focusing on the fact that it is stationary in a short period of sec, only part of the audio data of the part that is considered to be the same waveform in this stationary section is stored in the first memory 1, and this audio data is stored in the first memory 1. The second memory 3 is provided as a means for repeatedly using audio data of the same waveform portion when resynthesizing waveforms, and the overall audio data is compressed, but two memories are required. As a result, the number of data lines such as address lines increases, making the configuration complicated.
It was also disadvantageous in terms of cost.

本発明は、係る点に鑑み為されたものであつて
上述の如き音声データと共に該音声データの処理
命令が含まれる制御情報を貯えた単一のメモリ構
成を用いた音声合成装置を実現する為のメモリ情
報作成方法を提供するものである。 The present invention has been made in view of the above points, and is intended to realize a speech synthesis device using a single memory structure storing control information including processing instructions for the speech data as well as the above-mentioned speech data. This provides a method for creating memory information.

第２図は、本発明方法を採用した音声合成装置
の一実施例を示すブロツク図である。同図に於い
て、４〜７は、第１図と同様のマルチプレクサ〜
スピーカであり、９は一語８ビツトのリード・オ
ンリー・メモリ（以下ROMと略す）であつて、
第１図の装置と同様のデルタ変調（サンプリング
周波数64KHz）を施した音声データと、ブラン
チ命令及びブランチ・アンド・ループ命令等の音
声データの流れを制御する即ち音声データの読み
出しを制御する制御情報とが、定められた順にア
ドレス付けされて貯えられている。１０は該
ROM９のアドレスを指定するアドレスカウン
タ、１１は音声波形のサンプリング周期でカウン
ト動作するカウンタであつて、マルチプレクサ４
からの１ビツトずつの出力の周期を定めている。
１２は上記アドレスカウンタ９に依つて指定され
た上記ROM９の制御情報を解読するデコーダ、
１３は該デコーダ１２から得られる制御情報に基
いて上記したアドレスカウンタ１０並びにカウン
タ１１の動作を制御する制御部である。 FIG. 2 is a block diagram showing an embodiment of a speech synthesis device employing the method of the present invention. In the figure, 4 to 7 are multiplexers similar to those in Figure 1.
It is a speaker, and 9 is a read-only memory (hereinafter abbreviated as ROM) with 8 bits per word.
Audio data subjected to delta modulation (sampling frequency 64KHz) similar to the device shown in Figure 1, and control information that controls the flow of audio data such as branch commands and branch-and-loop commands, that is, controls the readout of audio data. are addressed and stored in a predetermined order. 10 is applicable
An address counter 11 specifies the address of the ROM 9, and a counter 11 counts at the sampling period of the audio waveform.
The period of output of each bit from the is determined.
12 is a decoder for decoding the control information of the ROM 9 designated by the address counter 9;
A control section 13 controls the operations of the address counter 10 and counter 11 described above based on control information obtained from the decoder 12.

ここで、本発明の特徴とする上記ROM９に格
納されるメモリ情報について更に詳しく説明する
と、該メモリ情報としてのブランチ・アンド・ル
ープ命令等の制御情報には、音声データとしては
頻度の小さな符号である１つの１と７つの０との
組み合せに依る２進コード並びに１つの０と７つ
の１との組み合せに依る２進コードの内の特定の
ものを割当て、さらに、メモリ情報としての本来
この特定の２進コードで表わされる音声データ
は、この音声データと実質的に近似である別の近
似音声データに変換している。 Here, to explain in more detail the memory information stored in the ROM 9, which is a feature of the present invention, control information such as branch-and-loop commands as the memory information includes codes that are infrequent as audio data. Allocate a specific binary code based on a combination of one 1 and seven 0s and a binary code based on a combination of one 0 and seven 1s, and furthermore, assign this specific one as memory information. The audio data represented by the binary code is converted into other approximate audio data that is substantially approximate to this audio data.

本実施例に於いては、音声データとして音声波
形にデルタ変調を施した２進コードを用いている
ので、上述の特定の２進コードで表わされる音声
データは８ビツトの内の１又は０が数ビツト移動
した形態の２進コードで表わされる音声データで
近似するのが適当であり、又全て０並びに全て１
の２進コードで近似しても実質的な問題は無い。
例えば、制御情報に00100000と00000100と
11011111と11111011とが割当てられた場合、本来
のこれらの２進コードで表わされる音声データ
は、夫々01000000と00000010と10111111と
11111101と表わされる事になり、１ビツトの変位
状態が保存された最適の近似である。又、簡単
に、前の２つの音声データを全て０の２進コード
並びに後の２つの音声データを全て１の２進コー
ドで表す事にしても、実質的に問題のない近似で
ある。 In this embodiment, since a binary code in which the audio waveform is delta-modulated is used as the audio data, the audio data represented by the above-mentioned specific binary code has 1 or 0 out of 8 bits. It is appropriate to approximate it with audio data expressed as a binary code shifted by several bits, and all 0s and all 1s.
There is no practical problem even if it is approximated by the binary code of
For example, 00100000 and 00000100 in the control information.
If 11011111 and 11111011 are assigned, the original audio data represented by these binary codes will be 01000000, 00000010, and 10111111, respectively.
11111101, which is an optimal approximation that preserves the displacement state of 1 bit. Furthermore, even if the first two audio data are represented by a binary code of all 0's and the latter two audio data are represented by a binary code of all 1's, this is a substantially problem-free approximation.

次に本発明方法を採用した音声合成装置の具体
的動作を説明する。第２図に於いて、アドレスカ
ウンタ１０のアドレス指定に従つてROM９から
８ビツト単位の音声データが8KHzの周波数で順
次マルチプレクサ４に伝送され、該マルチプレク
サ４に於いて、この音声データはカウンタ１１の
64KHzのカウント周波数に従い１ビツト毎に順
次音声合成器に伝送される。従つてデルタ変調時
のサンプリング周波数（64KHz）と同じ周波数
で音声合成が為される事になる。斯様な動作中に
上記ROM９から例えば00100000なる100番地のコ
ードが読み出され、デコーダ１２に依つてこのコ
ードが音声データでなくて、ブランチ・アンド・
ループ命令である事が解読された時、カウンタ１
１のマルチプレクサ４への作用を禁止してマルチ
プレクサ４の出力を停止せしめると共に該カウン
タ１１のカウント周期を上記アドレスカウンタ１
０に伝送して、64KHzの周波数をもつてこのア
ドレスカウンタ１０のカウント動作を行なわしめ
る。この事に依り、音声データ指定時の８倍の速
度で次の101番地に格納されたジヤンプ先の先頭
のアドレス番号である例えば50と、その次の102
番地に格納されたジヤンプ先の最終アドレス番号
である例えば70と、その又次の103番地に格納さ
れたくり返し回数である例えば３と、を順次制御
部１３に記憶し、これ等の命令番地の次のアドレ
ス番号104を上記アドレスカウンタ１０に一時貯
える。その後直ちに、上記制御部の指示に従つ
て、上記カウンタ１１はマルチプレクサ４に作用
すると共に、上記アドレスカウンタ１０はROM
９の50番地を指定して通常の音声データ指定動作
にもどる事になり、50番地から70番地までを３回
連続して指定した後、貯えられた104番地を指定
して以前のアドレスの流れに復帰する。斯様にし
て音声波形に於ける定常区間の音声が再合成され
る事になる。尚、上記ROM９の100番地から104
番地に貯えられた制御情報が読み出される周波数
を64KHzとして音声合成が中断される時間を有
効に短縮しているが、この時間中に例えば10101
のコードに相当する64KHzの方形波パルスを音
声合成器に入力する手段を設けて、制御情報の処
理時間中であつても音声波形の波高を維持できる
様にするのが好ましい。 Next, the specific operation of the speech synthesis apparatus employing the method of the present invention will be explained. In FIG. 2, audio data in 8-bit units is sequentially transmitted from the ROM 9 to the multiplexer 4 at a frequency of 8 KHz in accordance with the address designation of the address counter 10.
Each bit is sequentially transmitted to the speech synthesizer according to a count frequency of 64KHz. Therefore, voice synthesis will be performed at the same frequency as the sampling frequency (64KHz) during delta modulation. During this operation, a code at address 100, for example 00100000, is read out from the ROM 9, and the decoder 12 determines that this code is not audio data and is a branch and...
When it is decoded as a loop instruction, counter 1
1 is inhibited from acting on the multiplexer 4 to stop the output of the multiplexer 4, and the count period of the counter 11 is changed to the address counter 1.
0 and causes the address counter 10 to perform a counting operation with a frequency of 64KHz. Due to this, for example, 50, which is the first address number of the jump destination stored at the next address 101, and the next 102
For example, the final address number of the jump destination stored at the address 70, and the number of repetitions stored at the next address 103, for example 3, are sequentially stored in the control unit 13. The next address number 104 is temporarily stored in the address counter 10. Immediately thereafter, according to instructions from the control section, the counter 11 acts on the multiplexer 4, and the address counter 10 acts on the ROM.
After specifying address 50 of 9 and returning to the normal audio data specification operation, specifying addresses 50 to 70 three times in succession, specifying the stored address 104 and returning to the previous address flow. to return to. In this way, the sound in the stationary section of the sound waveform is resynthesized. In addition, 104 from address 100 of the above ROM9
The frequency at which the control information stored in the address is read out is set to 64KHz, effectively shortening the time during which speech synthesis is interrupted.
It is preferable to provide means for inputting a 64KHz square wave pulse corresponding to the code to the speech synthesizer so that the wave height of the speech waveform can be maintained even during the processing time of the control information.

本発明の音声合成装置のメモリ情報作成方法
は、以上の説明から明らかな如く、音声データと
制御情報とを共に貯えるメモリを備え、該制御情
報としては音声データとして頻度の小さな特定符
号を割当てると共に、本来この特定符号で表わさ
れるべき音声データをこの音声データ以外の近似
音声データに変換するものであるので、単一のメ
モリ構成を用いていながら音声データの流れを制
御する事の可能な音声合成装置が実現でき、斯る
装置にて簡単な構成で実質的な歪のない音声を合
成する事が可能となる。 As is clear from the above description, the method for creating memory information for a speech synthesizer according to the present invention includes a memory that stores both audio data and control information, and the control information is assigned a specific code with a low frequency as audio data. , which converts audio data that should originally be represented by this specific code into approximate audio data other than this audio data, so it is a speech synthesis method that can control the flow of audio data while using a single memory configuration. A device can be realized, and with such a device, it becomes possible to synthesize speech without substantial distortion with a simple configuration.

又、本発明の音声合成装置のメモリ情報作成方
法は、上記音声データに音声波形の変化状態をデ
ルタ変調した２進コードを用い、１つの１と特定
多数の０との組み合せ並びに１つの０と特定多数
の１との組み合せに依つてなる特定２進コードの
内から選ばれたものを上記制御情報を特定符号と
しているものであるので、音声データとして頻度
の小さなものが近似音声データに変換される事に
なり、合成音声への悪影響を殆んど皆無にする事
ができる。 Further, the method for creating memory information of a speech synthesis device of the present invention uses a binary code in which the change state of the speech waveform is delta-modulated to the above-mentioned speech data, and a combination of one 1 and a specific number of 0s, as well as one 0 and one 0. Since the above control information is a specific code selected from a specific binary code formed by a combination of a specific number of 1's, less frequent audio data is converted to approximate audio data. Therefore, it is possible to almost eliminate any negative effects on the synthesized speech.

更に、本発明の音声合成装置のメモリ情報作成
方法は、上記近似音声データとして全て０並びに
全て１の２進コードで表わされる音声データを用
いるものであるので、上記特定２進コードから該
近似音声データへの変換を容易にする事ができ、
メモリ情報の作成を簡略化し得る。 Furthermore, since the memory information creation method of the speech synthesis device of the present invention uses audio data expressed in a binary code of all 0s and all 1s as the approximate audio data, the approximate audio data is generated from the specific binary code. It can be easily converted into data,
Creation of memory information can be simplified.

以上の効果から明らかな如く本発明方法は必要
な音声データを圧縮して用いる音声合成装置をよ
り簡単な構成とする事を可能ならしめたものであ
り、その実益性大なるものである。 As is clear from the above-mentioned effects, the method of the present invention makes it possible to simplify the structure of a speech synthesis device that compresses necessary speech data, and is highly useful.

[Brief explanation of the drawing]

第１図は従来の音声合成装置の構成を示すブロ
ツク図、第２図は本発明のメモリ情報作成方法を
採用した音声合成装置の構成を示すブロツク図で
あり、４はマルチプレクサ、５は音声合成器、９
はROM、１０はアドレスカウンタ、１１はカウ
ンタ、１２はデコーダ、１３は制御部、を夫々示
す。 FIG. 1 is a block diagram showing the configuration of a conventional speech synthesis device, and FIG. 2 is a block diagram showing the configuration of a speech synthesis device adopting the memory information creation method of the present invention, where 4 is a multiplexer, and 5 is a speech synthesis device. vessel, 9
10 is a ROM, 10 is an address counter, 11 is a counter, 12 is a decoder, and 13 is a control section.

Claims

[Scope of Claims] 1. A memory that stores audio information obtained by encoding audio waveforms as well as control information including processing instructions for the audio information, and controls reading of the audio information in accordance with the control information in the memory. A speech synthesizer that synthesizes a speech waveform based on the speech information read from the memory, and a speaker that converts the speech signal obtained from the speech synthesizer into speech. In the method, the control information is voice information characterized by assigning a specific code with a low frequency as audio information, and converting the audio information that should originally be represented by this specific code to approximate audio information other than this audio information. A method for creating memory information for a synthesizer. 2. In the method for creating memory information for a speech synthesizer according to claim 1, the speech information is
It consists of a binary code that encodes the changing state of the audio waveform using delta modulation, and includes a combination of one 1 and a specific number of 0s, which are infrequent codes as audio information, and one 0 and a specific number of 1s. A method for creating memory information for a speech synthesizer, characterized in that a specific code selected from among specific binary codes formed by a combination of the following is used as the specific code assigned as the control information. 3. In the method for creating memory information for a speech synthesizer as set forth in claim 2, a voice represented by a binary code of all 0s and all 1s as approximate voice information other than the specific code assigned as the above-mentioned control information. A method for creating memory information for a speech synthesis device characterized by using information.