JPH0225516B2

JPH0225516B2 -

Info

Publication number: JPH0225516B2
Application number: JP55160402A
Authority: JP
Inventors: Yutaka Yasui; Shuichi Hashimoto; Shigeki Sagayama
Original assignee: Fujitsu Ltd; Nippon Telegraph and Telephone Corp
Current assignee: Fujitsu Ltd; NTT Inc
Priority date: 1980-11-14
Filing date: 1980-11-14
Publication date: 1990-06-04
Also published as: JPS5784499A

Description

【発明の詳細な説明】本発明は、音声データを合成して音声信号を出
力する音声合成方式に係り、特に音声データの合
成開始方式に関するものである。DETAILED DESCRIPTION OF THE INVENTION The present invention relates to a voice synthesis method for synthesizing voice data and outputting a voice signal, and particularly to a method for starting synthesis of voice data.

音声合成方式としては、PCM（Pulse
codemodulation）、DPCM（differential PCM）
等の波形符号化方式や音声の特徴を利用してパラ
メータ化したPARCOR（partial
autoecorrelation）方式、LPC（liner
pedictivecoding）方式、更にはLSP（line
spectrumpair）（線スペクトル対）方式等が知ら
れている。なかでもLSP方式は周波数領域のパラ
メータで周波数スペクトルの特徴を符号化するも
のであり線形予測符号化（LPC）方式に対し音
質改善、情報量削減等が図られ、また、音声波形
をPCM等に符号化した波形符号化方式に比較し
てデータ圧縮度が大きいものである。 As a voice synthesis method, PCM (Pulse
codemodulation), DPCM (differential PCM)
PARCOR (partial
autoecorrelation) method, LPC (liner
pedictive coding) method, and even LSP (line
spectrumpair (line spectrum pair) method, etc. are known. Among these, the LSP method encodes the characteristics of the frequency spectrum using frequency domain parameters, and is designed to improve sound quality and reduce the amount of information compared to the linear predictive coding (LPC) method. This method has a higher degree of data compression than the encoded waveform encoding method.

パラメータ符号化方式による音声合成は、例え
ばフレーム周期毎に１組の音声データ（例えば６
バイト）を所定の手順で変換し、直線補間等によ
りサンプル周期毎の補間出力をデイジタルフイル
タの係数として加え、音源部からのパルス列又は
ホワイトノイズを駆動音源としてデイジタルフイ
ルタに加えて音声合成出力を得るものである。通
常は或るフレームの処理の間に次のフレームの音
声データを読込むものであるが、最初にシステム
あるいは音声合成回路に起動をかけた場合、音声
合成回路内部にはシフトレジスタや演算回路等が
あり、これらの回路に含まれているフリツプフロ
ツプ類の回路の状態は不定である。そのために音
声合成動作を開始する前に演算回路等の初期設定
（フリツプフロツプ類のクリア）をする必要があ
る。 Speech synthesis using the parameter encoding method, for example, uses one set of audio data (for example, 6
bytes) according to a predetermined procedure, and the interpolated output for each sample period is added as a coefficient of a digital filter using linear interpolation, etc., and the pulse train or white noise from the sound source section is added to the digital filter as a driving sound source to obtain a speech synthesis output. It is something. Normally, the audio data of the next frame is read during the processing of a certain frame, but when the system or speech synthesis circuit is started for the first time, there are shift registers, arithmetic circuits, etc. inside the speech synthesis circuit. The states of the flip-flop circuits included in these circuits are undefined. Therefore, it is necessary to initialize the arithmetic circuits, etc. (clear the flip-flops) before starting the speech synthesis operation.

本発明は、上記に示す回路の不定状態を除去す
べく、音声合成の所定の時間内に音声合成回路を
初期設定し、音声合成を開始するための起動を受
けてから、実際に音声合成を開始するまでの時間
を短縮化し、且つ安定に音声合成を行なわせるこ
とを目的とする。 In order to eliminate the undefined state of the circuit described above, the present invention initializes the speech synthesis circuit within a predetermined time for speech synthesis, receives activation for starting speech synthesis, and then actually performs speech synthesis. The purpose is to shorten the time required to start and stably perform voice synthesis.

上記目的を達成するために、本発明は、音声デ
ータを読込んで該音声データの変換、補間、演算
等の処理により音声の特徴をパラメータ化して音
声を合成するパラメータ符号化方式による音声合
成装置において、音声合成回路を外部から起動す
る起動信号により所定の間初期設定し、前記音声
データの受信が該初期設定の完了前に行なわれて
いる時、該初期設定の完了とともに音声合成動作
を開始させ、前記音声データの受信が該初期設定
の完了後に行なわれる時はその音声データの受信
完了とともに音声合成動作を開始させることを特
徴とし、また、音声合成動作開始直後の音声合成
動作は音声データ変換した結果を補間処理するこ
となくデイジタルフイルタ部に入力し演算処理を
することを特徴とする。 In order to achieve the above object, the present invention provides a speech synthesis device using a parameter encoding method that reads speech data and converts the speech data into parameters such as conversion, interpolation, calculation, etc. to synthesize speech. , the voice synthesis circuit is initialized for a predetermined period of time by an external activation signal, and when the voice data is received before the initialization is completed, the voice synthesis operation is started upon completion of the initialization. , when the reception of the voice data is performed after the completion of the initial settings, the voice synthesis operation is started upon completion of the reception of the voice data, and the voice synthesis operation immediately after the start of the voice synthesis operation is voice data conversion. The present invention is characterized in that the obtained result is input to a digital filter section and subjected to arithmetic processing without interpolation processing.

以下実施例にもとづいて、本発明を詳細に説明
する。 The present invention will be described in detail below based on Examples.

第１図はマイクロプロセツサの制御により音声
合成を行なうシステムのブロツク線図であり、１
はマイクロプロセツサ、２は音声データを格納し
たメモリ、３は音声合成回路、４はタイミング発
生等の為の原信号発振用の水晶発振子、５はデイ
ジタルアナログ変換器、６はローパスフイルタ、
７は増幅器、８，１０はスピーカ、９はトランス
である。 Figure 1 is a block diagram of a system that performs speech synthesis under the control of a microprocessor.
is a microprocessor, 2 is a memory that stores audio data, 3 is a speech synthesis circuit, 4 is a crystal oscillator for oscillating the original signal for timing generation, etc., 5 is a digital-to-analog converter, 6 is a low-pass filter,
7 is an amplifier, 8 and 10 are speakers, and 9 is a transformer.

マイクロプロセツサ１からのアドレス信号によ
り、メモリ２から音声データが読出されて音声合
成回路３に加えられ、マイクロプロセツサ１から
の制御情報により音声合成回路３の音声合成の開
始、停止、フレーム周期の選択等が制御され、音
声合成回路３の状態情報がマイクロプロセツサ１
に送出される。音声合成回路３の音声出力がデイ
ジタル直列出力の場合、デイジタルアナログ変換
器５によりアナログ音声信号に変換され、ローバ
スフイルタ６を介して増幅器７に加えられ、増幅
出力によりスピーカ８が駆動される。また、音声
合成回路３には簡易型のデイジタルアナログ変換
器が内蔵されているので、トランス９を介してス
ピーカ１０を駆動することもできる。この場合の
音声品質は、簡易型のデイジタルアナログ変換器
によりアナログ音声信号に変換されるので、多少
低いものとなる。 Based on the address signal from the microprocessor 1, audio data is read from the memory 2 and applied to the speech synthesis circuit 3, and control information from the microprocessor 1 controls the start, stop, and frame period of speech synthesis in the speech synthesis circuit 3. The selection, etc. of
will be sent to. When the audio output of the audio synthesis circuit 3 is a digital serial output, it is converted into an analog audio signal by a digital-to-analog converter 5, and is applied to an amplifier 7 via a low-pass filter 6, and a speaker 8 is driven by the amplified output. Furthermore, since the voice synthesis circuit 3 has a built-in simple digital-to-analog converter, the speaker 10 can also be driven via the transformer 9. The audio quality in this case is somewhat low because it is converted into an analog audio signal by a simple digital-to-analog converter.

第２図は音声合成回路のブロツク線図であり、
１１は音声データD₀〜D₇を信号DLによりセツト
して音声データの変換時まで保持しておくシフト
レジスタ群からなるデータバツフアスタツク、１
２はインターフエス部、１３は起動、停止信号
ST、補間動作を停止し同一データを繰返し合成
動作を指定する繰返し信号RPT、フレーム周期
設定信号T₀，T₁、可変・固定フレーム長モード
指定信号MODE、音声データ内情報により音声
合成停止を指定するストツプビツトイネーブル信
号SBE等の制御信号を信号によりセツトす
るコントロールレジスタ、１４は状態信号として
音声データ要求信号、障害情報信号、
音声合成中表示信号を信号SEにより出力す
るステータスレジスタ、１５は音声変換部、１６
は音声データをフイルム係数に変換する読取専用
メモリ（ROM）等からなる変換部、１７はフレ
ーム周期毎に加えられるフイルタ係数をサンプル
周期毎に補間して出力する補間部、１８は音源
部、１９はデイジタルフイルタ部、２０は簡易型
のデイジタルアナログ変換器、２１はタイミング
信号等を発生し、各部を制御する制御部であり、
集積回路化した場合のものである。各端子に示す
符号は信号を表わし、V_DD，V_SS，V_SAは電源電
圧、SOUTは例えば16ビツト直列のデイジタル
音声出力信号、POUT，NOUTは例えば８ビツ
トの簡易型デイジタルアナログ変換器２０でアナ
ログ信号に変換されたアナログ音声出力信号、
XOUTは内部論理演算用の基本クロツク、
WSYNはSOUTの同期信号、MRは内部フリツ
プフロツプ、カウンタ類のリセツト信号、CKST
は水晶発振回路の制御信号、CLK及びXTALは
水晶振動子接続端子、FPは各フレームの先頭の
サンプル周期を示すフレームパルスの出力端子で
ある。 Figure 2 is a block diagram of the speech synthesis circuit.
Reference numeral 11 denotes a data buffer stack consisting of a group of shift registers in which audio data D ₀ to D ₇ are set by a signal DL and held until the audio data is converted.
2 is the interface section, 13 is the start and stop signal
ST, repeat signal RPT that stops interpolation and specifies repeat synthesis operation of the same data, frame period setting signals T ₀ , T ₁ , variable/fixed frame length mode specification signal MODE, specifies stop of voice synthesis by information in the voice data A control register 14 is used to set control signals such as a stop bit enable signal SBE, etc., and a control register 14 is a control register that sets control signals such as a stop bit enable signal SBE.
A status register that outputs a display signal during speech synthesis using a signal SE; 15 is a speech converter; 16
17 is a conversion unit consisting of a read-only memory (ROM) or the like that converts audio data into film coefficients; 17 is an interpolation unit that interpolates and outputs filter coefficients added every frame period; 18 is a sound source unit; 2 is a digital filter section, 20 is a simple digital-to-analog converter, and 21 is a control section that generates timing signals and controls each section.
This is the case when it is made into an integrated circuit. The symbols shown at each terminal represent signals, V _DD , V _SS , and V _SA are power supply voltages, SOUT is, for example, a 16-bit serial digital audio output signal, and POUT and NOUT are, for example, 8-bit simple digital-to-analog converter 20. analog audio output signal converted to analog signal,
XOUT is the basic clock for internal logic operations,
WSYN is the SOUT synchronization signal, MR is the internal flip-flop, counter reset signal, CKST
is a control signal for the crystal oscillator circuit, CLK and XTAL are crystal resonator connection terminals, and FP is a frame pulse output terminal indicating the sampling period at the beginning of each frame.

LSP方式の場合、音声データとしては、例えば
１ビツトのスタートビツトと７ビツトのピツチ周
期データ、２ビツトのフレーム長指定ビツトと６
ビツトの振幅データ、４ビツトづつのLSPパラメ
ータ８個からなる６バイトを１組としたもので、
フレーム長は例えば５、10、20、40ｍｓの指定が
可能となつている。 In the case of the LSP method, the audio data includes, for example, a 1-bit start bit, 7-bit pitch cycle data, 2-bit frame length designation bits, and 6-bit start bit.
A set of 6 bytes consisting of bit amplitude data and 8 LSP parameters of 4 bits each.
For example, the frame length can be specified as 5, 10, 20, or 40 ms.

このような音声データがデータバツフアスタツ
ク１１にメモリから読み込まれ、信号（デー
タロードパルス）によりセツトされ、変換部１６
によりLSPパラメータと振幅データとを用いてフ
イルタ係数に変換され、補間部１７に送出され
る。補間部１７に於いては、サンプル周期毎に直
線補間してデイジタルフイルタ部１９にフイルタ
係数を送出する。また、音源部１８は音声データ
中のピツチ周期データに従つたパルス列を発生し
てデイジタルフイルタ部１９に加えられる。 Such audio data is read from the memory into the data buffer stack 11, set by a signal (data load pulse), and converted into the converter 16.
The filter coefficients are converted into filter coefficients using the LSP parameters and amplitude data, and sent to the interpolation unit 17. The interpolation section 17 performs linear interpolation for each sample period and sends filter coefficients to the digital filter section 19. Further, the sound source section 18 generates a pulse train according to the pitch period data in the audio data, and applies it to the digital filter section 19.

デイジタルフイルタ部１９は、加減算回路及び
乗算回路等を含むもので、音源部１８からのパル
ス列と補間部１７からのフイルタ係数との演算に
より音声合成フイルとして動作し、例えば16ビツ
ト直列のデイジタル音声信号を出力する。 The digital filter unit 19 includes an addition/subtraction circuit, a multiplication circuit, etc., and operates as a voice synthesis filter by calculating the pulse train from the sound source unit 18 and the filter coefficients from the interpolation unit 17, and generates, for example, a 16-bit serial digital audio signal. Output.

第３図は本発明の実施例の要部ブロツク線図で
あり、第２図と同じ符号は同一物を示し、３１は
音声合成動作・補間計算等を制御する起動制御回
路、３２は音源ピツチ情報保持レジスタ、３３は
駆動音源発生部、３４は差分値レジスタ、３５は
１／２ｎ回路、３６は補間値レジスタ、３７，４１
は加算回路、３８，４３はインヒビツトゲート、
３９，４２はアンドゲート、４０，４４，４５は
オアゲートである。 FIG. 3 is a block diagram of the main parts of the embodiment of the present invention, in which the same reference numerals as in FIG. Information holding register, 33 is a driving sound source generating section, 34 is a difference value register, 35 is a 1/2n circuit, 36 is an interpolation value register, 37, 41
is an adder circuit, 38 and 43 are inhibit gates,
39 and 42 are AND gates, and 40, 44, and 45 are OR gates.

メモリから読出された音声データD₀〜D₇はデ
ータロードパルスDLによりデータバツフアスタ
ツク１１にセツトされる。この音声データは変換
用読出タイミングパルスt₁によつて読出され、振
幅データとLSPパラメータとが変換部１６にピツ
チ周期データが音源部１８の保持レジスタ３２に
送出される。ビツチ周期データ更新タイミングパ
ルスt₂により保持レジスタ３２の内容は駆動音源
発生回路３３に更新される。駆動音源発生回路３
３はレジスタ３２の内容のピツチ同期データに従
つたパルス列（周期的パルスまたはホワイトノイ
ズ相当のパルス）を出力し、デイジタルフイルタ
部１９へ加えられる。 The audio data D ₀ to _{D 7} read from the memory are set in the data buffer stack 11 by the data load pulse DL. This audio data is read out by the conversion read timing pulse t ₁ , and the amplitude data and LSP parameters are sent to the conversion section 16 and the pitch period data to the holding register 32 of the sound source section 18 . The contents of the holding register 32 are updated to the driving sound source generating circuit 33 by the bit cycle data update timing pulse _t2 . Drive sound source generation circuit 3
3 outputs a pulse train (periodic pulses or pulses equivalent to white noise) according to the pitch synchronization data of the contents of the register 32, and is applied to the digital filter section 19.

変換部１６の変換出力は、通常（音声合成中）
は、補間部１７の加算回路３７に加えられ、前フ
レームのフイルタ係数との差分が出力されて差分
値計算タイミングパルスt₃により差分値レジスタ
３４にセツトされ、1/2ｎ回路３５によりサンプ
ル周期毎のキザミに分割される。つまり、フレー
ム周期毎に変換部１６からの入力される係数と前
のフレームの係数との差が加算回路３７によつて
求められ、この差分が１フレーム内のサンプル数
で1/2ｎ回路で除去されて１サンプル毎の値が求
められ、この値を補間値レジスタ３６の内容に加
算回路４１で加算して、１サンプル毎のフイルタ
係数がデイジタルフイルタ部１９へ出力されるこ
とになる。この時、起動制御回路３１の出力は
“０”となつていてゲート４２を閉じ、ゲート４
３を開けている。 The conversion output of the conversion unit 16 is normally (during speech synthesis)
is added to the adder circuit 37 of the interpolation unit 17, the difference with the filter coefficient of the previous frame is output, and set in the difference value register 34 by the difference value calculation timing pulse _t3 , and the 1/2n circuit 35 outputs the difference with the filter coefficient of the previous frame. It is divided into several increments. In other words, the difference between the coefficients input from the converter 16 and the coefficients of the previous frame is determined for each frame period by the adder circuit 37, and this difference is removed by the 1/2n circuit according to the number of samples in one frame. Then, a value for each sample is obtained, and this value is added to the contents of the interpolation value register 36 in an adder circuit 41, and a filter coefficient for each sample is output to the digital filter section 19. At this time, the output of the startup control circuit 31 is "0", the gate 42 is closed, and the gate 4
3 is open.

一方、音声合成開始の時点、つまり起動信号
STが“０”→“１”へ変化した時、第１回目の
フレームパルスEPでは、変換部１６から出力さ
れた変換結果は補間計算されることなくゲート４
２を開けることにより、そのままデイジタルフイ
ルタ１９へ送出する。この第１フレームの音声合
成開始を起動制御回路３１でゲート制御（４２，
４３のゲート）することに行う。起動制御回路３
１では、起動信号STがセツトされた否かをタイ
ミングパルスt₄を基に識別し、フレームパルス
FP及びサンプリングクロツクＤから初期設定時
間を決め各回路へ“０”を出力する。は起
動信号をセツトするコントロールレジスタロード
信号であり、MRは音声合成回路のフリツプフロ
ツプ類を外部からリセツトするリセツト信号であ
る。 On the other hand, the point at which speech synthesis starts, that is, the start signal
When ST changes from "0" to "1", in the first frame pulse EP, the conversion result output from the conversion unit 16 is not interpolated and is sent to the gate 4.
2, the signal is sent to the digital filter 19 as it is. The startup control circuit 31 controls gate control (42, 42,
43 gates). Start-up control circuit 3
1, it is determined whether or not the start signal ST is set based on the timing pulse _t4 , and the frame pulse
Determine the initial setting time from FP and sampling clock D and output "0" to each circuit. is a control register load signal that sets the activation signal, and MR is a reset signal that resets the flip-flops of the speech synthesis circuit from the outside.

第４図は起動制御回路３１の要部ブロツク線図
であり、FF₁は保持型のフリツプフロツプ、FF₂，
FF₃，FF₉は基本クロツクt₄′によつて動作する遅
延型のフリツプフロツプ、FF₃，FF₄，FF₅，FF₇
はサンプルクロツクＤにより動作する遅延型のフ
リツプフロツプ、FF₆は基本クロツクt₄により動
作する遅延型のフリツプフロツプ、FF₁₀は非同
期のブリツプフロツプ、５１，５４，６０はイン
バータ、５２，５５，５６，５８，５９はアンド
ゲート、５３はオアゲート、５７はナンドゲー
ト、６１はインヒビツトゲート、RDFはインタ
ーフエス部内で発生される音声データがそろつて
いることを表示するデータレデイ信号である。 FIG. 4 is a block diagram of the main parts of the start-up control circuit 31, in which FF ₁ is a holding type flip-flop, FF ₂ ,
FF ₃ and FF ₉ are delay flip-flops operated by the basic clock t ₄ '; FF ₃ , FF ₄ , FF ₅ , and FF ₇
is a delay type flip-flop operated by the sample clock D, _FF6 is a delay type flip-flop operated by the basic clock _t4 , _FF10 is an asynchronous flip-flop, 51, 54, 60 are inverters, 52, 55, 56, 58. , 59 is an AND gate, 53 is an OR gate, 57 is a NAND gate, 61 is an inhibit gate, and RDF is a data ready signal indicating that the audio data generated within the interface section is complete.

起動信号STがコントロールレジスタロード信
号CRLによつて、セツトされFF₁に保持された
時、音声データがまだデータバツフアスタツクに
セツトされていない状態では、基本クロツクパル
スt₄′，t₄″が入力された時からFF₃，FF₄，FF₅が
サンプルクロツクＤにより順に駆動されるまでゲ
ート５６は開かれ、初期設定信号INIが制御部２
１及びFF₁₀のINIへ送出されている。サンプルク
ロツクの２回目でFF₅へ起動信号STが送られて
くるので初期設定信号INIは２〜３サンプルの間
出力される。初期設定信号INIの出力後音声デー
タがセツトされるとデータレデイ信号RDFがセ
ツトされゲート５９，５８が開くことによりデイ
ジタルフイルタ部１９、制御部等へ音声合成を開
始すべく起動信号を送る。この起動信号STによ
り制御部でフレームパルスFPを発生させ、その
フレームパルスFPはフリツプフロツプFF₉へ入
力するとともにゲート６１にも入力されている。
FF₁₀は初期設定信号INIによりセツトされている
ので、フレームパルスFPの後縁でゲート６１の
出力が“１”となりフリツプフロツプFF₁₀をリ
セツトさせるまで第３図の補間部のゲート４２を
開けていることになる。フリツプフロツプFF₁₀
がリセツト信号Ｒによつてリセツトされるとゲー
ト４３を開け補間計算を行なうようになるが第１
フレームでは差分値が“０”となつているので、
結果的に変換結果が各サンプルでデイジタルフイ
ルタ部１９に送出される。 When the start signal ST is set and held at _FF1 by the control register load signal CRL, the basic clock pulses t4' and _t4 _' ' are The gate 56 is opened until FF ₃ , FF ₄ , and FF ₅ are sequentially driven by the sample clock D from the time when the initial setting signal INI is input to the control section 2.
1 and FF ₁₀ INI. Since the startup signal ST is sent to FF ₅ at the second sample clock, the initial setting signal INI is output for 2 to 3 samples. When the audio data is set after outputting the initialization signal INI, the data ready signal RDF is set and the gates 59 and 58 are opened, thereby sending a start signal to the digital filter section 19, control section, etc. to start speech synthesis. This activation signal ST causes the control unit to generate a frame pulse FP, which is input to the flip-flop FF ₉ and also to the gate 61.
Since FF ₁₀ is set by the initial setting signal INI, the gate 42 of the interpolation section in FIG. 3 is kept open until the output of the gate 61 becomes "1" at the trailing edge of the frame pulse FP and resets the flip-flop FF ₁₀ . It turns out. flipflop FF ₁₀
When is reset by the reset signal R, the gate 43 is opened and interpolation calculation begins.
Since the difference value is “0” in the frame,
As a result, the conversion result is sent to the digital filter section 19 for each sample.

起動信号STが入力される前に音声データがデ
ータバツフアスタツクに入力されていれば、デー
タレデイ信号RDFはセツトされているので、初
期設定信号INIが完了（例えば“１”→“０”）
すると同時に第１フレームパルスを発生させて、
変換結果を補間部へ送るとともにデイジタルフイ
ルタへも送ることとなる。 If the audio data has been input to the data buffer stack before the activation signal ST is input, the data ready signal RDF has been set, and the initialization signal INI has been completed (for example, changing from "1" to "0"). )
At the same time, generate the first frame pulse,
The conversion result is sent to the interpolation section and also to the digital filter.

このようにフリツプフロツプFF₁にセツト（例
えば“１”）されてから最低２サンプリングパル
スの期間初期設定用の信号を発生し、初期設定完
了までにデータ受信が完了していれば初期設定完
了と同時に第１フレームの音声合成を開始させ、
初期設定完了までにデータ受信が完了していない
ときはデータ受信の完了後の最初のサンプリング
パルスの時に第１フレームを開始させることがで
きる。 In this way, the initial setting signal is generated for a period of at least two sampling pulses after the flip-flop FF is set to ₁ (for example, "1"), and if data reception is completed by the time the initial setting is completed, the initial setting is completed at the same time. Start voice synthesis for the first frame,
If data reception is not completed by the time the initial settings are completed, the first frame can be started at the time of the first sampling pulse after data reception is completed.

第５図は音声合成回路起動前にデータを受信し
た場合のタイムチヤートを示す。データ要求信号
REQはフレームパルス毎にデータがデータバツ
フアスタツクから読み出されるとローレベルとな
り、データDATAを要求しデータがデータバツ
フアスタツクに格納されるとハイレベルになる。
障害信号は、起動信号STが入力された後デ
ータ要求中にデータがセツトされなければローレ
ベルとなる。（第５図では常にハイレベルＨの状
態の例である。）ここで、ａは合成停止状態であり、起動信号が
入力される前にデータがセツトされているため、
内部初期設定ｂは第１フレームパルスが発生する
までの短時間に行なわれ、第１データ＃１の変換
ｃ、第１データ＃１の変換結果によるデイジタル
フイルタ動作ｄ、第２データ＃２の変換及び＃１
→＃２の差分計算ｅ、＃１→＃２のデータの補間
及びデイジタルフイルタ動作ｆ、データ＃３の変
換及び＃２→＃３の差分計算ｇ、データ＃２→
＃３の補間及びデイジタルフイルタ動作ｈと順に
音声合成が行なわれる。 FIG. 5 shows a time chart when data is received before the speech synthesis circuit is activated. data request signal
REQ goes low when data is read from the data buffer stack for each frame pulse, and goes high when data DATA is requested and the data is stored in the data buffer stack.
The failure signal becomes a low level unless data is set during a data request after the activation signal ST is input. (Fig. 5 shows an example of a state where the high level is always H.) Here, a is a state in which synthesis is stopped, and data is set before the start signal is input, so
The internal initialization b is performed in a short time until the first frame pulse is generated, and the conversion c of the first data #1, the digital filter operation d based on the conversion result of the first data #1, and the conversion of the second data #2 are performed. and #1
→ Difference calculation of #2 e, interpolation of data #1 → #2 and digital filter operation f, conversion of data #3 and difference calculation of #2 → #3 g, data #2 →
Speech synthesis is performed in the order of #3 interpolation and digital filter operation h.

第６図は音声合成回路起動後にデータが受信さ
れた場合のタイムチヤートを示す。 FIG. 6 shows a time chart when data is received after the voice synthesis circuit is activated.

ここで示した符号は第５図のものと同じ意味を
示しているが、第５図と異なつているのは、起動
信号STがセツトされてから音声データが、デー
タバツフアスタツクに格納された状態である。 The symbols shown here have the same meanings as those in Fig. 5, but the difference from Fig. 5 is that the audio data is stored in the data buffer stack after the activation signal ST is set. It is in a state of

データが格納されるまで障害信号はロー
レベルとなり、データがセツトされてからデータ
要求信号とともにハイレベルになる。第１回目の
音声合成は起動信号STがセツトされてから第４
図で説明したごとく、２〜３サンプリング周期の
間初期設定しその後にデータ＃１が格納されると
第１のフレームパルスが発生して音声合成が開始
される。この第１のフレームパルスでの音声合成
は第５図と同様にデータ＃１の変換結果をデイジ
タルフイルタに出力してフイルタ動作ｄを行ない
その後の音声合成は、データの差分計算、補間及
びデイジタルフイルタ動作ｆ，ｈを行なう。 The fault signal goes low until the data is stored, and goes high together with the data request signal after the data is set. The first speech synthesis starts after the start signal ST is set.
As explained in the figure, when data #1 is stored after initialization for 2 to 3 sampling periods, the first frame pulse is generated and speech synthesis is started. Speech synthesis using this first frame pulse is performed by outputting the conversion result of data #1 to a digital filter and performing filter operation d, as in FIG. Perform operations f and h.

以上説明したように、本発明によれば、音声合
成開始の第１フレームにおいて変換した音声デー
タは補間等を施さずデイジタルフイルタ部に送る
ため、最初のフリツプフロツプ回路等のランダム
な情報をもとに補間することがないので異常信号
の発生がなく、また、起動信号が入力されてから
２〜３サンプル周期間初期設定を行い、データが
あらかじめ設定されていると即フレームパルスを
発生し、データが後に設定された場合、そのデー
タ受信完了後の最初のサンプリングパルスから第
１フレームパルスを発生させるので、音声合成開
始動作が極めて早く行うことができ、且つ開始時
間にバラツキがなく、安定した音声合成開始を行
なうことができる。 As explained above, according to the present invention, the voice data converted in the first frame at the start of voice synthesis is sent to the digital filter section without being subjected to interpolation, etc. Since there is no interpolation, there is no abnormal signal generation, and after the start signal is input, the initial setting is performed for 2 to 3 sample cycles, and if the data is set in advance, a frame pulse is immediately generated, and the data is If it is set later, the first frame pulse is generated from the first sampling pulse after the data reception is completed, so the voice synthesis start operation can be performed extremely quickly, and there is no variation in the start time, resulting in stable voice synthesis. Initiation can be performed.

[Brief explanation of drawings]

第１図はマイクロプロセツサの制御により音声
合成を行なうシステムのブロツク線図、第２図は
音声合成回路のブロツク線図、第３図は本発明の
実施例の要部ブロツク線図、第４図は本発明の実
施例の起動制御回路の要部ブロツク線図、第５
図、第６図は本発明の音声合成開始方式を説明す
るためのタイムチヤートであり、第５図は音声合
成回路起動前にデータを受信した場合、第６図は
音声合成回路起動後にデータが受信された場合で
ある。 D₀〜D₇；音声データ、１１；データバツフア
スタツク、１６；変換部、１８；音源部、１７；
補間部、１９；デイジタルフイルタ、３１；起動
制御回路、DL；データロードパルス、ST；起動
信号、FP；フレームパルス、Ｄ；サンプリング
クロツク、t₁〜t₄；基本クロツク。 FIG. 1 is a block diagram of a system that performs speech synthesis under the control of a microprocessor, FIG. 2 is a block diagram of a speech synthesis circuit, FIG. 3 is a block diagram of main parts of an embodiment of the present invention, and FIG. The figure is a block diagram of the main part of the start-up control circuit according to the embodiment of the present invention.
6 are time charts for explaining the speech synthesis start method of the present invention. FIG. 5 shows a case where data is received before starting the speech synthesis circuit, and FIG. 6 shows a case where data is received after starting the speech synthesis circuit. This is the case when it is received. _D0 to _D7 ; Audio data, 11; Data buffer stack, 16; Conversion section, 18; Sound source section, 17;
Interpolation unit, 19; Digital filter, 31; Start-up control circuit, DL: Data load pulse, ST: Start-up signal, FP: Frame pulse, D: Sampling clock, _t1 to _t4 : Basic clock.

Claims

[Scope of Claims] 1. A speech synthesis device using a parameter encoding method that reads speech data and synthesizes speech by parameterizing the features of the speech through processing such as conversion, interpolation, and calculation of the speech data, comprising: a speech synthesis circuit; initializes the speech synthesis circuit for a predetermined period of time using a startup signal externally activated, and when the audio data is received before the completion of the initialization, the speech synthesis circuit is started upon completion of the initialization. . A voice synthesis start method, characterized in that when the voice data is received after the initial setting is completed, the voice synthesis operation is started upon completion of the voice data reception. 2. In the statement of claim 1, the voice synthesis operation immediately after the start of the voice synthesis operation is characterized in that the result of converting the voice data is input to a digital filter section without interpolation processing, and arithmetic processing is performed. Speech synthesis starting method.