JPH0363697A

JPH0363697A - speech synthesizer

Info

Publication number: JPH0363697A
Application number: JP1199249A
Authority: JP
Inventors: Hiroya Fujisaki; 藤崎　博也; Keikichi Hirose; 広瀬　啓吉; Mikio Yamaguchi; 幹雄山口
Original assignee: Sumitomo Electric Industries Ltd
Current assignee: Sumitomo Electric Industries Ltd
Priority date: 1989-08-02
Filing date: 1989-08-02
Publication date: 1991-03-19

Abstract

(57)【要約】本公報は電子出願前の出願データであるた
め要約のデータは記録されません。(57) [Summary] This bulletin contains application data before electronic filing, so abstract data is not recorded.

Description

【発明の詳細な説明】［ａ業上の利用分野］本発明は、音声を合成する音声合成装置に関し、特に、
文章や音韻記号を入力して、音声として出力するテキス
ト合成装置や音声規則合成装置に用いることが可能な音
声合成装置に関する。[Detailed Description of the Invention] [Field of Application in A Business] The present invention relates to a speech synthesis device for synthesizing speech, and in particular,
The present invention relates to a speech synthesis device that can be used as a text synthesis device or a speech rule synthesis device that inputs sentences and phonetic symbols and outputs them as speech.

［従来の技術］従来の音声合成装置の一例として、Ｄ、Ｈ，ＫＩａｔｔ
が　１Ｓｏｆｔｗａｒｅ　　ｆｏｒ　　ａ　　ｃａｓｃ
ａｄｅ／ｐａｒａｌｌｅｌ　　ｆｏｒｍａｔｓｙｎｔｈ
ｅｓｉｚｅｒ　　、Ｔｈｅ　Ｊｏｕｒｎａｌ　ｏｆ　ｔ
ｈｅ　ＡｃｏｕｓｔｉｃａｌＳｏｃｉｅｔｙ　ｏｆ　　
Ａ＋＋＋ｅｒｉｃａ　６７（３）、　Ｍａｒ、１９８０
　ｐｐ　９７１−９９５において、提案している音声合
成装置の回路構成を第３図に示す。[Prior Art] As an example of a conventional speech synthesis device, D, H, KIatt
1Software for a cask
ade/parallel formatsynth
Esizer, The Journal of t
he Acoustical Society of
A+++erica 67(3), Mar, 1980
Fig. 3 shows the circuit configuration of the speech synthesis device proposed in pp. 971-995.

第３図心おいて、有声音を合成するときにはインパルス
波形生成部２１において、有声音を作成するための音源
波形、本例では、インパルス波形信号が作成される。Referring to the third centroid, when synthesizing a voiced sound, the impulse waveform generating section 21 creates a sound source waveform for creating the voiced sound, in this example, an impulse waveform signal.

このインパルス波形信号は、共振器２２，２３．２４に
より発生する音声の種類に応じた周波数特性が付与され
る。また、強度制御部２５．２６により信号の強度が可
変設定される。このように作成された有声音の音源波形
（声帯音源波形）を模擬したアナログ信号は共振器３２
．３４〜３８および反共振器３３により調音された後、
−次微演算部５５により放射特性が付与されて、音声信
号として出力される。This impulse waveform signal is given a frequency characteristic depending on the type of sound generated by the resonators 22, 23, and 24. Further, the intensity of the signal is variably set by the intensity control units 25 and 26. The analog signal simulating the sound source waveform of the voiced sound (vocal cord sound source waveform) created in this way is sent to the resonator 32.
．． 34 to 38 and after being tuned by the antiresonator 33,
A radiation characteristic is added by the -order micro-arithmetic unit 55 and output as an audio signal.

一方、無声音を合成するときには、乱数発生部２７で発
生された雑音波形が強度変調部２８．ローパスフィルタ
２９を介して出力され、この出力のアナログ信号が無声
音、すなわち、人間の呼気が声道の狭い部分を通って生
じる乱流による雑音の音源信号となる。On the other hand, when synthesizing unvoiced sounds, the noise waveform generated by the random number generator 27 is transmitted to the intensity modulator 28. It is outputted through a low-pass filter 29, and the output analog signal becomes a sound source signal of unvoiced sound, that is, noise due to turbulent flow caused by human exhalation passing through a narrow portion of the vocal tract.

次に、発生する音声の種類、たとえば帯気音。Next, the types of sounds that occur, such as asphyxiation.

摩擦音等に応じて信号処理系が選択されて、上記アナロ
グ信号は、強度制御部３０または３１により強度が調整
され、まに１選択された系の中からさらに、発生対象の
音韻Ｃ対応して選択された共振器９反共振器により調音
された後、−次微分演算部５５により、放射特性が付与
されて出力される。A signal processing system is selected depending on the fricative, etc., and the intensity of the analog signal is adjusted by the intensity control section 30 or 31, and every time one of the selected systems is further processed according to the phoneme C to be generated. After the selected resonator 9 is tuned by the anti-resonator, a radiation characteristic is given by the −th order differential calculation unit 55 and output.

なお、有声音および無声音のそれぞれにおいて、破裂音
を発生するときは、第４図に示すような指数関数的に減
衰するステップ状の波形の信号を不図示の信号発生部に
より発生し、この信号を破裂音の破裂部を示す信号とし
て、上記音源信号に付加している。In addition, when generating a plosive sound in each of a voiced sound and an unvoiced sound, a signal generator (not shown) generates a step waveform signal that decays exponentially as shown in FIG. is added to the sound source signal as a signal indicating the plosive part of the plosive.

このように、人間の音声の生成過程を模擬することによ
り上述の電気回路により作成された音声信号がスピーカ
等から音声出力されていた。In this way, the audio signal created by the above-mentioned electric circuit by simulating the human voice generation process is output as audio from a speaker or the like.

［発明が解決しようとする課題］しかしながら、従来のこの種の音声合成装置では、［ｐ
］、［ｔ］、［ｋ］のように破裂音を有する子音が人間
の自然音声とよく一致しないという不具合があった。[Problem to be solved by the invention] However, in the conventional speech synthesis device of this type, [p
], [t], and [k], which have plosive consonants, do not match natural human speech well.

この点について詳しく説明する。This point will be explained in detail.

３４５図は自然音声から抽出した音＠／ｐａ／の音声波
形を示し、第６図はその破裂部の周波数特性を示す。Figure 345 shows the audio waveform of the sound @/pa/ extracted from natural speech, and Figure 6 shows the frequency characteristics of its rupture.

第７図は自然音声から抽出した音＠／ｌａ／の音声波形
を示し、第８図はその破裂部の周波数特性を示す。FIG. 7 shows the audio waveform of the sound @/la/ extracted from natural speech, and FIG. 8 shows the frequency characteristics of the rupture part.

第９図は自然音声から抽出した音韻／ｋａ／の音声波形
を示し、第１０図はその破裂部の周波数特性を示す。FIG. 9 shows the speech waveform of the phoneme /ka/ extracted from natural speech, and FIG. 10 shows the frequency characteristics of its rupture.

第６図、第８図、　３４１０図Ｃ示すように破裂部の波
形の周波数成分は子音の種類により異なっている。従来
装置ではこの差異を無視して第４図Ｃ示す単一の波形で
破裂部の波形を表現しているため、その結果、子音につ
いて合成音の品質が劣化していた。As shown in FIGS. 6, 8, and 3410C, the frequency components of the waveform of the plosive part differ depending on the type of consonant. In the conventional device, this difference is ignored and the waveform of the rupture portion is represented by a single waveform shown in FIG.

そこで、本発明の目的は、このような不具合を解消し、
合成した子音が自然音声により近い高品質の音声合成装
置を提供することにある。Therefore, the purpose of the present invention is to eliminate such problems,
To provide a high-quality speech synthesizer in which synthesized consonants are closer to natural speech.

［課題を解決するための手段］このような目的を達成するために、本発明は、破裂音の
音源波形を発生する第１波形発生手段と、Ｎ１波形発生
手段により発生された破裂音の音源波形のうち、強調お
よび／または減衰すべき周波数成分を、発生対象の破裂
音の種類に応じて指示する周波数成分指示手段と、第１
波形発生手段社より発生された破裂音の音源波形のうち
、周波数成分指示手段により指示された周波数成分を強
調および／または減衰させることにより当該破裂音の破
裂部分の波形を発生する第２波形発生手段とを具えたこ
とを特徴とする。[Means for Solving the Problems] In order to achieve such an object, the present invention provides a first waveform generating means for generating a sound source waveform of a plosive sound, and a sound source of a plosive sound generated by the N1 waveform generating means. a frequency component indicating means for indicating a frequency component to be emphasized and/or attenuated in the waveform according to the type of plosive sound to be generated;
A second waveform generator that generates a waveform of the plosive part of the plosive by emphasizing and/or attenuating the frequency component designated by the frequency component indicating means among the source waveform of the plosive sound generated by the waveform generating means. It is characterized by having the means.

［作　用］本発明は、自然音声の中の破裂音の破裂部の周波数特性
がその種類に応じて異なることに着目し、破裂音の種類
毎にその種類に対応した周波数成分を指示手段により指
示し、破裂音源波形の指示された周波数成分を第２波形
発生手段により、強調および減衰することにより破裂音
の破裂部の波形を発生する。この結果、従来のように、
破裂部の波形に一定形状の波形を用いるよりも人間の自
然音声に近い破裂音を合成することができる。[Function] The present invention focuses on the fact that the frequency characteristics of the plosive part of a plosive in natural speech differs depending on the type of the plosive. The designated frequency component of the plosive sound source waveform is emphasized and attenuated by the second waveform generating means, thereby generating a waveform of the plosive part of the plosive sound. As a result, as before,
It is possible to synthesize a plosive sound that is closer to natural human speech than by using a waveform with a fixed shape as the waveform of the plosive part.

［実施例］以下、図面を参照して本発明の実施例を詳細に説明する
。[Example] Hereinafter, an example of the present invention will be described in detail with reference to the drawings.

第１図は本発明実施例の基本構成を示す。FIG. 1 shows the basic configuration of an embodiment of the present invention.

第１図において、１０００は音源部であり、声帯音源波
形生成部１．摩摩擦音源波形生部２．破裂音源生成部３
により構成される。In FIG. 1, reference numeral 1000 denotes a sound source section, which includes a vocal cord sound source waveform generation section 1. Friction sound source waveform generation part 2. Plosive sound source generation section 3
Consisted of.

声帯音源波形生成部１は有声音を発生するときに、外部
装置からの動作信号により起動され、声帯音源波形を有
する信号を発生する０本例において、声帯音源波形とし
ては信号の強度と時間の関係を多項式で表わすことがで
きる波形を用いる。When generating a voiced sound, the vocal cord sound source waveform generation section 1 is activated by an operation signal from an external device and generates a signal having a vocal cord sound source waveform. A waveform whose relationship can be expressed as a polynomial is used.

摩擦音源波形生成部２は摩擦音を発生するときに動作信
号により起動され、摩擦音源波形を有する信号を発生す
る。本例Ｃおいては代表的なランダム波形を用いる。The frictional sound source waveform generating section 2 is activated by an operation signal when generating a frictional sound, and generates a signal having a frictional sound source waveform. In this example C, a typical random waveform is used.

第１波形発生手段としての破裂音源波形生成部３は破裂
音を発生するときに動作信号により起動され、破裂音源
波形を有する信号を発生する。本例においてはステップ
的な変化を有する波形を用いる。The plosive sound source waveform generating section 3 as a first waveform generating means is activated by an operation signal when generating a plosive sound, and generates a signal having a plosive sound source waveform. In this example, a waveform with step changes is used.

２０００は音源部１０００において発生された音源信号
を、合成対象の音韻の種類に応じて調音する調音部であ
る。調音部２０００はバズバー・鼻音用分岐部１１およ
びその強度制御部５．帯気音・有声音用分岐部１２およ
びその強度制御部６．７およびその強度制御部８ならび
に第２波形発生手段としての破裂部分用分岐部１４およ
びその強度制御部９（指示手段）から主に構成されてい
る。Reference numeral 2000 denotes an articulation unit that modulates the sound source signal generated in the sound source unit 1000 according to the type of phoneme to be synthesized. The articulation unit 2000 includes a buzz bar/nasal branch unit 11 and its intensity control unit 5. From the branch section 12 for aspirated sounds/voiced sounds, its intensity control section 6.7, its intensity control section 8, and the rupture section branch section 14 as a second waveform generating means and its intensity control section 9 (instruction means), It is composed of

バズバー・鼻音用分岐部１１は強度制御部５を介して声
帯音源波形の信号を入力し、声帯音源波形の信号の特定
の周波数成分を強調あるいは減衰することにより、バズ
バー、鼻音を調音する。The buzzbar/nasal sound branching unit 11 receives the signal of the vocal cord sound source waveform via the intensity control unit 5, and articulates the buzz bar/nasal sound by emphasizing or attenuating a specific frequency component of the signal of the vocal cord sound source waveform.

帯気音・有声音用分岐部１２は合成対象の音韻の種類に
応じて有声音の発生のときは声帯音源波形の信号、ｆ気
音の発生のときは摩擦音源波形の信号を人力し、帯気音
、有声音を調音する。The branching unit 12 for aspirated sounds and voiced sounds manually outputs a signal of a vocal cord sound source waveform when a voiced sound is generated, and a signal of a fricative sound source waveform when an aspirated sound is generated, according to the type of phoneme to be synthesized. Articulate aspirated and voiced sounds.

摩擦音用分岐部１３は摩擦音源波形の信号を入力し、摩
擦音を調音する。The fricative sound branching section 13 inputs the signal of the fricative sound source waveform and modulates the fricative sound.

破裂部分用分岐部１４は、破裂音源波形の信号を入力し
て破裂音を調音する。The plosive part branching section 14 receives a signal of the plosive sound source waveform and modulates the plosive sound.

強度制御部５，６，７，８．９は外部装置、例えばマイ
クロコンピュータ等から合成対象の音韻に固有のパラメ
ータを受信し、入力した音源信号の強度を可変設定する
と共に音韻の種類に対応した共振器および／または反共
振器を選択する。The intensity control units 5, 6, 7, and 8.9 receive parameters specific to the phoneme to be synthesized from an external device, such as a microcomputer, and variably set the intensity of the input sound source signal and adjust the intensity to correspond to the type of phoneme. Select resonators and/or antiresonators.

変調部４は声帯音源波形の基本周期に同期して摩擦音源
波形の信号を振幅変調する。なお、声帯音源波形の信号
が出力されないとき、すなわち声帯が振動していないと
きは変調部４の振幅変調は行われない。上述の各分岐部
１１，１２，１３．１４から出力される信号が加算部１
５を介して放射特性部３０００に出力される。放射特性
部３０００は入力信号の高域周波数を強調することによ
り放射特性を付与した音声信号を出力する。The modulator 4 amplitude-modulates the signal of the frictional sound source waveform in synchronization with the fundamental period of the vocal cord sound source waveform. Note that when the vocal cord sound source waveform signal is not output, that is, when the vocal cords are not vibrating, the modulation section 4 does not perform amplitude modulation. The signals output from the above-mentioned branching units 11, 12, 13, and 14 are added to the adding unit 1.
5 to the radiation characteristic section 3000. The radiation characteristic section 3000 outputs an audio signal imparted with radiation characteristics by emphasizing the high frequency of the input signal.

第２図は第１図に示す回路の具体的な回路構成の一例を
示す。本回路では各音源信号発生器において放射特性を
付与している。FIG. 2 shows an example of a specific circuit configuration of the circuit shown in FIG. In this circuit, radiation characteristics are given to each sound source signal generator.

第２図において、１０１は多項式波形発生器であり、本
例では４次多項式波形の高域強調を行った声帯音源波形
の信号を発生する。なお、他の計算式で求めた波形や自
然音声から分析Ｃよって抽出した波形を利用することも
できる。In FIG. 2, numeral 101 is a polynomial waveform generator, which in this example generates a vocal cord sound source waveform signal with high-frequency emphasis of a fourth-order polynomial waveform. Note that it is also possible to use waveforms obtained using other calculation formulas or waveforms extracted from natural speech by analysis C.

１０２は乱数を用いてランダム波形を発生し、高域強調
を行った摩擦音源波形の信号を出力する乱数発生器であ
る。A random number generator 102 generates a random waveform using random numbers and outputs a signal of a friction sound source waveform with high frequency emphasis.

１０３はステップ波形に高域強調を行った破裂音源波形
の信号、すなわちインパルス波形を発生するインパルス
波形発生器である。Reference numeral 103 denotes an impulse waveform generator that generates a signal of a plosive sound source waveform in which high frequencies are emphasized on a step waveform, that is, an impulse waveform.

第１図示のバズバー・鼻音用分岐部１１．帯気音・有声
音用分岐部１２．摩擦音用分岐部１３および破裂部分用
分岐部１４はそれぞれ対応の制御部の指示により人力信
号の所定周波数部分を強調する共振器および所定周波数
部分を減衰する反共振器が縦続接続されている。帯気音
・有声音用分岐部１２には特に、母音の極零対を示す波
形を作成するための共振器１２６，１２８および反共振
器１２７．．１２９が設けられている。Buzz bar/nasal branch part 11 shown in the first diagram. Branch part 12 for aspirated sounds and voiced sounds. The fricative sound branch section 13 and the rupture section branch section 14 are each connected in cascade with a resonator that emphasizes a predetermined frequency portion of the human input signal and an anti-resonator that attenuates the predetermined frequency portion according to instructions from the corresponding control portion. In particular, the aspirated/voiced sound branching section 12 includes resonators 126, 128 and an anti-resonator 127 for creating waveforms representing pole-zero pairs of vowels. ．． 129 is provided.

次に、第２図に示す回路の動作説明を行う。Next, the operation of the circuit shown in FIG. 2 will be explained.

合成しようとする音声に応じて動作信号により、各波形
生成部１０２〜１０３が起動され、対応の強度制御部Ｃ
より音源信号の強度が調整される。Each waveform generation unit 102 to 103 is activated by an operation signal according to the voice to be synthesized, and the corresponding intensity control unit C
The strength of the sound source signal is adjusted accordingly.

しかる後、それぞれの分岐部の共振器および反共振器に
より特定の周波数帯域が強調および減衰され、加算部１
５から、調音処理および放射特性の付加処理が施された
音声信号が出力される。Thereafter, a specific frequency band is emphasized and attenuated by the resonator and anti-resonator of each branch, and the adder 1
5 outputs an audio signal that has been subjected to articulation processing and radiation characteristic addition processing.

第２図に示した共振器の数は約６　ｋＨｚまでの周波数
成分を表現するのに適した数である。合成信号の自然音
声に対する模擬の程度たとえば細かく模擬するかまたは
粗く模擬するか等により共振器および反共振器の個数を
増減することもできる。The number of resonators shown in FIG. 2 is suitable for representing frequency components up to about 6 kHz. The number of resonators and anti-resonators can be increased or decreased depending on the degree of simulation of the synthesized signal with respect to natural speech, for example, whether it is finely or coarsely simulated.

また、表現したい音声の周波数成分の範囲がたとえば５
　ｋＨｚまでのように狭いときは共振器および反共振器
の個数を減することができる。Also, if the range of frequency components of the voice you want to express is, for example, 5
When the frequency is narrow, such as up to kHz, the number of resonators and antiresonators can be reduced.

実施例の他に次の例が挙げられる。In addition to the working examples, the following examples are given.

１）本実施例では、各種の分岐部において、共振器およ
び反共振器を固定的に縦続接続する例を示しているが、
共振器および反共振器の接続位置を入れ替え自在とする
ことも可能である。また、強度制御部と、共振器、反共
振器の接続順序も入れ替え可能である。1) In this example, an example is shown in which resonators and anti-resonators are fixedly connected in cascade at various branch parts.
It is also possible to freely replace the connection positions of the resonator and anti-resonator. Furthermore, the connection order of the intensity control section, the resonator, and the antiresonator can also be changed.

２）本実施例はアナログ回路によって実現できる他、デ
ィジタル信号で波形を表現し、ディジタル演算を専用の
ディジタル回路を用いて表現することもでき、あるいは
、マイクロプロセッサや、ディジタル信号ｆｉ理プロセ
ッサ等を用いてソフトウェアにより実現することもでき
る。2) In addition to being able to implement this embodiment using an analog circuit, it is also possible to express the waveform using a digital signal and perform digital operations using a dedicated digital circuit, or a microprocessor, digital signal processing processor, etc. It can also be realized by software.

本実施例の分岐部１１〜１４の全てに常に信号が流れて
いるわけではない”。たとえば、破裂音用分岐部１４は
破裂音の最初の破裂部分を合成している以外は合成音生
成に寄与していない。そこで、デジタル処理により音声
合成を行う場合は各分岐部１１〜１４の処理をサブルー
チン化し、発生すべき音韻の種類ごとに計算処理を行う
サブルーチンのみを選択することにより、全体としての
音声合成時間を短縮することができる。In this embodiment, signals do not always flow through all of the branching units 11 to 14. For example, the plosive branching unit 14 does not generate synthesized sounds except for synthesizing the first plosive part of the plosive. Therefore, when performing speech synthesis by digital processing, the processing of each branching section 11 to 14 is made into a subroutine, and by selecting only the subroutine that performs calculation processing for each type of phoneme to be generated, the overall The speech synthesis time can be reduced.

［発明の効果］以上説明したよう（、本発明は、自然音声の中の破裂音
の破裂部の周波数特性がその種類に応じて異なることに
着目し、破裂音の種類毎にその種類に対応した周波数成
分を指示手段により指示し、破裂音源波形の指示された
周波数成分を第２波形発生手段により、強調および減衰
することＣより破裂音の破裂部の波形を発生する。この
結果、従来のように、破裂部の波形に一定形状の波形を
用いるよりも人間の自然音声に近い破裂音を合成するこ
とができる。[Effects of the Invention] As explained above, the present invention focuses on the fact that the frequency characteristics of the plosive parts in natural speech differ depending on the type of plosives, and develops methods that correspond to each type of plosives. The designated frequency component of the plosive sound source waveform is emphasized and attenuated by the second waveform generating means. Thus, it is possible to synthesize a plosive sound that is closer to natural human speech than by using a waveform with a fixed shape as the waveform of the plosive part.

[Brief explanation of drawings]

第１図は本発明実施例の基本構成を示すブロック図、１３２図は本発明実施例の具体的な回路構成を示すブロ
ック図、第３′図は従来例の回路構成を示すブロック図、第４図
は従来例の減衰波形の形状を示す波形図、第５図、第７図、第９図は従来例の合成音を説明するた
めの自然音声波形を示す波形図、第６図、第８図、第１
０図は第５図、第７図、第８図にそれぞれ示す音声波形
の周波数成分を示す特性図である。１・・・声帯音源波形生成部、２・・・摩擦音源波形生成部、３・・・破裂音源波形生成部、４・・・変調部、５〜９・・・強度制御部、１１・・・バズバー・鼻音用分岐部、１２・・・帯気音・有声音用分岐部、１３・・・摩擦音用分岐部、１４・・・破裂部分用分岐部、１５−・・加算部、１０１・・・多項式波形発生器、１０２−・・乱数発生器、１Ｇ３−・・インパルス波形発生器、１１１〜１１５４２１〜１２６，１２８，１３１，１３３，１３５゜１４１．
１４３，１４５，１４６・・・共振器、１１６．１２７．１２９，１３２，１３４，１４２．１
４４・・・反共振器。FIG. 1 is a block diagram showing the basic configuration of an embodiment of the present invention, FIG. 132 is a block diagram showing a specific circuit configuration of the embodiment of the present invention, FIG. Figure 4 is a waveform diagram showing the shape of the attenuation waveform in the conventional example; Figures 5, 7, and 9 are waveform diagrams showing natural speech waveforms for explaining the synthesized sound in the conventional example; Figure 8, 1st
FIG. 0 is a characteristic diagram showing the frequency components of the audio waveforms shown in FIGS. 5, 7, and 8, respectively. DESCRIPTION OF SYMBOLS 1... Vocal cord sound source waveform generation part, 2... Frictional sound source waveform generation part, 3... Plosive sound source waveform generation part, 4... Modulation part, 5-9... Intensity control part, 11... - Buzz bar/branch section for nasal sounds, 12... Branch section for aspirated sounds/voiced sounds, 13... Branch section for fricative sounds, 14... Branch section for plosive parts, 15-... Addition section, 101. ... Polynomial waveform generator, 102-... Random number generator, 1G3-... Impulse waveform generator, 111 ~ 115421 ~ 126, 128, 131, 133, 135° 141.
143,145,146...Resonator, 116.127.129,132,134,142.1
44...Anti-resonator.

Claims

[Scope of Claims] A first waveform generating means for generating a sound source waveform of a plosive sound; and a frequency component to be emphasized and/or attenuated in the sound source waveform of a plosive sound generated by the first waveform generating means;
frequency component indicating means for indicating according to the type of plosive to be generated, and emphasizing and emphasizing the frequency component specified by the frequency component indicating means of the sound source waveform of the plosive generated by the first waveform generating means; and/or a second waveform generating means for generating a waveform of the plosive part of the plosive by attenuating the plosive.