JPH0236000B2

JPH0236000B2 -

Info

Publication number: JPH0236000B2
Application number: JP57185197A
Authority: JP
Inventors: Satoru Taguchi
Original assignee: Nippon Electric Co Ltd
Current assignee: NEC Corp
Priority date: 1982-10-21
Filing date: 1982-10-21
Publication date: 1990-08-14
Also published as: JPS5974599A

Description

【発明の詳細な説明】本発明は線スペクトル型ボコーダに関し、特に
線スペクトル係数の量子化特性及び補間特性の改
善に関する。DETAILED DESCRIPTION OF THE INVENTION The present invention relates to a line spectral vocoder, and particularly to improvements in quantization characteristics and interpolation characteristics of line spectral coefficients.

ボコーダは音声信号を低速のデイジタル符号列
に変換する目的等で使用され、一定周期毎に音声
信号を分析し、声道伝送特性を表現するスペクト
ルパラメータと声帯振動等を表現する音源情報パ
ラメータとから成る音声特徴パラメータを抽出
し、更に前記特徴パラメータを波形の代りに伝送
し、波形を再合成するものであり、音声特徴パラ
メータの時間的変化が比較的緩やかであり、又、
音声特徴パラメータを実用的に歪みなく量子化す
るに必要なビツト数が比較的に少数であることを
利用して情報圧縮を実現している。 A vocoder is used for the purpose of converting an audio signal into a low-speed digital code string, and analyzes the audio signal at regular intervals to extract spectral parameters that express vocal tract transmission characteristics and sound source information parameters that express vocal cord vibration, etc. The method extracts voice feature parameters consisting of
Information compression is achieved by taking advantage of the fact that the number of bits required to practically quantize voice feature parameters without distortion is relatively small.

ボコーダは音声特徴パラメータを極力長い時間
間隔で伝送することが望ましいため、一般に合成
側で音声特徴パラメータの補間を実施している。
第１図ａとｂは補間の効果を説明するための波形
図である。ａは補間を実施しない場合の時間軸方
向のスペクトルの歪を、ｂは補間を実施した場合
の時間軸方向のスペクトルの歪を示すものであ
る。 Since it is desirable for a vocoder to transmit voice feature parameters at as long a time interval as possible, interpolation of voice feature parameters is generally performed on the synthesis side.
FIGS. 1a and 1b are waveform diagrams for explaining the effect of interpolation. a indicates distortion of the spectrum in the time axis direction when no interpolation is performed, and b indicates distortion of the spectrum in the time axis direction when interpolation is performed.

ａに於いて曲線１０１は実際の音声スペクトル
の変化を示す。点１０２〜１０７は一定周期毎に
分析された音声特徴パラメータに対応するスペク
トルを示す。階段状線分群１０８は音声特徴パラ
メータを分析周期時間長について保持した場合の
再生音声スペクトルである。曲線１０１と階段状
線分群１０８とに包まれた部分（斜線部分）の面
積が補間を実施しない場合の時間軸方向の歪みの
大きさを表現している。 At a, a curve 101 shows the change in the actual audio spectrum. Points 102 to 107 indicate spectra corresponding to voice feature parameters analyzed at regular intervals. The stepped line segment group 108 is a reproduced speech spectrum when the speech feature parameters are held for the analysis cycle time length. The area of the portion (hatched portion) encompassed by the curve 101 and the group of stepped line segments 108 expresses the magnitude of distortion in the time axis direction when no interpolation is performed.

ｂに於いて曲線１０１、点１０２〜１０７はａ
の説明と同一である。折線１０９は音声特性パラ
メータを後述する意味における理想的な補間を実
施した場合の再生音声スペクトルである。曲線１
０１と析線１０９とに包まれた部分（斜線部分）
の面積が補間を実施した場合の時間軸方向の歪み
の大きさを表現している。第１図ａとｂより、理
想的な補間を実施すると、補間を実施しない場合
と比較して、大巾に時間軸方向の歪を低減し得
る。 In b, curve 101 and points 102 to 107 are a
This is the same as the explanation. A broken line 109 is a reproduced audio spectrum when ideal interpolation is performed on the audio characteristic parameters in the sense described later. curve 1
The part surrounded by 01 and analysis line 109 (shaded part)
The area of represents the magnitude of distortion in the time axis direction when interpolation is performed. From FIGS. 1a and 1b, if ideal interpolation is performed, the distortion in the time axis direction can be reduced to a large extent compared to the case where no interpolation is performed.

次に前述の理想的な補間について説明する。第
２図は理想的な補間を説明するためのスペクトル
図である。図に於いて実線２０１は時刻t₁のパワ
ースペクトル包絡、実線２０２は時刻t₂のパワー
スペクトル包絡である。点線２０３は時刻
t₁＋t₂／２の理想的な補間によるパワースペクトル包絡である。点線２０３は次の手順で算出され
る。周波数f₁における時刻t₁の電力密度をP₁（f₁）、
時刻t₂の電力密度をP₂（f₁）とするときに時刻
t₁＋t₂／２の対数電力密度をLog（P₁（f₁））＋Log（P₂ （f₁））とする。補間により求めたパワースペクト
ル包絡は（t₂−t₁）がある程度短かければ、比較
的よくに時刻t₁＋t₂／２で実際に分析したパワースペクトル包絡を近似する。 Next, the above-mentioned ideal interpolation will be explained. FIG. 2 is a spectrum diagram for explaining ideal interpolation. In the figure, a solid line 201 is the power spectrum envelope at time _t1 , and a solid line 202 is the power spectrum envelope at time _t2 . Dotted line 203 is time
This is the power spectrum envelope obtained by ideal interpolation of t ₁ +t ₂ /2. The dotted line 203 is calculated by the following procedure. The power density at time t ₁ at frequency f ₁ is P ₁ (f ₁ ),
When the power density at time t ₂ is P ₂ (f ₁ ),
Let the logarithmic power density of t ₁ +t ₂ /2 be Log(P ₁ (f ₁ ))+Log(P ₂ (f ₁ )). The power spectrum envelope obtained by interpolation relatively well approximates the power spectrum envelope actually analyzed at time t ₁ +t ₂ /2 if (t ₂ −t ₁ ) is short to some extent.

通常、各観測時刻におけるスペクトル包絡は直
接的に表現せずに、例えばスペクトル包絡と同一
の周波数応答を有するフイルタの係数として表現
される。例えばスペクトル包絡をＰ次（Ｐは整
数）の線形予測分析に基づいてフイルタの係数で
表現するものとしては、α−パラメータ、Ｋパラ
メータ、線スペクトル周波数等が知られている。
通常、ボコーダではスペクトル包絡を間接的に表
現する線スペクトル周波数等のパラメータを分析
側で求め、前記パラメータを合成側で補間してい
る。一般にパラメータレベルの補間結果に対応す
るスペクトル包絡は理想的な補間とは一致しな
い。 Usually, the spectral envelope at each observation time is not directly expressed, but is expressed, for example, as a coefficient of a filter having the same frequency response as the spectral envelope. For example, α-parameters, K-parameters, line spectrum frequencies, and the like are known as methods for expressing the spectrum envelope using filter coefficients based on P-order (P is an integer) linear predictive analysis.
Normally, in a vocoder, parameters such as a line spectrum frequency that indirectly expresses the spectrum envelope are determined on the analysis side, and these parameters are interpolated on the synthesis side. In general, the spectral envelope corresponding to the parameter-level interpolation result does not match the ideal interpolation.

本発明の一つの目的は線スペクトル係数の補間
に関し、理想的な補間に近い結果を簡単に求める
ことにある。 One object of the present invention is to easily obtain results close to ideal interpolation regarding interpolation of line spectral coefficients.

次に特徴パラメータの量子化について簡単に述
べる。ボコーダの目的から、特徴パラメータを極
力少ないbit数で量子化する必要があるが、半面
量子化のbit数を減少させると、量子化の結果発
生するスペクトル包絡の歪みが増大する。この歪
みはパラメータの分布特性、量子化に対する感度
がパラメータの位置により異なる点等を考慮して
量子化ステツプサイズを可変する云わゆる非線形
量子化を実施するとによりある程度軽減し得る。
しかしながら、パラメータの補間領域の選択も
又、重要な要素である。 Next, quantization of feature parameters will be briefly described. For the purpose of a vocoder, it is necessary to quantize feature parameters with as few bits as possible, but if the number of bits in half-plane quantization is reduced, distortion of the spectral envelope that occurs as a result of quantization increases. This distortion can be alleviated to some extent by implementing so-called nonlinear quantization in which the quantization step size is varied in consideration of the distribution characteristics of the parameters, the fact that sensitivity to quantization varies depending on the position of the parameters, etc.
However, the selection of the parameter interpolation region is also an important factor.

本発明の他の目的は、線スペクトル周波数の量
子化法の改善にある。 Another object of the invention is to improve the method of quantizing line spectral frequencies.

従来より線形予測分析の結果得られるαパラメ
ータ、Ｋパラメータ、線スペクトル周波数につい
ては、それらの量子化感度、補間特性について相
互の優劣が論じられている。量子化感度について
はαパラメータが最も高く（云い換えれば量子化
に不向きであり）、次にＫパラメータが高く、線
スペクトル周波数が最も低いとされている。又、
補間特性については、Ｋパラメータが時間軸方向
の歪が最も大きく（即ち補間に不向きであり）、
次にαパラメータが大きく、線スペクトル周波数
が最も小さい。最近のボコーダは量子化、補間特
性の両面共他のパラメータより有利な線スペクト
ル周波数を使用する例が多い。線スペクトル周波
数は上述の様に量子化特性、補間特性共優れてい
るが、必づしも完壁なものではなく、幾つかの問
題点も又有している。問題点の代表的なものはＰ
次（Ｐは整数）の線スペクトル周波数ω₁〜ω_Pに
ついて、ω₁〜ω_P相互に独立性がないことである。
例えばＫパラメータK₁〜K_Pによる合成フイルタ
の安定性は各々のパラメータが｜Ki｜＜１（ｉ＝
１，２，…Ｐ）を満足すれば保障されるのに対
し、線スペクトル周波数ω₁〜ω_Pによる合成フイ
ルタの安定性は０＜ω₁＜ω₂…＜ω_P＜π（但し単位
はラジアン）を満足する必要がある。又、線スペ
クトル周波数ω₁〜ω_Pによる合成フイルタは相隣
る係数ω_iとω_i+1とがω_i≒ω_i+1（ｉ＝１，２，…Ｐ
−１）となるときに極めて高い選択性を有する。 Conventionally, the relative superiority of α parameters, K parameters, and line spectrum frequencies obtained as a result of linear predictive analysis has been discussed in terms of their quantization sensitivities and interpolation characteristics. Regarding quantization sensitivity, it is said that the α parameter is the highest (in other words, it is unsuitable for quantization), the K parameter is the next highest, and the line spectrum frequency is the lowest. or,
Regarding the interpolation characteristics, the K parameter has the largest distortion in the time axis direction (that is, it is unsuitable for interpolation),
Next, the α parameter is the largest, and the line spectrum frequency is the smallest. Recent vocoders often use line spectrum frequencies that are more advantageous than other parameters in both quantization and interpolation characteristics. Although the line spectrum frequency has excellent quantization characteristics and interpolation characteristics as described above, it is not necessarily perfect and also has some problems. The typical problem is P
Regarding the next (P is an integer) line spectral frequencies ω ₁ to ω _P , ω ₁ to ω _P are not independent from each other.
For example, the stability of a synthetic filter with K parameters K ₁ to K _P is as follows: |Ki|<1 (i=
1, 2, ... P), whereas the stability of the synthesis filter with line spectral frequencies ω ₁ to ω _P is 0 < ω ₁ < ω ₂ ... < ω _P < π (however, the unit is radians) must be satisfied. In addition, in the synthesis filter using line spectrum frequencies ω ₁ to ω _P , adjacent coefficients ω _i and ω _i+1 are ω _i ≒ ω _i+1 (i=1, 2,...P
-1), it has extremely high selectivity.

第３図は約1.9秒の音声を分析して求めた８次
の線スペクトル周波数である。図３より例えば
ω₁とω₂との間隔が極めて狭い個所が多いことが
判る。図からω₁，ω₂を各々独立に量子化すると、
各々の多少の量子化誤差により、ω₁とω₂との間
隔が相対的に大きく変化し、合成フイルタの選択
性を大きく変動させる原因となる。更に線スペク
トル周波数は単独の周波数ではフオルマント等の
物理的性質の明確なパラメータとは対応しないた
め、その時間的変化特性は不明確であり、必づし
も直線補間等で充分な補間特性を得られるもので
はない。 Figure 3 shows the 8th order line spectrum frequency obtained by analyzing approximately 1.9 seconds of audio. It can be seen from FIG. 3 that, for example, there are many places where the interval between ω ₁ and ω ₂ is extremely narrow. From the figure, if we independently quantize ω ₁ and ω ₂ , we get
Each small amount of quantization error causes the interval between ω ₁ and ω ₂ to change relatively significantly, causing a large change in the selectivity of the synthesis filter. Furthermore, as a single frequency of line spectrum frequency does not correspond to clear parameters of physical properties such as formants, its temporal change characteristics are unclear, and it is not always possible to obtain sufficient interpolation characteristics by linear interpolation etc. It's not something you can do.

一方、第３図に於けるω₁とω₂との関係からω₁
とω₂の中間付近に選択度の高い極が存在するこ
とは明らかであり第１フオルマントに対応すると
考えられる。 On the other hand, from the relationship between ω ₁ and ω ₂ in Figure 3, ω ₁
It is clear that a pole with high selectivity exists near the middle between and ω ₂ and is thought to correspond to the first formant.

従来の線スペクトル型ボコーダは線スペクトル
周波数を独立に量子化していたために、量子化誤
差の影響で例えばω₁とω₂との間隔が相対的に大
きく変化し、合成フイルタの選択性を大きく変動
させるという第１の欠点と、線スペクトル周波数
を独立に補間していたために、必づしもフオルマ
ント等との対応がつかず充分に満足し得る補間特
性が得られないという第２の欠点とを有してい
た。 Since conventional line spectrum vocoders independently quantize line spectrum frequencies, the interval between ω ₁ and ω ₂ changes relatively significantly due to the influence of quantization errors, which greatly changes the selectivity of the synthesis filter. The first disadvantage is that the line spectrum frequency is interpolated independently, and the second disadvantage is that it is not always possible to correspond to formants, etc., and it is not possible to obtain sufficiently satisfactory interpolation characteristics. had.

そこで本発明の目的は、線スペクトル型ボコー
ダに於いて、線スペクトル周波数の量子化特性と
補間特性とを改善し、高品質のボコーダを提供す
ることにある。 SUMMARY OF THE INVENTION An object of the present invention is to provide a high-quality vocoder by improving the quantization characteristics and interpolation characteristics of line spectral frequencies in a line spectral type vocoder.

本発明の線スペクトル型ボコーダは、分析側は
相隣る奇数次数の線スペクトル周波数（ω₁，ω₃，
…，ω_P-1）と偶数次数の線スペクトル周波数
（ω₂，ω₄，…，ω_P）とを対として扱い、且つ各対
を構成する線スペクトル周波数の中間値と間隔と
を算それぞれ出し量子化する手段を主要な手段と
して構成され、合成側は前記中間値と間隔とを独
立に補間する手段を主要な手段として構成されて
いる。 In the line spectrum type vocoder of the present invention, on the analysis side, adjacent odd-order line spectrum frequencies (ω ₁ , ω ₃ ,
..., ω _P-1 ) and even-order line spectral frequencies (ω ₂ , ω ₄ , ..., ω _P ) are treated as a pair, and the intermediate value and interval of the line spectral frequencies constituting each pair are calculated. The main means is a means for output quantization, and the main means on the synthesis side is a means for independently interpolating the intermediate value and the interval.

本発明は、線スペクトル型ボコーダに関し、相
隣る奇数次数の線スペクトル周波数と偶数次数の
線スペクトル周波数とを一括して量子化し補間す
ることにより、量子化に伴う線スペクトル周波数
間の間隔の変化を軽減し、合成フイルタの安定性
を改善するという第１の効果と、補間特性がフオ
ルマントと比較的に良く対応する結果、良好な補
間特性が得られるという第２の効果を有する。 The present invention relates to a line spectrum type vocoder, in which adjacent odd-order line spectrum frequencies and even-order line spectrum frequencies are collectively quantized and interpolated, thereby changing the interval between the line spectrum frequencies due to quantization. The first effect is that the stability of the synthesis filter is improved, and the second effect is that good interpolation characteristics are obtained because the interpolation characteristics correspond relatively well to the formants.

次に本発明の実施例を図面を参照して説明す
る。第４図は本発明の実施例を説明するためのブ
ロツク図である。第１の実施例は分析側４１０、
伝送路４３０、合成側４４０より構成される。
又、分析側４０１は低域通過フイルタ１、４１
１、Ａ／Ｄ変換器４１２、ウインドウプロセツサ
４１３、自己相関係数計測器４１４、線形予測分
析器４１５、残差電力算出器４１６、線スペクト
ル周波数分析器４１７、線スペクトル周波数量子
化器４１８、残差電力量子化４１９、ピツチ、ｖ
−uv分析器４２０、ピツチ、ｖ−uv量子化器４
２１、多重化器４２２より構成される。合成側４
４０は多重分離器４４１、ピツチ、ｖ−uv復号
化器４４２、パルス発生器４４３、雑音発生器４
４４、ｖ−uv切換スイツチ４４５、可変利得増
幅器４４６、残差電力復号化器４４７、線スペク
トル周波数復号化教４４８、中間周波数補間器４
４９、周波数間隔補間器４５０、線スペクトル周
波数復元器４５１、LSP合成フイルタ４５２、
Ｄ／Ａ変換器４５３、低域通過フイルタ２、４５
４より構成される。 Next, embodiments of the present invention will be described with reference to the drawings. FIG. 4 is a block diagram for explaining an embodiment of the present invention. The first embodiment includes an analysis side 410,
It is composed of a transmission line 430 and a combining side 440.
Furthermore, the analysis side 401 includes low-pass filters 1 and 41.
1, A/D converter 412, window processor 413, autocorrelation coefficient measuring device 414, linear prediction analyzer 415, residual power calculator 416, line spectrum frequency analyzer 417, line spectrum frequency quantizer 418, Residual power quantization 419, pitch, v
-uv analyzer 420, pitch, v-uv quantizer 4
21 and a multiplexer 422. Synthesis side 4
40 is a demultiplexer 441, a pitch, a v-uv decoder 442, a pulse generator 443, a noise generator 4
44, v-uv switch 445, variable gain amplifier 446, residual power decoder 447, line spectrum frequency decoding signal 448, intermediate frequency interpolator 4
49, frequency interval interpolator 450, line spectrum frequency restorer 451, LSP synthesis filter 452,
D/A converter 453, low pass filters 2, 45
Consists of 4.

波形入力端子４０１を介して音声信号が低域通
過フイルタ１、４１１へ入力される。低域通過フ
イルタ１、４１１は音声信号を例えば3.4kHzに帯
域制限しＡ／Ｄ変換器４１２へ出力する。Ａ／Ｄ
変換器４１２は3.4kHzに帯域制限された音声信号
を例えば8kHzで標本化し、更に量子化し、ウイ
ンドウプロセツサ４１３へ出力する。ウインドウ
プロセツサ４１３は標本化、量子化された音声信
号を例えば240標本、即ち30mSEC分を１ブロツ
クとし、ブロツク単位で自己相関係数計測器４１
４とピツチ、ｖ−uv分析器４２０とへ出力する。
前記ブロツク単位での出力の周期は例えば20ｍ
SECである。自己相関係数計測器４１４はブロツ
ク化された音声信号の遅れ“０”から遅れ“Ｐ”
（Ｐは整数）までの自己相関係数を算出する。遅
れ“０”の自己相関係数はブロツク化された音声
信号の電力である。自己相関係数計測器４１４は
遅れ“０”から遅れ“Ｐ”までのＰ＋１ケの自己
相関係数を線形予測分析器４１５へ、又、遅れ
“０”の自己相関係数を残差電力算出器４１６へ
出力する。線形予測分析器４１５はＰ＋１ケの自
己相関係数から、自己相関法により線形予測分析
を実行してＰ次のαパラメータと正規化予測残差
電力とを算出する。線形予測分析器４１５は更に
前記算出結果のうち、正規化予測残差電力を残差
電力算出器４１６へ、Ｐ次のαパラメータを線ス
ペクトル周波数分析器４１７へ出力する。残差電
力算出器４１６は自己相関係数計測器１１４より
供給される遅れ“０”の自己相関係数即ち電力
と、線形予測分析器４１５より供給される正規化
予測残差電力とから残差電力を算出し、結果を残
差電力量子化器４１９へ出力する。残差電力量子
化器４１９は残差電力を例えば対数線形量子化
し、結果を多重化器４２２へ出力する。線スペク
トル周波数分析器４１７は例えば特許願57−
114195「線スペクトル型音声分析合成装置」に述
べられている手法で入力されたＰ次のαパラメー
タからＰ次の線スペクトル周波数ω₁，ω₂，…，
ω_P，（０＜ω₁＜ω₂…＜ω_P＜π）を求め、更に前
記周波数を線スペクトル周波数量子化器４１８へ
出力する。線スペクトル周波数量子化器４１８は
Ｐ／２組の線スペクトル周波数の中間値、ω₁＋ω₂／２
， ω₃＋ω₄／２，…，ω_P-1＋ω_P／２と、Ｐ／２組の線ス
ペクトル周波数の間隔、（ω₂−ω₁），（ω₄−ω₃），…，
（ω_P−ω_P-1）とを算出し、算出結果を各々独立に
線形量子化する。線スペクトル周波数量子化器４
１８は量子化結果を多重化器４２２へ出力する。
ピツチ、ｖ−uv分析器４２はウインドウプロセ
ツサ４１３より供給されたブロツク化された音声
信号からピツチ周期と有声−無声判別信号を分析
し、分析結果をピツチ、ｖ−uv量子化器４２１
へ出力する。ピツチ、ｖ−uv量子化器４２１は
ピツチ周期を例えば線形量子化し、結果を多重化
器４２２へ出力する。なおピツチ、ｖ−uv量子
化器４２１は有声−無声判別信号については、無
声の場合にピツチ周期を“０”とすることにより
処理する。多重化器４２２は残差電子量子化器４
１９より供給される量子化された残差電力情報
と、線スペクトル周波数量子化器４１８より供給
される量子化されたＰ／２組の線スペクトル周波数の中間値及びＰ／２の線スペクトル周波数の間隔と、ピツチ、ｖ−uv量子化器４２１より供給される
量子化されたピツチ周期情報とを多重化し、伝送
路４３０を介して、多重化結果を多重分離器４４
１へ出力する。多重分離器４４１は多重化された
量子化データを分離し、量子化された残差電力情
報を残差電力復号化器４４７へ、量子化された
Ｐ／２組の線スペクトル周波数の中間値及びＰ／２の線スペクトル周波数周波数の間隔を線スペクトル周
波数復号化器４４８へ、量子化されたピツチ周期
情報をピツチ、ｖ−uv復号化器４４２へ、それ
ぞれ出力する。ピツチ、ｖ−uv復号化器４４２
は量子化されたピツチ周期情報を復号し、更に復
号結果が“０”であれば無声と判断し、“０”で
なければ有声と判断することにより有声−無声判
別信号を発生する。ピツチ、ｖ−uv復号化器４
４２は復号されたピツチ周期情報をパルス発生器
４４３へ、有声−無声判別信号をｖ−uv切換ス
イツチ４４５へそれぞれ出力する。パルス発生器
４４３は供給されたピツチ周期情報に対応する周
期を有するパルス列を発生し、更に発生したパル
ス列をｖ−uv切換スイツチ４４５へ出力する。
雑音発生器４４４は白色雑音を発生し、更に発生
した白色雑音をｖ−uv切換スイツチ４４５へ出
力する。ｖ−uv切換スイツチ４４５はピツチ、
ｖ−uv復号化器４４２より供給される有声−無
声判別信号が有声であればパルス列を選択し、無
声であれば白色雑音を選択し、更に選択結果を可
変利得増幅器４４６へ出力する。残差電力復号化
器４４７は量子化された残差電力を復号し、復号
結果を可変利得増幅器４４６へ出力する。可変利
得増幅器４４６はｖ−uv切換スイツチ４４５を
介して供給されるパルス列又は白色雑音の振幅を
残差電力復号化器４４７より供給される残差電力
の平方根に比例した利得により増幅し、更に増幅
されたパルス列又は白色雑音をLSP合成フイルタ
４５２へ出力する。線スペクトル周波数復号化器
４４８は多重分離器４４１より供給される量子化
されたＰ／２組の線スペクトル周波数の中間値を復号し、中間周波数補間器４４９へ前記中間値を出
力する。線スペクトル周波数復号化器４４８は更
に多重分離器４４１より供給される量子化された
Ｐ／２組の線スペクトル周波数の間隔を復号し、周波数間隔補間器４５０へ前記間隔を出力する。中
間周波数補間器４４９は線スペクトル周波数復号
化器４４８より供給されるＰ／２組の線スペクトル周波数の中間値を独立に線形補間する。補間々隔
は例えば１標本化周期、125μSECである。中間
周波数補間器４４９は更には補間されたＰ／２組の線スペクトル周波数の中間値を線スペクトル周波
数復元器４５１へ出力する。周波数間隔補間器４
５０は線スペクトル周波数復号化器４４８より供
給されるＰ／２組の線スペクトル周波数の間隔を独立に線形補間する。補間々隔は例えば１標本化周
期、125μSECである。周波数間隔補間器４５０
は更に補間されたＰ／２組の線スペクトル周波数の間隔を線スペクトル周波数復元器４５１へ出力す
る。線スペクトル周波数復元器４５１は中間周波
数補間器４４９より供給されるＰ／２組の補間された線スペクトル周波数の中間値と、周波数間隔補
間器４５０より供給されるＰ／２組の補間された線スペクトル周波数の間隔とからＰ次の線スペクト
ル周波数を復元し、更に復元された線スペクトル
周波数をフイルタ係数としてLSP合成フイルタ４
５２へ出力する。LSP合成フイルタ４５２は可変
利得増幅器４４６より供給される信号を入力と
し、前記線スペクトル周波数をフイルタ係数とし
て音声波形を再合成し、更に再合成された音声波
形をＤ／Ａ変換器４５３へ出力する。Ｄ／Ａ変換
器４５３は標本化系列である再合成された音声波
形を連続音声波形に変換し、低域通過フイルタ
２、４５４へ出力する。低域フイルタ２、４５４
は連続音声波形に含まれている折返し成分を除去
して再合成音声を波形出力端子４６０へ出力す
る。なお、本実施例において、線スペクトル周波
数の間隔を線形量子化する代りに、線スペクトル
周波数の間隔に関する量子化感度と、前記間隔の
分布特性とを考慮した非線形量子化を実施し得る
ことは明らかである。 The audio signal is input to the low pass filter 1, 411 via the waveform input terminal 401. The low-pass filter 1, 411 limits the band of the audio signal to, for example, 3.4 kHz and outputs it to the A/D converter 412. A/D
The converter 412 samples the audio signal band-limited to 3.4 kHz at, for example, 8 kHz, further quantizes it, and outputs it to the window processor 413. The window processor 413 takes the sampled and quantized audio signal as one block, for example, 240 samples, that is, 30 mSEC, and uses the autocorrelation coefficient measuring device 41 in block units.
4, pitch, and output to the v-uv analyzer 420.
The output period in each block is, for example, 20 m.
It is SEC. The autocorrelation coefficient measuring device 414 measures the delay of the blocked audio signal from delay "0" to delay "P".
(P is an integer). The autocorrelation coefficient of delay "0" is the power of the blocked audio signal. The autocorrelation coefficient measuring device 414 sends the P+1 autocorrelation coefficients from delay "0" to delay "P" to the linear prediction analyzer 415, and calculates the residual power using the autocorrelation coefficient of delay "0". output to the device 416. The linear prediction analyzer 415 performs linear prediction analysis using the autocorrelation method from the P+1 autocorrelation coefficients to calculate the P-order α parameter and the normalized prediction residual power. Of the calculation results, the linear prediction analyzer 415 further outputs the normalized prediction residual power to the residual power calculator 416 and the P-order α parameter to the line spectrum frequency analyzer 417. The residual power calculator 416 calculates the residual from the delayed “0” autocorrelation coefficient, that is, power supplied from the autocorrelation coefficient measuring device 114 and the normalized prediction residual power supplied from the linear prediction analyzer 415. The power is calculated and the result is output to the residual power quantizer 419. The residual power quantizer 419 performs log-linear quantization on the residual power, for example, and outputs the result to the multiplexer 422 . The line spectrum frequency analyzer 417 is disclosed in, for example, patent application No. 57-
114195 “Line spectrum type speech analysis and synthesis device” From the P order α parameter inputted by the method described in “Line spectrum type speech analysis and synthesis device”, the P order line spectrum frequencies ω ₁ , ω ₂ , …,
ω _P , (0<ω ₁ <ω ₂ ...<ω _P <π) is obtained, and the frequency is further output to the line spectrum frequency quantizer 418. The line spectral frequency quantizer 418 calculates the intermediate value of P/2 sets of line spectral frequencies, ω ₁ +ω ₂ /2.
, ω ₃ +ω ₄ /2,…, ω _P-1 +ω _P /2, and the interval of the P/2 set of line spectral frequencies, (ω ₂ −ω ₁ ), (ω ₄ −ω ₃ ),…,
(ω _P −ω _P-1 ) is calculated, and the calculation results are each independently linearly quantized. Line spectrum frequency quantizer 4
18 outputs the quantization result to multiplexer 422.
The pitch and v-uv analyzer 42 analyzes the pitch period and the voiced/unvoiced discrimination signal from the blocked audio signal supplied from the window processor 413, and sends the analysis results to the pitch and v-uv quantizer 421.
Output to. A pitch/v-uv quantizer 421 linearly quantizes the pitch period, for example, and outputs the result to a multiplexer 422 . The pitch and v-uv quantizer 421 processes the voiced/unvoiced discrimination signal by setting the pitch period to "0" in the case of unvoiced signal. The multiplexer 422 is the residual electron quantizer 4
The quantized residual power information supplied from the line spectral frequency quantizer 418 and the intermediate value of the quantized P/2 set of line spectral frequencies and the quantized line spectral frequency of P/2 supplied from the line spectral frequency quantizer 418 The interval and quantized pitch period information supplied from the pitch and v-uv quantizer 421 are multiplexed, and the multiplexed result is sent to the demultiplexer 44 via the transmission line 430.
Output to 1. The demultiplexer 441 separates the multiplexed quantized data and sends the quantized residual power information to the residual power decoder 447 as the intermediate value of the quantized P/2 set of line spectrum frequencies and The line spectrum frequency interval of P/2 is output to the line spectrum frequency decoder 448, and the quantized pitch period information is output to the pitch and v-uv decoder 442, respectively. pitch, v-uv decoder 442
decodes the quantized pitch period information, and further generates a voiced/unvoiced discrimination signal by determining that it is unvoiced if the decoding result is "0", and determining that it is voiced if it is not "0". Pituchi, v-uv decoder 4
42 outputs the decoded pitch period information to the pulse generator 443 and the voiced/unvoiced discrimination signal to the v-uv changeover switch 445, respectively. The pulse generator 443 generates a pulse train having a period corresponding to the supplied pitch period information, and further outputs the generated pulse train to the v-uv changeover switch 445.
Noise generator 444 generates white noise and further outputs the generated white noise to v-uv changeover switch 445. The v-uv changeover switch 445 is pitch,
If the voiced/unvoiced discrimination signal supplied from the v-uv decoder 442 is voiced, a pulse train is selected; if it is unvoiced, white noise is selected, and the selection result is output to the variable gain amplifier 446. Residual power decoder 447 decodes the quantized residual power and outputs the decoding result to variable gain amplifier 446. The variable gain amplifier 446 amplifies the amplitude of the pulse train or white noise supplied via the v-uv switch 445 with a gain proportional to the square root of the residual power supplied from the residual power decoder 447, and further amplifies it. The resulting pulse train or white noise is output to the LSP synthesis filter 452. The line spectral frequency decoder 448 decodes the intermediate value of the quantized P/2 set of line spectral frequencies supplied from the demultiplexer 441 and outputs the intermediate value to the intermediate frequency interpolator 449. Line spectral frequency decoder 448 further decodes the interval of the quantized P/2 sets of line spectral frequencies supplied from demultiplexer 441 and outputs the interval to frequency interval interpolator 450 . The intermediate frequency interpolator 449 independently linearly interpolates the intermediate values of the P/2 sets of line spectral frequencies supplied from the line spectral frequency decoder 448. The interpolation interval is, for example, one sampling period, 125 μSEC. The intermediate frequency interpolator 449 further outputs the interpolated intermediate value of the P/2 set of line spectral frequencies to the line spectral frequency restorer 451. Frequency interval interpolator 4
50 independently linearly interpolates the intervals of P/2 sets of line spectrum frequencies supplied from the line spectrum frequency decoder 448. The interpolation interval is, for example, one sampling period, 125 μSEC. Frequency interval interpolator 450
further outputs the interpolated P/2 set of line spectral frequency intervals to the line spectral frequency restorer 451. The line spectral frequency restorer 451 receives intermediate values of P/2 sets of interpolated line spectral frequencies supplied from the intermediate frequency interpolator 449 and P/2 sets of interpolated line spectral frequencies supplied from the frequency interval interpolator 450. The P-order line spectrum frequency is restored from the spectral frequency interval, and the LSP synthesis filter 4 uses the restored line spectrum frequency as a filter coefficient.
Output to 52. The LSP synthesis filter 452 receives the signal supplied from the variable gain amplifier 446 as input, resynthesizes the audio waveform using the line spectrum frequency as a filter coefficient, and further outputs the resynthesized audio waveform to the D/A converter 453. . The D/A converter 453 converts the resynthesized audio waveform, which is a sampling series, into a continuous audio waveform, and outputs it to the low-pass filter 2, 454. Low-pass filter 2, 454
removes aliasing components included in the continuous speech waveform and outputs resynthesized speech to the waveform output terminal 460. Note that in this embodiment, instead of linearly quantizing the line spectral frequency intervals, it is clear that nonlinear quantization can be performed in consideration of the quantization sensitivity regarding the line spectral frequency intervals and the distribution characteristics of the intervals. It is.

[Brief explanation of drawings]

第１図ａとｂは補間の効果を説明するための波
形図、第２図は理想的な補間を説明するためのス
ペクトル図、第３図は実音声を分析して求めた線
スペクトル周波数の時間的変化を説明するための
波形図、第４図は本発明の一実施例を説明するた
めのブロツク図である。４０１……波形入力端子、４１０……分析側、
４１１……低域通過フイルタ１、４１２……Ａ／
Ｄ変換器、４１３……ウインドウプロセツサ、４
１４……自己相関係数計測器、４１５……線形予
測分析器、４１６……残差電力算出器、４１７…
…線スペクトル周波数分析器、４１８……線スペ
クトル周波数量子化器、４１９……残差電力量子
化器、４２０……ピツチ、ｖ−uv分析器、４２
１……ピツチ、ｖ−uv量子化器、４２２……多
重化器、４３０……伝送路、４４０……合成側、
４４１……多重分離器、４４２……ピツチ、ｖ−
uv復号化器、４４３……パルス発生器、４４４
……雑音発生器、４４５……ｖ−uv切換スイツ
チ、４４６……可変利得増幅器、４４７……残差
電力復号化器、４４８……線スペクトル周波数復
号化器、４４９……中間周波数補間器、４５０…
…周波数間隔補間器、４５１……線スペクトル周
波数復元器、４５２……LSP合成フイルタ、４５
３……Ｄ／Ａ変換器、４５４……低域通過フイル
タ２、４６０……波形出力端子。 Figure 1 a and b are waveform diagrams to explain the effect of interpolation, Figure 2 is a spectrum diagram to explain ideal interpolation, and Figure 3 is a line spectral frequency diagram found by analyzing real speech. FIG. 4 is a waveform diagram for explaining temporal changes, and a block diagram for explaining one embodiment of the present invention. 401... Waveform input terminal, 410... Analysis side,
411...Low pass filter 1, 412...A/
D converter, 413...Window processor, 4
14...Autocorrelation coefficient measuring device, 415...Linear prediction analyzer, 416...Residual power calculator, 417...
... line spectrum frequency analyzer, 418 ... line spectrum frequency quantizer, 419 ... residual power quantizer, 420 ... pitch, v-uv analyzer, 42
1... Pitch, v-uv quantizer, 422... Multiplexer, 430... Transmission line, 440... Combining side,
441... demultiplexer, 442... pitch, v-
uv decoder, 443...pulse generator, 444
... Noise generator, 445 ... V-UV changeover switch, 446 ... Variable gain amplifier, 447 ... Residual power decoder, 448 ... Line spectrum frequency decoder, 449 ... Intermediate frequency interpolator, 450...
...Frequency interval interpolator, 451...Line spectrum frequency restorer, 452...LSP synthesis filter, 45
3...D/A converter, 454...Low pass filter 2, 460...Waveform output terminal.

Claims

[Claims]

1. In a line spectrum type vocoder that analyzes and synthesizes speech by extracting line spectrum frequencies from linear prediction coefficients obtained by linear prediction analysis from an input audio signal, adjacent odd-order line spectrum frequencies and even-order lines are used. The analysis side includes means for handling the spectrum frequencies as a pair and calculating and quantizing the intermediate value and interval of the line spectral frequencies constituting each pair, and means for independently interpolating the intermediate value and the interval. A line spectrum type vocoder characterized in that it has on the synthesis side.