JPS6041100A

JPS6041100A - Multipulse type vocoder

Info

Publication number: JPS6041100A
Application number: JP58149007A
Authority: JP
Inventors: 哲田口
Original assignee: Nippon Electric Co Ltd
Current assignee: NEC Corp
Priority date: 1983-08-15
Filing date: 1983-08-15
Publication date: 1985-03-04
Also published as: JPH0242240B2

Abstract

(57)【要約】本公報は電子出願前の出願データであるた
め要約のデータは記録されません。(57) [Summary] This bulletin contains application data before electronic filing, so abstract data is not recorded.

Description

【発明の詳細な説明】本発明はマルチパルス型ボコーダに関する。入力音声信
号を分析して、この入力音声信号の音声情報を構成する
スペクトル包絡情報と音源情報とを分析側で抽出し、こ
れら音声情報を伝送路を介して合成側に送出して入力音
声信号を再生するボコーダはよく知られている。DETAILED DESCRIPTION OF THE INVENTION The present invention relates to a multi-pulse vocoder. The input audio signal is analyzed, the spectral envelope information and sound source information that constitute the audio information of this input audio signal are extracted on the analysis side, and these audio information are sent to the synthesis side via the transmission path to generate the input audio signal. Vocoders that play .

上述したスペクトル包絡情報は、入力音声信号を発生す
る声道系のスペクトル分布情報を表わすもので、通常Ｌ
ＰＣ分析によって得られた分析次数に対応する個数のＬ
ＰＣ係数、たとえばαパラメータ、にパラメータ等によ
って表現され、また音源情報はスペクトル包絡の微細構
造を示すもので入力音声信号からスペクトル分布情報を
除いた、いわゆる残差信号として知られるもので、入力
音戻信号の音源の強さ、ピッチ周期および有声・無声に
関する情報が含まれ、通常これらの情報は入力音声信号
の分析フレームごとの自己相関係数を介して抽出される
こともよく知られている。The above-mentioned spectral envelope information represents the spectral distribution information of the vocal tract system that generates the input speech signal, and is usually L
The number of L corresponding to the analysis order obtained by PC analysis
The sound source information is expressed by PC coefficients, such as the α parameter, and other parameters, and the sound source information indicates the fine structure of the spectral envelope, and is known as the so-called residual signal, which is obtained by removing the spectral distribution information from the input sound signal. It is also well known that it contains information about the source strength, pitch period and voiced/unvoiced of the return signal, and that this information is usually extracted via the autocorrelation coefficient for each analysis frame of the input audio signal. .

さて、スペクトル包絡情報はボコーダの合成側で入力音
声信号を合成する場合、通常全極型のデジタルフィルタ
を利用して近似的声道系を形成せしめるＬＰＣ合成器の
係数として利用され、音源情報はこのデジタルフィルタ
の駆動音源とし７て利用され、このデジタルフィルタに
よって入力音声信号が合成される。Now, when spectral envelope information is synthesized on the synthesis side of a vocoder, it is usually used as coefficients of an LPC synthesizer that uses an all-pole digital filter to form an approximate vocal tract system, and the sound source information is It is used as a driving sound source 7 for this digital filter, and input audio signals are synthesized by this digital filter.

このようにして得られる従来のＬＰＣボコーダは、約４
Ｋｂ（キロビット）以下の低ビツトレートでも音声の合
成が可能であり多用されているものの、筒品質の音声合
成は高ピットレートにおいても困難でワｚ）という欠点
を有する。この原因は音源情報のモデル化の場合、有声
前に対してはその内容に対応するピッチ周期を抽出して
このピッチ周期に対応する単一のインパルス列で近似的
に表現し、ランダム周期の無声音に対しては白色雑音で
近似的に表現するとい９単純なモデル化処理を前提とし
ている７ｈめ、入力音声信号の冴、ｌヤ醪報を忠実に抽
出し−にものとならず、従って？ｆ　！ｒ、’灯’ｔ　
？ＩＪに含まれる入力背戸イｇ号の波形・渭陳の分析２
合成が実施さｔていないことによる。The conventional LPC vocoder thus obtained is approximately 4
Although it is possible to synthesize speech even at a low bit rate of Kb (kilobits) or less, and it is often used, it has the disadvantage that pipe-quality speech synthesis is difficult even at a high pit rate. The reason for this is that when modeling sound source information, the pitch period corresponding to the pre-voiced content is extracted and approximately expressed by a single impulse train corresponding to this pitch period, and the unvoiced sound with a random period is In the seventh step, which assumes a simple modeling process, it is approximated by white noise, and it is not possible to faithfully extract the sharpness and false information of the input audio signal. f! r, 'light't
? Analysis of waveform and Wei Chen of input back door Ig included in IJ 2
This is because the synthesis has not been performed.

マルチパルス型ボコーダは、この上°）なａ２ノｉ非伝
送による問題の改善を図るため波形伝送イ什イ１ｈっ゛
Ｃ入力斤声信号の合成を実り耐するボコーダのひとつと
して近時よく知らｌし９つをンるものである。The multi-pulse vocoder has recently become well known as one of the vocoders that can effectively synthesize 1hC input voice signals while transmitting waveforms in order to improve the problem caused by non-transmission of A2 and I signals. 1 and 9.

第１図は従来のマルチパルスへ１（ボコーダの分Ｊｊｉ
側基本的構成ｒ示すブロック図であ２・。Figure 1 shows the conventional multi-pulse 1 (vocoder portion Jji
2 is a block diagram showing the basic configuration.

Ｉ、　ｌ）　Ｃ合成器ｌけ声這金シミュレートす、ｂ全
枠型デジタルフィルタケ備え、その係数は入力９１ン子
２００１を介し“Ｃ人力さｊしる入力音ハ゛イ、１号Ｘ
（ｎ）（ｎ＝１．２．３・・・・・・ｎ）紮Ｌ　Ｐ　Ｃ
力析谷２により分析フレームごとに分析したｌ、　ｉ’
　Ｃ糸数が供給ｔ３れる。I, l) The C synthesizer l) is equipped with a full-frame digital filter, and its coefficients are inputted via the input 91 connector 2001.
(n) (n=1.2.3...n) L P C
l, i' analyzed for each analysis frame using force analysis valley 2
The number of C yarns is supplied t3.

音源パルス発生器３は、へカ音−ｆｉ、＝伯号の稿源１
，１報　。The sound source pulse generator 3 is a heka sound-fi, = Hakugo draft source 1
, 1 report.

から複数個のインパルス系列、う”　７ｊゎちマルチパ
ルスからなる駆動−静諒系列Ｖ　（ｒ＋）を１４で、こ
２”ｌ−全Ｌ　Ｐ　Ｃ合成器１の駆動音源として供給す
Ｚ）。A drive-silence sequence V (r+) consisting of a plurality of impulse sequences, 7jゎchi multipulses, is supplied at 14 as a driving sound source of the total LPC synthesizer 1.

ＬＰＣ合成器１はこうして入力するＬＰＣ係数を、通常
は全極型デジタルフィルタを利用する合成フィルタの係
数とし、マルチパルスを駆動音源として駆動され合成信
号？（ｎ）を出力する。この場合、マルチパルスは入力
音声信号の波形情報を含むものでろ７）、Ｌｌ）Ｃ合成
器１は波形情報を含む入力音声４０号の合成を行なうこ
ととなる。The LPC synthesizer 1 uses the input LPC coefficients as coefficients of a synthesis filter that normally uses an all-pole digital filter, and generates a synthesized signal driven by a multi-pulse as a driving sound source. Output (n). In this case, the multi-pulse contains the waveform information of the input audio signal 7), and the Ll)C synthesizer 1 synthesizes the input audio No. 40 including the waveform information.

さて、ＬＰＧ合成器１から出力する合成信号マ（ｎ）は
次に減算器４で人力音声信号ｘ　（ｎ）との差をとり、
誤差ｅ　（ｎ）を得てこれを聴感重み付は器５に送出す
る。Now, the synthesized signal ma(n) output from the LPG synthesizer 1 is then subtracted by the subtracter 4 to take the difference from the human voice signal x(n),
The error e(n) is obtained and sent to the perceptual weighting device 5.

聴感重み付は器５は、誤差ｅ　（ｎ）に対して次の（１
）式に示す特性ｗ（ｚ）を有する重み付はフィルタによ
って聴感的な重み付けを付与したうえ、これらを２′乗
誤差最小化器６に送出するものである。The auditory weighting unit 5 calculates the following (1
) The weighting having the characteristic w(z) shown in equation (2) is performed by applying perceptual weighting using a filter and then sending these weights to the 2' power error minimizer 6.

・・・・・・・・・・・・・・・　Ｑ）（１）式におい
てａｋはＬＰＣ合成器１の全極型デジタルフィルタの係
数とすべきＬＰＣ係数、ｐはその次数であり従ってＬＰ
Ｃ分析θζ数、ｒｔｉ止み付は係数、２は全極型デジタ
ルフィルタの２変換表示による伝達関数Ｈ（ｚ−りにお
けるｚ＝ｅｘｐ（ｊλ）を示し、ここにλ＝２πΔＴ／
でありΔＴは分析フレームの標本化ザンプリング周期、
ｆは周波数ヲ示す。・・・・・・・・・・・・・・・ Q) In equation (1), ak is the LPC coefficient that should be the coefficient of the all-pole digital filter of LPC synthesizer 1, p is its order, and therefore LP
C analysis θζ number, rti constant is the coefficient, 2 is the transfer function H (z = exp (jλ) in z-representation of the all-pole digital filter, where λ = 2πΔT/
and ΔT is the sampling period of the analysis frame,
f indicates the frequency.

また（１）式において重み付り係Ｏｉｌ　ｒは、ｏく区
Ｊの範囲で設定さＪ′１．る。In addition, in equation (1), the weighting factor Oil r is set within the range J'1. Ru.

（１）式に示すｗ（ｚ）はｒ　□＝　ｌ　ｆ（：対しこ
ｆ４１．ｒ−〇に対してはｗ　（ｚ）　＝１−１）　（
Ｚ）の範囲のｆｌｉｉ２ｕｉｌで変化し、ｒの値は誤差
ｅ　（ｎ）の周波数スペクトルにおけるンオルマント領
域に現われる過大号レベルを抑圧する程度に対応して前
述１．た射、囲の中−Ｃ設定され、合成すべき４８号の
聴感的１檻の、（＝１シー１の碌。w (z) shown in equation (1) is r □ = l f (: whereas for f41.r-〇, w (z) = 1-1) (
The value of r varies in the range of fii2uil in the range of error e (n), and the value of r corresponds to the degree to which the oversignal level appearing in the normalant region in the frequency spectrum of the error e (n) is suppressed. 48's auditory 1 cage to be set and synthesized, (= 1 sea 1's performance).

割を果たすものであり、通常予め最Ｊ！ｌ聴感デストに
よってその最適値が選定される。It is usually the best J! The optimum value is selected by the auditory test.

このようにして重み付けされた誤差ｅ（＋りは、音源パ
ルス発生器３から出力される駆動音源系列Ｖ　（ｎ）　
、すなわちマルチパルスの最適時間位置と振幅とを決足
するために２乗誤差最小化器６に諾出され、次の（２）
式による２乗誤差εを計算し、εを最小にするように駆
動音源系列Ｖ　（ｎ）が選択される。２（２）式において記号−Ｘ−は聴感重み付は器５０重み
付はフィルタによるたたみ込み積分、Ｎはマルチパルス
を計算する区間長を示す。The error e weighted in this way (+ is the driving sound source series V (n) output from the sound source pulse generator 3
In other words, in order to determine the optimal time position and amplitude of the multi-pulse, the following (2) is applied to the squared error minimizer 6.
The squared error ε is calculated according to the formula, and the drive sound source sequence V (n) is selected so as to minimize ε. 2 In Equation (2), the symbol -X- indicates the auditory weighting is the convolution integral by the filter, and N indicates the interval length for calculating the multipulse.

上述した処理にマルチパルスのパルスごとに繰返され、
分析による合成が・貫・ルチバルスごとに行なわれる、
いわゆるＡｎａｌｙｓｉｓ　−ｂｙ　−８ｙｎｔｌ〕ｅ
ｓｉｓ手法（以下Ａ　−１）　−８＋法と略称する）で
ぞ・うて、このＡ、　−ｂ　−８手法上、」二連しｈ内
容からも明らかな如く、マルチパルス１つずつについて
パルス発生、２粱誤差計ｑ、およびパルス位置・振幅ｇ
１．１彼のループで行なわれるため、低ビットレートｉ
域における有効な手段であるにもかかわらずそのぴ譚１
が柾めて膨大なものとなるという欠点がるる。The above process is repeated for each pulse of the multi-pulse,
Synthesis by analysis is carried out for each
So-called Analysis -by -8yntl〕e
sis method (hereinafter referred to as A-1)-8+ method), in this A, -b-8 method, as is clear from the contents of the double series, the pulses for each multi-pulse are generation, error meter q, and pulse position/amplitude g
1.1 Low bitrate i because it is done in his loop
Although it is an effective method in the region, its story 1
The disadvantage is that the amount of data becomes extremely large.

なお、このＡ−ｂ−８手法については、１３．　Ｓ。Regarding this A-b-8 method, see 13. S.

Ａｔａｌ　ｅｔ　ａｌ　ＡＮｅｗＭｏｄｅｌ　ｏｆ　Ｌ
ＰＣＥｘｃｉｔ−ａｔｉｏｎ　ｆｏｒ　Ｐｒｏｄｕｃｉ
ｎｇ　Ｎａｔｕｒａｌ　−８ｏｕｎｄｉｎｇＳｐｅｅｃ
ｈ　ａｔ　Ｌｏｗ　Ｂｉｔ　Ａａｔｅｓ”　、　Ｐｒｏ
ｃ、　ＩＣＡＳＳＰ８２、ｐｐ　６１４−６１７．（１
９８２）”；萱Ｃ１４連びイしている。Atal et al ANewModel of L
PC Excit-ation for Produci
ng Natural -8oundingSpeec
h at Low Bit Aates”, Pro
c, ICASSP82, pp 614-617. (1
982)"; 萱C14 consecutive A.

このような従来のＡ−ｂ＝８手法にあ・ける欠点（１（
対して、相関演算にもとづき最適なマルチパルスを効率
的に計算する次のような演算処理アルゴリズムが最近紹
介されている。The disadvantage of such conventional A-b=8 method (1(
On the other hand, the following arithmetic processing algorithm has recently been introduced that efficiently calculates optimal multi-pulses based on correlation calculations.

すなわら、人力音声信号ｘ　（ｎ）はＮす／グルごト処
理フンームによって区分さノし、このクレー　ムごとに
マルチパルスが包括的に計′ｌ−一されるものである。In other words, the human voice signal x (n) is divided by Ns/Glugoto processing functions, and the multipulses are comprehensively summed for each claim.

いま、１分Ｖ〒フＶ−ム内に背４・ｎパルスかに仙荏在
するものとし、ｉＪ目のパルスがルートら時間位置ｍ，
にあり、かつその振幅がｇ，であるとすると、ＬＰＣ合
成フィルタの駆１！Ｉノ音源＋１　（ｎ）は次の（３）
式で示される。Now, it is assumed that there are 4·n pulses in the 1-minute V frame, and the iJ-th pulse is at the time position m, from the root.
, and its amplitude is g, then the LPC synthesis filter's drive 1! I sound source +1 (n) is the next (3)
It is shown by the formula.

ｄ（ｎ）＝Σｇｌ・δｎ，Ｉｎ，　・・・・・・・・・
・・・（３）Ｉ：＋Ｉ１１（３）式においてδｎ，ＩｎＩはクロネソカーのデルタ
関数であり、δ（ｒ，ｍ，　＝　１　（ｎ　＝ｍｌ　）
　、δｎ　、　Ｉｎ　１　＝　０（ｎ’＝，ｍ，）であ
る。d(n)=Σgl・δn,In, ・・・・・・・・・
...(3) I: +I11 In equation (3), δn, InI are Kronesoker's delta functions, and δ(r, m, = 1 (n = ml)
, δn , In 1 = 0(n'=,m,).

ＬＰＧ９１八ミフィルりＶよこの爵プ丘力音源ｄ　（ｎ
）によって駆動さ６合りに１，Ｒ号７Ｃ（ｒｎ）を出力
する。LPG91 Yamifiruri V Yoko no Kakupuukuriki sound source d (n
) and outputs 1, R number 7C (rn).

ＬＰＧ合成フィルタとして、たとえば全ｈｆｉ　ｉｖ！
デジタルフィルタを省えるものとし、その伝達関数をイ
ンパルス応答ｋ　（、１）　’　（０≦ｎ≦１！ｖ１．
　−　１　）で表現するものとするｋ、合成信号’５ｒ
　（ｎ）は次の（４）式で表わされる。As an LPG synthesis filter, for example, all hfi iv!
Assume that the digital filter can be omitted, and its transfer function is defined as the impulse response k (,1)' (0≦n≦1!v1.
−1) k, the composite signal '5r
(n) is expressed by the following equation (4).

ｔ−。t-.

（４）式においてｄ　（ｔ）に」、！ｉｉ１７動音源を
表わす。次に人力音声ｈ４４弓＞：　（、１）と合成侶
号又（ｎ）との誤差に対し聴１１＆的：ニイ１１１正？
施した重み付は誤二≧ζをｅ，　（ｎ）とするとｅＷ（
ｎ）は次の（５）：Ａ．で示される。In equation (4), d (t)'',! ii17 Represents a moving sound source. Next, for the error between human voice h44 bow>: (, 1) and the synthesized name (n), listen 11 & target: Ni 111 correct?
The applied weighting is eW(
n) is the following (5):A. It is indicated by.

Ｃｗ（ｎ）　＝　（　ｘ　（ｎ）　−　ｘ　（ｎ）　ｌ
−Ｘ−ｗ　（ｎ）・・・・・・・・・（５）式らに２乗肪薊は（５）式から店導１７て次の（６）式
で示すことができる。Cw(n) = ( x (n) − x (n) l
-X-w (n) (5) The squared ratio can be expressed by the following equation (6) by deriving from equation (5).

・・・・・・・・・・・（６）（６）式においてＭｉｄ誤差ｋＡＪ小化すゐ区１１；Ｊ
の虻ンプル数を示し、プことえは１分ｔ１１゛フレーム
長い一ｙ）ζ，ｆ．。・・・・・・・・・・・・(6) In equation (6), Mid error kAJ is reduced.
The number of samples is 1 minute t11゛frame long (y) ζ, f. .

最適な音源パルス列とじてのマルチ）パルスは（６）式
を最小化するｇ．’ｌ：　ｉｓ　／！・ことによっでイ
、とられ、？−〇ｇｌは上述しプこ（３）、（４）およ
び（６）式から次の（７）式，の如く凱心嘔れる。The multi-pulse as the optimal sound source pulse train minimizes equation (6) g. 'l: is/!・Did you get caught? -〇gl is obtained from the above-mentioned equations (3), (4), and (6) as shown in the following equation (7).

１−１・・・・・・・・・・・・（７）（７）式においてｘｗ（ｎ）はｘ　（ｎ）　−Ｘ−ｗ　
（ｎハｈ７（ｎ）はｈ　（ｎ）　％ｗ　（ｎ）　’ｓ：
示す。（７）式の右辺の分子の第１項はｘ−　（ｎ）と
ｈＷ（ｎ）　との時間遅れ１１１Ｉ　の相互相関関数’
ｈｘ（ｍθを示すものであり、城た、Ｍの共分散関数１’ｈｈ（ｒｎｚｔｍｌ）　（１≦１１１
１＋　１７１　＋４Ｍ）を示す。共分散関数ｍｂｈ　（
ｍｔｔ　”θは自己相関関数几ｈｈ（１ｍｚ　ｍｌ）　
と等しくなり、従ッテ（７）式は次の（８）式の如く表
わすことができる。1-1 ・・・・・・・・・・・・(7) In equation (7), xw(n) is x (n) −X−w
(nhah7(n) is h (n) %w (n) 's:
show. The first term in the numerator on the right side of equation (7) is the cross-correlation function of the time delay 111I between x-(n) and hW(n).
hx(mθ, and the covariance function of M is 1'hh(rnztml) (1≦111
1+171+4M). Covariance function mbh (
mtt "θ is the autocorrelation function 几hh (1mz ml)
Therefore, the following equation (7) can be expressed as the following equation (8).

・・・・・・・・・・・・（８）（８）式によれば、時間位１　ｍ　、においてパルスを
発生せしめると振幅ｇ　Ｉ（ｍ　＋　）が最適なものと
して決足しうろこととなる。なお（８）式において１＝
ｍ、≦Ｍ　である。・・・・・・・・・・・・(8) According to equation (8), if a pulse is generated at a time of about 1 m, the amplitude g I (m + ) will be determined to be optimal. becomes. Note that in equation (8), 1=
m, ≦M.

つまり、ある音源パルスに看目し、種種の時間位置にお
いて（８）式によりその振幅を計算したうえ、その振幅
の絶対値を最大とするものが（６）式に示す２乗誤差を
最小化するパルスとなり、このような手続を繰返して複
数個の音源パルスをめることかできる。In other words, looking at a certain sound source pulse and calculating its amplitude using equation (8) at various time positions, the one that maximizes the absolute value of the amplitude minimizes the squared error shown in equation (6). By repeating this procedure, a plurality of sound source pulses can be obtained.

なお、上述した計算アルゴリズムに関しては、小沢、飛
開、小野１マルチパルス駆動形音声符号化法の検討“、
１９８３年３月　電子通信学会　通信方式研究会に詳述
されている。Regarding the above-mentioned calculation algorithm, please refer to Ozawa, Hikai, and Ono's study of multi-pulse driven speech coding method.
It was detailed in the Communication Method Study Group of the Institute of Electronics and Communication Engineers, March 1983.

このような計算アルゴリズムに基づいて行なわれるマル
チパルスの発生によれば、相互相関関数と自己相関関数
ならびに最大値演算から最適なマルチパルスの計算が可
能となるため、構成が非常に簡素化されたものとなり演
算量を大幅に低減しうるマルチパルス型ボコーダを実現
することができる。Generating multipulses based on such calculation algorithms makes it possible to calculate optimal multipulses from cross-correlation functions, autocorrelation functions, and maximum value calculations, which greatly simplifies the configuration. Therefore, it is possible to realize a multi-pulse vocoder that can significantly reduce the amount of calculation.

しかしながら、このようにして改善したマルチパルス型
ボコーダにあってもさらに次に述べるような欠点がある
。However, even the multi-pulse vocoder improved in this way still has the following drawbacks.

すなわち、小沢らのアルゴリズムによれば、マルチパル
スの時間位置と振幅とは以下の手順により決定されてい
く。先ず０ｈｘ（ｍＩ）をめる。第２図（Ａ）の波形は
ある男性話者が発声した音声のφ）、、（ｍθの実測値
である。次に几ゎ、をめる。第３図は同様に几、ｈの実
測値である。マルチパルスを構成する第１番目のパルス
の位置は第２図（Ａ）の波形の絶対値が最大になる位置
（ｍ、＝７２）として決定され、パルスの振幅はｍ、＝
７２におけるダ（ｍρの値＜　１２＋（７２）　＝−５
，３）として決定される。That is, according to Ozawa et al.'s algorithm, the time position and amplitude of the multipulse are determined by the following procedure. First, calculate 0hx (mI). The waveform in Fig. 2 (A) is the actual measured value of φ), , (mθ) of the voice uttered by a certain male speaker. The position of the first pulse constituting the multi-pulse is determined as the position (m, = 72) where the absolute value of the waveform in Fig. 2 (A) is maximum, and the amplitude of the pulse is m, =
Da at 72 (value of mρ < 12 + (72) = -5
, 3).

次にｌ　（ｍθ　から第１番目のパルスによる影響分を
除去する。この操作は第２図（Ａ）の波形から、ｍ、＝
７２を中心にして第３図の波形を（−５，３）倍して減
じることを意味する。第２図（Ｂ）の波形は第２図（Ａ
）の波形から第１番目のパルスによる影響分を除去した
結果を示している。第２図（Ｂ）の波形について第１番
目のパルスの位置と振幅とを決定した方法と同様の方法
で第２番目のパルスの位置と振幅とを決定する。次に第
２図（Ｂ）の波形から第２番目のパルスによる影響分を
除去する。Next, remove the influence of the first pulse from l (mθ. This operation is done from the waveform of FIG.
This means that the waveform in FIG. 3 is subtracted by multiplying it by (-5, 3) with 72 as the center. The waveform in Figure 2(B) is the same as that in Figure 2(A).
) shows the result of removing the influence of the first pulse from the waveform. The position and amplitude of the second pulse are determined in a manner similar to the method used to determine the position and amplitude of the first pulse for the waveform of FIG. 2(B). Next, the influence of the second pulse is removed from the waveform of FIG. 2(B).

（結果を第２図（Ｃ）に示す）、以上の操作を繰返し第
３．第４．・・・・・・・・・第ｔ・・・・・・・・・
番目のパルスの位置と振幅とを決定してゆく。小沢らの
アルゴリズムは上述した様に＄２図の各波形において絶
対値が最大となるＩｎｌを検索し、更にｍ、におけるグ
、又はパルスの影響を除去したグの値１２１（ｍＩ）　
をめ、更に前記ｒｎｌおよびｇ（ｍｌ）をパルスの位置
、振幅と決定している。しかしながら必づしもｍ１付近
におけるｌ　（ｍＩ）の形状が几ｈｈの形状と類似であ
るとは限らない。例えば第２図（Ｆ）のｍ、　＝１５９
付近の波形は第３図の波形とはその形状が大きく異なる
。その結果第２図（Ｇ）の波形は（１つに比べてｍ、−
１６３付近のｌの値を増加させておす第２図（Ｉ）にお
いてｍ、＝１６３にパルスが成牛される遠因になってい
る。(The results are shown in Figure 2 (C)), repeat the above operations, and then proceed to step 3. 4th.・・・・・・・・・No. t・・・・・・・・・
The position and amplitude of the second pulse are determined. As mentioned above, Ozawa et al.'s algorithm searches for Inl with the maximum absolute value in each waveform in the $2 diagram, and then searches for Inl with the maximum absolute value in m, or the value of G after removing the influence of the pulse, 121 (mI).
Furthermore, the above rnl and g (ml) are determined as the position and amplitude of the pulse. However, the shape of l (mI) near m1 is not necessarily similar to the shape of hh. For example, m in Figure 2 (F) = 159
The shape of the nearby waveform is significantly different from that shown in FIG. As a result, the waveform in Figure 2 (G) is (m, - compared to one).
Increasing the value of l near 163 in FIG. 2 (I) is a contributing factor to the pulse reaching m = 163.

以上述べた様に小沢らのアルゴリズムは９！＄（Ｉｎ、
）又は、パルスの影響を除去した（１＝　（ｍθの絶対
値が最大になるｍｌおよび対応するφ（ｍＩ）’ｔそれ
ぞれパルスの時間的位置、振幅と決定するため、特にφ
（ｍ　Ｉ）と几、ｈ　との形状が大きく異なる場合にψ
（ｍ　＋　）が必づしも最適に減少せず、パルス数の不
要な増加を伴い、符号化の効率が低下するという欠点を
有していた。As mentioned above, Ozawa et al.'s algorithm is 9! $(In,
) or the influence of the pulse is removed (1 = (ml where the absolute value of mθ is maximum and the corresponding φ(mI)'t are determined as the temporal position and amplitude of the pulse, respectively, so especially φ
When the shapes of (m I) and 几, h are significantly different, ψ
(m + ) does not necessarily decrease optimally, and the number of pulses increases unnecessarily, resulting in a decrease in encoding efficiency.

本発明の目的は上述した欠点を除去し、マルチパルス型
ボコーダにおいて、　Ｘ、、　（１１）　（すなわち、
入力音声信号ｘ　（ｎ）と聴Ｗ＆重み付は器５のインパ
ルスレスポンスＷ（ｎ）トのコンボリューションｘ　（
ｎ）　＊ｗ　（ｎ））とり、　（ｎ）　（すなわちＩ、
ＰＣ合成器１と聴感重み付は器５との相互インパルスレ
スポンス）との相互相関係数φ（ｍ？）と、前記ｈＷ（
ｎ）自己相関係数も、との類似度を考慮してパルスの位
置と振幅とを決定することにより小沢らのアルゴリズム
の欠点全除去し、マルチパルスによる符号化の効率を向
上し得るマルチパルス型ボコーダを提供することにある
。The object of the present invention is to eliminate the above-mentioned drawbacks and provide a multi-pulse vocoder with X, , (11) (i.e.
Convolution x (
n) *w (n)), (n) (i.e. I,
The cross-correlation coefficient φ (m?) between the PC synthesizer 1 and the perceptual weighting device 5 (mutual impulse response) and the hW (
n) By determining the position and amplitude of the pulse in consideration of the autocorrelation coefficient and the similarity with the multi-pulse, the shortcomings of Ozawa et al.'s algorithm can be completely removed and the efficiency of multi-pulse encoding can be improved. Its purpose is to provide a type vocoder.

本発明のマルチパルス型ボコーダは、入力音声信号を分
析フレームごとにＬＰＣ分析して抽出し１ヒＬＰＣ係数
をスペクトル包絡情報としこのスペクトル包絡情報とと
もに前記入力音声信号の音声情報を構成する音源情報を
分析フレームごとにこの前原情報の特徴に対応する発生
時間位置と振幅とを有する複数個のインパルス系列（マ
ルチパルス）を以って表現し前記入力音声信号の分析お
よび合成を行なうマルチパルス型ボコーダにおいて、前
記入力音声信号と音声合成フィルタのインパルス応答と
の相互相関係数列を算出する手段と、前記インパルス応
答の自己相関係数列全算出する手段と、前記相互相関係
数列と前記自己相関係との類似度を算出する手段とを分
析側に備え、目つ更に前記類似度の最大値を検索しイン
パルス系列（マルチパルス）の振幅、位置をフォワード
的に算出する手段を分析側に備えて構成される。The multi-pulse vocoder of the present invention performs LPC analysis on an input audio signal for each analysis frame, extracts the first LPC coefficient, and uses the spectral envelope information as spectral envelope information together with the sound source information constituting the audio information of the input audio signal. In a multi-pulse vocoder that analyzes and synthesizes the input audio signal by representing each analysis frame with a plurality of impulse sequences (multi-pulses) having generation time positions and amplitudes corresponding to the characteristics of the preceding source information. , means for calculating a cross-correlation coefficient sequence between the input speech signal and the impulse response of the speech synthesis filter; means for calculating the entire autocorrelation coefficient sequence of the impulse response; The analysis side includes a means for calculating the degree of similarity, and a means for searching for the maximum value of the degree of similarity and calculating the amplitude and position of the impulse sequence (multipulse) in a forward manner. Ru.

次に図面を参照して本発明の詳細な説明する３、第４図
は本発明によるマルチパルス型ボコーダの分析側の一実
施例を示すブロック図、第５１補は本発明ニよるマルチ
パルス型ボコーダの合成側の一実施例を示すブロック図
である。Next, the present invention will be described in detail with reference to the drawings. Figures 3 and 4 are block diagrams showing one embodiment of the analysis side of the multi-pulse type vocoder according to the present invention, and supplementary 51 is a block diagram showing an embodiment of the analysis side of the multi-pulse type vocoder according to the present invention. FIG. 2 is a block diagram showing an example of a synthesis side of a vocoder.

第４図に示す本発明によるマルチパルス型ボコーダの分
析側け、ＬＰＣ分析器７．相互）１１］閂］％Ｖ゛ζ算
出器８．符号化器（１）　９．自己相関１３“Ｊ０算出
器１０、類似反算出器１１．符号化器（２）１２および
マルチプレクサ１３を倫えて構成されている。Analysis side of a multi-pulse vocoder according to the present invention shown in FIG. 4, LPC analyzer 7. Mutual) 11] Bar] %V゛ζ Calculator 8. Encoder (1) 9. The autocorrelation 13 is composed of a J0 calculator 10, a similarity anti-calculator 11, an encoder (2) 12, and a multiplexer 13.

入力端子７００１に一介して入力した入力音声４５号は
、ＬＰＣ分析器７および相互相関関数算出器８に供給さ
れる。Input audio No. 45 input via the input terminal 7001 is supplied to the LPC analyzer 7 and the cross-correlation function calculator 8.

ＬＰＣ分析器７は入力音声信号を分析フレームごとに、
予め設定するビット数のデジタル量として量子化化し、
この刊子化音声信号をｉ、　Ｐ　Ｃ分析してＬＰＣ係数
としての１）次の１（パラメータ（偏自己相関係数）？
抽出し、これを出力ライン７０１全介して符号化器（１
）９に供給する。本実施例においては分析フレームは２
０ｒｎＳＥＣに設定している。The LPC analyzer 7 analyzes the input audio signal for each analysis frame.
Quantize it as a digital quantity with a preset number of bits,
This digitalized audio signal is analyzed by PC to obtain the following 1) parameter (partial autocorrelation coefficient) as an LPC coefficient.
The encoder (1
) 9. In this example, there are two analysis frames.
It is set to 0rnSEC.

符号化器（１）９は、入力しｆｃＬＰＣ係数の量子化と
符号化を行なったのち、出力ライン９０１　を介してマ
ルチプレクサ１３に送出する。The encoder (1) 9 quantizes and encodes the input fcLPC coefficients, and then sends them to the multiplexer 13 via an output line 901.

ＬＰＣ分析器７ばまた、ＬＰＣ係数から・インパルス応
答ｈ（＋リ　（０≦ｎ≦ｔａ−１）を計募し、出カライ
ン７０２．符号化器（１）９．出力ライン９０２を介し
て相互１目曲関敗葬出器８ぷ・よび自己相関関数算出器
１０に供給する。The LPC analyzer 7 also calculates an impulse response h(+ri (0≦n≦ta−1)) from the LPC coefficients and transmits the impulse response h(+ri (0≦n≦ta−1) The signal is supplied to the first turn function calculator 8 and the autocorrelation function calculator 10.

相互相関関ｌｔｉ算出器８は、入力音声信号とインパル
ス応答ｈ　（ｎ）とを利用して相互相関関数グｈｘを計
算し、これを出ガライン８０１　を介してｐ似度算出器
１１に送出する。The cross-correlation function lti calculator 8 calculates a cross-correlation function hx using the input audio signal and the impulse response h(n), and sends it to the p-similarity calculator 11 via the output line 801. .

また、自己相関関数算出器ｌＯは、入力したインパルス
応答ｈ　（ｎ）の自己相関間’ｅ”’ｈｌ＋をｄｌい、
シ９、これを出力ライン１００１を介して類似、）（ｔ
ＨＦ出〃１１に送出する。Further, the autocorrelation function calculator lO calculates the autocorrelation interval 'e'''hl+ of the input impulse response h(n) by dl,
9, analogous to this via output line 1001, )(t
Send to HF output 11.

類似度算呂器１１はこうＥ２て入力した分析フレームご
との相互相関関数ダ、Ｘと自己相１！Ｉ！：ｌ関数１（
，１，。The similarity calculator 11 calculates the cross-correlation function DA, X, and self-phase 1 for each analysis frame input as E2. I! :l function 1(
,1,.

と金利用し、て後述する類似度のｉｔ　Ａｔｆ；Ｃツこ
行り、ｒｌｉ定〕数の前原パルス列を得て、これらのパ
ルスｆ２’１Ｊｆｉ幅および位置情報を出力ライン１．
１０１を介して符号化器（２）１２に送出し、こねによ
ってＷ子化および符号化を行なったのち出力ライン１２
０１？ｒ介してマルチプレクサ１３に送出する。Using this method, we obtain a precursor pulse train with a similarity factor (it Atf;
It is sent to the encoder (2) 12 via the encoder (2) 101, and after being converted into a W child and encoded by kneading, it is output to the output line 12.
01? r to the multiplexer 13.

このようにして、量子化および打号（にされてマルチプ
レクサ１３に送出されるＬＰＣ係敷およびマルチパルス
データけ、入力音声信号のスペクトル包絡および音源情
報を表わすデータとしてマＡ・−チグレクサ１３を介し
、て所定の方式で時分割さね、伝送路１３０１．　を介
して第２図に示す分析側から弔５図に示す合成側に伝送
される。In this way, the LPC movable and multipulse data signals that are quantized and converted into signals and sent to the multiplexer 13 are passed through the MA-multiplexer 13 as data representing the spectral envelope and sound source information of the input audio signal. , are time-divided in a predetermined manner and transmitted via a transmission path 1301 from the analysis side shown in FIG. 2 to the synthesis side shown in FIG.

第５図に示す合成側は、伝送路１３０１を介して分析側
から伝送されたデータに基づいて入力音声信号の合成を
行なうものであり、デマルチプレクサ１４．復号化器（
１）１５．復号化器（２）１６゜ＬＰＣ合成器１７およ
びＬ　Ｐ　Ｆ　（Ｌｏｗ　Ｐａ５ｓＦｉｌｔｃｒ）１８
等全備えて構成される。The synthesis side shown in FIG. 5 synthesizes input audio signals based on data transmitted from the analysis side via a transmission path 1301, and includes demultiplexers 14. Decoder (
1)15. Decoder (2) 16° LPC synthesizer 17 and L P F (Low Pa5sFiltcr) 18
Constructed with all the necessary features.

デマルチプレクサ１４は、伝送路１３０１を介して入力
した各種データをマルチプレクサ１３の時分割伝送形式
による変換前の状態に復元し、ＬＰＣ係数データは出力
ライン１４１を介して復号化器（１）１５に、マルチパ
ルスデータは出力ライン１４２を介して復号化器（２）
　１６にそれぞれ供給され、これらの復号化器によって
データの復号化を行なったうえ、それぞれ出力ライン１
５１，１６１に送出する。The demultiplexer 14 restores various data input via the transmission path 1301 to the state before conversion by the time division transmission format of the multiplexer 13, and the LPC coefficient data is sent to the decoder (1) 15 via the output line 141. , the multi-pulse data is sent to the decoder (2) via output line 142.
16 respectively, the data is decoded by these decoders, and the output line 1
51,161.

ＬＰＧ合成器１７は、このようにして入力するマルチパ
ルスを晋源情報としてｐ次の全極型デジタルフィルタの
駆動音源に利用し、また出力２イン１５１を介して入力
するｐ次のＬＰＣ係数データを上記全極型デジタルフィ
ルタの係数としてこのＬＰＣ合成フィルタを制御して入
力音声信号を合成し、これを出力ライン２１１を介して
ＬＰＦ１８に送出し、所定の低域フィルタリングを行っ
てアナログ貴の合成音声として出力ライン１８１に送出
する。The LPG synthesizer 17 uses the input multi-pulses in this way as source information for the driving sound source of the p-order all-pole digital filter, and also uses the p-order LPC coefficient data input via the output 2-in 151. is used as the coefficient of the all-pole digital filter to control this LPC synthesis filter to synthesize the input audio signal, send it to the LPF 18 via the output line 211, perform predetermined low-pass filtering, and synthesize the analog signal. It is sent to the output line 181 as audio.

次に類似度算出器１１を図面を参照して詳細に説明する
。第６図は類似度算出器１１の一実施例を示すブロック
図である。Next, the similarity calculator 11 will be explained in detail with reference to the drawings. FIG. 6 is a block diagram showing one embodiment of the similarity calculator 11.

伝送路８０１を介して相互相関関数φｈｘが相互相関係
数メモリ１９へ蓄積される。伝送路３．００１ｆ介して
自己相関関数Ｒｈｈが自己相関正規化器２０へ供給され
る。自己相関正規化器２０は前記Ｒ，，，ｈを波形と見
なしたときの電力に対応する正規化係数ａを次の（９）
式により算出する。The cross-correlation function φhx is stored in the cross-correlation coefficient memory 19 via the transmission line 801. The autocorrelation function Rhh is supplied to the autocorrelation normalizer 20 via the transmission line 3.001f. The autocorrelation normalizer 20 calculates the normalization coefficient a corresponding to the power when the R, .
Calculated using the formula.

ＮＢａ　−Ｒｈｈ　＜ｏ）　＋　２　Σ　几ｈｈ　Ｃｓ）　
”・・””・　（９）ｓ＝まただし几ｈｈ　（Ｘ）は遅れＸの几ｈｈの成分を示す。NB a −Rhh <o) + 2 Σ 几hh Cs)
``...''''・ (9) s=Madashi hh hh (X) indicates the 几hh component of the delay X.

又、Ｎｎｔｊ：前述したインパルスレスポンスｈＷ（ｎ
）の実用上の持続時間を示す。更に自己相関正規化器２
゜は前記ａで几ｂｈ　（Ｘ）の各要素を正規化し、結果
を正規化自己相関係数Ｒ’ｈｈとして伝送路２０１ｉ介
して自己相関係数メモリ２１へ出力する。積和算出器２
２は伝送路１９１を介して供給される相互相関関数’ｈ
ｘの遅れｍ、全中心にしてｎｌ、（後ＮＢ分の要素と、
伝送路２１１を介して供給される正規化自己相関係数几
’ｈｈとの積和す、、、ｌを次の（１０）式により算出
する。Also, Nntj: the above-mentioned impulse response hW(n
) indicates the practical duration of Furthermore, autocorrelation normalizer 2
゜ normalizes each element of 几bh (X) using a, and outputs the result as a normalized autocorrelation coefficient R'hh to the autocorrelation coefficient memory 21 via the transmission line 201i. Product sum calculator 2
2 is the cross-correlation function 'h supplied via the transmission line 191
The delay m of x, nl for all centers, (the elements for the back NB,
The sum of products, .

Ｒｂ、、、ｌ＝　Σ　’ｈｘ　（ｍ、十ｓ）・”’ｈｈ　
（Ｓ）８冨−ＮＢ・・・・・・・・・・・・（１０）積和算出器２２は相互相関関数ｆ’ｈｘの定義される区
間（本実施例では２４０）、即ちｍｌ　＝１”’ｆｌ−
２４０についてｂｍ、を次々に算出し結果を伝送路２２
１を介して最大値検索器２３へ出力する。最大値検索器
２３は前記す、、１１０列のうち最大の絶対値を有する
ものを検索し、遅れ時間（第１番目のパルスの時間位置
に対応する）２□と振幅Ｊｖとを決定し、更に前記ｚ１
ｔＪ１を伝送路２３１，２３２を介して相互相関補正器
２４およびマルチパルスメモリ２５へ出力する。相互相
関補正器２４は相互相関係数メモリ１９より伝送路１９
１を介して供給される’ｈｘを遅れｚｌ　を中心にして
、自己相関係数メモリ２１より伝送路２１１を介して供
給される几′ｉと前記振幅ｂｍｌとを用いて次の（１１
）式により修正する。R b,,,l=Σ 'hx (m, 10s)・"'hh
(S) 8 Tomi-NB (10) The sum-of-products calculator 22 calculates the interval (240 in this embodiment) in which the cross-correlation function f'hx is defined, that is, ml = 1”'fl-
240, and calculate bm one after another and send the results to the transmission line 22.
1 to the maximum value searcher 23. The maximum value searcher 23 searches for the one having the maximum absolute value among the 110 columns, determines the delay time 2□ (corresponding to the time position of the first pulse) and the amplitude Jv, Furthermore, the above z1
tJ1 is output to the cross-correlation corrector 24 and multipulse memory 25 via transmission lines 231 and 232. The cross-correlation corrector 24 is connected to the transmission line 19 from the cross-correlation coefficient memory 19.
The following (11
) is corrected by the formula.

１’Ｈｘ（ｚ４＋ｔ）＝ｌ’Ｈ，（ｚｔ＋【）　−ｂｚ
ｌ””ｈｈ（ｔ）・・・・・・・・・・・・（１１）ただしｔは修正区間であり一８〜十Ｓに設定される。相
互相関補正器２４は更に上記（１１）式の結果を伝送路
２４１を介して相互相関係数メモリ１９へ供給する。1'Hx (z4+t)=l'H, (zt+[) -bz
l""hh(t) (11) However, t is a correction interval and is set to 18 to 10S. The cross-correlation corrector 24 further supplies the result of equation (11) above to the cross-correlation coefficient memory 19 via the transmission line 241.

以上の処理を必要とするマルチパルスの数に達するまで
繰返し実行し、結果を次々にマルチパルスメモリ２５へ
記憶スる。マルチパルスメモリ２５は繰返し終了後マル
チパルスの時間位置と振幅とを伝送路１１０１へ出力す
る。The above processing is repeated until the number of multipulses required is reached, and the results are stored one after another in the multipulse memory 25. After the repetition is completed, the multi-pulse memory 25 outputs the time position and amplitude of the multi-pulse to the transmission line 1101.

次に第６図の構成によりめられた’ｈｘおよび修正され
たグエの実測例（音声サンプルは第２図の例と同一であ
る）を第７図に示す。第７図（Ａ）は相互相関係数’ｂ
ｘと決定された第１番目のパルスの時間位置と振幅とを
示す。（Ｂ）は第１番目のパルスによ０（ｆｉｌ正され
たイ目互相関係数ｇｈｘと決定された第２番目のパルス
の時間位置と振幅とを示している。同様に（Ｃ）〜（Ｋ
）は修正されｆｃｌｂｘと決定されたパルスの時間位置
、娠幅全示している。Next, FIG. 7 shows an actual measurement example of the 'hx and corrected goue obtained by the configuration of FIG. 6 (the audio sample is the same as the example of FIG. 2). Figure 7 (A) shows the cross-correlation coefficient 'b
The time position and amplitude of the first pulse determined as x are shown. (B) shows the time position and amplitude of the second pulse determined as the A cross-correlation coefficient ghx corrected by 0(fil) by the first pulse.Similarly, (C) to ( K
) shows the time position and full width of the pulse corrected and determined as fclbx.

本発明は手沢らのアルゴリズムと異なり相互相関係数１
’ｈｘと正規化自己相関係数九′５．との類似度の最大
値を検索している。その結果第２図（Ｆ）に示す様なφ
ｈｘと几ｈｂとの形状が大きく異なる時間位置にパルス
が決定されることがなく、従ってパルス数の不要な増加
を伴なわない。The present invention differs from Tezawa et al.'s algorithm in that the cross-correlation coefficient is 1.
'hx and normalized autocorrelation coefficient 9'5. Searching for the maximum value of similarity with. As a result, φ as shown in Figure 2 (F)
A pulse is not determined at a time position where the shapes of hx and hb are significantly different, and therefore the number of pulses does not increase unnecessarily.

次に本発明による入力音声（端子７００１の音声信号）
を基準にした出力音声（端子１８１の音声信号）のＳ／
Ｎ比をマルチパルスのパルス数を可変して測定したーデ
ータ例全同様の方法により測定した手沢らのアルゴリズ
ムによるＳ／Ｎ比と比較して第８図に示す。第８図にお
いて、Ｘは従来方法によるＳＮ比、・は本発明によるＳ
Ｎ比を示す。Next, input audio according to the present invention (audio signal at terminal 7001)
S/ of the output audio (audio signal at terminal 181) based on
The N ratio was measured by varying the number of multi-pulses. All data examples are shown in FIG. 8 in comparison with the S/N ratio measured by the Tezawa et al. algorithm using the same method. In FIG. 8, X is the S/N ratio according to the conventional method, and .
Shows the N ratio.

第８図から明らかな様に本発明は手沢らのアルゴリズム
より符号花の効率が向上している。As is clear from FIG. 8, the code efficiency of the present invention is improved over the algorithm of Tezawa et al.

以上、類似度の例として相互相関係数１ｂｘ又はパルス
の影響を考慮して修正された’ｈｘと正規化自己相関係
数Ｒｈｈとの積和を示したが類似度としては必づしも前
記積和に限定されるもので幻ない。Above, as an example of the degree of similarity, we have shown the sum of products of the cross-correlation coefficient 1bx or 'hx modified in consideration of the influence of pulses and the normalized autocorrelation coefficient Rhh. It is limited to the sum of products and is not an illusion.

例えば下記の（１２）式で示される’ｈｘと旧５５．と
の遅れｍＩにおけるマグニチュード金最大とするＣｍｌ
を算出し、更に各遅れにおけるマグニチュードが最小と
なる、品ち類似度が最大となるｍｌを検索してもよい。For example, 'hx shown in the following equation (12) and old 55. Cml with maximum magnitude gold at lag mI with
, and then search for the ml with the minimum magnitude at each delay and the maximum product similarity.

Ｎ。N.

Ｃ１ｉ、、＝ｍｉｎ　Σ　ｌ　ｆ’　ｈ、　（ｍ、　＋
　Ｓ）　Ｃｍ＋８＝−ＮＢＲｈｈ　（Ｓ）　ｌ　・・・・・・・・・・・・（１２
）マグニチュードを類似度として使用する場合には自己
相関正規化器２０は必づしも必要でない。又、積和算出
器２２．最大値検索器２３なそれぞれ最小マグニチュー
ド推定器と最小値検索器とにＭ１換えることにより類似
度費−出が可能となること４：１自明である。C1i,,=min Σ l f' h, (m, +
S) Cm+8=-NB Rhh (S) l ・・・・・・・・・・・・(12
) When magnitude is used as the similarity, the autocorrelation normalizer 20 is not necessarily required. Also, a sum of products calculator 22. It is obvious that by replacing M1 with a minimum magnitude estimator and a minimum value searcher such as the maximum value searcher 23, the similarity cost can be reduced by 4:1.

又、上記の説明に於いては（１）式に示さ！しる聴感重
み付けを実施する串を前提にしていたが必つしも聴感重
み付はヶ実施する必要はない。（（Ｉ）式においてγ＝
１．０　に対応）、聴感重み付けを行なわない場合には
、本発明は入力音声波形と、音声合成フィルタのインパ
ルス応答との相互相関関数を’ｈｘとし、音声合成フィ
ルタのインパルス応答の自己相関関数を”ｈｂとして冥
施し得る。Also, in the above explanation, it is shown in equation (1)! Although the skewer is based on the assumption that the perceptual weighting is performed, it is not necessarily necessary to perform the perceptual weighting. (In formula (I), γ=
1.0), and when perceptual weighting is not performed, the present invention sets the cross-correlation function between the input speech waveform and the impulse response of the speech synthesis filter to 'hx, and the autocorrelation function of the impulse response of the speech synthesis filter to can be performed as ``hb''.

なお、第４図および第５図に示す本発明の実施例におい
ては、ＬＰＧ係数として１ぐパラメータを用いているが
これは他のＬＰＣ係数、たとえばαパラメータ等を利用
してもよく、また符号化器とマルチプレクサ、および仮
号化器とデマルチプレクサはそれぞれこれらを一体化−
した構成のものとしても同様に実施し得ることは明らか
であり、またＬＰＣ合成フィルタは全極型以外の非極型
デジタルフィルタ等と置換してもほぼ同様に実施しうろ
こともまた明らかである。In the embodiments of the present invention shown in FIGS. 4 and 5, the 1st parameter is used as the LPG coefficient, but other LPC coefficients such as the α parameter may also be used. The encoder and multiplexer, and the decoder and demultiplexer, respectively, integrate these.
It is clear that the present invention can be implemented in the same manner even if the LPC synthesis filter has such a configuration, and it is also clear that the LPC synthesis filter can be implemented in almost the same way even if the LPC synthesis filter is replaced with a non-polar type digital filter or the like other than the all-pole type.

以上説明した如く本発明によれば、マルチパルスボコー
ダにおいて、入力音声信号と音声合成フィルタのインパ
ルス応答との相互相関係数を算出する手段と、Ｎｉｊ記
イフィンパルス応答己相関係数列を算出する手段と、前
記相互相関係数列と前記自己相関係数列との類似度を算
出する手！・りとを分析側に備え、且つ更に前記類似度
の最大値を検索しインパルス系列（マルチパルス）の振
＋！ｇ、ｉ　、　（、ｌ’ｊ置をフォワード的に算出す
る手段を分担制に有すイ）ことにより、効率よく前記相
互相関係数列からマルチパルスの影響を減じることも〜
可能とし、マルチパルス符号化の効率を向上させ得ると
いう効果がおる。As described above, according to the present invention, a multipulse vocoder includes means for calculating a cross-correlation coefficient between an input speech signal and an impulse response of a speech synthesis filter, and means for calculating a sequence of Nij-Iffine pulse response autocorrelation coefficients. , How to calculate the similarity between the cross-correlation coefficient sequence and the autocorrelation coefficient sequence!・Equip the analysis side, and further search for the maximum value of the similarity and calculate the impulse sequence (multipulse) +! g, i, (and having a means for calculating the positions of l'j in a forward manner), it is also possible to efficiently reduce the influence of multi-pulses from the cross-correlation coefficient sequence.
This has the effect of making it possible to improve the efficiency of multi-pulse encoding.

[Brief explanation of the drawing]

第１図は従来のマルチパルス４（ボコーダの基本的構成
を示−！ブロック図、第２）＋（Ａ）〜（１＜）は従来
方法における相互相関係？Ｊ、　ｌ　ｂｘどマルチパル
スボコーダとの関係を示す波形）・４、第３Ｉソ］けイ
ンパルス応答の自己相関係数１ｔ、、−、ｚ示す波形図
、弔４図は本発明によるマルチパルス型ボコーダの分析
側の一実施例を示すブロック図、弔５１’＆ｌけ本発明
ニよるマルチパルス型ボコーダの合成６！すの一実施例
を示すブロック図、第６図は類似度界Ｌｉ２器１１を詳
細に説明するためのブ〔・ツク１゛；］、第７１５（ｌ
（ＩＮ）〜（Ｋ）ｌ−ｊ：本発明による相互相関係数”
ｈｘとマルチパルス決定子１：ｉＦｊとの関係？示す波
形図、第８回は本発明の符弓化効率ｉｈＪ上効果？従来
方法とズＪ比してＳ／Ｎ？′ａ１′価した鼓形図である
。１・・・・・・Ｌ、ＰＣ合成器、２・・・・・・ＬＰＣ
分机器、３・・・・・・背源パルス発生器、４・・・・
・・減算器、５・・・・・・聴感重み付は器、６・・・
・・・２乗誤差・最小化器、７・・・・・・ＬＰＣ分析
器、８・・・・・・相互相関ｌ！Ｉ数算出器、９・・・
・・・符号化器（１）、１０・・・・・・自己相関関数
算出器、１１・・・・・・類似度算出器、１２・・・・
・・符号化器（２）、１３・・・・・・マルチプレクシ
−１１４・・・・・・デマルチプレクサ、１５・・・・
・・仮号化器（１）、１６・・・・・・仮号化器（２）
、１７・・・・・・１．ＰＣ合成器、１８・・・・・・
Ｌ　Ｐ　Ｆ　。１９・・・・・・相互イ・目間係数メモリ、２０・・・
・・・自己）；目間正規化器、２１・・・・・・自己相
関係数メモリ、２２・・・・・・積和見出）Ｘハタ３・
・・・・・最大値検系器、２４・・・・・・相互相関ン
山正器、２５・・・・・・マルチパルスメモリ。第　１　頂 −３０−２０−／Ｄ　Ｏ＃＋　２６　．３０事３ｊｙ第４図′ 峯２回Ｂ　ｌ θｒノ０／左２ρ２左ハ＋）シ゛フ、教い　）　偽　〜　＋　（Ｎ　〜　づ　勢　カ×Fig. 1 shows the basic configuration of a conventional multi-pulse 4 (vocoder -!Block diagram, 2nd) + (A) - (1<) are mutual correlations in the conventional method? The waveform diagram showing the relationship with the multi-pulse vocoder such as J, l bx) and the autocorrelation coefficient 1t, -, z of the impulse response of the third I A block diagram showing an embodiment of the analysis side of the vocoder, 51'&l; Synthesis of multi-pulse vocoder according to the present invention 6! FIG. 6 is a block diagram showing one embodiment of the similarity field Li2 unit 11.
(IN)~(K)l-j: Cross-correlation coefficient according to the present invention"
Relationship between hx and multipulse determiner 1: iFj? The waveform diagram shown in Part 8 is the effect of the present invention on arching efficiency ihJ? What is the S/N compared to the conventional method? It is a ``a1'' rated hourglass figure. 1...L, PC synthesizer, 2...LPC
Separator, 3...Back source pulse generator, 4...
...Subtractor, 5... Auditory weighting is device, 6...
...Squared error minimizer, 7...LPC analyzer, 8...Cross correlation l! I number calculator, 9...
... Encoder (1), 10 ... Autocorrelation function calculator, 11 ... Similarity calculator, 12 ...
...Encoder (2), 13...Multiplexer 114...Demultiplexer, 15...
... Temporary encoder (1), 16... Temporary encoder (2)
, 17...1. PC synthesizer, 18...
LPF. 19... Mutual A-memory coefficient memory, 20...
... self); inter-eye normalizer, 21 ... autocorrelation coefficient memory, 22 ... sum of products header)
...Maximum value detector, 24...Cross-correlation peak corrector, 25...Multi-pulse memory. 1st Vertex -30-20-/D O#+ 26. 30 things 3j y Figure 4' Mine 2 times B l θrノ0/left 2ρ2 left C +) shift, teach) False ~ + (N ~ zu force Ka ×

Claims

[Claims]

The input audio signal is subjected to LPC (Linea) for each analysis frame.
rPrediction Coefficient (Linear Prediction Coefficient) The LPC coefficients analyzed and extracted are used as spectral envelope information, and together with this spectral envelope information, sound source information that constitutes the audio information of the input audio signal is generated corresponding to the characteristics of this sound source information for each analysis frame. Multiple impulse sequences (multipulses) with time positions and amplitudes
Analysis and synthesis of the input audio signal 2
In the multi-pulse vocoder, means for calculating a cross-correlation coefficient sequence between the input speech signal and an impulse response of a speech synthesis filter; The analysis side includes means for calculating the degree of similarity f with the autocorrelation coefficient sequence, and further includes means for searching for the maximum value of the degree of similarity and calculating the amplitude and position of the impulse sequence (multipulse) in a forward manner. A multi-pulse type vocoder having an analysis side.