JPH0517700U

JPH0517700U - Voice recognizer

Info

Publication number: JPH0517700U
Application number: JP073815U
Authority: JP
Inventors: 貢一佐藤
Original assignee: Alpine Electronics Inc
Current assignee: Alpine Electronics Inc
Priority date: 1991-08-21
Filing date: 1991-08-21
Publication date: 1993-03-05

Abstract

(57)【要約】【目的】登録音声と入力音声とを比較してパターンマ
ッチングを行なう場合、ノイズの影響でミスマッチング
が生ずるのを抑えるようにする。【構成】入力音声のノイズレベルを検出するローパス
フィルタ２を接続し、検出されたノイズレベルに応じて
音声増幅回路４の利得を制御する。音声増幅回路４の出
力音声と登録音声とのパターンマッチングを行なえば、
出力音声はノイズの影響が防止されて常に一定の利得で
制御されて増幅されているので、ノイズの影響を避けて
ミスマッチングを少なくすることができる。 (57) [Summary] [Purpose] When performing pattern matching by comparing the registered voice and the input voice, it is intended to suppress the occurrence of mismatching due to the influence of noise. [Structure] A low-pass filter 2 for detecting a noise level of an input voice is connected, and a gain of a voice amplifier circuit 4 is controlled according to the detected noise level. If pattern matching between the output voice of the voice amplifier circuit 4 and the registered voice is performed,
Since the influence of noise is prevented and the output voice is always controlled and amplified with a constant gain, the influence of noise can be avoided and the mismatch can be reduced.

Description

[Detailed description of the device]

【０００１】[0001]

[Industrial applications]

本考案は、周囲のノイズに影響されることの少ない音声認識装置に関する。 The present invention relates to a voice recognition device that is less affected by ambient noise.

【０００２】[0002]

[Prior Art]

認識すべき複数の言葉を予め音声登録しておき、認識時に入力された音声と登録音声とを比較しパターンマッチングを行なって特定話者を認識する音声認識方法が知られている。最近かかる音声認識方法を採用して音声でデータを入力したり、音声で被制御装置を制御することが行なわれるようになっている。 A voice recognition method is known in which a plurality of words to be recognized are registered in advance and the voice input at the time of recognition is compared with the registered voice to perform pattern matching to recognize a specific speaker. Recently, such a voice recognition method has been adopted to input data by voice and control the controlled device by voice.

【０００３】このような音声登録を行なう場合は、入力された音声信号をＡＧＣ（自動利得制御）回路によって利得を制御するように増幅した後、登録することが行なわれている。また、認識時においても入力された音声信号はＡＧＣ回路によって利得の制御が行なわれた後、登録音声との比較が行なわれる。かかる音声認識を行なう場合は、音声登録時と音声認識時における周囲の音の状況が一致していて、各々音声に対する利得の制御が等しく行なわれていることが望ましい。When performing such voice registration, the input voice signal is amplified by an AGC (automatic gain control) circuit so as to control the gain, and then registered. Also during recognition, the input voice signal is compared with the registered voice after the gain is controlled by the AGC circuit. In the case of performing such voice recognition, it is desirable that the circumstances of the surrounding sounds at the time of voice registration and at the time of voice recognition are the same, and that the gain control for each voice is equally performed.

【０００４】[0004]

[Problems to be solved by the device]

ところで従来の音声認識装置では、音声登録時と音声認識時とでは音声に対するＡＧＣ回路の利得の制御が等しく行なわれないことが多いので、入力音声と登録音声を比較する場合パターンマッチングがうまくいかなくて、ミスマッチングが生ずるという問題がある。 By the way, in the conventional voice recognition device, the control of the gain of the AGC circuit for the voice is not performed at the same time during the voice registration and the voice recognition. There is a problem that it does not work well and mismatch occurs.

【０００５】すなわち、音声登録は静かなノイズのない場所で行なわれるが、音声認識は周囲のノイズが多い場所で行なわれることが多いため、ノイズに影響されて利得の制御が音声登録時と異なってくる。例えば音声認識により車載用オーディオ装置を制御する場合には、車が走行している状態で音声認識が行なわれるため、ロードノイズの影響が避けられなくなる。That is, voice registration is performed in a quiet, noise-free place, but since voice recognition is often performed in a noisy place in the surroundings, the gain control is affected by noise and the gain control is different from that during voice registration. Will be different. For example, when controlling an in-vehicle audio device by voice recognition, voice recognition is performed while the vehicle is running, so the influence of road noise cannot be avoided.

【０００６】音声信号に対する利得の制御は、発声単語に対して行なわれるため音声のパワーレベルの低い部分（子音）と、パワーレベルの高い部分（母音）とで利得差が生じる。しかし、認識時はロードノイズが重畳されるため、入力音声はロードノイズにマスキングされてしまい、パワーレベルの低い部分とパワーレベルの高い部分でのレベル差が減少してしまうので、利得差もなくなってくる。このため、登録音声のパターンと入力音声のパターンとの差異が出てきてしまい、パターンマッチングがうまくいかないのでミスマッチングが生ずる。Since gain control for a voice signal is performed for a spoken word, a gain difference occurs between a low power level portion (consonant) and a high power level portion (vowel). However, since the road noise is superimposed during recognition, the input voice is masked by the load noise, and the level difference between the low power level part and the high power level part is reduced, so there is no gain difference. Come on. For this reason, a difference occurs between the registered voice pattern and the input voice pattern, and the pattern matching does not work well, resulting in a mismatch.

【０００７】本考案は以上のような事情に鑑みてなされたもので、周囲のノイズの影響をなくして音声認識時のミスマッチングを少なくするようにした音声認識装置を提供することを目的とするものである。The present invention has been made in view of the above circumstances, and an object of the present invention is to provide a voice recognition device that eliminates the influence of ambient noise and reduces mismatching during voice recognition. To do.

【０００８】[0008]

[Means for Solving the Problems]

上記目的を達成するために本考案は、認識すべき音声を予め複数登録しておき、入力された音声と前記登録音声とのパターンマッチングを行なって特定話者を認識する音声認識装置において、音声を入力するときの周囲のノイズレベルを検出するノイズ検出部と、検出されたノイズレベルに応じて入力音声の利得を制御する音声増幅部と、音声増幅部の出力のノイズ成分を除去するノイズ成分除去部と、ノイズ成分除去部の出力と登録音声を比較して特定話者を認識する音声認識部とを有することを特徴とするものである。 In order to achieve the above object, the present invention provides a voice recognition device that recognizes a specific speaker by registering a plurality of voices to be recognized in advance and performing pattern matching between the input voice and the registered voice. Noise detection section that detects the ambient noise level when inputting, an audio amplification section that controls the gain of the input audio according to the detected noise level, and noise that removes the noise component of the output of the audio amplification section. It is characterized by having a component removal unit and a voice recognition unit that recognizes a specific speaker by comparing the output of the noise component removal unit and the registered voice.

【０００９】[0009]

[Action]

音声認識時入力された音声の周囲のノイズレベルを検出し、この検出されたノイズレベルに応じて利得を制御するように入力音声を増幅する。この増幅音声出力からノイズ成分を除去した後登録音声を比較してパターンマッチングを行なう。入力音声はノイズレベルに応じて利得が制御されているので、常に周囲のノイズに応じた増幅が行なわれるようになる。これにより周囲のノイズの影響をなくして音声認識時のミスマッチングを少なくすることができる。 The noise level around the input voice during voice recognition is detected, and the input voice is amplified so that the gain is controlled according to the detected noise level. After removing the noise component from this amplified voice output, the registered voices are compared and pattern matching is performed. Since the gain of the input voice is controlled according to the noise level, amplification is always performed according to the surrounding noise. This can eliminate the influence of surrounding noise and reduce mismatching during voice recognition.

【００１０】[0010]

【Example】

以下図面を参照して本考案の実施例を説明する。 An embodiment of the present invention will be described below with reference to the drawings.

【００１１】図１は本考案の音声認識装置の実施例を示すブロック図である。１は音声を入力するマイク、２はロードノイズを検出するローパスフィルタ、３はローパスフィルタ３の出力を直流に変える整流平滑回路、４は入力音声の利得の制御を整流平滑回路３の出力のノイズレベルに応じて増幅するＶＣＡ（電圧制御増幅器）形からなる音声増幅回路、５は音声増幅回路４のノイズ主成分を除去するハイパスフィルタである。FIG. 1 is a block diagram showing an embodiment of a voice recognition device of the present invention. 1 is a microphone for inputting voice, 2 is a low-pass filter for detecting road noise, 3 is a rectifying / smoothing circuit for changing the output of the low-pass filter 3 to DC, and 4 is control of the gain of input voice. A VCA (voltage control amplifier) type voice amplifier circuit 5 that amplifies according to the noise level is a high-pass filter that removes the main noise component of the voice amplifier circuit 4.

【００１２】６はハイパスフィルタ５から出力されたアナログ信号をデジタル信号に変換するＡ／Ｄ変換器、７は予め複数の音声を登録して記憶しておく音声登録部、８はマイコン構成の登録・認識処理部、９は音声登録及び音声認識を選択する操作部である。Reference numeral 6 is an A / D converter that converts an analog signal output from the high-pass filter 5 into a digital signal, 7 is a voice registration unit that registers and stores a plurality of voices in advance, and 8 is a microcomputer configuration. A registration / recognition processing unit 9 is an operation unit for selecting voice registration and voice recognition.

【００１３】ローパスフィルタ２は入力音声の中から特にロードノイズ成分を検出するためのものである。車室内のロードノイズを測定の結果、図２に示すような分布が得られた。ロードノイズはｆｃ＝２００Hz、１２ｄＢ／oct の音声分布を有しており、２００Hz以下にエネルギーは集中している。従ってローパスフィルタ２としてｆｃ＝２００Hzのものを使用することにより、ロードノイズを検出することができる。The low-pass filter 2 is for particularly detecting a road noise component from the input voice. As a result of measuring the road noise in the passenger compartment, the distribution shown in Fig. 2 was obtained. Road noise has a voice distribution of fc = 200Hz and 12dB / oct, and energy is concentrated below 200Hz. Therefore, the road noise can be detected by using the low pass filter 2 with fc = 200 Hz.

【００１４】次に本実施例の動作を説明する。Next, the operation of this embodiment will be described.

【００１５】まず音声登録に際しては、静かなノイズのない場所が選ばれる。また、操作部９によって音声登録を選択しておく。この状態で、マイク１から入力された音声は、ノイズがないためローパスフィルタ２及び整流平滑回路３の動作に関係なく、所定の利得の制御が行なわれるように音声増幅回路４によって増幅され、さらにハイパスフィルタ５を通過した後Ａ／Ｄ変換器６によってデジタル信号に変換される。First, at the time of voice registration, a quiet, noise-free place is selected. Also, voice registration is selected by the operation unit 9. In this state, the voice input from the microphone 1 is amplified by the voice amplification circuit 4 so that a predetermined gain control is performed regardless of the operations of the low-pass filter 2 and the rectifying and smoothing circuit 3 because there is no noise. After passing through the high-pass filter 5, it is converted into a digital signal by the A / D converter 6.

【００１６】デジタル信号は登録・認識処理部８へ入力され、この制御の基に音声登録部７内のメモリに記録される。以後、マイク１から音声が入力されるごとに同様な信号処理が行なわれて、音声登録部７内のメモリには複数の音声が登録される。The digital signal is input to the registration / recognition processing unit 8 and recorded in the memory in the voice registration unit 7 under this control. After that, the same signal processing is performed every time a voice is input from the microphone 1, and a plurality of voices are registered in the memory in the voice registration unit 7.

【００１７】次に音声認識に際しては、走行中の車内などのように周囲にノイズが多い場所で行なわれることが多いので、ノイズの影響を受ける。この場合、操作部９によって音声認識を選択しておく。この状態で、マイク１から入力された音声は音声増幅回路４へ出力されると共に、ローパスフィルタ２へ出力される。Next, voice recognition is often performed in a place where there is a lot of noise in the surroundings, such as in a moving vehicle, and is therefore affected by noise. In this case, the voice recognition is selected by the operation unit 9. In this state, the voice input from the microphone 1 is output to the voice amplifier circuit 4 and the low pass filter 2.

【００１８】ローパスフィルタ２はｆｃ＝２００Hzに設定されたものが用いられているので、ほぼ２００Hz以下のロードノイズの主成分はこのローパスフィルタ２を通過した後、整流平滑回路３で直流に変えられ、この直流は音声増幅回路４へ出力される。Since the low-pass filter 2 is set to fc = 200 Hz, the main component of road noise of approximately 200 Hz or less passes through the low-pass filter 2 and is then converted into direct current by the rectifying / smoothing circuit 3. This direct current is output to the voice amplifier circuit 4.

【００１９】これによって音声増幅回路４は入力音声をノイズレベルに応じて利得を制御するように動作する。すなわち、ノイズレベルが高いときは利得を低くするように制御し、ノイズレベルが低いときは利得を高くするように制御する。これによって入力音声の単語中の音素の利得は一定となるように制御される。なお、前記のように音声登録時のようなノイズレベルが低い場合は、音声増幅回路４の利得は高くなるように制御される。As a result, the voice amplifier circuit 4 operates so as to control the gain of the input voice according to the noise level. That is, the gain is controlled to be low when the noise level is high, and the gain is controlled to be high when the noise level is low. As a result, the gain of the phoneme in the word of the input speech is controlled to be constant. When the noise level is low as in the voice registration as described above, the gain of the voice amplifier circuit 4 is controlled to be high.

【００２０】ノイズレベルに応じて利得が制御された音声増幅回路４の出力はハイパスフィルタ５へ加えられて、ノイズの主成分が除去された後Ａ／Ｄ変換器６によってデジタル信号に変換される。このデジタル信号はロードノイズの影響が除去された音声となっている。The output of the audio amplifier circuit 4 whose gain is controlled according to the noise level is added to the high-pass filter 5 to remove the main component of noise, and then converted into a digital signal by the A / D converter 6. It This digital signal is a voice with the influence of road noise removed.

【００２１】デジタル信号は登録・認識処理部８へ入力され、これに基づいて登録・認識処理部８はこの入力音声の特徴である音声パターンを、音声登録部７内のメモリに記録されている複数の登録音声の音声パターンとの比較を行なってパターンマッチングを行なう。パターンマッチングが成立した登録音声が見つかると、特定話者が認識されたことになり、登録・認識処理部８は被処理装置へ制御信号を出力する。The digital signal is input to the registration / recognition processing unit 8, and based on this, the registration / recognition processing unit 8 records a voice pattern, which is a feature of the input voice, in a memory in the voice registration unit 7. Pattern matching is performed by comparing the voice patterns of multiple registered voices. When the registered voice for which pattern matching is established is found, it means that the specific speaker is recognized, and the registration / recognition processing unit 8 outputs a control signal to the device to be processed.

【００２２】このように本実施例によれば、音声認識時入力された音声のノイズレベルをローパスフィルタ２によって検出し、このノイズレベルに応じて音声増幅回路３の利得を制御するので、音声増幅回路３からは常に一定のレベルの音声が出力される。従って登録音声とのパターンマッチングを行なう場合は、ロードノイズの影響をなくした入力音声との比較が行なわれるため、ミスマッチングの少ない音声認識を行なうことができる。また、ロードノイズを音声増幅回路４によって過大増幅を行なうサチュレーションを防止することもできる。As described above, according to the present embodiment, the noise level of the voice input during voice recognition is detected by the low-pass filter 2 and the gain of the voice amplifier circuit 3 is controlled according to this noise level. The circuit 3 always outputs a constant level of sound. Therefore, when performing pattern matching with the registered voice, the input voice without the influence of road noise is compared, so that voice recognition with less mismatching can be performed. Further, it is possible to prevent saturation in which the road noise is excessively amplified by the audio amplifying circuit 4.

【００２３】[0023]

[Effect of the device]

以上述べたように本考案によれば、ノイズレベルに応じて利得を制御するように音声増幅を行ない、この音声出力と登録音声とを比較してパターンマッチングを行なうようにしたので、ミスマッチングの少ない音声認識を行なうことができる。 As described above, according to the present invention, the voice amplification is performed so as to control the gain according to the noise level, and the pattern output is performed by comparing the voice output with the registered voice. It is possible to perform less voice recognition.

[Brief description of drawings]

【図１】本考案の音声認識装置の実施例を示すブロック
図である。FIG. 1 is a block diagram showing an embodiment of a voice recognition device of the present invention.

【図２】本考案の動作原理を説明するロードノイズのエ
ネルギー分布図である。FIG. 2 is an energy distribution diagram of road noise for explaining the operating principle of the present invention.

[Explanation of symbols]

２ローパスフィルタ４音声増幅回路５ハイパスフィルタ７音声登録部８登録・認識処理部 2 Low-pass filter 4 Voice amplification circuit 5 High-pass filter 7 Voice registration unit 8 Registration / recognition processing unit

Claims

[Scope of utility model registration request]

1. A voice recognition device for recognizing a specific speaker by registering a plurality of voices to be recognized in advance and performing pattern matching between the input voices and the registered voices. Noise detecting section for detecting the noise level of the sound, a voice amplifying section for controlling the gain of the input voice according to the detected noise level, a noise component removing section for removing the noise component of the output of the voice amplifying section, and a noise component A voice recognition device comprising: a voice recognition unit for recognizing a specific speaker by comparing the output of the removal unit and a registered voice.