JPH079598B2

JPH079598B2 - Method for correcting standard parameters in voice recognition device

Info

Publication number: JPH079598B2
Application number: JP60288953A
Authority: JP
Inventors: 正典宮武
Original assignee: Sanyo Electric Co Ltd
Current assignee: Sanyo Electric Co Ltd
Priority date: 1985-12-20
Filing date: 1985-12-20
Publication date: 1995-02-01
Anticipated expiration: 2010-02-01
Also published as: JPS62147492A

Description

【発明の詳細な説明】〔産業上の利用分野〕本発明は音声認識装置に関し、更に詳述すると子音と母
音とを分離して認識する方式の装置における標準パラメ
ータの修正方法に関する。Description: TECHNICAL FIELD The present invention relates to a voice recognition device, and more specifically to a method of correcting standard parameters in a device of a method of separately recognizing consonants and vowels.

音声認識は予め所定の音声を入力して特徴パラメータを
抽出し、これを標準パラメータとして複数の音声につき
登録しておき、未知の音声入力が入力されるとその特徴
パラメータを標準パラメータと比較して最も近い標準パ
ラメータを検索し、これに相当する音声が入力されたも
のとして特定する技術である。In voice recognition, a predetermined voice is input in advance to extract a characteristic parameter, this is registered as a standard parameter for a plurality of voices, and when an unknown voice input is input, the characteristic parameter is compared with the standard parameter. This is a technique for searching for the closest standard parameter and specifying that a voice corresponding to this is input.

而して誤認識が多い場合は登録しておいた標準パラメー
タが不適切であったものとしてこれを修正することが必
要とされる。If there are many false recognitions, it is necessary to correct the registered standard parameters as if they were inappropriate.

このような修正又は再登録の方法として、音声入力を行
って正しく認識されなかった場合に、再度音声入力さ
せ、これによっても正しく認識されなかったときに２回
目の入力時の音声のパラメータを標準パラメータとして
登録する方法がある。As a method of such correction or re-registration, when voice input is not performed and the voice is not correctly recognized, the voice is input again, and when the voice is not correctly recognized by this, the voice parameter at the time of the second input is used as a standard. There is a method of registering as a parameter.

一方、最近ではより多くの語彙を認識したり、或いは語
彙の制限を無くするために音節単位又は音素単位での識
を行う方法が開発されつつある。On the other hand, recently, a method for recognizing a larger number of vocabulary words, or a method of performing recognition in a syllable unit or a phoneme unit in order to eliminate the limitation of the vocabulary is being developed.

[Problems to be solved by the invention]

音節単位での認識の場合、特に子音部分の認識は難し
く、また発声の都度得られるパラメータにはばらつきが
生じるので、前述の如き標準パラメータの修正では発声
ごとのパラメータのばらつきを吸収することができな
い。これを解決する方法の１つとして特開昭58−31398
号公報の方法があるが、この方法による場合にも雑音、
セグメンテーション（子音，母音の分離）の誤りによっ
て得られる不適切なパラメータが認識に悪影響を及ぼす
おそれがあった。In the case of recognition in syllable units, it is particularly difficult to recognize consonant parts, and the parameters obtained each time the utterance varies, so it is not possible to absorb the variations in each utterance by modifying the standard parameters as described above. . As one of the methods for solving this, Japanese Patent Laid-Open No. 58-31398.
There is a method of Japanese Patent No.
Inappropriate parameters obtained by incorrect segmentation (separation of consonants and vowels) could adversely affect recognition.

[Means for solving problems]

本発明はこのような問題点を解決するためになされもの
であって、修正のために入力された音声の子音，母音が
ともに誤って判定された場合はその音声入力が不適切で
あったとして修正を行わないようにして雑音等の影響を
同避した標準パラメータの修正方法を提供することを目
的とする。The present invention has been made to solve such a problem, and when both the consonant and the vowel of the voice input for correction are erroneously determined, it is determined that the voice input is inappropriate. It is an object of the present invention to provide a standard parameter correction method that avoids the influence of noise and the like without performing correction.

本発明に係る音声認識装置における標準パラメータの修
正方法は入力音声の子音、母音判定の標準となるパラメ
ータを夫々に格納してある子音標準パラメータメモリ及
び母音標準パラメータメモリと、入力された音声から抽
出された子音特徴パラメータと前記子音標準パラメータ
メモリに格納してあるパラメータとを比較して子音判定
を行う子音識別と、入力された音声から抽出された母音
特徴パラメータと前記母音標準パラメータメモリに格納
してあるパラメータとを比較して母音判定を行う母音識
別部と、入力された音声から抽出された子音特徴パラメ
ータ又は母音特徴パラメータを用いて子音標準パラメー
タメモリ又は母音標準パラメータメモリの内容を修正す
る標準パラメータ修正部とを具備し、音節を入力して前
記標準パラメータ修正部により子音標準パラメータメモ
リ又は母音標準パラメータメモリの内容を修正するに際
し、前記子音識別部における判定結果が入力音節の子音
と相異し、また前記母音識別部における判定結果が入力
音声の母音と相異する場合には、前記標準パラメータ修
正部による修正を禁止することを特徴とする。A standard parameter correction method in a voice recognition apparatus according to the present invention is a consonant standard parameter memory and a vowel standard parameter memory that store consonants of an input voice and standard parameters for vowel determination, respectively, and extracted from input voice. The consonant feature parameter and the parameter stored in the consonant standard parameter memory are compared to perform consonant identification, and the vowel feature parameter extracted from the input voice and the vowel standard parameter memory are stored. A vowel discrimination unit that compares vowels with certain parameters to determine the vowels, and a standard that corrects the contents of the consonant standard parameter memory or the vowel standard parameter memory using the consonant feature parameters or vowel feature parameters extracted from the input speech. A parameter correction unit, which inputs the syllable to input the standard parameter When correcting the contents of the consonant standard parameter memory or the vowel standard parameter memory by the positive part, the determination result in the consonant identification unit is different from the consonant of the input syllable, and the determination result in the vowel identification unit is the vowel of the input voice. In the case of a difference, it is characterized in that the correction by the standard parameter correction unit is prohibited.

〔作用〕修正時の音声入力に際し雑音が同時に入力された場合、
或いはセグメンテーションが不良であった場合には子音
識別部，母音識別部とも入力音節の子，母音どおりの判
定をすることができない。このような場合の入力音節を
標準パラメータとして登録するのは不都合であることは
言うまでもなく、本発明ではそれが回避されることにな
る。[Operation] When noise is input at the same time during voice input during correction,
Alternatively, if the segmentation is poor, the consonant identification unit and the vowel identification unit cannot determine the child of the input syllable and the vowel. It goes without saying that it is inconvenient to register the input syllable in such a case as a standard parameter, and the present invention avoids it.

〔Example〕

以下本発明をその実施例を示す図面に基づいて詳述す
る。Hereinafter, the present invention will be described in detail with reference to the drawings showing an embodiment thereof.

第１図は本発明に係る音声認識装置の全体ブロック図で
ある。FIG. 1 is an overall block diagram of a voice recognition device according to the present invention.

マイクロホン１から入力された音節は前処理部２にて高
域強調など処理を受けたあとパラメータ抽出部３に入力
されてここで特徴パラメータが抽出される。特徴パラメ
ータとしてはFFTにより求められる周波数スペクトル、L
PCケプストラム或いはパワー情報，零交差数，自己相関
係数が用いられる。The syllable input from the microphone 1 is subjected to processing such as high-frequency emphasis by the preprocessing unit 2 and then input to the parameter extraction unit 3 where the characteristic parameters are extracted. As the characteristic parameter, the frequency spectrum obtained by FFT, L
PC cepstrum or power information, number of zero crossings, and autocorrelation coefficient are used.

特特徴パラメータは音韻判定部４へ入力され、ここで音
韻性の判定がなされ、この判定結果とスペクトル変化及
び継続時間とを用いてセグメンテーション部７は子音区
間と母音区間との判定を行い、判定結果を子音パラメー
タ作成部５及び母音パラメータ作成部６へ入力する。前
記特徴パラメータは子音パラメータ作成部５及び母音パ
ラメータ作成部６へも入力されており、ここでセグメン
テーション部７の判定結果に従って子音パラメータ及び
母音パラメータが夫々作成される。この装置が標準パラ
メータの登録モードで動作している場合は上記子音パラ
メータ，母音パラメータは夫々子音標準パラメータメモ
リ11及び母音標準パラメータメモリ12へ入力されて入力
音節に対する子音標準パラメータ，母音標準パラメータ
としてここに登録されることになる。このような音節に
ついての子音標準パラメータ及び母音標準パラメータの
登録を所要の複数の音節について予め行っておく。The special feature parameter is input to the phonological unit determination unit 4, where the phonological property is determined. The segmentation unit 7 determines the consonant section and the vowel section using the determination result, the spectrum change, and the duration. The result is input to the consonant parameter creating unit 5 and the vowel parameter creating unit 6. The characteristic parameters are also input to the consonant parameter creating unit 5 and the vowel parameter creating unit 6, where the consonant parameter and the vowel parameter are created according to the determination result of the segmentation unit 7. When this device is operating in the standard parameter registration mode, the above consonant parameters and vowel parameters are input to the consonant standard parameter memory 11 and the vowel standard parameter memory 12, respectively. Will be registered in. The consonant standard parameters and vowel standard parameters for such syllables are registered in advance for a plurality of required syllables.

而して音声認識を行う場合は操作部15にて登録モードか
ら認識モードに切換えて、前同様にして子音パラメータ
作成部５、母音パラメータ作成部６が作成したパラメー
タを未認識の子音パラメータ，母音パラメータとして夫
々子音識別部８及び母音識別部９へ入力する。子音識別
部８及び母音識別部９は夫々予め子音標準パラメータメ
モリ11及び母音標準パラメータメモリ12に各格納してあ
る複数の子音標準パラメータ及び母音標準パラメータを
次々と読出してこれを未認識の子音パラメータ及び母音
パラメータの夫々と比較し、最も類似する子音標準パラ
メータ及び母音標準パラメータを決定し、それに対応す
る子音，母音が入力されたものとして出力部13へその結
果を与える。出力部は子音と母音とを音節として合成
し、これを適宜の外部装置14へ出力する。When performing voice recognition, the operation unit 15 is switched from the registration mode to the recognition mode, and the parameters created by the consonant parameter creating unit 5 and the vowel parameter creating unit 6 are used in the same manner as before to recognize unrecognized consonant parameters and vowels. The parameters are input to the consonant identification section 8 and the vowel identification section 9, respectively. The consonant identification unit 8 and the vowel identification unit 9 successively read out a plurality of consonant standard parameters and vowel standard parameters respectively stored in the consonant standard parameter memory 11 and the vowel standard parameter memory 12 in advance, and read these unrecognized consonant parameters. And the vowel parameters, respectively, to determine the most similar consonant standard parameter and vowel standard parameter, and give the result to the output unit 13 as if the corresponding consonant and vowel were input. The output unit synthesizes a consonant and a vowel as a syllable and outputs this to an appropriate external device 14.

次に修正モードに係る構成を第２図に示すそのフローチ
ャートと共に説明する。操作部15にて修正モードを指令
すると、標準パラメータ修正部10は子音標準パラメータ
メモリ11及び母音標準パラメータメモリ12と登録内容の
組合せにて定まる音節を合成して（例えばKAを）表示部
16に発せしめる。オペレータがマイクロホン１からこの
音節を入力すると前同様に子音パラメータ，母音パラメ
ータが作成され、夫々子音識別部８及び母音識別部９へ
入力される。Next, the configuration related to the correction mode will be described with reference to the flowchart shown in FIG. When the correction mode is instructed by the operation unit 15, the standard parameter correction unit 10 synthesizes the consonant standard parameter memory 11 and the vowel standard parameter memory 12 with the syllable determined by the combination of registered contents (for example, KA) and the display unit
Call out to 16. When the operator inputs this syllable from the microphone 1, consonant parameters and vowel parameters are created as before, and are input to the consonant identification section 8 and the vowel identification section 9, respectively.

子音識別部８及び母音識別部９は認識モード時と同様に
して、夫々子音標準パラメータメモリ11及び母音標準パ
ラメータメモリ12の内容を順次子音パラメータ作成部５
及び母音パラメータ作成部６から入力されてきたパラメ
ータ夫々と比較して入力音節の子音，母音を判定する。The consonant identification unit 8 and the vowel identification unit 9 sequentially store the contents of the consonant standard parameter memory 11 and the vowel standard parameter memory 12 in the same manner as in the recognition mode.
Also, the consonant and vowel of the input syllable are determined by comparing each parameter input from the vowel parameter creating unit 6.

標準パラメータ修正部10はこの判定の結果を読込む。そ
して子音の判定結果がＫでなく、また母音の判定結果が
Ａでない場合は、標準パラメータの修正、つまりＫにつ
いて子音標準パラメータメモリ11に登録してある内容の
修正、Ａについて母音標準パラメータメモリ12に登録し
てある内容の修正は行わない。The standard parameter correction unit 10 reads the result of this determination. If the consonant determination result is not K and the vowel determination result is not A, the standard parameters are corrected, that is, the contents registered in the consonant standard parameter memory 11 for K are corrected, and the vowel standard parameter memory 12 for A is corrected. The contents registered in will not be modified.

これに対して子音，母音とも正しくK,Aと判定された場
合はK,Aの標準パラメータの修正を行う。つまり新に入
力された音節から得た特徴パラメータを子音標準パラメ
ータメモリ11、母音標準パラメータメモリ12に標準パラ
メータとして書込む。子音，母音の一方のみが正しく判
定された場合は第２図のフローチャートに示すように正
しい判定の方の標準パラメータのみを修正してもよい
し、再入力を要求するメッセージを表示部16に表示させ
てもよい。On the other hand, if both consonants and vowels are correctly judged as K and A, the standard parameters of K and A are corrected. That is, the characteristic parameters obtained from the newly input syllable are written in the consonant standard parameter memory 11 and the vowel standard parameter memory 12 as standard parameters. When only one of the consonant and the vowel is correctly judged, only the standard parameter for the correct judgment may be corrected as shown in the flowchart of FIG. 2, or a message requesting re-input is displayed on the display unit 16. You may let me.

〔effect〕

以上の如き本発明によ場合は標準パラメータを修正せん
とするときに雑音が高かった場合であるとか、発声条件
によってセグメンテーションの結果に不具合があった場
合等にそのときのパラメータを標準パラメータとして登
録してしまう虞れがなくなり、これによって認識精度が
高められる。According to the present invention as described above, the parameter at that time is registered as the standard parameter when the noise is high when the standard parameter is to be corrected, or when there is a problem in the segmentation result due to the vocalization condition. There is no risk of this, and the recognition accuracy is improved.

[Brief description of drawings]

第１図は本発明に係る音声認識装置のブロック図、第２
図は本発明方法の内容を示すフローチャートである。５……子音パラメータ作成部、６……母音パラメータ作
成部、７……セグメンテーション部、８……子音識別
部、９……母音識別部、10……標準パラメータ修正部、
11……子音標準パラメータメモリ、12……母音標準パラ
メータメモリFIG. 1 is a block diagram of a voice recognition device according to the present invention, and FIG.
The figure is a flow chart showing the contents of the method of the present invention. 5 ... consonant parameter creation unit, 6 ... vowel parameter creation unit, 7 ... segmentation unit, 8 ... consonant identification unit, 9 ... vowel identification unit, 10 ... standard parameter correction unit,
11 …… Consonant standard parameter memory, 12 …… Vowel standard parameter memory

Claims

[Claims]

1. A consonant standard parameter memory and a vowel standard parameter memory, which store parameters that are standards for determining a consonant and a vowel of an input voice, a consonant characteristic parameter extracted from an input voice, and the consonant standard. A consonant identification unit that compares a parameter stored in a parameter memory to determine a consonant, and a vowel characteristic parameter extracted from an input voice and a parameter stored in the vowel parameter memory are compared. A vowel discrimination unit for performing vowel determination, and a standard parameter correction unit for correcting the contents of the consonant standard parameter memory or the vowel standard parameter memory using the consonant feature parameter or vowel feature parameter extracted from the input voice, Input a syllable and use the standard parameter correction unit to store a consonant standard parameter memory or a vowel mark. When the content of the quasi-parameter memory is modified, the determination result in the consonant identification unit is different from the consonant of the input syllable, and the determination result in the vowel identification unit is different from the vowel of the input voice, the standard parameter A method for correcting a standard parameter in a voice recognition device, characterized in that the correction by a correction unit is prohibited.