JPH0257320B2

JPH0257320B2 -

Info

Publication number: JPH0257320B2
Application number: JP58170248A
Authority: JP
Inventors: Yasuo Sato
Original assignee: Fujitsu Ltd
Current assignee: Fujitsu Ltd
Priority date: 1983-09-14
Filing date: 1983-09-14
Publication date: 1990-12-04
Also published as: JPS6061799A

Description

【発明の詳細な説明】 (A) 発明の技術分野本発明は音声認識装置、特に認識対象の各項目
について複数の標準特徴パターンが登録された辞
書をそなえた音声認識装置において、例えば誤つ
て発声された音声情報にもとづいて登録されてし
まつた標準特徴パターンを、登録／練習モード時
または認識モード時に自動的に削除し、辞書の品
質を向上させるようにした音声認識装置に関する
ものである。[Detailed Description of the Invention] (A) Technical Field of the Invention The present invention relates to a speech recognition device, particularly a speech recognition device equipped with a dictionary in which a plurality of standard feature patterns are registered for each item to be recognized. The present invention relates to a speech recognition device that automatically deletes standard feature patterns that have been registered based on voice information that has been registered, in a registration/practice mode or in a recognition mode, thereby improving the quality of a dictionary.

(B) 従来技術と問題点一般に音声認識において、認識率を向上させる
ためには、音声情報からどのような特徴パラメー
タを抽出し照合に用いるかが重要であるが、その
システムで定められた特徴抽出により、各項目を
代表する標準特徴パラメータとして、いかに最適
なものを辞書に用意するかについても重要であ
る。特徴の抽出のし方や照合のし方が、いかに優
れていても、辞書中に登録される標準特徴パター
ンに、雑音付加パターン、不明瞭発声パターン等
の不良標準特徴パターンや、例えば「ａ」を登録
すべきときに「ｉ」と発声してしまう等の発声誤
りによる誤り標準特徴パターンが多ければ、認識
率は向上しない。(B) Prior art and problems In general, in speech recognition, in order to improve the recognition rate, it is important to determine what kind of feature parameters are extracted from speech information and used for matching. It is also important to determine how best to prepare the standard feature parameters representative of each item in the dictionary through extraction. No matter how good the method of feature extraction or matching is, the standard feature patterns registered in the dictionary may contain defective standard feature patterns such as noise-added patterns, unclear utterance patterns, or, for example, "a". If there are many erroneous standard feature patterns due to utterance errors such as uttering ``i'' when ``i'' should be registered, the recognition rate will not improve.

標準特徴パターンは、辞書中にデイジタル情報
で記憶され、その数が多く、機械部品のように目
にみえるわけではなく、またすべての標準特徴パ
ターンが一律に使用されるわけではないので、一
旦登録されてしまうと、上記不良標準特徴パター
ン、誤り標準特徴パターン等の検出は、容易では
ない。 Standard feature patterns are stored as digital information in the dictionary, and there are many of them, and they are not visible like mechanical parts, and not all standard feature patterns are uniformly used, so once registered. If this happens, it is not easy to detect the defective standard feature patterns, erroneous standard feature patterns, etc.

従来、一旦登録した標準特徴パターンはすべて
正しいものとして扱い、認識誤りが生じた場合、
認識させようとする入力音声が悪いか、または認
識の限界であつて、止む得ないものとされるのが
一般的であつた。また、誤認識を生じさせた入力
音声から抽出された入力特徴パターンと、既に登
録されている標準特徴パターンとのいわゆる平均
化により、辞書の品質を改良していく学習方式等
も提案されているが、登録済みの標準特徴パター
ンが、ある程度正しいという前提のもとにとられ
る方式であつて、標準特徴パターンが誤つている
場合には、収束が遅いという問題があつた。 Conventionally, all standard feature patterns once registered are treated as correct, and if a recognition error occurs,
Generally, it was assumed that the input voice to be recognized was bad or that recognition was at its limit, and that it was unavoidable. In addition, a learning method has been proposed that improves the quality of the dictionary by averaging the input feature pattern extracted from the input voice that caused the misrecognition with the already registered standard feature pattern. However, this method is based on the premise that the registered standard feature pattern is correct to some extent, and if the standard feature pattern is incorrect, there is a problem that convergence is slow.

(C) 発明の目的と構成本発明は上記問題点の解決を図り、登録ないし
練習モード時、または認識モード時に、妥当でな
い標準特徴パターンがあるとき、それを検出し
て、自動削除することにより、辞書作成者もしく
は使用者にできるだけ負担をかけることなく、辞
書の品質を向上させ、認識率を高めることを目的
としている。そのため、本発明の音声認識装置
は、認識すべき各項目に対応して１または複数の
標準特徴パターンが格納された辞書をそなえ、未
知入力音声を音響分析して得られた入力特徴パタ
ーンと上記辞書中の標準特徴パターンとの照合に
よつて認識を行う音声認識装置において、認識結
果の誤りを検出する誤り検出部と、該誤り検出部
が認識誤りを検出した際に入力特徴パターンに最
も類似した最類似標準特徴パターンと該最類似標
準特徴パターンの属する項目の他の標準特徴パタ
ーンとの類似度および上記最類似標準特徴パター
ン入力音声に対応する正しい項目の標準特徴パタ
ーンとの類似度を比較する類似性判定部と、該類
似性判定部の判定結果にもとづいて、上記類似度
の差または比が所定の基準値よりも大きい場合
に、上記最類似標準特徴パターンを上記辞書から
削除する登録パターン削除部とをそなえたことを
特徴としている。以下図面を参照しつつ実施例に
従つて説明する。(C) Purpose and Structure of the Invention The present invention aims to solve the above-mentioned problems by detecting and automatically deleting unsuitable standard feature patterns when there is an invalid standard feature pattern in the registration or practice mode or in the recognition mode. The aim is to improve the quality of dictionaries and increase the recognition rate without placing as much burden on dictionary creators or users as possible. Therefore, the speech recognition device of the present invention includes a dictionary in which one or more standard feature patterns are stored corresponding to each item to be recognized, and input feature patterns obtained by acoustic analysis of unknown input speech and the above-mentioned input feature patterns are provided. In a speech recognition device that performs recognition by comparing standard feature patterns in a dictionary, there is an error detection unit that detects errors in the recognition result, and when the error detection unit detects a recognition error, it detects a recognition error that is most similar to the input feature pattern. Compare the similarity between the most similar standard feature pattern and other standard feature patterns of the item to which the most similar standard feature pattern belongs, and the similarity with the standard feature pattern of the correct item corresponding to the input voice of the most similar standard feature pattern. and a registration for deleting the most similar standard feature pattern from the dictionary if the difference or ratio of the degrees of similarity is greater than a predetermined reference value based on the determination result of the similarity determination unit. It is characterized by having a pattern deletion section. Embodiments will be described below with reference to the drawings.

(D) 発明の実施例第１図は音声パターンの分布と標準特徴パター
ンとの関係を説明するための図、第２図は本発明
による処理概要を説明するための図、第３図は本
発明の一実施例構成、第４図ないし第６図は類似
性判定部の各一実施例処理説明図、第７図はパタ
ーン削除部の一実施例処理説明図を示す。(D) Embodiments of the invention FIG. 1 is a diagram for explaining the relationship between the distribution of voice patterns and standard feature patterns, FIG. 2 is a diagram for explaining the outline of the processing according to the present invention, and FIG. The structure of an embodiment of the invention, FIGS. 4 to 6 are diagrams illustrating processing of each embodiment of the similarity determining section, and FIG. 7 is a diagram illustrating processing of an embodiment of the pattern deletion section.

第１図において、Ａ，Ｂ，Ｃの実線で囲まれた
部分は、パターン空間における実際の音声パター
ンの分布を示し、A₁およびA₂は単語Ａ（単音節を
含む。以下同様。）に対する登録された標準特徴
パターン、B₁ないしB₃は単語Ｂに対する標準特
徴パターン、C₁は単語Ｃに対する標準特徴パタ
ーンを表わしている。図示Ｃのように、１つの単
語項目について、１つの標準特徴パターンでカバ
ーする場合もあるが、通常、図示Ａ，Ｂのよう
に、１つの項目について複数の標準特徴パターン
を用意し、認識すべき音声パターンの分布範囲を
カバーするのが普通である。例えば、未知入力音
声の入力特徴パターンＸが抽出されると、その入
力特徴パターンＸと各標準特徴パターンA₁，A₂，
B₁，…とのマツチング距離の演算を行い、距離
の小さい標準特徴パターンの属する項目を認識結
果とする。 In Figure 1, the parts surrounded by solid lines A, B, and C indicate the distribution of actual speech patterns in the pattern space, and A ₁ and A ₂ are for word A (including monosyllables; the same applies hereinafter). The registered standard feature patterns B ₁ to B ₃ represent standard feature patterns for word B, and C ₁ represents a standard feature pattern for word C. In some cases, one word item is covered by one standard feature pattern, as shown in illustration C, but usually, multiple standard feature patterns are prepared and recognized for one item, as shown in illustrations A and B. It usually covers the distribution range of power speech patterns. For example, when an input feature pattern X of unknown input speech is extracted, the input feature pattern X and each standard feature pattern A ₁ , A ₂ ,
The matching distance between B ₁ , .

もし、辞書に登録された標準特徴パターンの中
に、音声パターンの分布から外れた不良標準特徴
パターンや誤り標準特徴パターン等があれば、認
識率は劣化することとなる。本発明は、このよう
な妥当でない標準特徴パターンを削除することに
よつて、認識率を向上させようとするものであ
る。 If the standard feature patterns registered in the dictionary include a defective standard feature pattern or an erroneous standard feature pattern that deviates from the distribution of speech patterns, the recognition rate will deteriorate. The present invention aims to improve the recognition rate by deleting such unsuitable standard feature patterns.

例えば、第２図図示の如く、単語「渋谷」の音
声パターンの分布が、図示Ｓであり、単語「日比
谷」の音声パターンの分布が図示Ｈであつたとす
る。辞書の作成にあたつて、それぞれ複数個の標
準特徴パターンを登録するとき、操作ミスまたは
発声ミスによつて、「シブヤ」と発声すべきとこ
ろを、誤つて「ヒビヤ」と発声し、この標準特徴
パターンＳ３を登録してしまつたとする。標準特
徴パターンＳ３は、実際には「ヒビヤ」の音声パ
ターンであるにもかかわらず、辞書においては単
語「渋谷」に属するものとして記憶されることに
なる。 For example, as shown in FIG. 2, it is assumed that the distribution of the voice pattern of the word "Shibuya" is S in the diagram, and the distribution of the voice pattern of the word "Hibiya" is H in the diagram. When creating a dictionary, when registering multiple standard feature patterns, due to an operational error or a pronunciation error, the user mistakenly pronounced "hibiya" instead of "shibuya", resulting in the standard feature pattern being registered. Assume that the characteristic pattern S3 has been registered. Although the standard feature pattern S3 is actually a speech pattern for "Hibiya", it will be stored in the dictionary as belonging to the word "Shibuya".

１度、上記のように登録されてしまうと、例え
ば「シブヤ」の発声に対する認識にあたつては、
標準特徴パターンＳ１およびＳ２だけがマツチン
グし、パターンＳ３はマツチングしない。しか
し、パターンＳ３が誤つていることは、検知され
ず、単にパターンＳ３に該当する発声がなされな
いとして扱われる。一方、例えば第２図図示の如
く、「ヒビヤ」について入力特徴パターンＸの発
声がなされたとする。入力特徴パターンＸと標準
特徴パターンＳ３との距離d₁は、標準特徴パター
ンＨ３との距離d₂よりも小さいため、パターンＸ
は、単語「渋谷」と認識されることとなる。この
場合、従来の学習方式等によれば、標準特徴パタ
ーンＳ３が誤つているというよりも、むしろ、単
語「日比谷」の標準特徴パターンH₁，H₂，H₃が
適当でないと判断し、「目比谷」に属する標準特
徴パターンの追加、修正を行うようにされてい
た。そのため、誤り標準特徴パターンＳ３は、そ
のまま辞書中に放置されることとなる。 Once registered as described above, for example, when recognizing the utterance of "Shibuya",
Only standard feature patterns S1 and S2 are matched, and pattern S3 is not matched. However, the fact that pattern S3 is incorrect is not detected, and it is simply treated as if no utterance corresponding to pattern S3 is made. On the other hand, it is assumed that the input feature pattern X for "Hibiya" is uttered as shown in FIG. 2, for example. Since the distance d ₁ between the input feature pattern X and the standard feature pattern S3 is smaller than the distance d ₂ between the input feature pattern X and the standard feature pattern H3, the pattern
will be recognized as the word "Shibuya". In this case, according to the conventional learning method, rather than determining that the standard feature pattern S3 is incorrect, it is determined that the standard feature patterns H ₁ , H ₂ , H ₃ of the word "Hibiya" are inappropriate, and " Standard feature patterns belonging to "Mebiya" were added and modified. Therefore, the error standard feature pattern S3 will be left as is in the dictionary.

本発明の場合、認識誤りが検出されると、次の
ように標準特徴パターンＳ３が妥当なものである
かどうかのチエツクを行い、妥当でない場合に、
標準特徴パターンＳ３を辞書中から消去するよう
にされる。なお、認識誤りが生じたかどうかは、
登録モードまたは練習モード時には、入力単語が
何であるかをシステムは知つているので、直ちに
検出できる。また、認識モードにおいても、認識
誤りや正答内容を指示する手段があれば、使用者
の指示により、誤りを検知できる。 In the case of the present invention, when a recognition error is detected, it is checked whether the standard feature pattern S3 is valid as follows, and if it is not valid,
The standard feature pattern S3 is deleted from the dictionary. In addition, whether or not a recognition error occurred is determined by
When in registration mode or practice mode, the system knows what the input word is and can immediately detect it. Furthermore, even in the recognition mode, if there is a means for indicating recognition errors and correct answers, errors can be detected by the user's instructions.

入力特徴パターンＸについての音声認識が誤り
であることが判ると、その原因となつた最類似標
準特徴パターンＳ３について、まず同種辞書項目
中の標準特徴パターンS₁，S₂との類似性と、異種
辞書項目中の標準特徴パターンH₁，H₂，H₃との
類似性とが調べられる。そして、これらの類似性
によつて、標準特徴パターンＳ３についての妥当
性の判断を行う。類似性の基準として、例えば第
２図図示距離D₁、距離D₂の平均値と、距離D′₁、
距離D′₂、距離D′₃の平均値とを用い、これらの平
均値の差または比が、所定の基準値よりも大きい
かどうかによつて、パターンＳ３の妥当性を判断
する。 When it is determined that _the speech recognition for _input feature pattern The similarity with standard feature patterns H ₁ , H ₂ , and H ₃ in the heterogeneous dictionary entries is examined. Then, based on these similarities, the validity of the standard feature pattern S3 is determined. As a criterion for similarity, for example, the average value of the illustrated distance D ₁ and distance D ₂ in Figure 2, and the distance D′ ₁ ,
Using the average value of distance D' ₂ and distance D' ₃ , the validity of pattern S3 is determined depending on whether the difference or ratio between these average values is larger than a predetermined reference value.

また、例えば、類似性の基準として、距離D₁、
距離D₂のうちの最小値と、距離D′₁、距離D′₂、距
離D′₃のうちの最小値とを用いてもよい。こうし
て、パターンＳ３が妥当でないことが検知される
と、パターンＳ３を辞書中から抹消する。このと
き、例えば登録削除の条件として、使用者への確
認その他の種々の条件をもうけてもよい。 Also, for example, as a criterion for similarity, the distance D ₁ ,
The minimum value of distance D ₂ and the minimum value of distance D′ ₁ , distance D′ ₂ , and distance D′ ₃ may be used. In this way, when it is detected that pattern S3 is not valid, pattern S3 is deleted from the dictionary. At this time, for example, various other conditions such as confirmation from the user may be set as conditions for registration deletion.

第３図は、本発明の一実施例構成を示すブロツ
ク図である。図中、１はマイクロホン、２は音響
分析部、３はパターン抽出部、４は切替部、５は
パターン追加部、６は辞書、７は照合判定部、８
は結果表示部、９はキーボード、１０は誤り検出
部、１１は類似性判定部、１２はパターン削除部
を表わす。 FIG. 3 is a block diagram showing the configuration of one embodiment of the present invention. In the figure, 1 is a microphone, 2 is an acoustic analysis section, 3 is a pattern extraction section, 4 is a switching section, 5 is a pattern addition section, 6 is a dictionary, 7 is a matching judgment section, 8
9 represents a result display section, 9 a keyboard, 10 an error detection section, 11 a similarity determination section, and 12 a pattern deletion section.

マイクロホン１から入力された音声信号は、音
響分析部２において周波数分析される。音響分析
部２は、例えば帯域フイルタ群、パラメータ抽出
回路等を有しており、入力音声の特徴量（パラメ
ータ）、例えば第１ホルマント周波数に相当する
モーメントM₁や、第２ホルマント周波数に相当
するモーメントM₂や、さらには、低域電力や高
域電力などを抽出し、これらの特徴量に関するサ
ンプル点を決定して、特徴量の時系列情報を得
る。 The audio signal input from the microphone 1 is subjected to frequency analysis in the acoustic analysis section 2. The acoustic analysis unit 2 includes, for example, a group of band filters, a parameter extraction circuit, etc., and extracts features (parameters) of the input speech, such as a moment _M1 corresponding to the first formant frequency and a moment M1 corresponding to the second formant frequency. Moment _M2 , low frequency power, high frequency power, etc. are extracted, sample points related to these feature quantities are determined, and time series information of the feature quantities is obtained.

音響分析部２において得られたパラメータ時系
列情報は、パターン抽出部３に入力される。パタ
ーン抽出部３は、このパラメータ時系列情報か
ら、入力音声の特徴を表わす入力特徴パターンを
抽出する。切替部４は、パターン情報の登録また
は照合を、例えばキーボード９からのモード切替
指示により切り替えるものである。パターン追加
部５は、登録指示があつた場合に、パターン抽出
部３が抽出した入力特徴パターンを、その項目名
に対応させて、辞書６に追加登録するものであ
る。辞書６は、例えば磁気デイスク装置等の外部
記憶装置であつて、認識対象の項目名と標準特徴
パターンの情報とを記憶して保持する。 The parameter time series information obtained by the acoustic analysis section 2 is input to the pattern extraction section 3. The pattern extraction unit 3 extracts an input feature pattern representing the characteristics of the input voice from this parameter time series information. The switching unit 4 switches registration or verification of pattern information in response to a mode switching instruction from the keyboard 9, for example. When a registration instruction is given, the pattern addition section 5 additionally registers the input feature pattern extracted by the pattern extraction section 3 in the dictionary 6 in association with its item name. The dictionary 6 is, for example, an external storage device such as a magnetic disk device, and stores and holds the item names to be recognized and information on standard feature patterns.

認識を行う場合、パターン抽出部３の出力は、
照合判定部７に供給される。照合判定部７は、辞
書６の内容を順次読出し、入力特徴パターンと辞
書に登録されている標準特徴パターンとを、例え
ば周知のダイナミツク・プログラミング（DP）
マツチング等により照合する。認識結果は、デイ
スプレイ等の結果表示部８に表示される。使用者
は、表示結果を見て、自分の入力した音声が正し
く認識されたかどうかを確認することができる。
誤つている場合、使用者は、例えばキーボード９
から、認識結果が誤つていることと、正答が何ん
であるかを指示する。これによつて、誤り検出部
１０は、認識誤りを検出できる。なお、登録／練
習モードであつて、予め入力される音声の単語が
判つている場合等には、誤り検出部１０は、照合
判定部７の判定結果から、直ちに誤りであること
を検出することもできる。 When performing recognition, the output of the pattern extraction section 3 is
The data is supplied to the comparison determination section 7. The comparison/judgment unit 7 sequentially reads out the contents of the dictionary 6, and compares the input feature pattern with the standard feature patterns registered in the dictionary using, for example, well-known dynamic programming (DP).
Verify by matching etc. The recognition results are displayed on a result display section 8 such as a display. The user can check whether the voice input by the user has been correctly recognized by viewing the displayed results.
If it is incorrect, the user must, for example, use the keyboard 9
, it is indicated that the recognition result is incorrect and what the correct answer is. This allows the error detection unit 10 to detect recognition errors. Note that in the registration/practice mode, when the word of the input voice is known in advance, the error detection unit 10 immediately detects an error from the determination result of the verification determination unit 7. You can also do it.

誤り検出部１０は、認識誤りを検出すると、そ
の旨、類似性判定部１１に通知する。類似性判定
部１１は、例えば第４図または第５図のフローチ
ヤートに示すような処理を実行し、入力特徴パタ
ーンに最も類似する標準特徴パターンについての
妥当性のチエツクを行う。妥当性の判断の基準値
T_Ldは、システムで統一的に定めてもよいし、予
め２種の辞書項目に応じて定めておくようにして
もよい。第２図で説明した平均距離または最小距
離等の類似性を示す度合の差または比などが、０
以上の所定の基準値T_Ldよりも大きい場合には、
パターン削除部１２を起動する。また、間違つて
正しい標準特徴パターンを削除してしまう危険性
を少なくするために、例えば第６図に示す如く、
まず認識誤りに関連した最類似標準特徴パターン
について、認識誤り時に選択された回数Ｎをカウ
ントしておき、この回数Ｎが所定の規定値N_Lを
超えた場合にのみ、パターンの削除が行われるよ
うにしてもよい。なお、この認識誤りが複数回生
じた場合にのみ、標準特徴パターンを削除するた
めの判定処理は、類似性判定部１１ではなく、他
の処理部で実行するようにしてもよい。 When the error detection unit 10 detects a recognition error, it notifies the similarity determination unit 11 to that effect. The similarity determination unit 11 executes the process shown in the flowchart of FIG. 4 or 5, for example, and checks the validity of the standard feature pattern that is most similar to the input feature pattern. Criteria for determining validity
T _Ld may be uniformly determined by the system, or may be determined in advance according to two types of dictionary items. The difference or ratio of the degree of similarity such as the average distance or minimum distance explained in Figure 2 is 0.
If it is larger than the above predetermined reference value T _Ld ,
Start the pattern deletion section 12. In addition, in order to reduce the risk of accidentally deleting the correct standard feature pattern, for example, as shown in FIG.
First, the number of times N that the most similar standard feature pattern associated with a recognition error has been selected at the time of a recognition error is counted, and the pattern is deleted only when this number of times N exceeds a predetermined specified value _NL . You can do it like this. Note that only when this recognition error occurs multiple times, the determination process for deleting the standard feature pattern may be performed not by the similarity determination unit 11 but by another processing unit.

パターン削除部１２は、類似性判定部１１から
通知された最類似標準特徴パターンを辞書６から
削除する。そして、必要に応じて、削除した旨を
結果表示部８へ表示する。ところで、例えば入力
音声が不明瞭な場合や雑音の多い環境のもとで認
識が行われる場合等、無暗にパターンの削除機能
が働かないようにするために、パターンの登録削
除を許可する状態またはパターンの登録削除を禁
止する状態のいずれかを、例えばキーボード９か
ら指示できれば便利である。そのため、現在、登
録削除許可状態であるか、禁止状態であるかを記
憶する状態記憶部（図示省略）を設けるとよい。
この場合、パターン削除部１２は、例えば第７図
図示の如く、上記状態記憶部を参照し、現在、削
除許可状態であることを確認してから、最類似標
準特徴パターンの登録削除を行う。なお、この状
態判定処理は、他の処理部で実行してもよい。 The pattern deletion unit 12 deletes the most similar standard feature pattern notified from the similarity determination unit 11 from the dictionary 6. Then, if necessary, the result display section 8 displays the fact that it has been deleted. By the way, in order to prevent the pattern deletion function from working unnecessarily, for example when the input voice is unclear or recognition is performed in a noisy environment, it is necessary to allow pattern registration deletion. It would be convenient if, for example, the keyboard 9 could be used to instruct either a state in which registration or deletion of a pattern is prohibited. Therefore, it is preferable to provide a status storage unit (not shown) that stores whether the registration deletion is currently permitted or prohibited.
In this case, the pattern deletion section 12 refers to the state storage section, as shown in FIG. 7, for example, and after confirming that deletion is currently permitted, deletes the registration of the most similar standard feature pattern. Note that this state determination process may be executed by another processing unit.

上記標準特徴パターンを削除するための処理
は、登録／練習モードのときにのみ行われるよう
にしてもよいし、また、通常の認識モード時に実
行されるようにしてもよい。 The process for deleting the standard feature pattern may be performed only during the registration/practice mode, or may be performed during the normal recognition mode.

(E) 発明の効果以上説明した如く本発明によれば、雑音付加パ
ターン、不明瞭発生パターン等の不良標準特徴パ
ターンや登録時の誤り発声による誤り標準特徴パ
ターンを自動的に削除できるようになり、辞書の
品質を向上させ、認識率を向上させることが可能
となる。(E) Effects of the Invention As explained above, according to the present invention, it becomes possible to automatically delete defective standard feature patterns such as noise-added patterns and ambiguous patterns, as well as erroneous standard feature patterns caused by incorrect utterances during registration. , it becomes possible to improve the quality of the dictionary and improve the recognition rate.

[Brief explanation of the drawing]

第１図は音声パターンの分布と標準特徴パター
ンとの関係を説明するための図、第２図は本発明
による処理概要を説明するための図、第３図は本
発明の一実施例構成、第４図ないし第６図は類似
性判定部の各一実施例処理説明図、第７図はパタ
ーン削除部の一実施例処理説明図を示す。図中、２は音響分析部、３はパターン抽出部、
６は辞書、７は照合判定部、１０は誤り検出部、
１１は類似性判定部、１２はパターン削除部を表
わす。 FIG. 1 is a diagram for explaining the relationship between the distribution of voice patterns and standard feature patterns, FIG. 2 is a diagram for explaining the outline of processing according to the present invention, and FIG. 3 is a diagram for explaining the configuration of an embodiment of the present invention. 4 to 6 are explanatory diagrams of processing of one embodiment of the similarity determination section, and FIG. 7 is a diagram of illustration of processing of one embodiment of the pattern deletion section. In the figure, 2 is an acoustic analysis section, 3 is a pattern extraction section,
6 is a dictionary, 7 is a collation determination unit, 10 is an error detection unit,
11 represents a similarity determination section, and 12 represents a pattern deletion section.

Claims

[Claims] 1. A dictionary in which one or more standard feature patterns are stored corresponding to each item to be recognized is provided, and input feature patterns obtained by acoustic analysis of unknown input speech and those in the dictionary are provided. A speech recognition device that performs recognition by comparing with a standard feature pattern includes an error detection unit that detects errors in the recognition result, and a most similar to the input feature pattern when the error detection unit detects a recognition error. Similarity that compares the degree of similarity between the standard feature pattern and other standard feature patterns of the item to which the most similar standard feature pattern belongs, and the degree of similarity between the most similar standard feature pattern and the correct item standard feature pattern corresponding to the input voice. a determination unit; and a registered pattern deletion unit that deletes the most similar standard feature pattern from the dictionary when the difference or ratio of the degrees of similarity is greater than a predetermined reference value based on the determination result of the similarity determination unit. A speech recognition device characterized by having the following features. 2. When comparing the degrees of similarity, the similarity determination unit calculates the average distance or minimum distance between the most similar standard feature pattern and the standard feature pattern in the same type of dictionary entry, and the standard feature in the different type of dictionary entry. The speech recognition device according to claim 1, characterized in that the difference or ratio of the average distance or the minimum distance to the pattern is used. 3. A patent claim characterized in that the number of recognition errors is stored for each of the standard feature patterns, and when the number of recognition errors exceeds a predetermined number, the corresponding most similar standard feature pattern is deleted. The speech recognition device according to item 1 of the scope of the invention. 4 The recognition error detection unit, the similarity determination unit, or the registered pattern deletion unit is provided with means for selecting either the registration deletion permission state or the registration deletion prohibition state, and only when the registration deletion permission state is set, the registration pattern deletion unit selects the registration deletion permission state. 2. The speech recognition device according to claim 1, wherein the speech recognition device is configured to delete similar standard feature patterns.