JPH01112298A

JPH01112298A - Voice recognition equipment

Info

Publication number: JPH01112298A
Application number: JP62271148A
Authority: JP
Inventors: Hiromi Fujii; 藤井　浩美
Original assignee: NEC Corp
Current assignee: NEC Corp
Priority date: 1987-10-26
Filing date: 1987-10-26
Publication date: 1989-04-28
Also published as: JPH0577079B2

Abstract

PURPOSE: To efficiently execute free pruning correspondingly to a speaker or an environment change and to execute voice recognition processing at a high speed by learning a threshold by using a voice spoken in the past and information indicating the validity/invalidity of a recognized result. CONSTITUTION: A parameter for finding out an accumulated distance from past vocalization on an optimum path and finding out a suitable threshold from the found value is learned. The learning using the accumulated distance on the optimum path is executed only when the recognized result is correct, and if the recognized result is erroneous, the continuation of errors can be prevented by increasing the tree pruning threshold. Namely the validity/invalidity of the recognized result is inputted from a result checking part 8, and when the recognized result is inputted as an error, threshold parameters α, β set up at preset are read out from a threshold parameter storing part 3 and updated so that the threshold is increased. Consequently an optimum threshold for tree printing can be learned, a threshold suitable for a speaker or an environmental change can be set up, a recognition speed can be increased, and a recognition rate can be improved.

Description

【発明の詳細な説明】（産業上の利用分野）本発明は、発声された音声を高速で認識する音声認識装
置の改良に関する。DETAILED DESCRIPTION OF THE INVENTION (Field of Industrial Application) The present invention relates to an improvement in a speech recognition device that recognizes uttered speech at high speed.

（従来の技術）音声認識は、優れたマンマシンインターフェースを実現
する技術として重要であり、すでにさまざまな分野で音
声認識装置が使われている。現在の装置のほとんどは、
パターンマツチング法による認識方式を採用している。(Prior Art) Speech recognition is an important technology for realizing excellent man-machine interfaces, and speech recognition devices are already being used in various fields. Most of the current devices are
A recognition method based on the pattern matching method is adopted.

この方式は、発声された認識対象の単語をあらかじめ標
準パターンとして保持しておき、入力された発声のパタ
ーン（以下、入力パターンとする）と保持された標準パ
ターンとの比較を行い、最も類似した標準パターンの単
語名を認識結果とするものである。この時、２つのパタ
ーンの時間軸を対応づけてパターン間距離を求める方法
としては、動的計画法により非線形な対応付けを行うＤ
Ｐマツチング法が使用されている。Ｄｒマツチングにつ
いては「連続発声した単語音声を効率的に認識する２段
ＤＰマツチング」２日計エレクトロニクス、　１９８３
年１１月７日号の１７１頁より２０８頁（以下、文献１
とする）に詳しく記述されている。この文献によると、
７：　タフ　Ａ　＋　８間の距ｗＥＤ（Ａ、Ｂ）は、以
下のように定義されている。In this method, uttered words to be recognized are stored in advance as standard patterns, and the input utterance pattern (hereinafter referred to as input pattern) is compared with the stored standard pattern. The recognition result is the word name of the standard pattern. At this time, as a method for associating the time axes of two patterns and finding the distance between the patterns, D
P matching method is used. Regarding Dr matching, see “Two-stage DP matching for efficient recognition of continuously uttered word sounds,” Nikkakei Electronics, 1983.
Pages 171 to 208 of the November 7 issue (hereinafter referred to as Reference 1)
) is described in detail. According to this literature,
7: The distance wED(A, B) between Tough A + 8 is defined as follows.

Ｄ　（Ａ　、　Ｂ　）　＝、ｍ、ｉｎ［Σｄ（ｉ　、ｊ
）］１　＄１（ｉ）　　　２二１ｄ（ｉ、ｊ）はａｌとす、のベクトル間距離である。パ
ターン間距離りは、たとえば以下の漸化式計算により求
めることができる。D (A, B) =, m, in[Σd(i, j
)]1 $1(i) 221 d(i, j) is the distance between vectors al and . The inter-pattern distance can be determined, for example, by the following recurrence formula calculation.

・・・（１）弐ｇ（ｉ、ｊ）は１＋Ｊにより張られるｉｊ平面中の点（
１，１）から（ｉ　、　ｊ）までのベクトル間距離ｄの
累積値の最小値であり、以下、累積距離と呼ぶことにす
る。Ｄ（Ａ、Ｂ）はこの漸化式計算をｉ＝１・・・Ｉ、
ｊ＝１・・・Ｊまで行って得られるｇ（Ｉ　、Ｊ）とし
て求められる。...(1) 2g (i, j) is a point in the ij plane spanned by 1+J (
1, 1) to (i, j), and is hereinafter referred to as the cumulative distance. D(A,B) calculates this recurrence formula as i=1...I,
It is obtained as g(I, J) obtained by going up to j=1...J.

ここで、第２図に示すようなｉ、ｊ平面を考える。上記
漸化式は、第２図に示すように、（ｉ−１、ｊ）、（ｉ
−１，ｊ−１）、（ｉ−１，ｊ−２）から（ｉ　、　ｊ
）に至る３木のパス（イ）。Here, consider the i, j plane as shown in FIG. The above recurrence formula, as shown in Figure 2, (i-1, j), (i
-1, j-1), (i-1, j-2) to (i, j
) A three-tree path (a).

（ロ）、（ハ）を許して、格子点（１，１）から（Ｉ、
Ｊ）に至るベクトル間距＠ｄ（ｉ、ｊ）の−３＝総和が最小を与える（ｉ、ｊ）の経路（以下、最適パス
という）を求めるものである。最適パスは、（１）式の
計算の際に（イ）、（ロ）。Allowing (b) and (c), from lattice point (1,1) to (I,
This is to find the path (hereinafter referred to as the optimal path) for (i, j) that provides the minimum -3= sum of the inter-vector distance @d(i, j) to J). The optimal path is determined by (a) and (b) when calculating equation (1).

（ハ）のうちどのパスが選ばれたかのパス情報ｈ（ｆ　
、　ｊ）をすべての（ｉ、ｊ）に対して保持しておき、
Ｄが求められた後に（Ｉ、Ｊ）より保持されたパスを（
１，１）まで遡るバックトラックを行うことにより得ら
れる。バックトラックにより最適パスを求める方法につ
いては、「音声認識ニオケル動的計画法の応用Ｊ　、　
ｂｉｔ、　Ｖｏｌ、　１５゜Ｎｏ、８の１３１頁より１
４２頁に詳しく述べられている。Path information h (f
, j) for all (i, j),
After D is determined, the path retained from (I, J) is (
This can be obtained by backtracking all the way back to 1,1). For information on how to find the optimal path by backtracking, see "Applications of Niockel Dynamic Programming for Speech Recognition J.
bit, Vol, 15° No. 8, page 131 1
Details are given on page 42.

以上述べたＤＰマツチングによる認識アルゴリズムには
、多くの改良があるが、その１つとして特開昭５８−９
８７９６号に記載されているクロックワイズＤＰ法があ
る。この方法は、入力パターンの時間軸ｉを最も外側の
ループにしてｇｎ（ｉ、ｊ）の計算を行うことにより、
特徴ベクトルａｉの入力と同期したマツチング処理を実
現し、実時間性を高める方法である。すなわち、この方
法では、入力パターンの時刻ｉにおいて、全ての単語ｎ
と単語ｎの標準パターン上の時刻ｊに対してｇｆｉ　（
ｉ。There have been many improvements to the recognition algorithm using DP matching described above, one of which is JP-A-58-9
There is a clockwise DP method described in No. 8796. This method calculates gn(i, j) using the time axis i of the input pattern as the outermost loop.
This method realizes matching processing that is synchronized with the input of feature vector ai and improves real-time performance. That is, in this method, at time i of the input pattern, all words n
and gfi (
i.

ｊ）を求める。Find j).

また、上記のクロックワイズＤｒ法に枝刈の考えを導入
することにより処理を高速化したものが特願昭６２−６
１７３２号、特願昭６２−２１９４６０号に述べられて
いる。以下、これらの方法について簡単に説明する。In addition, a patent application filed in 1982-6 that sped up the processing by introducing the idea of pruning to the clockwise Dr method described above.
No. 1732 and Japanese Patent Application No. 62-219460. These methods will be briefly explained below.

特願昭６２−６１７３２号の方法は、クロックワイズＤ
Ｐ法において、時刻ｉでの累積距離ｇｎ（ｉ、ｊ）があ
るしきい値θ（ｉ）以上のｎ、ｉ、ｊに対しては、時刻
ｉ＋１以降の漸化式計算を省略するものである。これは
、ｇ″（ｉ　、　ｊ）が大きい（ｎ。The method of patent application No. 62-61732 is Clockwise D.
In the P method, for n, i, j whose cumulative distance gn(i, j) at time i is greater than or equal to a certain threshold θ(i), the recurrence formula calculation after time i+1 is omitted. be. This means that g″(i, j) is large (n.

ｉ、ｊ）は最適パス上にある可能性が低いとみなして漸
化式計算を省略するものである。これにより計算すべき
漸化式計算回数が大幅に減少し、認識処理が高速化され
る。θ（ｉ）の設定法としては以下のものがある。i, j) are considered to be unlikely to be on the optimal path, and the recurrence formula calculation is omitted. As a result, the number of recurrence formula calculations to be performed is significantly reduced, and the recognition processing speed is increased. There are the following methods for setting θ(i).

（イ）　θ（ｉ）＝αｉ＋β （ロ）　θ（ｉ　）＝ｇｍｉｎ（ｉ）＋α（α、βは定
数）（イ）は最適累積距離が増加するとしてθ（ｉ）をｉの
一次単調増加関数として定めるもの、（ロ）は各ｉにお
ける累積距離ｇ″（ｉ、ｊ）。(a) θ(i)=αi+β (b) θ(i)=gmin(i)+α (α, β are constants) (a) Assuming that the optimal cumulative distance increases, θ(i) is a linear monotonous increase of i. What is defined as a function, (b) is the cumulative distance g″(i, j) at each i.

Ｊ　＝　ｉ　、−−・Ｊｎ、　ｎ＝　１、−、　Ｎの最
小値ｇ　ｍ１ｎ（ｉ）にαの余裕を持たせてθを定める
ものである。しかし、この方法において、しきい値を求
めるためのしきい値パラメータα、βは一定値であった
ため、適切でないしきい値θ（ｉ）による認識エラーや
計算量が低減されない場合があった。J=i, ---Jn, n=1,-, θ is determined by giving a margin of α to the minimum value g m1n(i) of N. However, in this method, since the threshold parameters α and β for determining the threshold value are constant values, recognition errors and calculation amount due to inappropriate threshold value θ(i) may not be reduced.

特願昭６２−２１９４６０号の方法は、特願昭６２−６
１７３２号におけるこのような問題に対処できるもので
あり、過去の発声よりしきい値パラメータを学習する機
能を有する。しきい値パラメータは以下の手順で学習す
る。まず、認識結果出力後、入力パターンと認識結果を
与えた標準パターンとの間でマツチングを行い、バック
トラックによって最適パスを求める。次に、最適パス上
の累積距離ｇａｐ＋（ｉ）、ｉ＝１．・・・、１を求め
、しきい値θ（ｉ）が全てのｉにおいてｇ−ｐ＋’：’
ｘ）＜θ（ｆ）を満足するようにしきい値パラメータα
゛、β′を求める。次の認識処理に使用するα、βは、
過去１回収上Ｘ回の発声に対するしきい値パラメータα
′（Ｘ）、β’（ｘ）、ｘ＝１．・・・、Ｘより求める
。The method of Japanese Patent Application No. 62-219460 is
This is a device that can deal with such problems in No. 1732, and has a function of learning threshold parameters from past utterances. The threshold parameters are learned using the following steps. First, after outputting the recognition results, matching is performed between the input pattern and the standard pattern that gave the recognition results, and an optimal path is determined by backtracking. Next, the cumulative distance gap+(i) on the optimal path, i=1. ..., 1 is calculated, and the threshold value θ(i) is g-p+':' for all i.
The threshold parameter α is set so that x) < θ(f).
Find ゛, β′. α and β used for the next recognition process are
Threshold parameter α for utterances X times in the past one collection
'(X), β'(x), x=1. ..., find it from X.

（発明が解決しようとする問題点）従来法における枝刈では、認識結果が正解でもエラーで
も同様の方法でパラメータα、βの学Ｗを行っていた。(Problems to be Solved by the Invention) In pruning in the conventional method, the parameters α and β are calculated using the same method regardless of whether the recognition result is correct or erroneous.

しかし、誤認識の場合には最適パスを求める際に、発声
された単語とは異なる標準パターンとのマツチングが行
われるため、正解単語との正解パスは求められない。従
って、誤認識が起きると、適切でないしきい値パラメー
タα。However, in the case of misrecognition, when determining the optimal path, the uttered word is matched with a different standard pattern, so the correct path with the correct word cannot be determined. Therefore, when misrecognition occurs, the threshold parameter α is inappropriate.

βが学習されることがあり、さらに誤認識を生むことに
つながっていた。β may be learned, leading to further misrecognition.

本発明の目的は、上記の問題点をなくし、常に適切な枝
刈のしきい値θを定めることのできる音声認識装置を提
供することである。An object of the present invention is to provide a speech recognition device that eliminates the above-mentioned problems and can always determine an appropriate pruning threshold θ.

（問題点を解決するための手段）本発明による音声認識装置は次の各部を必要とする。す
なわち、各単語ｎの音声の特徴ベクトル時系列Ｂ”　＝
　ｂ”、・・・ｂａｉ・・・ｂ″Ｔｍを標準パターンと
して保持する標準パターン格納部と、枝刈のしきい値を
求めるパラメータであるしきい値パラメータを格納する
しきい値パラメータ格納部と、時刻ｉの入力音声の特徴
ベクトルａｉを逐次読み込み時系列パターンＡ＝ａｌ・
・・ａｉ・・・ａｏとして保持する入力パターン格納部
と、各時刻ｉにおいて入力音声の特徴ａｉと前記標準パ
ターン格納部の標準パターンｂ″。(Means for Solving the Problems) The speech recognition device according to the present invention requires the following parts. In other words, the speech feature vector time series B'' of each word n =
b",...bai...b"Tm as a standard pattern, and a threshold parameter storage section that stores a threshold parameter that is a parameter for determining a pruning threshold. , the feature vector ai of the input voice at time i is read sequentially and the time series pattern A=al・
. . . ai . . . an input pattern storage section that holds them as ao, the characteristics ai of the input voice at each time i, and the standard pattern b'' of the standard pattern storage section.

との距離”（１１ｊ）の累計距離ｇｎ（ｓ　ｌ　３　）
を前記しきい値パラメータ格納部のパラメータで定めら
れる枝刈条件を満足する（ｎ、ｊ）の値に対して求める
マツチング部と、マツチング部にて時刻工に求められた
累積距離ｇｎ（ｒ、Ｊ）の最小値を与える単語ｎを認識
結果として出力する判定部と、認識結果の正否を与える
結果確認部と、結果が正解である場合に入力パターン格
納部における入力パターンＡと認識結果の標準パターン
Ｂｎを読み込み最適パスを求める最適パス計算部と、結
果が正解である場合には前記最適パス計算部にて得−８
＝られた最適パス上の累積距離の値を用いてしきい値パラ
メータを更新し結果が誤りである場合にはしきい値パラ
メータ格納部のしきい値パラメータをしきい値を高くす
るように更新するしきい値パラメータ決定部の各部であ
る。Cumulative distance gn(s l 3 ) of "distance" (11j) to
a matching unit that calculates gn(r, J) a determination unit that outputs the word n that gives the minimum value as a recognition result; a result confirmation unit that determines whether the recognition result is correct; and a standard for the input pattern A and the recognition result in the input pattern storage unit when the result is correct. The optimum path calculation section reads the pattern Bn and calculates the optimum path, and if the result is correct, the optimum path calculation section calculates the result by -8.
= The threshold parameter is updated using the value of the cumulative distance on the optimal path determined, and if the result is incorrect, the threshold parameter in the threshold parameter storage is updated to increase the threshold value. These are the various parts of the threshold parameter determining section.

（作用）本発明による音声認識装置は、過去に発声された音声と
、認識結果の正否の情報を用いてしきい値を学習するこ
とにより、話者や環境の変化に対応して、枝刈を効率よ
く行い高速に認識処理を行うことを特徴とする。(Operation) The speech recognition device according to the present invention learns a threshold value using speech uttered in the past and information on whether the recognition results are correct or incorrect, and pruns in response to changes in the speaker or the environment. It is characterized by efficient and high-speed recognition processing.

上述したように、枝刈は入力パターンの各時刻ｉにおけ
るしきい値θ（ｉ）を用いて行う。従って、θ（ｉ）は
正解の単語の最適パス上の累積距離を下回らず、かつ、
高すぎないように設定することが望ましい。本発明の特
徴は、過去の発声に対して最適パス上の累積距離を求め
、それらの値から適切なしきい値θ（ｉ）を求めるパラ
メータを学習することにある。さらに、最適パス上の累
積距離を用いた学習は認識結果が正解であるときのみ行
い、認識結果がエラーである場合には、枝刈のしきい値
を上げることにより連続したエラーを防ぐことを特徴と
する。以下に、その動作原理を説明する。As described above, pruning is performed using the threshold value θ(i) at each time i of the input pattern. Therefore, θ(i) is not less than the cumulative distance of the correct word on the optimal path, and
It is desirable to set it not too high. The feature of the present invention is to obtain cumulative distances on the optimal path for past utterances, and to learn parameters for obtaining an appropriate threshold value θ(i) from these values. Furthermore, learning using the cumulative distance on the optimal path is performed only when the recognition result is correct, and if the recognition result is an error, the pruning threshold is raised to prevent consecutive errors. Features. The operating principle will be explained below.

従来方式による認識処理が行われ、結果が出力された後
、認識結果が正解か否かの入力を促すプロンプトに従っ
て、利用者が結果の正否を入力する。認識結果が正解で
ある場合には、認識結果の単語ｎの標準パターンＢ”と
保持されている基カバターンとの間で前記文献１に述べ
られているＤＰマツチングを行う。マツチングでは、漸
化式計算において選択されたパスの情報ｈ（ｚ、ｊ）と
累積距離ｇ（ｉ、ｊ）を全ての（ｉ　、　ｊ）に対して
保持しておき、バックトラックにより最適パスを得る。After the recognition process using the conventional method is performed and the results are output, the user inputs whether the results are correct or not in accordance with a prompt prompting the user to input whether the recognition results are correct or not. If the recognition result is correct, DP matching described in the above-mentioned document 1 is performed between the recognition result standard pattern B of word n and the retained basic pattern.In the matching, the recurrence formula The information h(z, j) of the path selected in the calculation and the cumulative distance g(i, j) are held for all (i, j), and the optimal path is obtained by backtracking.

最適パス上の累積距離ｇ−ｐｔ（ｔ　）は、最適パスｂ
−ｐ＋　（ｓ　）　＝Ｊ　（１）・・・ｊ（ｆ）・・・
ｊ（Ｉ）における累積距離ｇ（ｉ　、ｊ（ｉ））として
得られる。The cumulative distance g-pt(t) on the optimal path is the optimal path b
-p+ (s) = J (1)...j(f)...
It is obtained as the cumulative distance g(i, j(i)) at j(I).

このようにして得られた各ｉにおける最適パス上の累積
距離ｇ−ｐ＋　（１）は、直前の入力音声の時刻ｉにお
ける枝刈のしきい値θ（ｉ）の最適値と考えることがで
きる。そのため、これらの情報を用いて、現在のしきい
値パラメータを補正することで、次回の認識処理ではよ
り適切なしきい値を設定することができる。さらに、話
者や環境が変化した場合、それ以前と同じパラメータで
は不適切なしきい値θが設定される場合があるが、その
ような場合にも、上記の原理によって、発声ごとにより
適切なθが設定できるようになる。The cumulative distance g−p+ (1) on the optimal path at each i obtained in this way can be considered as the optimal value of the pruning threshold θ(i) at time i of the immediately preceding input voice. . Therefore, by correcting the current threshold parameters using this information, a more appropriate threshold can be set for the next recognition process. Furthermore, if the speaker or the environment changes, an inappropriate threshold θ may be set using the same parameters as before, but even in such cases, the above principle allows a more appropriate θ to be set for each utterance. can be set.

以上は、認識結果が正解である場合の処理であるが、誤
認識の場合は正解単語に対する最適パスが枝刈きれてし
まった可能性が強いとみなして、現在のしきい値θより
高いしきい値が設定されるようにしきい値パラメータα
、βを更新する。The above is the process when the recognition result is correct, but in the case of incorrect recognition, it is assumed that there is a strong possibility that the optimal path for the correct word has been pruned, and if the recognition result is higher than the current threshold θ. Threshold parameter α such that the threshold value is set
, β is updated.

（実施例）以下に、本発明の実施例について図面を参照しながら詳
細に説明する。第１図は、本発明の一実施例を示すブロ
ック図である。(Example) Examples of the present invention will be described in detail below with reference to the drawings. FIG. 1 is a block diagram showing one embodiment of the present invention.

第１図における標準バクーン格納部２には、あらかじめ
発声された認識対象単語ｎの各時系列デ一夕が標準パタ
ーンＢ、とじて格納されており、しきい値パラメータ格
納部３には、枝刈のしきい値θ（ｉ）を求めるためのし
きい値α、βがあらかじめ格納されている。発声された
基カバターンＡは実時間で分析され、特徴ベクトルａｉ
の時系列のデータとして逐次マツチング部４に入力され
る。In the standard Bakun storage unit 2 in FIG. Threshold values α and β for determining the mowing threshold θ(i) are stored in advance. The uttered basic cover turn A is analyzed in real time and the feature vector ai
The data is sequentially input to the matching unit 4 as time-series data.

また、ａｉは同時に、入力パターン格納部１に逐次格納
され、次の入力があるまで保持される。マツチング部４
では入力されたａｉごとにｎ、ｊに対して漸化式計算を
行いｇｎ（ｉ＋　ｊ）を求める。マツチングには、従来
方式におけるクロックワイズＤＰ法に枝刈の考えを導入
した方式（特願昭６２−６１７３２号）を用いる。枝刈
のしきい値θの求め方としては、ここでは、−次車調増
加関数（θ（ｉ）＝αｉ十β）を用いることとする。マ
ツチング部４は、ｉにおける累積距離計算後、しきい値
パラメータ格納部３のα、βを読み込みθ（ｉ）を計算
し、ｇ’（ｉ、ｊ）＜θ（ｉ）を満足する（ｎ、ｊ）を
求める。ａｌ＋１が入力されると、時刻ｉで求められた
枝刈基準を満足する（ｎ、ｊ）に対して漸化式計算を行
う。マツチング部４ではこのように枝刈をしながら時刻
Ｉまでの処理を行い基カバターンＡと全ての標準パター
ンＢｎとのパターン間距離を求める。Furthermore, ai is simultaneously stored sequentially in the input pattern storage section 1 and held until the next input. Matching section 4
Now, for each input ai, a recurrence formula calculation is performed on n and j to obtain gn(i+j). For matching, a method (Japanese Patent Application No. 62-61732) is used in which the idea of pruning is introduced into the conventional clockwise DP method. As a method for determining the pruning threshold θ, here, a −th vehicle adjustment increasing function (θ(i)=αi + β) is used. After calculating the cumulative distance at i, the matching unit 4 reads α and β from the threshold parameter storage unit 3, calculates θ(i), and satisfies g'(i, j)<θ(i) (n , j). When al+1 is input, recurrence formula calculation is performed for (n, j) that satisfies the pruning criterion determined at time i. The matching section 4 performs the processing up to time I while performing pruning in this manner, and calculates inter-pattern distances between the base cover pattern A and all standard patterns Bn.

判定郡５では、マツチング部４にて得られた基カバター
ンＡと全ての標準パターンＢｎ、ｎ＝１・・Ｎとのパタ
ーン間距離のうち最小距離を与えた標準パターンを結果
として出力する。引続き、利用者によってこの認識結果
の正否が結果確認部８より入力される。結果確認部８は
、正否を入力する手段を有し、例えば正否に対応する２
つのキーよりなる装置を用いることができる。このよう
に、結果の正否が入力されると、正否に応じてしきい値
パラメータの学習処理が行われる。In determination group 5, the standard pattern that gives the minimum distance among the inter-pattern distances between the base cover pattern A obtained by the matching section 4 and all the standard patterns Bn, n=1...N is output as a result. Subsequently, the user inputs whether the recognition result is correct or not from the result confirmation section 8. The result confirmation unit 8 has a means for inputting whether the result is correct or not.
A device consisting of two keys can be used. In this way, when the result is correct or incorrect, the threshold parameter learning process is performed depending on whether the result is correct or incorrect.

まず、結果が正解である場合について説明する。この場
合の学習には、従来方法（特願昭６２−６１７３２号）
と同様に、基カバターンと認識結果を与えた標準パター
ンとのマツチングにおける最適パス上の累積距離を用い
る。最適パス計算部６は、利用者から結果が正解である
という入力を得ると、入力パターンＡを入力パターン格
納部１より、結果を与えた標準パターンＢｎを標準パタ
ーン格納部２より読み込み、文献１に示されているよう
なりＰマツチング法を用いて１対１のマツチングを行う
。マツチングの際には、（ｉ、ｊ）における累積距離ｇ
（ｉ、ｊ）と（ｉ　、　ｊ）に至るパスｈ（ｉ、ｊ）を
全ての（ｉ　、　ｊ）に対して保持しておく。最適パス
は、（１，Ｊ）より、パスの情報りを遡るバックトラッ
クにより得られる。このようにして得られた最適パス上
の累積距離をｇ、ｐｔ（ｉ）　、　ｉ　＝　１　、・・
・、■とする。その後、パラメータ決定部７においてし
きい値パラメータα。First, a case where the result is correct will be explained. In this case, the conventional method (Japanese Patent Application No. 62-61732)
Similarly, the cumulative distance on the optimal path in matching the base cover pattern and the standard pattern that gave the recognition result is used. When the optimal path calculation unit 6 receives input from the user that the result is correct, it reads the input pattern A from the input pattern storage unit 1 and the standard pattern Bn that gave the result from the standard pattern storage unit 2, and reads the input pattern A from the input pattern storage unit 1 and the standard pattern Bn giving the result from the standard pattern storage unit 2, One-to-one matching is performed using the P matching method as shown in . During matching, the cumulative distance g at (i, j)
A path h(i, j) leading to (i, j) and (i, j) is maintained for all (i, j). The optimal path is obtained from (1, J) by backtracking the path information. The cumulative distance on the optimal path obtained in this way is g, pt(i), i = 1,...
・、■. Thereafter, the parameter determination unit 7 determines the threshold parameter α.

βの学習を行う。乙。、１は、最適パス上の累積距離で
あるから、マツチング時の枝刈のしきい値は常にこの値
以上である必要がある。パラメータα。Perform learning of β. Otsu. , 1 is the cumulative distance on the optimal path, so the pruning threshold during matching must always be greater than or equal to this value. Parameter α.

βの値は例えば、ｇｏ、の最小自乗近似直線の係数とし
て求めることができる。第３図は、ｇ＊ｐ＋と求められ
たα、βによるθ（ｉ）を示している。図におけるβは
最小自乗近似直線の係数として求められた値より余裕分
Δβだけ大きくなっている。The value of β can be obtained, for example, as a coefficient of the least squares approximation straight line of go. FIG. 3 shows θ(i) based on g*p+ and the determined α and β. In the figure, β is larger than the value determined as the coefficient of the least squares approximation straight line by an amount of margin Δβ.

パラメータ決定部７には、このようにして求められたα
、βを過去の発声Ｘ回分（Ｘ≧０）が格納されている。The parameter determination unit 7 stores α obtained in this way.
, β have been uttered X times (X≧0) in the past are stored.

これらの値から、新たなしきい値パラメータを求め、し
きい値パラメータ格納部３に格納する。しきい値パラメ
ータの求め方としては、Ｘ回のα、βの値の最大値をと
る方法を用いる。しきい値パラメータα、βの求め方と
しては、過去Ｘ回の最大値をとる方法の他に、過去Ｘ回
の平均値をとる方法などを用いることができる。New threshold parameters are determined from these values and stored in the threshold parameter storage section 3. The threshold parameter is determined by taking the maximum value of α and β values X times. As a method for determining the threshold parameters α and β, in addition to a method of taking the maximum value of the past X times, a method of taking the average value of the past X times, etc. can be used.

次に、認識結果が誤認識の場合について説明する。利用
者により、認識結果がエラーと入力された場合は、しき
い値パラメータ格納部３より現在設定されているしきい
値パラメータα、βを読み込み、それらをしきい値θ（
ｉ）が増加するように更新する。αの更新値は例えば、
α、、＝ｋ・α、１４．α。、、＝α−＋ａ＋ＴまたＢ
の更新値も同様に、β。、、−ｋ・β。４．β０．、＝
β。、＋Ｔなどの方法を用いることができる。θの求め
方として一次単調増加関数を用いる上記の例では、ｋと
して１以上の値、または、正の値を持っＴを与えること
によりθ１．．〉θ。＋４となるα、βを設定すること
ができる。Next, a case where the recognition result is erroneous recognition will be explained. If the user inputs an error in the recognition result, the currently set threshold parameters α and β are read from the threshold parameter storage unit 3 and set to the threshold value θ (
i) is updated so that it increases. For example, the updated value of α is
α,,=k・α,14. α. ,,=α−+a+T and B
Similarly, the updated value of β is also β. ,,−k・β. 4. β0. ,=
β. , +T, etc. can be used. In the above example in which a linear monotonically increasing function is used to obtain θ, θ1. ．．〉θ. It is possible to set α and β to be +4.

以上の実施例では、しきい値θの求め方として一次単調
増加関数を用いて説明したが、それ以外の方法として、
ｉにおける最小値を基にした場合（θ（ｉ）−ｇｍｉｎ
（ｉ　）＋α）について説明する。この場合には、以下
のようにしてパラメータαを学習するととができる。ま
ず、認識時にマツチング部４における枝刈処理に使用さ
れた各ｉの累積距離の最小値ｇ　ｍ１ｎ（１）　＋　ｉ
＝１　＋・・・。In the above embodiments, the method of determining the threshold value θ was explained using a linear monotonically increasing function, but as another method,
Based on the minimum value at i (θ(i)−gmin
(i)+α) will be explained. In this case, the parameter α can be learned as follows. First, the minimum value g m1n (1) + i of the cumulative distance of each i used in the pruning process in the matching unit 4 during recognition.
=1 +...

■を、各時刻ｉにおいてパラメータ決定部７に格納して
おく。その後、上記の実施例と同様に処理し、最適パス
計算部６においてｇ−ｐｌ（ｔ）　、　ｉ　＝１、・・
・、■を求めた後、パラメータ決定部７において、ｇ□
ｒｒ（１）＝ｇ。ｐｌ　（ｉ）　−ｇｍｉｎ（ｉ）　、
　ｓ−１，・・・、■を計算しｇａ＋。の最大値を求め
る。(2) is stored in the parameter determination unit 7 at each time i. Thereafter, processing is performed in the same manner as in the above embodiment, and the optimal path calculation unit 6 calculates g-pl(t), i = 1,...
・,■ After determining g□
rr(1)=g. pl(i)-gmin(i),
Calculate s-1,..., ■ and get ga+. Find the maximum value of.

パラメータ決定部７には、このようにして求められたｇ
□、の最大値が過去の発声Ｘ回分（Ｘ≧０）格納されて
いる。αはこれらの値の平均、または−】６− 最大値に基づいて決定することができる。The parameter determination unit 7 stores g obtained in this way.
The maximum value of □ is stored for X times of past utterances (X≧0). α can be determined based on the average of these values or the maximum value.

（発明の効果）以上に説明した本発明による高速音声認識装置では、誤
認識が起きてもそれが誤認識の多発につながることなく
、枝刈における最適なしきい値を学習することができる
。そのため、話者や環境などの変化に適応したしきい値
の設定が可能になる。それにより、しきい値が不適当で
あるために起きていた従来の問題点が解決され、認識速
度がより高速になり、また、認識率が向上する。(Effects of the Invention) In the high-speed speech recognition device according to the present invention described above, even if erroneous recognition occurs, it is possible to learn an optimal threshold value for pruning without leading to frequent erroneous recognition. Therefore, it is possible to set a threshold value that adapts to changes in the speaker, environment, etc. This solves the conventional problems caused by inappropriate threshold values, increases recognition speed, and improves recognition rate.

[Brief explanation of the drawing]

第１図は本発明の一実施例を示すブロック図、第２図は
従来方式におけるマツチングの様子を説明するための図
、第３図は第１図実施例におけるパラメータ決定部で行
なわれる処理を説明するための図である。１・・・入力パターン格納部、２・・・標準パターン格
納部、３・・・しきい値パラメータ格納部、４・・・マ
ツチング部、５・・・判定部、６・・・最適パス計算部
、７・・・パラメータ決定部、８・・・結果確認部。FIG. 1 is a block diagram showing an embodiment of the present invention, FIG. 2 is a diagram for explaining the state of matching in the conventional method, and FIG. 3 shows the processing performed by the parameter determining section in the embodiment of FIG. It is a figure for explaining. DESCRIPTION OF SYMBOLS 1... Input pattern storage part, 2... Standard pattern storage part, 3... Threshold parameter storage part, 4... Matching part, 5... Judgment part, 6... Optimal path calculation part, 7...parameter determination part, 8...result confirmation part.

Claims

[Claims]

Voice feature vector time series of each word n = b^n,
...b^n_j...b^n_j_m as a standard pattern; a threshold parameter storage section that stores a threshold parameter that is a parameter for determining a pruning threshold; Sequentially read feature vector a_i of input voice at time i, time series pattern A=a
_1...a_i...a_I and the input pattern storage section that stores the characteristics a_i of the input voice at each time i.
The cumulative distance g^n(i, j) of the distance d^n(i, j) between the standard pattern b^n_j in the standard pattern storage section and the standard pattern b^n_j in the standard pattern storage section is determined by the pruning condition determined by the parameters in the threshold parameter storage section. A matching unit that calculates the value of (n, j) that satisfies a determination unit that outputs, a result confirmation unit that determines whether the recognition result is correct, and an optimal path calculation unit that reads the input pattern A in the input pattern storage unit and the standard pattern Bn of the recognition result and calculates the optimal path if the result is correct. If the result is correct, the threshold parameter is updated using the value of the cumulative distance on the optimal path obtained by the optimal path calculation section, and if the result is incorrect, the threshold parameter is updated. A speech recognition device comprising: a threshold parameter determination unit that updates a threshold parameter in a parameter storage unit so as to increase the threshold value.