JPS6320359B2

JPS6320359B2 -

Info

Publication number: JPS6320359B2
Application number: JP57178482A
Authority: JP
Inventors: Takao Irumano; Kunio Akiba; Hisanori Kanezashi
Original assignee: DENSHI KEISANKI KIPPON GIJUTSU KENKYU KUMIAI
Current assignee: DENSHI KEISANKI KIPPON GIJUTSU KENKYU KUMIAI
Priority date: 1982-10-13
Filing date: 1982-10-13
Publication date: 1988-04-27
Also published as: JPS5968796A

Description

[Detailed description of the invention]

（産業上の利用分野）本発明は、入力音声に対して先ず音素認識を行
ない、この認識音素系列を音素表記された単語辞
書と照合して単語を認識する単語音声認識方法に
関するものである。（従来例の構成とその問題点）従来の単語音声認識方法を第１図とともに説明
する。第１図に示すように、入力音声に対して先
ず分析を行ない、この入力単語音声の特徴を抽出
して、入力単語音声を構成する音素を認識する。
この認識された音素系列を、単語辞書中の各辞書
項目の辞書音素系列と照合し、２つの音素系列間
の尤度を、音素間のコンフユージヨンマトリクス
（Confusion Matrix、以下C.M.と略す）を用い
て、各音素毎の認識確率を求めることにより算出
し、音素系列間の尤度が最大となる辞書項目をも
つて認識単語とするものである。第１表は、前記単語音声認識方法に用いる単語
辞書の一例を示しており、各単語は第２表に示す
音素表記法に従つて表記されている。第２図は前
記C.M.の一部を示す。第２図において、縦は単
語辞書中の音素を示し、横は認識音素を示してい
る。また第２図中の数字は単語辞書中の各音素が
どのような音素に認識されるかの確率を％で示し
たものである。例えば第２図において、単語辞書
中の音素ＩがＩと認識される確率は75％、Ｕに認
識される確率は５％、Ａに認識される確率は０
％、脱落する確率は８％……等を示している。 (Industrial Application Field) The present invention relates to a word speech recognition method that first performs phoneme recognition on input speech, and then recognizes words by comparing the recognized phoneme sequence with a word dictionary in which phonemes are expressed. (Structure of conventional example and its problems) A conventional word speech recognition method will be explained with reference to FIG. As shown in FIG. 1, input speech is first analyzed, features of the input word speech are extracted, and phonemes making up the input word speech are recognized.
This recognized phoneme sequence is compared with the dictionary phoneme sequence of each dictionary item in the word dictionary, and the likelihood between the two phoneme sequences is calculated using a Confusion Matrix (hereinafter abbreviated as CM) between the phonemes. The recognition probability is calculated for each phoneme using the above method, and the dictionary entry with the maximum likelihood between phoneme sequences is determined as a recognized word. Table 1 shows an example of a word dictionary used in the word speech recognition method, and each word is written according to the phoneme notation shown in Table 2. FIG. 2 shows a part of the CM. In FIG. 2, the vertical lines indicate phonemes in the word dictionary, and the horizontal lines indicate recognized phonemes. Further, the numbers in FIG. 2 indicate the probability of what kind of phoneme each phoneme in the word dictionary is recognized as, expressed in percentage. For example, in Figure 2, the probability that the phoneme I in the word dictionary will be recognized as I is 75%, the probability that it will be recognized as U is 5%, and the probability that it will be recognized as A is 0.
%, the probability of dropping out is 8%, etc.

【表】【table】

Claims

[Claims]

1 Perform phoneme recognition on the input speech to obtain a recognized phoneme sequence, and recognize words by calculating the likelihood between this recognized phoneme sequence and the dictionary phoneme sequence of each dictionary item in the word dictionary with phoneme notation. When calculating the likelihood between the phoneme sequences for the input speech, the likelihood for each dictionary item is determined in advance based on which dictionary item has the highest likelihood for each dictionary item. A weighted likelihood value is calculated by adding or multiplying likelihood weight values set to different values, and the dictionary entry for which this weighted likelihood value is maximum is determined as a recognized word. Word speech recognition method.