JPH04308923A

JPH04308923A - Voice inputting device

Info

Publication number: JPH04308923A
Application number: JP3073010A
Authority: JP
Inventors: Mitsuhiro Inazumi; 稲積満広
Original assignee: Seiko Epson Corp
Current assignee: Seiko Epson Corp
Priority date: 1991-04-05
Filing date: 1991-04-05
Publication date: 1992-10-30

Abstract

(57)【要約】本公報は電子出願前の出願データであるた
め要約のデータは記録されません。(57) [Summary] This bulletin contains application data before electronic filing, so abstract data is not recorded.

Description

[Detailed description of the invention]

【０００１】0001

【産業上の利用分野】本発明は音声入力装置に関するも
のである。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a voice input device.

【０００２】0002

【従来の技術】音声入力装置は非常に効率的な情報入力
手段であると考えられる。　　しかしながら、音声入力
装置はまだまだ一般的な情報入力手段とはなっていない
。　　その理由として、従来的な手法においては非常に重
要な問題点があるからである。2. Description of the Related Art Voice input devices are considered to be very efficient means of inputting information. However, voice input devices have not yet become a common means of inputting information. The reason for this is that conventional methods have very important problems.

【０００３】つまり従来の技術において、音声入力装置
を使用する際は、予めそれを含めたハードウェアと、音
声入力を取り扱えるアプリケーションソフトウェアを用
意する必要があった。　　しかし、これは非常に大きな
制限である事は明かである。つまり、音声入力を取り扱
えるハードウェアは不足しており、また、それらを新規
に取り揃えるためには非常に大きなハードウェアへの投
資が必要とされる。そうなると、ハードウェアが無いか
ら使わない。　　使わないからハードウェアを買わない
と言う循環が形成されてしまう事が多い。That is, in the conventional technology, when using a voice input device, it was necessary to prepare in advance hardware including the voice input device and application software that can handle voice input. However, it is clear that this is a very large limitation. In other words, there is a shortage of hardware that can handle voice input, and acquiring new hardware requires a very large investment in hardware. If that happens, I won't use it because I don't have the hardware. A cycle often forms where people don't buy hardware because they don't use it.

【０００４】更により問題であるのは、音声認識を取り
扱えるアプリケーションソフトウェアが決定的に不足し
ている事である。　　従来の技術においては音声認識を
取り扱おうとするとアプリケーションソフトウェア側で
対応する必要があった。　　子かし、ソフトウェアを作
成する事はハードウェアを作成する事よりも非常に多く
の人的な投資を必要とする。　　従って、普及率の低い
ハードウェアに対応するソフトウェアを作成する事は現
実的にはまず不可能である。[0004] An even more problematic problem is that there is a critical shortage of application software that can handle speech recognition. In conventional technology, when attempting to handle voice recognition, it was necessary to handle it on the application software side. Unfortunately, creating software requires significantly more human investment than creating hardware. Therefore, it is practically impossible to create software compatible with hardware that has a low penetration rate.

【０００５】また仮に従来のソフトウェアを音声認識入
力に対応したものに書き換えたとしても、それによりそ
のソフトウェアの使用方法等が変化してしまえばユーザ
ーに拒絶される大きな要因となる。[0005] Furthermore, even if conventional software were to be rewritten to support voice recognition input, if the method of using the software changed as a result, this would be a major factor in the user's rejection of the software.

【０００６】更に上で述べた問題に加えて、音声認識技
術そのものの問題として、通常のオフィス環境程度の雑
音においても発話区間の切り出しが困難であると言う事
がある。　　また、音声認識処理はかなり多くの計算機
資源を必要とするため、それを常に動作させておく事は
、他の処理に対して悪影響を与える可能性がある。[0006] In addition to the above-mentioned problems, a problem with the speech recognition technology itself is that it is difficult to extract speech sections even in the noise of a normal office environment. Furthermore, since speech recognition processing requires a considerable amount of computer resources, keeping it running all the time may have an adverse effect on other processing.

【０００７】つまり従来的な技術においては、上で述べ
てきたように、ハードウェア的、ソフトウェア的に非常
に多くの問題があり、更にそれらの問題を全て解決した
としても、その上でより使用し易い環境を整えるのは従
来技術においては非常に困難であった。[0007] In other words, in the conventional technology, as mentioned above, there are many problems in terms of hardware and software, and even if all of these problems were solved, it would still be difficult to use It has been extremely difficult in the prior art to create an environment where it is easy to do so.

【０００８】[0008]

【発明が解決しようとする課題】本発明が解決しようと
する課題は上で述べてきたような事であり、つまり従来
の技術において音声認識装置を使用しようとした際には
、それ専用のコンピュータシステムと、音声認識に対応
したアプリケーションソフトウェアとが必要であり、更
にはそれらを用いても発話区間の切り出しが困難である
と言う事である。本発明の目的は、上記の課題を解決す
るために既存のハードウェアに最小限の音声入力用のハ
ードウェアを追加するのみで、かつ、既存のアプリケー
ションソフトウェアを変更する事なく使用可能であり、
かつユーザーが容易に音声入力とキー入力を選択できる
音声入力装置を実現する事である。[Problem to be Solved by the Invention] The problem to be solved by the present invention is as described above. In other words, when attempting to use a speech recognition device in the conventional technology, a dedicated computer is required to use it. A system and application software compatible with speech recognition are required, and furthermore, even if these are used, it is difficult to extract speech sections. An object of the present invention is to solve the above problems by simply adding a minimum amount of audio input hardware to existing hardware, and which can be used without changing existing application software.
Another object of the present invention is to realize a voice input device that allows a user to easily select voice input and key input.

【０００９】[0009]

【課題を解決するための手段】図１は本発明の概念の模
式図である。　　図１を用いて本発明を説明すると、本
発明は、１、キーボード１と、２、１のキーボードにより生成されたキーコードを入力
するキーコード入力手段２と、３、２により入力されたキーコードより、音声入力選択
コードを検出する音声入力選択コード検出手段３と、４
、音声入力手段４と、５、４の音声入力手段の出力を入力とする音声認識手段
５と、６、５の音声認識手段の結果を入力とするキーコード生
成手段６と、７、２のキーコード入力手段と、６のキーコード生成手
段のいずれかを選択する入出力選択手段７と、８、７に
より選択されたキーコードを出力するキーコード出力手
段８と、より構成される音声入力装置である。Means for Solving the Problems FIG. 1 is a schematic diagram of the concept of the present invention. To explain the present invention using FIG. 1, the present invention comprises: 1. a keyboard 1; 2. a key code input means 2 for inputting a key code generated by the keyboard 1; and 3. a key input by the keyboard 2. voice input selection code detection means 3 and 4 for detecting the voice input selection code from the code;
, voice input means 4; voice recognition means 5 which receives the output of the voice input means 5, 4; key code generation means 6 which receives the results of the voice recognition means 6, 5; and 7, 2. Audio input consisting of a key code input means, an input/output selection means 7 for selecting one of the key code generation means 6, and a key code output means 8 for outputting the key code selected by 8 and 7. It is a device.

【００１０】0010

【実施例】図１は本発明の概念の模式図である。　　ま
た、図２は本発明における処理の概略である。　　以下
にこの２つの図を用いて本発明を詳細に説明する。DESCRIPTION OF THE PREFERRED EMBODIMENTS FIG. 1 is a schematic illustration of the concept of the present invention. Moreover, FIG. 2 is an outline of the processing in the present invention. The present invention will be explained in detail below using these two figures.

【００１１】現在用いられているコンピュータの多くは
本体とは独立したキーボードを持っている。　　更にそ
のキーボードは本体とは独立したＣＰＵを持ち、キーボ
ード走査を行い、押されたキーに従った処理をプログラ
ムに従って行い、その結果をコンピュータ本体と通信し
ている。[0011] Many computers currently in use have a keyboard independent of the main body. Furthermore, the keyboard has a CPU independent from the main body, scans the keyboard, performs processing according to the pressed keys according to a program, and communicates the results with the computer main body.

【００１２】本発明は、そのキーボードと本体の通信線
に接続されたハードウェア、及びソフトウェアを用いて
音声認識装置を実現するものである。　　もちろん、こ
の通信線への接続は物理的な接続である必要はなく、コ
ンピュータ本体のキーボードデバイスドライバーを介し
たソフトウェア的な接続であってもかまわない。The present invention realizes a speech recognition device using hardware and software connected to the keyboard and the communication line of the main body. Of course, the connection to this communication line does not have to be a physical connection, and may be a software connection via a keyboard device driver of the computer itself.

【００１３】図１、及び図２を用いて本発明の動作を説
明する。　　まず予め音声入力選択文字コードを登録し
ておく。　　この文字コードは非印字文字、例えばコン
トロール文字を用いる事が考えられる。　　その後キー
ボード１から入力された文字のコードはキーコード入力
手段２を介して、音声入力選択コード検出手段３へ送ら
れる。　　もしも入力されたコードが音声入力選択コードでな
ければ、その文字コードは入出力選択手段７を介してキ
ーコード出力手段８へ送られ、それによりコンピュータ
本体９へ送られる。　　この処理はキーボードが直接コ
ンピュータ本体へ接続されている通常の場合とまったく
同じ動作である。The operation of the present invention will be explained using FIGS. 1 and 2. First, register the voice input selection character code in advance. This character code may be a non-printing character, such as a control character. Thereafter, the character code input from the keyboard 1 is sent to the voice input selection code detection means 3 via the key code input means 2. If the input code is not a voice input selection code, the character code is sent to the key code output means 8 via the input/output selection means 7, and thereby to the computer main body 9. This process is exactly the same as when the keyboard is directly connected to the computer.

【００１４】もしも、音声入力選択手段３へ送られた文
字コードが音声入力選択文字コードであったとしたなら
ば、その文字コードそのものは捨てられ、その副作用と
して音声入力手段４と音声認識手段５とキーコード生成
手段６が起動される。If the character code sent to the voice input selection means 3 is a voice input selection character code, that character code itself is discarded, and as a side effect, the voice input means 4 and the voice recognition means 5 The key code generation means 6 is activated.

【００１５】この状態において音声は音声入力手段４か
ら入力され、その入力された音声は音声認識手段５によ
り認識される。　　そしてその認識結果に対応したキー
コードがキーコード生成手段６により生成される。　　
例えば、コンピュータ本体においてワープロソフトウェ
アが動作しており、その中へ自分の住所を入力しなけれ
ばならなくなったとする。　　この時キーボード上の音
声入力選択コード文字に対応するキーを押し、音声入力
手段に対し「住所」と音声入力すれば、それに対応した
文字コード列、例えば「長野県諏訪市大和３ー３ー５」
等と言うような文字列に対応するキーコードが生成され
る。　　その後そのキーコードは入出力選択手段７を介
してキーコード出力手段８へ送られ、それによりコンピ
ュータ本体９へ送られる。　　この時コンピュータ本体
において動作しているワープロソフトウェアにおいては
、入力されたキーコード列がキーボードから入力された
ものなのか、それとも音声入力の結果であるのかを区別
する手段はない。　　逆にそれだからこそコンピュータ
本体で動作しているソフトウェアを修正する必要がない
と言う事になる。In this state, voice is input from the voice input means 4, and the input voice is recognized by the voice recognition means 5. Then, a key code corresponding to the recognition result is generated by the key code generating means 6.
For example, suppose your computer is running word processing software and you need to enter your address into it. At this time, if you press the key corresponding to the voice input selection code character on the keyboard and input ``address'' into the voice input means, the corresponding character code string, for example ``3-3-5 Yamato, Suwa City, Nagano Prefecture,'' will be displayed. ”
A key code corresponding to a character string such as ``etc'' is generated. Thereafter, the key code is sent via the input/output selection means 7 to the key code output means 8, and thereby to the computer main body 9. In the word processing software running on the computer at this time, there is no means to distinguish whether the input key code string is input from the keyboard or the result of voice input. On the contrary, this means that there is no need to modify the software running on the computer itself.

【００１６】もう一つ動作例をあげると、最近のソフト
ウェアはメニュー選択により動作するものが多くなって
いる。　　しかし、ソフトウェアが高機能になるに従い
、目的とするメニューに到るまでにいくつかのメニュー
を予備選択しなければならない事も多い。　　しかし、
キーボード操作に慣れた人間にとってはマウス等を用い
る事や、それにより視線が大きく動く事は疲労を増加さ
せる事になる。これを避けるために、多くの場合その操
作に対応するコントロール文字列が定義されている事も
多い。　　しかし、機能が多くなれば、あるいは複数の
ソフトウェアを使用する場合はそれらの間での操作の統
一性がとれず混乱する事も多い。[0016] To give another example of operation, many of the recent software are operated by menu selection. However, as software becomes more sophisticated, it is often necessary to make preliminary selections from several menus before reaching the desired menu. but,
For people who are accustomed to keyboard operations, using a mouse or the like and the large movement of the line of sight will increase fatigue. To avoid this, a control string corresponding to the operation is often defined. However, when the number of functions increases, or when multiple pieces of software are used, operations often become inconsistent and confusion occurs.

【００１７】本発明を用いるとこの問題を解決する事が
できる。　　例えば新しいウィンドーを開く操作に対し
てコントロール文字列「ＡＢＣ」が対応しているとする
。　　そして「ウィンドー」と言う音声にその文字列「Ａ
ＢＣ」の生成を対応させておくとする。　　その設定で
キーボードの音声入力選択コード文字を押しながら、「
ウィンドー」と発話すれば、それにより「ＡＢＣ」と言
うコントロール文字列が生成され、それをキーボードか
ら入力した事と同じになる。　　あるいはマウスを用い
て逐次「Ａ」、「Ｂ］、「Ｃ」を選択した事と同じにな
る。This problem can be solved using the present invention. For example, assume that the control character string "ABC" corresponds to the operation of opening a new window. Then, when the voice says "window", the character string "A"
It is assumed that the generation of "BC" is made compatible. In that setting, hold down the voice input selection code character on the keyboard and press "
If you say ``Window'', a control string ``ABC'' will be generated, which is the same as inputting it from the keyboard. Alternatively, it is the same as selecting "A", "B", and "C" sequentially using the mouse.

【００１８】[0018]

【発明の効果】以上述べてきたように、本発明によれば
既存のハードウェアに最小限の音声入出力ハードウェア
を追加するだけで音声入力を可能とする事ができる。　
　また、それに際して既存のアプリケーションソフトウ
ェアを修正する必要はない。As described above, according to the present invention, audio input can be made possible by simply adding the minimum amount of audio input/output hardware to existing hardware.
Furthermore, there is no need to modify existing application software.

【００１９】また、ユーザーが容易に音声入力とキーボ
ード入力とを切り替える事が可能であり、発話区間の切
り出し誤り等は起こらず、また計算機資源の負荷も非常
に小さいものである。[0019] Furthermore, the user can easily switch between voice input and keyboard input, no errors in segmentation of speech sections occur, and the load on computer resources is extremely small.

【００２０】本発明において、本発明のハードウェア的
、ソフトウェア的な附加部分がコンピュータ本体から独
立している必要はない。　　これはコンピュータ本体に
含まれたハードウェアを用い、ソフトウェア的なキーボ
ードデバイスドライバープログラムとして実現する事も
可能である。In the present invention, the additional hardware and software parts of the present invention do not need to be independent from the computer main body. This can also be implemented as a software keyboard device driver program using hardware included in the computer itself.

[Brief explanation of drawings]

【図１】本発明の構成の概念の模式図である。FIG. 1 is a schematic diagram of the concept of the configuration of the present invention.

【図２】本発明の処理の概略アルゴリズムを示すフロー
チャートである。FIG. 2 is a flowchart showing a schematic algorithm of processing of the present invention.

[Explanation of symbols]

１：キーボード２：キーコード入力手段３：音声入力選択手段４：音声入力手段５：音声認識手段６：キーコード生成手段７：入出力選択手段８：キーコード出力手段９：コンピュータ 1: Keyboard 2: Key code input means 3: Audio input selection means 4: Voice input means 5: Voice recognition means 6: Key code generation means 7: Input/output selection means 8: Key code output means 9: Computer

Claims

[Claims]

Claims 1: 1. A keyboard; 2. Key code input means for inputting a key code generated by the keyboard in 1; and 3. Voice input for detecting a voice input selection code from the key code input in 2. selection code detection means; 4;
voice input means; voice recognition means which inputs the output of the voice input means 5 and 4; key code generation means which inputs the results of the voice recognition means 6 and 5; and key code input means 7 and 2. , an input/output selection means for selecting one of the key code generation means of 6, and a key code output means for outputting the key code selected by 8 and 7. .