JP2000209663A

JP2000209663A - Method for transmitting non-voice information in voice channel

Info

Publication number: JP2000209663A
Application number: JP2766A
Authority: JP
Inventors: A Benno Stephen; エー．ベンノスティーブン; Charles Ricchion Michael; チャールズリッチオンマイケル
Original assignee: Lucent Technologies Inc
Current assignee: Nokia of America Corp
Priority date: 1999-01-11
Filing date: 2000-01-11
Publication date: 2000-07-28
Also published as: BR0000002A; CN1262577A; CA2293165A1; KR20000053407A; EP1020848A2; AU6533799A

Abstract

PROBLEM TO BE SOLVED: To enhance transmission of data, especially a non-voice signal data through a radio voice channel. SOLUTION: Information is transmitted by bits assigned to an output of one or both code books by setting a gain of a concerned code book to zero. A receiver side VOCODER does not interpret the output of the code book whose gain is set to zero. According to this system, the information can further be transmitted to a VOCODER 10 with a completely transparent method. As an application of this technology where a 'secret' message is transmitted, a parameter to generate a non-voice signal may be transmitted. For example, the information to generate a call waiting tone, a DTMF tone or a TTY/TDD character is imbedded secretly to a compressed bit stream, from which a non-voice tone can be regenerated.

Description

DETAILED DESCRIPTION OF THE INVENTION

【０００１】[0001]

【発明の属する技術分野】本発明は、通信に関し、特
に、無線音声チャネルにおけるデータの送信に関する。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to communications, and more particularly, to the transmission of data on a wireless voice channel.

【０００２】[0002]

【従来の技術】音声信号を圧縮するために音声エンコー
ダ／デコーダ（ボコーダ）が用いられ、通信チャネル上
の伝送バンド幅を減らしている。呼び当たりのバンド幅
を減らすことによって、同じチャネル上の呼びを増やす
ことができる。符号励起線形予測（ＣＥＬＰ）ボコーダ
として知られる種類のボコーダがあり、これらボコーダ
においては、音声は一連のフィルターによりモデル化さ
れている。これらフィルターのパラメータは元の音声よ
りも遙かに少ないビット数で送ることができる。2. Description of the Related Art Speech encoders / decoders (vocoders) are used to compress speech signals, reducing the transmission bandwidth on communication channels. By reducing per-call bandwidth, calls on the same channel can be increased. There is a type of vocoder known as a code-excited linear prediction (CELP) vocoder in which speech is modeled by a series of filters. The parameters of these filters can be sent with far fewer bits than the original speech.

【０００３】また、これらフィルターに入力（励起）を
送り、元の音声を再構築する必要がある。励起を直接送
るのには多すぎるバンド幅が必要なので、励起を少ない
数の非ゼロパルスで置き換えることにより粗い（crud
e）近似を行う。これらパルスの位置は非常に少ないビ
ット数で送ることができ、この元の励起に対するクルー
ド近似は高品質音声を再生するために適切である。この
励起は、固定したコードブック貢献および対応するゲイ
ンで表される。また、音声に存在する擬似周期性は、適
用性コードブック出力と対応するゲインで表される。固
定コードブック出力とその対応するゲイン、適用性コー
ドブック出力とその対応するゲイン、フィルターパラメ
ータ（線形予測コーダパラメータとしても知られる）が
エンコードされた音声信号を表すために送られる。Further, it is necessary to send an input (excitation) to these filters to reconstruct the original sound. Because too much bandwidth is required to send the excitation directly, coarser replacement of the excitation with a smaller number of non-zero pulses can be used.
e) Perform approximation. The positions of these pulses can be sent with a very small number of bits, and a crude approximation to this original excitation is appropriate for reproducing high quality speech. This excitation is represented by a fixed codebook contribution and the corresponding gain. Also, the pseudo-periodicity existing in the voice is represented by the gain corresponding to the output of the applicability codebook. A fixed codebook output and its corresponding gain, an adaptive codebook output and its corresponding gain, and filter parameters (also known as linear prediction coder parameters) are sent to represent the encoded audio signal.

【０００４】ボコーダは、その特性をモデル化し、音声
自体を送るよりも遙かに少ないビット数でそのパラメー
タを送ることにより音声を圧縮するように初期には設計
されていた。無線電話が一般的になるにつれ、ボイスメ
ールにアクセスすることや呼び待ち音を受けることのよ
うな伝統的な地上電話で用いられるように、人々は無線
電話を非音声アプリケーションの同じ領域で用いること
を益々期待している。最近になり、米国連邦通信委員会
（ＦＣＣ）は、デジタルセルラー電話を用いる聴覚障害
者用テキスト電話（ＴＴＹ／ＴＤＤ）を規定した。[0004] Vocoders were initially designed to model their characteristics and compress speech by sending its parameters with far fewer bits than sending the speech itself. As wireless telephones become more common, people use wireless telephones in the same area of non-voice applications as they are used in traditional terrestrial telephones, such as accessing voicemail and receiving calls. I expect more and more. More recently, the United States Federal Communications Commission (FCC) has defined a text telephone for the deaf (TTY / TDD) using a digital cellular telephone.

【０００５】[0005]

【発明が解決しようとする課題】非音声アプリケーショ
ンにまつわる問題として、それらのボコーダの音声モデ
ルに合致しないことがある。非音声信号がボコーダによ
って処理される場合、デコードされた結果は必ずしも満
足いくものではない。また、この問題は無線電話がエラ
ーに弱い環境にて稼働することにより更に悪くなってい
る。伝送エラーから回復するために、ボコーダがランダ
ムエラーから回復するために音声モデルに依存する。こ
のように、非音声信号はこのモデルに合致せず再構築は
適切ではなくなってしまう。A problem with non-speech applications is that they do not match the vocoder's speech model. If the non-speech signal is processed by a vocoder, the decoded result is not always satisfactory. This problem is further exacerbated by the fact that wireless telephones operate in error-prone environments. To recover from transmission errors, the vocoder relies on a speech model to recover from random errors. Thus, non-speech signals do not fit this model and reconstruction is not appropriate.

【０００６】[0006]

【課題を解決するための手段】本発明は、対応するコー
ドブックのゲインをゼロにセットすることにより、一方
または両方のコードブック出力に割り当てられたビット
で情報を送る。ゲインをゼロにセットすることにより、
コードブックの出力は受け側ボコーダによって解釈され
ない。この方式により、ボコーダに完全に透過性のある
方法で更なる情報を送ることが可能になる。「秘密」メ
ッセージを送るこの技術のアプリケーションとして、非
音声信号を生成するパラメータを送ることがある。例え
ば、呼び待ちトーン、ＤＴＭＦトーン、またはＴＴＹ／
ＴＤＤ文字を生成する情報を圧縮ビットストリームに秘
密的に（clandestinely）埋め込み、非音声トーンを再
生成することができる。SUMMARY OF THE INVENTION The present invention sends information in bits assigned to one or both codebook outputs by setting the gain of the corresponding codebook to zero. By setting the gain to zero,
The output of the codebook is not interpreted by the receiving vocoder. This scheme allows the vocoder to send more information in a completely transparent manner. An application of this technique to send a "secret" message is to send a parameter that produces a non-voice signal. For example, call waiting tone, DTMF tone, or TTY /
The information that generates the TDD characters can be clandestinely embedded in the compressed bitstream to regenerate non-voice tones.

【０００７】[0007]

【発明の実施の形態】図１は、典型的なボコーダのブロ
ック図を示す。ボコーダ１０は入力１２にてデジタル化
した音声を受信する。このデジタル化した音声はＡ／Ｄ
変換器を通過したアナログ音声信号であり、典型的には
２０ｍｓのオーダの各フレームへと分けられている。入
力１２における信号はエンコーダ部１４にわたされ、こ
れは音声伝送に用いられるバンド幅量を減らすために音
声をエンコードする。エンコードされた音声は出力１６
にて利用可能となる。FIG. 1 shows a block diagram of a typical vocoder. Vocoder 10 receives digitized audio at input 12. This digitized audio is A / D
An analog audio signal that has passed through a converter and is typically divided into frames of the order of 20 ms. The signal at input 12 is passed to an encoder section 14, which encodes the audio to reduce the amount of bandwidth used for audio transmission. The encoded audio is output 16
It will be available at.

【０００８】エンコードされた音声は通信チャネルの他
端における同様なボコーダのデコーダ部によって受信さ
れる。通信チャネルの他端におけるデコーダは、ボコー
ダ１０のデコーダ部と同様ないし同一である。各音声は
入力１８を介してボコーダ１０によって受信され、デコ
ーダ部２０へとわたされる。デコーダ部２０は、送信ボ
コーダから受信したエンコードされた信号を用いて出力
２２にてデジタル化した音声を作る。The encoded speech is received by a similar vocoder decoder at the other end of the communication channel. The decoder at the other end of the communication channel is similar or identical to the decoder section of vocoder 10. Each voice is received by the vocoder 10 via an input 18 and passed to a decoder unit 20. The decoder section 20 produces digitized audio at an output 22 using the encoded signal received from the transmitting vocoder.

【０００９】ボコーダは通信分野において周知である。
例えば、ボコーダは、文献、Speechand audio coding f
or wireless and network applications, Bishnu S. At
al,Vladimir Cuperman, Allen Gersho編集, 1993, Kluw
er Academic Publishersに記載されている。ボコーダは
広く利用可能であり、Qualcomm IncorporatedやLucent
Technologies Inc.のような会社によって製造されてい
る。[0009] Vocoders are well known in the communications arts.
For example, vocoders can use literature, Speechand audio coding f
or wireless and network applications, Bishnu S. At
al, Vladimir Cuperman, edited by Allen Gersho, 1993, Kluw
er Academic Publishers. Vocoders are widely available, including Qualcomm Incorporated and Lucent
Manufactured by companies such as Technologies Inc.

【００１０】図２は、ボコーダ１０のエンコーダ１４の
主な機能を示す。デジタル化した音声信号は入力１２に
て受けられ、線形予測コーダ４０へとわたされる。線形
予測コーダはフレーム当たり１回の入音声の線形予測解
析を行う。線形予測解析はこの分野において周知であ
り、入力音声信号に基づいた声道（vocal tract）の線
形予測合成モデルを作る。このモデルを記述する線形予
測パラメータないし係数が出力１６を介してエンコード
された音声信号の一部として送られる。FIG. 2 shows the main functions of the encoder 14 of the vocoder 10. The digitized audio signal is received at input 12 and passed to linear prediction coder 40. The linear prediction coder performs a linear prediction analysis of the input speech once per frame. Linear predictive analysis is well known in the art and creates a linear predictive synthesis model of the vocal tract based on the input speech signal. The linear prediction parameters or coefficients describing this model are sent via output 16 as part of the encoded audio signal.

【００１１】コーダ４０はこのモデルを用いて、入力音
声信号をこのモデルが作る際に用いる励起を表す残余
（residual）音声信号を作る。この残余音声信号は出力
４２にて利用可能となる。出力４２からの残余音声は開
ループピッチサーチユニット５０の入力４８へ、適用コ
ードブックユニット７２の入力へ、そして固定コードブ
ックユニット８２へと与えられる。インパルス応答ユニ
ット６０は線形予測コーダから線形予測パラメータを受
信し、コーダ４０にて生成されたモデルのインパルス応
答を生成する。このインパルス応答は適用および固定コ
ードブックユニットにて用いられる。The coder 40 uses this model to produce a residual speech signal that represents the excitation used by the model to produce the input speech signal. This residual audio signal is made available at output 42. The residual speech from output 42 is provided to input 48 of open loop pitch search unit 50, to the input of applicable codebook unit 72, and to fixed codebook unit 82. The impulse response unit 60 receives the linear prediction parameters from the linear prediction coder and generates an impulse response of the model generated by the coder 40. This impulse response is used in application and fixed codebook units.

【００１２】開ループピッチサーチユニット５０はコー
ダ４０からの残余音声信号を用いてそのピッチをモデル
化し、出力５２にて、ピッチ、すなわち、ピッチ周期ま
たはピッチ遅延信号と一般に呼ばれているものを与え
る。この出力５２からのピッチ遅延信号およびインパル
ス応答ユニット６０の出力６４からのインパルス応答信
号は、適用コードブックユニット７２の入力７０にて受
けられる。適用コードブックユニット７２はピッチゲイ
ン出力およびピッチインデックス出力を作り、これはボ
コーダ１０のエンコードされた音声出力１６の一部とな
る。適用コードブックユニット７２の出力７４は、固定
コードブックユニット８２の入力８０へピッチゲインピ
ッチインデックス信号をも与える。また、適用コードブ
ックユニット７２は入力８０へと励起信号および適用コ
ードブックターゲット信号を与える。The open loop pitch search unit 50 models the pitch using the residual speech signal from the coder 40 and provides at output 52 the pitch, ie, what is commonly referred to as the pitch period or pitch delay signal. . The pitch delayed signal from output 52 and the impulse response signal from output 64 of impulse response unit 60 are received at input 70 of applicable codebook unit 72. The application codebook unit 72 produces a pitch gain output and a pitch index output, which becomes part of the encoded audio output 16 of the vocoder 10. Output 74 of applicable codebook unit 72 also provides a pitch gain pitch index signal to input 80 of fixed codebook unit 82. Also, the application codebook unit 72 provides an excitation signal and an application codebook target signal to an input 80.

【００１３】適用コードブック７２は入力１２からのデ
ジタル化した音声信号および線形予測コーダ４０が作ら
れた残余音声信号を用いてその出力を作る。適用コード
ブック７２はデジタル化した音声信号および線形予測コ
ーダ４０の残余音声信号を用いて適用コードブックター
ゲット信号を形成する。適用コードブックターゲット信
号は固定コードブックユニット８２への入力として用い
られ、また、適用コードブックユニット７２のピッチゲ
イン、ピッチインデックス、励起出力を作る演算手段へ
の入力として用いる。また、適用コードブックターゲッ
ト信号、開ループピッチサーチユニット５０からのピッ
チ遅延信号、インパルス応答ユニット６０からのインパ
ルス応答はピッチインデックス、ピッチゲイン、励起信
号を作るために用いられ、これらは固定コードブックユ
ニット８２へとわたされる。これら信号が演算される方
式はボコーダの分野において周知である。An application codebook 72 produces its output using the digitized audio signal from input 12 and the residual audio signal produced by linear prediction coder 40. The adaptive codebook 72 uses the digitized audio signal and the residual audio signal of the linear prediction coder 40 to form an adaptive codebook target signal. The applied codebook target signal is used as an input to the fixed codebook unit 82, and is also used as an input to a calculating means for generating a pitch gain, a pitch index and an excitation output of the applied codebook unit 72. Also, the applied codebook target signal, the pitch delay signal from the open loop pitch search unit 50, and the impulse response from the impulse response unit 60 are used to create a pitch index, a pitch gain, and an excitation signal, which are fixed codebook units. It is passed to 82. The manner in which these signals are computed is well known in the vocoder field.

【００１４】固定コードブックユニット８２は入力８０
から受信した入力を用いて、固定ゲイン出力および固定
インデックス出力を作り、これらは出力１６にてエンコ
ードされた音声の一部として用いられる。固定コードブ
ックユニットは線形予測コーダ４０の残余音声信号の確
率的（stochastic）部分をモデル化しようと試みる。固
定コードブックサーチのターゲットは、現在の適用コー
ドブックターゲット信号と残余音声信号の間のエラーを
判断することにより作られる。固定コードブックサーチ
は、励起パルスに対する固定ゲインと固定インデックス
信号を作り、このエラーを最小化する。適用コードブッ
クユニット７２からの出力を用いて固定ゲインと固定イ
ンデックス信号が計算される方式はボコーダの分野にお
いて周知である。The fixed code book unit 82 has an input 80
Using the input received from, a fixed gain output and a fixed index output are made, which are used at output 16 as part of the encoded speech. The fixed codebook unit attempts to model the stochastic part of the residual speech signal of the linear prediction coder 40. The fixed codebook search target is created by determining the error between the currently applied codebook target signal and the residual audio signal. A fixed codebook search creates a fixed gain and fixed index signal for the excitation pulse, minimizing this error. The manner in which the fixed gain and fixed index signals are calculated using the output from the applied codebook unit 72 is well known in the field of vocoders.

【００１５】スイッチ９０、９２は、固定コードブック
出力と適用コードブック出力をそれぞれ送るのに用いら
れるビット群に置き換わるデータを送るのに用いられ
る。これらスイッチ９０，９２の接点が位置Ａにあれ
ば、対応するコードブック出力はデータないし他の情報
で置き換わり、対応するコードブックゲインはゼロない
しほぼゼロにセットされる。結果として、受信器におい
て作られるスケールされたコードブック出力ないし励起
はゼロないしほぼゼロになり、通常送信される音声をモ
デル化するために受信側ボコーダによって用いられるフ
ィルターに対して悪影響を与えずに済む。Switches 90 and 92 are used to send data that replaces the bits used to send the fixed codebook output and the applied codebook output, respectively. When the contacts of these switches 90, 92 are in position A, the corresponding codebook output is replaced with data or other information and the corresponding codebook gain is set to zero or nearly zero. As a result, the scaled codebook output or excitation produced at the receiver will be zero or nearly zero, without adversely affecting the filters normally used by the receiving vocoder to model the transmitted speech. I'm done.

【００１６】図３は、ボコーダ１０のデコーダ部２０の
機能的ブロック図を示す。エンコードされた音声信号は
エンコーダ２０の入力１８にて受信される。このエンコ
ードされた音声信号はデコーダ１００によって受信され
る。デコーダ１００は固定インデックスおよびピッチイ
ンデックス信号それぞれに対応する固定および適用コー
ドベクトルを作る。これらコードベクトルはピッチゲイ
ンおよび固定ゲイン信号とともにユニット１１０の励起
構築部分にわたされる。ピッチゲイン信号は、ピッチイ
ンデックス信号を用いて作られた適用ベクトルをスケー
ルするのに用いられ、固定ゲイン信号は、固定インデッ
クス信号を用いてえられた固定ベクトルをスケールする
のに用いられる。FIG. 3 shows a functional block diagram of the decoder section 20 of the vocoder 10. The encoded audio signal is received at input 18 of encoder 20. The encoded audio signal is received by the decoder 100. Decoder 100 produces fixed and applied code vectors corresponding to the fixed index and pitch index signals, respectively. These code vectors along with the pitch gain and fixed gain signals are passed to the excitation building block of unit 110. The pitch gain signal is used to scale the application vector created using the pitch index signal, and the fixed gain signal is used to scale the fixed vector obtained using the fixed index signal.

【００１７】デコーダ１００は、線形予測コードパラメ
ータをフィルターにわたす。フィルターまたはユニット
１１０のモデル合成部分へとわたす。続いて、ユニット
１１０はスケールされたベクトルを用いて線形予測コー
ダが作った線形予測係数を用いて合成されたフィルター
を励起し、入力１２にて元々受信したデジタル化した音
声を表す出力信号を作る。随意に、ポストフィルター１
２０を用いて出力２０にて作られるデジタル化した音声
信号のスペクトルを整形するのに用いることができる。The decoder 100 passes the linear prediction code parameters to a filter. Pass to the filter or model synthesis portion of unit 110. Subsequently, unit 110 uses the scaled vector to excite the synthesized filter using the linear prediction coefficients produced by the linear prediction coder, producing an output signal representing the digitized speech originally received at input 12. . Optionally, post filter 1
20 can be used to shape the spectrum of the digitized audio signal produced at output 20.

【００１８】音声情報ではなくデータが送信される場
合、ピッチインデックス（適用コードブック出力）およ
び／または固定インデックス（固定コードブック出力）
がデータを受信するのに用いられる。ユニット１１０に
よるフィルター合成に対しての非データ信号の影響は除
去される。なぜなら、ピッチないしコードインデックス
に対応するゲイン値がゼロであるからである。When data is transmitted instead of voice information, the pitch index (applied codebook output) and / or the fixed index (fixed codebook output)
Is used to receive data. The effect of non-data signals on filter synthesis by unit 110 is eliminated. This is because the gain value corresponding to the pitch or chord index is zero.

【００１９】このような機能的ブロック図は多くの形態
にて実装することができる。各ブロックは、マイクロプ
ロセッサやマイクロコンピュータを用いて個別に実装し
てもよいが、１つのマイクロプロセッサやマイクロコン
ピュータを用いて実装してもよい。前期会社や他の半導
体会社から利用可能なプログラマブルデジタルシグナル
プロセッシングデバイスや特殊用途デバイスを用いて機
能ブロックの一部ないし全てを実装することができる。Such a functional block diagram can be implemented in many forms. Each block may be individually implemented using a microprocessor or a microcomputer, or may be implemented using one microprocessor or a microcomputer. Some or all of the functional blocks can be implemented using programmable digital signal processing devices or special purpose devices available from previous companies and other semiconductor companies.

[Brief description of the drawings]

【図１】典型的なボコーダのブロック図。FIG. 1 is a block diagram of a typical vocoder.

【図２】ボコーダ１０のエンコーダ１４の主な機能を示
す図。FIG. 2 is a diagram showing main functions of an encoder 14 of the vocoder 10;

【図３】ボコーダ１０のデコーダ部２０の機能的なブロ
ック図。FIG. 3 is a functional block diagram of a decoder unit 20 of the vocoder 10.

[Explanation of symbols]

１０ボコーダ１４エンコーダ２０デコーダ１２音声信号４０線形予測コーダ５０開ループピッチサーチユニット６０インパルス応答ユニット７２適用コードブックユニット８２固定コードブックユニット２０デコーダ部１００デコーダ１１０ユニット１２０ポストフィルター Reference Signs List 10 vocoder 14 encoder 20 decoder 12 audio signal 40 linear prediction coder 50 open loop pitch search unit 60 impulse response unit 72 applied codebook unit 82 fixed codebook unit 20 decoder unit 100 decoder 110 unit 120 post filter

───────────────────────────────────────────────────── フロントページの続き (71)出願人 596077259 600 ＭｏｕｎｔａｉｎＡｖｅｎｕｅ, ＭｕｒｒａｙＨｉｌｌ，ＮｅｗＪｅｒｓｅｙ 07974−0636Ｕ．Ｓ．Ａ. (72)発明者スティーブンエー．ベンノアメリカ合衆国、07801 ニュージャージー、ドーバー、プリンストンアベニュー 53 (72)発明者マイケルチャールズリッチオンアメリカ合衆国、07110 ニュージャージー、ナッツリー、パザイックアベニュー 565 ──────────────────────────────────────────────────続き Continuation of the front page (71) Applicant 596077259 600 Mountain Avenue, Murray Hill, New Jersey 07974-0636 U.S.A. S. A. (72) Inventor Stephen A. Benno United States, 07801 New Jersey, Dover, Princeton Avenue 53 (72) Inventor Michael Charles Richon United States, 07110 New Jersey, Knutsley, Pazaik Avenue 565

Claims

[Claims]

1. A method of transmitting non-voice information over a voice channel, comprising: (A) transmitting non-voice information instead of pitch index information; and (B) providing a pitch gain value of substantially zero value. Transmitting.

2. The method according to claim 1, wherein the non-voice information is DTMF tone information.

3. The method according to claim 1, wherein the non-speech information is information of a text telephone for the hearing impaired.

4. A method for transmitting non-voice information over a voice channel, comprising: (A) transmitting first non-voice information instead of fixed index information; and (B) an index of substantially zero value. Transmitting the gain value.

5. The method according to claim 1, further comprising the steps of: (C) transmitting second non-voice information instead of pitch index information; and (D) transmitting a pitch gain value having a value of substantially zero. Item 5. The method according to Item 4.