JPH0467200B2

JPH0467200B2 -

Info

Publication number: JPH0467200B2
Application number: JP58010647A
Authority: JP
Inventors: Rooton Furanagan Jeemusu; Deiuitsudo Jonsuton Jeemusu
Original assignee: AT&T Technologies Inc
Current assignee: AT&T Corp
Priority date: 1983-01-27
Filing date: 1983-01-27
Publication date: 1992-10-27
Also published as: JPS58217998A

Description

【発明の詳細な説明】技術の分野本発明はスピーチの処理に関し、特にスピーチ
信号符号化装置に関する。発明の背景公知の様に、通常のスピーチには伝達される知
能の一部を構成するための無声期間を含んでい
る。無声期間は文章間、節間、語間、ならびに語
内で生ずるものである。元来、無声期間はスピー
チの自然的属性であるとして聴取者に受容れられ
るものである。しかしながら、無声はスピーチパ
ターンの重要な部分を表わすものである。通信チ
ヤネル上で伝送するため、あるいは記憶装置内へ
記憶するためにスピーチ信号が符号化されている
場合には、無声部分に相当する符号列は、他の目
的に使用する様に除去されることができるか、あ
るいは他の目的に使用することができる様な符号
化信号の断片を占めるものである。メツセージを
理解するためには、スピーチパターンにおける適
切な位置に無声信号を再生する必要があることは
勿論である。しかし、記憶されたデジタル信号、
あるいは伝達されたデジタル信号がはるかに簡便
に作れる様に、無声のための符号化を簡易化する
ことができる。この様にして、通信システムの効
率が向上している。デジタル的にスピーチ信号を符号化する場合に
は、多くの方法が可能である。標本化されたスピ
ーチを直接二進形に変換することはパルス符号変
調により実施できる。変換過程は各スピーチ標本
を個別の各レベルの組に量子化するための過程
と、選択されている量子化レベルを符号化するた
めの過程とを含むものである。適応形のパルス変
調においては、スピーチ信号標本の量子化は入力
信号レベルに適応しているが、この形のパルス変
調ではデジタル標本あたりのビツト数がきわめて
少なく、性能が改善されている。斯かる適応形の
装置においては、量子化装置におけるステツプサ
イズが入力信号の統計的特性に合致するように変
化させることができる。差動形パルス変調系は符
号化効率を改善するために入力信号それ自身を標
本化するよりも、入力信号標本間の差を符号化す
るものである。しかしながら、これらの符号化装
置では入力信号の活性化部分と、非活性化部分と
を正確に区別するものではない。結果的には、信
号の無声期間は知能の伝送効率に逆の影響を与え
る。米国特許第4280192号は、アナログ情報をデジ
タル的に記憶するための空間を最小化するための
構成を開示したものであり、スピーチのようなア
ナログ入力信号を連続的に傾斜率の変化する形式
のデルタ変調（ADM）符号の流れに変換するも
のである。肉声動作形スイツチは、アナログ信号
があらかじめ定められたレベル以下に下降した時
に休止期間の開始を検出するものである。スピー
チ信号があらかじめ定められたレベル以上に上昇
した時に動作を停止するためのカウンタにより休
止時間が決定される。タイミング情報を含む特殊
な休止符号は、そこでデジタル符号流に挿入され
る。肉声動作形スイツチとタイミング装置との組
合せは、アナログ信号の休止期間における繰返し
符号を除去するためのものである。公知のように肉声動作形スイツチは句節間でク
リツピングが起らないように低速で動作するよう
に設計され、短かい無声期間の検出を禁止するた
めの装置を含むものである。その結果、米国特許
第4280192号に記載された系では、データ流の適
切な位置に休止符号が挿入されるのを容易にする
ための補償形遅延を活用している。しかし、肉声
動作形スイツチには本質的に遅延が存在するた
め、言語の速度より高い速度で起る短期間の無声
期間を検出することは困難である。米国特許第4053712号は適応形のデジタル符号
化装置ならびに解読装置を開示したものであり、
全体のビツト速度の減ずるために、特殊な符号化
ビツトパターンをCVSD出力ビツト流のアイドル
パターンに置換している。CVSD出力をアナログ
信号に変換すると共にアナログ信号を固定振幅の
閾値と比較することにより、アイドルパターンの
検出が行われる。無声の検出するためのアナログ
スピーチ信号の再生成には、符号化の原価を増加
させることになる解読装置を追加しなければなら
ない。直接的に振幅閾値が存在すると、無声検出
における速度上の制限を比較的受け難い構成であ
る反面、ノイズに対する感度が増加する。結果的
には、無声状態と発音とを区別することは困難で
あり、装置のスピーチクリツピングに対する許容
性が大きくなつている。本発明の目的は、速度に制限を加えることなく
無声期間を経済的に除去するための改良形デジタ
ル符号化装置を提供することにある。発明の概要従来技術の無声除去方式の問題点は、無声期間
を検出するための適応形のパルス変調法において
すでに生成されているステツプサイズ関係信号を
使用することにより、本発明によつて解決するこ
とができる。これらのステツプサイズ関係信号は
印加されたスピーチ信号のエネルギ内容を表わす
ものであるが、肉声動作形スイツチに存在する遅
延装置により影響されることはない。その結果、
アナログ信号処理過程が除去されるが、無声期間
の検出があらかじめ定められた速度には制限され
ず、無声期間符号を適当に配置するための装置は
必要がなくなる。本発明はスピーチパターンを適応形のデジタル
符号列に変換するためのスピーチ処理装置を指向
したものである。これは、パターンの無声期間を
検出し、各無声期間を表わすデジタル符号を発生
させるものである。スピーチパターンを表わすデ
ジタル信号を形成するためには、適応形のデジタ
ル符号と無声期間符号とを組合せる。パターンを
適応形のデジタル符号へ変換する過程には、各デ
ジタル符号に対する適応ステツプサイズに相当し
た信号を形成する過程を含む。そこで、第１およ
び第２の閾値信号が生成される。無声期間の検出
には適応ステツプサイズ相当信号が第１の閾値よ
り小さく消滅した時に無声期間信号を発生させる
ため過程と、無声期間に第２の閾値より大きく適
応ステツプサイズ関係信号が増加した時に無声期
間信号を終端するための過程とを含む。詳細な説明第１図は本発明によるデジタル式音声通信装置
を示す図である。第１図を参照すれば、スピーチ
パターンをマイクロホン１０１に加え、マクロホ
ン１０１から得られたスピーチ信号を低減波器
と標本化回路１０３に加えてある。回路１０３に
よれば、公知の方法により印加したスピーチ信号
を低減波すると共に、クロツク信号発生器１０
７によりあらかじめ与えられた速度で信号を標本
化することができる。低域波器のカツトオフ周
波数を例えば3.2kHzとすると、標本化の速度は、
例えば、8kHzとすることができる。回路１０３
からのスピーチ標本の例はスピーチ信号の波形を
表わすものである。アナログ・デジタル変換器１０５は回路１０３
から連続するスピーチ標本を受信し、各スピーチ
標本をスピーチ標本の振幅に相当する値を有する
デジタル式符号化信号に変換する。変換器１０５
からのエジタル符号化信号は適応形のエンコーダ
１１０の入力へ印加される。従来技術から公知の
ように、適応形のエンコーダは変換器１０５から
のデジタル信号をＳ／Ｎ比が良好な、従来よりも
効率よく符号化された信号に変換することができ
る。第１図の回路に使用することができる適応形の
エンコーダを、第２図の論理図に示す。第２図は
適応形の差動パルル符号変調（adaptive
differential pulse code modulation：ADPCM）
用符号化装置を示す図であるが、他の形式の適応
形の符号変調器も使用できるものと理解すべきで
ある。差動パルス符号変調においては、各標本
Xnと過去の標本に基づいた前記標本Xn）の予測
値との間の相違は、伝送のために量子化し符号化
することにある。量子化装置のレベル数によれ
ば、スピーチ信号に対する段階波近似信号が生成
される。差動符号化においては、冗長度の信号に
対してとりすぎない様にして、従来のPCM符号
化よりも標本化あたり２ビツトだけビツト速度が
低くてすむ効果が得られるようにすることができ
る。一般に、差動符号化には固定されたステツプサ
イズを有する量子化装置を使用している。適応形
の装置には、符号化装置からのデジタル出力を監
視するための装置を具備している。出力における
差動信号振幅に応答できるように、量子化装置で
は実効的なステツプサイズを変えることができ
る。この方法においては、入力信号の量子化は最
適化されている。第２図を参照すれば、アナログからデジタルへ
の変換器回路１０５のデジタル信号Xnの列は加
算器２０１のひとつの入力端子に加えられてい
る。加算器２０１の他の入力では前の信号列
Xn_-1、Xn_-2、…に基づいて現在のデジタル信号
Xnの予測値を受信する。現在の信号Xnと予測信
号Xnとの間の相違は加算器２０１の出力に現れ、
これは量子化装置２０３に供給される。量子化装
置は、加算器２０１からの差動信号をあらかじめ
設定された量子化装置レベルの組と比較し、最近
接の量子化装置レベルに相当する信号を発生す
る。量子化装置２０３の出力に現れている量子化
差信号はエンコーダ２０７に加えられ、エンコー
ダ２０７は供給された量子化信号レベルに相当し
てデジタル符号Cnを形成するように適応した動
作をする。エンコーダ２０７の出力は、スピーチ信号標本
とその予測値との間の量子化された差を表わすデ
ジタル符号の列である。量子化装置２０３の出力
における量子化された差信号も加算器２０９のひ
とつの入力端子に加えられ、これは予測装置２０
５から得られた予測標本値を使つて加算器２０９
において加算される。加算器回路２０９からの和
信号は予測装置２０５の入力端子に供給される。
その結果、予測装置２０５の出力は次のスピーチ
標本信号Xn₊₁の予測値を表わすように更新され
る。ステツプサイズ発生器回路２１０はエンコーダ
出力符号Cnを受信する。また、Cnの相対振幅に
従つて量子化装置２０３におけるレベルを調整す
るために、上記符号Cnに対して応答可能なステ
ツプサイズ発生回路２１０はステツプサイズ信号
Δnを標本化する。Cnが大きい場合には、ステツ
プサイズ信号Δnは量子化装置２０３におけるス
テツプサイズを伸張するために効果的であり、こ
れによつて量子化装置は高振幅信号と適合性がよ
い。Cnが小さな値の場合には、量子化装置が低
振幅の信号に適合するようにステツプサイズを一
致させる。この方法においては、量子化装置は入
力信号を期待したレベルに適合せしめるためのも
のである。適応形のエンコーダと差動形のエンコ
ーダとを組立てて使用することは、エル・アー
ル・ラビナ（L.R.Rabiner）、アール・ダブリユ
ウ・シエフア（R.W.Schafer）共著の、「スピー
チ信号のデジタル処理（“Digital Processing of
Speech Signals”）」（1978年、プレンテイスホー
ル（Prentice Hall）社発刊、版権所有：ベル電
話研究所）において説明されている。次に、適応形のエンコーダ１１０はモートロー
ラ社から1980年に出された“MC68000の設計モ
ジユールユーザガイド”（MC68000 Design
Module Use′s Guide、Motorola Inc.、1980）
に記載されているモートローラ社の68000形マイ
クロプロセツサから成立つ。このマイクロプロセ
ツサは読取り専用記憶装置に記憶された、あらか
じめ定められた一連の命令に従つて動作するもの
である。付録Ａはフオートラン言語
（FORTRAN）においてＡ／Ｄ変換器１０５から
のADPCMエンコード信号に必要とされる永久記
憶形命令の一覧表を示したものである。第２図においてステツプサイズ発生器２１０は d_o＝βd_o-1＋mC_o (1) により動作するものであり、上式はエンコーダ２
０７の最後の出力（C_o-1）に対して応答可能なス
テツプサイズの対数に相当した信号d_oを形成する
ためのものである。βは誤差の消滅に関係した定
数であり、例えば β＝１−2^-6 (2) である。d_o-1は前回の対数ステツプサイズ信号で
あり、ｍは信号のダイナミツクレンジの期待値と
量子化装置のレベルの数とに関係してステツプサ
イズの大きさを調整するための倍率である。第２図において、デジタル符号化信号Cnは倍
率発生器２１１のアドレス入力端子に加えられ、
倍率発生器２１１はそのアドレス入力に加えられ
たそれぞれのデジタル符号入力Cnに対して、あ
らかじめ割当てられた出力mCnが得られるよう
に従来の公知の方法により採用されているプログ
ラムマブル読取り専用記憶装置（PROM）から
成立つ。ｍはｍ＝log_QＭ (3) ここで、最低の４レベルに対しては、ｍが0.85
であり、第５および第６のレベルに対しては1.2
である。また、Ｍ＝1.6（第７のレベル） 2.4（第８のレベル）Ｑ＝D^1/S である。Ｄは入力信号Xnのダイナミツクレンジ
であり、Ｓはステツプサイズの数である。遅延レジスタ２１５は前回の対数ステツプサイ
ズ信号d_o-1を保持するものである。ｎ番目の入力
に対するクロツク信号発生器１０７からのクロツ
ク信号CLTに応答して、信号d_o-1はシフタ２１７
に供給されると共に、減算器２１９のひとつの入
力端子に供給される。シフタ２１７は符号化信号
d_o-1を右へ６桁だけシフトして信号2^-6d_o-1を形
成し、符号化信号ビツトを再割当てするための配
線構成をとることができる。減算器２１９は信号d_o-1をレジスタ２１５から
受取り、さらに信号2^-6d_o-1をシフタ２１７から
受取つて、差動信号（１−2⁶）d_o-1を生成する様
に動作するものである。減算器２１９の出力は加
算器２１２において信号mC_oに加算され、得られ
たd_o信号はクロツクパルスCLTによりレジスタ
２１５に置数される。レジスタ２１５における信
号d_oは、量子化装置２０３に供給された符号化装
置のステツプサイズΔnの対数を表わすものであ
る。Δn信号を形成するために、デジタル符号d_o
はステツプサイズ信号形成装置２２１のアドレス
入力端子に印加される。形成装置２２１はプログ
ラマブル読取り専用記憶装置（PROM）であり、
PROMにd_oとΔnとの関係の一覧表が記憶されて
いる。それぞれのd_oのアドレス入力に対して、相
当するΔnのステツプサイズ信号はPROMから出
力される。対数ステツプサイズ信号d_oはスピーチ標本Xn
の列のエネルギを表わすものであり、スピーチ信
号における無声期間を決定するために使用するこ
とができる。肉声動作スイツチと他のスピーチ現
存信号検出器とに対比して、ノイズまたはスピー
チクリツピングが得られた符号化スピーチ信号に
混入されることなく、句節の速度よりもはるかに
速い速度で対数ステツプサイズ信号が変化する。
その結果、対数ステツプサイズ信号の変化に応答
可能な無声期間の検出は、事実上速度上の制限を
受けることなく本発明により実施される。無声期間の検出は、第３図にさらに詳細に示す
無声検出器１１５において実施されている。第３
図を参照して、適応形のエンコーダ１１０からの
信号d_oは振幅比較器３０１のａ入力端子と振幅比
較器３０５のｃ入力端子とに加えられる。d_oの信
号が閾値レベル信号TH１より小さな値に消滅し
た時には比較器３０５がイネーブルされる。比較
器３０５からのイネーブルされた出力は、ORゲ
ート３１５を介してフリツプフロツプ３２０に加
えられ、このフリツプフロツプをセツトする。セ
ツトされると、フリツプフロツプ３２０は無声期
間が開始したことを示すためのイネーブル化SF
信号を出力する。比較器３０１の出力はANDゲ
ート３１０に加えられているが、信号SFはAND
ゲート３１０に警告を与える。比較器３０１のＢ
入力端子に印加された閾値信号TH２はスピーチ
のオンセツトレベルに相当するものである。信号
d_oが閾値信号TH２より大きい時には、比較器３
０１がイネーブルされる。無声期間においては、
対数のステツプサイズ信号d_oが閾値TH２よりも
大きく増加する時に限つて、フリツプフロツプ３
２０はANDゲート３１０とORゲート３１５とを
介してリセツトされる。入力スピーチ信号のダイナミツクレンジが第１
図におけるマイクロホン１０１とフイルタ付き標
本化回路１０３との間に接続された信号圧縮装置
によりプリセツトされている場合には、閾値信号
TH１，TH２は固定電圧レベルとすることがで
きる。しかしながら、斯かる圧縮装置はスピーチ
信号を警告として送出し、話し手の声の特性が不
自然になる。その結果、再生したスピーチパター
ンはその話し手のようにはきこえない可能性があ
る。本発明によれば、通常採用されているような
スピーチ信号圧縮装置は、第１図の適応形の閾値
発生器１１２の使用により除去されている。適応
形の閾値発生器は閾値信号TH１，TH２を変更
するものである。閾値信号TH１，TH２は対数
ステツプサイズ信号により変化することが可能で
あり、これにより話し手の声の特性は無声除去に
必要な装置により変化することはない。適応形の閾値発生器１１２を第４図にさらに詳
細に示す。第４図を参照して、レベル信号発生器
４０１は、第１図の回路に入力された能動スピー
チ信号と合致性のある最低対数ステツプサイズ信
号の期待値に相当するプリセツト制限信号Ｌと、
d_naxと無声閾値との間で通常期待されている差の
量に相当するプリセツトレベル信号HW１，HW
２とを生成するものである。一般にd_naxは適応形
のエンコーダ１１０からのd_o信号列の最大値であ
る。レジスタ４２７は最初Ｌにセツトされ、標本
X_o-1までの対数ステツプサイズ信号の最大値を
記憶する。レジスタ４２７からのd_nax信号は、振
幅比較器４０３において、現在の対数ステツプサ
イズ信号d_oと比較される。d_oがd_naxよりも大きい
場合には、比較器はイネーブルされる。比較器４
０３からのイネーブル信号は三値スイツチ４０９
を開放せしめると共に三値スイツチ４０５を閉塞
せしめ、これによつてd_naxよりも大きいd_oを表わ
す信号が引算器４１５のａ入力端子とシフタ４１
２とに供給される。信号d_naxが信号d_oよりも大き
い場合には、比較器４０３はデイスエーブルされ
たままで、レジスタ４２７からのd_nax信号は三値
スイツチ４０９を介して通過し、減算器４１５の
ａ入力とシフタ４１２の入力とに供給される。シフタ４１２は入力を10桁だけ右へシフトし、
減算器４１５は信号 d_nax（１−2^-10）を比較器４１８のｂ入力と三ウエイスイツチ４２
５の入力とに与える。比較器４１８に加えられた
Ｌ制限信号が減算器４１５の出力よりも大きい場
合には、比較器４１８がイネーブルされる。その
場合には、三値スイツチ４２０がターンオンし、
制限信号Ｌはレジスタ４２７に置数される。減算
器４１５の出力が制限信号Ｌより小さい場合に
は、比較器４１８はデイスエーブルされたままで
ある。三値スイツチ４２５がターンオンし、レジ
スタ４２７の内容は対数ステツプサイズ信号d_o以
下で、これを含んでいる最大値の対数ステツプサ
イズ信号を受信する。レジスタ４２７からの信号d_naxは減算器４３０
に供給され、減算器４３０は信号d_nax−HW１を
形成するために動作する。この閾値レベルは最大
対数ステツプサイズ信号d_naxによつて変化し、無
声閾値は適応形の変化をする。減算器４４０は信
号TH２＝d_nax−HW２を形成し、スピーチオン
セツト閾値は最大対数ステツプサイズ信号により
適応形の変化をすることができる。このようにし
て、スピーチ信号の無声期間は話し手の特性を変
えることなく検出される。無声検出器１１５のSF出力は、第５図にさら
に詳細に示すように無声カウンタ１２０に供給さ
れる。第５図を参照して、SF信号はANDゲート
５０５のひとつの入力端子に加えられると共に、
インバータ５０７の入力端子にも加えられてい
る。通常のスピーチの期間には信号SFがデイス
エーブルされ、インバータ５０７の出力はカウン
タ５１０をその零位置にプリセツトする。無声期
間の開始に際して信号SFがイネーブルされ、符
号クロツク信号CLTはANDゲート５０５を通を
通つてカウンタ５１０に入力される。無声期間の
終端が検出されるまではカウンタ５１０の内容が
増分する。ラツチ５１５は無声期間の終端におい
てイネーブルされ、これにより無声期間カウント
信号はカウンタ５１０からラツチ５１５へ転送さ
れる。ラツチ５１５からの無声期間カウント信号
SCTは符号処理装置１２５の入力に供給され、
符号処理装置１２５は適応形のエンコーダ１１０
において生成したC_o符号と検出器１１５からの
SF信号とを共に受信する。符号処理装置１２５は出力符号C_oと無声期間
符号との組合せから成立つメツセージを形成し、
必要に応じて通信網１４０にメツセージを供給す
る様に適応化されている。処理装置１２５は上記
1980年にモートローラ社から出版された
MC68000の設計モジユールユーザガイド
（MC68000 Design Module User ｓ Guide、
Motorola、Inc.、1980）に記載されているモー
トローラ社の68000形マイクロプロセツサから成
立つ。処理装置１２５の符号の組合せ読取り専用
記憶装置（ROM）に記憶され、固定された一連
の命令により実行される。これらの命令は付録Ｂ
におけるフオートラン言語（FORTRAN）によ
り表わされている。スピーチ信号から除去された無声期間は、無声
カウンタ１２０からの無声カウント信号SCTの
先に現れる特殊な無声符号SCにより表わされて
いる。第１図の回路において、無声符号は適応形
のエンコーダの最大振幅出力として選択されてい
る。４ビツトのADPCM符号構成に対して、最大
振幅出力の組合せは87₁₆である。この符号はその
生起確率が低いゆえに選択されている。無声期間
の偽の検出を避けるため、C_oに生起している最
大振幅出力符号を置換しなければならない。第１
図の符号変更回路は、96₁₆の組合せにより87の符
号の組合せを置換するように動作し、信号の歪を
最低にする。符号変更回路を第６図にさらに詳細に示す。第
６図を参照して、適応形のエンコーダ１１０の出
力は各クロツクパルスCLTにおいてレジスタ６
１０に供給される。レジスタ６１０におけるC_o
符号は通常、順次三値スイツチ６１５を介してレ
ジスタ６２５に転送され、レジスタ６２５におけ
るC_o-1符号は三値スイツチ６０１の出力に現れ
る。現在の一対の符号化装置の信号を表わしてい
るレジスタ６１０，６２５の出力は、比較器６３
５において符号の組合せ87₁₆と比較される。これ
らの符号は従来技術において公知のように、信号
発生器６３２により供給される。レジスタ６１
０，６２５における87₁₆の列の検出に際して、比
較器６３５の出力がイネーブルされ、インバータ
６３７の出力がデイスエーブルされる。デイスエ
ーブルされた信号CCに応答して、三値スイツチ
６０５，６２０が信号CCによりイネーブルされ
ている期間には三値スイツチ６０１，６１５はデ
イスエーブルされている。発生器６３２からの
9₁₆信号はそれによりレジスタ６２５に挿入され、
発生器６３２からの6₁₆信号は三値スイツチ６０
５を介してデータ流に挿入されている。符号変更
器１３０の出力はそこで必要に応じて変えること
ができる。第７図の流れ図は第１図において実行された動
作の列を表わし、第８図に示す波形は第１図の回
路における種々の点で、信号と符号とを図示した
ものである。マイクロホン１０１におけるスピー
チ信号の受信に先立ち、無声カウンタ１２０は零
状態にリセツトされる。無声検出器１１５からの
信号SFはデイスエーブルされ、閾値発生器１１
２において記憶されている最大対数ステツプサイ
ズ信号d_naxは零に設定される。これらの動作は指
標過程７０１に指示されている。波形８０１上で
時刻t₀と時刻t₁との間に表示されているスピーチ
信号は無声期間には相当しない。その結果、スピ
ーチ波形（波形８０５上に示されている）から得
られる対数ステツプサイズ信号d_oは波形８０９の
閾値信号TH１よりも大きい。無声検出器１１５
からの無声フラグ信号SFがリセツトされ、出力
符号C_oは符号変更器回路１３０を介して符号処
理装置１２５に供給される。これらの符号C₁，
C₂，…C_oは通常のスピーチに相当し、波形８１
３に示されている。第７図を参照すると、各符号クロツクパルス
CLTの生成に際して待合せ過程７０８が励起さ
れる。符号クロツクパルスCLTに応答し、適応
形のエンコーダ１１０は過程７１０に示されてい
るように、次の適応形のエンコーダ出力信号C_o
を形成する。対数ステツプサイズ信号d_oとステツ
プサイズ信号Δnとは、過程７１２により適応形
のエンコーダにおいて形成され、d_o信号とd_nax信
号とは決定過程７１５により適応形の閾値発生器
１１２において比較される。対数のステツプサイ
ズ信号d_oがd_naxよりも大きい場合には、d_nax信号
は現在のd_o信号（過程７１８）に置換される。そこで、d_o信号を無声検出器１１５において試
験し、d_o信号が閾値発生器１１２からの低い方の
閾値信号より小さいか、あるいは両者が等しいか
を決定する。（決定過程７２０参照。）第８図にお
ける時刻t₀と時刻t₁との間で、対数ステツプサイ
ズ信号（波形８０５）が適応形の閾値信号TH１
（波形８０９）よりも大きく、もし無声フラグ信
号SFがセツトされていれば決定過程７２３は決
定を開始する。時刻t₀と時刻t₁との間でSF信号は
イネーブルされていないため、この期間には、各
CLTクロツプパルスに対して過程７２５が入れ
られる。決定過程７２５により、符号変更器１３
０において現在の符号化信号C_oは無声符号SCと
して試験される。符号化信号C_oが逆無声符号SC
に等しい場合には、C_oは回路１３０の変更器論
理により変化する。（過程７２９参照。）さもなけ
れば、変更されていないC_o符号は、伝送または
記憶に際して符号処理装置１２５のなかに置数さ
れる。時刻t₁に到達した時には、符号化装置の動作が
変化する。符号化信号C_o，d_o，Δ_oは箱７１０，
７１２に示したようにして発生し、適応形の閾値
回路１１２に記憶されたd_nax信号とd_o信号とが比
較される。（過程７１５参照。）しかしながら、時
刻t₁における信号d_oの値は適応形の閾値信号TH
１より小さいため、無声フラグ設定過程７３５は
決定過程７２０を介して入れられている。信号
SFは無声検出器１１５においてイネーブルされ
ていて、無声カウンタ１２０の内容は過程７３８
により増分する。待ちの過程７０８は、次のスピ
ーチ信号符号に対してクロツプパルス信号CLを
検出するために入れられている。時刻t₁とt₂との間では、対数ステツプサイズ信
号d_oの値は閾値信号TH２の値よりも小さい。結
果的には、無声カウンタ増分過程７３８は無声フ
ラグ設定過程７３５、あるいは決定過程７２３を
介して入れられているので、無声期間は過程７３
８に指示されているようにして時間を刻み続けて
いる。時刻t₂に到着した時には、対数ステツプサ
イズ信号d_oの値は決定過程７４０におけるスピー
チオンセツト閾値TH２の値よりも大きくなる。
無声フラグリセツト過程７４２は決定過程７２
０，７２３，７４０を含む経路を介して動作して
いる。そこで検出器１１５における無声期間信号
SFはリセツトされ、無声期間が終端する。無声
先頭信号（SC＝87₁₆）と無声カウント信号
（SCT）とは、信号SFの再セツテイングに応答可
能な符号処理装置１２５において形成されてい
る。そこで、無声カウンタの内容は過程７４６に
よつて零にクリアされる。無声符号試験と符号変
更回路１３０における過程７２５，７２９による
順序変更との後で、現在のC_o符号を符号処理装
置１２５に置数する。無声符号SCと無声カウントSCTとは、時刻t₂
と時刻t₃との間の波形８１３に示されているよう
に、符号処理装置１２５に記憶されているデータ
流のなかに置かれる。その結果、無声期間がこれ
以上検出されないため、符号変更装置１３０から
のC_o符号はデータ流に加えられている。無声と
無声カウント符号とを含み、処理装置１２５にお
いて取扱われるデータ流の符号は波形８１５に図
示してある。第１図の回路は肉声記憶系を構成し、肉声記憶
系において、回路網１４０は符号処理装置１２５
から受信された無声編集符号を記憶するための構
成を有するデジタル式処理装置である。波形８０
１のスピーチ信号に応答して、８１５に示すデジ
タル符号列を回路網１４０の処理装置に置数す
る。波形８１５に示すように、時刻t₁と時刻t₂と
の間の無声期間は無声カウントSCTの前に置か
れた無声符号SCに置換される。この様にして、
回路網１４０に対する記憶上の要求は事実上減ぜ
られる。回路網１４０からのデジタル符号は解読器１５
０に加えられ、無声期間を含み、元来マイクロホ
ン１０１に供給されているスピーチパターンのレ
プリカを形成するように解読器１５０は動作す
る。解読器１５０においては、適応形のデジタル
符号化信号は無声符号検出器と、カウンタ１５２
と、選択器回路１６０とに対して、フアーストイ
ン・フアーストアウト形のシフトレジスタ１５０
を介して加えられている。検出器１５２と選択器
１６０とに対してデジタル符号化信号列を加える
ための発生器１５３からのクロツクパルスCLR
に対してシフトレジスタ１５１は応答可能であ
り、これにより動作することができる。スピーチ
標本への変換のための適応形の解読器１６５に対
して直接、符号C_oを通過させるように選択器１
６０は通常動作させることができる。ジエー・ア
ール・ボデイら（J.R.Boddieet al.）により、ベ
ル電話研究所技術雑誌（Bell System Technical
Jurnal）の1981年９月号（第60巻第７号）に掲載
された「適応形差動パルス符号変調の符号化
（“Adaptive Differential Pulse Code
Modulation Coding”）」と題する論文において
記載されている形のADPCM解読器により解読器
１６５が構成されている。代りに、読取り専用記
憶装置に記載されている命令に従つて動作する上
記モートローラ社（Motorola）の6800形マイク
ロプロセツサを使用して解読器を構成することも
できる。フオートラン言語（FORTRAN）にお
ける符号化したADPCM信号を解読するために必
要な、永久に記憶されている命令の一覧表を付録
Ｃに示す。検出回路１５２における無声符号の検
出に際して、選択器１６０は符号発生器１５５を
適応形の解読器１６５に接続する。無声カウント
符号SCTに設定されている時間間隔に対して、
発生器１５５は無声期間と等価なC_o符号を生成
する。無声期間は無声カウンタ１５２によつてク
ロツクされて決定されている。この方法において
は、無声期間は符号流に再挿入されている。適応
形の解読器１６５からの標本化信号列は、元来、
無声期間を含んだ原符号化スピーチ波形に相当す
るものである。標本化スピーチ信号はD/A変換
器１７０と低域波器１７５とを介してアナログ
形式の信号に変換され、スピーチパターンが変換
器１８０において生成される。無声符号検出器とカウンタとの回路１５２を、
第９図にさらに詳細に示す。第９図を参照して、
回路網１４０から送出された適応形の符号化信号
列は、多段シフトレジスタ９０５の入力端子に加
えられている。レジスタ９０５からの符号は比較
器９１５の入力端子に供給され、比較器９１５に
おいては、それらの符号は符号発生器９２０にお
いて発生した87₁₆の無声符号と比較されている。
シフトレジスタ９０５からの符号87₁₆の検出に際
しては、比較器９１５がイネーブルされる。比較
器９１５からのイネーブルされた信号SLは、
ANDゲート９３０に警告を発して線路９２７上
の無声カウント符号に対してカウンタ９４０をプ
リセツトしている。イネーブルされた信号SLは
フリツプフロツプ９２５をセツトせしめ、フリツ
プフロツプ９２５からの信号SL１は線路１８１
を適応形の解読器１６５の入力から分離し、無声
符号発生器１５５を適応形の解読器の入力端子に
接続している。信号SL１もANDゲート１５６か
らの出力を禁止することができ、この場合には無
声カウントダウント期間を通してFIFOシフトレ
ジスタ１５１は符号化信号を与えることはない。
零カウントに到達するまでは、後続するクロツク
パルスCLRがANDゲート９３５を介して加えら
れ、カウンタ９４０の内容のカウントを減分す
る。その時には、カウンタ９４０の借り出力は無
声期間フリツプフロツプ９２５をリセツトし、信
号SL１はデイスエーブルされる。選択器１６０
は線路１８１を適応形の解読器１６５に接続し、
FIFOレジスタ１５１から送出された適応形の符
号流を解読器１６５に供給している。解読回路１５０の動作シーケンスは第１０図の
流れ図に示してあり、この動作に関連した波形は
第１１図に示してある。第１１図の波形１１０１
は回路網１４０から送出されて受信された適応形
の符号化信号列を示す。時刻t₀と時刻t₁との間
で、適応形のデジタル符号C₁〜C_oは順次、解読
器１５０に加えられている。符号C_oの後で、無
声カウント符号SCTの前に置かれた無声符号SC
は波形１１０１のデータ流に現れる。これらの２
つの符号はスピーチ信号における無声期間を表わ
している。無声期間に続いて、符号C_o+1から始ま
る適応形のデジタル符号列が現れる。第１０図において、解読器１５０のレジスタ
と、フリツプフロツプと、ラツチとは動作過程１
００１によつて初期にリセツトされる。過程１０
０７はクロツク待ち過程１００５を介して、次の
クロツクパルスCLRが送出される時に入れられ
る。第１図における時刻t₀と時刻t₁との間で、ス
ピーチ符号C₁，C₂…C_oに回路網１４０から受信
される。各受信パルスに対して、出力標本は動作
過程１００７によつて解読装置１６５において形
成される。決定過程１００９においては、フリツ
プフロツプ９２５からのSL１信号は検査される。
（決定過程１００９参照。）時刻t₀から時刻t₁まで
のスピーチ期間においては、信号SL１セツトさ
れず、FIFO１５０からの次の入力符号が受信さ
れる。（過程１０２０参照。）入力符号が無声符号
ではないため、過程１０２９は決定過程１０２５
を介して入れられ、入力標本を解読している。時刻t₁に到達した時には、入力符号が無声符号
SCである。SC符号は比較器９１５において検出
され、過程１０３４は決定過程１０２５を介して
入れられている。過程１０３４によつてシフトレ
ジスタ９０５からの無声カウントＭ（波形１１０
７）が送出されるごとに、カウンタ９４０はロー
ドされる。無声符号発生器１５５は、イネーブル
された信号SL１（波形１１０５）により解読器
１６５に接続されている。次のクロツクパルスCLの発生に際して、過程
１０４０は決定過程１００９を介して入れられ、
無声カウンタの内容は過程１０４０において減分
されている。無声カウンタの内容は時刻t₂に到る
まで零より大きい。結果的には、過程１０４２を
介して過程１０３６を実行させることができ、発
生器１５５からの無声符号列は解読器１６５に加
えられている。時刻t₂においては、無声カウント
は零に減分し（波形１１０７参照）、フリツプフ
ロツプ９２５は過程１０４２を介して過程１０４
４によりリセツトれる。時刻t₂の後では、入力符
号上の通常の動作は過程１００５，１００７，１
００９，１０２０，１０２５，１０２９を含む通
路を介して回復される。波形１１０９に示したよ
うな解読器１６５に対する入力は、時刻t₁と時刻
t₂との間の無声期間を含むものであり、波形１１
０１におけるSCと無声カウント符号とに応答可
能なように再構成することができる。本明細書では特定実施例を参照して本発明を説
明したが、本発明の精神と範囲とを越えることな
く種々の変化をさせることができることは当業者
において明らかである。例えば本明細書において
説明したADPCM形のエンコーダと解読器とは、
他の形の適応形のPCMのような方法による適応
形のデジタルエンコーダ装置と解読器装置とによ
り置換することができる。【表】【表】【表】 DETAILED DESCRIPTION OF THE INVENTION Field of the Invention The present invention relates to speech processing, and more particularly to speech signal encoding devices. BACKGROUND OF THE INVENTION As is known, normal speech includes silent periods that constitute part of the intelligence being conveyed. Silent periods occur between sentences, clauses, words, and words. Originally, silent periods are accepted by listeners as a natural attribute of speech. However, silence represents an important part of the speech pattern. When a speech signal is encoded for transmission over a communication channel or for storage in a storage device, the code sequences corresponding to the unvoiced portions may be removed for use for other purposes. It occupies a fragment of the encoded signal that can be used for other purposes. Of course, in order to understand the message, it is necessary to reproduce the unvoiced signal at an appropriate position in the speech pattern. However, the stored digital signal,
Alternatively, the encoding for silence can be simplified so that the transmitted digital signal can be created much more easily. In this way, the efficiency of the communication system is improved. Many methods are possible when digitally encoding speech signals. Converting sampled speech directly to binary form can be performed by pulse code modulation. The conversion process includes quantizing each speech sample into a separate set of levels and encoding the selected quantization level. In adaptive pulse modulation, where the quantization of the speech signal samples is adaptive to the input signal level, this form of pulse modulation uses significantly fewer bits per digital sample, resulting in improved performance. In such an adaptive system, the step size in the quantizer can be varied to match the statistical characteristics of the input signal. Differential pulse modulation systems encode the differences between input signal samples rather than sampling the input signal itself to improve encoding efficiency. However, these encoding devices do not accurately distinguish between active parts and inactive parts of an input signal. As a result, the silent period of the signal has an adverse effect on the intelligence transmission efficiency. U.S. Pat. No. 4,280,192 discloses an arrangement for minimizing space for digitally storing analog information, in which an analog input signal, such as speech, is stored in the form of a continuously varying ramp rate. It converts to a stream of delta modulation (ADM) symbols. A voice activated switch detects the start of a pause period when the analog signal falls below a predetermined level. The pause time is determined by a counter that stops operation when the speech signal rises above a predetermined level. A special pause code containing timing information is then inserted into the digital code stream. The combination of a voice activated switch and a timing device is intended to eliminate repetitive symbols during idle periods of the analog signal. As is known, voice-activated switches are designed to operate at low speeds to prevent interphrase clipping, and include a device for inhibiting the detection of short silent periods. As a result, the system described in US Pat. No. 4,280,192 utilizes compensated delays to facilitate insertion of pause codes at appropriate locations in the data stream. However, because of the inherent delay in voice-activated switches, it is difficult to detect short periods of silence that occur at speeds higher than the speed of speech. U.S. Pat. No. 4,053,712 discloses an adaptive digital encoding and decoding device,
To reduce the overall bit rate, a special encoding bit pattern is replaced by an idle pattern in the CVSD output bit stream. Idle pattern detection is performed by converting the CVSD output to an analog signal and comparing the analog signal to a fixed amplitude threshold. Regeneration of analog speech signals for detection of silence requires additional decoding equipment, which increases the cost of encoding. Although the existence of a direct amplitude threshold is relatively insensitive to speed limitations in silent detection, it increases sensitivity to noise. As a result, it is difficult to distinguish between silence and speech, making the device more susceptible to speech clipping. SUMMARY OF THE INVENTION It is an object of the present invention to provide an improved digital encoding device for economically removing silent periods without imposing speed limitations. SUMMARY OF THE INVENTION The problems of prior art silence cancellation schemes are solved by the present invention by using step size related signals already generated in an adaptive pulse modulation method for detecting silent periods. be able to. These step size related signals are representative of the energy content of the applied speech signal, but are not affected by the delay devices present in the voice activated switch. the result,
Although analog signal processing steps are eliminated, detection of silent periods is not limited to a predetermined rate and no equipment is needed to properly place silent period symbols. The present invention is directed to a speech processing device for converting speech patterns into adaptive digital code sequences. It detects the silent periods of the pattern and generates a digital code representing each silent period. The adaptive digital code and the silent period code are combined to form a digital signal representative of the speech pattern. The process of converting the patterns into adaptive digital codes includes forming a signal corresponding to the adaptive step size for each digital code. There, first and second threshold signals are generated. Detecting a silent period involves two steps: generating a silent period signal when the signal corresponding to the adaptive step size disappears below a first threshold value, and generating a silent period signal when the adaptive step size related signal increases to a value greater than a second threshold during the silent period. and terminating the period signal. DETAILED DESCRIPTION FIG. 1 is a diagram illustrating a digital voice communication device according to the present invention. Referring to FIG. 1, a speech pattern is applied to microphone 101 and the speech signal obtained from microphone 101 is applied to attenuator and sampling circuit 103. Referring to FIG. According to the circuit 103, the applied speech signal is attenuated in a known manner and the clock signal generator 10 is
7 allows the signal to be sampled at a predetermined rate. For example, if the cutoff frequency of the low frequency filter is 3.2kHz, the sampling speed is
For example, it can be 8kHz. circuit 103
An example of a speech sample from is representative of the waveform of a speech signal. Analog-to-digital converter 105 is circuit 103
and converting each speech sample into a digitally encoded signal having a value corresponding to the amplitude of the speech sample. converter 105
The digitally encoded signal from is applied to the input of an adaptive encoder 110. As is known from the prior art, an adaptive encoder is capable of converting the digital signal from the converter 105 into a more efficiently encoded signal with a good signal-to-noise ratio than before. An adaptive encoder that can be used in the circuit of FIG. 1 is shown in the logic diagram of FIG. Figure 2 shows adaptive differential pulse code modulation (adaptive differential pulse code modulation).
differential pulse code modulation (ADPCM)
Although the diagram shows a coding device for use in the present invention, it should be understood that other types of adaptive code modulators can also be used. In differential pulse code modulation, each sample
The difference between Xn and the predicted value of said sample Xn) based on past samples lies in the quantization and encoding for transmission. Depending on the number of levels of the quantizer, a step-wave approximation signal to the speech signal is generated. Differential encoding can take advantage of a lower bit rate of 2 bits per sample than traditional PCM encoding by not adding too much redundancy to the signal. . Generally, differential encoding uses a quantizer with a fixed step size. The adaptive device includes a device for monitoring the digital output from the encoding device. The effective step size can be varied in the quantizer so that it is responsive to the differential signal amplitude at the output. In this method, the quantization of the input signal is optimized. Referring to FIG. 2, the sequence of digital signals Xn of analog-to-digital converter circuit 105 is applied to one input terminal of adder 201. Referring to FIG. At the other input of adder 201, the previous signal sequence
Current digital signal based on Xn _-1 , Xn _-2 ,…
Receive the predicted value of Xn. The difference between the current signal Xn and the predicted signal Xn appears at the output of adder 201,
This is fed to a quantizer 203. The quantizer compares the differential signal from adder 201 with a preset set of quantizer levels and generates a signal corresponding to the nearest quantizer level. The quantized difference signal present at the output of the quantizer 203 is applied to an encoder 207, which operates adaptively to form a digital code Cn corresponding to the applied quantized signal level. The output of encoder 207 is a sequence of digital symbols representing the quantized difference between the speech signal sample and its predicted value. The quantized difference signal at the output of the quantizer 203 is also applied to one input terminal of the adder 209, which
Adder 209 uses the predicted sample value obtained from 5.
are added at . The sum signal from adder circuit 209 is provided to an input terminal of prediction device 205 .
As a result, the output of prediction device 205 is updated to represent the predicted value of the next speech sample signal Xn ₊₁ . Step size generator circuit 210 receives encoder output code Cn. Further, a step size generating circuit 210 responsive to the code Cn samples the step size signal Δn in order to adjust the level in the quantizer 203 according to the relative amplitude of Cn. When Cn is large, the step size signal Δn is effective for expanding the step size in the quantizer 203, thereby making the quantizer more compatible with high amplitude signals. For small values of Cn, the quantizer matches the step size to accommodate low amplitude signals. In this method, a quantizer is used to adapt the input signal to the expected level. The use of adaptive encoders and differential encoders in combination is described in the book “Digital Processing of Speech Signals,” co-authored by LLR Rabiner and RWSchafer.
"Speech Signals" (1978, published by Prentice Hall, copyright: Bell Telephone Laboratories). Next, the adaptive encoder 110 was published in the MC68000 Design Module User Guide published by Motorola in 1980.
Module Use's Guide, Motorola Inc., 1980)
It is based on Motorola's 68000 type microprocessor, which is described in . The microprocessor operates according to a predetermined series of instructions stored in read-only storage. Appendix A lists the permanent storage instructions required for the ADPCM encoded signal from the A/D converter 105 in the FORTRAN language. In FIG. 2, the step size generator 210 operates according to d _o =βd _o-1 +mC _o (1), and the above equation is expressed by the encoder 2
This is to form a signal d _o corresponding to the logarithm of the step size responsive to the last output (C _o-1 ) of the 07. β is a constant related to the disappearance of the error, and is, for example, β=1−2 ⁻⁶ (2). d _o-1 is the previous logarithmic step size signal, and m is a scaling factor for adjusting the magnitude of the step size in relation to the expected value of the dynamic range of the signal and the number of levels of the quantizer. . In FIG. 2, the digitally encoded signal Cn is applied to the address input terminal of the multiplication factor generator 211;
Multiplier generator 211 is a programmable read-only storage device employed in a conventional and known manner so that for each digital code input Cn applied to its address input, a preassigned output mCn is obtained. It is established from (PROM). m is m=log _Q M (3) where, for the lowest 4 levels, m is 0.85
and 1.2 for the fifth and sixth levels
It is. Also, M=1.6 (seventh level) 2.4 (eighth level) Q=D ^1/S . D is the dynamic range of the input signal Xn, and S is the number of step sizes. The delay register 215 holds the previous logarithmic step size signal do _-1 . In response to clock signal CLT from clock signal generator 107 for the nth input, signal do _-1 is output to shifter 217.
and one input terminal of the subtractor 219. Shifter 217 is a coded signal
A wiring configuration can be used to shift d _o-1 to the right by six places to form signal 2 ^-6 d _o-1 and reallocate the encoded signal bits. The subtracter 219 receives the signal d _o-1 from the register 215 and further receives the signal 2 ^-6 d _o-1 from the shifter 217 and operates to generate a differential signal (1-2 ⁶ ) d _o-1. It is something to do. The output of subtracter 219 is added to signal mC _o in adder 212, and the resulting _do signal is placed in register 215 by clock pulse CLT. The signal d _o in register 215 represents the logarithm of the encoder step size Δn supplied to quantizer 203 . To form the Δn signal, the digital code d _o
is applied to the address input terminal of the step size signal forming device 221. Forming device 221 is a programmable read-only memory (PROM);
A list of relationships between d _o and Δn is stored in the PROM. For each d _o address input, a corresponding Δn step size signal is output from the PROM. The logarithmic step size signal d _o is the speech sample Xn
represents the energy of a sequence of , and can be used to determine silent periods in a speech signal. In contrast to real voice motion switches and other speech extant signal detectors, logarithmic steps can be performed at speeds much faster than the speed of phrases without noise or speech clipping being introduced into the resulting encoded speech signal. The size signal changes.
As a result, silent period detection responsive to changes in the logarithmic step size signal is performed by the present invention with virtually no speed limitations. Detection of silent periods is performed in silence detector 115, which is shown in more detail in FIG. Third
Referring to the figure, signal _do from adaptive encoder 110 is applied to the a input terminal of amplitude comparator 301 and the c input terminal of amplitude comparator 305. Comparator 305 is enabled when the signal at d _o disappears to a value less than threshold level signal TH1. The enabled output from comparator 305 is applied to flip-flop 320 via OR gate 315 to set the flip-flop. When set, flip-flop 320 enables SF to indicate that a silent period has begun.
Output a signal. The output of comparator 301 is applied to AND gate 310, while signal SF is applied to AND gate 310.
Alert gate 310. B of comparator 301
The threshold signal TH2 applied to the input terminal corresponds to the onset level of speech. signal
When d _o is greater than the threshold signal TH2, comparator 3
01 is enabled. During the silent period,
Only when the logarithmic step size signal _do increases by more than the threshold TH2 does the flip-flop 3
20 is reset via AND gate 310 and OR gate 315. The dynamic range of the input speech signal is the first
If preset by a signal compression device connected between microphone 101 and filtered sampling circuit 103 in the figure, the threshold signal
TH1 and TH2 can be at fixed voltage levels. However, such compression devices send out speech signals as warnings and the characteristics of the speaker's voice become unnatural. As a result, the reproduced speech pattern may not sound like the speaker. In accordance with the present invention, speech signal compression devices, as commonly employed, are eliminated through the use of adaptive threshold generator 112 of FIG. The adaptive threshold generator changes the threshold signals TH1, TH2. The threshold signals TH1, TH2 can be varied by logarithmic step size signals, so that the characteristics of the speaker's voice are not changed by the equipment required for devoicing. Adaptive threshold generator 112 is shown in more detail in FIG. Referring to FIG. 4, level signal generator 401 generates a preset limit signal L corresponding to the expected value of the lowest logarithmic step size signal consistent with the active speech signal input to the circuit of FIG.
d A preset level signal HW1, HW corresponding to the amount of difference normally expected between _nax and the silent threshold.
2. Generally, d _nax is the maximum value of the d _o signal sequence from adaptive encoder 110 . Register 427 is initially set to L and the sample
Store the maximum value of the logarithmic step size signal up to X _o-1 . The d _nax signal from register 427 is compared in amplitude comparator 403 with the current logarithmic step size signal _do . If d _o is greater than d _nax , the comparator is enabled. Comparator 4
The enable signal from 03 is a three-value switch 409
is opened and the three-value switch 405 is closed, whereby a signal representing d _o which is greater than d _nax is transmitted to the a input terminal of the subtracter 415 and the shifter 41.
2. If signal d _nax is greater than signal d _o , comparator 403 remains disabled and the d _nax signal from register 427 is passed through ternary switch 409 to the a input of subtractor 415 and shifter 412. is supplied to the input of Shifter 412 shifts the input 10 digits to the right,
The subtracter 415 sends the signal d _nax ( ^1-2-10 ) to the b input of the comparator 418 and the three-way switch 42.
5 input. If the L limit signal applied to comparator 418 is greater than the output of subtractor 415, comparator 418 is enabled. In that case, the three-value switch 420 turns on,
Limit signal L is placed in register 427. If the output of subtractor 415 is less than limit signal L, comparator 418 remains disabled. Three-way switch 425 turns on and the contents of register 427 receive the largest logarithmic step size signal that is less than _or equal to logarithmic step size signal do. The signal d _nax from register 427 is sent to subtracter 430
subtractor 430 operates to form a signal d _nax -HW1. This threshold level varies with the maximum logarithmic step size signal d _nax and the silent threshold changes adaptively. Subtractor 440 forms the signal TH2=d _nax -HW2, and the speech onset threshold can be adaptively varied by the maximum logarithmic step size signal. In this way, silent periods of the speech signal are detected without changing the characteristics of the speaker. The SF output of silence detector 115 is provided to silence counter 120, as shown in more detail in FIG. Referring to FIG. 5, the SF signal is applied to one input terminal of AND gate 505, and
It is also added to the input terminal of the inverter 507. During normal speech, signal SF is disabled and the output of inverter 507 presets counter 510 to its zero position. At the beginning of a silent period, signal SF is enabled and code clock signal CLT is input to counter 510 through AND gate 505. The contents of counter 510 increment until the end of the silent period is detected. Latch 515 is enabled at the end of a silent period, thereby transferring the silent period count signal from counter 510 to latch 515. Silent period count signal from latch 515
SCT is provided to the input of code processing unit 125;
The code processing unit 125 is an adaptive encoder 110.
The C _o code generated in
Receive SF signals together. The code processing unit 125 forms a message consisting of the combination of the output code C _o and the silent period code,
It is adapted to provide messages to communication network 140 as needed. The processing device 125 is as described above.
Published by Motorola in 1980
MC68000 Design Module User's Guide
Motorola, Inc., 1980). The code combinations are stored in read-only memory (ROM) of processing unit 125 and executed by a fixed set of instructions. These instructions are in Appendix B
It is expressed in the FORTRAN language (FORTRAN). The silent periods removed from the speech signal are represented by a special silent symbol SC appearing ahead of the silent count signal SCT from the silent counter 120. In the circuit of FIG. 1, the unvoiced code is selected as the maximum amplitude output of the adaptive encoder. For a 4-bit ADPCM code structure, the maximum amplitude output combinations are ₈₇₁₆ . This code was chosen because its probability of occurrence is low. To avoid false detection of silent periods, the maximum amplitude output symbol occurring in C _o must be replaced. 1st
The sign change circuit shown operates to replace 87 sign combinations by 96 ₁₆ combinations, minimizing signal distortion. The sign change circuit is shown in more detail in FIG. Referring to FIG. 6, the output of adaptive encoder 110 is output to register 6 at each clock pulse CLT.
10. C _o in register 610
The code is typically transferred sequentially through ternary switch 615 to register 625, and the C _o-1 code in register 625 appears at the output of ternary switch 601. The outputs of registers 610 and 625 representing the current pair of encoder signals are output to comparator 63.
5 is compared with the code combination 87 ₁₆ . These codes are provided by signal generator 632, as is known in the art. register 61
Upon detection of the 87 ₁₆ column at 0,625, the output of comparator 635 is enabled and the output of inverter 637 is disabled. In response to the disabled signal CC, the ternary switches 601 and 615 are disabled while the ternary switches 605 and 620 are enabled by the signal CC. from generator 632
9 ₁₆ signals are thereby inserted into register 625,
The 6 ₁₆ signals from the generator 632 are sent to the three-way switch 60.
5 into the data stream. The output of sign changer 130 can then be changed as required. The flow diagram of FIG. 7 represents the sequence of operations performed in FIG. 1, and the waveforms shown in FIG. 8 illustrate the signals and symbols at various points in the circuit of FIG. Prior to reception of a speech signal at microphone 101, silent counter 120 is reset to a zero state. Signal SF from silence detector 115 is disabled and threshold generator 11
The maximum logarithmic step size signal d _nax stored at step 2 is set to zero. These operations are directed to index process 701. The speech signal displayed between time t ₀ and time t ₁ on waveform 801 does not correspond to a silent period. As a result, the logarithmic step size _signal do obtained from the speech waveform (shown on waveform 805) is greater than the threshold signal TH1 of waveform 809. Silence detector 115
The unvoiced flag signal SF from is reset and the output code C _o is provided to code processor 125 via code changer circuit 130 . These codes C ₁ ,
C ₂ ,...C _o corresponds to normal speech, and the waveform 81
3. Referring to FIG. 7, each code clock pulse
Upon creation of a CLT, a queuing process 708 is activated. In response to the code clock pulse CLT, adaptive encoder 110 generates the next adaptive encoder output signal C _o as shown in step 710.
form. Logarithmic step size signal _do and step size signal Δn are formed in the adaptive encoder by step 712, and the _do and d _nax signals are compared in adaptive threshold generator 112 by decision step 715. If the logarithmic step size signal _do is greater than d _nax , then the d _nax signal is replaced with the current _do signal (step 718). The _do signal is then tested in silence detector 115 to determine whether it is less than the lower threshold signal from threshold generator 112 or whether _they are equal. (See decision step 720.) Between time t ₀ and time t ₁ in FIG.
(waveform 809) and if the silent flag signal SF is set, the decision process 723 begins to make a decision. Since the SF signal is not enabled between time t ₀ and time t ₁ , each
Step 725 is entered for the CLT crop pulse. The decision process 725 determines that the sign changer 13
At 0 the current coded signal C _o is tested as a silent code SC. The coded signal C _o is an inverse unvoiced code SC
, then C _o is varied by the modifier logic of circuit 130 . (See step 729.) Otherwise, the unmodified C _o code is placed in code processing unit 125 for transmission or storage. When time t ₁ is reached, the operation of the encoding device changes. The encoded signals C _o , d _o , Δ _o are in boxes 710,
The d _nax and d _o signals generated as shown at 712 and stored in the adaptive threshold circuit 112 are compared. (See step 715.) However, the value of the signal d _o at time t ₁ is the adaptive threshold signal TH
Since it is less than 1, the silent flag setting process 735 is entered via the decision process 720. signal
SF is enabled in silence detector 115 and the contents of silence counter 120 are
Increment by. A wait step 708 is entered to detect the crop pulse signal CL for the next speech signal symbol. Between times _t1 and _t2 , the value of the logarithmic step size signal _do is smaller than the value of the threshold signal TH2. Consequently, since the silent counter increment step 738 is entered via the silent flag setting step 735 or the determining step 723, the silent period is determined by step 73.
I am continuing to keep track of time as instructed by 8. When time t ₂ is reached, the value of logarithmic step size signal d _o is greater than the value of speech onset threshold TH 2 in decision step 740 .
The silent flag reset process 742 is the determination process 72.
0,723,740. Therefore, the silent period signal at the detector 115
SF is reset and the silent period ends. The unvoiced leading signal (SC=87 ₁₆ ) and the unvoiced count signal (SCT) are formed in a code processing device 125 that is responsive to resetting of the signal SF. The contents of the silent counter are then cleared to zero by step 746. After unvoiced code testing and reordering by steps 725 and 729 in code change circuit 130, the current C _o code is placed in code processing unit 125. The unvoiced code SC and the unvoiced count SCT are the time t ₂
and time t ₃ in the data stream stored in code processor 125, as shown in waveform 813 between time t3 and time t3. As a result, the C _o code from code changer 130 has been added to the data stream since no more silent periods are detected. The symbols of the data stream handled by processor 125, including the silent and silent count symbols, are illustrated in waveform 815. The circuit shown in FIG.
A digital processing device having an arrangement for storing silent edit codes received from a computer. waveform 80
1, a digital code string shown at 815 is placed in the processing unit of circuitry 140. As shown in waveform 815, the silent period between time t ₁ and time t ₂ is replaced by a silent code SC placed before the silent count SCT. In this way,
Storage requirements on circuitry 140 are effectively reduced. The digital code from network 140 is sent to decoder 15.
Decoder 150 operates to create a replica of the speech pattern originally supplied to microphone 101, including silent periods. In the decoder 150, the adaptively digitally encoded signal is passed through a silent code detector and a counter 152.
and the selector circuit 160, a first-in/first-out shift register 150
It has been added via. clock pulse CLR from generator 153 for applying a digitally encoded signal sequence to detector 152 and selector 160;
The shift register 151 can respond to this and operate accordingly. The selector 1 passes the code C _o directly to an adaptive decoder 165 for conversion into speech samples.
60 can be operated normally. Bell System Technical Magazine by JRBoddie et al.
Adaptive Differential Pulse Code Modulation
The decoder 165 is constituted by an ADPCM decoder of the type described in the paper entitled "Modulation Coding"). Alternatively, the decoder can be constructed using the Motorola 6800 microprocessor described above, which operates according to instructions stored in read-only storage. Appendix C lists the permanently stored instructions required to decode an ADPCM signal encoded in the FORTRAN language. Upon detection of unvoiced codes in detection circuit 152, selector 160 connects code generator 155 to adaptive decoder 165. For the time interval set in the silent count code SCT,
Generator 155 generates a C _o code equivalent to a silent period. Silence periods are clocked and determined by silence counter 152. In this method, silent periods are reinserted into the codestream. The sampled signal sequence from the adaptive decoder 165 was originally
This corresponds to the original encoded speech waveform including silent periods. The sampled speech signal is converted to an analog format signal via a D/A converter 170 and a low frequency converter 175, and a speech pattern is generated in a converter 180. The unvoiced code detector and counter circuit 152,
This is shown in more detail in FIG. Referring to Figure 9,
The adaptively encoded signal sequence sent out from network 140 is applied to the input terminal of multi-stage shift register 905 . The codes from register 905 are provided to the input terminals of comparator 915 where they are compared with the 87 ₁₆ silent codes generated in code generator 920.
Upon detection of the code ₈₇₁₆ from shift register 905, comparator 915 is enabled. The enabled signal SL from comparator 915 is
AND gate 930 is alerted to preset counter 940 to the silent count symbol on line 927. The enabled signal SL causes flip-flop 925 to be set, and the signal SL1 from flip-flop 925 is set on line 181.
is separated from the input of the adaptive decoder 165, and the unvoiced code generator 155 is connected to the input terminal of the adaptive decoder. Signal SL1 can also be inhibited from outputting from AND gate 156, in which case FIFO shift register 151 will not provide an encoded signal throughout the silent countdown period.
Subsequent clock pulses CLR are applied via AND gate 935 to decrement the count of the contents of counter 940 until a zero count is reached. At that time, the borrow output of counter 940 resets silent period flip-flop 925 and signal SL1 is disabled. Selector 160
connects line 181 to adaptive decoder 165;
The adaptive code stream sent out from the FIFO register 151 is supplied to the decoder 165. The operating sequence of decoding circuit 150 is shown in the flow diagram of FIG. 10, and the waveforms associated with this operation are shown in FIG. Waveform 1101 in FIG.
shows an adaptively encoded signal sequence transmitted and received from network 140. Between time t ₀ and time t ₁ , adaptive digital codes C ₁ -C _o are sequentially applied to decoder 150 . A silent code SC placed after the code C _o and before the silent count code SCT
appears in the data stream of waveform 1101. These two
The two symbols represent silent periods in the speech signal. Following the silent period, an adaptive digital code sequence appears starting from code C _o+1 . In FIG. 10, the registers, flip-flops, and latches of the decoder 150 are shown in operation step 1.
It is initially reset by 001. Process 10
07 is entered via the clock wait process 1005 when the next clock pulse CLR is issued. Between time t ₀ and time t ₁ in FIG. 1, speech codes C ₁ , C ₂ . . . C _o are received from network 140 . For each received pulse, an output sample is formed in decoder 165 by operation 1007. In decision step 1009, the SL1 signal from flip-flop 925 is examined.
(See decision step 1009.) During the speech period from time _t0 to time _t1 , signal SL1 is not set and the next input symbol from FIFO 150 is received. (See step 1020.) Since the input code is not an unvoiced code, step 1029 is replaced by decision step 1025.
is entered through the input sample and decodes the input sample. When time t ₁ is reached, the input code is a silent code
It is SC. The SC code is detected in comparator 915 and step 1034 is entered via decision step 1025. Step 1034 extracts the silent count M from shift register 905 (waveform 110
7) is sent, counter 940 is loaded. Silent code generator 155 is connected to decoder 165 by enabled signal SL1 (waveform 1105). Upon the occurrence of the next clock pulse CL, step 1040 is entered via decision step 1009;
The contents of the silent counter are decremented in step 1040. The content of the silent counter remains greater than zero until time _t2 . Consequently, step 1036 can be performed via step 1042, with the unvoiced code string from generator 155 being applied to decoder 165. At time t ₂ , the silent count decrements to zero (see waveform 1107), and flip-flop 925 passes through step 1042 to step 104.
4 can be reset. After time t ₂ , the normal operation on the input code is steps 1005, 1007, 1
009, 1020, 1025, 1029. The inputs to the decoder 165 as shown in waveform 1109 are time t ₁ and time
It includes a silent period between t ₂ and waveform 11.
It can be reconfigured to respond to SC and silent count codes at 01. Although the invention has been described herein with reference to specific embodiments, it will be apparent to those skilled in the art that various changes can be made without departing from the spirit and scope of the invention. For example, the ADPCM type encoder and decoder described in this specification are:
Other forms of adaptive digital encoder and decoder devices can be replaced by methods such as adaptive PCM. [Table] [Table] [Table]

[Brief explanation of the drawing]

第１図は本発明によるデジタル式音声通信回路
のブロツク図、第２図は第１図の回路において有
用な適応形のエンコーダの詳細なブロツク図、第
３図は第１図の回路において有用な無声検出器の
詳細なブロツク図、第４図は第１図の回路におい
て有用な適応形の閾値発生器の詳細なブロツク
図、第５図は第１図の回路において有用な無声カ
ウンタ装置の詳細なブロツク図、第６図は第１図
の回路において有用な符号変更回路の詳細なブロ
ツク図、第７図ならびに第１０図は第１図の回路
の動作を示すフローチヤート図、第８図ならびに
第１１図は第１図の回路の動作を示す波形図、第
９図は第１図の回路において有用な無声符号検出
器とカウンタ装置との詳細なブロツク図である。〔主要部分の符号の説明〕、１０３……低域波
器と標本化装置、１０５，１７０……A/D変換
器、１１０……エンコーダ、１１２……閾値発生
器、１１５……無声検出器、１２０……無声カウ
ンタ、１２５……符号処理装置、１３０……符号
変更装置、１４０……回路網、１５０，１６５…
…解読器、１５１……シフトレジスタ、１５２…
…無声符号検出器とカウンタ、１５５……無声符
号発生器、１６０……選択器、１０７，１５３…
…クロツク信号発生器、２１０……ステツプサイ
ズ発生器。 1 is a block diagram of a digital voice communication circuit according to the present invention; FIG. 2 is a detailed block diagram of an adaptive encoder useful in the circuit of FIG. 1; and FIG. 3 is a detailed block diagram of an adaptive encoder useful in the circuit of FIG. FIG. 4 is a detailed block diagram of an adaptive threshold generator useful in the circuit of FIG. 1; FIG. 5 is a detailed block diagram of a silence counter device useful in the circuit of FIG. 6 is a detailed block diagram of a sign change circuit useful in the circuit of FIG. 1; FIGS. 7 and 10 are flowcharts illustrating the operation of the circuit of FIG. 1; FIGS. 11 is a waveform diagram showing the operation of the circuit of FIG. 1, and FIG. 9 is a detailed block diagram of an unvoiced code detector and counter device useful in the circuit of FIG. [Description of symbols of main parts], 103...Low frequency generator and sampling device, 105, 170...A/D converter, 110...Encoder, 112...Threshold generator, 115...Silence detector , 120... Silent counter, 125... Code processing device, 130... Code changing device, 140... Circuit network, 150, 165...
...Decoder, 151...Shift register, 152...
...Unvoiced code detector and counter, 155...Unvoiced code generator, 160...Selector, 107, 153...
...Clock signal generator, 210...Step size generator.

Claims

[Scope of Claims] 1. In a speech processing system having means for converting a speech pattern into an adaptively digitally encoded signal sequence and means for detecting a silent period in the speech pattern, means for generating a digitally encoded signal representative of a silent period in response to a period; and combining the adaptive digitally encoded signal and the digitally encoded signal representative of the silent period to form a digital signal corresponding to the pattern. means for forming a signal corresponding to an adaptive step size, the converting means comprising means for forming a signal representing the logarithm of the adaptive step size for each adaptively digitally encoded signal; means for generating a first threshold signal and a second threshold signal greater than the first threshold signal; means for generating a signal representative of a silent period in response to a size equivalent signal; and means for terminating the silent period signal in response to the silent period signal and the adaptive step size equivalent signal increasing by a greater amount than the second threshold signal. means for generating a third signal in response to the first threshold signal having a value larger than the step size equivalent signal; means for initializing the silent period signal in response to a third signal; means for generating a fourth signal in response to the step size equivalent signal greater than the second threshold signal; A speech processing system comprising means for terminating the silent period signal in response to both a period signal and the fourth signal. 2. The speech processing system according to claim 1, wherein the threshold signal generating means includes means for generating signals at first and second predetermined levels;
means for generating a signal representing a maximum step size equivalent signal in the train of step size equivalent signals in response to the train of step size equivalent signals; and means for generating a signal representing a maximum step size equivalent signal in the train of step size equivalent signals; means for generating an adaptive first threshold level signal both responsive to said maximum step size level signal and said second predetermined level signal; and means for generating a two-threshold level signal. 3. The speech processing system according to claim 1 or 2, wherein a digitally encoded signal sequence representing the speech is stored in response to the adaptive digitally encoded signal and the silent period encoded signal. A speech processing system comprising means for: 4. A speech processing system according to claim 3, further comprising means for configuring the speech pattern corresponding to the speech in response to a stored digital code string. 5. A method for processing speech comprising the steps of: converting a speech pattern into a sequence of adaptively digitally encoded signals; and detecting silent periods in said speech pattern, each detected silent period representing generating a digitally encoded signal and combining the adaptive digitally encoded signal and the silent period indicating encoded signal to form a digital signal representative of the speech pattern; and converting the speech pattern. a step of forming a signal representing the logarithm of the adaptive step size for each adaptive digital encoded signal; and a step of forming a signal corresponding to the adaptive step size; a second threshold signal that is greater than the first threshold signal; generating a signal indicative of the silent period in response to the silent period; and terminating the silent period signal in response to the silent threshold signal and the adaptive step size equivalent signal increasing greater than the second threshold signal. The step of generating the silent period signal is a step of generating a third signal in response to the first threshold signal which is larger than the step size equivalent signal, and the step of generating the third signal. a step for initializing the silent period in response to the signal; a step for generating a fourth signal in response to the step size equivalent signal greater than the second threshold signal; and terminating the silent period signal in response to the fourth signal. 6. The method according to claim 5, wherein the step of generating the threshold signal includes a step of generating signals of first and second predetermined levels, and a step of generating the step size equivalent signal. a step for generating a signal representative of said maximum step size equivalent signal in a column; and an adaptive first level signal responsive together to said maximum step size equivalent signal and said predetermined first level signal. and generating an adaptive second level threshold signal in response to the maximum step size equivalent signal and the second predetermined level threshold signal. A method characterized by comprising a process.