EP0685836A1 - Verfahren und Gerät zum Vorverarbeitung von einem akoustischen Signal vor der Sprachcodierung - Google Patents

Verfahren und Gerät zum Vorverarbeitung von einem akoustischen Signal vor der Sprachcodierung Download PDF

Info

Publication number
EP0685836A1
EP0685836A1 EP95401261A EP95401261A EP0685836A1 EP 0685836 A1 EP0685836 A1 EP 0685836A1 EP 95401261 A EP95401261 A EP 95401261A EP 95401261 A EP95401261 A EP 95401261A EP 0685836 A1 EP0685836 A1 EP 0685836A1
Authority
EP
European Patent Office
Prior art keywords
signal
state
frame
energy
pass
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
EP95401261A
Other languages
English (en)
French (fr)
Other versions
EP0685836B1 (de
Inventor
Sophie Scott
William Navarro
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nortel Networks France SAS
Original Assignee
Matra Communication SA
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Matra Communication SA filed Critical Matra Communication SA
Publication of EP0685836A1 publication Critical patent/EP0685836A1/de
Application granted granted Critical
Publication of EP0685836B1 publication Critical patent/EP0685836B1/de
Anticipated expiration legal-status Critical
Expired - Lifetime legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/04Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using predictive techniques
    • G10L19/06Determination or coding of the spectral characteristics, e.g. of the short-term prediction coefficients

Definitions

  • the present invention relates to a method and a device for preprocessing the acoustic signal supplied to a speech coder. It applies in particular, but not exclusively, to improve the performance of low bit rate speech coders.
  • the input signal of a speech coder has a "flatter" spectrum, for example when a hands-free system is used, employing a microphone with linear frequency response.
  • the usual vocoders are designed to be independent of the input with which they operate, and they are not informed of the characteristics of this input. If microphones of different characteristics are likely to be connected to the vocoder, or more generally if the vocoder is likely to receive acoustic signals having different spectral characteristics, then there are cases where the vocoder is used sub-optimally .
  • a main object of the present invention is to improve the performance of a vocoder by making them less dependent on the spectral characteristics of the signal intended for it.
  • the method according to the invention consists in subjecting the input acoustic signal to high-pass filtering, in comparing the energy of the filtered high-pass signal with that of the unfiltered signal to determine a state of the signal from a first state for wherein the energy of the filtered high pass signal is greater than a predetermined fraction of the energy of the unfiltered signal, and a second state for which the energy of the filtered high pass signal is less than the predetermined fraction of energy of the unfiltered signal, and to send to the encoder input the filtered high-pass signal subjected to a high frequency pre-emphasis when the signal is in its second state.
  • the high pass filter used is typically a 400 Hz abrupt cutoff filter, and the predetermined energy fraction is typically 85 to 95%.
  • the first signal state corresponds to the IRS characteristics
  • the second state corresponds to a flatter spectrum of the input acoustic signal containing proportionally more energy at low frequencies.
  • such a flat spectrum signal is preprocessed (high-pass filtering and pre-emphasis) to make its spectral characteristics closer to those of the IRS mask.
  • the use of high-pass filtering to determine the state of the signal has the advantage, compared to low-pass filtering, of making it possible to use the filtered signal to address it (after pre-emphasis) to the vocoder input.
  • the determined state of the signal can only be modified when the input acoustic signal, or the high-pass filtered signal, has an energy greater than a predetermined threshold.
  • a predetermined threshold for example in a zone of silence or of low ambient noise
  • the acoustic signal When the acoustic signal is digitized in successive frames, it is detected whether the signal included in each frame is in a first condition corresponding to the first state or in a second condition corresponding to the second state, and the state of the signal is determined on the basis of the frame by frame conditions, by modifying the determined state only after several successive frames show a signal condition different from that corresponding to the previously determined state.
  • This introduces a kind of hysteresis which makes it possible to take into account the rapid variations in the spectral envelope of the speech signal, due to the ambient noise or to the speech itself (the timbre of the voice is not constant). This reduces the risk of false determination of the state of the signal, which leads to a better quality of the coded signal and avoids introducing timbre discontinuities which could be due to untimely modifications of the determined state.
  • the pretreatment device comprises a high-pass filter receiving the acoustic input signal, means for calculating the energies contained respectively in said acoustic signal and in the output signal of the high-pass filter, comparison means calculated energies, and a high frequency pre-emphasis filter, the input of which receives the output signal of the high-pass filter, and the output of which delivers the signal sent to the input of the encoder when the comparison means reveal that the The high pass filter output signal contains less than a predetermined fraction of the energy of said acoustic signal.
  • the two solid lines correspond to the framing of the IRS mask defined for microphones in CCITT Recommendation P48.
  • an IRS type microphone signal has a strong attenuation in the lower part of the spectrum (between 0 and 300 Hz) and a relative attenuation in the high frequencies.
  • a signal of the linear type supplied for example by the microphone of a hands-free installation, has a flatter spectrum, in particular not having the strong attenuation at low frequencies (a typical example of such a signal linear is illustrated by a dashed line on the diagram in Figure 1).
  • the preprocessing device 10 processes the input signal supplied by an acoustic signal source to address it to a speech coder 12.
  • the encoder 12 is a low bit rate encoder optimized for an IRS type input signal. It can be, among other things, a linear prediction coder with excitation by regular pulse vectors (RP-CELP), as described in document EP-A-0 347 307.
  • RP-CELP regular pulse vectors
  • the coder 12 has no knowledge a priori of the source of the acoustic signal sent to it.
  • the acoustic input signal S I is the output signal from a microphone 13 which has been amplified and digitized by an analog-digital converter 14.
  • the signal is typically digitized at a rate of 8 kHz sampling, and put into successive 30 ms frames each containing 240 16-bit samples.
  • the pretreatment device 10 comprises a high-pass filter 16 receiving the input acoustic signal S I and delivering a filtered signal S I ′.
  • the filter 16 is typically a digital filter of the bi-quad type having an abrupt cutoff at 400 Hz.
  • the energies E1 and E2 contained in each frame of the acoustic input signal S I and of the filtered signal S I ' are calculated by two units 17, 18 each carrying out the sum of the squares of the samples of each frame which it receives.
  • the calculated energies E1 and E2 are supplied to a comparison unit 20 which determines the state of the signal in the form of a bit Y which is equal to 0 when it is determined that the signal is of IRS type (state Y A ), and 1 when it determines that the signal is rather of the linear type (state Y B ).
  • the output of the pretreatment device 10 connected to the input of the encoder 12 is constituted by a terminal of a switch 21, the other terminal of which is connected either to the input of the high-pass filter 16, or to the output of a pre-emphasis filter 22, according to the value of the bit Y delivered by the comparison unit 20.
  • H (z) 1- ⁇ / z
  • denotes a pre-emphasis coefficient which is typically of the order 0.4.
  • the comparison unit 20 is for example in accordance with the diagram illustrated in FIG. 3.
  • the energy E1 of each frame of the input signal S I is addressed to the input of a threshold comparator 25 which delivers a bit Z of value 0 when the energy E1 is less than a predetermined energy threshold, and of value 1 when the energy E1 is greater than the threshold.
  • the energy threshold is typically of the order of -38 dB relative to the signal saturation energy.
  • the comparator 25 serves to inhibit the determination of the state of the signal when the latter contains too little energy to be representative of the characteristics of the source. In this case, the determined state of the signal remains unchanged.
  • the energies E1 and E2 are sent to a digital divider 26 which calculates the ratio E2 / E1 for each frame.
  • This E2 / E1 ratio is sent to another threshold comparator 27 which delivers a bit X of value 0 when the E2 / E1 ratio is greater than a predetermined threshold, and of value 1 when the E2 / E1 ratio is less than the threshold.
  • This threshold on the E2 / E1 ratio is typically of the order of 0.93.
  • Bit X is representative of a signal condition on each frame.
  • the status bit Y is not taken directly equal to the condition bit X, but it results from a processing of the successive condition bits X by a state determination circuit 29.
  • the operation of the state determination circuit 29 is illustrated in FIG. 4, where the upper timing diagram illustrates an example of evolution of the bit X provided by the comparator 27.
  • the status bit Y (lower timing diagram) is initialized to 0 , because IRS characteristics are most frequently encountered.
  • variable V As soon as the variable V reaches a predetermined threshold (8 in the example considered), it is reset to 0 and the value of the bit Y is changed, so that it is determined that the signal has changed state.
  • a predetermined threshold 8 in the example considered
  • the signal is in state Y A up to frame M, in state Y B between frames M and N (change of signal source), then again in state Y A from frame N.
  • other modes of incrementation and decrementation and other threshold values could be used.
  • the above counting mode can for example be obtained by the circuit 29 shown in FIG. 3.
  • This circuit includes a counter 32 on four bits, the most significant bit of which corresponds to the status bit Y, and the three of which Least significant bits represent the counting variable V.
  • Bits X and Y are supplied to the input of an EXCLUSIVE OR gate 33, the output of which is addressed to the increment input of counter 32 via an AND gate 34 whose other input receives the Z bit supplied by the threshold comparator 25.
  • the inverted output of the gate 33 is supplied to a decrementing input of the counter 32 via another AND gate 35 whose two other inputs respectively receive the bit Z supplied by the comparator 25, and the output of an OR gate with three inputs 36 receiving the three least significant bits of the counter 32.
  • the counter 32 is arranged to split the pulses received on its decrementing input when its least significant bit is worth 0 or when at least one of the following two bits is worth 1, as shown diagrammatically by the OR gate 37 in FIG. 3.
  • the determination circuit 29 is not activated because AND gates 34, 35 prevent the value from being changed of counter 32.

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Spectroscopy & Molecular Physics (AREA)
  • Computational Linguistics (AREA)
  • Signal Processing (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Compression, Expansion, Code Conversion, And Decoders (AREA)
EP95401261A 1994-06-03 1995-05-31 Verfahren und Gerät zur Vorverarbeitung eines akustischen Signals vor der Sprachcodierung Expired - Lifetime EP0685836B1 (de)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
FR9406824A FR2720849B1 (fr) 1994-06-03 1994-06-03 Procédé et dispositif de prétraitement d'un signal acoustique en amont d'un codeur de parole.
FR9406824 1994-06-03

Publications (2)

Publication Number Publication Date
EP0685836A1 true EP0685836A1 (de) 1995-12-06
EP0685836B1 EP0685836B1 (de) 1999-07-21

Family

ID=9463860

Family Applications (1)

Application Number Title Priority Date Filing Date
EP95401261A Expired - Lifetime EP0685836B1 (de) 1994-06-03 1995-05-31 Verfahren und Gerät zur Vorverarbeitung eines akustischen Signals vor der Sprachcodierung

Country Status (4)

Country Link
US (1) US5644679A (de)
EP (1) EP0685836B1 (de)
DE (1) DE69510865T2 (de)
FR (1) FR2720849B1 (de)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
FR2729247A1 (fr) * 1995-01-06 1996-07-12 Matra Communication Procede de codage de parole a analyse par synthese
US6799159B2 (en) * 1998-02-02 2004-09-28 Motorola, Inc. Method and apparatus employing a vocoder for speech processing

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP0243562A1 (de) * 1986-04-30 1987-11-04 International Business Machines Corporation Sprachkodierungsverfahren und Einrichtung zur Ausführung dieses Verfahrens
EP0347307A2 (de) * 1988-06-13 1989-12-20 Matra Communication Kodierungsverfahren und linearer Prädiktionssprachkodierer
EP0477960A2 (de) * 1990-09-26 1992-04-01 Nec Corporation Sprachcodierung durch lineare Prädiktion mit Anhebung der Hochfrequenzen

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP0243562A1 (de) * 1986-04-30 1987-11-04 International Business Machines Corporation Sprachkodierungsverfahren und Einrichtung zur Ausführung dieses Verfahrens
EP0347307A2 (de) * 1988-06-13 1989-12-20 Matra Communication Kodierungsverfahren und linearer Prädiktionssprachkodierer
EP0477960A2 (de) * 1990-09-26 1992-04-01 Nec Corporation Sprachcodierung durch lineare Prädiktion mit Anhebung der Hochfrequenzen

Also Published As

Publication number Publication date
US5644679A (en) 1997-07-01
DE69510865T2 (de) 2000-07-13
DE69510865D1 (de) 1999-08-26
FR2720849B1 (fr) 1996-08-14
FR2720849A1 (fr) 1995-12-08
EP0685836B1 (de) 1999-07-21

Similar Documents

Publication Publication Date Title
EP0127718B1 (de) Verfahren zur Aktivitätsdetektion in einem Sprachübertragungssystem
EP0867856B1 (de) Verfahren und Vorrichtung zur Sprachdetektion
EP0768770B1 (de) Verfahren und Vorrichtung zur Erzeugung von Hintergrundrauschen in einem digitalen Übertragungssystem
CN1064772C (zh) 语音活动性检测器
EP0932964B1 (de) Verfahren und vorrichtung zum blinden ausgleich von übertragungskanaleffekten auf ein digitales sprachsignal
EP1008140B1 (de) Wellenform-basierter periodizitätsdetektor
US20020120440A1 (en) Method and apparatus for improved voice activity detection in a packet voice network
FR2520539A1 (fr) Procede et systeme de traitement des silences dans un signal de parole
CA2259641A1 (en) Microphone noise rejection system
EP0685833B1 (de) Verfahren zur Sprachkodierung mittels linearer Prädiktion
EP0428445B1 (de) Verfahren und Einrichtung zur Codierung von Prädiktionsfiltern in Vocodern mit sehr niedriger Datenrate
EP0043056B1 (de) Verfahren zum Erkennen der Sprache in einem Signal eines telephonischen Sprechkreises und Sprachdetektor dafür
NZ261180A (en) Coder/decoder for background sounds in digital telephony
EP0692883A1 (de) Verfahren zur blinden Entzerrung, und dessen Anwendung zur Spracherkennung
EP0685836B1 (de) Verfahren und Gerät zur Vorverarbeitung eines akustischen Signals vor der Sprachcodierung
EP0714088B1 (de) Sprachaktivitätsdetektion
EP1039736A1 (de) Verfahren und Vorrichtung zur adaptiven Identifikation und entsprechender adaptiver Echokompensator
EP1229517B1 (de) Verfahren zur Spracherkennung mit geräuschabhängiger Normalisierung der Varianz
WO1990009656A1 (fr) Appareil de traitement de la parole
CA1165917A (fr) Dispositif de mesure de l'attenuation d'un trajet de transmission
EP0776114A2 (de) Telefonapparat mit in Abhängigkeit des Umgebungsgeräusches regelbarer Lautstärke
US6633847B1 (en) Voice activated circuit and radio using same
EP0015363B1 (de) Sprachdetektor mit einem variablen Schwellwert
EP1729287A1 (de) Verfahren und Vorrichtung für adaptive Rauschunterdrückung
EP0470548B1 (de) Verfahren zur Kontrolle der Dämpfung bei einem digitalen Freisprechtelefonapparat

Legal Events

Date Code Title Description
PUAI Public reference made under article 153(3) epc to a published international application that has entered the european phase

Free format text: ORIGINAL CODE: 0009012

AK Designated contracting states

Kind code of ref document: A1

Designated state(s): DE ES GB IT NL SE

17P Request for examination filed

Effective date: 19951118

RAP1 Party data changed (applicant data changed or rights of an application transferred)

Owner name: MATRA NORTEL COMMUNICATIONS

GRAG Despatch of communication of intention to grant

Free format text: ORIGINAL CODE: EPIDOS AGRA

17Q First examination report despatched

Effective date: 19981112

GRAG Despatch of communication of intention to grant

Free format text: ORIGINAL CODE: EPIDOS AGRA

GRAH Despatch of communication of intention to grant a patent

Free format text: ORIGINAL CODE: EPIDOS IGRA

GRAH Despatch of communication of intention to grant a patent

Free format text: ORIGINAL CODE: EPIDOS IGRA

GRAA (expected) grant

Free format text: ORIGINAL CODE: 0009210

AK Designated contracting states

Kind code of ref document: B1

Designated state(s): DE ES GB IT NL SE

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: SE

Free format text: THE PATENT HAS BEEN ANNULLED BY A DECISION OF A NATIONAL AUTHORITY

Effective date: 19990721

Ref country code: NL

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 19990721

Ref country code: ES

Free format text: THE PATENT HAS BEEN ANNULLED BY A DECISION OF A NATIONAL AUTHORITY

Effective date: 19990721

GBT Gb: translation of ep patent filed (gb section 77(6)(a)/1977)

Effective date: 19990804

REF Corresponds to:

Ref document number: 69510865

Country of ref document: DE

Date of ref document: 19990826

ITF It: translation for a ep patent filed
NLV1 Nl: lapsed or annulled due to failure to fulfill the requirements of art. 29p and 29m of the patents act
PLBE No opposition filed within time limit

Free format text: ORIGINAL CODE: 0009261

STAA Information on the status of an ep patent application or granted ep patent

Free format text: STATUS: NO OPPOSITION FILED WITHIN TIME LIMIT

26N No opposition filed
REG Reference to a national code

Ref country code: GB

Ref legal event code: IF02

PGFP Annual fee paid to national office [announced via postgrant information from national office to epo]

Ref country code: DE

Payment date: 20040528

Year of fee payment: 10

PGFP Annual fee paid to national office [announced via postgrant information from national office to epo]

Ref country code: GB

Payment date: 20050414

Year of fee payment: 11

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: IT

Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES;WARNING: LAPSES OF ITALIAN PATENTS WITH EFFECTIVE DATE BEFORE 2007 MAY HAVE OCCURRED AT ANY TIME BEFORE 2007. THE CORRECT EFFECTIVE DATE MAY BE DIFFERENT FROM THE ONE RECORDED.

Effective date: 20050531

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: DE

Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES

Effective date: 20051201

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: GB

Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES

Effective date: 20060531

GBPC Gb: european patent ceased through non-payment of renewal fee

Effective date: 20060531