EP0685836A1 - Verfahren und Gerät zum Vorverarbeitung von einem akoustischen Signal vor der Sprachcodierung - Google Patents
Verfahren und Gerät zum Vorverarbeitung von einem akoustischen Signal vor der Sprachcodierung Download PDFInfo
- Publication number
- EP0685836A1 EP0685836A1 EP95401261A EP95401261A EP0685836A1 EP 0685836 A1 EP0685836 A1 EP 0685836A1 EP 95401261 A EP95401261 A EP 95401261A EP 95401261 A EP95401261 A EP 95401261A EP 0685836 A1 EP0685836 A1 EP 0685836A1
- Authority
- EP
- European Patent Office
- Prior art keywords
- signal
- state
- frame
- energy
- pass
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000000034 method Methods 0.000 title claims abstract description 10
- 238000007781 pre-processing Methods 0.000 title claims description 5
- 238000001914 filtration Methods 0.000 claims description 6
- 238000011144 upstream manufacturing Methods 0.000 claims 2
- 230000003247 decreasing effect Effects 0.000 claims 1
- 238000012986 modification Methods 0.000 abstract description 2
- 230000004048 modification Effects 0.000 abstract description 2
- 238000010586 diagram Methods 0.000 description 10
- 230000003595 spectral effect Effects 0.000 description 8
- 238000001228 spectrum Methods 0.000 description 6
- 230000008901 benefit Effects 0.000 description 3
- 238000005070 sampling Methods 0.000 description 2
- 230000037007 arousal Effects 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 230000001419 dependent effect Effects 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 230000005284 excitation Effects 0.000 description 1
- 238000009432 framing Methods 0.000 description 1
- 238000009434 installation Methods 0.000 description 1
- 230000008569 process Effects 0.000 description 1
- 230000004044 response Effects 0.000 description 1
- 239000013598 vector Substances 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
- G10L19/04—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using predictive techniques
- G10L19/06—Determination or coding of the spectral characteristics, e.g. of the short-term prediction coefficients
Definitions
- the present invention relates to a method and a device for preprocessing the acoustic signal supplied to a speech coder. It applies in particular, but not exclusively, to improve the performance of low bit rate speech coders.
- the input signal of a speech coder has a "flatter" spectrum, for example when a hands-free system is used, employing a microphone with linear frequency response.
- the usual vocoders are designed to be independent of the input with which they operate, and they are not informed of the characteristics of this input. If microphones of different characteristics are likely to be connected to the vocoder, or more generally if the vocoder is likely to receive acoustic signals having different spectral characteristics, then there are cases where the vocoder is used sub-optimally .
- a main object of the present invention is to improve the performance of a vocoder by making them less dependent on the spectral characteristics of the signal intended for it.
- the method according to the invention consists in subjecting the input acoustic signal to high-pass filtering, in comparing the energy of the filtered high-pass signal with that of the unfiltered signal to determine a state of the signal from a first state for wherein the energy of the filtered high pass signal is greater than a predetermined fraction of the energy of the unfiltered signal, and a second state for which the energy of the filtered high pass signal is less than the predetermined fraction of energy of the unfiltered signal, and to send to the encoder input the filtered high-pass signal subjected to a high frequency pre-emphasis when the signal is in its second state.
- the high pass filter used is typically a 400 Hz abrupt cutoff filter, and the predetermined energy fraction is typically 85 to 95%.
- the first signal state corresponds to the IRS characteristics
- the second state corresponds to a flatter spectrum of the input acoustic signal containing proportionally more energy at low frequencies.
- such a flat spectrum signal is preprocessed (high-pass filtering and pre-emphasis) to make its spectral characteristics closer to those of the IRS mask.
- the use of high-pass filtering to determine the state of the signal has the advantage, compared to low-pass filtering, of making it possible to use the filtered signal to address it (after pre-emphasis) to the vocoder input.
- the determined state of the signal can only be modified when the input acoustic signal, or the high-pass filtered signal, has an energy greater than a predetermined threshold.
- a predetermined threshold for example in a zone of silence or of low ambient noise
- the acoustic signal When the acoustic signal is digitized in successive frames, it is detected whether the signal included in each frame is in a first condition corresponding to the first state or in a second condition corresponding to the second state, and the state of the signal is determined on the basis of the frame by frame conditions, by modifying the determined state only after several successive frames show a signal condition different from that corresponding to the previously determined state.
- This introduces a kind of hysteresis which makes it possible to take into account the rapid variations in the spectral envelope of the speech signal, due to the ambient noise or to the speech itself (the timbre of the voice is not constant). This reduces the risk of false determination of the state of the signal, which leads to a better quality of the coded signal and avoids introducing timbre discontinuities which could be due to untimely modifications of the determined state.
- the pretreatment device comprises a high-pass filter receiving the acoustic input signal, means for calculating the energies contained respectively in said acoustic signal and in the output signal of the high-pass filter, comparison means calculated energies, and a high frequency pre-emphasis filter, the input of which receives the output signal of the high-pass filter, and the output of which delivers the signal sent to the input of the encoder when the comparison means reveal that the The high pass filter output signal contains less than a predetermined fraction of the energy of said acoustic signal.
- the two solid lines correspond to the framing of the IRS mask defined for microphones in CCITT Recommendation P48.
- an IRS type microphone signal has a strong attenuation in the lower part of the spectrum (between 0 and 300 Hz) and a relative attenuation in the high frequencies.
- a signal of the linear type supplied for example by the microphone of a hands-free installation, has a flatter spectrum, in particular not having the strong attenuation at low frequencies (a typical example of such a signal linear is illustrated by a dashed line on the diagram in Figure 1).
- the preprocessing device 10 processes the input signal supplied by an acoustic signal source to address it to a speech coder 12.
- the encoder 12 is a low bit rate encoder optimized for an IRS type input signal. It can be, among other things, a linear prediction coder with excitation by regular pulse vectors (RP-CELP), as described in document EP-A-0 347 307.
- RP-CELP regular pulse vectors
- the coder 12 has no knowledge a priori of the source of the acoustic signal sent to it.
- the acoustic input signal S I is the output signal from a microphone 13 which has been amplified and digitized by an analog-digital converter 14.
- the signal is typically digitized at a rate of 8 kHz sampling, and put into successive 30 ms frames each containing 240 16-bit samples.
- the pretreatment device 10 comprises a high-pass filter 16 receiving the input acoustic signal S I and delivering a filtered signal S I ′.
- the filter 16 is typically a digital filter of the bi-quad type having an abrupt cutoff at 400 Hz.
- the energies E1 and E2 contained in each frame of the acoustic input signal S I and of the filtered signal S I ' are calculated by two units 17, 18 each carrying out the sum of the squares of the samples of each frame which it receives.
- the calculated energies E1 and E2 are supplied to a comparison unit 20 which determines the state of the signal in the form of a bit Y which is equal to 0 when it is determined that the signal is of IRS type (state Y A ), and 1 when it determines that the signal is rather of the linear type (state Y B ).
- the output of the pretreatment device 10 connected to the input of the encoder 12 is constituted by a terminal of a switch 21, the other terminal of which is connected either to the input of the high-pass filter 16, or to the output of a pre-emphasis filter 22, according to the value of the bit Y delivered by the comparison unit 20.
- H (z) 1- ⁇ / z
- ⁇ denotes a pre-emphasis coefficient which is typically of the order 0.4.
- the comparison unit 20 is for example in accordance with the diagram illustrated in FIG. 3.
- the energy E1 of each frame of the input signal S I is addressed to the input of a threshold comparator 25 which delivers a bit Z of value 0 when the energy E1 is less than a predetermined energy threshold, and of value 1 when the energy E1 is greater than the threshold.
- the energy threshold is typically of the order of -38 dB relative to the signal saturation energy.
- the comparator 25 serves to inhibit the determination of the state of the signal when the latter contains too little energy to be representative of the characteristics of the source. In this case, the determined state of the signal remains unchanged.
- the energies E1 and E2 are sent to a digital divider 26 which calculates the ratio E2 / E1 for each frame.
- This E2 / E1 ratio is sent to another threshold comparator 27 which delivers a bit X of value 0 when the E2 / E1 ratio is greater than a predetermined threshold, and of value 1 when the E2 / E1 ratio is less than the threshold.
- This threshold on the E2 / E1 ratio is typically of the order of 0.93.
- Bit X is representative of a signal condition on each frame.
- the status bit Y is not taken directly equal to the condition bit X, but it results from a processing of the successive condition bits X by a state determination circuit 29.
- the operation of the state determination circuit 29 is illustrated in FIG. 4, where the upper timing diagram illustrates an example of evolution of the bit X provided by the comparator 27.
- the status bit Y (lower timing diagram) is initialized to 0 , because IRS characteristics are most frequently encountered.
- variable V As soon as the variable V reaches a predetermined threshold (8 in the example considered), it is reset to 0 and the value of the bit Y is changed, so that it is determined that the signal has changed state.
- a predetermined threshold 8 in the example considered
- the signal is in state Y A up to frame M, in state Y B between frames M and N (change of signal source), then again in state Y A from frame N.
- other modes of incrementation and decrementation and other threshold values could be used.
- the above counting mode can for example be obtained by the circuit 29 shown in FIG. 3.
- This circuit includes a counter 32 on four bits, the most significant bit of which corresponds to the status bit Y, and the three of which Least significant bits represent the counting variable V.
- Bits X and Y are supplied to the input of an EXCLUSIVE OR gate 33, the output of which is addressed to the increment input of counter 32 via an AND gate 34 whose other input receives the Z bit supplied by the threshold comparator 25.
- the inverted output of the gate 33 is supplied to a decrementing input of the counter 32 via another AND gate 35 whose two other inputs respectively receive the bit Z supplied by the comparator 25, and the output of an OR gate with three inputs 36 receiving the three least significant bits of the counter 32.
- the counter 32 is arranged to split the pulses received on its decrementing input when its least significant bit is worth 0 or when at least one of the following two bits is worth 1, as shown diagrammatically by the OR gate 37 in FIG. 3.
- the determination circuit 29 is not activated because AND gates 34, 35 prevent the value from being changed of counter 32.
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Spectroscopy & Molecular Physics (AREA)
- Computational Linguistics (AREA)
- Signal Processing (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- Compression, Expansion, Code Conversion, And Decoders (AREA)
Applications Claiming Priority (2)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| FR9406824A FR2720849B1 (fr) | 1994-06-03 | 1994-06-03 | Procédé et dispositif de prétraitement d'un signal acoustique en amont d'un codeur de parole. |
| FR9406824 | 1994-06-03 |
Publications (2)
| Publication Number | Publication Date |
|---|---|
| EP0685836A1 true EP0685836A1 (de) | 1995-12-06 |
| EP0685836B1 EP0685836B1 (de) | 1999-07-21 |
Family
ID=9463860
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| EP95401261A Expired - Lifetime EP0685836B1 (de) | 1994-06-03 | 1995-05-31 | Verfahren und Gerät zur Vorverarbeitung eines akustischen Signals vor der Sprachcodierung |
Country Status (4)
| Country | Link |
|---|---|
| US (1) | US5644679A (de) |
| EP (1) | EP0685836B1 (de) |
| DE (1) | DE69510865T2 (de) |
| FR (1) | FR2720849B1 (de) |
Families Citing this family (2)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| FR2729247A1 (fr) * | 1995-01-06 | 1996-07-12 | Matra Communication | Procede de codage de parole a analyse par synthese |
| US6799159B2 (en) * | 1998-02-02 | 2004-09-28 | Motorola, Inc. | Method and apparatus employing a vocoder for speech processing |
Citations (3)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| EP0243562A1 (de) * | 1986-04-30 | 1987-11-04 | International Business Machines Corporation | Sprachkodierungsverfahren und Einrichtung zur Ausführung dieses Verfahrens |
| EP0347307A2 (de) * | 1988-06-13 | 1989-12-20 | Matra Communication | Kodierungsverfahren und linearer Prädiktionssprachkodierer |
| EP0477960A2 (de) * | 1990-09-26 | 1992-04-01 | Nec Corporation | Sprachcodierung durch lineare Prädiktion mit Anhebung der Hochfrequenzen |
-
1994
- 1994-06-03 FR FR9406824A patent/FR2720849B1/fr not_active Expired - Fee Related
-
1995
- 1995-05-31 DE DE69510865T patent/DE69510865T2/de not_active Expired - Fee Related
- 1995-05-31 EP EP95401261A patent/EP0685836B1/de not_active Expired - Lifetime
- 1995-06-05 US US08/462,209 patent/US5644679A/en not_active Expired - Lifetime
Patent Citations (3)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| EP0243562A1 (de) * | 1986-04-30 | 1987-11-04 | International Business Machines Corporation | Sprachkodierungsverfahren und Einrichtung zur Ausführung dieses Verfahrens |
| EP0347307A2 (de) * | 1988-06-13 | 1989-12-20 | Matra Communication | Kodierungsverfahren und linearer Prädiktionssprachkodierer |
| EP0477960A2 (de) * | 1990-09-26 | 1992-04-01 | Nec Corporation | Sprachcodierung durch lineare Prädiktion mit Anhebung der Hochfrequenzen |
Also Published As
| Publication number | Publication date |
|---|---|
| US5644679A (en) | 1997-07-01 |
| DE69510865T2 (de) | 2000-07-13 |
| DE69510865D1 (de) | 1999-08-26 |
| FR2720849B1 (fr) | 1996-08-14 |
| FR2720849A1 (fr) | 1995-12-08 |
| EP0685836B1 (de) | 1999-07-21 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| EP0127718B1 (de) | Verfahren zur Aktivitätsdetektion in einem Sprachübertragungssystem | |
| EP0867856B1 (de) | Verfahren und Vorrichtung zur Sprachdetektion | |
| EP0768770B1 (de) | Verfahren und Vorrichtung zur Erzeugung von Hintergrundrauschen in einem digitalen Übertragungssystem | |
| CN1064772C (zh) | 语音活动性检测器 | |
| EP0932964B1 (de) | Verfahren und vorrichtung zum blinden ausgleich von übertragungskanaleffekten auf ein digitales sprachsignal | |
| EP1008140B1 (de) | Wellenform-basierter periodizitätsdetektor | |
| US20020120440A1 (en) | Method and apparatus for improved voice activity detection in a packet voice network | |
| FR2520539A1 (fr) | Procede et systeme de traitement des silences dans un signal de parole | |
| CA2259641A1 (en) | Microphone noise rejection system | |
| EP0685833B1 (de) | Verfahren zur Sprachkodierung mittels linearer Prädiktion | |
| EP0428445B1 (de) | Verfahren und Einrichtung zur Codierung von Prädiktionsfiltern in Vocodern mit sehr niedriger Datenrate | |
| EP0043056B1 (de) | Verfahren zum Erkennen der Sprache in einem Signal eines telephonischen Sprechkreises und Sprachdetektor dafür | |
| NZ261180A (en) | Coder/decoder for background sounds in digital telephony | |
| EP0692883A1 (de) | Verfahren zur blinden Entzerrung, und dessen Anwendung zur Spracherkennung | |
| EP0685836B1 (de) | Verfahren und Gerät zur Vorverarbeitung eines akustischen Signals vor der Sprachcodierung | |
| EP0714088B1 (de) | Sprachaktivitätsdetektion | |
| EP1039736A1 (de) | Verfahren und Vorrichtung zur adaptiven Identifikation und entsprechender adaptiver Echokompensator | |
| EP1229517B1 (de) | Verfahren zur Spracherkennung mit geräuschabhängiger Normalisierung der Varianz | |
| WO1990009656A1 (fr) | Appareil de traitement de la parole | |
| CA1165917A (fr) | Dispositif de mesure de l'attenuation d'un trajet de transmission | |
| EP0776114A2 (de) | Telefonapparat mit in Abhängigkeit des Umgebungsgeräusches regelbarer Lautstärke | |
| US6633847B1 (en) | Voice activated circuit and radio using same | |
| EP0015363B1 (de) | Sprachdetektor mit einem variablen Schwellwert | |
| EP1729287A1 (de) | Verfahren und Vorrichtung für adaptive Rauschunterdrückung | |
| EP0470548B1 (de) | Verfahren zur Kontrolle der Dämpfung bei einem digitalen Freisprechtelefonapparat |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| PUAI | Public reference made under article 153(3) epc to a published international application that has entered the european phase |
Free format text: ORIGINAL CODE: 0009012 |
|
| AK | Designated contracting states |
Kind code of ref document: A1 Designated state(s): DE ES GB IT NL SE |
|
| 17P | Request for examination filed |
Effective date: 19951118 |
|
| RAP1 | Party data changed (applicant data changed or rights of an application transferred) |
Owner name: MATRA NORTEL COMMUNICATIONS |
|
| GRAG | Despatch of communication of intention to grant |
Free format text: ORIGINAL CODE: EPIDOS AGRA |
|
| 17Q | First examination report despatched |
Effective date: 19981112 |
|
| GRAG | Despatch of communication of intention to grant |
Free format text: ORIGINAL CODE: EPIDOS AGRA |
|
| GRAH | Despatch of communication of intention to grant a patent |
Free format text: ORIGINAL CODE: EPIDOS IGRA |
|
| GRAH | Despatch of communication of intention to grant a patent |
Free format text: ORIGINAL CODE: EPIDOS IGRA |
|
| GRAA | (expected) grant |
Free format text: ORIGINAL CODE: 0009210 |
|
| AK | Designated contracting states |
Kind code of ref document: B1 Designated state(s): DE ES GB IT NL SE |
|
| PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: SE Free format text: THE PATENT HAS BEEN ANNULLED BY A DECISION OF A NATIONAL AUTHORITY Effective date: 19990721 Ref country code: NL Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 19990721 Ref country code: ES Free format text: THE PATENT HAS BEEN ANNULLED BY A DECISION OF A NATIONAL AUTHORITY Effective date: 19990721 |
|
| GBT | Gb: translation of ep patent filed (gb section 77(6)(a)/1977) |
Effective date: 19990804 |
|
| REF | Corresponds to: |
Ref document number: 69510865 Country of ref document: DE Date of ref document: 19990826 |
|
| ITF | It: translation for a ep patent filed | ||
| NLV1 | Nl: lapsed or annulled due to failure to fulfill the requirements of art. 29p and 29m of the patents act | ||
| PLBE | No opposition filed within time limit |
Free format text: ORIGINAL CODE: 0009261 |
|
| STAA | Information on the status of an ep patent application or granted ep patent |
Free format text: STATUS: NO OPPOSITION FILED WITHIN TIME LIMIT |
|
| 26N | No opposition filed | ||
| REG | Reference to a national code |
Ref country code: GB Ref legal event code: IF02 |
|
| PGFP | Annual fee paid to national office [announced via postgrant information from national office to epo] |
Ref country code: DE Payment date: 20040528 Year of fee payment: 10 |
|
| PGFP | Annual fee paid to national office [announced via postgrant information from national office to epo] |
Ref country code: GB Payment date: 20050414 Year of fee payment: 11 |
|
| PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: IT Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES;WARNING: LAPSES OF ITALIAN PATENTS WITH EFFECTIVE DATE BEFORE 2007 MAY HAVE OCCURRED AT ANY TIME BEFORE 2007. THE CORRECT EFFECTIVE DATE MAY BE DIFFERENT FROM THE ONE RECORDED. Effective date: 20050531 |
|
| PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: DE Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES Effective date: 20051201 |
|
| PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: GB Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES Effective date: 20060531 |
|
| GBPC | Gb: european patent ceased through non-payment of renewal fee |
Effective date: 20060531 |