EP1926082A1 - Procédé de codage échelonnable de signaux stéréo - Google Patents

Procédé de codage échelonnable de signaux stéréo Download PDF

Info

Publication number
EP1926082A1
EP1926082A1 EP07022523A EP07022523A EP1926082A1 EP 1926082 A1 EP1926082 A1 EP 1926082A1 EP 07022523 A EP07022523 A EP 07022523A EP 07022523 A EP07022523 A EP 07022523A EP 1926082 A1 EP1926082 A1 EP 1926082A1
Authority
EP
European Patent Office
Prior art keywords
signals
center
quantization
quantized
channels
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Withdrawn
Application number
EP07022523A
Other languages
German (de)
English (en)
Inventor
Bernhard Dr. Feiten
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Deutsche Telekom AG
Original Assignee
Deutsche Telekom AG
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Deutsche Telekom AG filed Critical Deutsche Telekom AG
Publication of EP1926082A1 publication Critical patent/EP1926082A1/fr
Withdrawn legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/008Multichannel audio signal coding or decoding using interchannel correlation to reduce redundancy, e.g. joint-stereo, intensity-coding or matrixing
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/02Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using spectral analysis, e.g. transform vocoders or subband vocoders
    • G10L19/032Quantisation or dequantisation of spectral components

Definitions

  • the present invention relates to the encoding of stereo signals, and more particularly to the application of scalable encoding techniques.
  • Scalable encoding methods for data compression of audio signals have the advantage that the transmission rate can be adapted dynamically to the properties of the networks and terminals.
  • a gradation of the bit rate by the coding method in small steps is particularly advantageous.
  • a stereo signal includes at least two channels, a left channel and a right channel.
  • the similarity between the two channels is exploited.
  • One known method of transmitting stereo signals is the mid / side method [ Michael Dickreiter, Handbuch der Tonstudiotechnik, Saur Verlag, 1997 ].
  • the left and right channels are combined with each other to produce a center channel and a side channel.
  • the center channel is made up of the sum of the right and left channels, while the side channel is made up of the difference between the left and right channels.
  • M 0 . 5 ⁇ R + L
  • S 0 . 5 ⁇ R - L
  • the factor 0.5 is a common size in practice and can also be chosen differently.
  • center / side processing will result in a significant saving in the amount of bits needed for encoding, since the side channel will then have relatively less energy than the left or right channel and To code the page channel much less bits are needed.
  • the center channel will be equal to the left channel or equal to the right channel, while the side channel would be 0. The more similar the left and right channels are, the less energy the page channel will be, and the fewer bits needed to encode the page channel. If the right and left channels are less similar, then the bit efficiency will decrease accordingly for center / page encoding.
  • the coding of the stereo signals is usually carried out with methods that process the audio signals in the spectral range.
  • the left and right channels of the audio signal which are usually in the form of PCM (Pulse Code Modulation) samples, are converted from the time domain to the frequency domain.
  • PCM Pulse Code Modulation
  • modern coding methods use the so-called modified discrete cosine transformation (MDCT) in order to obtain a block-wise frequency representation of an audio signal.
  • MDCT modified discrete cosine transformation
  • the stream of discrete-time audio samples is windowed to obtain a windowed block of audio samples, which are then transformed into a spectral representation by a transform. For each time window, one obtains a corresponding number of spectral coefficients.
  • the transformation divides the frequency spectrum into a certain number of frequency bands (subbands) of equal width.
  • the number of transformation points and the sampling rate determine the bandwidth of the subbands. These subbands are grouped according to hearing characteristics. At low frequencies, a few subbands fall into one group, many at high frequencies.
  • For each group a scaling factor is determined.
  • the quantization of the spectral coefficients then takes place relative to these scaling factors.
  • bits are assigned to the scaling factors and the transform coefficient according to the target bit rate. The bit allocation takes place in such a way that the resulting error can be perceived as little as possible.
  • the scaling factors are also transmitted and are required so that the decoder is able to reconstruct the original signal from the transmitted bits.
  • a middle / side coding for the signals of the left and right channels after the transformation into the frequency range MDCT is used for matrixing and subtraction.
  • the center and side signals thus formed are then quantized.
  • the quantization is a lossy coding, since process-related quantization errors occur.
  • the quantization errors mean that the signals can not be accurately reconstructed after transmission and an unnatural stereo image is formed.
  • the center / side encoding has the effect, in addition to the data reducing effect, that when the left and right channels are very similar, the quantization error in both the left channel and the right channel coincides with the quantization error of the other channel is correlated, so that the quantization error takes place in the middle and there is covered by the useful signal a little or much better than in the uncorrelated case.
  • the useful signal will be either left or right, while the quantization error is correlated and more in the middle.
  • the quantized middle / side signals are subsequently entropy-coded in the sense of a loss-free coding, for example by means of a Huffinan coding.
  • a bit stream is formed from the quantized and entropy-coded center-to-side signals by means of a bit stream multiplexer which can be transmitted.
  • Scalable coding methods are particularly advantageous for stereo signals [ J. Li; Embedded Audio Coding (EAC) With Implicit Auditory Masking; ACM Multimedia 2002 ].
  • Scalable coding methods are designed such that the output-side bit stream has at least a first and a second scaling layer.
  • the first scaling layer may differ from the second scaling layer or from any number of further scaling layers in the audio coding method itself, in the audio bandwidth, in the audio quality with respect to mono / stereo or in a combination of the quality criteria mentioned.
  • Scalable audio encoders for multichannel stereo transmission are often designed so that the mono signal, i.e., the first scaling layer, is the same. the center signal is used while in the other scaling layers the side channel is embedded.
  • a decoder that is simply designed will take only the first scaling layer from the scaled bitstream and provide a mono signal.
  • a stereo decoder also uses the side layer in addition to the center layer to provide a full bandwidth stereo signal.
  • a scalable stereo encoder that uses the center signal as the first scaling layer and the side signal in the other scaling layers has its best overall efficiency when there is a high similarity of the left channel to the right channel. For stereo channels that do not correlate or sudden changes in the characteristics of the two channels, the efficiency of a mid / side encoding is reduced.
  • the process of decoding a mid / side transmission is such that the received bit stream is divided by a demultiplexer into encoded quantized center / side signals and additional information.
  • the entropy-coded quantized center-to-side signals are first entropy-decoded to obtain the quantized center-to-side signals, which are then inversely quantized.
  • the decoded center / side signals have quantization errors introduced in the encoding and result in the signals for the left and right channels converted into the temporal representation after dematrixing and by means of a synthesis filter bank not being reconstructed in the original ratios can be
  • the object of the present invention is to achieve, for the application of the scalable coding according to the middle / side method, that spatial-related reproduction better obscures quantization errors and minimizes stereo imaging errors.
  • the object is achieved in that in the process of encoding the left channel and right channel are transformed and quantized by themselves and the middle / side processing only after the quantization takes place. The sum and subtraction is thus carried out with the already quantized signals of the left and right channels.
  • the invention is based on the finding that the effect of the quantization error in the center / side matrixing can be reduced if the matrixing is carried out after the quantization. This can be shown using the transfer equations.
  • the center signal is formed by the addition of the left and right channel, the side signal is formed by the difference.
  • M 0 . 5 ⁇ R + 0 . 5 ⁇ L
  • S 0 . 5 ⁇ R - 0 . 5 ⁇ L
  • R ' Q ⁇ 0 . 5 ⁇ R + 0 . 5 ⁇ L + Q ⁇ 0 . 5 ⁇ R - 0 . 5 ⁇ L
  • L' Q ⁇ 0 . 5 ⁇ R + 0 . 5 ⁇ L - Q ⁇ 0 . 5 ⁇ R - 0 . 5 ⁇ L
  • R ' Q ⁇ 0 . 5 ⁇ R + 0 . 5 ⁇ L
  • L' Q ⁇ 0 . 5 ⁇ R + 0 . 5 ⁇ L
  • the inventive optimization of the center / side stereophony using the quantization for the signals of the right and the left channel is as follows.
  • the quantization error is denoted by d and can assume the values -D / 2 ⁇ d ⁇ D / 2 .
  • the quantization error of the center signal is dm, that of the side signal ds. There is a random relationship between dm and ds .
  • the quantization error in the M / S quantization can take in the sum of values between - D and + D.
  • dr is the quantization error for the right channel
  • dl the quantization error for the left channel.
  • the quantization error d can assume the values - D / 2 ⁇ d ⁇ D / 2 , as already illustrated.
  • R / L quantization the quantization errors do not add up. Thus the error remains in the range -D / 2 ⁇ d ⁇ D / 2.
  • encoders and decoders are shown as an example of the application of the inventive principle of center / page formation after the quantization of the signals of the left and right channels.
  • the description is limited to a two-channel transmission and encoding. However, the same principles can also be applied to multi-channel transmission and coding.
  • the left (10) and right channel (20) of an audio signal are first transformed from the time domain into the frequency domain.
  • the known principle of the sliding modified cosine transform (200) is used for both audio channels.
  • the spectral values of the left (11) and right (12) channels are quantized in the next step.
  • the quantizer (300) is controlled by a quantization controller (500).
  • the quantization can, as is known from other methods, be supported by a division into frequency bands. This division has the advantage that the quantization error is adapted to the spectral properties of the useful signal and thus is not audible so quickly for our hearing.
  • the quantization is adapted to the modulation in the respective frequency band by a band for each band Scaling factor is determined.
  • the quantization controller uses the left (10) and right (20) input channels to determine the scaling factors.
  • a special feature of the quantization control in the new coding method is that the same scaling factor must be used for the left and right channel in order to enable the sum and difference formation in a linear number space.
  • various known methods can be used to determine the optimal scaling factors [ Marina Bosi and Karlheinz Brandenburg; Introduction to Digital Audio Coding and Standards; Springer Verlag 2002 ].
  • the quantization fulfills the function of a lossy reduction of the bits required for the coding.
  • the spectrally decomposed and quantized left (12) and right (22) channels are now fed to a center / side transformation stage (100) for converting left / right signals into center / side signals.
  • a center / side transformation stage 100
  • Another data reduction took place in a further stage for lossless coding (400).
  • This stage which can be realized, for example, as usual in other coding methods with a Huffman coding, the center (40) and side signals (50) and the scaling factors (60) are supplied.
  • the result is the coded signal (80).
  • the decoding of the encoded signal (80) is done by performing the steps in reverse order.
  • the lossless decoding reconstructs the center (41) and side signals (51) as well as the scaling factors (61).
  • the center and side signals are transformed back to left (13) and right (23) quantized signals.
  • inverse quantization (301) is carried out to produce the original values of the spectral coefficients.
  • the spectrally decomposed left (14) and right (15) signals are reset with the inverse modified discrete cosine transform (201) to the reconstructed signals for the left (15) and right (25) channels.
  • the present invention for minimizing the quantization errors also makes it possible in practice to make the generation of the bit stream more flexible.
  • the encoded signal (80) can be scaled in size (bit rate).
  • the bitstream contains the scaling factors, the center signal and the side signal.
  • the bitrate can now be reduced in several ways. First, high-frequency components of the side signal can be omitted. Then, for example, the high frequency components of the center signal can be omitted. The unused scaling factors then do not need to be transmitted. In the next step, the low-frequency components of the side signal could then be reduced until, for example, the side signal no longer occurs in the bit stream. The quality of the stereo transmission can thus be transferred step by step into a mono transmission with decreasing spectral bandwidth.

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Computational Linguistics (AREA)
  • Signal Processing (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Compression, Expansion, Code Conversion, And Decoders (AREA)
EP07022523A 2006-11-25 2007-11-20 Procédé de codage échelonnable de signaux stéréo Withdrawn EP1926082A1 (fr)

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
DE102006055737A DE102006055737A1 (de) 2006-11-25 2006-11-25 Verfahren zur skalierbaren Codierung von Stereo-Signalen

Publications (1)

Publication Number Publication Date
EP1926082A1 true EP1926082A1 (fr) 2008-05-28

Family

ID=39106071

Family Applications (1)

Application Number Title Priority Date Filing Date
EP07022523A Withdrawn EP1926082A1 (fr) 2006-11-25 2007-11-20 Procédé de codage échelonnable de signaux stéréo

Country Status (3)

Country Link
US (1) US20080136686A1 (fr)
EP (1) EP1926082A1 (fr)
DE (1) DE102006055737A1 (fr)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP2285025A1 (fr) * 2009-07-16 2011-02-16 Alcatel Lucent Procédé et appareil pour le codage/décodage d'un signal audio stéréo en un signal audio mono

Families Citing this family (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP2209114B1 (fr) * 2007-10-31 2014-05-14 Panasonic Corporation Appareil/procédé pour le codage/décodage de la parole
CN101751928B (zh) * 2008-12-08 2012-06-13 扬智科技股份有限公司 应用音频帧频谱平坦度简化声学模型分析的方法及其装置
EP2645367B1 (fr) * 2009-02-16 2019-11-20 Electronics and Telecommunications Research Institute Procédé de codage/décodage de signaux audio par sinusoidal codage adaptatif et dispositif correspondant
US20100331048A1 (en) * 2009-06-25 2010-12-30 Qualcomm Incorporated M-s stereo reproduction at a device
KR101698439B1 (ko) 2010-04-09 2017-01-20 돌비 인터네셔널 에이비 Mdct-기반의 복소수 예측 스테레오 코딩
AU2016222372B2 (en) * 2010-04-09 2018-06-28 Dolby International Ab Mdct-based complex prediction stereo coding
US11361776B2 (en) * 2019-06-24 2022-06-14 Qualcomm Incorporated Coding scaled spatial components
US11538489B2 (en) 2019-06-24 2022-12-27 Qualcomm Incorporated Correlating scene-based audio data for psychoacoustic audio coding
US12142285B2 (en) 2019-06-24 2024-11-12 Qualcomm Incorporated Quantizing spatial components based on bit allocations determined for psychoacoustic audio coding
US12308034B2 (en) 2019-06-24 2025-05-20 Qualcomm Incorporated Performing psychoacoustic audio coding based on operating conditions
DE102019219922B4 (de) 2019-12-17 2023-07-20 Volkswagen Aktiengesellschaft Verfahren zur Übertragung einer Mehrzahl an Signalen sowie Verfahren zum Empfang einer Mehrzahl an Signalen
WO2022097234A1 (fr) * 2020-11-05 2022-05-12 日本電信電話株式会社 Procédé de raffinage du signal sonore, procédé de décodage du signal sonore, dispositifs associés, programme et support d'enregistrement
CN118072721B (zh) * 2024-04-22 2024-07-26 深圳市友杰智新科技有限公司 加速解码方法、装置、设备和介质

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20030014136A1 (en) * 2001-05-11 2003-01-16 Nokia Corporation Method and system for inter-channel signal redundancy removal in perceptual audio coding
EP1400955A2 (fr) * 2002-09-04 2004-03-24 Microsoft Corporation Quantisation et quantisation inverse pour signaux audio
CN1787078A (zh) * 2005-10-25 2006-06-14 芯晟(北京)科技有限公司 一种基于量化信号域的立体声及多声道编解码方法与系统

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5812971A (en) * 1996-03-22 1998-09-22 Lucent Technologies Inc. Enhanced joint stereo coding method using temporal envelope shaping
SG120118A1 (en) * 2003-09-15 2006-03-28 St Microelectronics Asia A device and process for encoding audio data

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20030014136A1 (en) * 2001-05-11 2003-01-16 Nokia Corporation Method and system for inter-channel signal redundancy removal in perceptual audio coding
EP1400955A2 (fr) * 2002-09-04 2004-03-24 Microsoft Corporation Quantisation et quantisation inverse pour signaux audio
CN1787078A (zh) * 2005-10-25 2006-06-14 芯晟(北京)科技有限公司 一种基于量化信号域的立体声及多声道编解码方法与系统

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
MARINA BOSI; KARLHEINZ BRANDENBURG: "Introduction to Digital Audio Coding and Standards", 2002, SPRINGER VERLAG

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP2285025A1 (fr) * 2009-07-16 2011-02-16 Alcatel Lucent Procédé et appareil pour le codage/décodage d'un signal audio stéréo en un signal audio mono

Also Published As

Publication number Publication date
DE102006055737A1 (de) 2008-05-29
US20080136686A1 (en) 2008-06-12

Similar Documents

Publication Publication Date Title
EP1926082A1 (fr) Procédé de codage échelonnable de signaux stéréo
DE19628292B4 (de) Verfahren zum Codieren und Decodieren von Stereoaudiospektralwerten
EP0910928B1 (fr) Codage et decodage de signaux audio au moyen d'un procede stereo en intensite et de prediction
EP0931386B1 (fr) Procede de signalisation d'une substitution de bruit lors du codage d'un signal audio
EP1864279B1 (fr) Dispositif et procede pour produire un flux de donnees et pour produire une representation multicanaux
DE4320990B4 (de) Verfahren zur Redundanzreduktion
DE69432012T2 (de) Wahrnehmungsgebundene Kodierung von Audiosignalen
DE60225276T2 (de) Codierungsvorrichtung und -verfahren, decodierungsvorrichtung und -verfahren und programm
DE69310990T2 (de) Verfahren zum Einfügen digitaler Daten in ein Audiosignal vor der Kanalkodierung
DE60206390T2 (de) Effiziente und skalierbare parametrische stereocodierung für anwendungen mit niedriger bitrate
DE102006022346B4 (de) Informationssignalcodierung
DE69826529T2 (de) Schnelle datenrahmen-optimierung in einem audio-kodierer
DE19742655C2 (de) Verfahren und Vorrichtung zum Codieren eines zeitdiskreten Stereosignals
DE10200653B4 (de) Skalierbarer Codierer, Verfahren zum Codieren, Decodierer und Verfahren zum Decodieren für einen skalierten Datenstrom
EP0642719B1 (fr) Procede visant a reduire les donnees lors de la transmission et/ou de la memorisation de signaux numeriques provenant de plusieurs canaux interdependants
EP0611516B1 (fr) Procede de reduction de donnees dans la transmission et/ou la mise en memoire de signaux numeriques de plusieurs canaux dependants
WO1988001811A1 (fr) Procede de codage numerique
DE69425768T2 (de) Kodierverfahren, Kodierer und Dekodierer für ein Digitalsignal
DE60217612T2 (de) Verfahren und Vorrichtung zur Kodierung und Dekodierung von Sprachsignalen
DE69611987T2 (de) Übertragungssystem mit zeitabhängigen filterbänken
EP0905918A2 (fr) Procédé et dispositif de codage de signaux audio
DE19735675C2 (de) Verfahren zum Verschleiern von Fehlern in einem Audiodatenstrom
DE69734613T2 (de) Kodiertes Informationssignal
DE4239506A1 (de) Verfahren zur bitratenreduzierenden Quellcodierung für die Übertragung und Speicherung von digitalen Tonsignalen
DE102021203087A1 (de) Kompression von Audiodaten im Fahrzeug

Legal Events

Date Code Title Description
PUAI Public reference made under article 153(3) epc to a published international application that has entered the european phase

Free format text: ORIGINAL CODE: 0009012

AK Designated contracting states

Kind code of ref document: A1

Designated state(s): AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HU IE IS IT LI LT LU LV MC MT NL PL PT RO SE SI SK TR

AX Request for extension of the european patent

Extension state: AL BA HR MK RS

STAA Information on the status of an ep patent application or granted ep patent

Free format text: STATUS: THE APPLICATION HAS BEEN WITHDRAWN

18W Application withdrawn

Effective date: 20081125

RTI1 Title (correction)

Free format text: PROCESS FOR SCALABLE ENCODING OF STEREO SIGNALS