WO2025201625A1 - Codeur et décodeur - Google Patents
Codeur et décodeurInfo
- Publication number
- WO2025201625A1 WO2025201625A1 PCT/EP2024/057979 EP2024057979W WO2025201625A1 WO 2025201625 A1 WO2025201625 A1 WO 2025201625A1 EP 2024057979 W EP2024057979 W EP 2024057979W WO 2025201625 A1 WO2025201625 A1 WO 2025201625A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- band
- encoder
- signal
- limited
- decoder
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/02—Speech enhancement, e.g. noise reduction or echo cancellation
- G10L21/038—Speech enhancement, e.g. noise reduction or echo cancellation using band spreading techniques
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
- G10L19/04—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using predictive techniques
- G10L19/16—Vocoder architecture
- G10L19/18—Vocoders using multiple modes
- G10L19/24—Variable rate codecs, e.g. for generating different qualities using a scalable representation such as hierarchical encoding or layered encoding
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
- G10L19/04—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using predictive techniques
- G10L19/06—Determination or coding of the spectral characteristics, e.g. of the short-term prediction coefficients
Definitions
- the baseband encoder may, according to embodiments, comprise a neural coder and/or an auto-encoder architecture and/or a Vector Quantized Variational Auto-Encoder (VQ-VAE).
- VQ-VAE Vector Quantized Variational Auto-Encoder
- the baseband encoder may perform encoding based on a quantization of a latent representation of the band-limited signal portion.
- the baseband encoder is trained or trainable using adversarial losses.
- the baseband encoder is configured to obtain a latent, latent being quantized encoded by the baseband encoder.
- the LPC analysis entity is configured to obtain a filter AHB(z).
- the filter AHB(z) may be used to whiten the extended-band signal portion and to obtain a residual eHB(n) based on the formula
- M H B is the LPC order
- n is a time-domain sample index of the audio signal
- LPC coefficients which could be obtained after quantization and interpolation in the LSF domain.
- the bandwidth extension encoder (or its quantization entity) is configured to code and quantize the energy after the residual signal e H B(n) had been obtained and/or after the prediction.
- the bandwidth extension entity is configured to code and quantize an energy of the residual exploiting information derived from the band-limited residual, so that energy of the residual in the extended-band is predicted by the energy of the linear prediction residual signal computed on the band-limited signal for deriving a residual of the energy prediction, which is then quantized and/or an information output to the decoder.
- the latent representation is generally learned and difficult to interpret making it difficult to extract relevant information that can steer and guide a classical speech bandwidth encoder.
- the baseband encoder comprises a format definer configured to define a first multi-dimensional band-limited signal representation of the band-limited signal, the first multi-dimensional band-limited representation of the band-limited signal including at least
- the at least one learnable layer configured to process the first multidimensional band-limited signal representation of the band-limited signal, or processed version of the first multi-dimensional band-limited audio signal representation.
- Another aspect of embodiments provide a decoder for decoding an audio signal comprising a band-limited decoded signal portion and an extended-band signal portion.
- the decoder comprises the two central entities baseband decoder and bandwidth extension decoder.
- the decoder comprises a LPC estimation entity configured to perform LPC estimation based on the band-limited decoded signal portion to obtain a LPC coefficients and a band-limited excitation.
- the bandwidth extension decoder is then configured to generate the extended-band excitation based on the band-limited excitation.
- Embodiments of the aspect are based on the finding that an audio signal which is decoded, e.g. using the above-defined encoder, so that a band-limited decoded signal and an extended-band signal portion is present by use of a decoder comprising the baseband decoder and the bandwidth extension decoder.
- the baseband decoder preferably is implemented as neural decoder, i.e. , comprises at least one learnable layer. As discussed in context of the decoder, this embodiment enables as well to combine neural speech coding and classical bandwidth extension techniques so as to increase the efficiency.
- the excitation generation may be based on a technique called Waveform Envelope Synchronized Pulse Excitation (WESPE).
- WESPE Waveform Envelope Synchronized Pulse Excitation
- the extended-band LPC are estimated based on an entity comprising at least one learnable layer.
- the bandwidth extension decoder is not solely based on a classical approach, but also a neural decoder, i.e., a decoder comprising a learnable layer.
- the band-limited LPC estimation entity may, according to further embodiments, be configured to generate the excitation comprising an analysis of the bandlimited decoded signal portion for getting an estimate of harmonicity and/or voicing factor. Due to this, the relevant information may be advantageously extracted from the band-limited band-limiteddecoded signal portion from the baseband decoder and exploited for the decoding of the extended-band excitation.
- the band-limited LPC estimation entity comprises an input for receiving a decoded band-limited signal portion of the encoded audio signal.
- the baseband decoder comprises a learnable convolution layer and/or a learnable affine transform and/or a learnable recurrent layer and/or a weighting layer in a residual block of neural network and/or learnable element-wise modulation.
- the decoder may comprise a combination entity for combining the band-limited decoded signal portion and the extended-band signal portion to obtain a reconstructed signal. For example, this combination entity may be connected to the two decoders or the output of the two decoders.
- the combination entity may comprise a filterbank and/or block transform and/or a time domain upscale entity and/or a complex valued low delay filterbank (CLDFB) being configured to perform additional postprocessing in a filterbank domain before combining and/or before transforming the constructed signal to a time domain and/or at a desired sampling rate.
- CLDFB complex valued low delay filterbank
- an embodiment provides a method for coding an audio signal comprising a band-limited signal portion and an extended-band signal portion.
- the method comprises the two central steps
- Another embodiment provides a method for decoding an audio signal comprising a bandlimited decoded signal portion and an extended-band signal portion. This method may comprise the two central steps:
- embodiments of the present invention may be computed and implemented.
- another embodiment provides a computer program for performing, when running on a processor the steps of the two method steps as above.
- Another embodiment provides a method for training the neural encoder and/or decoder. For example, the training may be performed on the decoder side, and the encoder side or by use of both sides.
- Fig 1 shows schematically a level zero of the split band encoder, involving the base band encoder and the BWE encoder according to embodiments;
- Fig. 2 illustrates schematically a two-band system realized with block transform, e.g. DFT according to embodiments;
- Fig. 3 shows a high-level architecture of an example of neural baseband encoder and decoder to discuss embodiments:
- Fig. 4 shows a basic implementation of the baseband encoder and the baseband decoder according to embodiments
- Fig. 5 shows a schematic block diagram of a BWE encoder according to embodiments
- Fig. 7 shows schematically an overall block diagram of the encoding and decoding system combining a time domain classical speech BWE at neural coder.
- Fig. 1 shows a split band encoder 100 comprising the entity’s baseband encoder 110, BWE encoder 120 and the optional entity for pre-processing (cf. reference numeral 130) and multiplexor 140.
- the pre-processing entity 130 receives the audio signal s(n) and splits this audio signal s(n) into a limited-band portion and band-extended portion, e.g. the two portions Sib(n) and Shb(n).
- the band-limited signal portion Sib(n) e.g. a low band portion is provided to the baseband encoder 110
- the extended-band signal portion Shb(n) e.g. a high band portion is provided to the BWE encoder 120.
- Both encoders 110 and 120 perform an encoding as will be discussed below, so as to output the two encoded signals for the baseband and the extended bandwidth portion to the multiplexing 140.
- the multiplexer uses the two signal so as to generate a bitstream b.
- the input signal is first conveyed to a pre-processing block, which is in charge of performing several analyses like a pitch estimation, a voice activity detection but also to convey signals at a proper sampling rate to the subsequent coding modules, consisting in our case of the baseband coder 110 and bandwidth extension (BWE) encoder 120.
- a filter-bank like a Quadrature Mirror Filters (QMF), pseudo QMF, modulated lapped or block transforms, or simply downsampling filters in time domain can be used.
- QMF Quadrature Mirror Filters
- pseudo QMF pseudo QMF
- modulated lapped or block transforms or simply downsampling filters in time domain
- the low-band signal is conveyed to the baseband coder 110, which in our preferred case is a neural coder, similar to the Neural End-to-End Speech Coder (NESC).
- b (n) signal preferably contains a wideband or broadband signal sampled at 16 kHz.
- Fig. 2 shows a possible implementation for the pre-processing 130.
- the truncation and normalization 133a and 133b of DFT spectrum serves as lowpass and highpass filtering respectively and the Inverse DFT 135a is operating at a size corresponding to the target sampling rate for the low-band signal.
- the demodulation and truncation module 133b For the high band, only the high frequencies are retained and copied and flipped to the baseband (aka known as demodulation) by the demodulation and truncation module 133b before being decimated by the Inverse DFT 135a with a size corresponding to the sampling-rate of high-band signal.
- the sub-band decomposition can be achieved by time-domain decimation, like with a polyphase filterbank, or a pseudo-QMF.
- the neural baseband coder to be used on the encoder side (cf. 110) and on the decoder side (cf. 210) will be discussed.
- the encoder architecture comprising the encoder 110 and the decoder 220 is configured to perform the following processing: the encoder 110 receives a speech signal and codes same so as to output a bitstream B.
- the decoder 210 uses the bitstream B so as to decode same and outputting the decoded speech signal.
- the entities 111 , 112, 113, 211 , 212 and 213 belong to the baseband encoder 110 and the baseband decoder 210, respectively. Consequently, no bandwidth extension is used in the current example.
- the decoder 200 uses the baseband decoder 210 and the bandwidths extension decoder 220. Both decode the bitstream B and output the respective signals yib(n) and yhb(n), respectively to the post-processor 230.
- the post-processor is configured to obtain based on these two signals yib(n) and yhb(n), the decoded audio signal y(n).
- the exact decoding will be discussed in context of Fig. 6 with focus on the bandwidth extension decoding 220. Before discussing the decoder-side, the bandwidth extension encoding 120 will be discussed with respect to Fig. 5.
- Fig. 5 shows a bandwidth extension encoder 120. It comprises LPC analysis 142, and LPC to LSF transformation 144 and LSF quantization 146 enabling to output the LSF parameters.
- energy parameters are determined using the entities 150, 152 (subframe windowing), 154 (energy computation) and 156 (energy quantization).
- the energy quantization 156 is based on the energy computation 154 and the energy prediction 160 which gets the signal from the entity 150 and from a baseband preprocessor 110.
- the entity 150 is connected with the input for the signal and the LSF quantization 146 via the entity 147.
- An LPC analysis aka short-term linear analysis is performed on shb(n) to obtain a set of LPC coefficients. Since speech and in general audio shows less structure or formant structure in the high frequencies, fewer parameters are required than for the low-band signal. In our preferred mode, an order of 8 or 10 is used for a 16kHz sampled shb(n) signal.
- the LPC analysis is performed as it can be done in baseband encoder, that means, by windowing the signal, computing the autocorrelation function up to a maximum lag corresponding to the order, before finding the optimal prediction coefficients with a recursive algorithm like Levinson-Durbin. It is worth noting that the LPC analysis windows of both low and high band can be the same and preferably time aligned, which will be an advantage in the subsequent processing steps, but also for exploiting the same lookahead. The so- obtained LPC coefficients are then quantized and coded. Once again, since the spectral envelope of the high-band is usually less structured and also perceptually less relevant, the quantization resolution can be lowered for the BWE coding compared to the baseband coding.
- the method for generating the HB excitation uses another analysis of LB decoded signal for getting an estimate of the harmonicity also known as voicing factor.
- voicing factor an estimate of the harmonicity also known as voicing factor.
- zero crossings or other methods may be used for estimating the harmonicity. Such a zero crossing can, therefore, be interpreted as voicing factor.
- Fig. 7 shows the combination of the encoder 100’ and the decoder 200’.
- the baseband encoder 110 and the base band decoder 212 are based on DNNs.
- the latent of the encoder 110 is output to the decoder 210.
- a filterbank 130 is arranged which pre-processes the input signal for the encoder 110 and the BWE encoder 120’. It performs a perceived parameter estimation 142 and LPC encoding 143.
- the parameter encoder 143 uses LPC coefficients out of the input signal for the baseband encoder 110.
- the encoder comprising: o a baseband encoder configured to encode the band-limited signal including at least one learnable layer; o a bandwidth extension encoder which comprises a linear prediction of the extended-band signal.
- the decoder comprising: o a baseband decoder configured to decode the band-limited signal including at least one learnable layer; o LPC estimation based on a band-limited signal from a decoder o a bandwidth extension decoder which comprises the generation of an excitation, input of linear predictive synthesis filter.
- aspects have been described in the context of an apparatus, it is clear that these aspects also represent a description of the corresponding method, where a block or device corresponds to a method step or a feature of a method step. Analogously, aspects described in the context of a method step also represent a description of a corresponding block or item or feature of a corresponding apparatus.
- Some or all of the method steps may be executed by (or using) a hardware apparatus, like for example, a microprocessor, a programmable computer or an electronic circuit. In some embodiments, some one or more of the most important method steps may be executed by such an apparatus.
- the inventive encoded audio signal can be stored on a digital storage medium or can be transmitted on a transmission medium such as a wireless transmission medium or a wired transmission medium such as the Internet.
- embodiments of the invention can be implemented in hardware or in software.
- the implementation can be performed using a digital storage medium, for example a floppy disk, a DVD, a Blu-Ray, a CD, a ROM, a PROM, an EPROM, an EEPROM or a FLASH memory, having electronically readable control signals stored thereon, which cooperate (or are capable of cooperating) with a programmable computer system such that the respective method is performed. Therefore, the digital storage medium may be computer readable.
- embodiments of the present invention can be implemented as a computer program product with a program code, the program code being operative for performing one of the methods when the computer program product runs on a computer.
- the program code may for example be stored on a machine readable carrier.
- inventions comprise the computer program for performing one of the methods described herein, stored on a machine readable carrier.
- an embodiment of the inventive method is, therefore, a computer program having a program code for performing one of the methods described herein, when the computer program runs on a computer.
- a further embodiment of the inventive methods is, therefore, a data carrier (or a digital storage medium, or a computer-readable medium) comprising, recorded thereon, the computer program for performing one of the methods described herein.
- the data carrier, the digital storage medium or the recorded medium are typically tangible and/or non- transitionary.
- a further embodiment of the inventive method is, therefore, a data stream or a sequence of signals representing the computer program for performing one of the methods described herein.
- the data stream or the sequence of signals may for example be configured to be transferred via a data communication connection, for example via the Internet.
- a further embodiment comprises a processing means, for example a computer, or a programmable logic device, configured to or adapted to perform one of the methods described herein.
- a processing means for example a computer, or a programmable logic device, configured to or adapted to perform one of the methods described herein.
- a further embodiment comprises a computer having installed thereon the computer program for performing one of the methods described herein.
Landscapes
- Engineering & Computer Science (AREA)
- Quality & Reliability (AREA)
- Computational Linguistics (AREA)
- Signal Processing (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- Physics & Mathematics (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- Compression, Expansion, Code Conversion, And Decoders (AREA)
Abstract
Codeur pour coder un signal audio comprenant une partie de signal à bande limitée et une partie de signal à bande étendue, le codeur comprenant : un codeur de bande de base conçu pour coder la partie de signal à bande limitée, le codeur de bande de base comprenant au moins une couche d'apprentissage ; et un codeur d'extension de bande passante comprenant une entité de prédiction linéaire pour effectuer une prédiction linéaire sur la partie de signal à bande étendue.
Priority Applications (2)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| PCT/EP2024/057979 WO2025201625A1 (fr) | 2024-03-25 | 2024-03-25 | Codeur et décodeur |
| PCT/EP2025/058177 WO2025202226A1 (fr) | 2024-03-25 | 2025-03-25 | Codeur et décodeur |
Applications Claiming Priority (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| PCT/EP2024/057979 WO2025201625A1 (fr) | 2024-03-25 | 2024-03-25 | Codeur et décodeur |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| WO2025201625A1 true WO2025201625A1 (fr) | 2025-10-02 |
Family
ID=90482357
Family Applications (2)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| PCT/EP2024/057979 Pending WO2025201625A1 (fr) | 2024-03-25 | 2024-03-25 | Codeur et décodeur |
| PCT/EP2025/058177 Pending WO2025202226A1 (fr) | 2024-03-25 | 2025-03-25 | Codeur et décodeur |
Family Applications After (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| PCT/EP2025/058177 Pending WO2025202226A1 (fr) | 2024-03-25 | 2025-03-25 | Codeur et décodeur |
Country Status (1)
| Country | Link |
|---|---|
| WO (2) | WO2025201625A1 (fr) |
Citations (8)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US20020007280A1 (en) * | 2000-05-22 | 2002-01-17 | Mccree Alan V. | Wideband speech coding system and method |
| US20090319277A1 (en) * | 2005-03-30 | 2009-12-24 | Nokia Corporation | Source Coding and/or Decoding |
| US20130051571A1 (en) * | 2010-03-09 | 2013-02-28 | Frederik Nagel | Apparatus and method for processing an audio signal using patch border alignment |
| ES2627775T3 (es) * | 2009-02-18 | 2017-07-31 | Dolby International Ab | Banco de filtros modulado de bajo retardo |
| US20190385626A1 (en) * | 2013-07-12 | 2019-12-19 | Koninklijke Philips N.V. | Optimized scale factor for frequency band extension in an audio frequency signal decoder |
| US20200176004A1 (en) * | 2018-11-30 | 2020-06-04 | Google Llc | Speech coding using auto-regressive generative neural networks |
| US20210287687A1 (en) | 2018-12-21 | 2021-09-16 | Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. | Audio processor and method for generating a frequency enhanced audio signal using pulse processing |
| WO2023175197A1 (fr) * | 2022-03-18 | 2023-09-21 | Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. | Techniques de vocodeur |
-
2024
- 2024-03-25 WO PCT/EP2024/057979 patent/WO2025201625A1/fr active Pending
-
2025
- 2025-03-25 WO PCT/EP2025/058177 patent/WO2025202226A1/fr active Pending
Patent Citations (8)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US20020007280A1 (en) * | 2000-05-22 | 2002-01-17 | Mccree Alan V. | Wideband speech coding system and method |
| US20090319277A1 (en) * | 2005-03-30 | 2009-12-24 | Nokia Corporation | Source Coding and/or Decoding |
| ES2627775T3 (es) * | 2009-02-18 | 2017-07-31 | Dolby International Ab | Banco de filtros modulado de bajo retardo |
| US20130051571A1 (en) * | 2010-03-09 | 2013-02-28 | Frederik Nagel | Apparatus and method for processing an audio signal using patch border alignment |
| US20190385626A1 (en) * | 2013-07-12 | 2019-12-19 | Koninklijke Philips N.V. | Optimized scale factor for frequency band extension in an audio frequency signal decoder |
| US20200176004A1 (en) * | 2018-11-30 | 2020-06-04 | Google Llc | Speech coding using auto-regressive generative neural networks |
| US20210287687A1 (en) | 2018-12-21 | 2021-09-16 | Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. | Audio processor and method for generating a frequency enhanced audio signal using pulse processing |
| WO2023175197A1 (fr) * | 2022-03-18 | 2023-09-21 | Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. | Techniques de vocodeur |
Non-Patent Citations (4)
| Title |
|---|
| BRUHN, STEFANPOBLOTH, HARALDSCHNELL, MARKUSGRILL, BERNHARDGIBBS, JONMIAO, LEIJARVINEN, KARILAAKSONEN, LASSEHARADA, NOBORUNAKA, NOB: "Standardization of the new 3GPP EVS codec", 2015 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, ICASSP 2015, SOUTH BRISBANE, QUEENSLAND, AUSTRALIA, 19 April 2015 (2015-04-19) |
| DOUGLAS O'SHAUGHNESSY: "Review of methods for coding of speech signals", EURASIP JOURNAL ON AUDIO, SPEECH, AND MUSIC PROCESSING, BIOMED CENTRAL LTD, LONDON, UK, vol. 2023, no. 1, 7 February 2023 (2023-02-07), pages 1 - 25, XP021314321, DOI: 10.1186/S13636-023-00274-X * |
| MAKINEN, JARIBESSETTE, BRUNOBRUHN, STEFANOJALA, PASISALAMI, REDWANTALEB, ANISSE: "AMR-WB+: a new audio coding standard for 3rd generation mobile audio services", 2005 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, ICASSP '05, PHILADELPHIA, PENNSYLVANIA, USA, 18 March 2005 (2005-03-18) |
| PIA, NICOLAGUPTA, KISHANKORSE, SRIKANTHMULTRUS, MARKUSFUCHS, GUILLAUME, NESC: ROBUST NEURAL END-2-END SPEECH CODING WITH GANS, July 2022 (2022-07-01) |
Also Published As
| Publication number | Publication date |
|---|---|
| WO2025202226A1 (fr) | 2025-10-02 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| AU2008316860B2 (en) | Scalable speech and audio encoding using combinatorial encoding of MDCT spectrum | |
| EP0981816B1 (fr) | Procedes et systemes de codage audio | |
| RU2389085C2 (ru) | Способы и устройства для введения низкочастотных предыскажений в ходе сжатия звука на основе acelp/tcx | |
| EP3039676B1 (fr) | Extension de bande passante adaptative et son appareil | |
| EP3239979B1 (fr) | Codage de signaux audio génériques à faible débit binaire et à faible retard | |
| US20060271356A1 (en) | Systems, methods, and apparatus for quantization of spectral envelope representation | |
| CN101371296B (zh) | 用于编码和解码信号的设备和方法 | |
| CN103262161A (zh) | 确定用于线性预测编码(lpc)系数量化的具有低复杂度的加权函数的设备和方法 | |
| US20050065788A1 (en) | Hybrid speech coding and system | |
| EP4275204B1 (fr) | Procédé et dispositif de codage de domaine temporel/de domaine fréquentiel unifié d'un signal sonore | |
| CN102460574A (zh) | 用于使用层级正弦脉冲编码对音频信号进行编码和解码的方法和设备 | |
| JPWO2009125588A1 (ja) | 符号化装置および符号化方法 | |
| KR20140088879A (ko) | 음성 신호의 대역 선택적 양자화 방법 및 장치 | |
| Cho et al. | A spectrally mixed excitation (SMX) vocoder with robust parameter determination | |
| WO2025201625A1 (fr) | Codeur et décodeur | |
| RU2414009C2 (ru) | Устройство и способ для кодирования и декодирования сигнала | |
| EP1155405A1 (fr) | Codeur de forme d'onde interpolatif ameliore | |
| EP4553833A1 (fr) | Décodeur et codeur pour l'extension de la largeur de bande | |
| US20050065787A1 (en) | Hybrid speech coding and system | |
| EP4553832A1 (fr) | Processeur audio avec extension de largeur de bande audio dirigée | |
| EP4553830A1 (fr) | Processeur audio pour extension de la largeur de bande audio d'un signal audio à bande limitée | |
| Gupta et al. | UBGAN: Enhancing Coded Speech with Blind and Guided Bandwidth Extension | |
| Kim et al. | A 4 kbps adaptive fixed code-excited linear prediction speech coder | |
| HK40107881A (en) | Coding generic audio signals at low bitrates and low delay | |
| JP2004252477A (ja) | 広帯域音声復元装置 |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| 121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 24714479 Country of ref document: EP Kind code of ref document: A1 |