EP1334485B1 - Codec vocal et procede de generation d'un code vectoriel et de codage/decodage de signaux vocaux - Google Patents
Codec vocal et procede de generation d'un code vectoriel et de codage/decodage de signaux vocaux Download PDFInfo
- Publication number
- EP1334485B1 EP1334485B1 EP01993000A EP01993000A EP1334485B1 EP 1334485 B1 EP1334485 B1 EP 1334485B1 EP 01993000 A EP01993000 A EP 01993000A EP 01993000 A EP01993000 A EP 01993000A EP 1334485 B1 EP1334485 B1 EP 1334485B1
- Authority
- EP
- European Patent Office
- Prior art keywords
- speech
- vector
- codebook
- embedded
- speech signal
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Expired - Lifetime
Links
- 239000013598 vector Substances 0.000 title claims abstract description 131
- 238000000034 method Methods 0.000 title claims abstract description 54
- 238000004891 communication Methods 0.000 claims abstract description 19
- 230000003595 spectral effect Effects 0.000 claims description 15
- 238000004458 analytical method Methods 0.000 claims description 3
- 230000008569 process Effects 0.000 claims description 3
- 230000008901 benefit Effects 0.000 abstract description 7
- 230000005284 excitation Effects 0.000 description 18
- 230000003044 adaptive effect Effects 0.000 description 10
- 230000006870 function Effects 0.000 description 9
- 230000035945 sensitivity Effects 0.000 description 7
- 230000015572 biosynthetic process Effects 0.000 description 6
- 238000003786 synthesis reaction Methods 0.000 description 6
- 238000012937 correction Methods 0.000 description 5
- 238000010586 diagram Methods 0.000 description 4
- 238000012856 packing Methods 0.000 description 4
- 238000012545 processing Methods 0.000 description 4
- 238000013459 approach Methods 0.000 description 3
- 230000005540 biological transmission Effects 0.000 description 3
- 238000009826 distribution Methods 0.000 description 3
- 238000009499 grossing Methods 0.000 description 3
- 230000001413 cellular effect Effects 0.000 description 2
- 230000000295 complement effect Effects 0.000 description 2
- 230000009977 dual effect Effects 0.000 description 2
- 230000007774 longterm Effects 0.000 description 2
- 238000013139 quantization Methods 0.000 description 2
- 230000008859 change Effects 0.000 description 1
- 230000006835 compression Effects 0.000 description 1
- 238000007906 compression Methods 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 238000010295 mobile communication Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000004044 response Effects 0.000 description 1
- 238000005070 sampling Methods 0.000 description 1
- 238000002922 simulated annealing Methods 0.000 description 1
- 238000001228 spectrum Methods 0.000 description 1
- 238000012360 testing method Methods 0.000 description 1
- 238000012549 training Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
- G10L19/04—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using predictive techniques
- G10L19/06—Determination or coding of the spectral characteristics, e.g. of the short-term prediction coefficients
- G10L19/07—Line spectrum pair [LSP] vocoders
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
- G10L2019/0001—Codebooks
- G10L2019/0004—Design or structure of the codebook
- G10L2019/0005—Multi-stage vector quantisation
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
- G10L2019/0001—Codebooks
- G10L2019/0007—Codebook element generation
Definitions
- This invention relates to speech coding and methods of optimising the performance of speech codecs in communications systems.
- the invention is applicable to, but not limited to, speech codecs that accommodate wideband and narrowband speech signals without compromising the overall performance of the speech codec quantiser.
- GSM global system for mobile communications
- TETRA TErrestrial Trunked RAdio
- a primary objective in the use of speech coding techniques is to reduce the occupied capacity of the speech patterns as much as possible, by use of compression techniques, without losing fidelity.
- VQ vector quantisation
- the process of vector quantisation is to represent an input vector as a member of a set of fixed vectors.
- This set of fixed vectors is known as the VQ codebook.
- the fixed vector in the VQ codebook which best represents the input vector is found by exhaustively searching all members of the VQ codebook and selecting the fixed vector which gives the minimum distance measure (or Euclidean distance) between it and the input vector.
- VQ has been shown to be very attractive and efficient in many areas of speech coding, it is not without its drawbacks.
- Wideband speech codecs are likely to find application in telephone conferencing.
- Wideband speech codecs have an input speech bandwidth covering the 50Hz to 7KHz range, compared to narrowband or telephone-band codecs that have an input speech bandwidth of 250Hz to 3.3KHz.
- Tandemming is a term which is used to describe the situation where speech previously processed by one speech encoder/decoder is processed by a second speech encoder/decoder pair.
- the speech quality requirement of such tandemmed codecs is to achieve equivalence to the best narrowband codecs i.e. GSM Enhanced full-rate codec (EFR). It is therefore appropriate to consider the performance of any wideband line spectral frequency (LSF) VQ scheme in the presence of narrowband speech.
- GSM Global System for Mobile communications
- one option may be to use a classified VQ scheme with two sets of codebooks: one to represent the wideband speech and one to represent the narrowband speech.
- a respective codebook would be selected by a special "mode" bit, where the mode bit indicates whether the subsequent data bits represent a wideband or narrowband speech signal.
- the representative codecs have been simulated with each predictor of the speech codec arranged to be an 18 th order split vector quantiser, with the eighteen associated line spectral frequencies split into six groups of three bits each. 7KHz & 3KHz Spectral Distortion Results for the 1st order MA-PVQ 40 bit Quantisers.
- Table 1 details the wideband and narrowband spectral distortion figures for a 40-bit first order moving average quantiser trained on 50:50 wideband:narrowband speech.
- the configuration column denotes the number of bits allocated to each of six, moving-average predictive split-vector quantisers, applied to LSFs 1-3, 4-6, 7-9, 10-12, 13-15 and 16-18 respectively.
- a wideband speech codec would typically be represented by an even distribution of bits allocated to each of the six split-vector quantisers, to provide an approximately even frequency response across the full range of the line spectral frequencies.
- a narrowband speech codec would have an uneven distribution of bits associated with each quantiser, with more bits allocated to the lower frequencies of the LSFs.
- a compromise quantiser such as 8,9,8,7,6,2 provides substantially inferior performance to both of these.
- lsfs wideband line spectral frequencies
- the present invention aims to provide a speech codec and method of optimising a performance of the speech codec to at least alleviate some of the aforementioned disadvantages.
- a speech coder for a speech communications unit in accordance with claim 1 is provided.
- a speech communications unit adapted to include the speech coder of any one of claims 1 to 10 is provided.
- a method of generating a speech vector codebook in a speech communications unit in accordance with claim 12 is provided.
- a speech communications unit adapted to include a speech vector codebook generated, in accordance with any one of claims 12 to 20, is provided.
- a method of encoding a speech signal in accordance with claim 22, is provided.
- a speech communications unit adapted to employ a speech encoding method in accordance with any one of claims 22 to 26 is provided.
- a method of decoding a speech signal in accordance with claim 28, is provided.
- a speech communications unit adapted to employ a speech decoding method in accordance with any one of claims 28 to 31 is provided.
- FIG. 1 a block diagram of a code excited linear predictive speech encoder 100, according to a preferred embodiment of the present invention, is shown.
- An acoustic input signal to be analysed is applied to speech coder 100 at microphone 102.
- the input signal is then applied to filter 104.
- Filter 104 will generally exhibit band-pass filter characteristics. However, if the speech bandwidth is already adequate, filter 104 may comprise a direct wire connection.
- the analog speech signal from filter 104 is then converted into a sequence of N pulse samples, and the amplitude of each pulse sample is then represented by a digital code in analog-to-digital (A/D) converter 108, as known in the art.
- the sampling rate is determined by sample clock (SC).
- SC sample clock
- SC is generated along with the frame clock (FC) via clock 112.
- A/D 108 which may be represented as input speech vector s(n)
- coefficient analyser 110 The digital output of A/D 108, which may be represented as input speech vector s(n), is then applied to coefficient analyser 110.
- This input speech vector s(n) is repetitively obtained in separate frames, i.e., blocks of time, the length of which is determined by the frame clock (FC), as is known in the art.
- LPC linear predictive coding
- the generated speech coder parameters may include the following: LPC parameters, long-term predictor (LTP) parameters, excitation gain factor ( ⁇ ) (along with the best excitation codeword I).
- LPC parameters are applied to multiplexer 150 and sent over the channel 152 for use by the speech synthesizer at the decoder.
- the input speech vector s(n) is also applied to subtractor 130, the function of which is described later.
- coefficient analyser 110 has been adapted to incorporate the specially constructed family of embedded codebooks.
- the codebook search controller 140 selects the best indices and gains from the adaptive codebook within block 116 and the stochastic codebook within block 114 in order to produce a minimum weighted error in the summed chosen excitation vector used to represent the input speech sample.
- the output of the stochastic codebook 114 and the adaptive codebook 116 are input into respective gain functions 122 and 118.
- the gain-adjusted outputs are then summed in summer 120 and input into the LPC filter 124, as is known in the art.
- Gain block 122 For each individual excitation vector u i (n), a reconstructed speech vector s' i (n) is generated for comparison to the input speech vector s(n).
- Gain block 122 scales the excitation gain factor ' ⁇ '. Such gain may be pre-computed by coefficient analyser 110 and used to analyse all excitation vectors, or may be optimised jointly with the search for the best excitation codeword I, generated by codebook search controller 140.
- the scaled excitation signal ⁇ u i (n) is then filtered by the linear predictive coding filter 124, which preferably includes a long-term predictor (LTP) filter and a short-term predictor (STP) filter, to generate the reconstructed speech vector s' i (n).
- the reconstructed speech vector s' i (n) for the i-th excitation code vector is compared to the same block of input speech vector s(n) by subtracting these two signals in subtractor 130.
- the difference vector e i (n) represents the difference between the original and the reconstructed blocks of speech.
- the difference vector is perceptually-weighted by weighting filter 132, utilising the weighting filter parameters (WTP) generated by coefficient analyser 110. Perceptual weighting accentuates those frequencies where the error is perceptually more important to the human ear, and attenuates other frequencies.
- An energy calculator function inside the codebook search controller 140 computes the energy of the weighted difference vector e' i (n).
- the codebook search controller compares the i-th error signal for the present excitation vector u i (n) against previous error signals to determine the excitation vector producing the minimum error.
- the code of the i-th excitation vector having a minimum error is then output over the channel as the best excitation code I.
- codebook search controller 140 may determine a particular codeword that provides an error signal having some predetermined criteria, such as meeting a predefined error threshold.
- the coefficient analyzer 110 has also been adapted to employ at least some of the inventive concepts of the present invention. To accommodate vectors in either the wideband or narrowband vector space, the coefficient analyser 110 is used to train the quantisers and to determine whether the input speech comprises wideband or narrowband speech.
- the inventors of the present invention have recognised the opportunity to use the same training data, or at least very similar data, to train each of the quantisers.
- the different sized quantisers cover much the same signal vector space and hence a smaller quantiser is embedded within the larger quantiser leading to a more compact representation.
- the coefficient analyser 110 it is desirable for the coefficient analyser 110 to send an additional mode bit to indicate which quantiser set (i.e. which of the specially constructed family of embedded codebooks) is being used.
- the quantiser set will preferably refer to a wideband or narrowband arrangement.
- the codebook index transmission is structured in order to minimise the effect of errors to this mode bit, as described later with respect to FIG. 4. The consequence of such a careful structuring of the codebook index transmission means that any bit error(s) in the mode bit have much less impact than in any two or more independent-codebook prior art approach.
- FIG. 2 a block diagram of a code excited linear predictive speech decoder 200 is shown, according to a preferred embodiment of the present invention.
- the decoder functionality is substantially the reverse of that of the encoder.
- the received multiplexed signal is input into demultiplexer 202, which separates the excitation parameters 204 from the LPC parameters 206.
- LPC linear predictive coding
- the LPC parameters are input into an LPC de-quantiser, stability check and correction block 210 to obtain a local stable version of the synthesis filter even in the presence of channel bit errors.
- the LPC de-quantise, stability check and correction block 210 has been adapted to encompass the inventive concepts contained herein.
- the LPC de-quantise, stability check and correction block 210 receives the LPC parameters and mode bit sent from the corresponding encoder function.
- the LPC de-quantise function of block 210 includes the corresponding embedded codebook arrangement of the encoder, such that the determination of the at least one mode bit can select the embedded codebook arrangement that best describes the encoded and transmitted speech signal.
- the LPC de-quantise, stability check and correction block 210 also controls the filter co-efficients of the LPC synthesis filter 222 in order to reconstruct the transmitted speech vector s' i (n).
- the output from the LPC synthesis filter 222 is input to a post filter process 224, which subsequently outputs the reconstructed speech 226.
- the excitation parameters 204 may include: excitation gain factor ⁇ together with the best excitation codeword I, and are input into an adaptive non-linear smoothing function 208.
- the output from the adaptive non-linear smoothing function 208 provides the precise adaptive and stochastic codebook indices and gains that form the excitation for the synthesis filter. As such, the outputs from the adaptive non-linear smoothing function 208 are input to stochastic codebook 218 and adaptive codebook 212.
- the gain controls are input to adaptive codebook gain block 214, which receives an output from the adaptive codebook 212, and stochastic codebook gain block 220, which receives an output from the stochastic codebook 218.
- the output from the respective gain blocks 214, 220 are input to summing junction 216, whose output is fed into the LPC synthesis filter 222 and fed back to the adaptive codebook 212, as known in the art.
- FIG. 3 shows a 2-way split VQ codebook applied to eighteen wideband line spectral frequencies (LSFs).
- LSFs wideband line spectral frequencies
- the input LSFs (L1-L18) 250 are quantised by first quantiser 254 and second quantiser 268 to derive estimates and the two binary indices "I1" 270 and "I2" 272 using a respective first embedded codebook 256 and second embedded codebook 262.
- the LSFs (L1-L18) 250 are fed into a mode-bit detector 252, that selects the respective embedded codebook to provide the most appropriate one for the speech signal presented.
- the first embedded codebook 256 contains a first set of core entries, in this case appropriate for wideband speech 260, and additional entries appropriate for narrowband speech 258.
- the second embedded codebook 262 contains a second set of core entries, this time appropriate for narrowband speech 266, and additional wideband entries 264.
- the core entries are always searched in each quantiser and are indexed by a set of core bits. Additional entries are searched, depending upon the mode, and a set of "extra" bits are formed.
- the codebook is structured such that when the full codebook is searched, the "extra" bits are effectively zero for the core entries. This is depicted in FIG. 4.
- This arrangement of core bits provides for a constant sum of the bit allocations for each of the two modes, wideband or narrowband.
- FIG. 4 shows the preferred packing of bits for the 2-way split VQ of FIG. 3.
- the configuration of the bit stream 320 comprises the mode bit 322 (indicating a wideband or narrowband input signal) followed by the "I1" core bits 324 (either wideband or narrowband) and the "I2" core bits. Finally the "I1" narrowband extra bits or the "I2" wideband extra bits complete the preferred packing configuration.
- the mode bit and core bits are beneficially always in the same locations. Hence, the impact of a mode bit error can be arranged to result in much smaller errors in the two quantisers than in prior art arrangements.
- a series of "test" input speech signals may be used, to obtain the optimum set of vectors to represent all input speech signals.
- FIG. 5 shows two octagons 350 and 354, partitioned to reflect eight separate locations identified by a respective 3-bit address.
- the two octagons 350, 354 individually each represent one of the two split VQ codebooks of FIG. 3.
- the example shows the case where three "extra" LSBs are used.
- the xxxx & yyyy represent core entry bit patterns for each of the respective embedded codebooks.
- a potentially "non-zero extra” position will be appended instead of all zeros and the "extra" LSBs of the larger codebooks will be set to zero.
- the maximum error for a (WB/NB) mode bit is equivalent to that of several LSB errors in each codebook.
- the codebook entries of the embedded codebook trained using relevant and appropriately varied speech patterns, must be interlaced regularly within the large codebook.
- index reassignment of the combined codebook must be performed such that LSB errors in the indices result in small perceptual distances. This may be arranged using a simulated annealing method as is well known to those skilled in the art.
- a set of codebooks was derived and selectively searched. In order to determine which codebook configuration to search during each frame an appropriate narrowband speech indicator was employed.
- the graph 400 shown in FIG. 6 demonstrates the error resilience of the preferred embodiment of the invention in the presence of 10% bit errors, applied on a per-bit basis measured using an objective distortion measure, such as the perceptual speech quality measure (PSQM value), as defined by the ITU-T Recommendation P.861.
- PSQM value perceptual speech quality measure
- Graph 400 shows the bit error sensitivity profiles, this time for two 43-bit quantisers according to the embodiment.
- the distortion 402 (PSQM value) is shown plotted against bit number 404 on a bit-by-bit basis.
- the two quantisers shown are the hybrid 8,8,8,7,6,5 & 8,9,9,8,6,2 wideband/narrowband scheme 406, according to the preferred embodiment of the invention, and an 8,8,8,7,7,5 wideband-only scheme 408.
- the mode bit is the first bit and then the other core quantiser bits (8,8,8,7,6,2) are presented MSB first for each quantiser in-turn, followed finally by the three extra bits as depicted in FIG. 4.
- the bits are presented in MSB first natural order.
- the overall performance of the new quantiser in the presence of bit errors can be seen to be only very slightly worse than the wideband-only scheme (see rank-ordered sensitivities).
- the graph highlights that the sensitivity of the mode bit is 33rd out of 43, i.e. near, but not quite at, the bottom of the rank-ordered results.
- the explanation for the mode bit not being the least sensitive bit (as in the optimal case) positioned at the bottom of the rank ordering is that when a mode error occurs, several LSB changes (in the three-quantiser tables) occur which together are more significant than a single LSB change (bottom of the rank ordering). This clearly shows that the embedded structuring of the LSF VQ and bit stream has beneficially rendered the LSF VQ relatively immune to bit errors.
- the graph 450 shown in FIG. 7 demonstrates the error resilience of the preferred embodiment of the invention where the error sensitivity profile is shown rank-ordered, as compared to a bit-by-bit basis as shown in FIG. 6.
- Graph 450 shows the bit error sensitivity profiles, of the same two 43-bit quantisers, 456 and 458.
- the distortion 452 is shown plotted against re-ordered bit position 454 on a rank-ordered bit basis.
- the graph highlights that the two quantisers have broadly similar error sensitivity profiles and that the addition of the mode bit has not increased error sensitivity.
- bit-error robust embedded split vector quantiser for wideband line spectral frequencies (lsfs) in narrowband tandemming provides at least the following advantages.
- the invention provides for a single speech codec codebook that can quantise both wideband and narrowband signals in a near optimal manner to that of two independently-optimised speech codec codebooks. This provides for a reduced memory requirement of the codebook, in a memory constrained speech unit.
- inventive concepts described herein find particular use in speech processing units that are flexible enough to cope with a variety of bandwidth constrained speech input signals, such as future third generation cellular telecommunications systems
- any number of line spectral frequencies can be accommodated, in a LSF codebook arrangement.
- the present invention may be implemented outside of the line spectral frequency area, such as in video encoding and decoding.
- inventive concepts can benefit from such inventive concepts.
- inventive concepts can be applied to any LPC order, with any bit-division relationship.
- inventive concepts contained herein can be equally employed in any classified overlapping codebook arrangement, not necessarily limited to the overlapping arrangement between wideband and narrowband speech signals.
- bit-error robust speech codec is complementary to popular, proven speech codecs such as embedded split VQ codecs.
- the bit-error robust speech codec accommodates wideband line spectral frequencies in narrowband tandemming, and alleviates at least some of the aforementioned disadvantages.
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Computational Linguistics (AREA)
- Signal Processing (AREA)
- Health & Medical Sciences (AREA)
- Spectroscopy & Molecular Physics (AREA)
- Human Computer Interaction (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- Compression, Expansion, Code Conversion, And Decoders (AREA)
- Image Analysis (AREA)
- Image Processing (AREA)
- Machine Translation (AREA)
Claims (32)
- Codeur vocal (100, 200) pour unité de communications vocales, le codeur vocal comprenant un ensemble d'au moins deux livres de codes vectoriels intégrés, susceptible de représenter un signal vocal d'entrée par une série de vecteurs, le codeur vocal (100, 200) étant caractérisé par le fait que les deux ou plus de deux livres de codes vectoriels intégrés (256, 262) se partagent au moins certains vecteurs communs de la série de vecteurs, et par un moyen servant à classer (252) le signal vocal d'entrée en sélectionnant l'un des deux ou plus de deux livres de codes vectoriels intégrés (256, 262) en liaison avec au moins une partie dudit autre ou desdits autres livres de codes vectoriels intégrés des deux ou plus de deux livres de codes vectoriels intégrés (256, 262) pour représenter le signal vocal d'entrée.
- Codeur vocal selon la revendication 1, où des indices, qui adressent des points d'entrée de vecteurs individuels à l'intérieur des deux ou plus de deux livres de codes vectoriels intégrés (256, 262), sont affectés de façon que la distorsion résultant d'un classement incorrect soit minimisée.
- Codeur vocal selon la revendication 1 ou 2, caractérisé en outre par le fait que le livre de code vocal est un livre de code de quantification vectorielle, où les deux ou plus de deux livres de codes vectoriels intégrés (256, 262) sont classés en tant que livres de codes intégrés présentant de façon prédominante une bande large ou une bande étroite et ayant des vecteurs partagés en commun.
- Codeur vocal selon l'une quelconque des revendications 1 à 3, le codeur vocal étant en outre caractérisé en ce que le livre sélectionné parmi les livres de codes vectoriels intégrés (256, 262) fournit une résolution grossière et ladite ou lesdites parties dudit ou desdits autres livres de codes vectoriels intégrés fournissent une résolution fine de façon à représenter le signal vocal appliqué en entrée au codeur vocal.
- Codeur vocal selon l'une quelconque des revendications précédentes, caractérisé en outre par un moyen de quantification (110, 210) qui est couplé de façon fonctionnelle au moyen de classement (252) afin de quantifier un signal vocal d'entrée de fréquence spectrale linéaire.
- Codeur vocal selon l'une quelconque des revendications précédentes, caractérisé en outre par un moyen d'analyse fonctionnellement couplé au livre de code vectoriel vocal afin d'analyser un signal vocal entrant pour déterminer une caractéristique particulière du signal vocal.
- Codeur vocal selon la revendication 6, où le moyen de classement comporte un moyen servant à produire ou récupérer au moins un bit de mode afin d'identifier une caractéristique du signal vocal associé à un signal vocal à large bande ou à bande étroite.
- Codeur vocal selon la revendication 7, où le signal vocal d'entrée comprend au moins un bit de coeur, le codeur vocal étant en outre caractérisé par un moyen de positionnement afin de positionner dans le même espace vectoriel ledit bit de mode et ledit ou lesdits bits de coeur se rapportant aux premier et deuxième livres de codes vectoriels intégrés (256, 262) tous les deux.
- Codeur vocal selon l'une quelconque des revendications précédentes, comprenant en outre un moyen d'ajout, le moyen d'ajout étant destiné à ajouter au moins un zéro (304, 306) à une représentation vectorielle (300) afin de représenter des positions vectorielles pour l'un ou l'autre desdits premier et deuxième livres de codes vectoriels vocaux intégrés (256, 262) en un livre de code vectoriel vocal intégré combiné.
- Codeur vocal selon l'une quelconque des revendications précédentes, comprenant en outre un moyen (350) de réaffectation d'indices destiné à réarranger les positions vectorielles des vecteurs des premier et deuxième livres de codes dans le livre de code vectoriel vocal intégré combiné afin de minimiser la distance perceptuelle existant entre lesdits vecteurs des livres de codes intégrés.
- Unité de communications vocales conçue pour incorporer le codeur vocal défini dans l'une quelconque des revendications 1 à 10.
- Procédé de production d'un livre de code vectoriel vocal dans une unité de communications vocales, le procédé étant caractérisé par les opérations suivantes :représenter des signaux vocaux par une série de vecteurs ; etproduire au moins deux livres de codes vectoriels intégrés qui partagent des vecteurs communs de la série de vecteurs, afin de produire ledit livre de code vectoriel vocal, de façon que, en utilisation, au moins une partie de chaque livre des deux ou plus de deux livres de codes vectoriels intégrés est utilisée pour représenter un signal vocal d'entrée.
- Procédé de production d'un livre de code vocal selon la revendication 12, où l'opération de production d'au moins deux livres de codes vectoriels intégrés comporte les opérations suivantes :produire un premier livre de code vectoriel intégré des deux ou plus de deux livres de codes vectoriels intégrés afin de représenter sensiblement des signaux vocaux à bande étroite ; etproduire un deuxième livre de code vectoriel intégré des deux ou plus de deux livres de codes vectoriels intégrés afin de représenter sensiblement des signaux vocaux à bande large.
- Procédé de production d'un livre de code vectoriel intégré selon la revendication 12 ou 13, le procédé étant en outre caractérisé par l'opération suivante :quantifier, avec un moyen de quantification, un signal vocal d'entrée de fréquence spectrale linéaire.
- Procédé de production d'un livre de code vectoriel intégré selon l'une quelconque des revendications 12 à 14, où l'opération de production d'au moins deux livres de codes vectoriels intégrés comporte la production d'un des livres de codes vectoriels intégrés afin de produire une résolution grossière et la production d'au moins un autre livre de code vectoriel intégré afin de produire une résolution fine, pour représenter le signal vocal appliqué en entrée au codeur vocal.
- Procédé de production d'un livre de code vectoriel vocal selon l'une quelconque des revendications 12 à 15, le procédé étant en outre caractérisé par les opérations suivantes :analyser un signal vocal entrant avec un moyen d'analyse du codeur vocal afin de déterminer une caractéristique particulière du signal vocal associé à des signaux à bande large ou à bande étroite ; etproduire un bit de mode (322) afin de représenter la caractéristique particulière du signal vocal.
- Procédé de production d'un livre de code vectoriel vocal selon la revendication 16, où les signaux vocaux comportent des bits vocaux de coeur (324, 326), le procédé étant en outre caractérisé par l'opération suivante :positionner le ou les bits de mode et les bits vocaux de coeur afin que les livres de codes vectoriels intégrés soient tous deux sensiblement au même emplacement du livre de code vectoriel vocal.
- Procédé de production d'un livre de code vectoriel vocal selon l'une quelconque des revendications 12 à 17, le procédé étant en outre caractérisé par l'opération suivante :ajouter (304, 316) des zéros aux points d'entrée vectoriels dans le premier ou le deuxième livre de code vectoriel intégré.
- Procédé de production d'un livre de code vectoriel vocal selon l'une quelconque des revendications 12 à 18, le procédé étant en outre caractérisé par l'opération suivante :effectuer une réaffectation des indices du livre de code vectoriel vocal afin de maintenir une distance perceptuelle relativement petite entre des positions vectorielles respectives, pour ainsi minimiser les erreurs de distorsion qui résultent d'un classement incorrect dans un processus de codage ou de décodage de signaux vocaux.
- Procédé de production d'un livre de code vectoriel vocal selon l'une quelconque des revendications 12 à 19, le procédé étant en outre caractérisé par l'opération suivante :entrelacer lesdits points d'entrée vectoriels en des positions qui sont sensiblement réparties de façon uniforme à l'intérieur du livre de code vectoriel vocal combiné.
- Unité de communications vocales conçue pour incorporer un livre de code vectoriel vocal produit selon l'une quelconque des revendications de procédé 12 à 20.
- Procédé de codage d'un signal vocal, le procédé étant caractérisé par les opérations suivantes :recevoir un signal vocal ;identifier une caractéristique du signal vocal ;sélectionner l'un d'au moins deux livres de codes vectoriels intégrés en liaison avec au moins une partie de l'autre ou des autres livres de codes vectoriels intégrés desdits deux ou plus de deux livres de codes vectoriels intégrés, afin de représenter le signal vocal d'entrée sur la base de la caractéristique identifiée, les livres de codes vectoriels intégrés se partageant au moins certains vecteurs communs ; ettransmettre des informations identifiant ledit livre de code vectoriel intégré qui a été sélectionné au titre d'une représentation dudit signal vocal reçu.
- Procédé de codage d'un signal vocal selon la revendication 22, où l'opération d'identification d'une caractéristique du signal vocal reçu recouvre l'identification du fait que le signal vocal est associé à un signal vocal à large bande ou un signal vocal à bande étroite.
- Procédé de codage d'un signal vocal selon la revendication 22 ou 23, où l'opération de sélection comporte la sélection de l'un des livres de codes vectoriels intégrés afin de produire une résolution grossière, et de la partie ou des parties dudit ou desdits autres livres de codes vectoriels intégrés afin de produire une résolution fine, pour ainsi représenter le signal vocal appliqué en entrée au codeur vocal.
- Procédé de codage d'un signal vocal selon l'une quelconque des revendications 22 à 24, le procédé étant en outre caractérisé par l'opération suivante :produire au moins un bit de mode (322) afin d'identifier la caractéristique du signal vocal.
- Procédé de codage d'un signal vocal selon l'une quelconque des revendications 22 à 25, le procédé étant en outre caractérisé par l'opération suivante :ajouter (304, 316) au moins un zéro à une représentation vectorielle du signal vocal d'entrée afin de représenter ladite position vectorielle relative auxdits premier ou deuxième livres de codes vectoriels vocaux intégrés se trouvant dans ledit livre de code vectoriel vocal.
- Unité de communications vocales conçue pour employer un procédé de codage vocal selon l'une quelconque des revendications 22 à 26.
- Procédé de décodage d'un signal vocal, le procédé étant caractérisé par les opérations suivantes :recevoir un signal vocal ;identifier une caractéristique du signal vocal reçu ; etsélectionner l'un d'au moins deux livres de codes vectoriels intégrés en liaison avec au moins une partie d'au moins un autre livre de code vectoriel intégré desdits deux ou plus de deux livres de codes vectoriels intégrés, afin de représenter le signal vocal d'entrée sur la base de la caractéristique identifiée, les livres de codes vectoriels intégrés se partageant au moins quelques vecteurs communs.
- Procédé de décodage d'un signal vocal selon la revendication 27, où l'opération d'identification d'une caractéristique du signal vocal reçu recouvre l'identification du fait que le signal vocal est associé à un signal vocal à large bande ou à bande étroite.
- Procédé de décodage d'un signal vocal selon la revendication 28 ou 29, le procédé étant en outre caractérisé par l'opération suivante :récupérer (322) au moins un bit de mode afin d'identifier ladite caractéristique du signal vocal.
- Procédé de décodage d'un signal vocal selon l'une quelconque des revendications 28 à 30, le procédé étant caractérisé en outre par l'opération suivante :retirer au moins un zéro ajouté à une représentation vectorielle du signal vocal d'entrée afin de représenter la position vectorielle dans ledit premier ou ledit deuxième livre de code vectoriel vocal intégré.
- Unité de communications vocales conçue pour employer un procédé de décodage vocal tel que défini dans l'une quelconque des revendications de procédé 28 à 31.
Applications Claiming Priority (3)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| GB0026463A GB2368761B (en) | 2000-10-30 | 2000-10-30 | Speech codec and methods for generating a vector codebook and encoding/decoding speech signals |
| GB0026463 | 2000-10-30 | ||
| PCT/EP2001/012403 WO2002037477A1 (fr) | 2000-10-30 | 2001-10-22 | Codec vocal et procede de generation d'un code vectoriel et de codage/decodage de signaux vocaux |
Publications (2)
| Publication Number | Publication Date |
|---|---|
| EP1334485A1 EP1334485A1 (fr) | 2003-08-13 |
| EP1334485B1 true EP1334485B1 (fr) | 2005-08-31 |
Family
ID=9902177
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| EP01993000A Expired - Lifetime EP1334485B1 (fr) | 2000-10-30 | 2001-10-22 | Codec vocal et procede de generation d'un code vectoriel et de codage/decodage de signaux vocaux |
Country Status (6)
| Country | Link |
|---|---|
| EP (1) | EP1334485B1 (fr) |
| AT (1) | ATE303647T1 (fr) |
| AU (1) | AU2002215972A1 (fr) |
| DE (1) | DE60113144T2 (fr) |
| GB (1) | GB2368761B (fr) |
| WO (1) | WO2002037477A1 (fr) |
Families Citing this family (2)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| GB0703795D0 (en) | 2007-02-27 | 2007-04-04 | Sepura Ltd | Speech encoding and decoding in communications systems |
| CN110428847B (zh) * | 2019-08-28 | 2021-08-24 | 南京梧桐微电子科技有限公司 | 一种线谱频率参数量化比特分配方法及系统 |
Family Cites Families (9)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| JPH0365822A (ja) * | 1989-08-04 | 1991-03-20 | Fujitsu Ltd | ベクトル量子化符号器及びベクトル量子化復号器 |
| AU7960994A (en) * | 1993-10-08 | 1995-05-04 | Comsat Corporation | Improved low bit rate vocoders and methods of operation therefor |
| US5621852A (en) * | 1993-12-14 | 1997-04-15 | Interdigital Technology Corporation | Efficient codebook structure for code excited linear prediction coding |
| GB2300548B (en) * | 1995-05-02 | 2000-01-12 | Motorola Ltd | Method for a communications system |
| WO1997027578A1 (fr) * | 1996-01-26 | 1997-07-31 | Motorola Inc. | Analyseur de la parole dans le domaine temporel a tres faible debit binaire pour des messages vocaux |
| JP4132154B2 (ja) * | 1997-10-23 | 2008-08-13 | ソニー株式会社 | 音声合成方法及び装置、並びに帯域幅拡張方法及び装置 |
| US5966688A (en) * | 1997-10-28 | 1999-10-12 | Hughes Electronics Corporation | Speech mode based multi-stage vector quantizer |
| US7110943B1 (en) * | 1998-06-09 | 2006-09-19 | Matsushita Electric Industrial Co., Ltd. | Speech coding apparatus and speech decoding apparatus |
| SE519976C2 (sv) * | 2000-09-15 | 2003-05-06 | Ericsson Telefon Ab L M | Kodning och avkodning av signaler från flera kanaler |
-
2000
- 2000-10-30 GB GB0026463A patent/GB2368761B/en not_active Expired - Fee Related
-
2001
- 2001-10-22 AT AT01993000T patent/ATE303647T1/de not_active IP Right Cessation
- 2001-10-22 EP EP01993000A patent/EP1334485B1/fr not_active Expired - Lifetime
- 2001-10-22 AU AU2002215972A patent/AU2002215972A1/en not_active Abandoned
- 2001-10-22 WO PCT/EP2001/012403 patent/WO2002037477A1/fr not_active Ceased
- 2001-10-22 DE DE60113144T patent/DE60113144T2/de not_active Expired - Lifetime
Also Published As
| Publication number | Publication date |
|---|---|
| WO2002037477A1 (fr) | 2002-05-10 |
| EP1334485A1 (fr) | 2003-08-13 |
| ATE303647T1 (de) | 2005-09-15 |
| AU2002215972A1 (en) | 2002-05-15 |
| DE60113144T2 (de) | 2006-06-14 |
| GB2368761A (en) | 2002-05-08 |
| GB0026463D0 (en) | 2000-12-13 |
| GB2368761B (en) | 2003-07-16 |
| DE60113144D1 (de) | 2005-10-06 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| JP5343098B2 (ja) | スーパーフレーム構造のlpcハーモニックボコーダ | |
| US5966688A (en) | Speech mode based multi-stage vector quantizer | |
| EP0573398B1 (fr) | Vocodeur C.E.L.P. | |
| US7016831B2 (en) | Voice code conversion apparatus | |
| US5778335A (en) | Method and apparatus for efficient multiband celp wideband speech and music coding and decoding | |
| KR100713677B1 (ko) | 음성 디코딩 장치, 음성 디코딩 방법 및 음성 디코딩장치를 포함하는 전송 시스템 | |
| JP4390803B2 (ja) | 可変ビットレート広帯域通話符号化におけるゲイン量子化方法および装置 | |
| EP0704088B1 (fr) | Procede de codage de signaux de parole | |
| CA2443443C (fr) | Procede et systeme de quantification d'un vecteur a frequence spectrale lineaire dans un codec vocal | |
| KR100487943B1 (ko) | 음성 코딩 | |
| US7031912B2 (en) | Speech coding apparatus capable of implementing acceptable in-channel transmission of non-speech signals | |
| JP2006525533A5 (fr) | ||
| JPH08263099A (ja) | 符号化装置 | |
| JP2005202262A (ja) | 音声信号符号化方法、音声信号復号化方法、送信機、受信機、及びワイヤレスマイクシステム | |
| US6205423B1 (en) | Method for coding speech containing noise-like speech periods and/or having background noise | |
| US5987406A (en) | Instability eradication for analysis-by-synthesis speech codecs | |
| US5893060A (en) | Method and device for eradicating instability due to periodic signals in analysis-by-synthesis speech codecs | |
| US6397178B1 (en) | Data organizational scheme for enhanced selection of gain parameters for speech coding | |
| EP1334485B1 (fr) | Codec vocal et procede de generation d'un code vectoriel et de codage/decodage de signaux vocaux | |
| US20050278174A1 (en) | Audio coder | |
| EP3610481B1 (fr) | Codage audio | |
| JP3475772B2 (ja) | 音声符号化装置および音声復号装置 | |
| EP0723257B1 (fr) | Système de transmission d'un signal de parole utilisant des paramètres spectraux et dispositif associé de codage et décodage des paramètres de parole | |
| Drygajilo | Speech Coding Techniques and Standards | |
| Oshima et al. | Variable-length coding of ACELP gain using Entropy-Constrained VQ |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| PUAI | Public reference made under article 153(3) epc to a published international application that has entered the european phase |
Free format text: ORIGINAL CODE: 0009012 |
|
| 17P | Request for examination filed |
Effective date: 20030530 |
|
| AK | Designated contracting states |
Designated state(s): AT BE CH CY DE DK ES FI FR GB GR IE IT LI LU MC NL PT SE TR |
|
| AX | Request for extension of the european patent |
Extension state: AL LT LV MK RO SI |
|
| GRAP | Despatch of communication of intention to grant a patent |
Free format text: ORIGINAL CODE: EPIDOSNIGR1 |
|
| GRAS | Grant fee paid |
Free format text: ORIGINAL CODE: EPIDOSNIGR3 |
|
| GRAA | (expected) grant |
Free format text: ORIGINAL CODE: 0009210 |
|
| AK | Designated contracting states |
Kind code of ref document: B1 Designated state(s): AT BE CH CY DE DK ES FI FR GB GR IE IT LI LU MC NL PT SE TR |
|
| PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: NL Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20050831 Ref country code: LI Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20050831 Ref country code: TR Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20050831 Ref country code: ES Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20050831 Ref country code: AT Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20050831 Ref country code: CH Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20050831 Ref country code: BE Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20050831 |
|
| REG | Reference to a national code |
Ref country code: CH Ref legal event code: EP Ref country code: GB Ref legal event code: FG4D |
|
| REG | Reference to a national code |
Ref country code: IE Ref legal event code: FG4D |
|
| REF | Corresponds to: |
Ref document number: 60113144 Country of ref document: DE Date of ref document: 20051006 Kind code of ref document: P |
|
| PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: CY Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20051022 |
|
| PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: IE Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES Effective date: 20051024 |
|
| PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: LU Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES Effective date: 20051031 Ref country code: MC Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES Effective date: 20051031 |
|
| REG | Reference to a national code |
Ref country code: SE Ref legal event code: TRGR |
|
| PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: DK Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20051130 Ref country code: GR Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20051130 |
|
| PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: PT Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20060223 |
|
| NLV1 | Nl: lapsed or annulled due to failure to fulfill the requirements of art. 29p and 29m of the patents act | ||
| REG | Reference to a national code |
Ref country code: CH Ref legal event code: PL |
|
| ET | Fr: translation filed | ||
| PLBE | No opposition filed within time limit |
Free format text: ORIGINAL CODE: 0009261 |
|
| STAA | Information on the status of an ep patent application or granted ep patent |
Free format text: STATUS: NO OPPOSITION FILED WITHIN TIME LIMIT |
|
| 26N | No opposition filed |
Effective date: 20060601 |
|
| REG | Reference to a national code |
Ref country code: IE Ref legal event code: MM4A |
|
| PGFP | Annual fee paid to national office [announced via postgrant information from national office to epo] |
Ref country code: DE Payment date: 20101029 Year of fee payment: 10 |
|
| PGFP | Annual fee paid to national office [announced via postgrant information from national office to epo] |
Ref country code: IT Payment date: 20101021 Year of fee payment: 10 |
|
| REG | Reference to a national code |
Ref country code: FR Ref legal event code: CD |
|
| PGFP | Annual fee paid to national office [announced via postgrant information from national office to epo] |
Ref country code: GB Payment date: 20110930 Year of fee payment: 11 Ref country code: FR Payment date: 20111005 Year of fee payment: 11 |
|
| REG | Reference to a national code |
Ref country code: DE Ref legal event code: R082 Ref document number: 60113144 Country of ref document: DE Representative=s name: SCHUMACHER & WILLSAU PATENTANWALTSGESELLSCHAFT, DE |
|
| PGFP | Annual fee paid to national office [announced via postgrant information from national office to epo] |
Ref country code: SE Payment date: 20111006 Year of fee payment: 11 Ref country code: FI Payment date: 20111007 Year of fee payment: 11 |
|
| REG | Reference to a national code |
Ref country code: DE Ref legal event code: R082 Ref document number: 60113144 Country of ref document: DE Representative=s name: SCHUMACHER & WILLSAU PATENTANWALTSGESELLSCHAFT, DE Effective date: 20120113 Ref country code: DE Ref legal event code: R081 Ref document number: 60113144 Country of ref document: DE Owner name: MOTOROLA SOLUTIONS, INC., US Free format text: FORMER OWNER: MOTOROLA, INC., SCHAUMBURG, US Effective date: 20120113 |
|
| GBPC | Gb: european patent ceased through non-payment of renewal fee |
Effective date: 20121022 |
|
| REG | Reference to a national code |
Ref country code: FR Ref legal event code: ST Effective date: 20130628 |
|
| PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: SE Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES Effective date: 20121023 Ref country code: GB Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES Effective date: 20121022 Ref country code: DE Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES Effective date: 20130501 |
|
| REG | Reference to a national code |
Ref country code: DE Ref legal event code: R119 Ref document number: 60113144 Country of ref document: DE Effective date: 20130501 |
|
| PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: FI Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES Effective date: 20121022 Ref country code: IT Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES Effective date: 20121022 Ref country code: FR Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES Effective date: 20121031 |