EP1014340A2 - Procédé et dispositif de traitement des signaux audio contenant du bruit - Google Patents

Procédé et dispositif de traitement des signaux audio contenant du bruit Download PDF

Info

Publication number
EP1014340A2
EP1014340A2 EP99125575A EP99125575A EP1014340A2 EP 1014340 A2 EP1014340 A2 EP 1014340A2 EP 99125575 A EP99125575 A EP 99125575A EP 99125575 A EP99125575 A EP 99125575A EP 1014340 A2 EP1014340 A2 EP 1014340A2
Authority
EP
European Patent Office
Prior art keywords
signal
noise
time offset
vectors
noise reduction
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Withdrawn
Application number
EP99125575A
Other languages
German (de)
English (en)
Other versions
EP1014340A3 (fr
Inventor
Rainer Dr. Hegger
Holger Dr. Kantz
Lorenzo Matassini
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Max Planck Gesellschaft zur Foerderung der Wissenschaften eV
Original Assignee
Max Planck Gesellschaft zur Foerderung der Wissenschaften eV
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Max Planck Gesellschaft zur Foerderung der Wissenschaften eV filed Critical Max Planck Gesellschaft zur Foerderung der Wissenschaften eV
Publication of EP1014340A2 publication Critical patent/EP1014340A2/fr
Publication of EP1014340A3 publication Critical patent/EP1014340A3/fr
Withdrawn legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/0208Noise filtering
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/0208Noise filtering
    • G10L21/0216Noise filtering characterised by the method used for estimating noise
    • G10L2021/02161Number of inputs available containing the signal or the noise to be suppressed
    • G10L2021/02163Only one microphone
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04RLOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; ELECTRIC HEARING AIDS; PUBLIC ADDRESS SYSTEMS
    • H04R2225/00Details of deaf aids covered by H04R25/00, not provided for in any of its subgroups
    • H04R2225/43Signal processing in hearing aids to enhance the speech intelligibility

Definitions

  • the invention relates to methods for processing noisy Sound signals, in particular for non-linear noise reduction in speech signals, for the non-linear separation of power and noise signals and for the use of non-linear Time series analyzes based on the concept of low-dimensional deterministic chaos.
  • the invention relates also a device for implementing the methods and their Use.
  • Noise reduction during recording, storage, transmission or rendering human language has a high technical relevance.
  • Noise can be a pure measurement inaccuracy e.g. in the form of the digital error when outputting sound amplitudes, as noise in the transmission channel or as dynamic Noise due to the coupling of the system under consideration with the Outside world occur.
  • Examples of noise reduction of the human language are generally from telecommunications, automatic speech recognition or the use of electronic Known hearing aids.
  • the problem of noise reduction occurs not only in human language, but also in others Types of sound signals, and not just stochastic Noise, but also with all forms of overlay of a relevant sound signal due to extraneous noise. It exists an interest in a signal processing technique, with the strongly aperiodic and non-stationary sound signals in Analyzed, manipulated in relation to power and noise components or can be separated.
  • a typical approach to noise reduction ie to break down a signal into certain power and noise components, is based on signal filtering in the frequency domain.
  • filtering is done with bandpass filters, but this creates the following problem.
  • stochastic noise is broadband (often so-called white noise ").
  • white noise If, however, the power signal itself is strongly aperiodic and thus broadband, the frequency filter also destroys a portion of the power signal, which results in inadequate results. For example, if a low-pass filter is to be used to remove high-frequency noise from human speech during speech transmission the speech signal is distorted.
  • noise compensation for sound recordings.
  • a first microphone from a Noise levels in a room overlaid with human speech and with a second microphone recorded a sound signal that essentially represents the noise level.
  • a compensation signal is derived from microphones The noise is superimposed with the measurement signal of the first microphone compensated from the surrounding space.
  • This technique is due to the relatively high cost of equipment (use of special Microphones with directional characteristics) and because of the limited Application e.g. disadvantageous in voice recording.
  • time series analysis represents a fundamental approach to learn as much as possible about the properties or state of a system from observed data.
  • Known analytical methods for understanding aperiodic signals are described, for example, by H. Kantz et al. in Nonlinear Time Series Analysis ", Cambridge University Press, Cambridge, 1997, or by HDI Abarbanel in Analysis of Observed Chaotic Data ", Springer, New York, 1996.
  • Deterministic chaos means that although a system state at a particular point in time uniquely defines the system state at any later point in time, the However, the system is unpredictable over a long period of time because the current system state is recorded with an inevitable error, the effect of which increases exponentially depending on the equation of motion of the system, so that after a relatively short time a simulated model state with the real state of the system does not exist Resemblance more.
  • Fig. 10 shows schematically the dependence of successive Time series values for noise-free or noisy systems (on Example of a one-dimensional relationship).
  • the noiseless Data from a deterministic system provide this in Fig. 10a shown image.
  • the time offset vectors, to which details continue explained below are in a low dimensional Diversity in the embedding room.
  • Noise becomes the deterministic relationship through an approximate Relationship replaced.
  • the data is no longer on the Under manifold, but in their vicinity (Fig. 10b).
  • the Power and noise are differentiated by dimensionality. Everything that leads out of the sub-manifold, is due to the influence of noise.
  • the noise suppression is deterministic chaotic signals in three steps.
  • First is the Dimension m of the embedding space and the dimension of the manifold, in which the noiseless data were located, estimated.
  • the actual correction is then for each individual point the diversity in its vicinity identified and finally to reduce noise the point under consideration on the manifold projected (Fig. 10c).
  • the object of the invention is also to implement devices to specify such a method.
  • a first important aspect of the invention is in particular therein, non-stationary sound signals consisting of Power and noise components, with such a high sampling rate to detect that predetermined signal profiles within of the considered sound signal enough redundancy for one Noise reduction included.
  • Phonemes consist of a sequence of periodic or approximately periodic repetitions. On the concepts of periodic or approximately periodic Repetitions are discussed separately below. Hereinafter the concept of the approximately periodic becomes uniform Signal profiles used.
  • the time series of Sound signals provide waveforms that are at least over repeat certain signal sections of the sound signal and a temporary application of the above, per se allow known concept of non-linear noise reduction.
  • Another important aspect of the invention is in the idea of temporal correlations through geometric correlations in the time delay embedding room to replace the be expressed by environments in this room. Points in these environments provide the information needed for nonlinear Noise reduction of the point are necessary for the the environment is constructed.
  • the invention also provides a device for signal processing for sound signals, in particular a sampling circuit for signal value detection, an arithmetic circuit for signal value processing and an output unit to output noise-free time series.
  • the invention has the following advantages. It will be the first time created a noise reduction method for sound signals, which works essentially without distortion and with a low one equipment expenditure can be implemented.
  • the invention can be implemented in real time or almost in real time. Certain parts of the signal processing according to the invention are compatible with conventional noise reduction methods, so that additional correction methods known per se or fast data processing algorithms easily on the Invention are transferable.
  • the invention allows the effective Separation of power and noise components regardless of the frequency spectrum of noise. So is so-called in particular colored noise or isospectral noise separable.
  • the Invention is not only with stationary noise, but also applicable to non-stationary noise if the time scale, on which the intoxication process changes its properties, longer than typically 100 ms (this is an example value that relates in particular to the processing of speech signals and can also be shorter in other applications).
  • the invention is not limited to human language, but also with other sound sources natural or synthetic Applicable origin.
  • human speech signals from background noise to separate.
  • individual speech signals This would assume that e.g. one vote as a share of performance and one other voice is considered a noise component.
  • a voice representing noise would become an untreatable Show non-stationary noise on the same time scale.
  • the invention is described below using the example of noise reduction on speech signals by utilizing intra-phonem redundancy explained.
  • the power component of the sound signal is formed by a speech component x, which by a noise component r is superimposed.
  • the sound signal is in signal sections divided, in the language example by spoken syllables or phonemes are formed.
  • the invention is not limited to speech processing. With other sound signals the assignment of the signal sections becomes application-dependent chosen differently.
  • the signal processing according to the invention is every sound signal is accessible, which in itself is non-stationary is, but approximately within predetermined signal sections periodically repeating signal profiles.
  • s n 2nd k: x k ⁇ U n (A n x k + b n -x k + 1 ) 2nd ,
  • the quantity s n 2 represents a prediction error in relation to the factors A n and b n .
  • the implicit expression A n x k + b n x k + 1 0 illustrates that the values which correspond to the above-mentioned equation of motion are limited to a hyperplane within the state space under consideration.
  • the points belonging to the environment U n are no longer limited to the hyperplane formed by A n and b n , but are scattered in an area around the hyperplane.
  • the nonlinear noise reduction now means to project the noisy vectors y n onto this hyperplane.
  • the projection of the vectors onto the hyperplane is carried out using known methods of linear algebra.
  • the parameter m is the embedding dimension of the time offset vectors.
  • the embedding dimension is chosen depending on the application and is greater than twice the value of the fractal dimension of the attractor of the dynamic system under consideration.
  • the parameter ⁇ is a sampling interval (or: "time lag"), which represents the time interval between the successive elements of the time series.
  • the time offset processor is thus an m-dimensional vector, the components of which comprise a specific time series value and the (m-1) previous time series values.
  • the scanning distance ⁇ is in turn a variable selected depending on the application. If the system changes little, the scanning distance can be chosen larger to avoid processing redundant data. If the system changes rapidly, the sampling distance must be chosen smaller, since otherwise the correlations that occur between neighboring values would introduce errors into the further processing. The choice of the sampling distance ⁇ is therefore a compromise between the redundancy and the correlation between successive states.
  • the singular or eigenvalues are determined for the covariance matrix C ij .
  • the vectors corresponding to the largest singular values represent the directions spanning the hyperplane defined by the A n and b n above.
  • the associated time offset vectors are projected onto the dominant directions that span the hyperplane. For each element of the scalar time series, this results in m different corrections, which are combined in a suitable manner. The described process can be repeated for the new projection with the noise-reduced values.
  • the correlation is determined between neighboring states in the invention Signal processing in the non-deterministic system based on the following additional information.
  • the invention is based on the use of redundancy in the Signal. Because of the non-stationarity is between a real one Redundancy and random similarities of signal parts, which, however, are uncorrelated. This is through the use of a higher embedding dimension and one larger embedding window than would be necessary to dissolve the current dynamics.
  • a voice signal is a concatenation of phonemes. Every single phoneme is characterized by a characteristic waveform that repeated several times almost unchanged. A time offset embedding vector that completely covers such a wave, can thus be clearly assigned to a given phoneme without misinterpretation of another phoneme occurs with a different characteristic waveform. Within a phoneme, these waveforms change in one certain way so that no absolutely exact repetitions occur. Because of the latter property is almost periodic repetitions spoken.
  • Human language is a series of phonemes or syllables related to the amplitudes and Frequencies have characteristic patterns. These patterns can for example by observing electrical signals Sound transducer (e.g. microphone).
  • Sound transducer e.g. microphone
  • On medium Language is not a time scale (e.g. in the context of a word) stationary and on long time scales (e.g. in the context of a sentence) highly complex, with many active degrees of freedom and possibly long-range correlations occur.
  • On short time scales Time ranges that are essentially the length of a phoneme or correspond to a syllable
  • Patterns or repeating signal profiles based on the following are explained. Details of the concrete calculations are implemented in the same way as conventional noise reduction and can do the above Publications are taken.
  • time offset embedding (with suitably chosen parameters m and ⁇ , see above) form the repetitions shown neighboring Points in the state space (or vectors that refer to this Points are directed). Now is the variability in these Points due to noise overlay greater than that natural variability due to non-stationarity, see above becomes an approximate identification of the manifold and the projection on it will reduce the noise more than it does affects the actual signal. This is the basic approach of the inventive method, the following with reference to the flowchart of FIG. 3 is explained.
  • Fig. 3 is an overview diagram that schematically basic Shows steps of the method according to the invention.
  • the invention is not limited to this process.
  • the parameter determination, the actual calculation for noise reduction, the separation of power and Noise components and the output of the result can be provided.
  • data acquisition 101 takes place after start 100 and the parameter determination 102.
  • the data acquisition 101 comprises recording a sound signal by converting the sound in an electrical size. Data acquisition can be analog or digital sound recording. Depending on the application is the sound signal in a data memory or with real-time processing in a buffer memory (see FIG. 9) saved.
  • the parameter determination 102 includes the selection of Parameters that are used for the later search for correlations between neighboring states in the sound signal are suitable. These parameters include in particular the embedding dimension m, the scanning distance ⁇ , the ⁇ diameter of the surroundings U im Time offset embedding room to identify neighbors, and the number Q of the time offset vectors to which the state projection should be done.
  • the embedding dimension m for example in the range of 10-50, preferably 20-30, and the scanning distance ⁇ is in the range from 0.1 to 0.3 ms, so that the embedding window m ⁇ ⁇ preferably approx. 3 to 8 ms covers.
  • These data refer to a phoneme duration of approx. 50 to 200 ms and the complexity of the human voice.
  • Typical Signal profiles are due to the pitch of the human Voice of approximately 100 Hz between 3 and 15 ms.
  • Fig. 2 shows for example repetitions of the signal profile after each 7 ms.
  • the parameter determination 102 (FIG. 3) can interact with the data acquisition 101 or within the framework of a Preliminary analysis has been carried out.
  • the signal sample 103 follows on the basis of the recorded measured values and the specified parameters.
  • the signal sample 103 is provided to determine the values of the time series y n in accordance with the previously determined sample parameters from the data.
  • the following steps 104 to 109 represent the actual calculation of the projections of the real sound signals onto noiseless sound signals or states.
  • Step 104 comprises the formation of the first time offset vector at the beginning of the time series (for example according to FIG. 2).
  • This first time offset vector does not necessarily have to refer to the first signal profile that appears first in time. However, this is particularly preferred for real-time or quasi-real-time processing.
  • the first time offset vector comprises m signal values y n as m components which follow one another with the time offset ⁇ .
  • adjacent time offset vectors are formed and recorded.
  • the neighboring vectors refer to signal profiles that are very similar to the signal profile represented by the first vector. They form the first environment U. If the first vector represents a profile that is part of a phoneme, the neighboring vectors essentially correspond to the approximately repeating signal profiles within the same phoneme. In speech processing, around 15 signal profiles are repeated within a phoneme.
  • the number of neighboring vectors determined is less than or equal to the number of repeating signal profiles and is, for example, around 5 to 15.
  • the covariance matrix 106 is then calculated accordingly of equation (2) given above.
  • the one in this matrix inserted vectors are the vectors from the base environment U as determined in step 105.
  • step 106 includes then the determination of the Q largest singular values of the covariance matrix and the associated singular vectors in the m-dimensional Room.
  • the value Q is in the range of around 2 to 10, preferably 4 to 6. In a modified procedure, the value can be Q be zero (see below).
  • the relatively small number Q which is the dimension of the subspace represents, onto which the states or signals are projected represents a particular advantage of the invention It was found that the dynamic range of the waves only a few degrees of freedom within a given phoneme owns once inside a high dimensional Space has been identified. Therefore, they are also proportional few neighboring states for the projection calculation required. To capture the correlation between the Signal profiles are only the largest singular values and corresponding ones Singular vectors of the covariance matrix are relevant. This The result is surprising since the non-linear noise reduction in itself for deterministic systems with extensive Time series was developed. It also emerges as special Advantage of a relatively small amount of time for the Calculation.
  • step 108 selected and the sequence 105-107 repeated, with new Environments and new covariance matrices are formed. This Repeat until all time offset vectors that result from the Time series can be constructed, have been processed.
  • the formation or acquisition of the neighboring vectors (step 105) Incidentally, it takes place at a higher dimension than the projection 107.
  • the high dimension in the search for neighbors guarantees that Choosing the right neighbors to represent the profiles are derived from the same phonemes.
  • the invention thus chooses implicitly without any language model phonemes.
  • As above has been explained represents the dynamics within a phoneme however, significantly fewer degrees of freedom, so that within of the subspace spanned by the singular vectors low-dimensional and can be worked quickly.
  • the sound signal processing takes place essentially for the phonemes in succession, so that phoneme for Phonem completely processed and so a noise-free output signal is produced. This output signal is compared to that recorded sound signal (input signal) delayed by around 100-200 ms (Real time or quasi real time application).
  • Steps 109 and 110 relate to the formation of the actual output signal.
  • Step 109 is directed to the separation of power and noise signals.
  • a noise-free time series element s k is formed by averaging over the corresponding elements from all time offset vectors which contain this element.
  • a weighted averaging can be introduced instead of a simple averaging.
  • a jump back can be provided before step 104.
  • the noise-free time series elements then form the input variables for the renewed formation of time offset vectors and their projection onto the subspace in accordance with the singular vectors. This process repetition is not necessary, but can be provided, for example, 2 or 3 times to improve the noise reduction.
  • a return to parameter determination 102 can also be provided if the power component present after step 109 differs less than expected (for example by less than a predetermined threshold value) from the unprocessed sound signals.
  • decision mechanisms not shown, can be built in.
  • data output follows. With noise reduction, the noise-reduced voice signal is output as a power component. Alternatively, the output or storage of the noise component can also be provided depending on the application.
  • the dimension of Manifold (according to the parameter Q) in which the noise-free data would lie in the course of a signal vary.
  • the dimension Q can vary from phoneme to phoneme.
  • the dimension can, for example, also during a break between two spoken words or any other resting phase Be zero.
  • Second is a selection of relevant ones inherent time offset vectors onto which the state is projected should be excluded if the noise is relatively high is (about 50%). In this case, all eigenvalues of the Correlation matrix to be approximately the same.
  • Projection dimension Q becomes the dimension for each covariance matrix is adjusted or individually determined.
  • This modification increases the efficiency of the process drastically increased especially at high noise levels.
  • the signal processing according to the invention is described below illustrated two examples.
  • this is processed Sound signals a human whistle (see Fig. 4).
  • the second example concerns the above words “Buon giorno” (see Figs. 5 to 8).
  • FIG. 4 shows the power spectrum for a human whistle lasting 3 s.
  • a whistle is an essentially periodic signal with characteristic harmonics and only minor non-stationarities.
  • 4a shows the amplitude profile of the original recording.
  • FIG. 4b results.
  • This provides the input data for step 101 of the process sequence (FIG. 3).
  • the image shown in FIG. 4c results.
  • Figures 4a to 4c show a particular advantage of the invention over a conventional filter in the frequency domain.
  • a filter in the frequency domain would cut off all power components with amplitudes below 10 -6 , so that the noisy spectrum would only contain the peak at 0 and the peak around the fundamental frequency. Accordingly, the time series obtained from the back transformation would be completely harmonic, which would sound very synthetic.
  • FIG. 5 shows corresponding results using the example of curve representations for processing voice signals.
  • Fig. 5a is a section of the noiseless wave train of the words "Buon giorno" based on the signal curve according to FIG. 1 shown analogously to FIG. 2. It is the time-limited repetition of signal profiles recognizable, which are used to reduce the Noise contains the necessary redundancy.
  • 5b shows the wave train after adding a synthetic noise. After Noise reduction according to the invention results in the image Fig. 5c. It turns out that the original signal for the most part could be reconstructed.
  • the functionality of the noise reduction according to the invention was tested in different noise types and amplitudes.
  • the attenuation D (in dB) according to equation (3) can be considered as a measure of the performance of the noise reduction.
  • D 10 log (( ⁇ ( y k -x x ) 2nd ) / ( ⁇ (y k -x k ) 2nd ))
  • X k stands for the noiseless signal (power component)
  • y k for the noiseless signal (input sound signal)
  • y ⁇ k for the signal after the noise reduction according to the invention.
  • Fig. 6 illustrates the dependence of the damping D of the non-linear Noise reduction depending on the relative Noise amplitude (variance of the noise component: variance of the Performance share). It turns out that the damping itself at relatively high noise amplitudes (in the range of more than 100%) is reinforced.
  • Figures 7 and 8 show further details of the speech noise reduction.
  • Fig. 7 illustrates the occurrence of repetitive signal profiles within the phoneme train shown in the upper part of the figure.
  • a graph is printed in the lower part of the figure, which consists of points formed under the following conditions.
  • the associated time offset vector s ⁇ i and the set of all time offset vectors s ⁇ j, i are considered for each time i. If the amount of the difference vector between the s ⁇ i and each s ⁇ j is less than a predetermined limit, a dot is printed.
  • the points form more or less extended lines.
  • the line structures show that the periodicities of the signal profiles explained above occur within the phonemes.
  • Fig. 8 again shows the example of the words "Buon giorno" in upper part of the figure the noiseless signal, in the middle Part of the synthetically added noise and in the lower part the noise remaining after the noise reduction.
  • the ordinate scaling is identical in all three cases.
  • the rest Noise (bottom part of the figure) shows a systematic Variation indicating that the success of the invention Noise reduction even from the sound signal, i.e. depends on the specific phoneme.
  • the invention also relates to a device for implementation of the method according to the invention.
  • 9 includes a noise reduction arrangement a transducer 91, a data memory 92 and / or a buffer memory 93, a sampling circuit 94, an arithmetic circuit 95 and an output unit 96.
  • the components of the device according to the invention presented here are preferably used as a permanently connected circuit arrangement or manufactured as an integrated chip.
  • the invention is also in noise reduction Hearing aids and to improve the computerized automatic Speech recognition applicable.
  • speech recognition can be provided, in particular, the noise Compare time series values or sectors with table values.
  • the table values represent corresponding values or Vectors of predetermined phonemes.
  • An automatic speech recognition can be integrated with the noise reduction process become.

Landscapes

  • Engineering & Computer Science (AREA)
  • Human Computer Interaction (AREA)
  • Quality & Reliability (AREA)
  • Signal Processing (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Computational Linguistics (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Soundproofing, Sound Blocking, And Sound Damping (AREA)
  • Stereophonic System (AREA)
  • Compression, Expansion, Code Conversion, And Decoders (AREA)
EP99125575A 1998-12-21 1999-12-21 Procédé et dispositif de traitement des signaux audio contenant du bruit Withdrawn EP1014340A3 (fr)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
DE19859174A DE19859174C1 (de) 1998-12-21 1998-12-21 Verfahren und Vorrichtung zur Verarbeitung rauschbehafteter Schallsignale
DE19859174 1998-12-21

Publications (2)

Publication Number Publication Date
EP1014340A2 true EP1014340A2 (fr) 2000-06-28
EP1014340A3 EP1014340A3 (fr) 2001-07-18

Family

ID=7892062

Family Applications (1)

Application Number Title Priority Date Filing Date
EP99125575A Withdrawn EP1014340A3 (fr) 1998-12-21 1999-12-21 Procédé et dispositif de traitement des signaux audio contenant du bruit

Country Status (4)

Country Link
US (1) US6502067B1 (fr)
EP (1) EP1014340A3 (fr)
JP (1) JP2000194400A (fr)
DE (1) DE19859174C1 (fr)

Families Citing this family (21)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7124075B2 (en) * 2001-10-26 2006-10-17 Dmitry Edward Terez Methods and apparatus for pitch determination
EP1585112A1 (fr) * 2004-03-30 2005-10-12 Dialog Semiconductor GmbH Suppression de bruit sans retard
JP4931927B2 (ja) * 2005-09-07 2012-05-16 バイループ テクノロジック,エス.エル. マイクロコントローラーを利用した信号認識法
US20070076001A1 (en) * 2005-09-30 2007-04-05 Brand Matthew E Method for selecting a low dimensional model from a set of low dimensional models representing high dimensional data based on the high dimensional data
JP2009529699A (ja) 2006-03-01 2009-08-20 ソフトマックス,インコーポレイテッド 分離信号を生成するシステムおよび方法
US8175291B2 (en) * 2007-12-19 2012-05-08 Qualcomm Incorporated Systems, methods, and apparatus for multi-microphone based speech enhancement
US8321214B2 (en) * 2008-06-02 2012-11-27 Qualcomm Incorporated Systems, methods, and apparatus for multichannel signal amplitude balancing
US8515097B2 (en) * 2008-07-25 2013-08-20 Broadcom Corporation Single microphone wind noise suppression
US9253568B2 (en) * 2008-07-25 2016-02-02 Broadcom Corporation Single-microphone wind noise suppression
US9228785B2 (en) 2010-05-04 2016-01-05 Alexander Poltorak Fractal heat transfer device
TWI412019B (zh) 2010-12-03 2013-10-11 Ind Tech Res Inst 聲音事件偵測模組及其方法
JP2014085609A (ja) * 2012-10-26 2014-05-12 Sony Corp 信号処理装置および方法、並びに、プログラム
CN103811017B (zh) * 2014-01-16 2016-05-18 浙江工业大学 一种基于Welch法的冲床噪声功率谱估计改进方法
US9530408B2 (en) * 2014-10-31 2016-12-27 At&T Intellectual Property I, L.P. Acoustic environment recognizer for optimal speech processing
WO2017033430A1 (fr) 2015-08-26 2017-03-02 パナソニックIpマネジメント株式会社 Dispositif de détection de signal et procédé de détection de signal
US10830545B2 (en) 2016-07-12 2020-11-10 Fractal Heatsink Technologies, LLC System and method for maintaining efficiency of a heat sink
US11217254B2 (en) * 2018-12-24 2022-01-04 Google Llc Targeted voice separation by speaker conditioned on spectrogram masking
CN110349592B (zh) * 2019-07-17 2021-09-28 百度在线网络技术(北京)有限公司 用于输出信息的方法和装置
JP7271360B2 (ja) * 2019-07-31 2023-05-11 株式会社Nttドコモ 状態判定システム
WO2021071489A1 (fr) 2019-10-10 2021-04-15 Google Llc Séparation vocale ciblée par un locuteur à des fins de reconnaissance vocale
CN121415799B (zh) * 2025-12-30 2026-03-31 北京冠宇信息科技股份有限公司 声波信号自监督学习增强的鲁棒降噪处理方法及系统

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CA1293693C (fr) * 1985-10-30 1991-12-31 Tetsu Taguchi Appareil reducteur de bruit
KR950013124B1 (ko) * 1993-06-19 1995-10-25 엘지전자주식회사 케이오스(chaos) 피이드백 시스템
US6000833A (en) * 1997-01-17 1999-12-14 Massachusetts Institute Of Technology Efficient synthesis of complex, driven systems
US6208951B1 (en) * 1998-05-15 2001-03-27 Council Of Scientific & Industrial Research Method and an apparatus for the identification and/or separation of complex composite signals into its deterministic and noisy components

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
E.J.KOSTELICH, T.SCHEIBER: "Noise reduction in chaotic time-series data: A survey of common methods." PHYSICAL REVIEW E. STATISTICAL PHYSICS, PLASMAS, FLUIDS, AND RELATED INTERDISCIPLINARY TOPICS., Bd. 48, Nr. 3, September 1993 (1993-09) - September 1993 (1993-09), Seiten 1752-1763, XP000992597 AMERICAN INSTITUTE OF PHYSICS, NEW YORK, NY., US ISSN: 1063-651X *
MATASSINI L ET AL: "Filtering of speech signals by over-embedding" STOCHASTIC AND CHAOTIC DYNAMICS IN THE LAKES, AMBLESIDE, UK, AUG. 1999, Nr. 502, Seiten 642-648, XP000997106 AIP Conference Proceedings, 2000, AIP, USA ISSN: 0094-243X *
P.GRASSBERGER ET AL.: "On noise reduction methods for chaotic data" CHAOS., Bd. 3, Nr. 2, 1993 - 1993, Seiten 127-141, XP000997215 AMERICAN INSTITUTE OF PHYSICS, WOODBURY, NY., US ISSN: 1054-1500 *
R.HEGGER ET AL.: "denoising human speech signals using chaoslike features " PHYSICAL REVIEW LETTERS, Bd. 84, Nr. 14, 3. - 3. April 2001, Seiten 3197-3200, XP000997103 NEW YORK,NY, US ISSN: 0031-9007 *

Also Published As

Publication number Publication date
JP2000194400A (ja) 2000-07-14
DE19859174C1 (de) 2000-05-04
EP1014340A3 (fr) 2001-07-18
US6502067B1 (en) 2002-12-31

Similar Documents

Publication Publication Date Title
DE19859174C1 (de) Verfahren und Vorrichtung zur Verarbeitung rauschbehafteter Schallsignale
DE102007001255B4 (de) Tonsignalverarbeitungsverfahren und -vorrichtung und Computerprogramm
DE69030561T2 (de) Spracherkennungseinrichtung
DE69619284T3 (de) Vorrichtung zur Erweiterung der Sprachbandbreite
DE3687815T2 (de) Verfahren und vorrichtung zur sprachanalyse.
DE69432943T2 (de) Verfahren und Vorrichtung zur Sprachdetektion
DE60033549T2 (de) Verfahren und vorrichtung zur signalanalyse
DE60018886T2 (de) Adaptive Wavelet-Extraktion für die Spracherkennung
DE69417445T2 (de) Verfahren und system zur detektion und erzeugung von übergangsbedingungen in tonsignalen
DE69127961T2 (de) Verfahren zur Spracherkennung
DE60316517T2 (de) Verfahren und Vorrichtung zur Aufnahme von Störsignalen
DE3884880T2 (de) Billige Spracherkennungseinrichtung und Verfahren.
DE69519453T2 (de) Spracherkennung mit Sprecheradaptierung mittels Berechnung von Mittelwerten akustischer Kategorien
WO2003009273A1 (fr) Procede et dispositif pour caracteriser un signal et pour produire un signal indexe
DE69720134T2 (de) Spracherkenner unter Verwendung von Grundfrequenzintensitätsdaten
EP1193688A2 (fr) Procédé pour déterminer un espace propre pour la présentation d'une pluralité de locuteurs d'apprentissage
DE2326517A1 (de) Verfahren und schaltungsanordnung zum erkennen von gesprochenen woertern
DE602005000896T2 (de) Sprachsegmentierung
DE2020753A1 (de) Einrichtung zum Erkennen vorgegebener Sprachlaute
DE69317802T2 (de) Verfahren und Vorrichtung für Tonverbesserung unter Verwendung von Hüllung von multibandpassfiltrierten Signalen in Kammfiltern
EP1193689A2 (fr) Procédé pour le calcul d'un espace de vecteurs propres pour la représentation d'une pluralité de locuteurs pendant la phase d'entraínement
DE69020736T2 (de) Wellenanalyse.
DE3878895T2 (de) Verfahren und einrichtung zur spracherkennung.
DE10047718A1 (de) Verfahren zur Spracherkennung
DE102014207437A1 (de) Spracherkennung mit einer Mehrzahl an Mikrofonen

Legal Events

Date Code Title Description
PUAI Public reference made under article 153(3) epc to a published international application that has entered the european phase

Free format text: ORIGINAL CODE: 0009012

AK Designated contracting states

Kind code of ref document: A2

Designated state(s): AT BE CH CY DE DK ES FI FR GB GR IE IT LI LU MC NL PT SE

AX Request for extension of the european patent

Free format text: AL;LT;LV;MK;RO;SI

PUAL Search report despatched

Free format text: ORIGINAL CODE: 0009013

AK Designated contracting states

Kind code of ref document: A3

Designated state(s): AT BE CH CY DE DK ES FI FR GB GR IE IT LI LU MC NL PT SE

AX Request for extension of the european patent

Free format text: AL;LT;LV;MK;RO;SI

17P Request for examination filed

Effective date: 20011115

AKX Designation fees paid

Free format text: AT BE CH CY DE DK ES FI FR GB GR IE IT LI LU MC NL PT SE

RAP3 Party data changed (applicant data changed or rights of an application transferred)

Owner name: MAX-PLANCK-GESELLSCHAFT ZUR FOERDERUNG DER WISSENS

STAA Information on the status of an ep patent application or granted ep patent

Free format text: STATUS: THE APPLICATION IS DEEMED TO BE WITHDRAWN

18D Application deemed to be withdrawn

Effective date: 20080701