TWI500024B - Sound wave identification system and its method - Google Patents

Sound wave identification system and its method Download PDF

Info

Publication number
TWI500024B
TWI500024B TW099115647A TW99115647A TWI500024B TW I500024 B TWI500024 B TW I500024B TW 099115647 A TW099115647 A TW 099115647A TW 99115647 A TW99115647 A TW 99115647A TW I500024 B TWI500024 B TW I500024B
Authority
TW
Taiwan
Prior art keywords
sound wave
loudness
wave signal
sound
frequency bands
Prior art date
Application number
TW099115647A
Other languages
Chinese (zh)
Other versions
TW201142820A (en
Original Assignee
Univ Feng Chia
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Univ Feng Chia filed Critical Univ Feng Chia
Priority to TW099115647A priority Critical patent/TWI500024B/en
Publication of TW201142820A publication Critical patent/TW201142820A/en
Application granted granted Critical
Publication of TWI500024B publication Critical patent/TWI500024B/en

Links

Landscapes

  • Measurement Of Mechanical Vibrations Or Ultrasonic Waves (AREA)
  • Circuit For Audible Band Transducer (AREA)

Description

聲波辨識系統及其方法 Acoustic wave identification system and method thereof

本發明係有關於一種辨識系統及其方法,特別是指一種聲波辨識系統及其方法。 The present invention relates to an identification system and method thereof, and more particularly to an acoustic wave identification system and method thereof.

從物理意義來說,聲音是通過傳播固體,液體,或氣體的振動所產生,尤其是指人耳能感受到的聲音振動的複數個頻帶,其中人類的聽覺頻率範圍限制在約20赫茲(Hz)到20千赫(KHz)之間,且上限普遍隨著年齡下降。其他物種有不同範圍的聽覺範圍。例如,有些犬種感覺到震動到60,000赫茲。聲音是被許多物種用來做為檢測危險、導航、捕食和通信的信號,例如:蝙蝠、鯨魚、海豚利用聲音做為導航之依據,也就是聲納,現今亦轉用於潛艇之導航依據。聲音需透過介質進行傳輸,因此聲音不能在真空環境中傳輸。心理聲學(Psychoacoustics)為研究人對於聽覺的心理反應,也就針對一般人耳在聽覺範圍20Hz至20kHz之內,收聽到聲音時,所呈現的心理反應,而一般年輕人可聽到18kHz左右即算是金耳朵了,但隨著年齡的增長,人耳對高頻聲音的敏感度會隨著下降,且經常被噪音騷擾或聽慣了大聲的耳筒機的音樂的人對高頻聲音的敏感度也會隨著時間下降。現今聲音辨識大多應用在於語音辨識,其用於將語音內容輸入至電腦系統,而代替傳統輸入裝置(例如:滑鼠、鍵盤),抑或做為電話答錄內容,由於人的口語習慣,會造成每個人在講話時對每一個咬字的響度與頻率上有些許差異,也有不明顯的差異,因此語音辨識的精確度上,有待進一步的發展。 In the physical sense, sound is produced by the propagation of vibrations of solids, liquids, or gases, especially the complex frequency bands of sound vibrations that can be felt by the human ear. The human auditory frequency range is limited to about 20 Hz (Hz). ) to 20 kHz (KHz), and the upper limit generally decreases with age. Other species have different ranges of hearing. For example, some breeds feel a shock to 60,000 Hz. Sound is used by many species to detect danger, navigation, predation, and communication. For example, bats, whales, and dolphins use sound as a basis for navigation, that is, sonar, which is now used for submarine navigation. The sound needs to be transmitted through the medium, so the sound cannot be transmitted in a vacuum environment. Psychoacoustics is the psychological response of the researcher to the auditory, and it is also the psychological response of the average human ear when listening to the sound within the range of 20Hz to 20kHz. However, the average young person can hear about 18kHz. Ears, but as you age, the sensitivity of the human ear to high-frequency sounds decreases, and people who are often harassed by noise or who are used to the music of loud earphones are sensitive to high-frequency sound. It will also drop over time. Most of today's voice recognition applications are voice recognition, which is used to input voice content into a computer system instead of a traditional input device (eg, mouse, keyboard), or as a phone answer content, due to human oral habits, Everyone has a slight difference in the loudness and frequency of each bite when speaking, and there are also obvious differences. Therefore, the accuracy of speech recognition needs further development.

此外,現今聲波辨識之技術亦可應用於物品辨識的用途上,如台灣專利編號第M373528號之「錢幣辨識裝置」所揭示,其為利用磁力 圈感應搭配錢幣撞擊的聲音辨識錢幣的真偽,其中聲音辨識的技術如第一圖所示。請參閱第一圖,其為習知聲音辨識之流程圖。如圖所示,習知錢幣辨識裝置係先如步驟S10所示,利用收音裝置收音錢幣於通過錢道時會因慣性撞擊到撞擊棒的撞擊區而發出聲音;接續如步驟S20所示,轉換聲音為一電氣訊號;如步驟S30所示,經由電路板上的放大器、濾波器、整形器來調整該電氣訊號之波形;如步驟S40所示,依據錢幣所發出之聲音的頻率、振幅、波長進行真偽判斷,其係利用微處理器比對錢幣所撞擊出來的聲音的參數值與預先儲存的真幣所撞擊出來的聲音的參數值的比對,在一定的誤差容許範圍內,微處理器藉由撞擊聲音的參數值判斷錢幣真偽。但偽幣在極其近似真幣時,錢幣之撞擊聲音中不一定有明顯差異,尤其聲波訊號之波形會極其近似,對於傳統聲音辨識裝置而言,無法單單就頻率、振幅、波長辨識出極其近似的偽幣,因此需搭配磁力圈感應,否則仍然無可避免讓偽幣可使用投幣裝置。 In addition, today's sonic identification technology can also be applied to the use of item identification, as disclosed in the "Coin Identification Device" of Taiwan Patent No. M373528, which utilizes magnetic force. The circle sensing matches the sound of the coin impact to identify the authenticity of the coin, and the technique of sound recognition is as shown in the first figure. Please refer to the first figure, which is a flow chart of conventional sound recognition. As shown in the figure, the conventional coin identification device first uses the sound pickup device to receive the sound when the money is passed through the money channel, and the sound is generated by the inertial impact on the impact zone of the impact bar; the connection is continued as shown in step S20. The sound is an electrical signal; as shown in step S30, the waveform of the electrical signal is adjusted via an amplifier, a filter, and a shaper on the circuit board; as shown in step S40, the frequency, amplitude, and wavelength of the sound according to the coin are displayed. The authenticity judgment is performed by using a microprocessor to compare the parameter value of the sound struck by the coin with the parameter value of the sound struck by the pre-stored genuine coin, and within a certain error tolerance range, the micro-processing The instrument judges the authenticity of the coin by the parameter value of the impact sound. However, when the counterfeit currency is extremely similar to the real coin, the impact sound of the coin does not necessarily have a significant difference. In particular, the waveform of the acoustic wave signal is extremely similar. For the conventional sound recognition device, the frequency, amplitude, and wavelength cannot be recognized very simply. The counterfeit currency, so it needs to be matched with the magnetic ring, otherwise it is inevitable that the counterfeit coin can use the coin-operated device.

有鑑於此,我們將提出一個聲波辨識系統及其方法,除了可以避免聲音遮蔽而誤判外,相較於傳統語音辨識與物品辨識,本發明可提供較為精確之聲波辨識,而降低誤判的比率。且本發明更進一步利用心理聲學做為聲波分析的核心,以提供精確的聲波辨識,可辨識差異性小之聲波訊號。 In view of this, we will propose an acoustic wave recognition system and its method. In addition to avoiding sound masking and misjudging, the present invention can provide more accurate sound wave recognition and reduce the ratio of false positives than conventional voice recognition and article identification. Moreover, the present invention further utilizes psychoacoustics as the core of acoustic wave analysis to provide accurate sound wave identification, and can recognize sound waves with small differences.

本發明之一目的在於提供一種聲波辨識系統及其方法,其利用心理聲學做為聲波訊號分析的依據,以精確劃分聲波訊號用於辨識。 An object of the present invention is to provide an acoustic wave identification system and a method thereof, which utilize psychoacoustics as a basis for acoustic signal analysis to accurately divide an acoustic wave signal for identification.

本發明係提供一種聲波辨識系統及其方法,其中該系統係利用一聲波擷取單元擷取一包含複數頻段之聲波訊號,以傳送至濾波模組,而依據該些頻段濾波該聲波訊號,如此即可產生對應於複數頻段之該聲波訊號,分析單元利用心理聲學的原理分析對應於該些頻段之該聲波訊號,以產生一聲波分析資料,由於本發明係依據一心理聲學進行分析,因此該聲波分析資料的分析音域包含20赫茲至20千赫,其為目前精確聲波分析的音域,所以辨識單元可藉由具精確音域分析的聲波分析資料辨識產生該聲 波訊號之來源。如此本發明即可精確用於辨識物品、辨識機械運轉狀態與辨識語音。 The present invention provides an acoustic wave identification system and a method thereof, wherein the system uses an acoustic wave capturing unit to capture an acoustic wave signal including a plurality of frequency bands for transmission to a filtering module, and filtering the acoustic wave signal according to the frequency bands. The sound wave signal corresponding to the plurality of frequency bands is generated, and the analyzing unit analyzes the sound wave signal corresponding to the frequency bands by using a psychoacoustic principle to generate an acoustic wave analysis data. Since the present invention analyzes according to a psychoacoustic, the The analysis range of the acoustic analysis data contains 20 Hz to 20 kHz, which is the range of the current accurate acoustic analysis, so the identification unit can recognize the sound by analyzing the acoustic analysis data with accurate range analysis. The source of the wave signal. Thus, the present invention can be accurately used to identify articles, recognize mechanical operating conditions, and recognize speech.

茲為使 貴審查委員對本發明之結構比及所達成之功效更有進一步之瞭解與認識,謹佐以較佳之實施例圖及配合詳細之說明,說明如後: In order to give the reviewer a better understanding and understanding of the structure of the present invention and the efficacies achieved, please refer to the preferred embodiment diagram and the detailed description to illustrate:

80‧‧‧聲波辨識系統 80‧‧‧Sonic Identification System

82‧‧‧聲波擷取單元 82‧‧‧Sonic extraction unit

84‧‧‧轉換模組 84‧‧‧Transition module

842‧‧‧轉換單元 842‧‧‧Transfer unit

86‧‧‧濾波模組 86‧‧‧Filter module

862‧‧‧濾波器 862‧‧‧ filter

88‧‧‧分析單元 88‧‧‧Analysis unit

90‧‧‧辨識單元 90‧‧‧ Identification unit

92‧‧‧資料庫 92‧‧‧Database

第一圖為習知聲音辨識之流程圖;第二圖為本發明之一較佳實施例之方塊圖;第三圖為本發明之一較佳實施例之流程圖;第四圖為本發明之一較佳實施例之轉換濾波步驟的示意圖;以及第五圖為本發明之一較佳實施例之分析步驟的流程圖。 The first figure is a flow chart of a conventional sound recognition; the second figure is a block diagram of a preferred embodiment of the present invention; the third figure is a flow chart of a preferred embodiment of the present invention; A schematic diagram of a conversion filtering step of a preferred embodiment; and a fifth diagram of a flow of analysis of a preferred embodiment of the present invention.

請參閱第二圖,其為本發明之一較佳實施例之方塊圖。如圖所示,本發明之聲波辨識系統80包含一聲波擷取單元82、一轉換模組84、一濾波模組86、一分析單元88與一辨識單元90。聲波擷取單元82自一發聲源擷取一聲波訊號,其中該聲波訊號具有複數頻段,例如:布料之搓揉聲;轉換模組84係接收聲波擷取單元82所擷取之聲波訊號,並將該聲波訊號自時域轉換至頻域;濾波模組86依據該些頻段濾波頻域下的該聲波訊號,其中由於濾波模組86對應於一心理聲學,因此本實施例濾波模組86包含24個濾波器842,以對應於人耳分析的24個頻段,該些頻段分佈於20赫茲至20千赫,但亦可縮減於600赫茲至16千赫,此為一般人所擁有之有效聽覺範圍;分析單元88依據該該心理聲學分析該聲波訊號,由於分析單元88對應於該心理聲學,因此分析單元88就24個頻段分別分析該聲波訊號,也就是分析單元88就濾波模組86針對20赫茲至20千赫的濾波結果一一分析,而產生一聲波分析資料。 Please refer to the second figure, which is a block diagram of a preferred embodiment of the present invention. As shown in the figure, the acoustic wave recognition system 80 of the present invention comprises an acoustic wave capture unit 82, a conversion module 84, a filter module 86, an analysis unit 88 and an identification unit 90. The sound wave capturing unit 82 extracts a sound wave signal from a sound source, wherein the sound wave signal has a plurality of frequency bands, for example, a click sound of the cloth; and the conversion module 84 receives the sound wave signal captured by the sound wave capturing unit 82, and The sound wave signal is converted from the time domain to the frequency domain; the filter module 86 filters the sound wave signal in the frequency domain according to the frequency bands, wherein the filter module 86 corresponds to a psychoacoustic, so the filter module 86 of the embodiment includes 24 filters 842, corresponding to 24 frequency bands analyzed by the human ear, the frequency bands are distributed between 20 Hz and 20 kHz, but can also be reduced to 600 Hz to 16 kHz, which is an effective hearing range for ordinary people. The analyzing unit 88 analyzes the sound wave signal according to the psychoacoustic. Since the analyzing unit 88 corresponds to the psychoacoustic, the analyzing unit 88 analyzes the sound wave signal for each of the 24 frequency bands, that is, the analyzing unit 88 filters the module 86 for 20 The filtering results of Hertz to 20 kHz are analyzed one by one, and a sound wave analysis data is generated.

承接上述,辨識單元90接收分析單元88就該聲波訊號的分 析所得之聲波分析資料,而依據一聲波對照資料比對該聲波分析資料,以辨識該聲波訊號,其中辨識單元90為利用類神經網路進行比對,由於利用類神經網路進行資料比對為現今純熟的技術,因此在此不再贅述。此外,本發明之聲波辨識系統80更包含一資料庫92,其用以儲存該聲波對照資料,以供辨識單元90讀取該聲波對照資料,用以比對分析單元88所產生之聲波分析資料,其中該聲波對照資料為聲波辨識系統80於辨識物品之前由聲波擷取單元82擷取用於對照之一物件所發出的聲音或語音,再由轉換模組84與濾波模組86進行轉換並濾波,以經由分析單元88分析出該聲波對照資料並存至資料庫92,用以辨識發聲源,例如:辨識布料,以透過布料之聲波分析資料確認真偽。且辨識單元90更可將聲波分析資料建檔至資料庫92,以供後續更精確之辨識。另外,本發明之聲波辨識系統80可應用於電腦系統中,更可設置於單晶片上,例如:場域可程式邏輯閘陣列(Field Programmable Gate Array,FPGA)。 In response to the above, the identification unit 90 receives the analysis unit 88 for the sound signal. The obtained acoustic wave analysis data is analyzed, and the acoustic wave signal is analyzed according to the acoustic wave comparison data, wherein the identification unit 90 uses the neural network for comparison, and the data is compared by using the neural network. It is a skill that is nowadays so it will not be repeated here. In addition, the acoustic wave recognition system 80 of the present invention further includes a data library 92 for storing the sound wave reference data for the identification unit 90 to read the sound wave reference data for comparing the sound wave analysis data generated by the analysis unit 88. The sound wave comparison data is obtained by the sound wave capturing unit 82 for comparing the sound or voice emitted by the object by the sound wave capturing unit 82, and then converted by the conversion module 84 and the filtering module 86. Filtering, the sound wave comparison data is analyzed by the analyzing unit 88 and stored in the database 92 for identifying the sound source, for example, identifying the cloth to confirm the authenticity through the sound wave analysis data of the cloth. The identification unit 90 can further profile the acoustic analysis data to the database 92 for subsequent more accurate identification. In addition, the acoustic wave recognition system 80 of the present invention can be applied to a computer system, and can be disposed on a single chip, for example, a Field Programmable Gate Array (FPGA).

請參閱第三圖,其為本發明之一較佳實施例之方塊圖。如圖所示,本發明之聲波辨識方法係包含:步驟S100:擷取聲波訊號;步驟S200:轉換並過濾該聲波訊號;步驟S300;分析對應於該些頻段之該聲波訊號,產生一聲波分析資料;以及步驟S400:依據一聲波對照資料比對該聲波分析資料,以辨識該聲波訊號。 Please refer to the third drawing, which is a block diagram of a preferred embodiment of the present invention. As shown in the figure, the acoustic wave identification method of the present invention comprises: step S100: capturing an acoustic wave signal; step S200: converting and filtering the acoustic wave signal; and step S300; analyzing the acoustic wave signal corresponding to the frequency bands to generate an acoustic wave analysis And the step S400: analyzing the sound wave according to the sound wave comparison data to identify the sound wave signal.

於步驟S100中,擷取發聲源之聲波訊號,其即為X(n),例如:布料搓揉聲、物品或機械運轉或語音的聲波訊號;於步驟S200中,將步驟S100所擷取之聲波訊號進行轉換,並對轉換後的聲波訊號進行過濾,以對應於心理聲學之24個頻段,如第三圖所示,其即為轉換步驟S100所擷取之聲波訊號的訊號格式,自時域轉換為頻域,本實施例之轉換演算法係採用快速傅立葉轉換(Fast Fourier Transformation,FFT),以進行聲波訊號轉換,如第四圖所示之轉換模組84, 其分別針對24個頻段進行快速傅立葉轉換。假設對一個離散時域訊號X(n)作離散傅立葉轉換,我們可以得到以下方程式1: In step S100, the sound wave signal of the sound source is captured, which is X(n), for example: cloth sound, article or mechanical operation or voice sound wave signal; in step S200, step S100 is taken. The sound wave signal is converted, and the converted sound wave signal is filtered to correspond to the 24 frequency bands of psychoacoustics. As shown in the third figure, it is the signal format of the sound wave signal captured by the conversion step S100. The conversion algorithm of the embodiment is a Fast Fourier Transformation (FFT) for performing acoustic signal conversion, such as the conversion module 84 shown in the fourth figure, which is respectively for 24 frequency bands. Perform a fast Fourier transform. Assuming a discrete Fourier transform on a discrete time domain signal X ( n ), we can get Equation 1 below:

其中ω為角頻率,X(e )為轉換後的頻率域函數,但本發明不侷限於此,更可利用其他時域轉頻域的演算法,例如:拉氏轉換或Z轉換;於步驟S300中,依據心理聲學所對應之24個頻段濾波步驟S200所轉換之聲波訊號,如第四圖所示之濾波模組86與濾波單元862。該24段頻段換算公式如下方程式2: Where ω is the angular frequency and X ( e ) is the frequency domain function after the conversion, but the invention is not limited thereto, and other time domain domain frequency domain algorithms may be utilized, such as: Laplace conversion or Z conversion; In step S300, the sound wave signal converted by the step S200 is filtered according to the 24 frequency bands corresponding to the psychoacoustic, such as the filter module 86 and the filtering unit 862 shown in the fourth figure. The 24-band band conversion formula is as follows: Equation 2:

頻段B(f)主要是因人耳的聽覺感知所區分的頻段,應用在計算比響度、尖銳度和粗糙度等值。在濾波聲波訊號的處理上,為求劃分頻段B(f)的頻帶寬可以1/3八度音(octave)頻帶寬的頻段建立帶通濾波器(Bandpass Filter),該帶通濾波器之轉換函式H(s)可表示如下方程式3: Band B ( f ) is mainly a frequency band distinguished by the auditory perception of the human ear and is used to calculate values such as loudness, sharpness and roughness. In the processing of the filtered acoustic wave signal, a bandpass filter is established for the frequency band in which the frequency band B ( f ) is divided into 1/3 octave frequency band, and the bandpass filter is converted. The function H ( s ) can represent Equation 3 as follows:

s為Laplace運算子,Q 為頻段因子,而對應每個1/3八度音的頻段則表示如下列方程式4所示: s is the Laplace operator, Q is the band factor, and the band corresponding to each 1/3 octave is expressed as shown in Equation 4 below:

f li ~f ui 意指由上述每一濾波器所對應之每一頻率,其分別用於針對24個不等份頻段進行濾波。 f li ~ f ui means each frequency corresponding to each of the above filters, which are respectively used for filtering for 24 unequal frequency bands.

於步驟S300中,其即針對步驟S200所得之對應於該些頻段的聲波訊號的訊號值X(f)進行分析,如第五圖所示,步驟S300中更進一步包含:步驟S310:響度擷取;步驟S320:進行臨界頻帶運算並進行遮蔽效應評估運算;以及步驟S330:產生聲波分析資料。 In step S300, the signal value X ( f ) corresponding to the acoustic signals of the frequency bands obtained in step S200 is analyzed. As shown in the fifth figure, step S300 further includes: step S310: the loudness extraction Step S320: performing a critical band operation and performing a shadow effect evaluation operation; and step S330: generating sound wave analysis data.

於步驟S310中,其先將傅立葉轉換後分成的頻譜,我們可依據步驟S200所得之頻譜計算並取出頻譜能量值,其算式如下方程式5:Y(f)=10.Log 10{|X(f)|2} (5) In step S310, the spectrum is divided into Fourier transforms first, and we can calculate and extract the spectral energy value according to the spectrum obtained in step S200, and the equation is as follows: Equation 5: Y ( f )=10. Log 10 {| X ( f )| 2 } (5)

而響度能量值的聲壓換算則可由下方方程式6求得,該方程式6如下: The sound pressure conversion of the loudness energy value can be obtained by Equation 6 below, which is as follows:

由方程式6可獲得響度能量值L(f),將L(f)代入下方方程式7可針對不同的頻段求得在各個響度能量值L(f)在各個頻段上的能量加總值。該方程式7如下: The loudness energy value L ( f ) can be obtained from Equation 6, and L ( f ) can be substituted into Equation 7 below to find the sum of the energy in each frequency band for each loudness energy value L ( f ) for different frequency bands. The equation 7 is as follows:

其中能量級數L(i)為對應的每個頻段B(f)之各頻率(f li ~f ui )的能量值Y(f),亦即指各個不等頻段B(f)內的每一頻率所含之能量加總,所以針對每一不等頻段B(f)計算對應之能量級數L(i),如此每一頻段所包含的能量級數L(i)可表示為人耳在每個能量頻帶所能感受到的聲壓力度,且能量級數L(i)在心理聲學上稱之為受激發能量級數(Excitation Level)。 Each frequency (f li ~ f ui), wherein the series of energy L (i) corresponding to each band B (f) of the energy value of Y (f), i.e. means for each frequency range within each of B (f) The energy contained in a frequency is summed, so the corresponding energy level L ( i ) is calculated for each unequal frequency band B ( f ), so that the energy level L ( i ) included in each frequency band can be expressed as a human ear. The degree of acoustic stress that can be felt in each energy band, and the energy level L ( i ) is psychoacoustically referred to as the Excitation Level.

於步驟S320中,其依據步驟S310所得之響度能量級數進行運算,以求得各頻段之比響度值。 In step S320, it is calculated according to the loudness energy level obtained in step S310 to obtain the specific loudness value of each frequency band.

先由方程式8求得最小音量門檻資訊(LTq)的公式,該方程式8如下: First, the formula of the minimum volume threshold information ( LTq ) is obtained by Equation 8, which is as follows:

比響度的計算式是將每個能量級數L(i)所對應到的臨界頻帶遮蔽值進而計算來求得臨界頻帶遮蔽量計算方程式L E ,其中臨界頻帶遮蔽計算方程式L E 為現今習知遮蔽效應評估運算方程,因此在此不再贅述,其中方程式9中之Sones為響度的數量標度單位為宋,Bark為臨界頻帶的單位為巴克。在比對人耳所能聽到的最小音量門檻資訊(LTq)後,以方程式9將方程式8所得之最小音量門檻資訊(LTq)以及臨界頻帶加總方程式L E 代入,即可求得比響度運算子N',該方程式9如下: The calculation formula of the specific loudness is to calculate the critical band shading amount calculation equation L E by calculating the critical band shading value corresponding to each energy level L ( i ), wherein the critical band shading calculation equation L E is now known. The shadowing effect evaluates the equation of operation, so it will not be repeated here. Among them, Sones in Equation 9 is the unit of scale for loudness, and Bark is the unit of critical band for Buck. In the minimum volume threshold information (LTQ) than the human ear can hear, the Equation 9 to Equation 8 obtained from the minimum volume threshold information (LTQ) and critical band summed L E is substituted into equation can be obtained than the loudness calculation Sub N' , the equation 9 is as follows:

於步驟S330中,其依據步驟S320中所得之對應於24頻段之響度運算子N',進行運算,以產生聲波分析資料,而本實施例之聲波分析資料為包含比響度值L、尖銳值S與粗糙值R,其中比響度總值L,係由下方程式10求得,其利用步驟S320所得之響度運算子N'進行積分,該方程式10如下: In step S330, according to the loudness operator N' corresponding to the 24-band obtained in step S320, the operation is performed to generate sound wave analysis data, and the sound wave analysis data of the embodiment includes the specific loudness value L and the sharp value S. And the roughness value R, wherein the specific loudness total value L is obtained by the following program 10, which is integrated by the loudness operator N' obtained in step S320, which is as follows:

其中L即為比響度總值,其用以作為響度辨識之結果。 Where L is the total loudness value, which is used as the result of the loudness identification.

在求得各頻段對應之響度運算子N'以及比響度值L後,可藉由一查詢表得知每一響度運算子N'對應之加權函數g(z)並予以加權,並積分加權之響度以及予以相除,藉此依據方程式12求得尖銳度(Sharpness)S,其中對應之加權函數g(z)的查詢表為一習知技術,所以在此不再贅述。 該方程式12如下: After obtaining the loudness operator N' and the specific loudness value L corresponding to each frequency band, the weighting function g ( z ) corresponding to each loudness operator N' can be obtained by a lookup table and weighted, and the weighted The loudness and the division are performed, whereby the sharpness S is obtained according to the equation 12, and the look-up table corresponding to the weighting function g ( z ) is a conventional technique, so it will not be described here. The equation 12 is as follows:

其中S為尖銳度,用以做為聲波尖銳度之辨識值。 Where S is the sharpness, which is used as the identification value of the acoustic sharpness.

此外,本實施例更可由方程式13代入響度運算子N'求得粗糙度(Roughness)R,該方程式13如下: 其中能量的變化量為 In addition, in this embodiment, Roughness R is obtained by substituting Equation 13 into the loudness operator N' , and the equation 13 is as follows: The amount of change in energy is

由方程式14可知,方程式13係由每個響度運算子N'之間的能量變化量的關係來計算粗糙度R之值,△f B 為所對應的每個頻段上的臨界頻帶區間值,粗糙度R之值為用以辨識聲波之粗糙度。 It can be seen from Equation 14 that Equation 13 calculates the value of the roughness R from the relationship between the amount of energy change between each loudness operator N' , and Δ f B is the critical band interval value for each corresponding frequency band, rough The value of the degree R is used to identify the roughness of the sound wave.

於步驟S400中,依據步驟S330所求得知之聲波分析資料,用以進行辨識,其中辨識方式係以資料庫92所儲存之聲波對照資料比對聲波分析資料中所包含之比響度值L、尖銳值S與粗糙值R,其藉由類神經網路依據逼近比響度L、尖銳度S與粗糙度R的近似對照值辨識步驟S100所擷取之聲波訊號的發聲源,以藉由上述比響度值L、尖銳值S與粗糙值R之比對結果得知發聲源之狀態,因此本發明可用於物品辨識、機械運轉狀態與語音,其中物品辨識係可用於如辨識布料、錢幣真偽等用途,機械運轉狀態可應用於如辨識引擎轉速,語音用途如利用語音作為門禁控制、利用語音代替開關控制之鑰匙。 In step S400, the sound wave analysis data obtained in step S330 is used for identification, wherein the identification mode is based on the sound wave comparison data stored in the data library 92, and the specific loudness value L and sharpness included in the sound wave analysis data are compared. a value S and a rough value R, which are identified by a neural network based on approximate approximations of the approximation loudness L, sharpness S, and roughness R to identify the source of the acoustic signal captured in step S100, by the above-described specific loudness The value L, the sharp value S and the rough value R are compared to the result to know the state of the sound source, so the invention can be used for item identification, mechanical operation state and voice, wherein the item identification can be used for identifying the cloth, the authenticity of the coin, etc. The mechanical operating state can be applied, for example, to identify engine speeds, voice applications such as the use of voice as an access control, and the use of voice instead of a switch control key.

由上述可知,以上實施例係利用心理聲學之比響度值L、尖銳值S與粗糙值R進行辨識,除此之外,心理聲學更包含時變響度(Time-Varying Loudness)、音調(Tonality)、音訊波動量(Fluctuation strength),因此本發明更可利用變響度、音調、音訊波動量做為辨識之依據, 其中時變響度為隨著時間改變之響度,亦即暫態下的響度,且該時變響度之求得方式與上述比響度相同,音調為人耳根據不同的頻段下解析出反應音調感覺的色調差異性,音訊波動量亦即音訊隨著時間的浮動反應變化量。此外,本發明之聲波對照資料係執行本發明之方法之前利用步驟S100至S300針對對照物件或語音擷取一對照聲波訊號,以求得該聲波對照資料並存入資料庫92中。 As can be seen from the above, the above embodiment utilizes psychoacoustic specific loudness value L, sharp value S and rough value R for identification. In addition, psychoacoustic includes Time-Varying Loudness and Tonality. Fluctuation strength, therefore, the present invention can further utilize the variation of loudness, pitch, and audio fluctuation as the basis for identification. The time-varying loudness is the loudness that changes with time, that is, the loudness under transient conditions, and the time-varying loudness is obtained in the same manner as the above-mentioned specific loudness, and the pitch is the human ear's response to the reaction tonal sensation according to different frequency bands. The difference in tone, the amount of audio fluctuations is the amount of change in the response of the audio over time. In addition, the acoustic wave control data of the present invention uses the steps S100 to S300 to capture a control sound wave signal for the control object or the voice before performing the method of the present invention to obtain the sound wave control data and store it in the data library 92.

綜上所述,本發明為一種聲波辨識系統及其方法,主要係利用一聲音擷取單元擷取一聲波訊號,並傳送至過濾器模組,以濾波對應不同頻段之聲波訊號,用以依據對應於不同頻段之聲波訊號進行分析,而求得聲波分析資料,並依據聲波對照資料比對聲波分析資料用以比對,而辨識物品,如布料、木板、金屬,或辨識機械運轉聲或語音。 In summary, the present invention is an acoustic wave recognition system and method thereof, which mainly utilizes a sound extraction unit to capture an acoustic wave signal and transmit it to a filter module to filter sound wave signals corresponding to different frequency bands for use. Corresponding to the acoustic signals of different frequency bands for analysis, and obtaining acoustic wave analysis data, and comparing the acoustic wave analysis data according to the sound wave comparison data, and identifying items such as cloth, wood, metal, or identifying mechanical running sound or voice. .

雖然本發明已以較佳實施例揭露如上,然其並非用以限定本發明,任何熟習此技藝者,在不脫離本發明之精神和範圍內,當可作些許之更動與潤飾,因此本發明之保護範圍當視後附之申請專利範圍所界定者為準。 While the present invention has been described in its preferred embodiments, the present invention is not intended to limit the invention, and the present invention may be modified and modified without departing from the spirit and scope of the invention. The scope of protection is subject to the definition of the scope of the patent application.

Claims (7)

一種聲波辨識系統,其包含:一聲波擷取單元,擷取一聲波訊號;一分析單元,依據一心理聲學分析該聲波訊號,產生一聲波分析資料;以及一辨識單元,依據一聲波對照資料比對該聲波分析資料,以辨識該聲波訊號,而該聲波分析資料包含一比響度(Specific Loudness)、一尖銳度(Sharpness)與一粗糙度(Roughness);其中,該分析單元依據該聲波訊號求得複數響度運算子,並依據該些響度運算子求得該比響度,該分析單元依據該比響度求得該尖銳度,並依據該些響度運算子與該聲波訊號之複數頻段求得該粗糙度。 An acoustic wave recognition system comprising: an acoustic wave capturing unit for extracting a sound wave signal; an analyzing unit for analyzing the sound wave signal according to a psychoacoustic analysis to generate a sound wave analyzing data; and an identifying unit for comparing the sound wave according to a sound wave The sound wave analysis data is used to identify the sound wave signal, and the sound wave analysis data includes a Specific Loudness, a Sharpness, and a Roughness; wherein the analyzing unit is configured according to the sound wave signal a plurality of loudness operators are obtained, and the loudness is obtained according to the loudness operators, and the analyzing unit obtains the sharpness according to the specific loudness, and obtains the roughness according to the plurality of frequency bands of the loudness operator and the sound signal. degree. 如申請專利範圍第1項所述之聲波辨識系統,更包含:一資料庫,其儲存該聲波對照資料。 The sound wave identification system of claim 1, further comprising: a database for storing the sound wave control data. 如申請專利範圍第1項所述之聲波辨識系統,更包含:一轉換模組,接收該聲波擷取單元所擷取之該聲波訊號,並依據該聲波訊號所具有之複數頻段將該聲波訊號由時域轉換至頻域;以及一濾波模組,濾波轉換後的該聲波訊號之該些頻段,以傳送至該分析單元。 The acoustic wave identification system of claim 1, further comprising: a conversion module, receiving the sound wave signal captured by the sound wave capturing unit, and the sound wave signal according to the plurality of frequency bands of the sound wave signal Converting from the time domain to the frequency domain; and a filtering module that filters the frequency bands of the converted acoustic signals for transmission to the analysis unit. 如申請專利範圍第1項所述之聲波辨識系統,其中該聲波訊號具有複數頻段,該分析單元係自該聲波訊號擷取對應於該些頻段之複數能量值,並依據該些能量值進行一響度運算與一臨界頻帶運算而求得複數響度運算子,以依據該些響度運算子產生該聲波分析資料。 The acoustic wave identification system of claim 1, wherein the sound wave signal has a plurality of frequency bands, and the analyzing unit extracts a plurality of energy values corresponding to the frequency bands from the sound wave signal, and performs one according to the energy values. The loudness operation and a critical frequency band operation are used to obtain a complex loudness operator to generate the sound wave analysis data according to the loudness operators. 如申請專利範圍第1項所述之聲波辨識系統,其中該辨識單元係利用一類神經網路依據該聲波對照資料比對該聲波分析資料。 The acoustic wave identification system of claim 1, wherein the identification unit uses a type of neural network to analyze data according to the sound wave comparison data. 一種聲波辨識方法,其包含:擷取一聲波訊號;依據一心理聲學分析該聲波訊號,先自該聲波訊號擷取複數能量值,該些能量值對應於該聲波訊號之複數頻段,依據對應於該些頻段之該些能 量值進行響度運算,再依據該響度運算結果進行臨界頻帶運算,依據該臨界頻帶運算結果求得複數響度運算子,依據該些響度運算子產生一聲波分析資料;其中,該聲波分析資料更包含利用該些響度運算子求得一比響度(Specific Loudness),依據該比響度求得一尖銳度(Sharpness),再依據該些響度運算子與該聲波訊號之複數頻段求得一粗糙度(Roughness);以及依據一聲波對照資料比對該聲波分析資料,以辨識該聲波訊號。 An acoustic wave identification method, comprising: extracting a sound wave signal; analyzing the sound wave signal according to a psychoacoustic, first extracting a plurality of energy values from the sound wave signal, wherein the energy values correspond to a plurality of frequency bands of the sound wave signal, according to corresponding Some of these bands The magnitude is calculated by the loudness, and then the critical band operation is performed according to the result of the loudness operation, and the complex loudness operator is obtained according to the result of the critical band operation, and the sound wave analysis data is generated according to the loudness operator; wherein the sound wave analysis data further includes A specific loudness is obtained by using the loudness operators, and a sharpness is obtained according to the specific loudness, and a roughness is obtained according to the plurality of frequency bands of the loudness operator and the sound wave signal (Roughness) And; analyzing the data according to the sound wave comparison data to identify the sound wave signal. 如申請專利範圍第6項所述之聲波辨識方法,更包含:將該聲波訊號由時域轉換至頻域;以及依據該聲波訊號所具有之複數頻段濾波該聲波訊號,以進行分析。 The method for identifying a sound wave according to claim 6 further comprises: converting the sound wave signal from the time domain to the frequency domain; and filtering the sound wave signal according to the plurality of frequency bands of the sound wave signal for analysis.
TW099115647A 2010-05-17 2010-05-17 Sound wave identification system and its method TWI500024B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
TW099115647A TWI500024B (en) 2010-05-17 2010-05-17 Sound wave identification system and its method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
TW099115647A TWI500024B (en) 2010-05-17 2010-05-17 Sound wave identification system and its method

Publications (2)

Publication Number Publication Date
TW201142820A TW201142820A (en) 2011-12-01
TWI500024B true TWI500024B (en) 2015-09-11

Family

ID=46765163

Family Applications (1)

Application Number Title Priority Date Filing Date
TW099115647A TWI500024B (en) 2010-05-17 2010-05-17 Sound wave identification system and its method

Country Status (1)

Country Link
TW (1) TWI500024B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
TW201508303A (en) * 2013-08-16 2015-03-01 yong-yu Xu Sonar type object-seeking system and implementation method thereof
CN104422930A (en) * 2013-08-30 2015-03-18 许永裕 Sonar type object searching system and implementation method thereof

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
TW442773B (en) * 1998-08-04 2001-06-23 Sony Electronics Inc System and method for implementing a refined psycho-acoustic modeler
JP2008191659A (en) * 2007-01-12 2008-08-21 Sony Corp Speech enhancement method and speech reproduction system
CN101321387A (en) * 2008-07-10 2008-12-10 中国移动通信集团广东有限公司 Voiceprint recognition method and system based on communication system

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
TW442773B (en) * 1998-08-04 2001-06-23 Sony Electronics Inc System and method for implementing a refined psycho-acoustic modeler
JP2008191659A (en) * 2007-01-12 2008-08-21 Sony Corp Speech enhancement method and speech reproduction system
CN101321387A (en) * 2008-07-10 2008-12-10 中国移动通信集团广东有限公司 Voiceprint recognition method and system based on communication system

Also Published As

Publication number Publication date
TW201142820A (en) 2011-12-01

Similar Documents

Publication Publication Date Title
Ahmed et al. Void: A fast and light voice liveness detection system
Ferlini et al. EarGate: gait-based user identification with in-ear microphones
US10665250B2 (en) Real-time feedback during audio recording, and related devices and systems
CN103718242B (en) Adopt the system and method for the treatment of voice signal of spectrum motion transform
EP2881948A1 (en) Spectral comb voice activity detection
US20250342846A1 (en) Methods and apparatus to fingerprint an audio signal via normalization
Turchet et al. Real-time hit classification in a Smart Cajón
Singh et al. Countermeasures to replay attacks: A review
CN120636432B (en) A meeting window communication system based on voice sensor
Prego et al. A blind algorithm for reverberation-time estimation using subband decomposition of speech signals
US12236931B2 (en) Methods and apparatus for harmonic source enhancement
CN120321531A (en) Bluetooth speaker sound quality improvement method, device, equipment and storage medium
TWI500024B (en) Sound wave identification system and its method
CN104282303A (en) Method and electronic device for speech recognition using voiceprint recognition
CN110739006B (en) Audio processing method and device, storage medium and electronic equipment
CN114155850B (en) Voice deception attack detection system and method based on microphone array
CN120018022B (en) Audio device parameter configuration method, device, equipment and storage medium
LU507134B1 (en) Intelligent voice recognition method and system for ar helmets
Nagakrishnan et al. Generic speech based person authentication system with genuine and spoofed utterances: different feature sets and models
Valero et al. Classification of audio scenes using narrow-band autocorrelation features
Shabtai et al. Room volume classification from room impulse response using statistical pattern recognition and feature selection
CN118433623B (en) A sound quality testing method and system for sound equipment
Sun et al. Modulated Audio Replay Attack and Dual-Domain Defense
Wolley et al. An Investigation Towards the Effects of Environmental Noise and Reverberation on Synthetic Speech Generation and Detection
Weger et al. Schrödinger’s box: an artifact to study the limits of plausibility in auditory augmentations