TWI500024B - Sound wave identification system and its method - Google Patents
Sound wave identification system and its method Download PDFInfo
- Publication number
- TWI500024B TWI500024B TW099115647A TW99115647A TWI500024B TW I500024 B TWI500024 B TW I500024B TW 099115647 A TW099115647 A TW 099115647A TW 99115647 A TW99115647 A TW 99115647A TW I500024 B TWI500024 B TW I500024B
- Authority
- TW
- Taiwan
- Prior art keywords
- sound wave
- loudness
- wave signal
- sound
- frequency bands
- Prior art date
Links
- 238000000034 method Methods 0.000 title claims description 12
- 238000004458 analytical method Methods 0.000 claims description 35
- 238000006243 chemical reaction Methods 0.000 claims description 15
- 238000001914 filtration Methods 0.000 claims description 11
- 238000013528 artificial neural network Methods 0.000 claims description 4
- 239000000284 extract Substances 0.000 claims description 3
- 230000005540 biological transmission Effects 0.000 claims description 2
- 230000005236 sound signal Effects 0.000 claims description 2
- 238000010586 diagram Methods 0.000 description 6
- 239000004744 fabric Substances 0.000 description 6
- 230000004044 response Effects 0.000 description 5
- 238000004364 calculation method Methods 0.000 description 4
- 230000006870 function Effects 0.000 description 4
- 230000008859 change Effects 0.000 description 3
- 238000000605 extraction Methods 0.000 description 3
- 230000007423 decrease Effects 0.000 description 2
- 230000000694 effects Effects 0.000 description 2
- 238000012545 processing Methods 0.000 description 2
- 241000894007 species Species 0.000 description 2
- 238000001228 spectrum Methods 0.000 description 2
- 241000283153 Cetacea Species 0.000 description 1
- 241000288673 Chiroptera Species 0.000 description 1
- 241001125840 Coryphaenidae Species 0.000 description 1
- 238000004891 communication Methods 0.000 description 1
- 238000007796 conventional method Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 210000005069 ears Anatomy 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 238000011156 evaluation Methods 0.000 description 1
- 230000005284 excitation Effects 0.000 description 1
- 239000007789 gas Substances 0.000 description 1
- 239000007788 liquid Substances 0.000 description 1
- 230000000873 masking effect Effects 0.000 description 1
- 239000002184 metal Substances 0.000 description 1
- 230000008447 perception Effects 0.000 description 1
- 230000035807 sensation Effects 0.000 description 1
- 230000035945 sensitivity Effects 0.000 description 1
- 230000035939 shock Effects 0.000 description 1
- 239000007787 solid Substances 0.000 description 1
- 230000003595 spectral effect Effects 0.000 description 1
- 238000012546 transfer Methods 0.000 description 1
- 230000009466 transformation Effects 0.000 description 1
- 230000001052 transient effect Effects 0.000 description 1
- 230000007704 transition Effects 0.000 description 1
- 239000002023 wood Substances 0.000 description 1
Landscapes
- Measurement Of Mechanical Vibrations Or Ultrasonic Waves (AREA)
- Circuit For Audible Band Transducer (AREA)
Description
本發明係有關於一種辨識系統及其方法,特別是指一種聲波辨識系統及其方法。 The present invention relates to an identification system and method thereof, and more particularly to an acoustic wave identification system and method thereof.
從物理意義來說,聲音是通過傳播固體,液體,或氣體的振動所產生,尤其是指人耳能感受到的聲音振動的複數個頻帶,其中人類的聽覺頻率範圍限制在約20赫茲(Hz)到20千赫(KHz)之間,且上限普遍隨著年齡下降。其他物種有不同範圍的聽覺範圍。例如,有些犬種感覺到震動到60,000赫茲。聲音是被許多物種用來做為檢測危險、導航、捕食和通信的信號,例如:蝙蝠、鯨魚、海豚利用聲音做為導航之依據,也就是聲納,現今亦轉用於潛艇之導航依據。聲音需透過介質進行傳輸,因此聲音不能在真空環境中傳輸。心理聲學(Psychoacoustics)為研究人對於聽覺的心理反應,也就針對一般人耳在聽覺範圍20Hz至20kHz之內,收聽到聲音時,所呈現的心理反應,而一般年輕人可聽到18kHz左右即算是金耳朵了,但隨著年齡的增長,人耳對高頻聲音的敏感度會隨著下降,且經常被噪音騷擾或聽慣了大聲的耳筒機的音樂的人對高頻聲音的敏感度也會隨著時間下降。現今聲音辨識大多應用在於語音辨識,其用於將語音內容輸入至電腦系統,而代替傳統輸入裝置(例如:滑鼠、鍵盤),抑或做為電話答錄內容,由於人的口語習慣,會造成每個人在講話時對每一個咬字的響度與頻率上有些許差異,也有不明顯的差異,因此語音辨識的精確度上,有待進一步的發展。 In the physical sense, sound is produced by the propagation of vibrations of solids, liquids, or gases, especially the complex frequency bands of sound vibrations that can be felt by the human ear. The human auditory frequency range is limited to about 20 Hz (Hz). ) to 20 kHz (KHz), and the upper limit generally decreases with age. Other species have different ranges of hearing. For example, some breeds feel a shock to 60,000 Hz. Sound is used by many species to detect danger, navigation, predation, and communication. For example, bats, whales, and dolphins use sound as a basis for navigation, that is, sonar, which is now used for submarine navigation. The sound needs to be transmitted through the medium, so the sound cannot be transmitted in a vacuum environment. Psychoacoustics is the psychological response of the researcher to the auditory, and it is also the psychological response of the average human ear when listening to the sound within the range of 20Hz to 20kHz. However, the average young person can hear about 18kHz. Ears, but as you age, the sensitivity of the human ear to high-frequency sounds decreases, and people who are often harassed by noise or who are used to the music of loud earphones are sensitive to high-frequency sound. It will also drop over time. Most of today's voice recognition applications are voice recognition, which is used to input voice content into a computer system instead of a traditional input device (eg, mouse, keyboard), or as a phone answer content, due to human oral habits, Everyone has a slight difference in the loudness and frequency of each bite when speaking, and there are also obvious differences. Therefore, the accuracy of speech recognition needs further development.
此外,現今聲波辨識之技術亦可應用於物品辨識的用途上,如台灣專利編號第M373528號之「錢幣辨識裝置」所揭示,其為利用磁力 圈感應搭配錢幣撞擊的聲音辨識錢幣的真偽,其中聲音辨識的技術如第一圖所示。請參閱第一圖,其為習知聲音辨識之流程圖。如圖所示,習知錢幣辨識裝置係先如步驟S10所示,利用收音裝置收音錢幣於通過錢道時會因慣性撞擊到撞擊棒的撞擊區而發出聲音;接續如步驟S20所示,轉換聲音為一電氣訊號;如步驟S30所示,經由電路板上的放大器、濾波器、整形器來調整該電氣訊號之波形;如步驟S40所示,依據錢幣所發出之聲音的頻率、振幅、波長進行真偽判斷,其係利用微處理器比對錢幣所撞擊出來的聲音的參數值與預先儲存的真幣所撞擊出來的聲音的參數值的比對,在一定的誤差容許範圍內,微處理器藉由撞擊聲音的參數值判斷錢幣真偽。但偽幣在極其近似真幣時,錢幣之撞擊聲音中不一定有明顯差異,尤其聲波訊號之波形會極其近似,對於傳統聲音辨識裝置而言,無法單單就頻率、振幅、波長辨識出極其近似的偽幣,因此需搭配磁力圈感應,否則仍然無可避免讓偽幣可使用投幣裝置。 In addition, today's sonic identification technology can also be applied to the use of item identification, as disclosed in the "Coin Identification Device" of Taiwan Patent No. M373528, which utilizes magnetic force. The circle sensing matches the sound of the coin impact to identify the authenticity of the coin, and the technique of sound recognition is as shown in the first figure. Please refer to the first figure, which is a flow chart of conventional sound recognition. As shown in the figure, the conventional coin identification device first uses the sound pickup device to receive the sound when the money is passed through the money channel, and the sound is generated by the inertial impact on the impact zone of the impact bar; the connection is continued as shown in step S20. The sound is an electrical signal; as shown in step S30, the waveform of the electrical signal is adjusted via an amplifier, a filter, and a shaper on the circuit board; as shown in step S40, the frequency, amplitude, and wavelength of the sound according to the coin are displayed. The authenticity judgment is performed by using a microprocessor to compare the parameter value of the sound struck by the coin with the parameter value of the sound struck by the pre-stored genuine coin, and within a certain error tolerance range, the micro-processing The instrument judges the authenticity of the coin by the parameter value of the impact sound. However, when the counterfeit currency is extremely similar to the real coin, the impact sound of the coin does not necessarily have a significant difference. In particular, the waveform of the acoustic wave signal is extremely similar. For the conventional sound recognition device, the frequency, amplitude, and wavelength cannot be recognized very simply. The counterfeit currency, so it needs to be matched with the magnetic ring, otherwise it is inevitable that the counterfeit coin can use the coin-operated device.
有鑑於此,我們將提出一個聲波辨識系統及其方法,除了可以避免聲音遮蔽而誤判外,相較於傳統語音辨識與物品辨識,本發明可提供較為精確之聲波辨識,而降低誤判的比率。且本發明更進一步利用心理聲學做為聲波分析的核心,以提供精確的聲波辨識,可辨識差異性小之聲波訊號。 In view of this, we will propose an acoustic wave recognition system and its method. In addition to avoiding sound masking and misjudging, the present invention can provide more accurate sound wave recognition and reduce the ratio of false positives than conventional voice recognition and article identification. Moreover, the present invention further utilizes psychoacoustics as the core of acoustic wave analysis to provide accurate sound wave identification, and can recognize sound waves with small differences.
本發明之一目的在於提供一種聲波辨識系統及其方法,其利用心理聲學做為聲波訊號分析的依據,以精確劃分聲波訊號用於辨識。 An object of the present invention is to provide an acoustic wave identification system and a method thereof, which utilize psychoacoustics as a basis for acoustic signal analysis to accurately divide an acoustic wave signal for identification.
本發明係提供一種聲波辨識系統及其方法,其中該系統係利用一聲波擷取單元擷取一包含複數頻段之聲波訊號,以傳送至濾波模組,而依據該些頻段濾波該聲波訊號,如此即可產生對應於複數頻段之該聲波訊號,分析單元利用心理聲學的原理分析對應於該些頻段之該聲波訊號,以產生一聲波分析資料,由於本發明係依據一心理聲學進行分析,因此該聲波分析資料的分析音域包含20赫茲至20千赫,其為目前精確聲波分析的音域,所以辨識單元可藉由具精確音域分析的聲波分析資料辨識產生該聲 波訊號之來源。如此本發明即可精確用於辨識物品、辨識機械運轉狀態與辨識語音。 The present invention provides an acoustic wave identification system and a method thereof, wherein the system uses an acoustic wave capturing unit to capture an acoustic wave signal including a plurality of frequency bands for transmission to a filtering module, and filtering the acoustic wave signal according to the frequency bands. The sound wave signal corresponding to the plurality of frequency bands is generated, and the analyzing unit analyzes the sound wave signal corresponding to the frequency bands by using a psychoacoustic principle to generate an acoustic wave analysis data. Since the present invention analyzes according to a psychoacoustic, the The analysis range of the acoustic analysis data contains 20 Hz to 20 kHz, which is the range of the current accurate acoustic analysis, so the identification unit can recognize the sound by analyzing the acoustic analysis data with accurate range analysis. The source of the wave signal. Thus, the present invention can be accurately used to identify articles, recognize mechanical operating conditions, and recognize speech.
茲為使 貴審查委員對本發明之結構比及所達成之功效更有進一步之瞭解與認識,謹佐以較佳之實施例圖及配合詳細之說明,說明如後: In order to give the reviewer a better understanding and understanding of the structure of the present invention and the efficacies achieved, please refer to the preferred embodiment diagram and the detailed description to illustrate:
80‧‧‧聲波辨識系統 80‧‧‧Sonic Identification System
82‧‧‧聲波擷取單元 82‧‧‧Sonic extraction unit
84‧‧‧轉換模組 84‧‧‧Transition module
842‧‧‧轉換單元 842‧‧‧Transfer unit
86‧‧‧濾波模組 86‧‧‧Filter module
862‧‧‧濾波器 862‧‧‧ filter
88‧‧‧分析單元 88‧‧‧Analysis unit
90‧‧‧辨識單元 90‧‧‧ Identification unit
92‧‧‧資料庫 92‧‧‧Database
第一圖為習知聲音辨識之流程圖;第二圖為本發明之一較佳實施例之方塊圖;第三圖為本發明之一較佳實施例之流程圖;第四圖為本發明之一較佳實施例之轉換濾波步驟的示意圖;以及第五圖為本發明之一較佳實施例之分析步驟的流程圖。 The first figure is a flow chart of a conventional sound recognition; the second figure is a block diagram of a preferred embodiment of the present invention; the third figure is a flow chart of a preferred embodiment of the present invention; A schematic diagram of a conversion filtering step of a preferred embodiment; and a fifth diagram of a flow of analysis of a preferred embodiment of the present invention.
請參閱第二圖,其為本發明之一較佳實施例之方塊圖。如圖所示,本發明之聲波辨識系統80包含一聲波擷取單元82、一轉換模組84、一濾波模組86、一分析單元88與一辨識單元90。聲波擷取單元82自一發聲源擷取一聲波訊號,其中該聲波訊號具有複數頻段,例如:布料之搓揉聲;轉換模組84係接收聲波擷取單元82所擷取之聲波訊號,並將該聲波訊號自時域轉換至頻域;濾波模組86依據該些頻段濾波頻域下的該聲波訊號,其中由於濾波模組86對應於一心理聲學,因此本實施例濾波模組86包含24個濾波器842,以對應於人耳分析的24個頻段,該些頻段分佈於20赫茲至20千赫,但亦可縮減於600赫茲至16千赫,此為一般人所擁有之有效聽覺範圍;分析單元88依據該該心理聲學分析該聲波訊號,由於分析單元88對應於該心理聲學,因此分析單元88就24個頻段分別分析該聲波訊號,也就是分析單元88就濾波模組86針對20赫茲至20千赫的濾波結果一一分析,而產生一聲波分析資料。 Please refer to the second figure, which is a block diagram of a preferred embodiment of the present invention. As shown in the figure, the acoustic wave recognition system 80 of the present invention comprises an acoustic wave capture unit 82, a conversion module 84, a filter module 86, an analysis unit 88 and an identification unit 90. The sound wave capturing unit 82 extracts a sound wave signal from a sound source, wherein the sound wave signal has a plurality of frequency bands, for example, a click sound of the cloth; and the conversion module 84 receives the sound wave signal captured by the sound wave capturing unit 82, and The sound wave signal is converted from the time domain to the frequency domain; the filter module 86 filters the sound wave signal in the frequency domain according to the frequency bands, wherein the filter module 86 corresponds to a psychoacoustic, so the filter module 86 of the embodiment includes 24 filters 842, corresponding to 24 frequency bands analyzed by the human ear, the frequency bands are distributed between 20 Hz and 20 kHz, but can also be reduced to 600 Hz to 16 kHz, which is an effective hearing range for ordinary people. The analyzing unit 88 analyzes the sound wave signal according to the psychoacoustic. Since the analyzing unit 88 corresponds to the psychoacoustic, the analyzing unit 88 analyzes the sound wave signal for each of the 24 frequency bands, that is, the analyzing unit 88 filters the module 86 for 20 The filtering results of Hertz to 20 kHz are analyzed one by one, and a sound wave analysis data is generated.
承接上述,辨識單元90接收分析單元88就該聲波訊號的分 析所得之聲波分析資料,而依據一聲波對照資料比對該聲波分析資料,以辨識該聲波訊號,其中辨識單元90為利用類神經網路進行比對,由於利用類神經網路進行資料比對為現今純熟的技術,因此在此不再贅述。此外,本發明之聲波辨識系統80更包含一資料庫92,其用以儲存該聲波對照資料,以供辨識單元90讀取該聲波對照資料,用以比對分析單元88所產生之聲波分析資料,其中該聲波對照資料為聲波辨識系統80於辨識物品之前由聲波擷取單元82擷取用於對照之一物件所發出的聲音或語音,再由轉換模組84與濾波模組86進行轉換並濾波,以經由分析單元88分析出該聲波對照資料並存至資料庫92,用以辨識發聲源,例如:辨識布料,以透過布料之聲波分析資料確認真偽。且辨識單元90更可將聲波分析資料建檔至資料庫92,以供後續更精確之辨識。另外,本發明之聲波辨識系統80可應用於電腦系統中,更可設置於單晶片上,例如:場域可程式邏輯閘陣列(Field Programmable Gate Array,FPGA)。 In response to the above, the identification unit 90 receives the analysis unit 88 for the sound signal. The obtained acoustic wave analysis data is analyzed, and the acoustic wave signal is analyzed according to the acoustic wave comparison data, wherein the identification unit 90 uses the neural network for comparison, and the data is compared by using the neural network. It is a skill that is nowadays so it will not be repeated here. In addition, the acoustic wave recognition system 80 of the present invention further includes a data library 92 for storing the sound wave reference data for the identification unit 90 to read the sound wave reference data for comparing the sound wave analysis data generated by the analysis unit 88. The sound wave comparison data is obtained by the sound wave capturing unit 82 for comparing the sound or voice emitted by the object by the sound wave capturing unit 82, and then converted by the conversion module 84 and the filtering module 86. Filtering, the sound wave comparison data is analyzed by the analyzing unit 88 and stored in the database 92 for identifying the sound source, for example, identifying the cloth to confirm the authenticity through the sound wave analysis data of the cloth. The identification unit 90 can further profile the acoustic analysis data to the database 92 for subsequent more accurate identification. In addition, the acoustic wave recognition system 80 of the present invention can be applied to a computer system, and can be disposed on a single chip, for example, a Field Programmable Gate Array (FPGA).
請參閱第三圖,其為本發明之一較佳實施例之方塊圖。如圖所示,本發明之聲波辨識方法係包含:步驟S100:擷取聲波訊號;步驟S200:轉換並過濾該聲波訊號;步驟S300;分析對應於該些頻段之該聲波訊號,產生一聲波分析資料;以及步驟S400:依據一聲波對照資料比對該聲波分析資料,以辨識該聲波訊號。 Please refer to the third drawing, which is a block diagram of a preferred embodiment of the present invention. As shown in the figure, the acoustic wave identification method of the present invention comprises: step S100: capturing an acoustic wave signal; step S200: converting and filtering the acoustic wave signal; and step S300; analyzing the acoustic wave signal corresponding to the frequency bands to generate an acoustic wave analysis And the step S400: analyzing the sound wave according to the sound wave comparison data to identify the sound wave signal.
於步驟S100中,擷取發聲源之聲波訊號,其即為X(n),例如:布料搓揉聲、物品或機械運轉或語音的聲波訊號;於步驟S200中,將步驟S100所擷取之聲波訊號進行轉換,並對轉換後的聲波訊號進行過濾,以對應於心理聲學之24個頻段,如第三圖所示,其即為轉換步驟S100所擷取之聲波訊號的訊號格式,自時域轉換為頻域,本實施例之轉換演算法係採用快速傅立葉轉換(Fast Fourier Transformation,FFT),以進行聲波訊號轉換,如第四圖所示之轉換模組84,
其分別針對24個頻段進行快速傅立葉轉換。假設對一個離散時域訊號X(n)作離散傅立葉轉換,我們可以得到以下方程式1:
其中ω為角頻率,X(e jω )為轉換後的頻率域函數,但本發明不侷限於此,更可利用其他時域轉頻域的演算法,例如:拉氏轉換或Z轉換;於步驟S300中,依據心理聲學所對應之24個頻段濾波步驟S200所轉換之聲波訊號,如第四圖所示之濾波模組86與濾波單元862。該24段頻段換算公式如下方程式2:
頻段B(f)主要是因人耳的聽覺感知所區分的頻段,應用在計算比響度、尖銳度和粗糙度等值。在濾波聲波訊號的處理上,為求劃分頻段B(f)的頻帶寬可以1/3八度音(octave)頻帶寬的頻段建立帶通濾波器(Bandpass Filter),該帶通濾波器之轉換函式H(s)可表示如下方程式3:
s為Laplace運算子,Q ∞為頻段因子,而對應每個1/3八度音的頻段則表示如下列方程式4所示:
f li ~f ui 意指由上述每一濾波器所對應之每一頻率,其分別用於針對24個不等份頻段進行濾波。 f li ~ f ui means each frequency corresponding to each of the above filters, which are respectively used for filtering for 24 unequal frequency bands.
於步驟S300中,其即針對步驟S200所得之對應於該些頻段的聲波訊號的訊號值X(f)進行分析,如第五圖所示,步驟S300中更進一步包含:步驟S310:響度擷取;步驟S320:進行臨界頻帶運算並進行遮蔽效應評估運算;以及步驟S330:產生聲波分析資料。 In step S300, the signal value X ( f ) corresponding to the acoustic signals of the frequency bands obtained in step S200 is analyzed. As shown in the fifth figure, step S300 further includes: step S310: the loudness extraction Step S320: performing a critical band operation and performing a shadow effect evaluation operation; and step S330: generating sound wave analysis data.
於步驟S310中,其先將傅立葉轉換後分成的頻譜,我們可依據步驟S200所得之頻譜計算並取出頻譜能量值,其算式如下方程式5:Y(f)=10.Log 10{|X(f)|2} (5) In step S310, the spectrum is divided into Fourier transforms first, and we can calculate and extract the spectral energy value according to the spectrum obtained in step S200, and the equation is as follows: Equation 5: Y ( f )=10. Log 10 {| X ( f )| 2 } (5)
而響度能量值的聲壓換算則可由下方方程式6求得,該方程式6如下:
由方程式6可獲得響度能量值L(f),將L(f)代入下方方程式7可針對不同的頻段求得在各個響度能量值L(f)在各個頻段上的能量加總值。該方程式7如下:
其中能量級數L(i)為對應的每個頻段B(f)之各頻率(f li ~f ui )的能量值Y(f),亦即指各個不等頻段B(f)內的每一頻率所含之能量加總,所以針對每一不等頻段B(f)計算對應之能量級數L(i),如此每一頻段所包含的能量級數L(i)可表示為人耳在每個能量頻帶所能感受到的聲壓力度,且能量級數L(i)在心理聲學上稱之為受激發能量級數(Excitation Level)。 Each frequency (f li ~ f ui), wherein the series of energy L (i) corresponding to each band B (f) of the energy value of Y (f), i.e. means for each frequency range within each of B (f) The energy contained in a frequency is summed, so the corresponding energy level L ( i ) is calculated for each unequal frequency band B ( f ), so that the energy level L ( i ) included in each frequency band can be expressed as a human ear. The degree of acoustic stress that can be felt in each energy band, and the energy level L ( i ) is psychoacoustically referred to as the Excitation Level.
於步驟S320中,其依據步驟S310所得之響度能量級數進行運算,以求得各頻段之比響度值。 In step S320, it is calculated according to the loudness energy level obtained in step S310 to obtain the specific loudness value of each frequency band.
先由方程式8求得最小音量門檻資訊(LTq)的公式,該方程式8如下:
比響度的計算式是將每個能量級數L(i)所對應到的臨界頻帶遮蔽值進而計算來求得臨界頻帶遮蔽量計算方程式L E ,其中臨界頻帶遮蔽計算方程式L E 為現今習知遮蔽效應評估運算方程,因此在此不再贅述,其中方程式9中之Sones為響度的數量標度單位為宋,Bark為臨界頻帶的單位為巴克。在比對人耳所能聽到的最小音量門檻資訊(LTq)後,以方程式9將方程式8所得之最小音量門檻資訊(LTq)以及臨界頻帶加總方程式L E 代入,即可求得比響度運算子N',該方程式9如下:
於步驟S330中,其依據步驟S320中所得之對應於24頻段之響度運算子N',進行運算,以產生聲波分析資料,而本實施例之聲波分析資料為包含比響度值L、尖銳值S與粗糙值R,其中比響度總值L,係由下方程式10求得,其利用步驟S320所得之響度運算子N'進行積分,該方程式10如下:
其中L即為比響度總值,其用以作為響度辨識之結果。 Where L is the total loudness value, which is used as the result of the loudness identification.
在求得各頻段對應之響度運算子N'以及比響度值L後,可藉由一查詢表得知每一響度運算子N'對應之加權函數g(z)並予以加權,並積分加權之響度以及予以相除,藉此依據方程式12求得尖銳度(Sharpness)S,其中對應之加權函數g(z)的查詢表為一習知技術,所以在此不再贅述。
該方程式12如下:
其中S為尖銳度,用以做為聲波尖銳度之辨識值。 Where S is the sharpness, which is used as the identification value of the acoustic sharpness.
此外,本實施例更可由方程式13代入響度運算子N'求得粗糙度(Roughness)R,該方程式13如下:
由方程式14可知,方程式13係由每個響度運算子N'之間的能量變化量的關係來計算粗糙度R之值,△f B 為所對應的每個頻段上的臨界頻帶區間值,粗糙度R之值為用以辨識聲波之粗糙度。 It can be seen from Equation 14 that Equation 13 calculates the value of the roughness R from the relationship between the amount of energy change between each loudness operator N' , and Δ f B is the critical band interval value for each corresponding frequency band, rough The value of the degree R is used to identify the roughness of the sound wave.
於步驟S400中,依據步驟S330所求得知之聲波分析資料,用以進行辨識,其中辨識方式係以資料庫92所儲存之聲波對照資料比對聲波分析資料中所包含之比響度值L、尖銳值S與粗糙值R,其藉由類神經網路依據逼近比響度L、尖銳度S與粗糙度R的近似對照值辨識步驟S100所擷取之聲波訊號的發聲源,以藉由上述比響度值L、尖銳值S與粗糙值R之比對結果得知發聲源之狀態,因此本發明可用於物品辨識、機械運轉狀態與語音,其中物品辨識係可用於如辨識布料、錢幣真偽等用途,機械運轉狀態可應用於如辨識引擎轉速,語音用途如利用語音作為門禁控制、利用語音代替開關控制之鑰匙。 In step S400, the sound wave analysis data obtained in step S330 is used for identification, wherein the identification mode is based on the sound wave comparison data stored in the data library 92, and the specific loudness value L and sharpness included in the sound wave analysis data are compared. a value S and a rough value R, which are identified by a neural network based on approximate approximations of the approximation loudness L, sharpness S, and roughness R to identify the source of the acoustic signal captured in step S100, by the above-described specific loudness The value L, the sharp value S and the rough value R are compared to the result to know the state of the sound source, so the invention can be used for item identification, mechanical operation state and voice, wherein the item identification can be used for identifying the cloth, the authenticity of the coin, etc. The mechanical operating state can be applied, for example, to identify engine speeds, voice applications such as the use of voice as an access control, and the use of voice instead of a switch control key.
由上述可知,以上實施例係利用心理聲學之比響度值L、尖銳值S與粗糙值R進行辨識,除此之外,心理聲學更包含時變響度(Time-Varying Loudness)、音調(Tonality)、音訊波動量(Fluctuation strength),因此本發明更可利用變響度、音調、音訊波動量做為辨識之依據, 其中時變響度為隨著時間改變之響度,亦即暫態下的響度,且該時變響度之求得方式與上述比響度相同,音調為人耳根據不同的頻段下解析出反應音調感覺的色調差異性,音訊波動量亦即音訊隨著時間的浮動反應變化量。此外,本發明之聲波對照資料係執行本發明之方法之前利用步驟S100至S300針對對照物件或語音擷取一對照聲波訊號,以求得該聲波對照資料並存入資料庫92中。 As can be seen from the above, the above embodiment utilizes psychoacoustic specific loudness value L, sharp value S and rough value R for identification. In addition, psychoacoustic includes Time-Varying Loudness and Tonality. Fluctuation strength, therefore, the present invention can further utilize the variation of loudness, pitch, and audio fluctuation as the basis for identification. The time-varying loudness is the loudness that changes with time, that is, the loudness under transient conditions, and the time-varying loudness is obtained in the same manner as the above-mentioned specific loudness, and the pitch is the human ear's response to the reaction tonal sensation according to different frequency bands. The difference in tone, the amount of audio fluctuations is the amount of change in the response of the audio over time. In addition, the acoustic wave control data of the present invention uses the steps S100 to S300 to capture a control sound wave signal for the control object or the voice before performing the method of the present invention to obtain the sound wave control data and store it in the data library 92.
綜上所述,本發明為一種聲波辨識系統及其方法,主要係利用一聲音擷取單元擷取一聲波訊號,並傳送至過濾器模組,以濾波對應不同頻段之聲波訊號,用以依據對應於不同頻段之聲波訊號進行分析,而求得聲波分析資料,並依據聲波對照資料比對聲波分析資料用以比對,而辨識物品,如布料、木板、金屬,或辨識機械運轉聲或語音。 In summary, the present invention is an acoustic wave recognition system and method thereof, which mainly utilizes a sound extraction unit to capture an acoustic wave signal and transmit it to a filter module to filter sound wave signals corresponding to different frequency bands for use. Corresponding to the acoustic signals of different frequency bands for analysis, and obtaining acoustic wave analysis data, and comparing the acoustic wave analysis data according to the sound wave comparison data, and identifying items such as cloth, wood, metal, or identifying mechanical running sound or voice. .
雖然本發明已以較佳實施例揭露如上,然其並非用以限定本發明,任何熟習此技藝者,在不脫離本發明之精神和範圍內,當可作些許之更動與潤飾,因此本發明之保護範圍當視後附之申請專利範圍所界定者為準。 While the present invention has been described in its preferred embodiments, the present invention is not intended to limit the invention, and the present invention may be modified and modified without departing from the spirit and scope of the invention. The scope of protection is subject to the definition of the scope of the patent application.
Claims (7)
Priority Applications (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| TW099115647A TWI500024B (en) | 2010-05-17 | 2010-05-17 | Sound wave identification system and its method |
Applications Claiming Priority (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| TW099115647A TWI500024B (en) | 2010-05-17 | 2010-05-17 | Sound wave identification system and its method |
Publications (2)
| Publication Number | Publication Date |
|---|---|
| TW201142820A TW201142820A (en) | 2011-12-01 |
| TWI500024B true TWI500024B (en) | 2015-09-11 |
Family
ID=46765163
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| TW099115647A TWI500024B (en) | 2010-05-17 | 2010-05-17 | Sound wave identification system and its method |
Country Status (1)
| Country | Link |
|---|---|
| TW (1) | TWI500024B (en) |
Families Citing this family (2)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| TW201508303A (en) * | 2013-08-16 | 2015-03-01 | yong-yu Xu | Sonar type object-seeking system and implementation method thereof |
| CN104422930A (en) * | 2013-08-30 | 2015-03-18 | 许永裕 | Sonar type object searching system and implementation method thereof |
Citations (3)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| TW442773B (en) * | 1998-08-04 | 2001-06-23 | Sony Electronics Inc | System and method for implementing a refined psycho-acoustic modeler |
| JP2008191659A (en) * | 2007-01-12 | 2008-08-21 | Sony Corp | Speech enhancement method and speech reproduction system |
| CN101321387A (en) * | 2008-07-10 | 2008-12-10 | 中国移动通信集团广东有限公司 | Voiceprint recognition method and system based on communication system |
-
2010
- 2010-05-17 TW TW099115647A patent/TWI500024B/en active
Patent Citations (3)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| TW442773B (en) * | 1998-08-04 | 2001-06-23 | Sony Electronics Inc | System and method for implementing a refined psycho-acoustic modeler |
| JP2008191659A (en) * | 2007-01-12 | 2008-08-21 | Sony Corp | Speech enhancement method and speech reproduction system |
| CN101321387A (en) * | 2008-07-10 | 2008-12-10 | 中国移动通信集团广东有限公司 | Voiceprint recognition method and system based on communication system |
Also Published As
| Publication number | Publication date |
|---|---|
| TW201142820A (en) | 2011-12-01 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| Ahmed et al. | Void: A fast and light voice liveness detection system | |
| Ferlini et al. | EarGate: gait-based user identification with in-ear microphones | |
| US10665250B2 (en) | Real-time feedback during audio recording, and related devices and systems | |
| CN103718242B (en) | Adopt the system and method for the treatment of voice signal of spectrum motion transform | |
| EP2881948A1 (en) | Spectral comb voice activity detection | |
| US20250342846A1 (en) | Methods and apparatus to fingerprint an audio signal via normalization | |
| Turchet et al. | Real-time hit classification in a Smart Cajón | |
| Singh et al. | Countermeasures to replay attacks: A review | |
| CN120636432B (en) | A meeting window communication system based on voice sensor | |
| Prego et al. | A blind algorithm for reverberation-time estimation using subband decomposition of speech signals | |
| US12236931B2 (en) | Methods and apparatus for harmonic source enhancement | |
| CN120321531A (en) | Bluetooth speaker sound quality improvement method, device, equipment and storage medium | |
| TWI500024B (en) | Sound wave identification system and its method | |
| CN104282303A (en) | Method and electronic device for speech recognition using voiceprint recognition | |
| CN110739006B (en) | Audio processing method and device, storage medium and electronic equipment | |
| CN114155850B (en) | Voice deception attack detection system and method based on microphone array | |
| CN120018022B (en) | Audio device parameter configuration method, device, equipment and storage medium | |
| LU507134B1 (en) | Intelligent voice recognition method and system for ar helmets | |
| Nagakrishnan et al. | Generic speech based person authentication system with genuine and spoofed utterances: different feature sets and models | |
| Valero et al. | Classification of audio scenes using narrow-band autocorrelation features | |
| Shabtai et al. | Room volume classification from room impulse response using statistical pattern recognition and feature selection | |
| CN118433623B (en) | A sound quality testing method and system for sound equipment | |
| Sun et al. | Modulated Audio Replay Attack and Dual-Domain Defense | |
| Wolley et al. | An Investigation Towards the Effects of Environmental Noise and Reverberation on Synthetic Speech Generation and Detection | |
| Weger et al. | Schrödinger’s box: an artifact to study the limits of plausibility in auditory augmentations |