ATE341808T1 - Verfahren und vorrichtung zur robusten sprachklassifikation - Google Patents

Verfahren und vorrichtung zur robusten sprachklassifikation

Info

Publication number
ATE341808T1
ATE341808T1 AT01984988T AT01984988T ATE341808T1 AT E341808 T1 ATE341808 T1 AT E341808T1 AT 01984988 T AT01984988 T AT 01984988T AT 01984988 T AT01984988 T AT 01984988T AT E341808 T1 ATE341808 T1 AT E341808T1
Authority
AT
Austria
Prior art keywords
speech
classification
parameters
classifier
bit rate
Prior art date
Application number
AT01984988T
Other languages
English (en)
Inventor
Pengjun Huang
Original Assignee
Qualcomm Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Qualcomm Inc filed Critical Qualcomm Inc
Application granted granted Critical
Publication of ATE341808T1 publication Critical patent/ATE341808T1/de

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/08Speech classification or search
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/93Discriminating between voiced and unvoiced parts of speech signals
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/04Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using predictive techniques
    • G10L19/16Vocoder architecture
    • G10L19/18Vocoders using multiple modes
    • G10L19/22Mode decision, i.e. based on audio signal content versus external parameters
    • GPHYSICS
    • G11INFORMATION STORAGE
    • G11BINFORMATION STORAGE BASED ON RELATIVE MOVEMENT BETWEEN RECORD CARRIER AND TRANSDUCER
    • G11B20/00Signal processing not specific to the method of recording or reproducing; Circuits therefor
    • G11B20/10Digital recording or reproducing
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/02Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using spectral analysis, e.g. transform vocoders or subband vocoders
    • G10L19/022Blocking, i.e. grouping of samples in time; Choice of analysis windows; Overlap factoring
    • G10L19/025Detection of transients or attacks for time/frequency resolution switching
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/78Detection of presence or absence of voice signals

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Signal Processing (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Computational Linguistics (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Compression, Expansion, Code Conversion, And Decoders (AREA)
  • Exchange Systems With Centralized Control (AREA)
  • Transmission Systems Not Characterized By The Medium Used For Transmission (AREA)
  • Machine Translation (AREA)
AT01984988T 2000-12-08 2001-12-04 Verfahren und vorrichtung zur robusten sprachklassifikation ATE341808T1 (de)

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
US09/733,740 US7472059B2 (en) 2000-12-08 2000-12-08 Method and apparatus for robust speech classification

Publications (1)

Publication Number Publication Date
ATE341808T1 true ATE341808T1 (de) 2006-10-15

Family

ID=24948935

Family Applications (1)

Application Number Title Priority Date Filing Date
AT01984988T ATE341808T1 (de) 2000-12-08 2001-12-04 Verfahren und vorrichtung zur robusten sprachklassifikation

Country Status (12)

Country Link
US (1) US7472059B2 (de)
EP (1) EP1340223B1 (de)
JP (2) JP4550360B2 (de)
KR (2) KR100908219B1 (de)
CN (2) CN101131817B (de)
AT (1) ATE341808T1 (de)
AU (1) AU2002233983A1 (de)
BR (2) BRPI0116002B1 (de)
DE (1) DE60123651T2 (de)
ES (1) ES2276845T3 (de)
TW (1) TW535141B (de)
WO (1) WO2002047068A2 (de)

Families Citing this family (76)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6691084B2 (en) * 1998-12-21 2004-02-10 Qualcomm Incorporated Multiple mode variable rate speech coding
GB0003903D0 (en) * 2000-02-18 2000-04-05 Canon Kk Improved speech recognition accuracy in a multimodal input system
US8090577B2 (en) 2002-08-08 2012-01-03 Qualcomm Incorported Bandwidth-adaptive quantization
US7657427B2 (en) * 2002-10-11 2010-02-02 Nokia Corporation Methods and devices for source controlled variable bit-rate wideband speech coding
US7023880B2 (en) 2002-10-28 2006-04-04 Qualcomm Incorporated Re-formatting variable-rate vocoder frames for inter-system transmissions
US7698132B2 (en) * 2002-12-17 2010-04-13 Qualcomm Incorporated Sub-sampled excitation waveform codebooks
US7613606B2 (en) * 2003-10-02 2009-11-03 Nokia Corporation Speech codecs
US7472057B2 (en) * 2003-10-17 2008-12-30 Broadcom Corporation Detector for use in voice communications systems
KR20050045764A (ko) * 2003-11-12 2005-05-17 삼성전자주식회사 무선 단말기에서의 음성 저장/재생 장치 및 방법
US7630902B2 (en) * 2004-09-17 2009-12-08 Digital Rise Technology Co., Ltd. Apparatus and methods for digital audio coding using codebook application ranges
WO2006104576A2 (en) * 2005-03-24 2006-10-05 Mindspeed Technologies, Inc. Adaptive voice mode extension for a voice activity detector
US20060262851A1 (en) 2005-05-19 2006-11-23 Celtro Ltd. Method and system for efficient transmission of communication traffic
KR100744352B1 (ko) * 2005-08-01 2007-07-30 삼성전자주식회사 음성 신호의 하모닉 성분을 이용한 유/무성음 분리 정보를추출하는 방법 및 그 장치
US20070033042A1 (en) * 2005-08-03 2007-02-08 International Business Machines Corporation Speech detection fusing multi-class acoustic-phonetic, and energy features
US7962340B2 (en) * 2005-08-22 2011-06-14 Nuance Communications, Inc. Methods and apparatus for buffering data for use in accordance with a speech recognition system
KR100735343B1 (ko) * 2006-04-11 2007-07-04 삼성전자주식회사 음성신호의 피치 정보 추출장치 및 방법
US8917876B2 (en) 2006-06-14 2014-12-23 Personics Holdings, LLC. Earguard monitoring system
US20080031475A1 (en) 2006-07-08 2008-02-07 Personics Holdings Inc. Personal audio assistant device and method
US8239190B2 (en) * 2006-08-22 2012-08-07 Qualcomm Incorporated Time-warping frames of wideband vocoder
JP5096474B2 (ja) * 2006-10-10 2012-12-12 クゥアルコム・インコーポレイテッド オーディオ信号を符号化及び復号化する方法及び装置
KR101016224B1 (ko) * 2006-12-12 2011-02-25 프라운호퍼-게젤샤프트 추르 푀르데룽 데어 안제반텐 포르슝 에 파우 인코더, 디코더 및 시간 영역 데이터 스트림을 나타내는 데이터 세그먼트를 인코딩하고 디코딩하는 방법
WO2008095167A2 (en) 2007-02-01 2008-08-07 Personics Holdings Inc. Method and device for audio recording
US11750965B2 (en) 2007-03-07 2023-09-05 Staton Techiya, Llc Acoustic dampening compensation system
US8478587B2 (en) * 2007-03-16 2013-07-02 Panasonic Corporation Voice analysis device, voice analysis method, voice analysis program, and system integration circuit
WO2008124786A2 (en) 2007-04-09 2008-10-16 Personics Holdings Inc. Always on headwear recording system
US11217237B2 (en) 2008-04-14 2022-01-04 Staton Techiya, Llc Method and device for voice operated control
US11317202B2 (en) 2007-04-13 2022-04-26 Staton Techiya, Llc Method and device for voice operated control
US11683643B2 (en) 2007-05-04 2023-06-20 Staton Techiya Llc Method and device for in ear canal echo suppression
US11856375B2 (en) 2007-05-04 2023-12-26 Staton Techiya Llc Method and device for in-ear echo suppression
US10009677B2 (en) 2007-07-09 2018-06-26 Staton Techiya, Llc Methods and mechanisms for inflation
US8502648B2 (en) 2007-08-16 2013-08-06 Broadcom Corporation Remote-control device with directional audio system
CA2697920C (en) 2007-08-27 2018-01-02 Telefonaktiebolaget L M Ericsson (Publ) Transient detector and method for supporting encoding of an audio signal
US20090319263A1 (en) * 2008-06-20 2009-12-24 Qualcomm Incorporated Coding of transitional speech frames for low-bit-rate applications
US20090319261A1 (en) * 2008-06-20 2009-12-24 Qualcomm Incorporated Coding of transitional speech frames for low-bit-rate applications
US8768690B2 (en) 2008-06-20 2014-07-01 Qualcomm Incorporated Coding scheme selection for low-bit-rate applications
KR20100006492A (ko) * 2008-07-09 2010-01-19 삼성전자주식회사 부호화 방식 결정 방법 및 장치
US8380498B2 (en) * 2008-09-06 2013-02-19 GH Innovation, Inc. Temporal envelope coding of energy attack signal by using attack point location
US8600067B2 (en) 2008-09-19 2013-12-03 Personics Holdings Inc. Acoustic sealing analysis system
US9129291B2 (en) 2008-09-22 2015-09-08 Personics Holdings, Llc Personalized sound management and method
FR2944640A1 (fr) * 2009-04-17 2010-10-22 France Telecom Procede et dispositif d'evaluation objective de la qualite vocale d'un signal de parole prenant en compte la classification du bruit de fond contenu dans le signal.
US9838784B2 (en) 2009-12-02 2017-12-05 Knowles Electronics, Llc Directional audio capture
US8538035B2 (en) 2010-04-29 2013-09-17 Audience, Inc. Multi-microphone robust noise suppression
US8473287B2 (en) 2010-04-19 2013-06-25 Audience, Inc. Method for jointly optimizing noise reduction and voice quality in a mono or multi-microphone system
US8781137B1 (en) 2010-04-27 2014-07-15 Audience, Inc. Wind noise detection and suppression
WO2011145249A1 (ja) * 2010-05-17 2011-11-24 パナソニック株式会社 音声分類装置、方法、プログラム及び集積回路
US8447596B2 (en) 2010-07-12 2013-05-21 Audience, Inc. Monaural noise suppression based on computational auditory scene analysis
US8311817B2 (en) * 2010-11-04 2012-11-13 Audience, Inc. Systems and methods for enhancing voice quality in mobile device
US12349097B2 (en) 2010-12-30 2025-07-01 St Famtech, Llc Information processing using a population of data acquisition devices
JP2012203351A (ja) * 2011-03-28 2012-10-22 Yamaha Corp 子音識別装置、およびプログラム
US8990074B2 (en) * 2011-05-24 2015-03-24 Qualcomm Incorporated Noise-robust speech coding mode classification
EP2721610A1 (de) * 2011-11-25 2014-04-23 Huawei Technologies Co., Ltd. Vorrichtung und verfahren zur kodierung eines eingangssignals
US8731911B2 (en) * 2011-12-09 2014-05-20 Microsoft Corporation Harmonicity-based single-channel speech quality estimation
WO2013136742A1 (ja) * 2012-03-14 2013-09-19 パナソニック株式会社 車載通話装置
CN103903633B (zh) * 2012-12-27 2017-04-12 华为技术有限公司 检测语音信号的方法和装置
US9536540B2 (en) 2013-07-19 2017-01-03 Knowles Electronics, Llc Speech signal separation and synthesis based on auditory scene analysis and speech modeling
US9167082B2 (en) 2013-09-22 2015-10-20 Steven Wayne Goldstein Methods and systems for voice augmented caller ID / ring tone alias
US10043534B2 (en) 2013-12-23 2018-08-07 Staton Techiya, Llc Method and device for spectral expansion for an audio signal
EP2922056A1 (de) * 2014-03-19 2015-09-23 Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. Vorrichtung, Verfahren und zugehöriges Computerprogramm zur Erzeugung eines Fehlerverschleierungssignals unter Verwendung von Leistungskompensation
CN105374367B (zh) 2014-07-29 2019-04-05 华为技术有限公司 异常帧检测方法和装置
WO2016040885A1 (en) 2014-09-12 2016-03-17 Audience, Inc. Systems and methods for restoration of speech components
US10163453B2 (en) 2014-10-24 2018-12-25 Staton Techiya, Llc Robust voice activity detector system for use with an earphone
US9886963B2 (en) 2015-04-05 2018-02-06 Qualcomm Incorporated Encoder selection
US12268523B2 (en) 2015-05-08 2025-04-08 ST R&DTech LLC Biometric, physiological or environmental monitoring using a closed chamber
KR102446392B1 (ko) * 2015-09-23 2022-09-23 삼성전자주식회사 음성 인식이 가능한 전자 장치 및 방법
US10616693B2 (en) 2016-01-22 2020-04-07 Staton Techiya Llc System and method for efficiency among devices
US9820042B1 (en) 2016-05-02 2017-11-14 Knowles Electronics, Llc Stereo separation and directional suppression with omni-directional microphones
EP3324406A1 (de) 2016-11-17 2018-05-23 Fraunhofer Gesellschaft zur Förderung der Angewand Vorrichtung und verfahren zur zerlegung eines audiosignals mithilfe eines variablen schwellenwerts
EP3324407A1 (de) * 2016-11-17 2018-05-23 Fraunhofer Gesellschaft zur Förderung der Angewand Vorrichtung und verfahren zur dekomposition eines audiosignals unter verwendung eines verhältnisses als eine eigenschaftscharakteristik
US20180174574A1 (en) * 2016-12-19 2018-06-21 Knowles Electronics, Llc Methods and systems for reducing false alarms in keyword detection
KR20180111271A (ko) * 2017-03-31 2018-10-11 삼성전자주식회사 신경망 모델을 이용하여 노이즈를 제거하는 방법 및 장치
CN110506276B (zh) * 2017-05-19 2021-10-15 谷歌有限责任公司 使用环境传感器数据的高效的图像分析
US10817252B2 (en) 2018-03-10 2020-10-27 Staton Techiya, Llc Earphone software and hardware
US10951994B2 (en) 2018-04-04 2021-03-16 Staton Techiya, Llc Method to acquire preferred dynamic range function for speech enhancement
CN109545192B (zh) * 2018-12-18 2022-03-08 百度在线网络技术(北京)有限公司 用于生成模型的方法和装置
JP7608362B2 (ja) * 2019-05-07 2025-01-06 ヴォイスエイジ・コーポレーション コーディングされる音声信号内のアタックを検出し、検出されたアタックをコーディングするための方法およびデバイス
CN110310668A (zh) * 2019-05-21 2019-10-08 深圳壹账通智能科技有限公司 静音检测方法、系统、设备及计算机可读存储介质

Family Cites Families (18)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US574906A (en) * 1897-01-12 Chain
US4281218A (en) * 1979-10-26 1981-07-28 Bell Telephone Laboratories, Incorporated Speech-nonspeech detector-classifier
JPS58143394A (ja) * 1982-02-19 1983-08-25 株式会社日立製作所 音声区間の検出・分類方式
CA2040025A1 (en) 1990-04-09 1991-10-10 Hideki Satoh Speech detection apparatus with influence of input level and noise reduced
US5680508A (en) * 1991-05-03 1997-10-21 Itt Corporation Enhancement of speech coding in background noise for low-rate speech coder
CA2483324C (en) * 1991-06-11 2008-05-06 Qualcomm Incorporated Estimation of background noise in a variable rate vocoder
FR2684226B1 (fr) * 1991-11-22 1993-12-24 Thomson Csf Procede et dispositif de decision de voisement pour vocodeur a tres faible debit.
JP3277398B2 (ja) 1992-04-15 2002-04-22 ソニー株式会社 有声音判別方法
US5734789A (en) * 1992-06-01 1998-03-31 Hughes Electronics Voiced, unvoiced or noise modes in a CELP vocoder
IN184794B (de) * 1993-09-14 2000-09-30 British Telecomm
US5784532A (en) 1994-02-16 1998-07-21 Qualcomm Incorporated Application specific integrated circuit (ASIC) for performing rapid speech compression in a mobile telephone system
TW271524B (de) * 1994-08-05 1996-03-01 Qualcomm Inc
GB2317084B (en) 1995-04-28 2000-01-19 Northern Telecom Ltd Methods and apparatus for distinguishing speech intervals from noise intervals in audio signals
JPH09152894A (ja) 1995-11-30 1997-06-10 Denso Corp 有音無音判別器
EP0867856B1 (de) 1997-03-25 2005-10-26 Koninklijke Philips Electronics N.V. Verfahren und Vorrichtung zur Sprachdetektion
JP2000010577A (ja) 1998-06-19 2000-01-14 Sony Corp 有声音/無声音判定装置
JP3273599B2 (ja) 1998-06-19 2002-04-08 沖電気工業株式会社 音声符号化レート選択器と音声符号化装置
US6640208B1 (en) * 2000-09-12 2003-10-28 Motorola, Inc. Voiced/unvoiced speech classifier

Also Published As

Publication number Publication date
AU2002233983A1 (en) 2002-06-18
EP1340223B1 (de) 2006-10-04
WO2002047068A3 (en) 2002-08-22
ES2276845T3 (es) 2007-07-01
WO2002047068A2 (en) 2002-06-13
BR0116002A (pt) 2006-05-09
JP2004515809A (ja) 2004-05-27
CN101131817A (zh) 2008-02-27
KR100895589B1 (ko) 2009-05-06
JP2010176145A (ja) 2010-08-12
US20020111798A1 (en) 2002-08-15
CN101131817B (zh) 2013-11-06
TW535141B (en) 2003-06-01
JP4550360B2 (ja) 2010-09-22
CN100350453C (zh) 2007-11-21
DE60123651T2 (de) 2007-10-04
CN1543639A (zh) 2004-11-03
DE60123651D1 (de) 2006-11-16
HK1067444A1 (zh) 2005-04-08
JP5425682B2 (ja) 2014-02-26
KR20030061839A (ko) 2003-07-22
BRPI0116002B1 (pt) 2018-04-03
KR100908219B1 (ko) 2009-07-20
EP1340223A2 (de) 2003-09-03
KR20090026805A (ko) 2009-03-13
US7472059B2 (en) 2008-12-30

Similar Documents

Publication Publication Date Title
ATE341808T1 (de) Verfahren und vorrichtung zur robusten sprachklassifikation
CN1815558B (zh) 语音中非话音部分的低数据位速率编码
Bachu et al. Separation of voiced and unvoiced using zero crossing rate and energy of the speech signal
US7266494B2 (en) Method and apparatus for identifying noise environments from noisy signals
CN101763856B (zh) 信号分类处理方法、分类处理装置及编码系统
CN105122351B (zh) 声音合成装置及声音合成方法
CY1107233T1 (el) Συσκευη και μεθοδος για ευσταθη ταξινομηση ηχητικων σηματων, μεθοδος για τη δημιουργια και λειτουργια βασεως δεδομενων ηχητικων σηματων και προγραμμα ηλεκτρονικου υπολογιστη
US6983242B1 (en) Method for robust classification in speech coding
DE60221645D1 (de) Verfahren und vorrichtung zur verringerung von ungewünschter packeterzeugung
KR101116363B1 (ko) 음성신호 분류방법 및 장치, 및 이를 이용한 음성신호부호화방법 및 장치
CN1046366C (zh) 静态和非静态信号的鉴别
CN1920947A (zh) 用于低比特率音频编码的语音/音乐检测器
DE60200519D1 (de) Verfahren und Vorrichtung zur verteilten Spracherkennung
CN110910902A (zh) 一种基于集成学习的混合模型语音情感识别方法及系统
CN103198834B (zh) 一种音频信号处理方法、装置及终端
KR100291584B1 (ko) 피치 구간별 fo/f1률의 유사성에 의한 음성파형 압축방법
CN110580920A (zh) 一种声码器子带清浊音判决的方法及系统
CN118016081A (zh) 基于语音质量分级模型的变速率语音编码方法及系统
CN104318931A (zh) 一种音频文件的情绪活跃度获取方法及分类方法、装置
Patil et al. Goal-Oriented Auditory Scene Recognition.
Sharifzadeh et al. Regeneration of speech in voice-loss patients
US20070192097A1 (en) Method and apparatus for detecting affects in speech
Bezerra et al. Voice production model based on phonation biophysics
KR20150131588A (ko) 전자 장치 및 피치 생성 방법
ATE450942T1 (de) Verfahren und vorrichtung zur bestimmung von digitalen rahmen

Legal Events

Date Code Title Description
RER Ceased as to paragraph 5 lit. 3 law introducing patent treaties