WO2009145508A3 - 실시간 호출명령어 인식을 이용한 잡음환경에서의 음성구간검출과 연속음성인식 시스템 - Google Patents
실시간 호출명령어 인식을 이용한 잡음환경에서의 음성구간검출과 연속음성인식 시스템 Download PDFInfo
- Publication number
- WO2009145508A3 WO2009145508A3 PCT/KR2009/002118 KR2009002118W WO2009145508A3 WO 2009145508 A3 WO2009145508 A3 WO 2009145508A3 KR 2009002118 W KR2009002118 W KR 2009002118W WO 2009145508 A3 WO2009145508 A3 WO 2009145508A3
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- speech
- noisy environment
- call
- recognition
- real
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Ceased
Links
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/20—Speech recognition techniques specially adapted for robustness in adverse environments, e.g. in noise, of stress induced speech
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/02—Speech enhancement, e.g. noise reduction or echo cancellation
Landscapes
- Engineering & Computer Science (AREA)
- Computational Linguistics (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- Physics & Mathematics (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- Quality & Reliability (AREA)
- Signal Processing (AREA)
- Telephonic Communication Services (AREA)
- Sub-Exchange Stations And Push- Button Telephones (AREA)
Abstract
본 발명은 잡음환경에서 원활한 연속음성인식을 수행하기 위하여 호출명령어를 선정하고 잡음을 포함한 묵음구간과 호출명령어로 구성된 최소의 인식네트워크를 토큰으로 구성하여 입력된 음성에 대해 실시간 음성인식을 계속적으로 수행하고 이에 대한 신뢰도를 연속적으로 분석하여 연이어 들어오는 발화자의 음성을 인식하는 잡음환경에 매우 강인한 연속음성인식 시스템에 관한 것이다. 본 발명에 따른 실시간 호출명령어인식을 이용한 음성구간검출 및 연속음성인식 시스템은, 발화자가 호출명령어를 발화하면, 호출명령어를 인식한 후 신뢰도를 측정하여 상기 호출명령어를 인식하는 순간 상기 호출명령어에 연이어 발화되는 음성구간을 연속음성인식엔진에 인가함으로써 발화자의 음성을 인식하는 것을 특징으로 하여 이루어진다.
Priority Applications (2)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| US12/863,437 US8275616B2 (en) | 2008-05-28 | 2009-04-22 | System for detecting speech interval and recognizing continuous speech in a noisy environment through real-time recognition of call commands |
| US13/591,479 US8930196B2 (en) | 2008-05-28 | 2012-08-22 | System for detecting speech interval and recognizing continuous speech in a noisy environment through real-time recognition of call commands |
Applications Claiming Priority (2)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| KR1020080049455A KR101056511B1 (ko) | 2008-05-28 | 2008-05-28 | 실시간 호출명령어 인식을 이용한 잡음환경에서의음성구간검출과 연속음성인식 시스템 |
| KR10-2008-0049455 | 2008-05-28 |
Related Child Applications (2)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| US12/863,437 A-371-Of-International US8275616B2 (en) | 2008-05-28 | 2009-04-22 | System for detecting speech interval and recognizing continuous speech in a noisy environment through real-time recognition of call commands |
| US13/591,479 Continuation US8930196B2 (en) | 2008-05-28 | 2012-08-22 | System for detecting speech interval and recognizing continuous speech in a noisy environment through real-time recognition of call commands |
Publications (2)
| Publication Number | Publication Date |
|---|---|
| WO2009145508A2 WO2009145508A2 (ko) | 2009-12-03 |
| WO2009145508A3 true WO2009145508A3 (ko) | 2010-01-21 |
Family
ID=41377742
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| PCT/KR2009/002118 Ceased WO2009145508A2 (ko) | 2008-05-28 | 2009-04-22 | 실시간 호출명령어 인식을 이용한 잡음환경에서의 음성구간검출과 연속음성인식 시스템 |
Country Status (3)
| Country | Link |
|---|---|
| US (2) | US8275616B2 (ko) |
| KR (1) | KR101056511B1 (ko) |
| WO (1) | WO2009145508A2 (ko) |
Families Citing this family (49)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| WO2010019831A1 (en) * | 2008-08-14 | 2010-02-18 | 21Ct, Inc. | Hidden markov model for speech processing with training method |
| US8725506B2 (en) * | 2010-06-30 | 2014-05-13 | Intel Corporation | Speech audio processing |
| US9536523B2 (en) | 2011-06-22 | 2017-01-03 | Vocalzoom Systems Ltd. | Method and system for identification of speech segments |
| KR20130133629A (ko) | 2012-05-29 | 2013-12-09 | 삼성전자주식회사 | 전자장치에서 음성명령을 실행시키기 위한 장치 및 방법 |
| CN102999161B (zh) * | 2012-11-13 | 2016-03-02 | 科大讯飞股份有限公司 | 一种语音唤醒模块的实现方法及应用 |
| TWI557722B (zh) * | 2012-11-15 | 2016-11-11 | 緯創資通股份有限公司 | 語音干擾的濾除方法、系統,與電腦可讀記錄媒體 |
| US9110889B2 (en) * | 2013-04-23 | 2015-08-18 | Facebook, Inc. | Methods and systems for generation of flexible sentences in a social networking system |
| US9606987B2 (en) | 2013-05-06 | 2017-03-28 | Facebook, Inc. | Methods and systems for generation of a translatable sentence syntax in a social networking system |
| US9390708B1 (en) * | 2013-05-28 | 2016-07-12 | Amazon Technologies, Inc. | Low latency and memory efficient keywork spotting |
| US9508345B1 (en) | 2013-09-24 | 2016-11-29 | Knowles Electronics, Llc | Continuous voice sensing |
| US9953634B1 (en) | 2013-12-17 | 2018-04-24 | Knowles Electronics, Llc | Passive training for automatic speech recognition |
| US9589564B2 (en) | 2014-02-05 | 2017-03-07 | Google Inc. | Multiple speech locale-specific hotword classifiers for selection of a speech locale |
| US9437188B1 (en) | 2014-03-28 | 2016-09-06 | Knowles Electronics, Llc | Buffered reprocessing for multi-microphone automatic speech recognition assist |
| KR102216048B1 (ko) | 2014-05-20 | 2021-02-15 | 삼성전자주식회사 | 음성 명령 인식 장치 및 방법 |
| US9697828B1 (en) * | 2014-06-20 | 2017-07-04 | Amazon Technologies, Inc. | Keyword detection modeling using contextual and environmental information |
| US11676608B2 (en) | 2021-04-02 | 2023-06-13 | Google Llc | Speaker verification using co-location information |
| US11942095B2 (en) | 2014-07-18 | 2024-03-26 | Google Llc | Speaker verification using co-location information |
| US9257120B1 (en) | 2014-07-18 | 2016-02-09 | Google Inc. | Speaker verification using co-location information |
| US9318107B1 (en) * | 2014-10-09 | 2016-04-19 | Google Inc. | Hotword detection on multiple devices |
| US9812128B2 (en) | 2014-10-09 | 2017-11-07 | Google Inc. | Device leadership negotiation among voice interface devices |
| RU2606566C2 (ru) * | 2014-12-29 | 2017-01-10 | Федеральное государственное казенное военное образовательное учреждение высшего образования "Академия Федеральной службы охраны Российской Федерации" (Академия ФСО России) | Способ и устройство классификации сегментов зашумленной речи с использованием полиспектрального анализа |
| KR102323393B1 (ko) | 2015-01-12 | 2021-11-09 | 삼성전자주식회사 | 디바이스 및 상기 디바이스의 제어 방법 |
| CN105869640B (zh) * | 2015-01-21 | 2019-12-31 | 上海墨百意信息科技有限公司 | 识别针对当前页面中的实体的语音控制指令的方法和装置 |
| KR102371697B1 (ko) | 2015-02-11 | 2022-03-08 | 삼성전자주식회사 | 음성 기능 운용 방법 및 이를 지원하는 전자 장치 |
| KR101988222B1 (ko) | 2015-02-12 | 2019-06-13 | 한국전자통신연구원 | 대어휘 연속 음성 인식 장치 및 방법 |
| CN105741838B (zh) * | 2016-01-20 | 2019-10-15 | 百度在线网络技术(北京)有限公司 | 语音唤醒方法及装置 |
| US9779735B2 (en) | 2016-02-24 | 2017-10-03 | Google Inc. | Methods and systems for detecting and processing speech signals |
| US9972320B2 (en) | 2016-08-24 | 2018-05-15 | Google Llc | Hotword detection on multiple devices |
| CN106448663B (zh) * | 2016-10-17 | 2020-10-23 | 海信集团有限公司 | 语音唤醒方法及语音交互装置 |
| JP6616048B1 (ja) | 2016-11-07 | 2019-12-04 | グーグル エルエルシー | 記録されたメディアホットワードトリガ抑制 |
| KR20180062127A (ko) | 2016-11-30 | 2018-06-08 | 영남대학교 산학협력단 | 음성인식을 통한 다자간 무선 통신 장치 및 그 방법 |
| US10559309B2 (en) | 2016-12-22 | 2020-02-11 | Google Llc | Collaborative voice controlled devices |
| US10522137B2 (en) | 2017-04-20 | 2019-12-31 | Google Llc | Multi-user authentication on a device |
| US10395650B2 (en) | 2017-06-05 | 2019-08-27 | Google Llc | Recorded media hotword trigger suppression |
| US10311874B2 (en) | 2017-09-01 | 2019-06-04 | 4Q Catalyst, LLC | Methods and systems for voice-based programming of a voice-controlled device |
| US10692496B2 (en) | 2018-05-22 | 2020-06-23 | Google Llc | Hotword suppression |
| CN110738990B (zh) * | 2018-07-19 | 2022-03-25 | 南京地平线机器人技术有限公司 | 识别语音的方法和装置 |
| KR102628211B1 (ko) | 2018-08-29 | 2024-01-23 | 삼성전자주식회사 | 전자 장치 및 그 제어 방법 |
| KR102208496B1 (ko) * | 2018-10-25 | 2021-01-27 | 현대오토에버 주식회사 | 연속 음성 명령에 기반하여 서비스를 제공하는 인공지능 음성단말장치 및 음성서비스시스템 |
| KR102224994B1 (ko) | 2019-05-21 | 2021-03-08 | 엘지전자 주식회사 | 음성 인식 방법 및 음성 인식 장치 |
| KR102225001B1 (ko) | 2019-05-21 | 2021-03-08 | 엘지전자 주식회사 | 음성 인식 방법 및 음성 인식 장치 |
| IT201900015506A1 (it) | 2019-09-03 | 2021-03-03 | St Microelectronics Srl | Procedimento di elaborazione di un segnale elettrico trasdotto da un segnale vocale, dispositivo elettronico, rete connessa di dispositivi elettronici e prodotto informatico corrispondenti |
| KR102685533B1 (ko) | 2019-11-18 | 2024-07-17 | 삼성전자주식회사 | 비정상 잡음을 판단하는 전자 장치 및 방법 |
| KR20210141115A (ko) | 2020-05-15 | 2021-11-23 | 삼성전자주식회사 | 발화 시간 추정 방법 및 장치 |
| US11741964B2 (en) * | 2020-05-27 | 2023-08-29 | Sorenson Ip Holdings, Llc | Transcription generation technique selection |
| CN113516967B (zh) * | 2021-08-04 | 2024-06-25 | 青岛信芯微电子科技股份有限公司 | 一种语音识别方法及装置 |
| CN113707135B (zh) * | 2021-10-27 | 2021-12-31 | 成都启英泰伦科技有限公司 | 一种高精度连续语音识别的声学模型训练方法 |
| US11782877B1 (en) | 2022-05-17 | 2023-10-10 | Bank Of America Corporation | Search technique for noisy logs and resulting user interfaces displaying log entries in ranked order of importance |
| KR102723874B1 (ko) | 2024-05-07 | 2024-10-30 | 주식회사 리턴제로 | 양자화된 vad 점수 기반으로 발화 구간 결정을 수행하는 전자 장치 및 방법 |
Citations (5)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US20020161581A1 (en) * | 2001-03-28 | 2002-10-31 | Morin Philippe R. | Robust word-spotting system using an intelligibility criterion for reliable keyword detection under adverse and unknown noisy environments |
| US20060074651A1 (en) * | 2004-09-22 | 2006-04-06 | General Motors Corporation | Adaptive confidence thresholds in telematics system speech recognition |
| JP2006184589A (ja) * | 2004-12-28 | 2006-07-13 | Casio Comput Co Ltd | カメラ装置、及び撮影方法 |
| KR20060097895A (ko) * | 2005-03-07 | 2006-09-18 | 삼성전자주식회사 | 사용자 적응형 음성 인식 방법 및 장치 |
| WO2007045723A1 (en) * | 2005-10-17 | 2007-04-26 | Nokia Corporation | A method and a device for speech recognition |
Family Cites Families (15)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US5832430A (en) * | 1994-12-29 | 1998-11-03 | Lucent Technologies, Inc. | Devices and methods for speech recognition of vocabulary words with simultaneous detection and verification |
| JP3697748B2 (ja) * | 1995-08-21 | 2005-09-21 | セイコーエプソン株式会社 | 端末、音声認識装置 |
| WO2000040377A1 (fr) * | 1999-01-07 | 2000-07-13 | Sony Corporation | Appareil de type machine, procede d'actionnement de celui-ci et support enregistre |
| US6463415B2 (en) * | 1999-08-31 | 2002-10-08 | Accenture Llp | 69voice authentication system and method for regulating border crossing |
| US20030023437A1 (en) * | 2001-01-27 | 2003-01-30 | Pascale Fung | System and method for context-based spontaneous speech recognition |
| US7016315B2 (en) * | 2001-03-26 | 2006-03-21 | Motorola, Inc. | Token passing arrangement for a conference call bridge arrangement |
| US7203652B1 (en) * | 2002-02-21 | 2007-04-10 | Nuance Communications | Method and system for improving robustness in a speech system |
| GB2409750B (en) * | 2004-01-05 | 2006-03-15 | Toshiba Res Europ Ltd | Speech recognition system and technique |
| US7756709B2 (en) * | 2004-02-02 | 2010-07-13 | Applied Voice & Speech Technologies, Inc. | Detection of voice inactivity within a sound stream |
| US20070179784A1 (en) * | 2006-02-02 | 2007-08-02 | Queensland University Of Technology | Dynamic match lattice spotting for indexing speech content |
| US7966183B1 (en) * | 2006-05-04 | 2011-06-21 | Texas Instruments Incorporated | Multiplying confidence scores for utterance verification in a mobile telephone |
| KR101450188B1 (ko) * | 2006-08-09 | 2014-10-14 | 삼성전자주식회사 | 휴대용 단말기의 음성 제어 장치 및 방법 |
| US20080154870A1 (en) * | 2006-12-26 | 2008-06-26 | Voice Signal Technologies, Inc. | Collection and use of side information in voice-mediated mobile search |
| KR101393023B1 (ko) * | 2007-03-29 | 2014-05-12 | 엘지전자 주식회사 | 이동통신단말기 및 그 음성인식 사용자 인터페이스 방법 |
| US8620658B2 (en) * | 2007-04-16 | 2013-12-31 | Sony Corporation | Voice chat system, information processing apparatus, speech recognition method, keyword data electrode detection method, and program for speech recognition |
-
2008
- 2008-05-28 KR KR1020080049455A patent/KR101056511B1/ko active Active
-
2009
- 2009-04-22 WO PCT/KR2009/002118 patent/WO2009145508A2/ko not_active Ceased
- 2009-04-22 US US12/863,437 patent/US8275616B2/en active Active
-
2012
- 2012-08-22 US US13/591,479 patent/US8930196B2/en active Active
Patent Citations (5)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US20020161581A1 (en) * | 2001-03-28 | 2002-10-31 | Morin Philippe R. | Robust word-spotting system using an intelligibility criterion for reliable keyword detection under adverse and unknown noisy environments |
| US20060074651A1 (en) * | 2004-09-22 | 2006-04-06 | General Motors Corporation | Adaptive confidence thresholds in telematics system speech recognition |
| JP2006184589A (ja) * | 2004-12-28 | 2006-07-13 | Casio Comput Co Ltd | カメラ装置、及び撮影方法 |
| KR20060097895A (ko) * | 2005-03-07 | 2006-09-18 | 삼성전자주식회사 | 사용자 적응형 음성 인식 방법 및 장치 |
| WO2007045723A1 (en) * | 2005-10-17 | 2007-04-26 | Nokia Corporation | A method and a device for speech recognition |
Also Published As
| Publication number | Publication date |
|---|---|
| WO2009145508A2 (ko) | 2009-12-03 |
| KR101056511B1 (ko) | 2011-08-11 |
| US8930196B2 (en) | 2015-01-06 |
| KR20090123396A (ko) | 2009-12-02 |
| US20110054892A1 (en) | 2011-03-03 |
| US8275616B2 (en) | 2012-09-25 |
| US20120316879A1 (en) | 2012-12-13 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| WO2009145508A3 (ko) | 실시간 호출명령어 인식을 이용한 잡음환경에서의 음성구간검출과 연속음성인식 시스템 | |
| US10019992B2 (en) | Speech-controlled actions based on keywords and context thereof | |
| US11651780B2 (en) | Direction based end-pointing for speech recognition | |
| JP2020009459A5 (ko) | ||
| DE602007004733D1 (de) | Sprechererkennung | |
| WO2020256257A3 (ko) | 잡음 환경에 강인한 화자 인식을 위한 심화신경망 기반의 특징 강화 및 변형된 손실 함수를 이용한 결합 학습 방법 및 장치 | |
| JP5797009B2 (ja) | 音声認識装置、ロボット、及び音声認識方法 | |
| EP4236281A3 (en) | Event-triggered hands-free multitasking for media playback | |
| SG11201808360SA (en) | Acoustic model training method, speech recognition method, apparatus, device and medium | |
| WO2010117712A3 (en) | Systems and methods for measuring speech intelligibility | |
| WO2009004750A1 (ja) | 音声認識装置 | |
| WO2014063104A3 (en) | Keyword voice activation in vehicles | |
| EP3874490A4 (en) | SYSTEMS AND METHODS FOR TWO PASS SEGMENTATION AND GROUPING, AUTOMATIC SPEECH RECOGNITION AND TRANSCRIPTION GENERATION | |
| WO2008144638A3 (en) | Systems and methods of a structured grammar for a speech recognition command system | |
| WO2009114069A3 (en) | Method and system for providing interactivity based on sensor measurements | |
| WO2010063660A3 (en) | Wind noise detection method and system | |
| JP2005022065A5 (ko) | ||
| IN2013DE00063A (ko) | ||
| WO2009158581A3 (en) | System and method for spoken topic or criterion recognition in digital media and contextual advertising | |
| WO2014210392A3 (en) | Detecting self-generated wake expressions | |
| WO2012036424A3 (en) | Method and apparatus for performing microphone beamforming | |
| WO2008002365A3 (en) | Speech recognition system and method with biometric user identification | |
| WO2008084476A3 (en) | Vowel recognition system and method in speech to text applications | |
| WO2012003269A3 (en) | Speech audio processing | |
| WO2010088575A3 (en) | Patient-lifting-device controls |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| 121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 09754973 Country of ref document: EP Kind code of ref document: A2 |
|
| NENP | Non-entry into the national phase |
Ref country code: DE |
|
| 122 | Ep: pct application non-entry in european phase |
Ref document number: 09754973 Country of ref document: EP Kind code of ref document: A2 |