WO2009145508A3 - 실시간 호출명령어 인식을 이용한 잡음환경에서의 음성구간검출과 연속음성인식 시스템 - Google Patents

실시간 호출명령어 인식을 이용한 잡음환경에서의 음성구간검출과 연속음성인식 시스템 Download PDF

Info

Publication number
WO2009145508A3
WO2009145508A3 PCT/KR2009/002118 KR2009002118W WO2009145508A3 WO 2009145508 A3 WO2009145508 A3 WO 2009145508A3 KR 2009002118 W KR2009002118 W KR 2009002118W WO 2009145508 A3 WO2009145508 A3 WO 2009145508A3
Authority
WO
WIPO (PCT)
Prior art keywords
speech
noisy environment
call
recognition
real
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Ceased
Application number
PCT/KR2009/002118
Other languages
English (en)
French (fr)
Other versions
WO2009145508A2 (ko
Inventor
정희석
진세훈
노태영
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
KoreaPowerVoice Co Ltd
Original Assignee
KoreaPowerVoice Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Family has litigation
First worldwide family litigation filed litigation Critical https://patents.darts-ip.com/?family=41377742&utm_source=google_patent&utm_medium=platform_link&utm_campaign=public_patent_search&patent=WO2009145508(A3) "Global patent litigation dataset” by Darts-ip is licensed under a Creative Commons Attribution 4.0 International License.
Application filed by KoreaPowerVoice Co Ltd filed Critical KoreaPowerVoice Co Ltd
Priority to US12/863,437 priority Critical patent/US8275616B2/en
Publication of WO2009145508A2 publication Critical patent/WO2009145508A2/ko
Publication of WO2009145508A3 publication Critical patent/WO2009145508A3/ko
Anticipated expiration legal-status Critical
Priority to US13/591,479 priority patent/US8930196B2/en
Ceased legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/08Speech classification or search
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/20Speech recognition techniques specially adapted for robustness in adverse environments, e.g. in noise, of stress induced speech
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation

Landscapes

  • Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Quality & Reliability (AREA)
  • Signal Processing (AREA)
  • Telephonic Communication Services (AREA)
  • Sub-Exchange Stations And Push- Button Telephones (AREA)

Abstract

본 발명은 잡음환경에서 원활한 연속음성인식을 수행하기 위하여 호출명령어를 선정하고 잡음을 포함한 묵음구간과 호출명령어로 구성된 최소의 인식네트워크를 토큰으로 구성하여 입력된 음성에 대해 실시간 음성인식을 계속적으로 수행하고 이에 대한 신뢰도를 연속적으로 분석하여 연이어 들어오는 발화자의 음성을 인식하는 잡음환경에 매우 강인한 연속음성인식 시스템에 관한 것이다. 본 발명에 따른 실시간 호출명령어인식을 이용한 음성구간검출 및 연속음성인식 시스템은, 발화자가 호출명령어를 발화하면, 호출명령어를 인식한 후 신뢰도를 측정하여 상기 호출명령어를 인식하는 순간 상기 호출명령어에 연이어 발화되는 음성구간을 연속음성인식엔진에 인가함으로써 발화자의 음성을 인식하는 것을 특징으로 하여 이루어진다.
PCT/KR2009/002118 2008-05-28 2009-04-22 실시간 호출명령어 인식을 이용한 잡음환경에서의 음성구간검출과 연속음성인식 시스템 Ceased WO2009145508A2 (ko)

Priority Applications (2)

Application Number Priority Date Filing Date Title
US12/863,437 US8275616B2 (en) 2008-05-28 2009-04-22 System for detecting speech interval and recognizing continuous speech in a noisy environment through real-time recognition of call commands
US13/591,479 US8930196B2 (en) 2008-05-28 2012-08-22 System for detecting speech interval and recognizing continuous speech in a noisy environment through real-time recognition of call commands

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
KR1020080049455A KR101056511B1 (ko) 2008-05-28 2008-05-28 실시간 호출명령어 인식을 이용한 잡음환경에서의음성구간검출과 연속음성인식 시스템
KR10-2008-0049455 2008-05-28

Related Child Applications (2)

Application Number Title Priority Date Filing Date
US12/863,437 A-371-Of-International US8275616B2 (en) 2008-05-28 2009-04-22 System for detecting speech interval and recognizing continuous speech in a noisy environment through real-time recognition of call commands
US13/591,479 Continuation US8930196B2 (en) 2008-05-28 2012-08-22 System for detecting speech interval and recognizing continuous speech in a noisy environment through real-time recognition of call commands

Publications (2)

Publication Number Publication Date
WO2009145508A2 WO2009145508A2 (ko) 2009-12-03
WO2009145508A3 true WO2009145508A3 (ko) 2010-01-21

Family

ID=41377742

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/KR2009/002118 Ceased WO2009145508A2 (ko) 2008-05-28 2009-04-22 실시간 호출명령어 인식을 이용한 잡음환경에서의 음성구간검출과 연속음성인식 시스템

Country Status (3)

Country Link
US (2) US8275616B2 (ko)
KR (1) KR101056511B1 (ko)
WO (1) WO2009145508A2 (ko)

Families Citing this family (49)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2010019831A1 (en) * 2008-08-14 2010-02-18 21Ct, Inc. Hidden markov model for speech processing with training method
US8725506B2 (en) * 2010-06-30 2014-05-13 Intel Corporation Speech audio processing
US9536523B2 (en) 2011-06-22 2017-01-03 Vocalzoom Systems Ltd. Method and system for identification of speech segments
KR20130133629A (ko) 2012-05-29 2013-12-09 삼성전자주식회사 전자장치에서 음성명령을 실행시키기 위한 장치 및 방법
CN102999161B (zh) * 2012-11-13 2016-03-02 科大讯飞股份有限公司 一种语音唤醒模块的实现方法及应用
TWI557722B (zh) * 2012-11-15 2016-11-11 緯創資通股份有限公司 語音干擾的濾除方法、系統,與電腦可讀記錄媒體
US9110889B2 (en) * 2013-04-23 2015-08-18 Facebook, Inc. Methods and systems for generation of flexible sentences in a social networking system
US9606987B2 (en) 2013-05-06 2017-03-28 Facebook, Inc. Methods and systems for generation of a translatable sentence syntax in a social networking system
US9390708B1 (en) * 2013-05-28 2016-07-12 Amazon Technologies, Inc. Low latency and memory efficient keywork spotting
US9508345B1 (en) 2013-09-24 2016-11-29 Knowles Electronics, Llc Continuous voice sensing
US9953634B1 (en) 2013-12-17 2018-04-24 Knowles Electronics, Llc Passive training for automatic speech recognition
US9589564B2 (en) 2014-02-05 2017-03-07 Google Inc. Multiple speech locale-specific hotword classifiers for selection of a speech locale
US9437188B1 (en) 2014-03-28 2016-09-06 Knowles Electronics, Llc Buffered reprocessing for multi-microphone automatic speech recognition assist
KR102216048B1 (ko) 2014-05-20 2021-02-15 삼성전자주식회사 음성 명령 인식 장치 및 방법
US9697828B1 (en) * 2014-06-20 2017-07-04 Amazon Technologies, Inc. Keyword detection modeling using contextual and environmental information
US11676608B2 (en) 2021-04-02 2023-06-13 Google Llc Speaker verification using co-location information
US11942095B2 (en) 2014-07-18 2024-03-26 Google Llc Speaker verification using co-location information
US9257120B1 (en) 2014-07-18 2016-02-09 Google Inc. Speaker verification using co-location information
US9318107B1 (en) * 2014-10-09 2016-04-19 Google Inc. Hotword detection on multiple devices
US9812128B2 (en) 2014-10-09 2017-11-07 Google Inc. Device leadership negotiation among voice interface devices
RU2606566C2 (ru) * 2014-12-29 2017-01-10 Федеральное государственное казенное военное образовательное учреждение высшего образования "Академия Федеральной службы охраны Российской Федерации" (Академия ФСО России) Способ и устройство классификации сегментов зашумленной речи с использованием полиспектрального анализа
KR102323393B1 (ko) 2015-01-12 2021-11-09 삼성전자주식회사 디바이스 및 상기 디바이스의 제어 방법
CN105869640B (zh) * 2015-01-21 2019-12-31 上海墨百意信息科技有限公司 识别针对当前页面中的实体的语音控制指令的方法和装置
KR102371697B1 (ko) 2015-02-11 2022-03-08 삼성전자주식회사 음성 기능 운용 방법 및 이를 지원하는 전자 장치
KR101988222B1 (ko) 2015-02-12 2019-06-13 한국전자통신연구원 대어휘 연속 음성 인식 장치 및 방법
CN105741838B (zh) * 2016-01-20 2019-10-15 百度在线网络技术(北京)有限公司 语音唤醒方法及装置
US9779735B2 (en) 2016-02-24 2017-10-03 Google Inc. Methods and systems for detecting and processing speech signals
US9972320B2 (en) 2016-08-24 2018-05-15 Google Llc Hotword detection on multiple devices
CN106448663B (zh) * 2016-10-17 2020-10-23 海信集团有限公司 语音唤醒方法及语音交互装置
JP6616048B1 (ja) 2016-11-07 2019-12-04 グーグル エルエルシー 記録されたメディアホットワードトリガ抑制
KR20180062127A (ko) 2016-11-30 2018-06-08 영남대학교 산학협력단 음성인식을 통한 다자간 무선 통신 장치 및 그 방법
US10559309B2 (en) 2016-12-22 2020-02-11 Google Llc Collaborative voice controlled devices
US10522137B2 (en) 2017-04-20 2019-12-31 Google Llc Multi-user authentication on a device
US10395650B2 (en) 2017-06-05 2019-08-27 Google Llc Recorded media hotword trigger suppression
US10311874B2 (en) 2017-09-01 2019-06-04 4Q Catalyst, LLC Methods and systems for voice-based programming of a voice-controlled device
US10692496B2 (en) 2018-05-22 2020-06-23 Google Llc Hotword suppression
CN110738990B (zh) * 2018-07-19 2022-03-25 南京地平线机器人技术有限公司 识别语音的方法和装置
KR102628211B1 (ko) 2018-08-29 2024-01-23 삼성전자주식회사 전자 장치 및 그 제어 방법
KR102208496B1 (ko) * 2018-10-25 2021-01-27 현대오토에버 주식회사 연속 음성 명령에 기반하여 서비스를 제공하는 인공지능 음성단말장치 및 음성서비스시스템
KR102224994B1 (ko) 2019-05-21 2021-03-08 엘지전자 주식회사 음성 인식 방법 및 음성 인식 장치
KR102225001B1 (ko) 2019-05-21 2021-03-08 엘지전자 주식회사 음성 인식 방법 및 음성 인식 장치
IT201900015506A1 (it) 2019-09-03 2021-03-03 St Microelectronics Srl Procedimento di elaborazione di un segnale elettrico trasdotto da un segnale vocale, dispositivo elettronico, rete connessa di dispositivi elettronici e prodotto informatico corrispondenti
KR102685533B1 (ko) 2019-11-18 2024-07-17 삼성전자주식회사 비정상 잡음을 판단하는 전자 장치 및 방법
KR20210141115A (ko) 2020-05-15 2021-11-23 삼성전자주식회사 발화 시간 추정 방법 및 장치
US11741964B2 (en) * 2020-05-27 2023-08-29 Sorenson Ip Holdings, Llc Transcription generation technique selection
CN113516967B (zh) * 2021-08-04 2024-06-25 青岛信芯微电子科技股份有限公司 一种语音识别方法及装置
CN113707135B (zh) * 2021-10-27 2021-12-31 成都启英泰伦科技有限公司 一种高精度连续语音识别的声学模型训练方法
US11782877B1 (en) 2022-05-17 2023-10-10 Bank Of America Corporation Search technique for noisy logs and resulting user interfaces displaying log entries in ranked order of importance
KR102723874B1 (ko) 2024-05-07 2024-10-30 주식회사 리턴제로 양자화된 vad 점수 기반으로 발화 구간 결정을 수행하는 전자 장치 및 방법

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20020161581A1 (en) * 2001-03-28 2002-10-31 Morin Philippe R. Robust word-spotting system using an intelligibility criterion for reliable keyword detection under adverse and unknown noisy environments
US20060074651A1 (en) * 2004-09-22 2006-04-06 General Motors Corporation Adaptive confidence thresholds in telematics system speech recognition
JP2006184589A (ja) * 2004-12-28 2006-07-13 Casio Comput Co Ltd カメラ装置、及び撮影方法
KR20060097895A (ko) * 2005-03-07 2006-09-18 삼성전자주식회사 사용자 적응형 음성 인식 방법 및 장치
WO2007045723A1 (en) * 2005-10-17 2007-04-26 Nokia Corporation A method and a device for speech recognition

Family Cites Families (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5832430A (en) * 1994-12-29 1998-11-03 Lucent Technologies, Inc. Devices and methods for speech recognition of vocabulary words with simultaneous detection and verification
JP3697748B2 (ja) * 1995-08-21 2005-09-21 セイコーエプソン株式会社 端末、音声認識装置
WO2000040377A1 (fr) * 1999-01-07 2000-07-13 Sony Corporation Appareil de type machine, procede d'actionnement de celui-ci et support enregistre
US6463415B2 (en) * 1999-08-31 2002-10-08 Accenture Llp 69voice authentication system and method for regulating border crossing
US20030023437A1 (en) * 2001-01-27 2003-01-30 Pascale Fung System and method for context-based spontaneous speech recognition
US7016315B2 (en) * 2001-03-26 2006-03-21 Motorola, Inc. Token passing arrangement for a conference call bridge arrangement
US7203652B1 (en) * 2002-02-21 2007-04-10 Nuance Communications Method and system for improving robustness in a speech system
GB2409750B (en) * 2004-01-05 2006-03-15 Toshiba Res Europ Ltd Speech recognition system and technique
US7756709B2 (en) * 2004-02-02 2010-07-13 Applied Voice & Speech Technologies, Inc. Detection of voice inactivity within a sound stream
US20070179784A1 (en) * 2006-02-02 2007-08-02 Queensland University Of Technology Dynamic match lattice spotting for indexing speech content
US7966183B1 (en) * 2006-05-04 2011-06-21 Texas Instruments Incorporated Multiplying confidence scores for utterance verification in a mobile telephone
KR101450188B1 (ko) * 2006-08-09 2014-10-14 삼성전자주식회사 휴대용 단말기의 음성 제어 장치 및 방법
US20080154870A1 (en) * 2006-12-26 2008-06-26 Voice Signal Technologies, Inc. Collection and use of side information in voice-mediated mobile search
KR101393023B1 (ko) * 2007-03-29 2014-05-12 엘지전자 주식회사 이동통신단말기 및 그 음성인식 사용자 인터페이스 방법
US8620658B2 (en) * 2007-04-16 2013-12-31 Sony Corporation Voice chat system, information processing apparatus, speech recognition method, keyword data electrode detection method, and program for speech recognition

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20020161581A1 (en) * 2001-03-28 2002-10-31 Morin Philippe R. Robust word-spotting system using an intelligibility criterion for reliable keyword detection under adverse and unknown noisy environments
US20060074651A1 (en) * 2004-09-22 2006-04-06 General Motors Corporation Adaptive confidence thresholds in telematics system speech recognition
JP2006184589A (ja) * 2004-12-28 2006-07-13 Casio Comput Co Ltd カメラ装置、及び撮影方法
KR20060097895A (ko) * 2005-03-07 2006-09-18 삼성전자주식회사 사용자 적응형 음성 인식 방법 및 장치
WO2007045723A1 (en) * 2005-10-17 2007-04-26 Nokia Corporation A method and a device for speech recognition

Also Published As

Publication number Publication date
WO2009145508A2 (ko) 2009-12-03
KR101056511B1 (ko) 2011-08-11
US8930196B2 (en) 2015-01-06
KR20090123396A (ko) 2009-12-02
US20110054892A1 (en) 2011-03-03
US8275616B2 (en) 2012-09-25
US20120316879A1 (en) 2012-12-13

Similar Documents

Publication Publication Date Title
WO2009145508A3 (ko) 실시간 호출명령어 인식을 이용한 잡음환경에서의 음성구간검출과 연속음성인식 시스템
US10019992B2 (en) Speech-controlled actions based on keywords and context thereof
US11651780B2 (en) Direction based end-pointing for speech recognition
JP2020009459A5 (ko)
DE602007004733D1 (de) Sprechererkennung
WO2020256257A3 (ko) 잡음 환경에 강인한 화자 인식을 위한 심화신경망 기반의 특징 강화 및 변형된 손실 함수를 이용한 결합 학습 방법 및 장치
JP5797009B2 (ja) 音声認識装置、ロボット、及び音声認識方法
EP4236281A3 (en) Event-triggered hands-free multitasking for media playback
SG11201808360SA (en) Acoustic model training method, speech recognition method, apparatus, device and medium
WO2010117712A3 (en) Systems and methods for measuring speech intelligibility
WO2009004750A1 (ja) 音声認識装置
WO2014063104A3 (en) Keyword voice activation in vehicles
EP3874490A4 (en) SYSTEMS AND METHODS FOR TWO PASS SEGMENTATION AND GROUPING, AUTOMATIC SPEECH RECOGNITION AND TRANSCRIPTION GENERATION
WO2008144638A3 (en) Systems and methods of a structured grammar for a speech recognition command system
WO2009114069A3 (en) Method and system for providing interactivity based on sensor measurements
WO2010063660A3 (en) Wind noise detection method and system
JP2005022065A5 (ko)
IN2013DE00063A (ko)
WO2009158581A3 (en) System and method for spoken topic or criterion recognition in digital media and contextual advertising
WO2014210392A3 (en) Detecting self-generated wake expressions
WO2012036424A3 (en) Method and apparatus for performing microphone beamforming
WO2008002365A3 (en) Speech recognition system and method with biometric user identification
WO2008084476A3 (en) Vowel recognition system and method in speech to text applications
WO2012003269A3 (en) Speech audio processing
WO2010088575A3 (en) Patient-lifting-device controls

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 09754973

Country of ref document: EP

Kind code of ref document: A2

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 09754973

Country of ref document: EP

Kind code of ref document: A2