ATE536611T1 - Kommunikationsgerät mit lautsprecherunabhängiger spracherkennung - Google Patents
Kommunikationsgerät mit lautsprecherunabhängiger spracherkennungInfo
- Publication number
- ATE536611T1 ATE536611T1 AT07750697T AT07750697T ATE536611T1 AT E536611 T1 ATE536611 T1 AT E536611T1 AT 07750697 T AT07750697 T AT 07750697T AT 07750697 T AT07750697 T AT 07750697T AT E536611 T1 ATE536611 T1 AT E536611T1
- Authority
- AT
- Austria
- Prior art keywords
- communication device
- speaker
- voice recognition
- likelihood
- calculated
- Prior art date
Links
- 239000013598 vector Substances 0.000 abstract 4
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/02—Feature extraction for speech recognition; Selection of recognition unit
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
- G10L15/12—Speech classification or search using dynamic programming techniques, e.g. dynamic time warping [DTW]
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/06—Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/06—Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
- G10L15/065—Adaptation
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
- G10L15/18—Speech classification or search using natural language modelling
- G10L15/183—Speech classification or search using natural language modelling using context dependencies, e.g. language models
- G10L15/187—Phonemic context, e.g. pronunciation rules, phonotactical constraints or phoneme n-grams
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04M—TELEPHONIC COMMUNICATION
- H04M1/00—Substation equipment, e.g. for use by subscribers
- H04M1/26—Devices for calling a subscriber
- H04M1/27—Devices whereby a plurality of signals may be stored simultaneously
- H04M1/271—Devices whereby a plurality of signals may be stored simultaneously controlled by voice recognition
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/02—Feature extraction for speech recognition; Selection of recognition unit
- G10L2015/025—Phonemes, fenemes or fenones being the recognition units
Landscapes
- Engineering & Computer Science (AREA)
- Human Computer Interaction (AREA)
- Physics & Mathematics (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Computational Linguistics (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- Artificial Intelligence (AREA)
- Signal Processing (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Telephonic Communication Services (AREA)
- Telephone Function (AREA)
Applications Claiming Priority (2)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| US77357706P | 2006-02-14 | 2006-02-14 | |
| PCT/US2007/003876 WO2007095277A2 (en) | 2006-02-14 | 2007-02-13 | Communication device having speaker independent speech recognition |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| ATE536611T1 true ATE536611T1 (de) | 2011-12-15 |
Family
ID=38328169
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| AT07750697T ATE536611T1 (de) | 2006-02-14 | 2007-02-13 | Kommunikationsgerät mit lautsprecherunabhängiger spracherkennung |
Country Status (7)
| Country | Link |
|---|---|
| US (1) | US20070203701A1 (de) |
| EP (1) | EP1994529B1 (de) |
| JP (1) | JP2009527024A (de) |
| KR (1) | KR20080107376A (de) |
| CN (1) | CN101385073A (de) |
| AT (1) | ATE536611T1 (de) |
| WO (1) | WO2007095277A2 (de) |
Families Citing this family (17)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US20070225049A1 (en) * | 2006-03-23 | 2007-09-27 | Andrada Mauricio P | Voice controlled push to talk system |
| US8521235B2 (en) * | 2008-03-27 | 2013-08-27 | General Motors Llc | Address book sharing system and method for non-verbally adding address book contents using the same |
| US8515749B2 (en) * | 2009-05-20 | 2013-08-20 | Raytheon Bbn Technologies Corp. | Speech-to-speech translation |
| US8626511B2 (en) * | 2010-01-22 | 2014-01-07 | Google Inc. | Multi-dimensional disambiguation of voice commands |
| WO2013167934A1 (en) | 2012-05-07 | 2013-11-14 | Mls Multimedia S.A. | Methods and system implementing intelligent vocal name-selection from directory lists composed in non-latin alphabet languages |
| CN107620340B (zh) | 2012-07-19 | 2020-12-11 | 住友建机株式会社 | 挖土机 |
| US9401140B1 (en) * | 2012-08-22 | 2016-07-26 | Amazon Technologies, Inc. | Unsupervised acoustic model training |
| EP3010017A1 (de) * | 2014-10-14 | 2016-04-20 | Thomson Licensing | Verfahren und Vorrichtung zur Trennung von Sprachdaten von Hintergrunddaten in der Audiokommunikation |
| EP3257043B1 (de) * | 2015-02-11 | 2018-12-12 | Bang & Olufsen A/S | Sprechererkennung in einem multimediasystem |
| KR101684554B1 (ko) * | 2015-08-20 | 2016-12-08 | 현대자동차 주식회사 | 음성 다이얼링 시스템 및 그 방법 |
| EP3496090A1 (de) * | 2017-12-07 | 2019-06-12 | Thomson Licensing | Vorrichtung und verfahren für datenschutzbewahrende stimminteraktion |
| JP7173049B2 (ja) * | 2018-01-10 | 2022-11-16 | ソニーグループ株式会社 | 情報処理装置、情報処理システム、および情報処理方法、並びにプログラム |
| US11410642B2 (en) * | 2019-08-16 | 2022-08-09 | Soundhound, Inc. | Method and system using phoneme embedding |
| US20220067304A1 (en) * | 2020-08-27 | 2022-03-03 | Google Llc | Energy-Based Language Models |
| JP7528277B2 (ja) * | 2020-09-02 | 2024-08-05 | グーグル エルエルシー | データ密度の勾配の推定による条件付き出力生成 |
| WO2022051552A1 (en) | 2020-09-02 | 2022-03-10 | Google Llc | End-to-end speech waveform generation through data density gradient estimation |
| CN116320130A (zh) * | 2022-12-15 | 2023-06-23 | 深圳市创智成科技股份有限公司 | 一种适用于固定电话机的语音识别拨号方法及系统 |
Family Cites Families (28)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US4908865A (en) * | 1984-12-27 | 1990-03-13 | Texas Instruments Incorporated | Speaker independent speech recognition method and system |
| US6236964B1 (en) * | 1990-02-01 | 2001-05-22 | Canon Kabushiki Kaisha | Speech recognition apparatus and method for matching inputted speech and a word generated from stored referenced phoneme data |
| US5390278A (en) * | 1991-10-08 | 1995-02-14 | Bell Canada | Phoneme based speech recognition |
| US5353376A (en) * | 1992-03-20 | 1994-10-04 | Texas Instruments Incorporated | System and method for improved speech acquisition for hands-free voice telecommunication in a noisy environment |
| FI97919C (fi) * | 1992-06-05 | 1997-03-10 | Nokia Mobile Phones Ltd | Puheentunnistusmenetelmä ja -järjestelmä puheella ohjattavaa puhelinta varten |
| US5758021A (en) * | 1992-06-12 | 1998-05-26 | Alcatel N.V. | Speech recognition combining dynamic programming and neural network techniques |
| JP3008799B2 (ja) * | 1995-01-26 | 2000-02-14 | 日本電気株式会社 | 音声適応化装置,単語音声認識装置,連続音声認識装置およびワードスポッティング装置 |
| US5675706A (en) * | 1995-03-31 | 1997-10-07 | Lucent Technologies Inc. | Vocabulary independent discriminative utterance verification for non-keyword rejection in subword based speech recognition |
| US5799276A (en) * | 1995-11-07 | 1998-08-25 | Accent Incorporated | Knowledge-based speech recognition system and methods having frame length computed based upon estimated pitch period of vocalic intervals |
| US5963903A (en) * | 1996-06-28 | 1999-10-05 | Microsoft Corporation | Method and system for dynamically adjusted training for speech recognition |
| US5930751A (en) * | 1997-05-30 | 1999-07-27 | Lucent Technologies Inc. | Method of implicit confirmation for automatic speech recognition |
| FI972723A0 (fi) * | 1997-06-24 | 1997-06-24 | Nokia Mobile Phones Ltd | Mobila kommunikationsanordningar |
| JP3447521B2 (ja) * | 1997-08-25 | 2003-09-16 | Necエレクトロニクス株式会社 | 音声認識ダイアル装置 |
| KR100277105B1 (ko) * | 1998-02-27 | 2001-01-15 | 윤종용 | 음성 인식 데이터 결정 장치 및 방법 |
| US6321195B1 (en) * | 1998-04-28 | 2001-11-20 | Lg Electronics Inc. | Speech recognition method |
| US6389393B1 (en) * | 1998-04-28 | 2002-05-14 | Texas Instruments Incorporated | Method of adapting speech recognition models for speaker, microphone, and noisy environment |
| US6289309B1 (en) * | 1998-12-16 | 2001-09-11 | Sarnoff Corporation | Noise spectrum tracking for speech enhancement |
| US6418411B1 (en) * | 1999-03-12 | 2002-07-09 | Texas Instruments Incorporated | Method and system for adaptive speech recognition in a noisy environment |
| US6487530B1 (en) * | 1999-03-30 | 2002-11-26 | Nortel Networks Limited | Method for recognizing non-standard and standard speech by speaker independent and speaker dependent word models |
| DE10043064B4 (de) * | 2000-09-01 | 2004-07-08 | Dietmar Dr. Ruwisch | Verfahren und Vorrichtung zur Elimination von Lautsprecherinterferenzen aus Mikrofonsignalen |
| US7457750B2 (en) * | 2000-10-13 | 2008-11-25 | At&T Corp. | Systems and methods for dynamic re-configurable speech recognition |
| GB0028277D0 (en) * | 2000-11-20 | 2001-01-03 | Canon Kk | Speech processing system |
| FI114051B (fi) * | 2001-11-12 | 2004-07-30 | Nokia Corp | Menetelmä sanakirjatiedon kompressoimiseksi |
| DE60106781T2 (de) * | 2001-12-21 | 2005-12-15 | Dr. Dietmar Ruwisch | Verfahren und Vorrichtung zur Erkennung von verrauschten Sprachsignalen |
| EP1369847B1 (de) | 2002-06-04 | 2008-03-12 | Intellectual Ventures Fund 21 LLC | Verfahren und Vorrichtung zur Spracherkennung |
| JP4109063B2 (ja) * | 2002-09-18 | 2008-06-25 | パイオニア株式会社 | 音声認識装置及び音声認識方法 |
| US20050197837A1 (en) * | 2004-03-08 | 2005-09-08 | Janne Suontausta | Enhanced multilingual speech recognition system |
| JP4551915B2 (ja) | 2007-07-03 | 2010-09-29 | ホシデン株式会社 | 複合操作型入力装置 |
-
2007
- 2007-02-13 KR KR1020087020244A patent/KR20080107376A/ko not_active Ceased
- 2007-02-13 AT AT07750697T patent/ATE536611T1/de active
- 2007-02-13 US US11/674,424 patent/US20070203701A1/en not_active Abandoned
- 2007-02-13 EP EP07750697A patent/EP1994529B1/de not_active Not-in-force
- 2007-02-13 WO PCT/US2007/003876 patent/WO2007095277A2/en not_active Ceased
- 2007-02-13 JP JP2008555320A patent/JP2009527024A/ja active Pending
- 2007-02-13 CN CNA2007800054635A patent/CN101385073A/zh active Pending
Also Published As
| Publication number | Publication date |
|---|---|
| KR20080107376A (ko) | 2008-12-10 |
| CN101385073A (zh) | 2009-03-11 |
| JP2009527024A (ja) | 2009-07-23 |
| EP1994529B1 (de) | 2011-12-07 |
| US20070203701A1 (en) | 2007-08-30 |
| EP1994529A2 (de) | 2008-11-26 |
| WO2007095277A3 (en) | 2007-10-11 |
| WO2007095277A2 (en) | 2007-08-23 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| ATE536611T1 (de) | Kommunikationsgerät mit lautsprecherunabhängiger spracherkennung | |
| Matassoni et al. | Non-native children speech recognition through transfer learning | |
| KR101237799B1 (ko) | 문맥 종속형 음성 인식기의 환경적 변화들에 대한 강인성을 향상하는 방법 | |
| TW200601263A (en) | Apparatus and method for synthesized audible response to an utterance in speaker-independent voice recognition | |
| CN109155132A (zh) | 说话者验证方法和系统 | |
| Cohen | Embedded speech recognition applications in mobile phones: Status, trends, and challenges | |
| WO2008142836A1 (ja) | 声質変換装置および声質変換方法 | |
| JPH10507536A5 (de) | ||
| KR20180084392A (ko) | 전자 장치 및 그의 동작 방법 | |
| NO20083580L (no) | Autentisering av taler | |
| CN103095911A (zh) | 一种通过语音唤醒寻找手机的方法及系统 | |
| US9747897B2 (en) | Identifying substitute pronunciations | |
| TW200638337A (en) | Using a spoken utterance for disambiguation of spelling inputs into a speech recognition system | |
| WO2007117814A3 (en) | Voice signal perturbation for speech recognition | |
| WO2008087934A1 (ja) | 拡張認識辞書学習装置と音声認識システム | |
| WO2006033044A3 (en) | Method of training a robust speaker-dependent speech recognition system with speaker-dependent expressions and robust speaker-dependent speech recognition system | |
| US9378735B1 (en) | Estimating speaker-specific affine transforms for neural network based speech recognition systems | |
| Kuamr et al. | Continuous Hindi speech recognition using Gaussian mixture HMM | |
| WO2007129156A3 (en) | Soft alignment in gaussian mixture model based transformation | |
| Scheffer et al. | Content matching for short duration speaker recognition. | |
| CN102651218A (zh) | 用于创建语音标签的方法以及设备 | |
| Doddipatla et al. | Speaker dependent bottleneck layer training for speaker adaptation in automatic speech recognition. | |
| Pramanik et al. | Automatic speech recognition using correlation analysis | |
| WO2007005098A2 (en) | Method and apparatus for generating and updating a voice tag | |
| TW200627376A (en) | Method and apparatus for constructing Chinese new words by the input voice |