JP7407580B2 - システム、及び、方法 - Google Patents
システム、及び、方法 Download PDFInfo
- Publication number
- JP7407580B2 JP7407580B2 JP2019220476A JP2019220476A JP7407580B2 JP 7407580 B2 JP7407580 B2 JP 7407580B2 JP 2019220476 A JP2019220476 A JP 2019220476A JP 2019220476 A JP2019220476 A JP 2019220476A JP 7407580 B2 JP7407580 B2 JP 7407580B2
- Authority
- JP
- Japan
- Prior art keywords
- target
- stream
- operable
- enhancement
- engine
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Images
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/02—Speech enhancement, e.g. noise reduction or echo cancellation
- G10L21/0316—Speech enhancement, e.g. noise reduction or echo cancellation by changing the amplitude
- G10L21/0364—Speech enhancement, e.g. noise reduction or echo cancellation by changing the amplitude for improving intelligibility
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L65/00—Network arrangements, protocols or services for supporting real-time applications in data packet communication
- H04L65/80—Responding to QoS
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/02—Speech enhancement, e.g. noise reduction or echo cancellation
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/20—Speech recognition techniques specially adapted for robustness in adverse environments, e.g. in noise, of stress induced speech
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/22—Procedures used during a speech recognition process, e.g. man-machine dialogue
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/02—Speech enhancement, e.g. noise reduction or echo cancellation
- G10L21/0208—Noise filtering
- G10L21/0216—Noise filtering characterised by the method used for estimating noise
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/02—Speech enhancement, e.g. noise reduction or echo cancellation
- G10L21/0272—Voice signal separating
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
- G10L25/78—Detection of presence or absence of voice signals
- G10L25/84—Detection of presence or absence of voice signals for discriminating voice from noise
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L65/00—Network arrangements, protocols or services for supporting real-time applications in data packet communication
- H04L65/60—Network streaming of media packets
- H04L65/65—Network streaming protocols, e.g. real-time transport protocol [RTP] or real-time control protocol [RTCP]
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04R—LOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; ELECTRIC HEARING AIDS; PUBLIC ADDRESS SYSTEMS
- H04R3/00—Circuits for transducers
- H04R3/005—Circuits for transducers for combining the signals of two or more microphones
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04S—STEREOPHONIC SYSTEMS
- H04S3/00—Systems employing more than two channels, e.g. quadraphonic
- H04S3/008—Systems employing more than two channels, e.g. quadraphonic in which the audio signals are in digital form, i.e. employing more than two discrete digital channels
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
- G10L2015/088—Word spotting
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/02—Speech enhancement, e.g. noise reduction or echo cancellation
- G10L21/0208—Noise filtering
- G10L2021/02087—Noise filtering the noise being separate speech, e.g. cocktail party
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/02—Speech enhancement, e.g. noise reduction or echo cancellation
- G10L21/0208—Noise filtering
- G10L21/0216—Noise filtering characterised by the method used for estimating noise
- G10L2021/02161—Number of inputs available containing the signal or the noise to be suppressed
- G10L2021/02166—Microphone arrays; Beamforming
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
- G10L25/48—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use
- G10L25/51—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for comparison or discrimination
- G10L25/60—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for comparison or discrimination for measuring the quality of voice signals
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
- G10L25/78—Detection of presence or absence of voice signals
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L65/00—Network arrangements, protocols or services for supporting real-time applications in data packet communication
- H04L65/60—Network streaming of media packets
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04R—LOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; ELECTRIC HEARING AIDS; PUBLIC ADDRESS SYSTEMS
- H04R1/00—Details of transducers, loudspeakers or microphones
- H04R1/20—Arrangements for obtaining desired frequency or directional characteristics
- H04R1/32—Arrangements for obtaining desired frequency or directional characteristics for obtaining desired directional characteristic only
- H04R1/40—Arrangements for obtaining desired frequency or directional characteristics for obtaining desired directional characteristic only by combining a number of identical transducers
- H04R1/406—Arrangements for obtaining desired frequency or directional characteristics for obtaining desired directional characteristic only by combining a number of identical transducers microphones
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04R—LOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; ELECTRIC HEARING AIDS; PUBLIC ADDRESS SYSTEMS
- H04R2430/00—Signal processing covered by H04R, not provided for in its groups
- H04R2430/20—Processing of the output signals of the acoustic transducers of an array for obtaining a desired directivity characteristic
- H04R2430/23—Direction finding using a sum-delay beam-former
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04S—STEREOPHONIC SYSTEMS
- H04S2400/00—Details of stereophonic systems covered by H04S but not provided for in its groups
- H04S2400/01—Multi-channel, i.e. more than two input channels, sound reproduction with two speakers wherein the multi-channel information is substantially preserved
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Health & Medical Sciences (AREA)
- Multimedia (AREA)
- Acoustics & Sound (AREA)
- Signal Processing (AREA)
- Computational Linguistics (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- Quality & Reliability (AREA)
- Otolaryngology (AREA)
- General Health & Medical Sciences (AREA)
- Computer Networks & Wireless Communication (AREA)
- Circuit For Audible Band Transducer (AREA)
Description
Claims (10)
- マルチチャンネルオーディオ入力信号を分析し、複数の強調ターゲットストリームを生成するように作動可能なターゲット発話強調エンジンと、
前記ストリームの中の特定のターゲット発話の品質、及び/又は、存在の信頼性を決定するようにそれぞれが作動可能な複数のターゲット発話検出エンジンを備えるマルチストリームターゲット発話検出生成部であり、前記複数の強調ターゲットストリームに関する複数の重みを決定するように作動可能なマルチストリームターゲット発話検出生成部と、
前記複数の重みを前記強調ターゲットストリームに適用して、組合せ強調出力信号を生成するように作動可能な融合サブシステムと、
を備えるシステム。 - 人間の発話と環境ノイズとを感知し、対応する前記マルチチャンネルオーディオ入力信号を生成するように作動可能なオーディオセンサアレーを更に備える、
請求項1に記載のシステム。 - 前記ターゲット発話強調エンジンが、前記マルチチャンネルオーディオ入力信号を分析し、前記複数の強調ターゲットストリームの一つを出力するようにそれぞれが作動可能な複数の発話強調モジュールを備える、
請求項1に記載のシステム。 - 前記複数の発話強調モジュールが、適応空間フィルタリングアルゴリズム、ビームフォーミングアルゴリズム、ブラインド音源分離アルゴリズム、シングルチャンネル強調アルゴリズム、及び/又は、ニューラルネットワーク、を備える、
請求項3に記載のシステム。 - 前記ターゲット発話検出エンジンが、混合ガウスモデル、隠れマルコフモデル、及び/又は、ニューラルワーク、を備える、
請求項1に記載のシステム。 - 各ターゲット発話検出エンジンが、入力オーディオストリームが前記特定のターゲット発話を含んでいる信頼性に相関する事後重みを作るように作動可能な、
請求項1に記載のシステム。 - 各ターゲット発話検出エンジンが、クリーンな発話に対してはより高い事後を作るように作動可能な、
請求項6のシステム。 - 前記強調出力信号が、前記強調ターゲットストリームの重み付き和である、
請求項1のシステム。 - 前記マルチストリームターゲット発話検出生成部が、前記ストリームの中に特定のターゲット発話が検出される組合せ確率を決定するように更に作動可能であり、前記組合せ確率が検出閾値を超えている場合に前記ターゲット発話が検出される、
請求項1のシステム。 - ターゲット発話強調エンジンを用いて、マルチチャンネルオーディオ入力信号を解析し、複数の強調ターゲットストリームを生成し、
マルチストリームターゲット発話検出生成部を用いて前記ストリームにターゲット発話を検出する確率を決定し、
前記複数の強調ターゲットストリームのそれぞれについて重みを計算し、
計算した前記重みを前記複数の強調ターゲットストリームに適用して、強調出力信号を生成する、
方法。
Applications Claiming Priority (2)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| US201862776422P | 2018-12-06 | 2018-12-06 | |
| US62/776,422 | 2018-12-06 |
Publications (3)
| Publication Number | Publication Date |
|---|---|
| JP2020109498A JP2020109498A (ja) | 2020-07-16 |
| JP2020109498A5 JP2020109498A5 (ja) | 2022-12-08 |
| JP7407580B2 true JP7407580B2 (ja) | 2024-01-04 |
Family
ID=70970205
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| JP2019220476A Active JP7407580B2 (ja) | 2018-12-06 | 2019-12-05 | システム、及び、方法 |
Country Status (3)
| Country | Link |
|---|---|
| US (2) | US11158333B2 (ja) |
| JP (1) | JP7407580B2 (ja) |
| CN (1) | CN111370014B (ja) |
Families Citing this family (23)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| JP7407580B2 (ja) | 2018-12-06 | 2024-01-04 | シナプティクス インコーポレイテッド | システム、及び、方法 |
| US11048472B2 (en) | 2019-01-27 | 2021-06-29 | Listen AS | Dynamically adjustable sound parameters |
| US11126398B2 (en) * | 2019-03-13 | 2021-09-21 | Listen AS | Smart speaker |
| WO2020231151A1 (en) * | 2019-05-16 | 2020-11-19 | Samsung Electronics Co., Ltd. | Electronic device and method of controlling thereof |
| US11557307B2 (en) | 2019-10-20 | 2023-01-17 | Listen AS | User voice control system |
| US20210201928A1 (en) * | 2019-12-31 | 2021-07-01 | Knowles Electronics, Llc | Integrated speech enhancement for voice trigger application |
| US11064294B1 (en) | 2020-01-10 | 2021-07-13 | Synaptics Incorporated | Multiple-source tracking and voice activity detections for planar microphone arrays |
| EP4147459A4 (en) | 2020-05-08 | 2024-06-26 | Microsoft Technology Licensing, LLC | SYSTEM AND METHOD FOR DATA AMPLIFICATION FOR MULTI-MICROPHONE SIGNAL PROCESSING |
| US11875797B2 (en) * | 2020-07-23 | 2024-01-16 | Pozotron Inc. | Systems and methods for scripted audio production |
| CN111916106B (zh) * | 2020-08-17 | 2021-06-15 | 牡丹江医学院 | 一种提高英语教学中发音质量的方法 |
| CN112017686B (zh) * | 2020-09-18 | 2022-03-01 | 中科极限元(杭州)智能科技股份有限公司 | 基于门控递归融合深度嵌入式特征的多通道语音分离系统 |
| CN112786069B (zh) * | 2020-12-24 | 2023-03-21 | 北京有竹居网络技术有限公司 | 语音提取方法、装置和电子设备 |
| TWI761018B (zh) * | 2021-01-05 | 2022-04-11 | 瑞昱半導體股份有限公司 | 語音擷取方法以及語音擷取系統 |
| EP4416924A4 (en) * | 2021-10-12 | 2025-07-16 | Qsc Llc | MULTI-SOURCE AUDIO PROCESSING SYSTEMS AND METHODS |
| US11823707B2 (en) | 2022-01-10 | 2023-11-21 | Synaptics Incorporated | Sensitivity mode for an audio spotting system |
| US12057138B2 (en) | 2022-01-10 | 2024-08-06 | Synaptics Incorporated | Cascade audio spotting system |
| CN114582323B (zh) * | 2022-03-21 | 2025-05-27 | 联想(北京)有限公司 | 语音识别及模型训练方法和装置 |
| CN114724553B (zh) * | 2022-03-24 | 2025-06-10 | 中山大学 | 一种关键词的识别方法、系统、装置及存储介质 |
| US20240371386A1 (en) * | 2023-05-02 | 2024-11-07 | Synaptics Incorporated | Audio source separation for multi-channel beamforming based on personal voice activity detection (vad) |
| CN116631432B (zh) * | 2023-06-21 | 2026-03-31 | 中信银行股份有限公司 | 一种音频分离和话术违规提醒方法、装置及计算机设备 |
| KR102795375B1 (ko) * | 2023-08-11 | 2025-04-11 | 코클 아이엔씨 | 향상된 신뢰도를 가진 음향 인식 결과 제공 방법, 장치 및 컴퓨터 프로그램 |
| CN119227752A (zh) * | 2024-08-28 | 2024-12-31 | 中国科学院自动化研究所 | 嵌入生物网络元结构的多模态脉冲信号识别方法及装置 |
| CN120015050B (zh) * | 2025-02-12 | 2025-11-07 | 北京科技大学 | 一种基于多通道声信号的融合分离方法及系统 |
Citations (3)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| JP2011248025A (ja) | 2010-05-25 | 2011-12-08 | Nippon Telegr & Teleph Corp <Ntt> | チャネル統合方法、チャネル統合装置、プログラム |
| JP2016517023A (ja) | 2013-07-18 | 2016-06-09 | 三菱電機株式会社 | 音響信号を処理する方法 |
| JP2016524193A (ja) | 2013-06-27 | 2016-08-12 | ロウルズ リミテッド ライアビリティ カンパニー | 自己生成ウェイク表現の検出 |
Family Cites Families (80)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| JP3484112B2 (ja) | 1999-09-27 | 2004-01-06 | 株式会社東芝 | 雑音成分抑圧処理装置および雑音成分抑圧処理方法 |
| US6370500B1 (en) | 1999-09-30 | 2002-04-09 | Motorola, Inc. | Method and apparatus for non-speech activity reduction of a low bit rate digital voice message |
| AUPS270902A0 (en) | 2002-05-31 | 2002-06-20 | Canon Kabushiki Kaisha | Robust detection and classification of objects in audio using limited training data |
| CN1303582C (zh) | 2003-09-09 | 2007-03-07 | 摩托罗拉公司 | 自动语音归类方法 |
| KR100754385B1 (ko) * | 2004-09-30 | 2007-08-31 | 삼성전자주식회사 | 오디오/비디오 센서를 이용한 위치 파악, 추적 및 분리장치와 그 방법 |
| US7464029B2 (en) | 2005-07-22 | 2008-12-09 | Qualcomm Incorporated | Robust separation of speech signals in a noisy environment |
| JP2007047427A (ja) | 2005-08-10 | 2007-02-22 | Hitachi Ltd | 音声処理装置 |
| KR100821177B1 (ko) | 2006-09-29 | 2008-04-14 | 한국전자통신연구원 | 통계적 모델에 기반한 선험적 음성 부재 확률 추정 방법 |
| KR100964402B1 (ko) | 2006-12-14 | 2010-06-17 | 삼성전자주식회사 | 오디오 신호의 부호화 모드 결정 방법 및 장치와 이를 이용한 오디오 신호의 부호화/복호화 방법 및 장치 |
| US8005237B2 (en) | 2007-05-17 | 2011-08-23 | Microsoft Corp. | Sensor array beamformer post-processor |
| DE602008002695D1 (de) | 2008-01-17 | 2010-11-04 | Harman Becker Automotive Sys | Postfilter für einen Strahlformer in der Sprachverarbeitung |
| US9113240B2 (en) * | 2008-03-18 | 2015-08-18 | Qualcomm Incorporated | Speech enhancement using multiple microphones on multiple devices |
| KR20100006492A (ko) | 2008-07-09 | 2010-01-19 | 삼성전자주식회사 | 부호화 방식 결정 방법 및 장치 |
| EP2146519B1 (en) | 2008-07-16 | 2012-06-06 | Nuance Communications, Inc. | Beamforming pre-processing for speaker localization |
| JP2010085733A (ja) * | 2008-09-30 | 2010-04-15 | Equos Research Co Ltd | 音声強調システム |
| US9202456B2 (en) | 2009-04-23 | 2015-12-01 | Qualcomm Incorporated | Systems, methods, apparatus, and computer-readable media for automatic control of active noise cancellation |
| US20110010172A1 (en) | 2009-07-10 | 2011-01-13 | Alon Konchitsky | Noise reduction system using a sensor based speech detector |
| US9037458B2 (en) * | 2011-02-23 | 2015-05-19 | Qualcomm Incorporated | Systems, methods, apparatus, and computer-readable media for spatially selective audio augmentation |
| CN102956230B (zh) | 2011-08-19 | 2017-03-01 | 杜比实验室特许公司 | 对音频信号进行歌曲检测的方法和设备 |
| EP2791935B1 (en) | 2011-12-12 | 2016-03-09 | Dolby Laboratories Licensing Corporation | Low complexity repetition detection in media data |
| CN103325386B (zh) | 2012-03-23 | 2016-12-21 | 杜比实验室特许公司 | 用于信号传输控制的方法和系统 |
| KR101318328B1 (ko) * | 2012-04-12 | 2013-10-15 | 경북대학교 산학협력단 | 성김 특성 최소화를 통한 암묵 신호 제거를 이용한 음성 향상 방법 및 장치 |
| US9768829B2 (en) | 2012-05-11 | 2017-09-19 | Intel Deutschland Gmbh | Methods for processing audio signals and circuit arrangements therefor |
| TWI474317B (zh) | 2012-07-06 | 2015-02-21 | Realtek Semiconductor Corp | 訊號處理裝置以及訊號處理方法 |
| US10142007B2 (en) | 2012-07-19 | 2018-11-27 | Intel Deutschland Gmbh | Radio communication devices and methods for controlling a radio communication device |
| DK2701145T3 (en) | 2012-08-24 | 2017-01-16 | Retune DSP ApS | Noise cancellation for use with noise reduction and echo cancellation in personal communication |
| US9183849B2 (en) | 2012-12-21 | 2015-11-10 | The Nielsen Company (Us), Llc | Audio matching with semantic audio recognition and report generation |
| US9158760B2 (en) | 2012-12-21 | 2015-10-13 | The Nielsen Company (Us), Llc | Audio decoding with supplemental semantic audio recognition and report generation |
| EP2747451A1 (en) | 2012-12-21 | 2014-06-25 | Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. | Filter and method for informed spatial filtering using multiple instantaneous direction-of-arrivial estimates |
| CN104078050A (zh) | 2013-03-26 | 2014-10-01 | 杜比实验室特许公司 | 用于音频分类和音频处理的设备和方法 |
| US9769576B2 (en) | 2013-04-09 | 2017-09-19 | Sonova Ag | Method and system for providing hearing assistance to a user |
| CN104217729A (zh) | 2013-05-31 | 2014-12-17 | 杜比实验室特许公司 | 音频处理方法和音频处理装置以及训练方法 |
| US9240182B2 (en) | 2013-09-17 | 2016-01-19 | Qualcomm Incorporated | Method and apparatus for adjusting detection threshold for activating voice assistant function |
| GB2518663A (en) | 2013-09-27 | 2015-04-01 | Nokia Corp | Audio analysis apparatus |
| US9654894B2 (en) * | 2013-10-31 | 2017-05-16 | Conexant Systems, Inc. | Selective audio source enhancement |
| US9589560B1 (en) | 2013-12-19 | 2017-03-07 | Amazon Technologies, Inc. | Estimating false rejection rate in a detection system |
| EP2916321B1 (en) | 2014-03-07 | 2017-10-25 | Oticon A/s | Processing of a noisy audio signal to estimate target and noise spectral variances |
| US9548065B2 (en) | 2014-05-05 | 2017-01-17 | Sensory, Incorporated | Energy post qualification for phrase spotting |
| US9484022B2 (en) | 2014-05-23 | 2016-11-01 | Google Inc. | Training multiple neural networks with different accuracy |
| US9369113B2 (en) | 2014-06-20 | 2016-06-14 | Steve Yang | Impedance adjusting device |
| WO2016007528A1 (en) | 2014-07-10 | 2016-01-14 | Analog Devices Global | Low-complexity voice activity detection |
| US9432769B1 (en) | 2014-07-30 | 2016-08-30 | Amazon Technologies, Inc. | Method and system for beam selection in microphone array beamformers |
| US9953661B2 (en) | 2014-09-26 | 2018-04-24 | Cirrus Logic Inc. | Neural network voice activity detection employing running range normalization |
| US9530400B2 (en) | 2014-09-29 | 2016-12-27 | Nuance Communications, Inc. | System and method for compressed domain language identification |
| JP6450139B2 (ja) | 2014-10-10 | 2019-01-09 | 株式会社Nttドコモ | 音声認識装置、音声認識方法、及び音声認識プログラム |
| US20160275961A1 (en) * | 2015-03-18 | 2016-09-22 | Qualcomm Technologies International, Ltd. | Structure for multi-microphone speech enhancement system |
| US9734822B1 (en) * | 2015-06-01 | 2017-08-15 | Amazon Technologies, Inc. | Feedback based beamformed signal selection |
| US10229700B2 (en) | 2015-09-24 | 2019-03-12 | Google Llc | Voice activity detection |
| US9668073B2 (en) | 2015-10-07 | 2017-05-30 | Robert Bosch Gmbh | System and method for audio scene understanding of physical object sound sources |
| US10347271B2 (en) * | 2015-12-04 | 2019-07-09 | Synaptics Incorporated | Semi-supervised system for multichannel source enhancement through configurable unsupervised adaptive transformations and supervised deep neural network |
| US9978397B2 (en) | 2015-12-22 | 2018-05-22 | Intel Corporation | Wearer voice activity detection |
| US10090005B2 (en) | 2016-03-10 | 2018-10-02 | Aspinity, Inc. | Analog voice activity detection |
| JP6480644B1 (ja) * | 2016-03-23 | 2019-03-13 | グーグル エルエルシー | マルチチャネル音声認識のための適応的オーディオ強化 |
| US9947323B2 (en) | 2016-04-01 | 2018-04-17 | Intel Corporation | Synthetic oversampling to enhance speaker identification or verification |
| US11107461B2 (en) | 2016-06-01 | 2021-08-31 | Massachusetts Institute Of Technology | Low-power automatic speech recognition device |
| US20180039478A1 (en) | 2016-08-02 | 2018-02-08 | Google Inc. | Voice interaction services |
| CN109791760A (zh) | 2016-09-30 | 2019-05-21 | 索尼公司 | 信号处理装置、信号处理方法和程序 |
| US9741360B1 (en) * | 2016-10-09 | 2017-08-22 | Spectimbre Inc. | Speech enhancement for target speakers |
| US9881634B1 (en) * | 2016-12-01 | 2018-01-30 | Arm Limited | Multi-microphone speech processing system |
| WO2018106971A1 (en) | 2016-12-07 | 2018-06-14 | Interactive Intelligence Group, Inc. | System and method for neural network based speaker classification |
| US10546575B2 (en) | 2016-12-14 | 2020-01-28 | International Business Machines Corporation | Using recurrent neural network for partitioning of audio data into segments that each correspond to a speech feature cluster identifier |
| US10083689B2 (en) | 2016-12-23 | 2018-09-25 | Intel Corporation | Linear scoring for low power wake on voice |
| US10170134B2 (en) | 2017-02-21 | 2019-01-01 | Intel IP Corporation | Method and system of acoustic dereverberation factoring the actual non-ideal acoustic environment |
| JP6652519B2 (ja) | 2017-02-28 | 2020-02-26 | 日本電信電話株式会社 | ステアリングベクトル推定装置、ステアリングベクトル推定方法およびステアリングベクトル推定プログラム |
| US10224053B2 (en) * | 2017-03-24 | 2019-03-05 | Hyundai Motor Company | Audio signal quality enhancement based on quantitative SNR analysis and adaptive Wiener filtering |
| US10269369B2 (en) | 2017-05-31 | 2019-04-23 | Apple Inc. | System and method of noise reduction for a mobile device |
| US10403299B2 (en) * | 2017-06-02 | 2019-09-03 | Apple Inc. | Multi-channel speech signal enhancement for robust voice trigger detection and automatic speech recognition |
| US10096328B1 (en) | 2017-10-06 | 2018-10-09 | Intel Corporation | Beamformer system for tracking of speech and noise in a dynamic environment |
| US10090000B1 (en) | 2017-11-01 | 2018-10-02 | GM Global Technology Operations LLC | Efficient echo cancellation using transfer function estimation |
| US10504539B2 (en) | 2017-12-05 | 2019-12-10 | Synaptics Incorporated | Voice activity detection systems and methods |
| US10777189B1 (en) | 2017-12-05 | 2020-09-15 | Amazon Technologies, Inc. | Dynamic wakeword detection |
| US10679617B2 (en) | 2017-12-06 | 2020-06-09 | Synaptics Incorporated | Voice enhancement in audio signals through modified generalized eigenvalue beamformer |
| CN111465981B (zh) | 2017-12-21 | 2024-05-24 | 辛纳普蒂克斯公司 | 模拟语音活动检测器系统和方法 |
| US11062727B2 (en) | 2018-06-13 | 2021-07-13 | Ceva D.S.P Ltd. | System and method for voice activity detection |
| JP7407580B2 (ja) | 2018-12-06 | 2024-01-04 | シナプティクス インコーポレイテッド | システム、及び、方法 |
| US11232788B2 (en) | 2018-12-10 | 2022-01-25 | Amazon Technologies, Inc. | Wakeword detection |
| US11069353B1 (en) | 2019-05-06 | 2021-07-20 | Amazon Technologies, Inc. | Multilingual wakeword detection |
| US11064294B1 (en) | 2020-01-10 | 2021-07-13 | Synaptics Incorporated | Multiple-source tracking and voice activity detections for planar microphone arrays |
| US11308959B2 (en) | 2020-02-11 | 2022-04-19 | Spotify Ab | Dynamic adjustment of wake word acceptance tolerance thresholds in voice-controlled devices |
| US11769520B2 (en) | 2020-08-17 | 2023-09-26 | EMC IP Holding Company LLC | Communication issue detection using evaluation of multiple machine learning models |
-
2019
- 2019-12-05 JP JP2019220476A patent/JP7407580B2/ja active Active
- 2019-12-06 US US16/706,519 patent/US11158333B2/en active Active
- 2019-12-06 CN CN201911241535.7A patent/CN111370014B/zh active Active
-
2021
- 2021-09-24 US US17/484,208 patent/US11694710B2/en active Active
Patent Citations (3)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| JP2011248025A (ja) | 2010-05-25 | 2011-12-08 | Nippon Telegr & Teleph Corp <Ntt> | チャネル統合方法、チャネル統合装置、プログラム |
| JP2016524193A (ja) | 2013-06-27 | 2016-08-12 | ロウルズ リミテッド ライアビリティ カンパニー | 自己生成ウェイク表現の検出 |
| JP2016517023A (ja) | 2013-07-18 | 2016-06-09 | 三菱電機株式会社 | 音響信号を処理する方法 |
Also Published As
| Publication number | Publication date |
|---|---|
| CN111370014B (zh) | 2024-05-28 |
| US11158333B2 (en) | 2021-10-26 |
| CN111370014A (zh) | 2020-07-03 |
| JP2020109498A (ja) | 2020-07-16 |
| US11694710B2 (en) | 2023-07-04 |
| US20200184985A1 (en) | 2020-06-11 |
| US20220013134A1 (en) | 2022-01-13 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| JP7407580B2 (ja) | システム、及び、方法 | |
| JP7498560B2 (ja) | システム及び方法 | |
| CN113810825B (zh) | 在存在强噪声干扰的情况下的鲁棒的扬声器定位系统和方法 | |
| CN109599124B (zh) | 一种音频数据处理方法、装置及存储介质 | |
| JP2021505933A (ja) | 修正された一般化固有値ビームフォーマーを用いた音声信号のボイス強調 | |
| US12148441B2 (en) | Source separation for automatic speech recognition (ASR) | |
| JP7690138B2 (ja) | 自動音声認識のためのマイクロフォンアレイ構成でインバリアントな、ストリーミングな、マルチチャネルな、ニューラル強調フロントエンド | |
| US12175965B2 (en) | Method and apparatus for normalizing features extracted from audio data for signal recognition or modification | |
| EP4515536A1 (en) | Audio source feature separation and target audio source generation | |
| US20210201928A1 (en) | Integrated speech enhancement for voice trigger application | |
| US20170206898A1 (en) | Systems and methods for assisting automatic speech recognition | |
| TWI920096B (zh) | 強烈雜訊干擾存在下穩健的揚聲器定位系統與方法 | |
| Giacobello | An online expectation-maximization algorithm for tracking acoustic sources in multi-microphone devices during music playback | |
| WO2025183943A1 (en) | Streaming, array-agnostic, full- and sub-band modeling front-end for robust automatic speech recognition | |
| HK40073191A (en) | Normalizing features extracted from audio data for signal recognition or modification | |
| HK40073191B (en) | Normalizing features extracted from audio data for signal recognition or modification |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| A521 | Request for written amendment filed |
Free format text: JAPANESE INTERMEDIATE CODE: A523 Effective date: 20221130 |
|
| A621 | Written request for application examination |
Free format text: JAPANESE INTERMEDIATE CODE: A621 Effective date: 20221130 |
|
| A977 | Report on retrieval |
Free format text: JAPANESE INTERMEDIATE CODE: A971007 Effective date: 20231024 |
|
| TRDD | Decision of grant or rejection written | ||
| A01 | Written decision to grant a patent or to grant a registration (utility model) |
Free format text: JAPANESE INTERMEDIATE CODE: A01 Effective date: 20231206 |
|
| A61 | First payment of annual fees (during grant procedure) |
Free format text: JAPANESE INTERMEDIATE CODE: A61 Effective date: 20231219 |
|
| R150 | Certificate of patent or registration of utility model |
Ref document number: 7407580 Country of ref document: JP Free format text: JAPANESE INTERMEDIATE CODE: R150 |









