WO2026072097A1 - Techniques d'amélioration de la parole - Google Patents
Techniques d'amélioration de la paroleInfo
- Publication number
- WO2026072097A1 WO2026072097A1 PCT/US2025/024937 US2025024937W WO2026072097A1 WO 2026072097 A1 WO2026072097 A1 WO 2026072097A1 US 2025024937 W US2025024937 W US 2025024937W WO 2026072097 A1 WO2026072097 A1 WO 2026072097A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- audio
- speech
- signal
- playback device
- channel
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/02—Speech enhancement, e.g. noise reduction or echo cancellation
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/02—Speech enhancement, e.g. noise reduction or echo cancellation
- G10L21/0272—Voice signal separating
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/02—Speech enhancement, e.g. noise reduction or echo cancellation
- G10L21/0316—Speech enhancement, e.g. noise reduction or echo cancellation by changing the amplitude
- G10L21/0364—Speech enhancement, e.g. noise reduction or echo cancellation by changing the amplitude for improving intelligibility
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04R—LOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; ELECTRIC HEARING AIDS; PUBLIC ADDRESS SYSTEMS
- H04R5/00—Stereophonic arrangements
- H04R5/04—Circuit arrangements, e.g. for selective connection of amplifier inputs/outputs to loudspeakers, for loudspeaker detection, or for adaptation of settings to personal preferences or hearing impairments
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04S—STEREOPHONIC SYSTEMS
- H04S2400/00—Details of stereophonic systems covered by H04S but not provided for in its groups
- H04S2400/05—Generation or adaptation of centre channel in multi-channel audio systems
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04S—STEREOPHONIC SYSTEMS
- H04S3/00—Systems employing more than two channels, e.g. quadraphonic
- H04S3/008—Systems employing more than two channels, e.g. quadraphonic in which the audio signals are in digital form, i.e. employing more than two discrete digital channels
Landscapes
- Engineering & Computer Science (AREA)
- Computational Linguistics (AREA)
- Quality & Reliability (AREA)
- Signal Processing (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- Physics & Mathematics (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- Circuit For Audible Band Transducer (AREA)
Abstract
Un exemple de procédé consiste à détecter, à l'aide d'un dispositif de reproduction, un signal audio, appliquer un modèle d'apprentissage automatique paramétrique pour détecter dynamiquement la parole dans le signal audio, sur la base de la détection de la parole, séparer le signal audio en audio vocal et audio non vocal, appliquer un premier traitement audio à l'audio vocal pour produire un audio vocal traité, appliquer un second traitement audio à l'audio non vocal pour produire un audio non vocal traité, le second traitement audio étant différent du premier traitement audio, combiner l'audio vocal traité et l'audio non vocal traité pour produire un signal de sortie audio, et reproduire le signal de sortie audio par l'intermédiaire du dispositif de reproduction.
Applications Claiming Priority (2)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| US202463700280P | 2024-09-27 | 2024-09-27 | |
| US63/700,280 | 2024-09-27 |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| WO2026072097A1 true WO2026072097A1 (fr) | 2026-04-02 |
Family
ID=95783837
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| PCT/US2025/024937 Pending WO2026072097A1 (fr) | 2024-09-27 | 2025-04-16 | Techniques d'amélioration de la parole |
Country Status (1)
| Country | Link |
|---|---|
| WO (1) | WO2026072097A1 (fr) |
Citations (6)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US8234395B2 (en) | 2003-07-28 | 2012-07-31 | Sonos, Inc. | System and method for synchronizing operations among a plurality of independently clocked digital data processing devices |
| US8483853B1 (en) | 2006-09-12 | 2013-07-09 | Sonos, Inc. | Controlling and manipulating groupings in a multi-zone media system |
| US10499146B2 (en) | 2016-02-22 | 2019-12-03 | Sonos, Inc. | Voice control of a media playback system |
| US10712997B2 (en) | 2016-10-17 | 2020-07-14 | Sonos, Inc. | Room association based on name |
| US20230087486A1 (en) * | 2020-05-29 | 2023-03-23 | Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. | Method and apparatus for processing an initial audio signal |
| US20240111484A1 (en) * | 2022-09-30 | 2024-04-04 | Sonos, Inc. | Techniques for Intelligent Home Theater Configuration |
-
2025
- 2025-04-16 WO PCT/US2025/024937 patent/WO2026072097A1/fr active Pending
Patent Citations (6)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US8234395B2 (en) | 2003-07-28 | 2012-07-31 | Sonos, Inc. | System and method for synchronizing operations among a plurality of independently clocked digital data processing devices |
| US8483853B1 (en) | 2006-09-12 | 2013-07-09 | Sonos, Inc. | Controlling and manipulating groupings in a multi-zone media system |
| US10499146B2 (en) | 2016-02-22 | 2019-12-03 | Sonos, Inc. | Voice control of a media playback system |
| US10712997B2 (en) | 2016-10-17 | 2020-07-14 | Sonos, Inc. | Room association based on name |
| US20230087486A1 (en) * | 2020-05-29 | 2023-03-23 | Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. | Method and apparatus for processing an initial audio signal |
| US20240111484A1 (en) * | 2022-09-30 | 2024-04-04 | Sonos, Inc. | Techniques for Intelligent Home Theater Configuration |
Non-Patent Citations (3)
| Title |
|---|
| LEE GEON WOO ET AL: "Multi-Task Learning U-Net for Single-Channel Speech Enhancement and Mask-Based Voice Activity Detection", APPLIED SCIENCES, vol. 10, no. 9, 6 May 2020 (2020-05-06), pages 3230, XP093293003, ISSN: 2076-3417, Retrieved from the Internet <URL:https://www.mdpi.com/2076-3417/10/9/3230/pdf> [retrieved on 20250704], DOI: 10.3390/app10093230 * |
| LEE YOUNGLO ET AL: "Spectro-Temporal Attention-Based Voice Activity Detection", IEEE SIGNAL PROCESSING LETTERS, IEEE, USA, vol. 27, 13 December 2019 (2019-12-13), pages 131 - 135, XP011768664, ISSN: 1070-9908, [retrieved on 20200123], DOI: 10.1109/LSP.2019.2959917 * |
| SALEEM NASIR ET AL: "On Learning Spectral Masking for Single Channel Speech Enhancement Using Feedforward and Recurrent Neural Networks", IEEE ACCESS, IEEE, USA, vol. 8, 31 August 2020 (2020-08-31), pages 160581 - 160595, XP011808493, [retrieved on 20200910], DOI: 10.1109/ACCESS.2020.3021061 * |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| US12360736B2 (en) | Audio conflict resolution | |
| US12149897B2 (en) | Audio playback settings for voice interaction | |
| US12288558B2 (en) | Systems and methods of operating media playback systems having multiple voice assistant services | |
| US11778404B2 (en) | Systems and methods for authenticating and calibrating passive speakers with a graphical user interface | |
| US11900014B2 (en) | Systems and methods for podcast playback | |
| US10891105B1 (en) | Systems and methods for displaying a transitional graphical user interface while loading media information for a networked media playback system | |
| WO2019222667A1 (fr) | Filtrage linéaire pour détection de parole avec suppression de bruit | |
| US12417071B2 (en) | Techniques for intelligent home theater configuration | |
| US20230195783A1 (en) | Speech Enhancement Based on Metadata Associated with Audio Content | |
| AU2021382800A1 (en) | Playback of generative media content | |
| WO2026072097A1 (fr) | Techniques d'amélioration de la parole | |
| WO2026085066A1 (fr) | Rendu de dialogue amélioré | |
| WO2025029609A1 (fr) | Techniques de personnalisation pour systèmes de lecture multimédia |