EP4521777A1 - Betrieb eines hörgeräts zur optimierung der tonausgabe aus einer lokalisierten medienquelle - Google Patents

Betrieb eines hörgeräts zur optimierung der tonausgabe aus einer lokalisierten medienquelle Download PDF

Info

Publication number: EP4521777A1
Authority: EP; European Patent Office
Prior art keywords: audio signal; sound; user; media source; hearing device
Prior art date: 2023-09-07
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.): Withdrawn

Application number

EP23196022.0A

Other languages

English (en)

French (fr)

Inventor

Stephan Müller

Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)

Sonova Holding AG

Original Assignee

Sonova AG

Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)

2023-09-07

Filing date

2023-09-07

Publication date

2025-03-12

2023-09-07 Application filed by Sonova AG filed Critical Sonova AG

2023-09-07 Priority to EP23196022.0A priority Critical patent/EP4521777A1/de

2025-03-12 Publication of EP4521777A1 publication Critical patent/EP4521777A1/de

Status Withdrawn legal-status Critical Current

Links

Images

Classifications

- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04R—LOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; ELECTRIC HEARING AIDS; PUBLIC ADDRESS SYSTEMS
- H04R25/00—Electric hearing aids
- H04R25/40—Arrangements for obtaining a desired directivity characteristic
- H04R25/407—Circuits for combining signals of a plurality of transducers
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04R—LOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; ELECTRIC HEARING AIDS; PUBLIC ADDRESS SYSTEMS
- H04R25/00—Electric hearing aids
- H04R25/40—Arrangements for obtaining a desired directivity characteristic
- H04R25/405—Arrangements for obtaining a desired directivity characteristic by combining a plurality of transducers
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04R—LOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; ELECTRIC HEARING AIDS; PUBLIC ADDRESS SYSTEMS
- H04R25/00—Electric hearing aids
- H04R25/43—Electronic input selection or mixing based on input signal analysis, e.g. mixing or selection between microphone and telecoil or between microphones with different directivity characteristics
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04R—LOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; ELECTRIC HEARING AIDS; PUBLIC ADDRESS SYSTEMS
- H04R25/00—Electric hearing aids
- H04R25/50—Customised settings for obtaining desired overall acoustical characteristics

Definitions

the disclosure relates to method of operating a hearing device configured to be worn at an ear of a user, according to the preamble of claim 1.
the disclosure further relates to a hearing device, according to the preamble of claim 15.
Hearing devices may be used to improve the hearing capability or communication capability of a user, for instance by compensating a hearing loss of a hearing-impaired user, in which case the hearing device is commonly referred to as a hearing instrument such as a hearing aid, or hearing prosthesis.
a hearing device may also be used to output sound based on an audio signal which may be communicated by a wire or wirelessly to the hearing device.
a hearing device may also be used to reproduce a sound in a user's ear canal detected by an input transducer such as a microphone or a microphone array.
the reproduced sound may be amplified to account for a hearing loss, such as in a hearing instrument, or may be output without accounting for a hearing loss, for instance to provide for a faithful reproduction of detected ambient sound and/or to add audio features of an augmented reality in the reproduced ambient sound, such as in a hearable.
a hearing device may also provide for a situational enhancement of an acoustic scene, e.g. beamforming and/or active noise cancelling (ANC), with or without amplification of the reproduced sound.
ANC active noise cancelling
a hearing device may also be implemented as a hearing protection device, such as an earplug, configured to protect the user's hearing.
earbuds earbuds
earphones hearables
hearing instruments such as receiver-in-the-canal (RIC) hearing aids, behind-the-ear (BTE) hearing aids, in-the-ear (ITE) hearing aids, invisible-in-the-canal (IIC) hearing aids, completely-in-the-canal (CIC) hearing aids, cochlear implant systems configured to provide electrical stimulation representative of audio content to a user, a bimodal hearing system configured to provide both amplification and electrical stimulation representative of audio content to a user, or any other suitable hearing prostheses.
RIC receiver-in-the-canal
BTE behind-the-ear
ITE in-the-ear
IIC invisible-in-the-canal
CIC completely-in-the-canal
cochlear implant systems configured to provide electrical stimulation representative of audio content to a user
bimodal hearing system configured to provide both amplification and electrical stimulation representative of audio content to a user, or any other suitable hearing
a hearing system comprising two hearing devices configured to be worn at different ears of the user is sometimes also referred to as a binaural hearing device.
a hearing system may also comprise a hearing device, e.g., a single monaural hearing device or a binaural hearing device, and a user device, e.g., a smartphone and/or a smartwatch, communicatively coupled to the hearing device.
Hearing devices are often employed in conjunction with communication devices, such as smartphones or tablets, for instance when listening to sound data processed by the communication device and/or during a phone conversation operated by the communication device. More recently, communication devices have been integrated with hearing devices such that the hearing devices at least partially comprise the functionality of those communication devices.
a hearing system may comprise, for instance, a hearing device and a communication device.
a sound detector such as a microphone or a microphone array.
An amplified and/or signal processed version of the detected sound may then be outputted to the user by an output transducer, e.g., a receiver, loudspeaker, or electrodes to provide electrical stimulation representative of the outputted signal.
an output transducer e.g., a receiver, loudspeaker, or electrodes to provide electrical stimulation representative of the outputted signal.
various other sensor types are progressively implemented, in particular sensors which are not directly related to the sound reproduction and/or amplification function of the hearing device.
Those sensors include inertial sensors, such as accelerometers, allowing to monitor the user's movements.
Physiological sensors such as optical sensors and bioelectric sensors, are mostly employed for monitoring the user's health.
hearing devices have been equipped with a sound classifier to classify an ambient sound.
An input transducer can provide an audio signal representative of the ambient sound.
the sound classifier can classify the audio signal allowing to identify different listening situations by determining a characteristic from the audio signal and assigning the audio signal to at least one relevant class from a plurality of predetermined classes depending on the characteristic.
the sound classification does not directly modify a sound output of the hearing device.
different audio processing instructions are stored in a memory of the hearing device specifying different audio processing parameters for a processing of the audio signal, wherein the different classes are each associated with one of the different audio processing instructions. After assigning the audio signal to one or more classes, the one or more associated audio processing instructions are executed.
the audio processing parameters specified by the audio processing instructions can then provide a processing of the audio signal customized for the particular listening situation corresponding to the at least one class identified by the classifier.
the different listening situations may comprise, for instance, different classes of listening conditions and/or different classes of sounds.
the different classes may comprise speech and/or nonspeech and/or music and/or traffic noise and/or other ambient noise.
the different audio processing instructions may be provided as sub-functions, which can be included into a transfer function used by the signal processing circuit according to the desired mixing of the audio processing instructions.
audio processing instructions e.g., in the form of the base parameter sets, related to a beamformer and/or a gain model (i.e., an amplification characteristic) may be mixed depending on whether or to which degree the audio signal is attributed, e.g., by the class similarity factors, to one or more of the classes music and/or speech in noise and/or speech.
EP 2 201 793 B1 discloses a classifier configured for an automatic adaption of the audio processing instructions associated with the different classes depending on adjustments performed by the user. Adjustment data indicative of the user adjustments can be logged, e.g., stored in a storage unit, and evaluated to learn correction data for correcting the audio processing instructions. In a mixed mode classifier, for a current sound environment and depending on the adjustment data, an offset can be learned for the mixed base parameter sets representing the audio processing instructions associated with the different classes. For the purpose of learning, correction data may be separately provided for different classes.
a rather specific use case of operating a hearing device concerns a faithful reproduction of sound which is emitted from a localized media source in the user's environment.
the above described acoustic environment classification could also be employed to determine whether an audio signal representative of the ambient sound would include such a media content, e.g., by attributing the audio signal to a dedicated class characteristic for the media sound. Subsequently, when the audio signal would be attributed to such a class, at least one audio processing instruction associated with the class which is optimized for perceiving sound from the localized media source could be applied to the audio signal.
a difficulty of such an approach is that media content, which may be presented to the user by various media sources in his environment, can, in general, vary greatly, which makes a reliable classification of the environmental sound with regard to such a media content rather complex and/or challenging.
some media sound such as a TV program and/or a movie presented at a movie theater, may comprise sound features typically occurring also in other daily situations of the user, e.g., a single speech, conversations of other people, traffic sound and/or sound emitted from other noise sources, and may therefore be hard to distinguish by a classifier whether it stems from a localized media source or not.
Another problem arising from such an approach is that, even if such a media content is present in the user's environment, it remains questionable whether the user would be interested in following and/or consuming such a content.
initiating an operation of the hearing device which would be optimized for perceiving the sound from the localized media source would be mostly desirable when the user is also interested in the media content.
At least one of these objects can be achieved by a method of operating a hearing device configured to be worn at an ear of a user comprising the features of claim 1 and/or a hearing device comprising the features of claim 15.
a method of operating a hearing device configured to be worn at an ear of a user comprising the features of claim 1 and/or a hearing device comprising the features of claim 15.
the present disclosure proposes a method of operating a hearing device configured to be worn at an ear of a user, the method comprising
the hearing device operation optimized for perceiving sound from the localized media source can be evoked in a more reliable way and/or can be better attuned to the user's individual needs and/or sound properties of the media content.
an interest of the user in the media content may thus not be presumed based on solely determining a presence of such a media source in the user's environment, e.g., based on the audio signal, but rather on indications contained in the audio signal and/or displacement data of the user's intention and/or preference to engage in such a content.
the operation optimized for perceiving sound from the localized media source may thus be automatically activated depending the determined user interest facilitating an interaction of the user with the hearing device.
the present disclosure also proposes a non-transitory computer-readable medium storing instructions that, when executed by a processor, cause a hearing device to perform operations of the method.
the present disclosure also proposes a hearing device configured to be worn an ear of a user, the hearing device comprising
the method further comprises
the first parameter is indicative of a variability of the sound detected in the environment, wherein said determining whether the user is interested in perceiving sound from the localized media source includes a condition that the first parameter exceeds a threshold.
the variability of the sound may be determined with respect to a variability of at least one sound content, e.g., sound type, encoded in the audio signal and/or a level and/or a frequency and/or a number of onsets and/or a direction of arrival (DOA) of the audio signal.
the sound content may be characteristic for sound which is typical for one or more acoustic objects emitting the sound. The variability of the sound content may then be characteristic for an amount by which sound typical for one or more acoustic objects varies.
the second parameter is indicative of an amplitude and/or an amount and/or a variability of the movements performed by the user, wherein said determining whether the user is interested in perceiving sound from the localized media source includes a condition that the second parameter falls below a threshold.
the first parameter is indicative of a sound content encoded in the audio signal
said determining whether the user is interested in perceiving sound from the localized media source includes a condition that the first parameter is characteristic of a predetermined media sound.
the second parameter is indicative of a movement behavior of the user over time, e.g., a type and/or sequence and/or lack of movements performed by the user over time, and said determining whether the user is interested in perceiving sound from the localized media source includes a condition that the second parameter is characteristic of a predetermined movement pattern.
the method further comprises classifying the audio signal by attributing at least one class from a plurality of predetermined classes to the audio signal, wherein said determining whether the user is interested in perceiving sound from the localized media source is based on the at least one class attributed to the audio signal.
the first parameter is indicative of a variability, e.g., alteration over time, of the at least one class attributed to the audio signal and/or whether the at least one class attributed to the audio signal is characteristic of a predetermined media sound.
the method further comprises classifying the displacement data by attributing at least one class from a plurality of predetermined classes to the displacement data, wherein said determining whether the user is interested in perceiving sound from the localized media source is based on the at least one class attributed to the displacement data.
the second parameter is indicative of a variability of the at least one class attributed to the displacement data and/or whether the at least one class attributed to the displacement data is characteristic of a predetermined movement pattern of the user when focusing his attention to the localized media source.
the method further comprises
the method further comprises
the operation of the hearing device optimizing the processing of the audio signal comprises
the audio processing instruction optimized for perceiving sound from the localized media source comprises
the method further comprises
the method of further comprises
the localized media source is configured to provide for a visual media content.
the localized media source comprises a screen for displaying the visual content.
the media source may comprise a television and/or a screen in a movie theater.
the operation of the hearing device is optimized for perceiving the sound of a television program and/or a movie shown in a movie theater.
the method further comprises
the method further comprises
the audio signal and/or the first parameter and the displacement data and/or the second parameter are input into a machine learning (ML) algorithm, which outputs, e.g., a probability and/or likelihood, whether the user is interested in perceiving sound from the localized media source, wherein the ML algorithm has been trained with previous audio signals and/or first parameters and associated displacement data and/or second parameters.
ML machine learning
the operation of the hearing device optimizing said processing of the audio signal for perceiving sound from the localized media source may then be initiated when the probability and/or likelihood exceeds a threshold.
the variability of the sound and/or movements and/or at least one class attributed to the audio signal may be defined as a temporal variability, e.g., alteration over time, of the sound and/or movements and/or class, in particular an amount by which the sound and/or movements and/or class varies over time.
FIG. 1 illustrates an exemplary hearing device 110 configured to be worn at an ear of a user.
Hearing device 110 may be implemented by any type of hearing device configured to enable or enhance hearing or a listening experience of a user wearing hearing device 110.
hearing device 110 may be implemented by a hearing aid configured to provide an amplified version of audio content to a user, a sound processor included in a cochlear implant system configured to provide electrical stimulation representative of audio content to a user, a sound processor included in a bimodal hearing system configured to provide both amplification and electrical stimulation representative of audio content to a user, or any other suitable hearing prosthesis, or an earbud or an earphone or a hearable.
hearing device 110 can also be distinguished by the position at which they are worn at the ear.
Some hearing devices such as behind-the-ear (BTE) hearing aids and receiver-in-the-canal (RIC) hearing aids, typically comprise an earpiece configured to be at least partially inserted into an ear canal of the ear, and an additional housing configured to be worn at a wearing position outside the ear canal, in particular behind the ear of the user.
BTE behind-the-ear
RIC receiver-in-the-canal
Some other hearing devices as for instance earbuds, earphones, hearables, in-the-ear (ITE) hearing aids, invisible-in-the-canal (IIC) hearing aids, and completely-in-the-canal (CIC) hearing aids, commonly comprise such an earpiece to be worn at least partially inside the ear canal without an additional housing for wearing at the different ear position.
ITE in-the-ear
IIC invisible-in-the-canal
CIC completely-in-the-canal
the algorithm comprises an audio signal analyzing module 318, a sensor data analyzing module 319, a media interest determination module 315, a processing instruction selection module 317, and an audio processing module 313.
Audio signal 302 can be received by audio processing module 313.
Audio processing module 313 is configured to process audio signal 302, e.g., based on one or more audio processing instructions provided by processing instruction selection module 317.
a level and/or frequency content and/or a number of onsets and/or, when sound is emitted by the media source from a plurality of sound sources 411, 412, a DOA of the sound at hearing device 110, 210 may change rather frequently.
the parameter may be indicative of a level and/or a content and/or an SNR of the sound detected in the environment.
Other examples may include, but are not limited to, a mean-squared signal power, a standard deviation of a signal envelope, a mel-frequency cepstrum (MFC), a mel-frequency cepstrum coefficient (MFCC), a delta mel-frequency cepstrum coefficient (delta MFCC), a spectral centroid such as a power spectrum centroid, a standard deviation of the centroid, a spectral entropy such as a power spectrum entropy, a zero crossing rate (ZCR), a standard deviation of the ZCR, a broadband envelope correlation lag and/or peak, and a four-band envelope correlation lag and/or peak.
MFC mel-frequency cepstrum
MFCC mel-frequency cepstrum coefficient
delta mel-frequency cepstrum coefficient delta mel-frequency cepstrum coefficient
ZCR zero crossing rate
audio signal analyzing module 318 comprises an audio signal classifier.
the audio signal classifier can be configured to classify audio signal 302 by attributing at least one class from a plurality of predetermined classes to audio signal 302.
the first parameter may be indicative of a temporal variability of the one or more classes attributed to the audio signal and/or contain information about the one or more classes attributed to audio signal 302, e.g., whether a class unrelated to the media sound and/or a class associated with the media sound has been attributed.
the one or more classes attributed to the audio signal may change rather often.
a variability of the one or more classes attributed to audio signal 302 over time can indicate a sound stemming from a media source, and thus a presence of the media source, localized in the environment of the user.
the audio signal classifier may be implemented as a sound classification module configured for a statistical evaluation of audio signal 302 as disclosed, e.g., in EP 3 036 915 B1 , and/or a mixed mode classifier as disclosed, e.g., in EP 1 858 292 B1 , and/or a sound source separator configured to separate sound generated by different sound sources in the environment, as disclosed, e.g., in PCT/EP 2020/051 734 , PCT/EP 2020/051 735 and DE 2019 206 743.3 , which may comprise one or more neural networks (NNs).
Ns neural networks
the classes may represent a specific sound content and/or sound type encoded in audio signal 302.
Exemplary classes include, but are not limited to, low ambient noise, high ambient noise, traffic noise, machine noise, babble noise, public area noise, background noise, speech, nonspeech, speech in quiet, speech in babble, speech in noise, speech from the user, own voice of the user, speech from a significant other, background speech, speech from multiple sources, quiet indoor, quiet outdoor, speech in a car, speech in traffic, speech in a reverberating environment, speech in wind noise, speech in a lounge, car noise, applause, music, e.g. classical music, and/or the like.
Information about the classes may be stored in a database, e.g., in memory 113, and accessed by audio signal analyzer 318.
the information may comprise different patterns associated with each class wherein it is determined whether audio signal 302, in particular characteristics and/or features determined from audio signal 302, matches, at least to a certain extent, the respective pattern such that the respective class can be attributed to the audio signal 302.
a probability may be determined whether the respective pattern associated with the respective class matches the characteristics and/or features determined from audio signal 302, wherein the respective class may be attributed to audio signal 302 when the probability exceeds a threshold.
At least one of the classes may indicate whether audio signal 302 contains sound from a localized media source, e.g., as a precondition for the user being interested in perceiving the media sound, and/or at least one of the classes may indicate whether audio signal 302 does not contain such sound.
the parameter determined by audio signal analyzing module 318 can be received by processing instruction selection module 317.
Processing instruction selector 317 may select one or more audio processing instructions depending on the parameter which can then be applied to audio signal 302 by audio processing module 313.
audio signal analyzing module 318 comprises an audio signal classifier
one or more of the audio processing instructions may be associated with at least one respective class, or a plurality of respective classes.
the audio processing instructions may be stored in a database, e.g., in memory 113, and accessed by processing instruction selector 317 and/or audio signal processor 313.
the audio processing instructions may be implemented as different audio processing programs which can be executed by audio signal processing module 313.
the audio processing instructions may include, e.g., instructions executable by processor 310 providing for at least one of a gain model (GM), noise cancelling (NC), wind noise cancelling (WNC), reverberation cancelling (RevC), narrowband coupling, feedback cancelling (FC), speech enhancement (SE), noise cleaning, beamforming (BF), in particular static and/or adaptive beamforming, and/or the like.
GM gain model
NC noise cancelling
WNC wind noise cancelling
RevC reverberation cancelling
FC feedback cancelling
SE speech enhancement
BF beamforming
BF beamforming
the audio processing instructions may also include one or more instructions optimized for perceiving sound from a localized media source by the user.
the at least one audio processing instruction may provide for a separation of sound from the localized media source from other sound features contained in audio signal 302 such that the separated sound from the localized media source can be presented to user 450 via output transducer 117.
the audio processing instruction may provide for enhancing of the media sound encoded in audio signal 302 relative to other environmental sound encoded in audio signal 302. E.g., the audio processing instruction may provide for noise reduction of the other environmental sound.
the audio processing instruction may provide for enhancing an intelligibility of speech encoded in the audio signal 302.
the audio processing instructions may provide for enhancing a quality of sound encoded in the audio signal 302.
the quality of sound may be improved with regard to a clarity of the sound, e.g., be increasing a sharpness of the sound, and/or with regard to a listening comfort of the sound, e.g., by modifying the sound to be more pleasing and/or less aggressive.
Displacement data 303 can be received by sensor data analyzing module 319.
Sensor data analyzer 319 is configured to determine, based on displacement data 303, a second parameter indicative of a property of movements performed by the user.
the parameter may be determined as another indicator whether user 450 could be interested in perceiving sound from a media source localized in the environment. e.g., an amplitude and/or an amount and/or a temporal variability of movements performed by the user.
the parameter may be indicative of a viewing direction of user relative to the location of the media source.
the location of the media source may be determined based on a DOA of the sound of the media source contained in audio signal 302.
sensor data analyzing module 319 comprises a displacement data classifier.
the displacement data classifier can be configured to classify displacement data 303 by attributing at least one class from a plurality of predetermined classes to displacement data 303.
the classes may represent a specific movement pattern performed by the user. Exemplary classes include, but are not limited to, the user sitting, lying, walking, running, turning his head, shaking his head, orienting his head in a specific direction, moving in a specific direction, moving steady, moving irregularly, being in a sedentary position, being restless, and/or the like.
Information about the classes may be stored in a database, e.g., in memory 113, and accessed by sensor data analyzer 319.
a probability may be determined whether the respective pattern associated with the respective class matches a characteristic and/or feature determined from displacement data 303, wherein the respective class may be attributed to displacement data 303 when the probability exceeds a threshold.
At least one of the classes may indicate whether displacement data 303 is typical for the user being interested in perceiving sound from a localized media source and/or at least one of the classes may indicate whether the displacement data is typical for the user not being interested.
a rather motionless behavior of the user and/or movements of small amplitude and/or a head orientation toward a media source may be attributed to the class of the user being interested.
Rather frequent movements and/or movements of large amplitude and/or a large amount of movements may be attributed to the class of the user not being interested.
sensor data analyzing module 319 can be configured to determine, based on displacement data 303, a movement behavior of the user over time.
the movement behavior may include a type and/or sequence and/or rate and/or amplitude and/or duration and/or lack of movements performed by the user.
the second parameter may then be indicative of the user's movement behavior.
sensor data analyzing module 319 may be configured to log displacement data 303 over time and to extract a type and/or sequence and/or lack of movements performed by the user over time from the logged displacement data 303.
sensor data analyzing module 319 may be configured to determine a type and/or sequence and/or lack of movements performed by the user from currently received displacement data 303 and to log the determined movement characteristics over time.
the displacement data 303 and/or movement characteristics may be stored and/or accessed in memory 113.
sensor data 304 which may be provided, e.g., by any of sensors 115, 131 - 135, 137 - 139, may also be received by sensor data analyzing module 319.
Sensor data analyzer 319 may then be configured to determine, based on sensor data 304, a third parameter.
sensor data 304 may be provided by any of environmental sensors 115, 131, 132.
the parameter may then be indicative of whether the environment of the user is suitable and/or typical for perceiving sound from a media source localized in the environment, e.g., for a media source being localized in the environment, or not.
audio signal 302 provided by input transducer 115 would include traffic sound and/or when barometric data provided by barometric sensor would indicate a rather high altitude and/or when ambient temperature sensor 132 would indicate a rather hot environment it may be concluded that the user's interest in perceiving sound from a localized media source is rather small.
sensor data 304 may be provided by any of physiological sensors 133 - 135.
the parameter may then be indicative of whether a physiological condition of the user is suitable and/or typical for perceiving sound from a localized media source, or not.
physiological data provided by optical sensor 133 e.g., a PPG sensor, and/or bioelectric sensor, e.g., an ECG sensor
optical sensor 133 e.g., a PPG sensor
bioelectric sensor e.g., an ECG sensor
sensor data 304 may be provided by location sensor 138 and/or clock 139. E.g., a current location and/or time may be typical or rather unusual for the user being interested in perceiving sound from a localized media source.
sensor data 304 may be provided by user interface 137. E.g., some adjustments of hearing device 110, 210 performed by the user on the user interface may be typical or rather unusual for the user being interested in perceiving sound from a localized media source.
Media interest determination module 315 is configured to receive the first parameter determined by audio signal analyzer 318 indicative of a property of sound detected in the environment, and the second parameter determined by sensor data analyzer 319 indicative of a property of movements performed by the user. In some instances, when sensor data analyzer 319 is configured to determine, based on sensor data 304, a third parameter, media interest determinator 315 may also be configured to receive the third parameter. Media interest determinator 315 is configured to determine, based on the first and second parameter, and optionally also based on the third parameter, whether the user is interested in perceiving sound from a media source localized in the environment.
media interest determinator 315 may be configured to determine whether the first parameter and the second parameter fulfill a condition as a requirement for concluding and/or predicting that the user could be interested in perceiving sound from the localized media source. In some instances, a further requirement may be that the third parameter fulfills such a condition.
the condition may be determined relative to a threshold for at least one of the parameters, e.g., a first threshold for the first parameter and/or a second threshold for the second parameter and/or a third threshold for the third parameter.
the condition may be determined to be fulfilled when the first parameter exceeds a first threshold and the second parameter falls below a second threshold.
the first parameter exceeding the threshold may indicate a rather large variability of the sound suggesting that the sound may originate from such a localized media source.
the second parameter When the second parameter is indicative of an amplitude and/or an amount and/or a temporal variability of the movements performed by the user, the second parameter falling below the threshold may further indicate a rather small amplitude and/or amount and/or temporal variability of the movements suggesting that the user has the intention to dedicate his attention to the sound and/or other media content, e.g., visual content, originating from the media source.
evaluating the first parameter relative to the first threshold and the second parameter relative to the second threshold can allow to conclude and/or predict an interest of the user in perceiving sound from a localized media source with a higher certainty as compared to only evaluating one of the parameters.
the certainty of such a prediction may be further enhanced by also evaluating the third parameter, e.g., relative to a third threshold.
the condition may be determined with respect to a content of the sound detected in the environment, e.g., whether the content is characteristic of a predetermined media sound.
the first parameter may be indicative of a content of the sound detected in the environment.
Media interest determination module 315 may then be configured to determine whether the content of the environmental sound matches the predetermined media sound.
the predetermined media sound may be provided as a sound pattern which can be compared to the environmental sound content. When the environmental sound content matches the sound pattern, at least part of the condition for the user being interested in perceiving the media sound may be concluded and/or predicted to be fulfilled.
media interest determination module 315 can be configured to execute a machine learning (ML) algorithm configured to predict and/or indicate and/or output a likelihood whether the first parameter indicative of a content of the sound detected in the environment matches the predetermined media sound, e.g., in the form of a sound pattern indicative of a localized media source.
the ML algorithm may be trained with previously recorded media sound of such a localized media source.
one or more NNs may be employed configured to provide for a separation of sound emanated from a localized media source from other content and/or sound components contained in audio signal 302. The first parameter may then be indicative of the separated sound received from the localized media source and/or about the circumstance whether such media sound is present in audio signal 302.
NNs which may be implemented as one or more deep neural networks (DNNs), configured to separate content and/or sound components stemming from different acoustic objects from audio signal 302 are disclosed in international patent application Nos. PCT/EP 2020/051 734 and PCT/EP 2020/051 735 , and in German patent application No. DE 2019 206 743.3 .
DNNs deep neural networks
media interest determination module 315 can be configured to determine whether the first parameter fulfills a condition of the one or more classes attributed to audio signal 302 indicating that audio signal 302 contains sound and/or sound components from a localized media source in which the user could be interested.
the condition may be evaluated relative to a threshold.
an indicator for sound and/or sound components from a localized media source contained in audio signal 302 can be that the variability of the one or more classes attributed to audio signal 302 exceeds the threshold.
the threshold may be exceeded when the one or more classes attributed to audio signal 302 change rather often. In this way the circumstance may be exploited that other sound in the user's environment, in particular sound unrelated to a localized media source, may typically result in a more steady attribution of the one or more classes to audio signal 302 and/or a variability of the one or more classes attributed to audio signal 302 falling below the threshold.
the first parameter exceeding the threshold may be taken as a condition for sound and/or sound components from a localized media source contained in audio signal 302 in which the user could be interested.
the condition may be evaluated relative to whether the information is characteristic of a predetermined media sound which may be characteristic of sound and/or sound components from a localized media source.
the information may comprise a label and/or identifier and/or other characteristic of the one or more classes attributed to audio signal 302.
the condition may be determined to be fulfilled when it is determined that the information is characteristic of the predetermined media sound.
some classes which may be attributed to audio signal 302 may be less characteristic of sound and/or sound components from a localized media source as compared to other classes, e.g., public area noise, speech, nonspeech, speech in quiet, public area noise, applause, music and/or the like. Accordingly, the condition may be deemed to be fulfilled when the information yields that at least one of the classes which are more characteristic for sound and/or sound components from the localized media source has been attributed to audio signal 302.
the condition may be determined with respect to whether the movement behavior is characteristic for the user being interested in perceiving sound from a localized media source.
media interest determination module 315 may be configured to determine whether the user's movement behavior matches a predetermined movement pattern.
the movement pattern may be indicative of a movement behavior of the user, e.g., a type and/or sequence and/or rate and/or amplitude and/or duration and/or lack of movements performed by the user, which are typical for the user being interested in perceiving sound from the localized media source.
media interest determination module 315 can be configured to execute an ML algorithm configured to predict and/or indicate and/or output a likelihood whether the second parameter matches the movement pattern.
the ML algorithm may be trained with previously recorded movement behaviors of the user and/or other users over time. The training data may be labelled with regard to whether the user has been interested or uninterested in perceiving sound from a localized media source when executing the respective movement behavior.
the condition may be also determined to be fulfilled by the third parameter.
the third parameter may be evaluated with respect to an environmental condition to be fulfilled by environmental sensor data provided by any of environmental sensors 115, 131, 132 and/or a physiological condition to be fulfilled by physiological sensor data provided by any of physiological sensors 133 - 135 and/or with respect to a location and/or time and/or adjustments via user interface 137 indicating the user's interest in perceiving sound from a localized media source.
media interest determination module 315 When it is determined, by media interest determination module 315, that the user is interested in perceiving sound from a localized media source, media interest determination module 315 can be configured to initiate an operation of hearing device 110, 210 optimizing the processing of audio signal 302, as performed by audio processing module 313, for perceiving sound from the localized media source. In particular, media interest determination module 315 may be configured to provide one or more instructions for the optimizing of the processing of audio signal 302 to processing instruction selection module 317.
the operation for optimizing the processing of audio signal 302 comprises reducing a rate at which different audio processing instructions are applied to audio signal 302.
some media sources may emit sound at a large variability of the sound.
the emitted sound may change rather frequently between different sound types and/or sound contents, e.g., speech and/or music and/or environmental sound and/or background noise and/or sound special effects and/or silence.
a level and/or frequency content and/or a number of onsets and/or when sound is emitted by the media source from a plurality of sound sources 411, 412, a DOA of the sound at hearing device 110, 210 may change rather frequently.
the large variability of the detected sound may lead to a frequent change of the audio processing instructions are applied to audio signal 302.
some audio processing instructions may be optimized for a reproduction of speech encoded in audio signal 302
other audio processing instructions may be optimized for a reproduction of music encoded in audio signal 302
still other audio processing instructions may be optimized for noise reduction.
the applied audio processing instructions optimized for the respective sound type and/or sound content may change accordingly.
Such a frequent switching between different audio processing instructions applied to audio signal 302 can be rather disturbing for the user, e.g., due to a varying sound reproduction and/or processing delays and/or sound artefacts caused by the switching.
reducing the rate at which different audio processing instructions are applied to audio signal 302 can optimize the processing of audio signal 302 for perceiving sound from a localized media source.
audio signal 302 may be processed by audio signal processor 318 by applying one or more audio processing instructions associated with the one or more classes attributed to audio signal 302 by the audio signal classifier.
processing instruction selection module 317 may be configured to select the audio processing instructions applied by audio signal processor 318 depending on the classification performed by the audio signal classifier. In a case in which sound encoded in audio signal 302 has a rather large variability, however, the one or more classes attributed to audio signal 302 may change rather frequent leading to a frequent change of the applied audio processing instructions.
media interest determination module 315 can provide instructions to processing instruction selection module 317 to reduce the rate at which different audio processing instructions are applied to audio signal 302.
the instructions may include, e.g., to apply currently applied audio processing instructions for a minimum time even if a different class has been attributed to audio signal 302 by the audio signal classifier.
the instructions may also include, e.g., to only select one or more audio processing instructions associated with one of the classes attributed to audio signal 302 to be applied to audio signal 302 which are most appropriate for reproducing sound from the localized media source.
one or more audio processing instructions for enhancing an intelligibility of speech encoded in audio signal 302 may then be selected to be applied to audio signal 302, e.g., under the presumption that the user is mostly interested in comprehending a speech content presented by the media source, even if speech content and music content would be reproduced by the media source.
the operation for optimizing the processing of audio signal 302 comprises disabling applying of at least one audio processing instruction which is unsuitable for perceiving sound from the localized media source to audio signal 302.
some audio processing instructions which may be applied to audio signal 302 in a standard operation of hearing device 110, 210, may be unsuitable for reproducing sound from a localized media source and/or may affect aversely a desired perception of such sound for the user.
some media sources e.g., a television program and/or a movie shown in a movie theater, may reproduce sound features such as traffic noise, machine noise, speech in babble, and/or the like as part of the media content, e.g., to provide for a desired sound ambience and/or for entertainment purposes.
sound features such as traffic noise, machine noise, speech in babble, and/or the like
media interest determination module 315 can provide instructions to processing instruction selection module 317 to disable a selection of at least one audio processing instruction which is unsuitable for perceiving sound from the localized media source when applied to audio signal 302.
audio processing instructions usually employed for a noise reduction of environmental sound may then be disabled, e.g., to avoid an undesired influence and/or distortion of the media content.
the operation for optimizing the processing of audio signal 302 comprises applying at least one audio processing instruction optimized for perceiving sound from the localized media source to audio signal 302.
at least one audio processing instruction may be provided which is uniquely applicable to audio signal 302 when it is determined that the user is interested in perceiving sound from a localized media source.
the at least one audio processing instruction may not be associated with one or more classes attributed to audio signal 302 by an audio signal classifier included in audio signal analyzing module 318. Accordingly, when it is determined that the user is interested in perceiving sound from the localized media source, media interest determination module 315 can provide instructions to processing instruction selection module 317 to select the at least one audio processing instruction to be applied to audio signal 302.
the at least one audio processing instruction optimized for perceiving sound from the localized media source may provide for at least one of enhancing an intelligibility of speech encoded in the audio signal, in particular speech presented from the localized media source; enhancing a quality of sound encoded in the audio signal; enhancing sound from the media source encoded in the audio signal relative to other environmental sound encoded in the audio signal; and separating sound from the media source encoded in the audio signal from other sound encoded in the audio signal.
the at least one audio processing instruction may provide for noise reduction of the other environmental sound and/or improve the quality of sound with regard to a clarity of the reproduced sound and/or with regard to a listening comfort for the user when perceiving the sound.
FIG. 6 illustrates a block flow diagram for an exemplary method of processing input audio signal 302.
the method may be executed by processor 112, 310 of hearing device 110, 210 and/or another processor communicatively coupled to processor 112, 310.
a processing of audio signal 302 is performed by applying one or more audio processing instructions to audio signal 302.
an output audio signal 305 is provided which can be output by output transducer 117 so as to stimulate the user's hearing.
operation S 13 after receiving audio signal 302 and displacement data 303, which may be provided by displacement sensor 125, it is determined whether the user is interested in perceiving sound from a media source localized in the environment. Operation S13 may be performed independently and/or in parallel to the audio processing performed at S12. In some implementations, further sensor data 304 may be received at S13 to determine whether the user is interested. In a case in which it is determined that the user is interested, operation S14 is executed. At S14, an operation optimizing the processing of audio signal 302 for perceiving sound from the localized media source is initiated, which can then be applied in the audio processing at S12.
FIG. 7 illustrates a block flow diagram of an exemplary implementation of the method illustrated in FIG. 6 .
operation S13 is replaced by operations S22 and S23.
audio signal 302 is classified by attributing at least one class from a plurality of classes to audio signal 302, wherein different audio processing instructions are associated with different classes.
the at least one audio processing instruction associated with the at least one class attributed to audio signal 302 can then be applied in the audio processing at S12.
the determining whether the user is interested can thus be based on displacement data 303 and the at least one class which has been attributed to audio signal 302 at S22.
the determining whether the user is interested may be determined depending on a temporal variability of the attribution of the at least one class to audio signal 302 and/or whether the at least one class attributed to audio signal 302 is characteristic of a predetermined media sound.
audio signal 302 may be further employed at S23 for the determining whether the user is interested.
the determining whether the user is interested may be based on another characteristic of audio signal 302, e.g., a level and/or a SNR and/or a frequency content of audio signal 302.

Landscapes

Health & Medical Sciences (AREA)
General Health & Medical Sciences (AREA)
Neurosurgery (AREA)
Otolaryngology (AREA)
Physics & Mathematics (AREA)
Engineering & Computer Science (AREA)
Acoustics & Sound (AREA)
Signal Processing (AREA)
Stereophonic System (AREA)

EP23196022.0A 2023-09-07 2023-09-07 Betrieb eines hörgeräts zur optimierung der tonausgabe aus einer lokalisierten medienquelle Withdrawn EP4521777A1 (de)

Priority Applications (1)

Application Number	Priority Date	Filing Date	Title
EP23196022.0A EP4521777A1 (de)	2023-09-07	2023-09-07	Betrieb eines hörgeräts zur optimierung der tonausgabe aus einer lokalisierten medienquelle

Applications Claiming Priority (1)

Application Number	Priority Date	Filing Date	Title
EP23196022.0A EP4521777A1 (de)	2023-09-07	2023-09-07	Betrieb eines hörgeräts zur optimierung der tonausgabe aus einer lokalisierten medienquelle

Publications (1)

Publication Number	Publication Date
EP4521777A1 true EP4521777A1 (de)	2025-03-12

Family

ID=87971821

Family Applications (1)

Application Number	Title	Priority Date	Filing Date
EP23196022.0A Withdrawn EP4521777A1 (de)	2023-09-07	2023-09-07	Betrieb eines hörgeräts zur optimierung der tonausgabe aus einer lokalisierten medienquelle

Country Status (1)

Country	Link
EP (1)	EP4521777A1 (de)

Citations (8)

* Cited by examiner, † Cited by third party
Publication number	Priority date	Publication date	Assignee	Title
EP2201793B1 (de)	2007-10-16	2011-03-09	Phonak AG	Hörsystem und verfahren zum betrieb eines hörsystems
US20110091056A1 (en) *	2009-06-24	2011-04-21	Makoto Nishizaki	Hearing aid
US20130223660A1 (en) *	2012-02-24	2013-08-29	Sverrir Olafsson	Selective acoustic enhancement of ambient sound
EP1858292B1 (de)	2006-05-16	2014-06-18	Phonak AG	Hörgerät sowie Verfahren zum Betrieb eines Hörgerätes
EP3036915B1 (de)	2013-08-20	2018-10-10	Widex A/S	Hörgerät mit einem adaptiven klassifikator
DE102019206743A1 (de)	2019-05-09	2020-11-12	Sonova Ag	Hörgeräte-System und Verfahren zur Verarbeitung von Audiosignalen
EP3884849A1 (de) *	2020-03-25	2021-09-29	Sonova AG	Selektives erfassen und speichern von sensordaten eines hörsystems
EP4057644A1 (de) *	2021-03-11	2022-09-14	Oticon A/s	Hörgerät zur bestimmung von sprechern von interesse

2023
- 2023-09-07 EP EP23196022.0A patent/EP4521777A1/de not_active Withdrawn

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number	Priority date	Publication date	Assignee	Title
EP1858292B1 (de)	2006-05-16	2014-06-18	Phonak AG	Hörgerät sowie Verfahren zum Betrieb eines Hörgerätes
EP2201793B1 (de)	2007-10-16	2011-03-09	Phonak AG	Hörsystem und verfahren zum betrieb eines hörsystems
US20110091056A1 (en) *	2009-06-24	2011-04-21	Makoto Nishizaki	Hearing aid
US20130223660A1 (en) *	2012-02-24	2013-08-29	Sverrir Olafsson	Selective acoustic enhancement of ambient sound
EP3036915B1 (de)	2013-08-20	2018-10-10	Widex A/S	Hörgerät mit einem adaptiven klassifikator
DE102019206743A1 (de)	2019-05-09	2020-11-12	Sonova Ag	Hörgeräte-System und Verfahren zur Verarbeitung von Audiosignalen
EP3884849A1 (de) *	2020-03-25	2021-09-29	Sonova AG	Selektives erfassen und speichern von sensordaten eines hörsystems
EP4057644A1 (de) *	2021-03-11	2022-09-14	Oticon A/s	Hörgerät zur bestimmung von sprechern von interesse

Legal Events

Date	Code	Title	Description
2025-02-07	PUAI	Public reference made under article 153(3) epc to a published international application that has entered the european phase	Free format text: ORIGINAL CODE: 0009012
2025-02-07	STAA	Information on the status of an ep patent application or granted ep patent	Free format text: STATUS: THE APPLICATION HAS BEEN PUBLISHED
2025-03-12	AK	Designated contracting states	Kind code of ref document: A1 Designated state(s): AL AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HR HU IE IS IT LI LT LU LV MC ME MK MT NL NO PL PT RO RS SE SI SK SM TR
2026-01-01	STAA	Information on the status of an ep patent application or granted ep patent	Free format text: STATUS: THE APPLICATION IS DEEMED TO BE WITHDRAWN
2026-02-04	18D	Application deemed to be withdrawn	Effective date: 20250913

Publication	Publication Date	Title
CN113812173B (zh)	2024-07-02	处理音频信号的听力装置系统及方法
US12047750B2 (en)	2024-07-23	Hearing device with user driven settings adjustment
US9307332B2 (en)	2016-04-05	Method for dynamic suppression of surrounding acoustic noise when listening to electrical inputs
US11457318B2 (en)	2022-09-27	Hearing device configured for audio classification comprising an active vent, and method of its operation
CN113891225A (zh)	2022-01-04	听力装置的算法参数的个人化
US12300248B2 (en)	2025-05-13	Audio signal processing for automatic transcription using ear-wearable device
KR20130133790A (ko)	2013-12-09	보청기를 가진 개인 통신 장치 및 이를 제공하기 위한 방법
EP3361753A1 (de)	2018-08-15	Hörgerät mit dynamischer mikrofondämpfung beim streaming
US11627398B2 (en)	2023-04-11	Hearing device for identifying a sequence of movement features, and method of its operation
CN108696813A (zh)	2018-10-23	用于运行听力设备的方法和听力设备
EP4521777A1 (de)	2025-03-12	Betrieb eines hörgeräts zur optimierung der tonausgabe aus einer lokalisierten medienquelle
EP4422212A1 (de)	2024-08-28	Auswahl des verarbeitungsmodus eines hörgeräts
EP4507327A1 (de)	2025-02-12	Betrieb eines hörgeräts zur klassifizierung eines audiosignals
EP4489440A1 (de)	2025-01-08	Betrieb eines hörgeräts zur darstellung einer anpassungsoption zur modifizierung eines audiosignals an einen benutzer
EP4496350A1 (de)	2025-01-22	Verfahren zur verarbeitung eines audiosignals in einem hörgerät
EP4415390A1 (de)	2024-08-14	Betrieb eines hörgeräts zur klassifizierung eines audiosignals zur berücksichtigung der benutzersicherheit
US12587796B2 (en)	2026-03-24	Method of optimizing audio processing in a hearing device
EP4178228B1 (de)	2025-08-27	Verfahren und computerprogramm zum betrieb eines hörsystems, hörsystem und computerlesbares medium
US11758341B2 (en)	2023-09-12	Coached fitting in the field
US20250310704A1 (en)	2025-10-02	Systems and Methods for Inducing Modulation of User’s Voice Level by a Hearing Device
EP4290886A1 (de)	2023-12-13	Erfassung von kontextstatistiken in hörgeräten
WO2024204100A1 (en)	2024-10-03	Information processing system, information processing method, and audio reproduction device