EP2321981A1 - Automatische optimierung der leistungsfähigkeit für wahrnehmungsbezogene einrichtungen - Google Patents

Automatische optimierung der leistungsfähigkeit für wahrnehmungsbezogene einrichtungen

Info

Publication number: EP2321981A1
Authority: EP; European Patent Office
Prior art keywords: perceptual; stimulus; user; parameter; signal
Prior art date: 2008-08-04
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.): Withdrawn

Application number

EP09791124A

Other languages

English (en)

French (fr)

Inventor

Bonny Banerjee

Lee Krause

Mark D. Skowronski

Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)

Cochlear Ltd

Original Assignee

AUDIGENCE Inc

Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)

2008-08-04

Filing date

2009-08-04

Publication date

2011-05-18

2008-08-04 Priority claimed from US12/185,394 external-priority patent/US8755533B2/en

2009-08-04 Application filed by AUDIGENCE Inc filed Critical AUDIGENCE Inc

2011-05-18 Publication of EP2321981A1 publication Critical patent/EP2321981A1/de

Status Withdrawn legal-status Critical Current

Links

Classifications

- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/02—Speech enhancement, e.g. noise reduction or echo cancellation
- A—HUMAN NECESSITIES
- A61—MEDICAL OR VETERINARY SCIENCE; HYGIENE
- A61B—DIAGNOSIS; SURGERY; IDENTIFICATION
- A61B5/00—Measuring for diagnostic purposes; Identification of persons
- A61B5/16—Devices for psychotechnics; Testing reaction times ; Devices for evaluating the psychological state
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04R—LOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; ELECTRIC HEARING AIDS; PUBLIC ADDRESS SYSTEMS
- H04R25/00—Electric hearing aids
- H04R25/70—Adaptation of deaf aid to hearing loss, e.g. initial electronic fitting
- A—HUMAN NECESSITIES
- A61—MEDICAL OR VETERINARY SCIENCE; HYGIENE
- A61B—DIAGNOSIS; SURGERY; IDENTIFICATION
- A61B5/00—Measuring for diagnostic purposes; Identification of persons
- A61B5/12—Audiometering
- A61B5/121—Audiometering evaluating hearing capacity
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/06—Transformation of speech into a non-audible representation, e.g. speech visualisation or speech processing for tactile aids
- G10L2021/065—Aids for the handicapped in understanding
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/02—Speech enhancement, e.g. noise reduction or echo cancellation
- G10L21/0208—Noise filtering
- G10L21/0264—Noise filtering characterised by the type of parameter measurement, e.g. correlation techniques, zero crossing techniques or predictive techniques

Definitions

This invention relates to systems and methods for optimizing performance of perceptual devices to adjust to a user's needs and, more particularly, to systems and methods for adjusting the parameters of digital hearing devices to customize the output from the hearing device to a user.
Perception is integral to intelligence. Perceptual ability is a prerequisite for any intelligent agent, living or artificial, to function satisfactorily in the real world. For an agent to experience an external environment with its perceptual organs (or sensors, in the case of artificial agents), it sometimes becomes necessary to augment the perceptual organs, the environment, or both.
human eyes are often augmented with a pair of prescription glasses.
the environment is augmented with devices, such as speakers and sub-woofers, placed in certain positions with respect to the agent.
devices such as speakers and sub-woofers
the agent often has to wear specially designed eyeglasses, such as polarized glasses.
perceptual devices include, without limitation, audio headphones, hearing aids, cochlear implants, low-light or "night-vision” goggles, tactile feedback devices, etc.
perceptual devices Due to personal preference, taste, and the raw perceptual ability of the organs, the quality of experience achieved by augmenting the agent's perceptual organs or environment with devices is often user-specific. As a result, the devices should be tuned to provide the optimum experience to each user. In the case of hearing devices (e.g., hearing aids, cochlear implants), such devices are endowed with parameters that tailor the device's performance to an individual's hearing needs.
Agents with simple perceptual systems e.g., robotic vacuum cleaners
agents with complex perceptual systems e.g., humans
a sophisticated perceptual device should also allow the user to tune the device to meet that user's particular perceptual needs.
Such complex devices often have a large set of parameters that can be tuned to a specific user's needs. Each parameter can be assigned one of many values, and determining the values of parameters for a particular user's optimum performance is difficult.
a user is required to be thoroughly tested with the device in order to be assigned the optimum parameter values. The number of tests required increases exponentially with the number of device parameters. Dedicating a significant amount of time to testing often is not a feasible option; accordingly, it is may be advantageous to reduce the complexity of the problem.
the invention relates to a method for modifying a controllable stimulus generated by a perceptual device in communication with a human user, the method including: generating an input signal to the perceptual device, the perceptual device sending a stimulus to the human user, the stimulus defined at least in part by a parameter, the parameter having a value; receiving an output signal from the human user, the output signal based at least in part on a perception of the stimulus by the human user; determining a difference between the input signal and the output signal; constructing a perceptual model based at least in part on the difference; and suggesting a value for the parameter based at least in part on the perceptual model.
suggesting a value further includes utilizing a knowledge base.
the knowledge base includes at least one of declarative knowledge and procedural knowledge.
the method further includes generating a second input signal to the perceptual device based at least in part on the perceptual model.
the input signal is an audio signal, and/or the perceptual device is a digital audio device.
the invention in another aspect, relates to a system for modifying a controllable stimulus generated by a perceptual device in communication with a human user, the system including: a test set generator for generating a test set to the perceptual device, the perceptual device sending a stimulus to the human user, the stimulus defined at least in part by a parameter, the parameter including a value; a signal receiver for receiving an output signal from the human user, the output signal based at least in part on a perception of the stimulus by the human user; a perceptual model module for constructing a perceptual model based at least in part on the difference; and a parameter generator for suggesting a value for the parameter based at least in part on the perceptual model.
the system further includes a second signal generator for generating a second input signal to the perceptual device based at least in part on the perceptual model.
the system further includes a storage module for storing information used in the construction of the perceptual model.
the information stored in the storage module includes a knowledge base.
the system includes a rule extraction module for formulating a rule based at least in part on the perceptual model.
the parameter generator suggests a value for the parameter based at least in part on at least one of information obtained from the storage module and information obtained from the perceptual model module.
the signal generator includes the second signal generator.
the input signal is an audio signal.
the invention relates to an article of manufacture having computer-readable portions embodied thereon for modifying a controllable stimulus generated by a perceptual device in communication with a user, the article including: computer readable instructions for providing an input signal to the perceptual device, the perceptual device sending a stimulus to the human user, the stimulus defined at least in part by a parameter, the parameter having a value; computer readable instructions for receiving an output signal from the agent, the output signal based at least in part on a perception of the stimulus by the human user; computer readable instructions for determining a difference between the input signal and the output signal; computer readable instructions for constructing a perceptual model based at least in part on the difference; and computer readable instructions for suggesting a value for the parameter based at least in part on the perceptual model.
the article of manufacture further includes computer readable instructions for providing a second input signal to the perceptual device based at least in part on the perceptual model.
the invention in another aspect, relates to a method of tuning a perceptual device from a speech waveform, the method including the steps of: inputting a speech waveform from a user response to a stimulus; extracting at least one first acoustic feature from the waveform; segmenting at least one phoneme from the at least one first acoustic feature; extracting at least one second acoustic feature from the at least one phoneme; comparing the speech waveform to a stimulus; and determining at least one parameter value for the perceptual device.
Embodiments of the above aspect include the steps of: transmitting a stimulus to a user; and receiving a user response based at least in part on the stimulus.
the at least one first acoustic feature is extracted utilizing a frame-based procedure.
the at least one second acoustic feature is extracted utilizing a segment-based procedure.
the method includes the step of determining an error that is a difference between the speech waveform and the stimulus. In still other embodiments, the error is equal to
W 1 is the weight of the / ⁇ feature
f s t and f r t are the / ⁇ features of the stimulus and response respectively
I I denotes a distance measure.
the distance measure is a Mahalanobis distance.
the invention in another aspect, relates to an article of manufacture having computer-readable program portions embedded thereon for tuning a perceptual device from a speech waveform, the program portions including: instructions for inputting a speech waveform from a user response to a stimulus; instructions for extracting at least one first acoustic feature from the waveform; instructions for segmenting at least one phoneme from the at least one first acoustic feature; instructions for extracting at least one second acoustic feature from the at least one phoneme; instructions for comparing the speech waveform to a stimulus; and instructions for determining at least one parameter value for the perceptual device.
the invention in another aspect, relates to a system for tuning a perceptual device from a speech waveform, the system including: a receiver for receiving a speech waveform from a user response to a stimulus; an first extractor for extracting at least one first acoustic feature from the waveform; a first processor for segmenting at least one phoneme from the at least one first acoustic feature; a second extractor for extracting at least one second acoustic feature from the at least one phoneme; a second processor for comparing the speech waveform to a stimulus; and a third processor for determining at least one parameter value for the perceptual device.
Embodiments of the above aspect include a transmitter for transmitting a stimulus to a user.
the system includes a system processor that includes the first extractor, the first processor, the second extractor, the second processor, the third processor, and the fourth processor.
the invention comprises an article of manufacture having a computer-readable medium with computer-readable instructions embodied thereon for performing the methods described in the preceding paragraphs.
the functionality of a method of the present invention may be embedded on a computer-readable medium, such as, but not limited to, a floppy disk, a hard disk, an optical disk, a magnetic tape, a PROM, an EPROM, CD-ROM, DVD-ROM or downloaded from a server.
the functionality of the techniques may be embedded on the computer-readable medium in any number of computer-readable instructions, or languages such as, for example, FORTRAN, PASCAL, C, C++, Java, PERL, LISP, JavaScript, C#, TcI, BASIC and assembly language.
the computer- readable instructions may, for example, be written in a script, macro, or functionally embedded in commercially available software (such as EXCEL or VISUAL BASIC).
FIG. 1 is a schematic diagram of a method for automatic hearing device parameter tuning from a speech waveform
FIG. 2 is a schematic diagram depicting the relationship between a perceptual device and an agent in accordance with one embodiment of the present invention
FIG. 3 is a schematic diagram of an apparatus in accordance with one embodiment of the present invention
FIG. 4 is the schematic diagram of FIG. 3 incorporating a knowledge base in accordance with one embodiment of the present invention
FIG. 5 is a flowchart of a testing procedure in accordance with one embodiment of the present invention.
FIG. 6 is a schematic diagram of a testing system in accordance with one embodiment of the present invention. DETAILED DESCRIPTION OF THE INVENTION
FIG. 1 depicts one embodiment of a method 10 for automatic tuning of hearing device parameters directly from acoustic features of speech.
the speech waveform is spoken by a hearing device user (i.e., a patient or agent) 12 after being presented with an oral speech stimulus in a stimulus/response test paradigm, such as that depicted in U.S. Patent No.
the speech waveform may include, generally, one or more multi-phoneme sounds spoken by a user. These multi-phoneme sounds may form parts, or entire portions of, words, phrases, sentences, or other constructs of spoken language, and may be in any language or in a plurality of languages.
the speech waveform is input into an acoustic feature extraction process 14.
the acoustic features are input into a segmentation routine 16 which delimits phoneme boundaries in the speech waveform. Segmentation may be performed using a hidden Markov model (HMM), described in Rabiner, L., "A tutorial on Hidden Markov Models and Selected Applications in Speech Recognition," Proc. IEEE, vol. 77, no. 2, pp. 257-286, Feb. 1989, the disclosure of which is hereby incorporated by reference herein in its entirety.
HMM hidden Markov model
ASR automatic speech recognition
the HMM may be trained as phoneme models, bi-phone models, N-phone models, syllable models or word models.
a Viterbi path of the speech waveform through the HMM may be used for segmentation, so the phonemic representation of each state in the HMM is required.
Phonemic representation of each state may utilize hand-labeling phoneme boundaries for the HMM training data.
Specific states are assigned to specific phonemes (more than one state may be used to represent each phoneme for all types of HMMs).
the frame-based acoustic feature extraction process may be a conventional ASR front end.
Human factor cepstral coefficients (HFCCs) a spectral flatness measure, a voice bar measure, and delta and delta- delta coefficients as acoustic features may be utilized.
HFCCs and spectral flatness measure are described in Skowronski, M. D. and J. G. Harris, "Exploiting independent filter bandwidth of human factor cepstral coefficients in automatic speech recognition," J. Acoustical Society of America, vol. 116, no. 3, pp. 1774-1780, Sept. 2004, and Skowronski, M. D. and J. G. Harris, "Applied principles of clear and Lombard speech for intelligibility enhancement in noisy environments," Speech Communication, vol. 48, no. 5, pp. 549-558, May 2006, the disclosures of which are hereby incorporated by reference herein in their entireties. Acoustic features may be measured for each analysis frame at predetermined durations.
Frame durations of 1 ms to about 50 ms, from about 10 ms to about 40 ms, and from about 15 ms to about 30 ms are acceptable. In certain embodiments, durations of about 20 ms are desirable. Uniform overlap between adjacent frames is also desirable. The overlap duration may be in the range of about 0 ms to a predetermined overlap duration. The predetermined overlap duration may be quantified as the frame duration minus ⁇ , where ⁇ is a small positive value greater than zero. A smaller ⁇ yields more overlap between frames, and more frames per second. As ⁇ goes to zero, frames per second goes to infinity. Overlap durations of about 10 ms between adjacent frames may be desirable in certain embodiments. Analysis frames and overlaps having other durations and times are also contemplated.
Segment-based acoustic features for each phoneme of the speech waveform are measured from segmented regions 18.
the features include HFCC calculated over a single window spanning the entire region of a phoneme (which may vary from about 5 ms to tens of seconds, depending on the agent's unconscious or purposeful exhalation length while forming a sound or sounds), a single voice bar measure, and/or a single spectral flatness measure, augmented with several other acoustic features.
Various other acoustic features may be appended to the set of segment-based features listed above that provide additional information targeting specific distinctive features of phonemes as described in Jakobson, R., C. G. M. Fant, and M. Halle, "Preliminaries to Speech Analysis: The Distinctive Features and Their
the difference between the two constitutes the error in perceiving that stimulus.
the distribution of acoustic features for each stimulus phoneme is then calculated.
This distribution may be represented by any distribution model, estimated from the same hand-labeled data used to train the segmentation HMM.
a Gaussian mixture model may be utilized to represent the stimulus features.
An error ⁇ is calculated as the mean of the weighted distance between the distribution of features for each stimulus and the features extracted from the corresponding response. That is,
the optimization process 20 information corresponding to a test stimulus and/or a previous device parameter set 22 are utilized to optimize or tune a perceptual device.
the optimization process 20 in turn, generates a new device parameter set 24 that is used to improve the performance of the perceptual device.
Various embodiments of the methods and systems disclosed herein are used to "tune" a perceptual device.
the term "optimization" is sometimes used to describe the process of tuning, which typically includes modifying parameters of a perceptual device.
Certain embodiments of the disclosed methods and systems automatically tune at least one device parameter based on a user's raw perceptual ability to improve the user's perception utilizing different tuning algorithms operating separately or in tandem to allow the device to be tuned quickly.
the device parameters can be user-specific or user-independent.
a model is created to describe a user' s perception (i.e., the perceptual model). This model is incremental and is specific to a user and his device.
one or more algorithms is applied to the model resulting in predictions (along with confidence and explanation) of the optimum parameter values for the user.
the user is iteratively tested with the values having the highest confidence, and the model is further updated.
a set of rules capturing user-independent information is used to tune certain parameters.
the number of parameters governing the operation of a given perceptual device may be large.
the amount of data required to faithfully model a user's perceptual strengths and weaknesses using that device increases exponentially with the number of device parameters; this limits the ability to reach optimal settings for the device in a reasonable time.
a number of algorithms are used with simple independent assumptions regarding the model. Using these assumptions, each algorithm studies the model and makes predictions with a confidence. The most confident prediction is chosen at any point of time. This architecture helps reduce the complexity of the solution that otherwise would have been enormous.
lookup tables or other procedures may be utilized to perform the optimization, in much the same way as the algorithms described above.
a user may be considered a black box with perceptual organs that can accept a signal as input and produce a signal as output in accordance with certain instructions.
This method is useful for applications where the black box is too complex to be modeled non- stochastically, such as the human brain.
the instructions can be conveyed by different means. For example, a human might be told instructions in a natural language; an artificial agent might be programmed with the instructions.
Raw perception of a user is judged by some criteria that measure the actual output signal against the output signal expected from the application of the given set of instructions to the input signal. For example, if the input signals are spoken phonemes, the black box is a human brain with ears as the perceptual organs, and the instruction is to reproduce the input phonemes (as speech or in writing), the perception might be measured by computing the difference between the input and output phonemes. In another example, if the input signal is a set of letters written on a piece of paper, the black box is a human brain with eyes as the perceptual organs, and the instruction is to reproduce the letters (as speech or in writing), the perception might be measured by computing the difference between the input and output letters.
FIG. 2 depicts an exemplary relationship between the perceptual device D and the agent A. Given a user or an agent A, one or more devices D, an input signal S lnp , and a
FIG. 1 corresponding output signal S out that the agent has produced obeying certain instructions
S lnt is the intermediate signal or stimulus emanated from the device(s) and perceived by
the agent In the case of a digital audio device, the stimulus is the sound actually heard by the user.
the intermediate signal cannot be measured in the same way that S lnp and S out are
the function A is characterized by the device parameters.
Embodiments of the present invention (1) statistically model the perceptual errors (i.e., some metric applied to S lnp ⁇ S out ) for an agent with respect to the device parameters, and (2) study this perceptual errors (i.e., some metric applied to S lnp ⁇ S out ) for an agent with respect to the device parameters, and (2) study this perceptual errors (i.e., some metric applied to S lnp ⁇ S out ) for an agent with respect to the device parameters, and (2) study this perceptual
a method for automatically tuning the parameters of at least one perceptual device in a user- specific way.
the agent or its environment is fitted with a device(s) whose parameters are preset, for example, to factory default values.
the proposed method may be implemented as a computer program that tests the raw perception of the agent.
FIG. 3 depicts one such implementation of the program 100. Based on the results of the test, the program 100 may suggest new parameter values along with an explanation of why such values are chosen and the confidence of the suggested set of values 102.
the devices 104 are reset with the parameter values with the highest confidence or best explanation.
a human tester for example, an audiologist fine tuning a digital hearing aid or cochlear implant (CI)
CI cochlear implant
the purpose of testing is to determine the raw perceptual ability, independent of context and background knowledge, of the agent 108.
a series of input signals is presented to the agent 108 whose environment is fitted with at least one perceptual device 104 set to certain parameter values.
the input signal may be of sufficient length, duration, complexity, etc. to exceed a single phoneme sound. Such a complex signal will ultimately elicit a responsive speech waveform from the agent.
the agent 108 After each signal is presented, the agent 108 is given enough time to output a signal in response to its perceived signal, in accordance with instructions that the agent 108 has previously received.
the output signal 110 corresponding to each input signal is recorded along with the time required for response.
an ASR 150 may be used to process the speech waveform.
the speech waveform is input into an acoustic feature extractor 152.
the acoustic features are input into a processor for segmentation 154 which delimits phoneme boundaries in the speech waveform.
a second extractor 156 then measures segment-based acoustic features for each phoneme of the o speech waveform. The resulting features are then used to in the following processes to compare the stimulus to the speech waveform, to optimize the perceptual device.
a metric captures the difference between the input signal and the agent's response in a meaningful way such that a model 112 of the agent's perceptual ability can be incrementally constructed using that metric and the device parameters. 5 [0035]
the test set creator or generator 114 modifies the parameters based on information received during the test.
the next set of input signals are chosen on which the agent 108 should be tested, based on its strengths and weaknesses as evident from the model 112.
a new test starts with the perceptual devices 104 set to new parameter values, again,0 based on the application of the algorithm to the information.
An increase in response time indicates that either the agent 108 is having difficulty in perception or the agent 108 is getting fatigued. In the latter case, the agent 108, tester, or program 100 may opt to rest before further testing.
the model 112 describes the perceptual ability of the agent 108 with respect to the perceptual devices 104. Given an accurate model, one can predict the parameter values best suited for an agent 108. However, the model 112 is never complete until the agent 108 has been tested with all combinations of values for the parameters.
FIG. 4 presents another embodiment of the present invention incorporating a knowledge base into the computer program 100 of FIG. 3.
the knowledge base (KB) of the computer program 100 stores knowledge in two forms - declarative 120 and procedural 122.
Declarative knowledge 120 is stored as a set of statements useful for predicting a new set of parameter values 132 based on the model of the agent's perceptual ability.
declarative knowledge would include a situation where the agent 108 is a human with hearing loss, the device 104 is a CI, and his model 112 shows that he is weak in hearing the middle range of the frequency spectrum.
the declarative knowledge 120 would include a statement that more CI channels should be associated with frequencies in that middle range than the higher or lower frequency ranges.
Declarative knowledge can be readily applied, wherever appropriate, to make an inference. Often a user's previously tested parameters and device parameters 134 may be utilized with the declarative knowledge.
Procedural knowledge 122 is stored as procedures or algorithms that study the perceptual model 112 in order to make predictions for new parameter values.
Each item of procedural knowledge is an independent algorithm 124 that studies the model 112 in a way which might involve certain assumptions about the model 112. These items of procedural knowledge may also utilize declarative knowledge 120 to study the model 112. Upon studying a model 112 and comparing it with the stored models of previously tested similar agents using similar devices, the algorithms may derive new rules 126 for storage as items of declarative knowledge 128.
An example of procedural knowledge would include a situation where the agent is a human with hearing loss and the device is a CI. In this case, his model might be studied by an algorithm assuming that there exists a region in the model that represents the perceptual error minima of the agent. Hence, the algorithm will study the model hoping to find that minimum region and will predict appropriate parameter values for that minimum.
the number of adjustable parameters can be large.
the number of tests required to tune these parameters may even increase exponentially with the number of device parameters.
One of the challenges faced by the proposed method is to reduce the number of tests so that the time required for tuning the parameter values can be reduced to a practical time period.
One way to make the process more efficient is to utilize procedural knowledge 122.
a number of procedures, lookup tables, or algorithms 124 with very different assumptions are contemporaneously applied to the model 112. After application, each procedure provides its prediction of the parameters along with a confidence value for the prediction and an explanation of how the prediction was reached. These explanations are evaluated, either by a supervisory program or a tester, and that prediction that provides the best explanation is selected 130.
FIG. 5 depicts an exemplary testing procedure 200 in accordance with one embodiment of the present invention.
a user fitted with a CI is tested in the presence of an audiologist, who is monitoring the test.
the program begins by generating an input signal 202.
This input signal directs the CI to deliver a stimulus (e.g., a phoneme sound) to the user.
a stimulus e.g., a phoneme sound
the stimulus parameter value is accessed 204 by the program. This value may be either a factory default setting (usually when the device is first implanted), a previously stored suggested value, or a previously stored override value. The latter two values are described in more detail below.
a stimulus based on the parameter is then delivered to the user 206.
the program waits for an output signal from the user 208.
This received output signal may take any form that is usable by the program. For example, the user may repeat the sound into a microphone, spell the sound in a keyboard, or press a button or select an icon that corresponds to their perception of the sound.
the program notes the time T when the output signal is received.
the elapsed time is compared to a predetermined value 210. If the time exceeds this value, the program determines that the user is fatigued 212, and the program ends 214. If the elapsed time does not exceed the threshold, however, the output signal and stimulus are compared 216 to begin analysis of the results. The difference between the output signal from the user and the stimulus sent from the CI to the user are used to construct the perceptual model 218. Next, the program suggests a value for the next parameter to be tested 220.
the audiologist may optionally decide whether or not to utilize the suggested value 222 for the next test procedure, based on his or her knowledge base or other factors that may not be considered by the program. If the audiologist overrides the suggested value with a different value, this override value is stored 224 to be used for the next test. The program then determines if the test is complete 226, and may terminate the test 228 if required or desired by the user. [0044] The test may be determined to be complete for a number of reasons. For example, the user or audiologist may be given the option at this point (or at any point during the test) to terminate testing.
the program may determine that during one or more iterations of the test, the user's response time, as measured in step 210, increased such that fatigue may be a factor, warranting termination of the testing. Additionally, the program may determine that, based on information regarding the tested device or the program itself, all iterations or options have been tested. In such a case, the program may determine that no further parameter adjustment would materially improve the operation of the device or the program. Also, the program may interpret inconsistent information at this point as indicative of an error condition that requires termination. Other procedures for terminating testing are known to the art. [0045] Returning to step 222, if the suggested value is accepted, this value is then stored for later use in a subsequent test 230.
the program may be operated without the assistance of an audiologist. In this case, acceptance of the suggested value would be the default response to the suggested value. In this way, the test may be utilized without the involvement of an audiologist.
the program with few modifications, could allow the user to self-tune his device remotely, potentially over an internet connection or with a stand-alone tuning device.
a determination to continue the test 232 (having similar considerations as described in step 226), may be made prior to ending the test 234.
the optimization methods of the current invention may be utilized with virtually any metric that may be used to test people that utilize digital hearing devices.
One such metric is disclosed in, for example, U.S. Patent No. 7,206,416 to Krause et al., the entire disclosure of which is hereby incorporated by reference herein in its entirety, and will be discussed herein as 5 one exemplary application of the optimization methods.
a typical testing system 300 is depicted in FIG. 6.
the testing procedure tests the raw hearing ability, independent of context and background knowledge, of a hearing-impaired person.
an input signal 302 is generated and sent to a digital audio device, which, in this example, is a CI 304.
the CI will deliver an o intermediate signal or stimulus 306, associated with one or more parameters, to a user 308.
the parameters may be factory-default settings.
the parameters may be otherwise defined, as described below. In either case, the test procedure utilizes the stored parameter values to define the stimulus (i.e., the sound).
the user After a signal is presented, the user is given enough time to make a sound signal (or speak a string of sounds sufficient to form a speech waveform) representing what he heard.
the output signal corresponding to each input signal is recorded along with the response time. If the response time exceeds a predetermined setting, the system determines that the person may be getting fatigued and will stop the test.
the output signal 310 may be a sound repeated0 by the user 308 into a microphone 312.
the resulting analog signal 314 is converted by an analog/digital converter 316 into a digital signal 318 delivered to the processor 320.
the user 308 may type a textual representation of the sound heard into a keyboard 322.
the output signal 310 is stored and compared to the immediately preceding stimulus.
an algorithm, lookup table, or other procedure decides the user's strengths and weaknesses and stores this information in an internal perceptual model. Additionally, the algorithm suggests a value for the next test parameter, effectively choosing the next input sound signal to be presented. This new value is delivered via the output module 324. If an audiologist is administering the test, the audiologist may choose to ignore the suggested value, in favor of their own suggested value. In such a case, the tester's value would be entered into the override module 326.
the suggested value or the tester's override value is utilized, this value is stored in a memory for later use (likely in the next test). These tests may be repeated with different sounds, words, sentences, or other stimuli until the CI performance is optimized or otherwise modified, the user fatigues, etc. In one embodiment, the test terminates when the user's strengths and weaknesses with respect to the current CI device parameters are comprehensively determined. A new test starts with the CI device set to new parameter values. [0050]
the disclosed system utilizes any number of algorithms that may operate substantially or completely in parallel to suggest parameter values in real time.
Exemplary algorithms include (1) computing a reduced set of phonemes (input sound signals) for testing a person based on his strengths and weaknesses from past tests and using the features of the phonemes, thereby reducing testing time considerably; (2) computing a measure of performance for a person from his tests involving features of phonemes and their weights; (3) classifying a person based on their strengths and weaknesses as obtained from previous tests; and (4) predicting the parameter setting of a CI device to achieve optimum hearing for a person using his perceptual model and similar people's optimal device settings.
predetermined parameter values may be selected from a lookup table containing parameter value combinations based on a person's known or predicted strengths and weaknesses based on results from tests.
a phoneme In human language, a phoneme is the smallest unit of distinguishable speech. Phonemes may be utilized in testing. For example, the input signal may be chosen from a set of phonemes from the Iowa Medial Consonant Recognition Test. Both consonant phonemes and vowel phonemes may be used during testing, though vowel phonemes may have certain disadvantages in testing: they are too easy to perceive and typically do not reveal much about the nature of hearing loss. It is known that each phoneme is characterized by the presence, absence or irrelevance of a set of nine features - Vocalic, Consonantal, Compact, Grave, Flat, Nasal, Tense, Continuant, and Strident.
a person' s performance in a test can be measured by the number of input sound signals (i.e., phonemes, although actual words, phrases, sentences, or other language constructs in any language may also be used) he fails to perceive.
This type of basic testing may fail to capture the person's strengths and weaknesses because many phonemes share similar features. For example, the phonemes ' ⁇ f and ' ⁇ p' differ only in one out of the nine features called Continuant.
a person's performance in a test is measured by the weighted mean of the feature errors, given by: ⁇ W ⁇
W 1 is the weight and n t is the number of errors in the ith feature of the hierarchy.
weights of the features are experimentally ascertained to be ⁇ 0.151785714, 0.151785714, 0.142857143, 0.098214286, 0, 0.142857143, 0.125, 0.125, 0.0625 ⁇ .
Other weights may be utilized as the testing procedures evolve for a given user or group of users.
the actual weight utilized in experimentation to optimize may include other values and potentially may be dependent upon testing, the language being used, and other variables. Acceptable results may o be obtained utilizing other weightings.
This manner of testing provides a weighted error representing the user's performance with a set of parameter values. If a person is tested with all possible combinations of parameter values, the result can be represented as a weighted error surface in a high-dimensional space, where the dimension is one more than the number of parameters being 5 considered. In this error surface, there exists a global minimum and one or more local minima. In general, while the person's performance is good at each of these local minima, his performance is the best at the global minimum.
One task of the computer program is to predict the location of the global minimum or at least a good local minimum within a short period of testing.
the perceptual model may be represented in a number of ways, such as using a surface model, a set of rules, a set of mathematical/logical equations and inequalities, and so on, to obtain results.
a surface model due to the presence of many parameters, a very high-dimensional error surface may be formed.
the minimum amount of data required to model such a surface increases exponentially with the number of dimensions leading to the so-called “curse of dimensionality.” There is therefore an advantage to reducing the number of parameters.
the large number of parameters are reduced to three - “stimulation rate,” "Q-value,” and “map number.” The stimulation rate and Q-value can dramatically change a person's hearing ability.
the map number is an integer that labels the map and includes virtually all device parameters along with a frequency allocation table. Changing any parameter value or frequency allocation to the different channels would constitute a new map with a new map number.
the error surface is reduced to a four-dimensional space, thereby considerably reducing the minimum amount of data required to model the surface.
Each set of three parameter values constitutes a point. Only points at which a person has been tested, called sampled points, have a corresponding weighted error.
the error surface is constituted of sampled points. Adjusting parameters to reduce errors in one feature may lead to an increase in error in another feature. In order to adjust parameters such that the overall performance is enhanced, one should strive to reduce the total weighted error as described by equation (i) .
the software may be configured to run on any computer or workstation such as a PC or PC-compatible machine, an Apple Macintosh, a Sun workstation, etc.
any device can be used as long as it is able to perform all of the functions and capabilities described herein.
the particular type of computer or workstation is not central to the invention, nor is the configuration, location, or design of the database, which may be flat-file, relational, or object-oriented, and may include one or more physical and/or logical components.
the servers may include a network interface continuously connected to the network, and thus support numerous geographically dispersed users and applications.
the network interface and the other internal components of the servers intercommunicate over a main bi-directional bus.
the main sequence of instructions effectuating the functions of the invention and facilitating interaction among clients, servers and a network can reside on a mass-storage device (such as a hard disk or optical storage unit) as well as in a main system memory during operation. Execution of these instructions and effectuation of the functions of the invention is accomplished by a central-processing unit (“CPU").
CPU central-processing unit
a group of functional modules that control the operation of the CPU and effectuate the operations of the invention as described above can be located in system memory (on the server or on a separate machine, as desired).
An operating system directs the execution of low- level, basic system functions such as memory allocation, file management, and operation of mass storage devices.
a control block implemented as a series of stored instructions, responds to client-originated access requests by retrieving the user-specific profile and applying the one or more rules as described above.

Landscapes

Health & Medical Sciences (AREA)
Engineering & Computer Science (AREA)
Physics & Mathematics (AREA)
Life Sciences & Earth Sciences (AREA)
General Health & Medical Sciences (AREA)
Signal Processing (AREA)
Acoustics & Sound (AREA)
Social Psychology (AREA)
Biomedical Technology (AREA)
Multimedia (AREA)
Child & Adolescent Psychology (AREA)
Developmental Disabilities (AREA)
Educational Technology (AREA)
Hospice & Palliative Care (AREA)
Psychiatry (AREA)
Psychology (AREA)
Audiology, Speech & Language Pathology (AREA)
Quality & Reliability (AREA)
Biophysics (AREA)
Pathology (AREA)
Human Computer Interaction (AREA)
Heart & Thoracic Surgery (AREA)
Medical Informatics (AREA)
Molecular Biology (AREA)
Surgery (AREA)
Animal Behavior & Ethology (AREA)
Computational Linguistics (AREA)
Public Health (AREA)
Veterinary Medicine (AREA)
Neurosurgery (AREA)
Otolaryngology (AREA)
Measurement Of The Respiration, Hearing Ability, Form, And Blood Characteristics Of Living Organisms (AREA)
User Interface Of Digital Computer (AREA)

EP09791124A 2008-08-04 2009-08-04 Automatische optimierung der leistungsfähigkeit für wahrnehmungsbezogene einrichtungen Withdrawn EP2321981A1 (de)

Applications Claiming Priority (3)

Application Number	Priority Date	Filing Date	Title
US12/185,394 US8755533B2 (en)	2008-08-04	2008-08-04	Automatic performance optimization for perceptual devices
US16445309P	2009-03-29	2009-03-29
PCT/US2009/052633 WO2010017156A1 (en)	2008-08-04	2009-08-04	Automatic performance optimization for perceptual devices

Publications (1)

Publication Number	Publication Date
EP2321981A1 true EP2321981A1 (de)	2011-05-18

Family

ID=41401791

Family Applications (1)

Application Number	Title	Priority Date	Filing Date
EP09791124A Withdrawn EP2321981A1 (de)	2008-08-04	2009-08-04	Automatische optimierung der leistungsfähigkeit für wahrnehmungsbezogene einrichtungen

Country Status (3)

Country	Link
EP (1)	EP2321981A1 (de)
AU (1)	AU2009279764A1 (de)
WO (1)	WO2010017156A1 (de)

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number	Priority date	Publication date	Assignee	Title
CN104956689B (zh)	2012-11-30	2017-07-04	Dts（英属维尔京群岛）有限公司	用于个性化音频虚拟化的方法和装置
WO2014164361A1 (en)	2013-03-13	2014-10-09	Dts Llc	System and methods for processing stereo audio content
EP2924676A1 (de)	2014-03-25	2015-09-30	Oticon A/s	Gehörbasierte adaptive Trainingssysteme
US10198964B2 (en)	2016-07-11	2019-02-05	Cochlear Limited	Individualized rehabilitation training of a hearing prosthesis recipient
EP3721429A2 (de) *	2017-12-07	2020-10-14	HED Technologies Sarl	Sprachbewusstes audiosystem und verfahren

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number	Priority date	Publication date	Assignee	Title
DE2349626C2 (de) *	1973-10-03	1984-06-07	Robert Bosch Gmbh, 7000 Stuttgart	Sprachaudiometer mit einem Tonwiedergabegerät
US5729658A (en) *	1994-06-17	1998-03-17	Massachusetts Eye And Ear Infirmary	Evaluating intelligibility of speech reproduction and transmission across multiple listening conditions
US6004015A (en) *	1994-11-24	1999-12-21	Matsushita Electric Industrial Co., Ltd.	Optimization adjusting method and optimization adjusting apparatus
AU1630799A (en) *	1997-12-12	1999-07-05	Knowles Electronics, Inc.	Automatic system for optimizing hearing aid adjustments
AU2004300976B2 (en) *	2003-08-01	2009-02-19	Audigence, Inc.	Speech-based optimization of digital hearing devices
US20060045281A1 (en) *	2004-08-27	2006-03-02	Motorola, Inc.	Parameter adjustment in audio devices

2009
- 2009-08-04 EP EP09791124A patent/EP2321981A1/de not_active Withdrawn
- 2009-08-04 AU AU2009279764A patent/AU2009279764A1/en not_active Abandoned
- 2009-08-04 WO PCT/US2009/052633 patent/WO2010017156A1/en not_active Ceased

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
See references of WO2010017156A1 *

Also Published As

Publication number	Publication date
WO2010017156A1 (en)	2010-02-11
AU2009279764A1 (en)	2010-02-11

Legal Events

Date	Code	Title	Description
2011-04-15	PUAI	Public reference made under article 153(3) epc to a published international application that has entered the european phase	Free format text: ORIGINAL CODE: 0009012
2011-05-18	17P	Request for examination filed	Effective date: 20110222
2011-05-18	AK	Designated contracting states	Kind code of ref document: A1 Designated state(s): AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HR HU IE IS IT LI LT LU LV MC MK MT NL NO PL PT RO SE SI SK SM TR
2011-05-18	AX	Request for extension of the european patent	Extension state: AL BA RS
2011-11-09	DAX	Request for extension of the european patent (deleted)
2012-02-22	17Q	First examination report despatched	Effective date: 20120118
2012-10-17	RAP1	Party data changed (applicant data changed or rights of an application transferred)	Owner name: COCHLEAR LIMITED
2015-07-03	STAA	Information on the status of an ep patent application or granted ep patent	Free format text: STATUS: THE APPLICATION IS DEEMED TO BE WITHDRAWN
2015-08-05	18D	Application deemed to be withdrawn	Effective date: 20150303

Publication	Publication Date	Title
US20220240842A1 (en)	2022-08-04	Utilization of vocal acoustic biomarkers for assistive listening device utilization
US10997970B1 (en)	2021-05-04	Methods and systems implementing language-trainable computer-assisted hearing aids
US12417771B2 (en)	2025-09-16	Hearing device or system comprising a user identification unit
US8433568B2 (en)	2013-04-30	Systems and methods for measuring speech intelligibility
US20210030371A1 (en)	2021-02-04	Speech production and the management/prediction of hearing loss
US9666181B2 (en)	2017-05-30	Systems and methods for tuning automatic speech recognition systems
US12236942B2 (en)	2025-02-25	Prediction and identification techniques used with a hearing prosthesis
US8755533B2 (en)	2014-06-17	Automatic performance optimization for perceptual devices
US12009008B2 (en)	2024-06-11	Habilitation and/or rehabilitation methods and systems
US20210321208A1 (en)	2021-10-14	Passive fitting techniques
US20230412995A1 (en)	2023-12-21	Advanced hearing prosthesis recipient habilitation and/or rehabilitation
CN109951783A (zh)	2019-06-28	用于基于瞳孔信息调整助听器配置的方法
WO2010017156A1 (en)	2010-02-11	Automatic performance optimization for perceptual devices
Legrand et al.	2007	Interactive evolution for cochlear implants fitting
US8401199B1 (en)	2013-03-19	Automatic performance optimization for perceptual devices
JP2026513542A (ja)	2026-04-28	吃音軽減のための自我異和的な音声変換
EP4699350A1 (de)	2026-02-25	Prädiktive verfahren für sensorische hilfsmittel
Jeyalakshmi et al.	2011	Transcribing deaf and hard of hearing speech using Hidden markov model
Van Zyl	2009	Objective determination of vowel intelligibility of a cochlear implant model
Roohisefat	2014	Neural Response Based Speaker Identification Under Noisy Condition
WO2010025356A2 (en)	2010-03-04	System and methods for reducing perceptual device optimization time