WO2023071519A1 - 音频信息的处理方法、电子设备、系统、产品及介质 - Google Patents
音频信息的处理方法、电子设备、系统、产品及介质 Download PDFInfo
- Publication number
- WO2023071519A1 WO2023071519A1 PCT/CN2022/116528 CN2022116528W WO2023071519A1 WO 2023071519 A1 WO2023071519 A1 WO 2023071519A1 CN 2022116528 W CN2022116528 W CN 2022116528W WO 2023071519 A1 WO2023071519 A1 WO 2023071519A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- sound
- audio information
- alarm
- alarm sound
- position information
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Ceased
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/16—Sound input; Sound output
- G06F3/165—Management of the audio stream, e.g. setting of volume, audio stream path
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10K—SOUND-PRODUCING DEVICES; METHODS OR DEVICES FOR PROTECTING AGAINST, OR FOR DAMPING, NOISE OR OTHER ACOUSTIC WAVES IN GENERAL; ACOUSTICS NOT OTHERWISE PROVIDED FOR
- G10K11/00—Methods or devices for transmitting, conducting or directing sound in general; Methods or devices for protecting against, or for damping, noise or other acoustic waves in general
- G10K11/18—Methods or devices for transmitting, conducting or directing sound
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04M—TELEPHONIC COMMUNICATION
- H04M1/00—Substation equipment, e.g. for use by subscribers
- H04M1/60—Substation equipment, e.g. for use by subscribers including speech amplifiers
- H04M1/6033—Substation equipment, e.g. for use by subscribers including speech amplifiers for providing handsfree use or a loudspeaker mode in telephone sets
- H04M1/6041—Portable telephones adapted for handsfree use
- H04M1/6058—Portable telephones adapted for handsfree use involving the use of a headset accessory device connected to the portable telephone
- H04M1/6066—Portable telephones adapted for handsfree use involving the use of a headset accessory device connected to the portable telephone including a wireless connection
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04M—TELEPHONIC COMMUNICATION
- H04M1/00—Substation equipment, e.g. for use by subscribers
- H04M1/72—Mobile telephones; Cordless telephones, i.e. devices for establishing wireless links to base stations without route selection
- H04M1/724—User interfaces specially adapted for cordless or mobile telephones
- H04M1/72403—User interfaces specially adapted for cordless or mobile telephones with means for local support of applications that increase the functionality
- H04M1/72409—User interfaces specially adapted for cordless or mobile telephones with means for local support of applications that increase the functionality by interfacing with external accessories
- H04M1/72412—User interfaces specially adapted for cordless or mobile telephones with means for local support of applications that increase the functionality by interfacing with external accessories using two-way short-range wireless interfaces
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04M—TELEPHONIC COMMUNICATION
- H04M1/00—Substation equipment, e.g. for use by subscribers
- H04M1/72—Mobile telephones; Cordless telephones, i.e. devices for establishing wireless links to base stations without route selection
- H04M1/724—User interfaces specially adapted for cordless or mobile telephones
- H04M1/72448—User interfaces specially adapted for cordless or mobile telephones with means for adapting the functionality of the device according to specific conditions
- H04M1/72454—User interfaces specially adapted for cordless or mobile telephones with means for adapting the functionality of the device according to specific conditions according to context-related or environment-related conditions
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04R—LOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; ELECTRIC HEARING AIDS; PUBLIC ADDRESS SYSTEMS
- H04R1/00—Details of transducers, loudspeakers or microphones
- H04R1/10—Earpieces; Attachments therefor ; Earphones; Monophonic headphones
- H04R1/1083—Reduction of ambient noise
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04R—LOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; ELECTRIC HEARING AIDS; PUBLIC ADDRESS SYSTEMS
- H04R3/00—Circuits for transducers
- H04R3/005—Circuits for transducers for combining the signals of two or more microphones
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04R—LOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; ELECTRIC HEARING AIDS; PUBLIC ADDRESS SYSTEMS
- H04R5/00—Stereophonic arrangements
- H04R5/033—Headphones for stereophonic communication
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04R—LOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; ELECTRIC HEARING AIDS; PUBLIC ADDRESS SYSTEMS
- H04R5/00—Stereophonic arrangements
- H04R5/04—Circuit arrangements, e.g. for selective connection of amplifier inputs/outputs to loudspeakers, for loudspeaker detection, or for adaptation of settings to personal preferences or hearing impairments
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04S—STEREOPHONIC SYSTEMS
- H04S7/00—Indicating arrangements; Control arrangements, e.g. balance control
- H04S7/30—Control circuits for electronic adaptation of the sound field
- H04S7/302—Electronic adaptation of stereophonic sound system to listener position or orientation
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04S—STEREOPHONIC SYSTEMS
- H04S7/00—Indicating arrangements; Control arrangements, e.g. balance control
- H04S7/30—Control circuits for electronic adaptation of the sound field
- H04S7/302—Electronic adaptation of stereophonic sound system to listener position or orientation
- H04S7/303—Tracking of listener position or orientation
- H04S7/304—For headphones
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04M—TELEPHONIC COMMUNICATION
- H04M1/00—Substation equipment, e.g. for use by subscribers
- H04M1/60—Substation equipment, e.g. for use by subscribers including speech amplifiers
- H04M1/6016—Substation equipment, e.g. for use by subscribers including speech amplifiers in the receiver circuit
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04R—LOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; ELECTRIC HEARING AIDS; PUBLIC ADDRESS SYSTEMS
- H04R2205/00—Details of stereophonic arrangements covered by H04R5/00 but not provided for in any of its subgroups
- H04R2205/024—Positioning of loudspeaker enclosures for spatial sound reproduction
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04S—STEREOPHONIC SYSTEMS
- H04S2400/00—Details of stereophonic systems covered by H04S but not provided for in its groups
- H04S2400/11—Positioning of individual sound objects, e.g. moving airplane, within a sound field
Definitions
- the present application relates to the technical field of audio processing, and in particular to an audio information processing method, electronic equipment, system, computer program product, and computer-readable storage medium.
- the noise-cancelling earphones Even if the noise-cancelling earphones have a transparent mode, in this mode, the noise-cancelling earphones will not completely shield the surrounding sound, but if the surrounding sound is relatively noisy, it still cannot protect the safety of the user.
- the present application provides an audio information processing method, an electronic device, a computer program product, and a computer-readable storage medium.
- the purpose is to ensure that the user can be reminded of the alarm sound that exists around when the user is wearing a noise-canceling headset.
- the present application provides a method for processing audio information applied to an electronic device, the method for processing audio information includes: acquiring audio information, the audio information is obtained by collecting the sound of the environment where the electronic device is located; determining the audio information Including the alarm sound; determining the first position information of the alarm sound based on the audio information; determining the first sound, the first sound includes the second position information, and both the first position information and the second position information are used to identify the sound source direction of the alarm sound, The second location information is the same as or different from the first location information; playing the first sound.
- the first position information and the second position information may refer to the relative position information of the warning sound relative to the user, or may refer to the absolute position information of the warning sound.
- the same first location information and the second location information can be understood as the same two values, and the difference between the first location information and the second location information can be understood as the two are approximately the same or within a certain range. If the first location information and the second location information Both position information are angle values, so it can be understood that the difference between them is within a certain angle range, such as 1°.
- acquiring the audio information obtained by collecting the sound of the environment where the electronic device is located, when the audio information includes an alarm sound, playing the first sound including the direction of the sound source of the alarm sound can ensure that the user's surroundings There is an alarm sound, even if the user wears earphones, the alarm sound is provided by the first sound played.
- determining the first sound, before the first sound includes the second position information further includes: determining that the audio information and the previous audio information including the warning sound are not acquired within a preset time period.
- it also includes: determining that the audio information and the previous audio information containing the warning sound are obtained within a preset time period; judging the first position information of the warning sound in the audio information, and obtaining The difference of the first position information of the warning sound in the sound audio information is within the preset range, and the warning sound in the detection audio information and the warning sound in the previous audio information containing the warning sound belong to the same sound, and the distance coefficient is generated , the distance coefficient is used to characterize the energy gain of the audio information relative to the previous audio information containing the warning sound; determine the second sound, the second sound includes the second position information and the energy gain; play the second sound.
- the difference between the first position information of the warning sound in the audio information and the first position information of the warning sound in the previous audio information containing the warning sound is within a preset range, and the difference in the audio information
- the alarm sound and the alarm sound in the previous audio information containing the alarm sound belong to the same sound, indicating that there are two consecutive alarm sounds around the user. Therefore, the second sound that includes the sound source direction of the identification alarm sound and carries energy gain is played. , guarantee to remind the user with the second sound including energy gain.
- playing the first sound includes: sending the first sound to an earphone, and playing the first sound by the earphone.
- playing the second sound includes: sending the second sound to an earphone, and playing the second sound by the earphone.
- determining the first position information of the warning sound based on the audio information includes: using a sound source localization algorithm based on a microphone array, using the audio information to perform sound source localization of the warning sound, and obtaining the first position information of the warning sound .
- determining the first position information of the warning sound based on the audio information includes: determining the third position information of the warning sound based on the audio information, where the third position information is used to identify the position information of the warning sound relative to the sound of the electronic device. Source direction; performing coordinate transformation on the third position information of the warning sound to obtain the first position information of the warning sound.
- determining the first sound, the first sound including the second position information includes: acquiring a standard sound; based on the first position information of the alarm sound, processing the standard sound to obtain the first sound, the first sound includes Second location information.
- processing the standard sound to obtain the first sound includes: obtaining the head-related impulse response HRIR value corresponding to the first position information of the warning sound; converting the standard sound, respectively The HRIR value is convolved to obtain the first sound.
- processing the standard sound to obtain the first sound includes: obtaining the HRTF value corresponding to the first position information of the warning sound; Fourier transform processing, and then multiplied by the HRTF value to obtain the first sound.
- the manner of detecting that the alarm sound in the audio information and the alarm sound in the previous audio information containing the alarm sound belong to the same sound includes: respectively performing an operation on the audio information and the previous audio information containing the alarm sound; Convert the time domain to the frequency domain to obtain the amplitude spectrum of the audio information and the previous audio information containing the warning sound; use the audio information and the amplitude spectrum of the previous audio information containing the warning sound to compare the audio information and the previous audio information containing the warning sound The similarity calculation is performed on the audio information to obtain the calculation result, which is used to represent whether the audio information and the previous audio information belong to the same sound.
- the audio information and the amplitude spectrum of the previous audio information containing the warning sound are used to perform similarity calculation on the audio information and the previous audio information containing the warning sound, and the calculation result is obtained, including: using Pearson Correlation function, calculate the similarity between the audio information and the previous audio information containing the alarm sound, and obtain the similarity value; if the similarity value is greater than the threshold, the audio information and the previous audio information containing the alarm sound belong to the same sound, similar If the intensity value is not greater than the threshold, the audio information and the previous audio information containing the alarm sound do not belong to the same sound.
- the audio information and the amplitude spectrum of the previous audio information containing the warning sound are used to perform similarity calculation on the audio information and the previous audio information containing the warning sound, and the calculation result is obtained, including: using a classification model Predict whether the audio message and the previous audio message containing the alarm sound belong to the same sound.
- the method of detecting that the alarm sound in the audio information and the alarm sound in the previous audio information containing the alarm sound belong to the same sound includes: from the audio information and the previous audio information containing the alarm sound, Raise alarm sounds respectively; judge whether the two extracted alarm sounds belong to the same alarm sound.
- judging whether the two extracted alarm sounds belong to the same alarm sound includes: respectively converting the two extracted alarm sounds from the time domain to the frequency domain to obtain the two extracted alarm sounds The amplitude spectrum of the sound; use the amplitude spectrum of the two extracted alarm sounds to calculate the similarity of the two extracted alarm sounds to obtain the calculation result. The calculation result is used to indicate whether the two extracted alarm sounds belong to the same alarm sound.
- the similarity calculation is performed on the two extracted alarm sounds by using the amplitude spectra of the two extracted alarm sounds to obtain the calculation result, which includes: using the Pearson correlation function to calculate the similarity of the two extracted alarm sounds Carry out similarity calculation for two alarm sounds to obtain a similarity value; if the similarity value is greater than the threshold, the two extracted alarm sounds belong to the same alarm sound; if the similarity value is not greater than the threshold value, the extracted two alarm sounds Do not belong to the same alarm sound.
- the amplitude spectrum of the two extracted alarm sounds is used to calculate the similarity of the two extracted alarm sounds, and the calculation result is obtained, including: using the classification model to predict the extracted two alarm sounds Whether they belong to the same alarm sound.
- the method further includes: determining that the distance coefficient is within a range of the distance coefficient.
- it also includes: determining that the distance coefficient exceeds the range of the distance coefficient; determining a third sound, the third sound including the second position information and the energy gain represented by the endpoint value of the range of the distance coefficient; playing the third sound .
- the end point value of the range of the distance coefficient is used as the distance coefficient to determine the third sound, and the third sound is played, which can avoid the generated distance coefficient being too large or too large. Small, how louder or softer the volume that causes the sound to play with the energy gain.
- the manner of determining whether the audio information includes an alarm sound includes: calling an alarm sound detection model to detect whether the audio information includes an alarm sound, and obtaining a detection result, which is used to represent whether the audio information includes an alarm sound.
- the present application provides an electronic device, including: one or more processors, memory and a wireless communication module; the memory and the wireless communication module are coupled with one or more processors, and the memory is used to store computer program codes,
- the computer program code includes computer instructions, and when one or more processors execute the computer instructions, the electronic device executes the audio information processing method according to any one of the first aspect.
- the present application provides a computer storage medium for storing a computer program, and when the computer program is executed, it is specifically used to implement the audio information processing method according to any one of the first aspect.
- the present application provides a computer program product.
- the computer program product When the computer program product is run on a computer, it enables the computer to execute the audio information processing method according to any one of the first aspect.
- the present application provides an audio information processing system, including: an electronic device and an earphone, wherein the electronic device is used to execute the audio information processing method according to any one of the first aspect; the earphone is used for communicating with the electronic device Interaction for playing a first sound, a second sound or a third sound in response to the electronic device.
- FIG. 1 is a diagram of an application scenario provided by an embodiment of the present application
- FIG. 2a is a schematic structural diagram of an electronic device provided in an embodiment of the present application.
- Figure 2b is a software architecture diagram of the electronic device provided by the embodiment of the present application.
- Fig. 3a is a display diagram of the noise reduction earphone provided by the embodiment of the present application.
- Figure 3b is an interface display diagram provided by the embodiment of the present application.
- FIG. 3c is a schematic diagram of the generalized cross-correlation delay estimation algorithm provided by the embodiment of the present application.
- FIG. 4 is a sequence diagram of a method for processing audio information provided in Embodiment 1 of the present application.
- FIG. 5 is a display diagram of the alarm sound relative to the position information of the user provided by the embodiment of the present application.
- FIG. 6 is another application scenario diagram provided by the embodiment of the present application.
- FIG. 7 is a sequence diagram of a method for processing audio information provided in Embodiment 2 of the present application.
- one or more refers to one, two or more than two; "and/or” describes the association relationship of associated objects, indicating that there may be three types of relationships; for example, A and/or B may mean: A exists alone, A and B exist simultaneously, and B exists alone, wherein A and B may be singular or plural.
- the character "/" generally indicates that the contextual objects are an "or" relationship.
- references to "one embodiment” or “some embodiments” or the like in this specification means that a particular feature, structure, or characteristic described in connection with the embodiment is included in one or more embodiments of the present application.
- appearances of the phrases “in one embodiment,” “in some embodiments,” “in other embodiments,” “in other embodiments,” etc. in various places in this specification are not necessarily All refer to the same embodiment, but mean “one or more but not all embodiments” unless specifically stated otherwise.
- the terms “including”, “comprising”, “having” and variations thereof mean “including but not limited to”, unless specifically stated otherwise.
- a plurality referred to in the embodiment of the present application means greater than or equal to two. It should be noted that in the description of the embodiments of the present application, words such as “first” and “second” are only used to distinguish the purpose of description, and cannot be understood as indicating or implying relative importance, nor can they be understood as indicating or imply order.
- the noise-cancelling earphones Even if the noise-cancelling earphones have a transparent mode, in this mode, the noise-cancelling earphones will not completely shield the surrounding sound, but if the surrounding sound is relatively noisy, it still cannot protect the safety of the user.
- the embodiment of the present application proposes a method for processing audio information.
- the audio information processing method provided in the embodiment of the present application can be applied to the application scenario shown in FIG. 1 .
- the user can be reminded when there is a dangerous alarm sound around the user through the interaction between the mobile phone and the noise-canceling headset.
- Fig. 2a shows a composition example of an electronic device provided by an embodiment of the present application.
- the composition structure of the mobile phone proposed in this application scenario is also shown in FIG. 2a.
- other electronic devices and the noise-canceling headset such as tablet computers, desktops, laptops, notebook computers, ultra-mobile personal computers (Ultra-mobile Personal Computer, UMPC), handheld computers, netbooks, personal digital assistants (Personal Digital Assistant, PDA), wearable electronic devices, etc.
- PDA Personal Digital Assistant
- the electronic device 200 may include a processor 210, an external memory interface 220, an internal memory 221, a display screen 230, an antenna 1, an antenna 2, a mobile communication module 240, a wireless communication module 250, an audio module 260 and the like.
- the structure shown in this embodiment does not constitute a specific limitation on the electronic device.
- the electronic device may include more or fewer components than shown, or combine some components, or separate some components, or arrange different components.
- the illustrated components can be realized in hardware, software or a combination of software and hardware.
- the processor 210 may include one or more processing units, for example: the processor 210 may include an application processor (application processor, AP), a modem processor, a graphics processing unit (graphics processing unit, GPU), an image signal processor (image signal processor, ISP), controller, video codec, digital signal processor (digital signal processor, DSP), baseband processor, and/or neural network processor (neural-network processing unit, NPU), etc. Wherein, different processing units may be independent devices, or may be integrated in one or more processors.
- a memory may also be provided in the processor 210 for storing instructions and data.
- the memory in processor 210 is a cache memory.
- the memory may hold instructions or data that the processor 210 has just used or recycled. If the processor 210 needs to use the instruction or data again, it can be called directly from the memory. Repeated access is avoided, and the waiting time of the processor 210 is reduced, thereby improving the efficiency of the system.
- the external memory interface 220 can be used to connect an external memory card, such as a Micro SD card, to expand the storage capacity of the electronic device.
- the external memory card communicates with the processor 210 through the external memory interface 220 to implement a data storage function. Such as saving music, video and other files in the external memory card.
- the internal memory 221 may be used to store computer-executable program codes including instructions.
- the processor 210 executes various functional applications and data processing of the electronic device 200 by executing instructions stored in the internal memory 221 .
- the internal memory 221 may include an area for storing programs and an area for storing data.
- the stored program area can store an operating system, at least one application program required by a function (such as a sound playing function, an image playing function, etc.) and the like.
- the storage data area can store data (such as audio data, phone book, etc.) created during the use of the electronic device.
- the internal memory 221 may include a high-speed random access memory, and may also include a non-volatile memory, such as at least one magnetic disk storage device, flash memory device, universal flash storage (universal flash storage, UFS) and the like.
- the processor 210 executes various functional applications and data processing of the electronic device by executing instructions stored in the internal memory 221 and/or instructions stored in a memory provided in the processor.
- the electronic device realizes the display function through the GPU, the display screen 230 , and the application processor.
- the GPU is a microprocessor for image processing, and is connected to the display screen 230 and the application processor. GPUs are used to perform mathematical and geometric calculations for graphics rendering.
- Processor 210 may include one or more GPUs that execute program instructions to generate or change display information.
- the electronic device can realize the shooting function through the ISP, the camera, the video codec, the GPU, the display screen 230 and the application processor.
- the wireless communication function of the electronic device can be realized by the antenna 1, the antenna 2, the mobile communication module 240, the wireless communication module 250, the modem processor and the baseband processor.
- the mobile communication module 240 can provide wireless communication solutions including 2G/3G/4G/5G applied to electronic devices.
- the mobile communication module 240 can receive electromagnetic waves through the antenna 1, filter and amplify the received electromagnetic waves, and send them to the modem processor for demodulation.
- the mobile communication module 240 can also amplify the signal modulated by the modem processor, and convert it into electromagnetic waves through the antenna 1 for radiation.
- the wireless communication module 250 can provide wireless local area networks (wireless local area networks, WLAN) (such as wireless fidelity (Wireless fidelity, Wi-Fi) network), bluetooth (bluetooth, BT), global navigation satellite system, etc. (global navigation satellite system, GNSS), frequency modulation (frequency modulation, FM), near field communication technology (near field communication, NFC), infrared technology (infrared, IR) and other wireless communication solutions.
- the wireless communication module 250 may be one or more devices integrating at least one communication processing module.
- the wireless communication module 250 receives electromagnetic waves via the antenna 2 , frequency-modulates and filters the electromagnetic wave signals, and sends the processed signals to the processor 210 .
- the wireless communication module 250 can also receive the signal to be sent from the processor 210 , frequency-modulate it, amplify it, and convert it into electromagnetic waves through the antenna 2 for radiation.
- the Bluetooth module in the wireless communication module 250 is used to implement short-distance communication between the electronic device 200 and other electronic devices, for example, the electronic device 200 interacts with the noise-canceling headset through the Bluetooth module.
- the bluetooth module can be an integrated circuit or a bluetooth chip or the like.
- the electronic device 200 can realize the audio function through the audio module 260 , the speaker 270A, the receiver 270B, the microphone 270C, the earphone interface 270D, and the application processor. Such as music playback, recording, etc.
- the audio module 260 is used for converting digital audio information into an output analog audio signal, and is also used for converting an analog audio input into a digital audio signal.
- the audio module 260 may also be used to encode and decode audio signals.
- the audio module 260 may be set in the processor 210 , or some functional modules of the audio module 260 may be set in the processor 210 .
- Speaker 270A also referred to as a "horn" is used to convert audio electrical signals into sound signals.
- Electronic device 200 can listen to music through speaker 270A, or listen to hands-free calls.
- the speaker 270A can be used to play the three-dimensional reminder sound mentioned in the embodiment of this application.
- Receiver 270B also called “earpiece” is used to convert audio electrical signals into audio signals.
- the receiver 270B can be placed close to the human ear to receive the voice.
- the microphone 270C also called “microphone” or “microphone” is used to convert sound signals into electrical signals.
- the user can make a sound by approaching the microphone 270C with a human mouth, and input the sound signal into the microphone 270C.
- the electronic device 200 may be provided with at least one microphone 270C.
- the electronic device 200 may be provided with two microphones 270C, which may also implement a noise reduction function in addition to collecting sound signals.
- the electronic device 200 can also be provided with three, four or more microphones 270C to form a microphone array to collect sound signals, reduce noise, identify sound sources, and realize directional recording functions.
- the microphone 270C is used to collect the sound of the external environment where the electronic device is located.
- an operating system runs on top of the above components.
- An application program can be installed and run on the operating system.
- Fig. 2b is a block diagram of the software structure of the electronic device according to the embodiment of the present application.
- the layered architecture divides the software into several layers, and each layer has a clear role and division of labor. Layers communicate through software interfaces.
- the Android system is divided into four layers, which are respectively the application program layer, the application program framework layer, the Android runtime (Android runtime) and the system library, and the kernel layer from top to bottom.
- the application layer can consist of a series of application packages. As shown in FIG. 2b, the application package may include applications such as camera, gallery, calendar, call, map, navigation, WLAN, and Bluetooth.
- applications such as camera, gallery, calendar, call, map, navigation, WLAN, and Bluetooth.
- the application framework layer provides an application programming interface (application programming interface, API) and a programming framework for applications in the application layer.
- the application framework layer includes some predefined functions. As shown in Figure 2b, the application framework layer can include window manager, content provider, phone manager, resource manager, notification manager, view system, etc.
- a window manager is used to manage window programs.
- the window manager can get the size of the display screen, determine whether there is a status bar, lock the screen, capture the screen, etc.
- Content providers are used to store and retrieve data and make it accessible to applications.
- Said data may include video, images, audio, calls made and received, browsing history and bookmarks, phonebook, etc.
- the phone manager is used to provide communication functions of electronic devices. For example, the management of call status (including connected, hung up, etc.).
- the resource manager provides various resources for the application, such as localized strings, icons, pictures, layout files, video files, and so on.
- the notification manager enables the application to display notification information in the status bar, which can be used to convey notification-type messages, and can automatically disappear after a short stay without user interaction.
- the notification manager is used to notify the download completion, message reminder, etc.
- the notification manager can also be a notification that appears on the top status bar of the system in the form of a chart or scroll bar text, such as a notification of an application running in the background, or a notification that appears on the screen in the form of a dialog window.
- prompting text information in the status bar issuing a prompt sound, vibrating the electronic device, and flashing the indicator light, etc.
- the view system includes visual controls, such as controls for displaying text, controls for displaying pictures, and so on.
- the view system can be used to build applications.
- a display interface can consist of one or more views.
- a display interface including a text message notification icon may include a view for displaying text and a view for displaying pictures.
- the Android Runtime includes core library and virtual machine.
- the Android runtime is responsible for the scheduling and management of the Android system.
- the cold start of the application will run in the Android runtime, and the Android runtime obtains the optimized file status parameters of the application, and then the Android runtime can judge whether the optimized file is outdated due to system upgrades through the optimized file status parameters , and return the judgment result to the application control module.
- the core library consists of two parts: one part is the function function that the java language needs to call, and the other part is the core library of Android.
- the application layer and the application framework layer run in virtual machines.
- the virtual machine executes the java files of the application program layer and the application program framework layer as binary files.
- the virtual machine is used to perform functions such as object life cycle management, stack management, thread management, security and exception management, and garbage collection.
- a system library can include multiple function modules. For example: surface manager (surface manager), media library (Media Libraries), 3D graphics processing library (eg: OpenGL ES), 2D graphics engine (eg: SGL), etc.
- the surface manager is used to manage the display subsystem and provides the fusion of 2D and 3D layers for multiple applications.
- the media library supports playback and recording of various commonly used audio and video formats, as well as still image files, etc.
- the media library can support a variety of audio and video encoding formats, such as: MPEG2, H.262, MP3, AAC, AMR, JPG, PNG, etc.
- the 3D graphics processing library is used to realize 3D graphics drawing, image rendering, compositing and layer processing, etc.
- the 2D graphics engine is a drawing engine for 2D drawing.
- the kernel layer is the layer between hardware and software.
- the kernel layer includes at least a display driver, a camera driver, an audio driver, and a sensor driver.
- the noise-canceling headset can generally be a Bluetooth headset.
- the bluetooth earphone is an earphone that supports the bluetooth communication protocol.
- the Bluetooth communication protocol may be an ER traditional Bluetooth protocol, may also be a BDR traditional Bluetooth protocol, or may be a BLE low-power Bluetooth protocol. Of course, it can also be other new Bluetooth protocol types that will be launched in the future.
- the version of the Bluetooth communication protocol can be any of the following: 1.0 series version, 2.0 series version, 3.0 series version, 4.0 series version, and other series of versions based on future releases.
- the bluetooth earphone of the embodiment of the present application generally refers to a double bluetooth earphone composed of a left earphone and a right earphone, which can provide stereo sound effect for the user.
- Common dual Bluetooth headsets include traditional in-ear Bluetooth headsets and true wireless stereo (TWS) Bluetooth headsets.
- TWS Bluetooth headset saves the connecting wire between the two earphones and the audio source, the left earphone and the right earphone still need to be connected through a connecting wire to synchronize the audio signals.
- the TWS Bluetooth headset not only saves the connection cable between the two headphones and the audio source, but also saves the connection cable between the left earphone and the right earphone.
- Both the left earphone and the right earphone are provided with a bluetooth module, and the left earphone and the right earphone can transmit data through the bluetooth protocol.
- Both the left earphone and the right earphone include microphones, that is to say, in addition to the function of audio playback, the main earphone and the auxiliary earphone also have the function of audio collection.
- the Bluetooth headset in the embodiment of the present application can be one or more of the following applications: HSP (Headset Profile) application, HFP (Hands-free Profile) application, A2DP (Advanced Audio Distribution Profile) application, AVRCP (Audio/ Video Remote Control Profile) application.
- HSP Headset Profile
- HFP Headset Profile
- A2DP Advanced Audio Distribution Profile
- AVRCP Audio/ Video Remote Control Profile
- the HSP application represents a headset application, and provides basic functions required for communication between the electronic device and the headset.
- Bluetooth earphones can be used as audio input and output interfaces of electronic equipment.
- the HFP application stands for the hands-free application.
- the HFP application adds some extended functions on the basis of the HSP application.
- the Bluetooth headset can control the call process of the terminal, such as answering, hanging up, rejecting, voice dialing, etc.
- the A2DP application is an advanced audio transmission application.
- A2DP can use the chip in the earphone to stack data to achieve high-definition sound.
- the AVRCP application is an audio and video remote control application.
- the AVRCP application defines how to control the characteristics of streaming media, including: pause, stop, start playback, volume control and other types of remote control operations.
- the noise-cancelling earphones can be provided with an activation button for the earphone intelligent reminder alarm sound function.
- an activation button 101 of the earphone smart reminder alarm sound function is set on the right earphone, and the activation button 101 may include a first position 11 and a second position 22 .
- the start button 101 is located at the first position 11, the function of the earphone intelligent reminder alarm sound is activated; when the start button 101 is located at the second position 22, the function of the earphone intelligent reminder alarm sound is turned off.
- the activation button of the earphone intelligent reminder alarm sound function may be the same button as the button for other functions of the noise-canceling earphone, or it may be a separate button.
- the noise-canceling headset After the noise-canceling headset activates the earphone intelligent reminder alarm sound function, when the noise-canceling headset determines that there is an alarm sound around the user, the noise-canceling headset can play an alarm sound to the user.
- the type of alarm sound played by the noise-canceling headset to the user can be set. Also referring to the example in FIG. 3 a , an alarm sound selection button 102 is set on the left earphone. The alarm sound is selected by triggering the alarm sound selection button 102 .
- the user clicks the alarm sound selection button 102, and the noise-canceling earphone responds to the user's click operation to make a voice broadcast and select the alarm sound.
- the alarm sound can be divided into three modes: default alarm sound, intelligently recommended alarm sound and manual selection of alarm sound.
- the default alarm sound is the alarm sound set by the system, and the intelligent recommended alarm sound can provide different alarm sounds in combination with the operating status of the noise-canceling earphones.
- Manually select the alarm sound and the user can click the alarm sound selection button 102 to select a different manual selection. Warning sounds, such as the horns of different types of vehicles.
- FIG. 3a uses a head-mounted Bluetooth headset as an example for illustration, but this does not constitute a limitation to the Bluetooth headset involved in the embodiment of the present application.
- the start button 101 and the alarm sound selection button 102 shown in FIG. 3a are physical keys, and in some embodiments, the start button 101 and the alarm sound selection button 102 may also be virtual keys.
- the left earphone or the right earphone of the Bluetooth headset can be set with a virtual button, and the earphone intelligent reminder alarm function can be activated by triggering the virtual button.
- the triggering of the virtual button can also be set in various forms.
- the function of starting or closing the earphone intelligent reminder alarm sound can be realized by touching for different durations; in other embodiments, it can also be started by touching for different times. Or turn off the earphone intelligent reminder warning sound function; in some other embodiments, it is also possible to activate or deactivate the earphone intelligent reminder warning sound function by triggering different positions.
- the left earphone or the right earphone of the bluetooth earphone can also be set with a virtual button, and different alarm sounds can be selected by triggering the virtual button.
- the triggering of virtual keys can also be set in various forms. In some embodiments, different alarm sounds can be selected by touching for different durations; in other embodiments, different alarm sounds can also be selected by touching for different times; In some embodiments, different alarm sounds can also be selected by triggering different positions.
- the control start and stop of the earphone intelligent reminder alarm sound function, and the control selection of different alarm sounds can also be realized by electronic equipment.
- the setting interface of the Bluetooth headset of the electronic device presents four items of earphone intelligent reminder alarm sound, active noise reduction, gesture and alarm sound selection, and the user can activate the activation button of each item.
- the function corresponding to the item In the example shown in FIG. 3 b , the earphone intelligent reminder alarm sound is activated, and the functions of the other three items are disabled.
- the earphone intelligent reminder alarm sound is activated, and the noise-canceling earphone connected to the mobile phone Bluetooth can interact with the mobile phone to realize the alarm sound reminder when there is a dangerous alarm sound around the user.
- the selection of the alarm sound is activated, and the user can complete the selection of the alarm sound when a dangerous alarm sound appears around the user through a manual input operation.
- the selection of alarm sound is an item with a sub-interface, and the user slides and clicks the start button of the selection of alarm sound, the function of selecting the alarm sound is activated, and the sub-interface of the selection of alarm sound is displayed.
- the sub-interface for selecting the alarm sound shows four modes, which are default alarm sound, intelligently recommended alarm sound, user-defined and manually selected alarm sound.
- the default alarm sound, the intelligently recommended alarm sound and the manual selection of the alarm sound can be as described above.
- Customization can be understood as an alarm sound that can be edited and customized by the user. In the example shown in Fig. 3b, the default alarm sound is enabled, and the other three modes are disabled.
- a sub-interface for manually selecting the alarm sound is displayed, as shown in the example in FIG. 3 b .
- the sub-interface for manually selecting the warning sound includes four kinds of vehicle warning sounds, and the user can select the warning sound through the start buttons of different vehicles.
- vehicle 1 is in the activated state, and the other three vehicles are in the closed state.
- the aforementioned electronic devices such as mobile phones and noise-cancelling earphones can also be equipped with an alarm sound detection model, which has the function of predicting whether the audio information input to the alarm sound detection model contains alarm sounds.
- the alarm sound detection model can use basic network models such as convolutional neural network (Convolutional Neural Network, CNN) and long-short-term memory artificial neural network (Long-Short Term Memory, LSTM).
- Convolutional neural networks usually include: input layer, convolution layer (Convolution Layer), pooling layer (Pooling layer), fully connected layer (Fully Connected Layer, FC) and output layer.
- Convolution Layer convolution layer
- Pooling layer Purooling layer
- FC Fully Connected Layer
- the first layer of a convolutional neural network is the input layer
- the last layer is the output layer.
- Convolution Layer refers to the neuron layer that performs convolution processing on the input signal in the convolutional neural network.
- a neuron can only be connected to some adjacent neurons.
- a convolutional layer usually contains several feature planes, and each feature plane can be composed of some rectangularly arranged neural units. Neural units of the same feature plane share weights, and the shared weights here are convolution kernels.
- Pooling layer usually after the convolutional layer will get a feature with a large dimension, cut the feature into several regions, take its maximum value or average value, and obtain a new feature with a smaller dimension.
- Fully-Connected layer which combines all local features into global features, is used to calculate the final score of each category.
- a long-short-term memory artificial neural network usually includes an input layer, a hidden layer, and an output layer.
- the input layer is composed of at least one input node; when the LSTM network is a unidirectional network, the hidden layer only includes the forward hidden layer; when the LSTM network is a bidirectional network, the hidden layer includes the forward hidden layer and the backward hidden layer to the hidden layer.
- the hidden nodes are respectively connected to the output nodes, and are used to output their own calculation results to the output nodes, and the output nodes perform calculations according to the output nodes of the hidden layer, and output data.
- the alarm sound detection model can be trained in the following ways:
- the original model of alarm sound detection can choose basic network models such as CNN and LSTM.
- the training samples include: samples containing alarm sounds and samples not containing alarm sounds, and the training samples are marked to indicate whether the samples contain alarm sounds.
- the warning sound in the training sample can be like the whistle of a vehicle.
- training samples including whistle sounds of different types of motor vehicles such as automobiles and motorcycles, as well as training samples containing other alarm sounds such as alarm bells .
- the alarm bell can be understood as the alarm sound when special vehicles such as ambulances, police cars, and fire trucks are running.
- the training samples are input into the original alarm sound detection model, and the original alarm sound detection model detects whether the training samples contain the alarm sound, and obtains the detection result.
- loss function uses the loss function to calculate the loss value of the detection result and the labeling result of each training sample to obtain the loss value of the model.
- loss functions such as cross-entropy loss function and weighted loss function can be used to calculate the loss value, or a combination of multiple loss functions can be used to calculate multiple loss values.
- the model convergence condition may be that the loss value of the model is less than or equal to a preset loss threshold. That is to say, the loss value of the model can be compared with the loss threshold. If the loss value of the model is greater than the loss threshold, it can be judged that the loss value of the model does not meet the model convergence conditions. Conversely, if the loss value of the model is less than or equal to the loss threshold, it can be judged that the model loss value meets the model convergence condition.
- the loss value of the corresponding model can be calculated for each training sample. In this case, only when the model loss value of each training sample meets the model convergence condition will the Execute, on the contrary, as long as there is a model loss value of a training sample that does not meet the model convergence conditions, then execute the subsequent steps.
- the trained model can be used in the audio information processing method proposed in the following embodiments to detect whether the audio information input to the model contains an alarm sound.
- the parameter update value of the model is calculated according to the loss value of the model, and the original model of alarm sound detection is updated with the parameter update value of the model. And use the updated model to continue to process the training samples, get the detection results, and continue to execute the subsequent process until the loss value of the model meets the convergence conditions of the model.
- the sound can be localized based on the sound source localization algorithm of the microphone array and the like.
- the sound source localization algorithm uses a microphone array for sound localization.
- Commonly used sound source localization algorithms mainly fall into three categories: localization technology based on high-resolution spectrum estimation, localization technology based on steerable beamforming (Beamforming), and localization technology based on TDOA.
- the implementation principle of the TDOA-based sound localization algorithm is simple. It is generally divided into two parts: delay estimation and sound source localization.
- Delay estimation can calculate the arrival time difference of two signals from different microphones, and sound source localization can be calculated according to the time difference. The angle of the sound from the sound source.
- the time delay estimation algorithm mainly includes the time delay estimation method based on correlation analysis, the time delay estimation method based on phase spectrum estimation, the time delay estimation method based on parameter estimation, etc.
- the most widely used method is mainly the time delay estimation based on correlation analysis
- the generalized cross-correlation function method (GCC) in the time delay estimation method based on correlation analysis introduces a weighting function to adjust the cross power spectral density, so as to optimize the performance of time delay estimation.
- the generalized cross-correlation function has many different deformations
- the generalized cross-correlation-phase transformation method Generalized Cross Correlation PHASE Transformation, GCC-PHAT
- the generalized cross-correlation function delay estimation algorithm estimates the delay value according to the peak value of the cross-correlation function of two microphone signals.
- the target signal received by each element of the microphone array comes from the same sound source. Therefore, there is a strong correlation between the signals of each channel.
- the correlation function between every two signals the time delay between two microphone observation signals can be determined.
- the received signals x 1 (t) and x 2 (t) of the two microphones in the array, as shown in Equation 1, are:
- x 1 (t) a 1 s(t- ⁇ 1 )+n 1 (t)
- t refers to the time
- s(t) is the sound source signal
- n 1 (t) and n 2 (t) are the environmental noise
- ⁇ 1 and ⁇ 2 are the signal propagation from the sound source to the two microphone array elements propagation time.
- X 1 ( ⁇ ) is the result of performing Fourier transform FFT on x 1 (t)
- X 2 ( ⁇ ) is the result of performing Fourier transform FFT on x 2 (t)
- ⁇ refers to the microphone
- the angular frequency of the received signal, ( . ) * refers to the conjugate processing of X 2 ( ⁇ )
- ⁇ ( ⁇ ) is the phase transformation weighting function, used for the common
- the yoke is weighted by phase transformation to obtain the calculation result.
- the calculation result is subjected to peak detection after inverse Fourier transform IFFT, and the peak detection result is used to output ⁇ 12 .
- ⁇ 12 ⁇ 1 ⁇ 2 , which is the time difference between the two microphone signals.
- GCC-PHAT only uses the signals of two microphones. If the number of microphones is more than two, other methods can be used for delay estimation, such as the sound source localization algorithm based on phase transformation weighted controllable response power (Steered Response Power- Phase Transform, SRP-PHAT).
- SRP-PHAT Stepered Response Power- Phase Transform
- the basic principle of the SRP-PHAT algorithm is to calculate the sum of the generalized cross-correlation GCC-PHAT functions weighted by all microphones on the phase transformation of the received signal at the position of the imaginary sound source, and to find the point where the SRP value is the largest in the entire sound source space is the sound source location estimate.
- this embodiment provides a method for processing audio information, and the method for processing audio information provided by this embodiment can be used in the application scenario in FIG. 1 .
- the processing method of the audio information includes steps:
- the mobile phone acquires audio information.
- the audio information is obtained by collecting the sound of the external environment, and the sound of the external environment may be collected through a microphone.
- both the mobile phone and the noise-canceling headset are equipped with microphones. Therefore, the microphone in the mobile phone or the microphone in the noise-canceling headset can collect the sound of the external environment to obtain audio information.
- the mobile phone or the noise-canceling earphone can collect the sound of the external environment periodically or in real time to obtain audio information when the noise-canceling earphone is in the running state.
- the noise-canceling earphone collects the sound of the external environment, obtains audio information, and transmits the audio information to the mobile phone through a channel connected to the mobile phone such as a Bluetooth channel.
- the noise-canceling headset collects the sound of the external environment to obtain audio information
- the mobile phone obtains the audio information obtained by the noise-canceling headset, and executes the following steps as an example for illustration.
- the mobile phone calls the alarm sound detection model to detect whether the audio information contains the alarm sound, and obtains a detection result, and the detection result is used to indicate whether the audio information contains the alarm sound.
- the alarm sound detection model has the function of predicting whether the audio information input to the alarm sound detection model contains the alarm sound. Therefore, after acquiring the audio information of the external environment, the alarm sound detection model can be used to detect whether the audio information contains the alarm sound, and obtain the detection result.
- the mobile phone after the mobile phone obtains the audio information, it calls the alarm sound detection model to detect whether the audio information contains the alarm sound, and obtains the detection result.
- the noise-canceling earphone can also call the alarm sound detection model to detect whether the audio information contains the alarm sound, obtain the detection result, and then transmit the detection result to the mobile phone. In this way, the mobile phone does not need to execute step S402.
- step S403 If the detection result indicates that the audio information contains an alarm sound, execute steps S403 and S404; if the detection result indicates that the audio information does not contain an alarm sound, return to step S401.
- alarm sounds mentioned in the embodiments of the present application can all be understood as the alarm sounds mentioned in the foregoing content, such as whistles or alarm bells of various types of motor vehicles.
- the mobile phone uses the audio information to locate the alarm sound, and obtains position information of the alarm sound relative to the user.
- the mobile phone may use the sound source localization algorithm based on the microphone array proposed in the foregoing content, and use audio information to perform sound source localization for the warning sound.
- the mobile phone uses the audio information collected by the microphone of the left earphone of the noise-canceling earphone and the audio information collected by the microphone of the right earphone to locate the sound source of the alarm sound, and obtain the location information of the alarm sound relative to the user.
- the location information is generally Including the horizontal direction angle ⁇ of the alarm sound relative to the user.
- FIG. 5 shows an example of the horizontal direction angle ⁇ of the alarm sound relative to the user (referring to the center point of the user's head).
- the position information of the alarm sound relative to the user obtained in this step refers to the position information of the alarm sound relative to the earphone.
- the position information of the alarm sound relative to the user mentioned in the following content refers to the position information of the alarm sound relative to the earphone.
- the obtained position information of the warning sound relative to the user can be understood as the relative position information of the warning sound.
- the mobile phone uses the audio information to locate the alarm sound to obtain the location information of the alarm sound relative to the user, which may also refer to obtaining the absolute position of the alarm sound.
- the audio information collected by the microphone of the left earphone and the audio information collected by the microphone of the right earphone can also be used by the noise-canceling earphone, and the sound source localization algorithm based on the microphone array proposed in the foregoing content can perform sound source localization on the alarm sound , to obtain the location information of the alarm sound relative to the user.
- the noise-canceling headset can transmit the obtained warning sound to the mobile phone relative to the direction angle of the user. In this way, the mobile phone does not need to execute step S403.
- the mobile phone can obtain the audio information collected by the built-in microphone array of the mobile phone, and use the microphone array to The collected audio information is used to locate the sound source of the alarm sound, and obtain the position information of the alarm sound relative to the mobile phone.
- the mobile phone and the user may have relative angles, after the mobile phone uses the audio information collected by its own microphone array to obtain the position information of the alarm sound relative to the mobile phone, it is necessary to perform coordinate transformation on the position information of the alarm sound relative to the mobile phone to obtain The location information of the alarm sound relative to the user.
- the coordinate conversion of the position information of the warning sound relative to the mobile phone can be performed to obtain the position information of the warning sound relative to the user.
- the position information of the alarm sound relative to the mobile phone can be transformed based on the earth coordinate system to obtain the position information of the alarm sound relative to the user.
- coordinate conversion can also be performed based on the unified coordinate system of other mobile phones and noise-canceling headphones.
- the noise-cancelling headset In order to adapt to the coordinate transformation, the noise-cancelling headset needs to calculate the attitude angle relative to the earth coordinate system. Therefore, the noise-canceling headset needs to be equipped with an acceleration sensor and an angular velocity sensor. Usually, it is necessary to set the same type of acceleration sensor and angular velocity sensor as the mobile phone.
- the mobile phone uses the detection data of its own acceleration sensor and angular velocity sensor to calculate the attitude angle of the mobile phone.
- the noise-canceling earphone uses the detection data of its own acceleration sensor and angular velocity sensor to calculate the attitude angle of the earphone.
- the mobile phone acquires the attitude angle of the earphone, and uses the attitude angle of the mobile phone and the earphone to determine the conversion relationship between the earphone and the mobile phone's coordinate system, and uses the conversion relationship to process the position information of the alarm sound relative to the mobile phone, and obtain the relative position of the alarm sound relative to the user. location information.
- the specific method for calculating the attitude angle by using the detection data of the acceleration sensor and the angular velocity sensor of the mobile phone and the noise-cancelling earphone can refer to the conventional method, and will not be described here.
- the mobile phone uses the attitude angle of the mobile phone and the attitude angle of the earphone to determine the conversion relationship between the coordinate system of the earphone and the mobile phone, and uses the conversion relationship to process the position information of the alarm sound relative to the mobile phone to obtain the position information of the alarm sound relative to the user. It can also refer to the conventional method, which will not be described here.
- the mobile phone acquires audio information collected by a built-in microphone array of the mobile phone, and the mobile phone transmits the acquired audio information to the noise reduction earphone.
- the noise-canceling earphone uses audio information to locate the alarm sound and obtain the position information of the alarm sound relative to the user.
- the microphone array of the mobile phone collects the audio information of the external environment. Therefore, when the noise-canceling headset uses the audio information and uses the sound source localization algorithm based on the microphone array proposed in the foregoing content to localize the sound source of the alarm sound, the result is The location information of the alarm sound relative to the mobile phone.
- the noise-canceling earphone uses the aforementioned content to perform coordinate transformation on the position information of the alarm sound relative to the mobile phone, and obtain the position information of the alarm sound relative to the user.
- the mobile phone detects whether the audio information and the previous audio information including the alarm sound are obtained within a preset time period.
- step S405 If the mobile phone detects the audio information and the previous audio information containing the warning sound is not obtained within the preset time period, then perform steps S405 to S407; if the mobile phone detects the audio information, the time difference with the previous audio information containing the warning sound is within the preset If it is acquired within the time period, execute steps S408 to S413.
- the previous audio information containing the warning sound refers to: when the mobile phone detects the warning sound before the audio information containing the warning sound is determined, the most adjacent audio information containing the warning sound information.
- the audio information detected by the mobile phone and the previous audio information containing the alarm sound are obtained within a preset time period, which means that the mobile phone has detected two alarm sounds consecutively within a time period, so it may be necessary to focus on reminding the user of the alarm sound.
- the preset time period can be set according to actual needs, because two consecutive alarm sounds need to be screened out through step S404, so the preset time period should not be set too long.
- the preset time period can be set to 30 seconds.
- the mobile phone Based on the location information of the alarm sound relative to the user, the mobile phone processes the standard alarm sound to obtain a three-dimensional alarm sound.
- the three-dimensional reminder sound can be understood as a directional warning sound.
- the standard reminder sound can be processed by three-dimensional sound technology, and the alarm sound of the carrying position can be obtained. After the warning sound of the carrying direction is output to the user, the user can feel the direction of the warning sound.
- the mobile phone pre-stores a plurality of standard reminder sounds, and the user can pre-set the standard reminder sound for the alarm reminder by adopting the alarm sound selection method proposed in the foregoing content.
- the user can be reminded to set the standard alarm sound by displaying the alarm sound selection interface shown in FIG. 3b through the mobile phone.
- the standard reminder sound can be understood as an alarm sound that does not contain noise, and it can usually be the whistle of a vehicle.
- the mobile phone also pre-stores a plurality of Head-Response Transfer Function (HRTF) values; wherein, a plurality of Head-Response Transfer Function (Head-Response Transfer Function, HRTF) values, usually according to
- HRTF Head-Response Transfer Function
- the left and right earphones are set in pairs. That is, multiple HRTF values are divided into multiple HRTF values of the left earphone and HRTF values of the right earphone corresponding to each HRTF value of the left earphone.
- the HRTF values of a pair of left and right earphones respectively correspond to an angle value of an alarm sound relative to the user.
- the human head can be used as the center point, and the 360° with a certain distance from the center point can be divided into multiple angle values.
- the 360° around the central point can be equally divided into multiple angle values.
- the number of division angles can be set according to actual conditions.
- HRTF Head-Response Transfer Function
- the calculation method of the head-response transfer function (Head-Response Transfer Function, HRTF) value is as shown in formula two:
- P L and P R are the frequency-domain complex sound pressures produced by the sound source in the left and right ears respectively;
- P 0 is the sound pressure in the frequency domain of the sound source in the center of the head after the head is removed, and the definition of P 0 is shown in Equation 3 :
- ⁇ 0 is the density of the medium (air)
- c represents the speed of sound
- c in the air at normal temperature is 344m/s
- Q0 is the intensity of the sound source
- r represents the sound
- f represents the frequency of the sound.
- this step includes:
- the HRTF value of the left earphone and the HRTF value of the right earphone corresponding to the position information are obtained.
- the location information of the alarm sound relative to the user includes: the horizontal direction angle of the alarm sound relative to the user. Use the horizontal direction angle of the alarm sound relative to the user as the screening factor, and filter multiple HRTF values stored in the mobile phone to obtain the HRTF value of the left earphone and the HRTF value of the right earphone that match the horizontal direction angle of the alarm sound relative to the user .
- the standard reminder sound, the HRTF value of the left earphone and the HRTF value of the right earphone corresponding to the position information are subjected to Fourier transform product processing to obtain binaural output signals, that is, the three-dimensional reminder sound of the left earphone and the three-dimensional reminder sound of the right earphone .
- the mobile phone can also pre-store multiple Head Related Inpulse Response (HRIR) values; where multiple Head Related Inpulse Response (HRIR) values are usually set in pairs according to the left and right earphones . That is, the multiple HRIR values are divided into multiple HRIR values of the left earphone and HRIR values of the right earphone corresponding to each HRIR value of the left earphone.
- the HRIR values of a pair of left and right earphones respectively correspond to an angle value of an alarm sound relative to the user.
- HRIR Head Related Inpulse Response
- HRTF Head-Response Transfer Function
- the HRIR value of the left earphone and the HRIR value of the right earphone corresponding to the position information are obtained.
- the location information of the alarm sound relative to the user includes: a horizontal direction angle of the alarm sound relative to the user. Use the horizontal direction angle of the alarm sound relative to the user as the screening factor, and filter multiple HRIR values stored in the mobile phone to obtain the HRIR value of the left earphone and the HRIR value of the right earphone that match the horizontal direction angle of the alarm sound relative to the user .
- the standard reminder sound is convolved with the HRIR value of the left earphone and the HRIR value of the right earphone corresponding to the position information, and the binaural output signal is obtained, that is, the three-dimensional reminder sound of the left earphone and the three-dimensional reminder sound of the right earphone.
- the position information based on the alarm sound relative to the user proposed in the foregoing content may be completely the same as the position information of the alarm sound relative to the user obtained in step S403, or may be approximately the same, or the difference between the two is within a certain range. In the range.
- the mobile phone sends a three-dimensional reminder sound to the noise reduction headset.
- the mobile phone can send the three-dimensional reminder sound of the left earphone and the three-dimensional reminder sound of the right earphone to the noise-canceling earphone through a connection channel such as Bluetooth.
- the noise reduction earphone plays a three-dimensional reminder sound.
- the left earphone of the noise-canceling earphone outputs the three-dimensional reminder sound of the left earphone
- the right earphone outputs the three-dimensional reminder sound of the right earphone
- the alarm sound is detected in the audio of the external environment
- the mobile phone uses the audio information to locate the alarm sound, obtains the position information of the alarm sound relative to the user, and processes the standard reminder based on the position information of the alarm sound relative to the user.
- the three-dimensional reminder sound is obtained by the sound, and then the three-dimensional reminder sound is played by the noise-canceling earphone, which can remind the user that there is an alarm sound around and there is a safety problem.
- the mobile phone determines whether the difference between the location information of the alarm sound relative to the user in the audio information and the location information of the alarm sound relative to the user in the previous audio information containing the alarm sound is within a preset range.
- step S405 is executed.
- step S409 is executed.
- the difference between the location information of the alarm sound relative to the user in the audio information and the location information of the alarm sound relative to the user in the previous audio information containing the alarm sound is within the preset range, indicating that the alarm sound has appeared successively within the same range.
- Two alarm sounds it is necessary to focus on reminding the user of the alarm sound.
- the preset range can be set according to the actual situation. Generally, the difference between the alarm sound and the horizontal direction angle of the user can be set to be smaller than the first threshold.
- the first threshold may be set according to actual conditions, and in one example, the first threshold may be 5°.
- the mobile phone detects whether the audio information and the previous audio information including the alarm sound belong to the same sound.
- whether the audio information detected by the mobile phone and the previous audio information containing the alarm sound belong to the same sound means: whether the alarm sound in the audio information detected by the mobile phone is the same as the alarm sound in the previous audio information containing the alarm sound belong to the same alarm sound.
- the method for detecting whether the alarm sound in the audio information and the alarm sound in the previous audio information containing the alarm sound belong to the same alarm sound may include the following two methods.
- the first method detecting whether the audio information and the previous audio information including the alarm sound belong to the same sound.
- the second method from the audio information and the previous audio information containing the warning sound, the warning sound is proposed respectively; whether the two extracted warning sounds belong to the same warning sound is judged.
- the alarm sound detection model when the alarm sound detection model detects that the audio information contains the alarm sound, the alarm sound detection model can obtain the position information of the alarm sound in the audio information, therefore, the position information of the alarm sound can be used to obtain the information from the audio information and the previous one containing the alarm sound. Extract the alarm sound from the audio information.
- the following takes the first method as an example to specifically describe the process of whether the alarm sound in the audio information and the alarm sound in the previous audio information containing the alarm sound belong to the same alarm sound.
- the method of judging whether two alarm sounds belong to the same alarm sound can also refer to the following content.
- the intensity of the two warning sounds before and after may be different, but if the two alarm sounds belong to the same sound source, the frequency of the warning sound should be the same. Therefore, in a possible implementation mode, The magnitude spectrum can be used to judge whether the two audio messages containing the warning sound belong to the same sound.
- the specific implementation is as follows:
- Each audio information containing the warning sound is converted from the time domain to the frequency domain to obtain an amplitude spectrum of each audio information containing the warning sound.
- the amplitude spectrum of the audio information can be obtained by performing Fourier transform on the audio information including the warning sound.
- the x-axis of the magnitude spectrum is the frequency
- the y-axis is the magnitude of the audio information.
- the similarity calculation is performed on the two audio information, and the calculation result is obtained, which is used to represent whether the two audio information belong to the same sound.
- a Pearson correlation function may be used to perform similarity calculations on two audio messages containing alarm sounds before and after to obtain a similarity value.
- sampling points are collected for the two audio information containing the warning sound before and after, and n sampling points for each audio information containing the warning sound are obtained.
- the two sampling points of the audio information containing the warning sound before and after can be referred to as ( Xi, Yi), using the two sampling points of the audio information containing the alarm sound into the following formula 4, the Pearson correlation coefficient r can be calculated.
- the correlation strength of the two audio information containing the warning sound can be judged by Table 1.
- a threshold can be set according to the relationship between the Pearson correlation coefficient r and the correlation strength provided in Table 1, for example, the threshold is set to 0.8. If the similarity value of the two audio information containing the warning sound is greater than the threshold, then the two audio information containing the warning sound belong to the same sound, and the similarity value of the two audio information containing the warning sound is not greater than the threshold, then the preceding and following Two audio messages containing warning tones do not belong to the same sound.
- whether two audio information belong to the same sound can be predicted by a classification model, such as a binary classification model, a som model, an SVM model, and the like.
- a classification model such as a binary classification model, a som model, an SVM model, and the like.
- the classification model after the training has two input signals input to the classification model for prediction, such as whether the two audio information containing the warning sound in this embodiment are the classification results of the same class, and the prediction result is obtained.
- the prediction result is 1, the two audio information containing the alarm sound belong to the same sound; if the prediction result is 0, the two audio information containing the alarm sound do not belong to the same sound.
- step S410 if the detected audio information belongs to the same sound as the previous audio information including the warning sound, step S410 is performed. If the detected audio information does not belong to the same sound as the previous audio information containing the warning sound, step S405 is performed.
- the audio information detected by the mobile phone belongs to the same sound as the previous audio information containing the alarm sound, which means that the user is in the same direction, and the alarm sound from the same sound source has appeared twice in a row. Therefore, it is necessary to focus on reminding the user of the alarm sound. Voice.
- step S403 , step S404 , step S408 and step S409 are not limited to the execution order shown in FIG. 4 , and may be executed in parallel.
- step S404, step S408, and step S409 are not limited to the execution sequence shown in FIG. 4, and may be executed in parallel or in other execution sequences.
- the mobile phone generates a distance coefficient, where the distance coefficient is used to represent the energy gain of the audio information relative to the previous audio information including the warning sound.
- the energy gain is positive, that is, the distance coefficient is a value greater than 1; if the energy of the audio information is smaller than the energy of the previous audio information containing the alarm sound The energy of the audio information, the energy gain is negative, that is, the distance coefficient is a value less than 1; if the energy of the audio information is the same as the energy of the previous audio information containing the warning sound, the energy gain is 0, that is, the distance coefficient is 1.
- the distance coefficient gain can be calculated using Formula 5.
- k is a constant.
- the range of the distance coefficient can be set in advance, such as 0.1 to 10. After the distance coefficient is calculated in step S410, it is compared whether the distance coefficient is within the range of the distance coefficient. If the distance coefficient is within the range of the distance coefficient, the following steps can be performed. If the distance coefficient exceeds the range of the distance coefficient, then the endpoint value of the range of the distance coefficient (i.e. the maximum or minimum value of the range of the distance coefficient) is used as this step For the distance factor, perform the following steps. Of course, the following steps should be performed with the closest endpoint value of the generated distance coefficient as the distance coefficient of this step.
- step S410 When the distance coefficient generated in step S410 exceeds the range of the distance coefficient, the following steps are performed with the endpoint value of the range of the distance coefficient as the distance coefficient of this step, which can avoid the generated distance coefficient being too large or too small, causing the following steps to generate How much or how low the volume of the 3D reminder sound with power gain is.
- the mobile phone Based on the location information and distance coefficient of the alarm sound relative to the user, the mobile phone processes the standard reminder sound to obtain a three-dimensional reminder sound with energy gain.
- the method of obtaining the standard reminder sound and determining the HRTF value and the HRIR value is the same as that of the aforementioned step S405, and will not be described here.
- the standard reminder sound is subjected to Fourier transform processing, and then respectively multiplied by the HRTF value of the left earphone corresponding to the position information and the HRTF value of the right earphone to obtain the binaural output signal, that is, the HRTF value of the left earphone.
- the three-dimensional reminder sound and the three-dimensional reminder sound of the right earphone and then multiply the three-dimensional reminder sound of the left earphone and the three-dimensional reminder sound of the right earphone by the distance coefficient gain to obtain the three-dimensional reminder sound of the left and right earphones with energy gain.
- the standard reminder sound is convolved with the HRIR value of the left earphone and the HRIR value of the right earphone corresponding to the position information, respectively, to obtain a binaural output signal, that is, the three-dimensional reminder sound of the left earphone and the HRIR value of the right earphone.
- the three-dimensional reminder sound of the right earphone, the three-dimensional reminder sound of the left earphone and the three-dimensional reminder sound of the right earphone are respectively multiplied by the distance coefficient gain, and the three-dimensional reminder sound with energy gain of the left and right earphones is obtained.
- the mobile phone processes the standard reminder sound to obtain a three-dimensional reminder sound with energy gain. If the sound source of the warning sound is constantly approaching the user, the energy of the audio information acquired by the mobile phone in the next time is greater than the energy of the audio information acquired in the previous time. , therefore, the energy gain is positive, the distance coefficient is greater than 1, and the three-dimensional reminder sound with energy gain has more energy than the previous three-dimensional reminder sound, which can ensure that the three-dimensional reminder sound with energy gain is used to remind the user.
- the mobile phone sends a three-dimensional reminder sound with energy gain to the noise reduction headset.
- the mobile phone can send the three-dimensional reminder sound of the left earphone with energy gain and the three-dimensional reminder sound of the right earphone with energy gain to the noise-canceling earphone through a connection channel such as Bluetooth.
- the noise reduction earphone plays a three-dimensional reminder sound with energy gain.
- the left earphone of the noise-cancelling earphone outputs the three-dimensional reminder sound with energy gain of the left earphone
- the right earphone outputs the three-dimensional reminder sound of the right earphone with energy gain.
- step S404, step S408 to step S413 are optional steps. In some embodiments, if there is an alarm sound in the user's environment and the user is reminded of the alarm sound through the noise-canceling earphones, step S404, step S408 to step S413 may not be performed. Step S405 to step S407 are directly executed after step S403 is executed.
- Embodiment 1 may also be performed by noise-canceling headphones.
- the noise-canceling earphone completely replaces the mobile phone, and the audio information processing method shown in FIG. 4 is completely implemented. That is, after the earphone intelligent reminder alarm sound function is activated, during the operation of the noise-canceling earphone, use its own microphone to collect the sound of the external environment, obtain audio information, and use the audio information to perform steps S402 to S405, step S407 to step S411, and Step S413.
- the earphone smart reminder alarm sound function is activated, and the microphone array of the mobile phone collects the sound of the external environment to obtain audio information.
- the noise-canceling earphone uses the audio information to execute steps S402 to S405, steps S407 to S411, and step S413.
- the user wears noise-cancelling headphones and a smart watch on his wrist, and the mobile phone establishes Bluetooth connections with the smart watch and the noise-cancelling headphones respectively.
- the noise-canceling earphones and the smart watch can also exchange information through connection channels such as Bluetooth, so as to remind the user when there is a dangerous alarm sound around the user.
- noise-canceling earphones and smart watches can be found in the foregoing content, and will not be repeated here.
- a method for processing audio information provided in this embodiment includes:
- the smart watch acquires the audio information obtained by the noise reduction earphone.
- the microphone of the noise-canceling headset collects the sound of the external environment to obtain audio information
- the smart watch can obtain audio information through the Bluetooth channel.
- the noise-cancelling earphones can transmit audio information to the smart watch through a bluetooth channel, and then the smart watch can transmit the audio information to the smart watch through the bluetooth channel.
- the noise-canceling earphones can transmit audio information to the smart watch through a bluetooth channel or the like.
- the smart watch calls the alarm sound detection model to detect whether the audio information contains the alarm sound, and obtains a detection result, which is used to represent whether the audio information contains the alarm sound.
- the alarm sound detection model has the function of predicting whether the audio information input to the alarm sound detection model contains the alarm sound. Therefore, after acquiring the audio information of the external environment, the smart watch can use the alarm sound detection model to detect whether the audio information contains an alarm sound.
- the smart watch pre-selects and stores a trained alarm sound detection model. After the smart watch acquires audio information, it invokes the alarm sound detection model to detect whether the audio information contains an alarm sound, and obtains the detection result.
- the noise-canceling earphone can also call the alarm sound detection model to detect whether the audio information contains the alarm sound, obtain the detection result, and then transmit the detection result to the smart watch. In this way, the smart watch may not perform step S702.
- step S703 If the detection result indicates that the audio information contains an alarm sound, execute steps S703 and S704; if the detection result indicates that the audio information does not contain an alarm sound, return to step S701.
- the smart watch uses the audio information to locate the alarm sound, and obtains position information of the alarm sound relative to the user.
- the smart watch can use the sound source localization algorithm based on the microphone array proposed in the foregoing content, and use audio information to perform sound source localization for the warning sound.
- the smart watch uses the audio information collected by the microphone of the left earphone of the noise-canceling earphone and the audio information collected by the microphone of the right earphone to locate the sound source of the alarm sound, and obtain the position information of the alarm sound relative to the user.
- the position information Generally, it includes the horizontal direction angle ⁇ of the alarm sound relative to the user.
- the audio information collected by the microphone of the left earphone and the audio information collected by the microphone of the right earphone can also be used by the noise-canceling earphone, and the sound source localization algorithm based on the microphone array proposed in the foregoing content can perform sound source localization on the alarm sound , to obtain the location information of the alarm sound relative to the user.
- the noise-canceling earphones can transmit the resulting warning sound relative to the user's orientation angle to the smart watch. In this way, the smart watch may not perform step S703.
- the smart watch can obtain the audio information collected by the built-in microphone array of the mobile phone, such as through Bluetooth channel to obtain the audio information collected by the microphone array of the mobile phone.
- the smart watch uses the audio information collected by the microphone array to locate the sound source of the alarm sound and obtain the position information of the alarm sound relative to the user.
- the smart watch uses the audio information collected by the microphone array of the mobile phone to locate the sound source of the alarm sound and obtain the position information of the alarm sound relative to the user, which can be described in step S403 in Embodiment 1, and will not be repeated here.
- the smart watch detects whether the audio information and the previous audio information including the alarm sound are obtained within a preset time period.
- steps S705 to S707 If it is obtained within a preset time period, then steps S708 to S713 are performed.
- the smart watch Based on the location information of the alarm sound relative to the user, the smart watch processes the standard reminder sound to obtain a three-dimensional reminder sound.
- the user can use the alarm sound selection method proposed in the foregoing content to pre-set the standard alarm sound for alarm reminder.
- the user can be reminded to set the standard alarm sound by displaying the alarm sound selection interface shown in FIG. 3b through the mobile phone.
- the smart watch also pre-stores a plurality of Head-Response Transfer Function (HRTF) values; wherein, a plurality of Head-Response Transfer Function (Head-Response Transfer Function, HRTF) values, usually Set up in pairs according to the left and right earphones. That is, the multiple HRTF values are divided into multiple HRTF values of the left earphone and HRTF values of the right earphone corresponding to each HRTF value of the left earphone.
- the HRTF values of a pair of left and right earphones respectively correspond to an angle value of an alarm sound relative to the user.
- the smart watch can also pre-store multiple Head Related Inpulse Response (HRIR) values; wherein, multiple Head Related Inpulse Response (HRIR) values are usually paired according to the left and right earphones. set up. That is, the multiple HRIR values are divided into multiple HRIR values of the left earphone and HRIR values of the right earphone corresponding to each HRIR value of the left earphone.
- the HRIR values of a pair of left and right earphones respectively correspond to an angle value of an alarm sound relative to the user.
- step S705 to process the standard reminder sound to obtain the three-dimensional reminder sound can be the same as the two possible implementation manners of step S405 in the first embodiment above, which will not be repeated here.
- the smart watch sends a three-dimensional reminder sound to the noise reduction earphone.
- the smart watch sends the three-dimensional reminder sound of the left earphone and the three-dimensional reminder sound of the right earphone to the noise reduction earphone.
- the smart watch sends the three-dimensional reminder sound of the left earphone and the three-dimensional reminder sound of the right earphone to the noise reduction earphone through the mobile phone.
- the noise-canceling earphone plays a three-dimensional reminder sound.
- the left earphone of the noise-canceling earphone outputs the three-dimensional reminder sound of the left earphone
- the right earphone outputs the three-dimensional reminder sound of the right earphone
- the alarm sound is detected in the audio of the external environment
- the smart watch uses the audio information to locate the alarm sound, obtains the position information of the alarm sound relative to the user, and based on the position information of the alarm sound relative to the user, the processing standard
- the reminder sound gets a three-dimensional reminder sound, and then the three-dimensional reminder sound is played by the noise-canceling earphones, which can remind the user that there is an alarm sound around and there is a safety problem.
- the smart watch judges whether the difference between the location information of the alarm sound relative to the user in the audio information and the location information of the alarm sound relative to the user in the previous audio information containing the alarm sound is within a preset range.
- step S705 If the smart watch determines that the difference between the location information of the alarm sound relative to the user in the audio information and the location information of the alarm sound relative to the user in the previous audio information containing the alarm sound is not within the preset range, then execute step S705 .
- step S709 is executed.
- step S708 by the smart watch, please refer to the content of step S408 in the first embodiment above, which will not be repeated here.
- the smart watch detects whether the audio information is the same sound as the previous audio information including the warning sound.
- step S409 for an implementation manner in which the smart watch detects whether the audio information is the same as the previous audio containing the alarm sound, please refer to the content of step S409 in the first embodiment above, and details will not be repeated here.
- step S710 is executed. If the detected audio information does not belong to the same sound as the previous audio information containing the warning sound, step S705 is executed.
- the smart watch generates a distance coefficient, where the distance coefficient is used to represent the energy gain of the audio information relative to the previous audio information including the warning sound.
- step S410 For the implementation manner of generating the distance coefficient by the smart watch, please refer to the content of step S410 in the first embodiment above, and details will not be repeated here.
- the smart watch Based on the location information and distance coefficient of the alarm sound relative to the user, the smart watch processes the standard reminder sound to obtain a three-dimensional reminder sound with energy gain.
- step S411 For the smart watch to process the standard reminder sound to obtain the three-dimensional reminder sound with energy gain based on the location information and distance coefficient of the alarm sound relative to the user, please refer to the content of step S411 in the first embodiment above, and will not repeat it here.
- the smart watch sends a three-dimensional reminder sound with energy gain to the noise canceling earphone.
- the smart watch sends the three-dimensional reminder sound of the left earphone with energy gain and the three-dimensional reminder sound of the right earphone with energy gain to the noise-canceling earphone.
- the smart watch sends the three-dimensional reminder sound of the left earphone with energy gain and the three-dimensional reminder sound of the right earphone with energy gain to the noise-canceling earphone through the mobile phone.
- the noise-canceling headset plays a three-dimensional reminder sound with energy gain.
- the left earphone of the noise-cancelling earphone outputs the three-dimensional reminder sound with energy gain of the left earphone
- the right earphone outputs the three-dimensional reminder sound of the right earphone with energy gain.
- Another embodiment of the present application also provides a computer-readable storage medium, where instructions are stored in the computer-readable storage medium, and when the computer-readable storage medium is run on a computer or a processor, the computer or the processor executes any one of the above-mentioned methods. one or more steps.
- Another embodiment of the present application also provides a computer program product including instructions.
- the computer program product is run on the computer or the processor, the computer or the processor is made to perform one or more steps in any one of the above methods.
- Another embodiment of the present application also provides an audio processing system, the system includes electronic equipment and earphones, electronic equipment such as mobile phones, smart watches, etc., the earphones can be noise-canceling earphones, wherein the working process of the electronic equipment and earphones can be as follows The content of the foregoing embodiment 1 and embodiment 2 will not be described here.
Landscapes
- Engineering & Computer Science (AREA)
- Signal Processing (AREA)
- Physics & Mathematics (AREA)
- Acoustics & Sound (AREA)
- Computer Networks & Wireless Communication (AREA)
- Human Computer Interaction (AREA)
- Theoretical Computer Science (AREA)
- Multimedia (AREA)
- Health & Medical Sciences (AREA)
- General Health & Medical Sciences (AREA)
- Environmental & Geological Engineering (AREA)
- Audiology, Speech & Language Pathology (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Otolaryngology (AREA)
- Telephone Function (AREA)
- Circuit For Audible Band Transducer (AREA)
Abstract
Description
| r | 相关强度 |
| 0.8-1.0 | 极强相关 |
| 0.6-0.8 | 强相关 |
| 0.4-0.6 | 中等程度相关 |
| 0.2-0.4 | 弱相关 |
| 0.0-0.2 | 极弱相关 |
Claims (41)
- 一种音频信息的处理方法,其特征在于,应用于电子设备,所述音频信息的处理方法包括:获取音频信息,所述音频信息由采集所述电子设备所处环境的声音而得到;确定所述音频信息包括告警声;基于所述音频信息确定所述告警声的第一位置信息,所述第一位置信息用于标识所述告警声的声源方向;确定第一声音,所述第一声音包括第二位置信息,所述第二位置信息用于标识所述告警声的声源方向,所述第二位置信息与所述第一位置信息相同或者不同;播放所述第一声音。
- 根据权利要求1所述的音频信息的处理方法,其特征在于,所述确定所述第一声音,所述第一声音包括第二位置信息之前,还包括:确定所述音频信息与前一个包含告警声的音频信息,未在预设时间段内获取。
- 根据权利要求2所述的音频信息的处理方法,其特征在于,还包括:确定所述音频信息与所述前一个包含告警声的音频信息,在预设时间段内获取;判断所述音频信息中的告警声的第一位置信息,与所述前一个包含告警声的音频信息中的告警声的第一位置信息的差值在预设范围内,且检测所述音频信息中的告警声和所述前一个包含告警声的音频信息中的告警声属于同一声音,生成距离系数,所述距离系数用于表征所述音频信息相对于所述前一个包含告警声的音频信息的能量增益;确定第二声音,所述第二声音包括所述第二位置信息和所述能量增益;播放所述第二声音。
- 根据权利要求1至3中任意一项所述的音频信息的处理方法,其特征在于,所述播放所述第一声音,包括:向耳机发送所述第一声音,由所述耳机播放所述第一声音。
- 根据权利要求3所述的音频信息的处理方法,其特征在于,所述播放所述第二声音,包括:向耳机发送所述第二声音,由所述耳机播放所述第二声音。
- 根据权利要求1至5中任意一项所述的音频信息的处理方法,其特征在于,所述基于所述音频信息确定所述告警声的第一位置信息,包括:基于麦克风阵列的声源定位算法,利用所述音频信息对所述告警声进行声源定位,得到所述告警声的第一位置信息。
- 根据权利要求1至5中任意一项所述的音频信息的处理方法,其特征在于,所述基于所述音频信息确定所述告警声的第一位置信息,包括:基于所述音频信息,确定所述告警声的第三位置信息,所述第三位置信息用于标识所述告警声相对于所述电子设备的声源方向;对所述告警声的第三位置信息进行坐标转换,得到所述告警声的第一位置信息。
- 根据权利要求1至5中任意一项所述的音频信息的处理方法,其特征在于,所述确定第一声音,所述第一声音包括第二位置信息,包括:获取标准声音;基于所述告警声的第一位置信息,处理所述标准声音,得到所述第一声音,所述第一声音包括第二位置信息。
- 根据权利要求8所述的音频信息的处理方法,其特征在于,所述基于所述告警声的第一位置信息,处理所述标准声音,得到所述第一声音,包括:获取所述告警声的第一位置信息对应的头相关冲击响应HRIR值;将所述标准声音,分别所述HRIR值进行卷积处理,得到所述第一声音。
- 根据权利要求8所述的音频信息的处理方法,其特征在于,所述基于所述告警声的第一位置信息,处理所述标准声音,得到所述第一声音,包括:获取所述告警声的第一位置信息对应的头部相关变换函数HRTF值;将所述标准声音进行傅里叶变换处理,再与所述HRTF值作乘,得到所述第一声音。
- 根据权利要求3所述的音频信息的处理方法,其特征在于,所述检测所述音频信息中的告警声和所述前一个包含告警声的音频信息中的告警声属于同一声音的方式,包括:分别对所述音频信息和所述前一个包含告警声的音频信息进行时域到频域的转换,得到所述音频信息和所述前一个包含告警声的音频信息的幅度谱;利用所述音频信息和所述前一个包含告警声的音频信息的幅度谱,对所述音频信息和所述前一个包含告警声的音频信息进行相似度计算,得到计算结果,所述计算结果用于表征所述音频信息和所述前一个包含音频信息是否属于同一声音。
- 根据权利要求11所述的音频信息的处理方法,其特征在于,所述利用所述音频信息和所述前一个包含告警声的音频信息的幅度谱,对所述音频信息和所述前一个包含告警声的音频信息进行相似度计算,得到计算结果,包括:采用皮尔逊相关函数,对所述音频信息和所述前一个包含告警声的音频信息进行相似度计算,得到相似度值;其中,所述相似度值大于阈值,则所述音频信息和所述前一个包含告警声的音频信息属于同一声音,所述相似度值不大于阈值,则所述音频信息和所述前一个包含告警声的音频信息不属于同一个声音。
- 根据权利要求11所述的音频信息的处理方法,其特征在于,所述利用所述音频信息和所述前一个包含告警声的音频信息的幅度谱,对所述音频信息和所述前一个包含告警声的音频信息进行相似度计算,得到计算结果,包括:利用分类模型预测所述音频信息和所述前一个包含告警声的音频信息是否属于同一声音。
- 根据权利要求3所述的音频信息的处理方法,其特征在于,所述检测所述音频信息中的告警声和所述前一个包含告警声的音频信息中的告警声属于同一声音的方式,包括:从所述音频信息以及所述前一个包含告警声的音频信息中,分别提出告警声;判断提取得到的两个告警声是否属于同一个告警声。
- 根据权利要求14所述的音频信息的处理方法,其特征在于,所述判断提取得到的两个告警声是否属于同一个告警声,包括:分别对提取得到的两个告警声进行时域到频域的转换,得到所述提取得到的两个告警声的幅度谱;利用所述提取得到的两个告警声的幅度谱,对所述提取得到的两个告警声进行相似度计算,得到计算结果,所述计算结果用于表征所述提取得到的两个告警声是否属于同一个告警声。
- 根据权利要求15所述的音频信息的处理方法,其特征在于,所述利用所述提取得到的两个告警声的幅度谱,对所述提取得到的两个告警声进行相似度计算,得到计算结果,包括:采用皮尔逊相关函数,对所述提取得到的两个告警声进行相似度计算,得到相似度值;其中,所述相似度值大于阈值,则所述提取得到的两个告警声属于同一个告警声,所述相似度值不大于阈值,则所述提取得到的两个告警声不属于同一个告警声。
- 根据权利要求15所述的音频信息的处理方法,其特征在于,所述利用所述提取得到的两个告警声的幅度谱,对所述提取得到的两个告警声进行相似度计算,得到计算结果,包括:利用分类模型预测所述提取得到的两个告警声是否属于同一个告警声。
- 根据权利要求3所述的音频信息的处理方法,其特征在于,所述生成距离系数之后,还包括:确定所述距离系数在所述距离系数的范围内。
- 根据权利要求18所述的音频信息的处理方法,其特征在于,还包括:确定所述距离系数超过所述距离系数的范围;确定第三声音,所述第三声音包括所述第二位置信息和所述距离系数的范围的端点值表征的能量增益;播放所述第三声音。
- 根据权利要求1至19中任意一项所述的音频信息的处理方法,其特征在于,所述确定所述音频信息包括告警声的方式,包括:调用告警声检测模型对所述音频信息是否包含告警声进行检测,得到检测结果,所述检测结果用于表征所述音频信息是否包含告警声。
- 一种电子设备,其特征在于,所述电子设备包括:一个或多个处理器、存储器和无线通信模块;所述存储器和所述无线通信模块与所述一个或多个所述处理器耦合,所述存储器用于存储计算机程序代码,所述计算机程序代码包括计算机指令,当所述一个或多个处理器执行所述计算机指令时,所述电子设备执行如权利要求1至20任意一项所述的音频信息的处理方法。
- 一种计算机存储介质,其特征在于,用于存储计算机程序,所述计算机程序被执行时,具体用于实现如权利要求1至20任意一项所述的音频信息的处理方法。
- 一种计算机程序产品,其特征在于,当计算机程序产品在计算机上运行时,使得所述计算机执行如权利要求1至20任意一项所述的音频信息的处理方法。
- 一种音频信息的处理系统,其特征在于,包括:电子设备和耳机,其中,所述电子设备用于执行如权利要求1至20任意一项所述的音频信息的处理方法;所述耳机与所述电子设备交互,用于响应所述电子设备,播放第一声音、第二声音或第三声音。
- 一种音频信息的处理方法,其特征在于,应用于电子设备,所述音频信息的处理方法包括:获取音频信息,所述音频信息由采集所述电子设备所处环境的声音而得到;确定所述音频信息包括告警声;基于所述音频信息确定所述告警声的第一位置信息,所述第一位置信息用于标识所述告警声的声源方向;确定所述音频信息与前一个包含告警声的音频信息,未在预设时间段内获取;确定第一声音,所述第一声音包括第二位置信息,所述第二位置信息用于标识所述告警声的声源方向,所述第二位置信息与所述第一位置信息相同或者不同,所述第一声音为三维提醒声,所述三维提醒声是携带有方向的告警声;向耳机发送第一声音,由所述耳机播放所述第一声音;确定所述音频信息与所述前一个包含告警声的音频信息,在预设时间段内获取;判断所述音频信息中的告警声的第一位置信息,与所述前一个包含告警声的音频信息中的告警声的第一位置信息的差值在预设范围内,且检测所述音频信息中的告警声和所述前一个包含告警声的音频信息中的告警声属于同一声音,生成距离系数,所述距离系数用于表征所述音频信息相对于所述前一个包含告警声的音频信息的能量增益;确定第二声音,所述第二声音包括所述第二位置信息和所述能量增益;播放所述第二声音。
- 根据权利要求25所述的音频信息的处理方法,其特征在于,所述播放所述第二声音,包括:向耳机发送所述第二声音,由所述耳机播放所述第二声音。
- 根据权利要求25或26所述的音频信息的处理方法,其特征在于,所述基于所述音频信息确定所述告警声的第一位置信息,包括:基于麦克风阵列的声源定位算法,利用所述音频信息对所述告警声进行声源定位,得到所述告警声的第一位置信息。
- 根据权利要求25或26所述的音频信息的处理方法,其特征在于,所述基于所述音频信息确定所述告警声的第一位置信息,包括:基于所述音频信息,确定所述告警声的第三位置信息,所述第三位置信息用于标识所述告警声相对于所述电子设备的声源方向;对所述告警声的第三位置信息进行坐标转换,得到所述告警声的第一位置信息。
- 根据权利要求25或26所述的音频信息的处理方法,其特征在于,所述确定第一声音,所述第一声音包括第二位置信息,包括:获取标准声音;基于所述告警声的第一位置信息,处理所述标准声音,得到所述第一声音,所述第一声音包括第二位置信息。
- 根据权利要求29所述的音频信息的处理方法,其特征在于,所述基于所述告警声的第一位置信息,处理所述标准声音,得到所述第一声音,包括:获取所述告警声的第一位置信息对应的头相关冲击响应HRIR值;将所述标准声音,分别所述HRIR值进行卷积处理,得到所述第一声音。
- 根据权利要求29所述的音频信息的处理方法,其特征在于,所述基于所述告警声的第一位置信息,处理所述标准声音,得到所述第一声音,包括:获取所述告警声的第一位置信息对应的头部相关变换函数HRTF值;将所述标准声音进行傅里叶变换处理,再与所述HRTF值作乘,得到所述第一声音。
- 根据权利要求25所述的音频信息的处理方法,其特征在于,所述检测所述音频信息中的告警声和所述前一个包含告警声的音频信息中的告警声属于同一声音的方式,包括:分别对所述音频信息和所述前一个包含告警声的音频信息进行时域到频域的转换,得到所述音频信息和所述前一个包含告警声的音频信息的幅度谱;利用所述音频信息和所述前一个包含告警声的音频信息的幅度谱,对所述音频信息和所述前一个包含告警声的音频信息进行相似度计算,得到计算结果,所述计算结果用于表征所述音频信息和所述前一个包含音频信息是否属于同一声音。
- 根据权利要求32所述的音频信息的处理方法,其特征在于,所述利用所述音频信息和所述前一个包含告警声的音频信息的幅度谱,对所述音频信息和所述前一个包含告警声的音频信息进行相似度计算,得到计算结果,包括:采用皮尔逊相关函数,对所述音频信息和所述前一个包含告警声的音频信息进行相似度计算,得到相似度值;其中,所述相似度值大于阈值,则所述音频信息和所述前一个包含告警声的音频信息属于同一声音,所述相似度值不大于阈值,则所述音频信息和所述前一个包含告警声的音频信息不属于同一个声音。
- 根据权利要求32所述的音频信息的处理方法,其特征在于,所述利用所述音频信息和所述前一个包含告警声的音频信息的幅度谱,对所述音频信息和所述前一个包含告警声的音频信息进行相似度计算,得到计算结果,包括:利用分类模型预测所述音频信息和所述前一个包含告警声的音频信息是否属于同一声音。
- 根据权利要求25所述的音频信息的处理方法,其特征在于,所述检测所述音频信息中的告警声和所述前一个包含告警声的音频信息中的告警声属于同一声音的方式,包括:从所述音频信息以及所述前一个包含告警声的音频信息中,分别提出告警声;判断提取得到的两个告警声是否属于同一个告警声。
- 根据权利要求35所述的音频信息的处理方法,其特征在于,所述判断提取得到的两个告警声是否属于同一个告警声,包括:分别对提取得到的两个告警声进行时域到频域的转换,得到所述提取得到的两个告警声的幅度谱;利用所述提取得到的两个告警声的幅度谱,对所述提取得到的两个告警声进行相似度计算,得到计算结果,所述计算结果用于表征所述提取得到的两个告警声是否属于同一个告警声。
- 根据权利要求36所述的音频信息的处理方法,其特征在于,所述利用所述提取得到的两个告警声的幅度谱,对所述提取得到的两个告警声进行相似度计算,得到计算结果,包括:采用皮尔逊相关函数,对所述提取得到的两个告警声进行相似度计算,得到相似度值;其中,所述相似度值大于阈值,则所述提取得到的两个告警声属于同一个告警声,所述相似度值不大于阈值,则所述提取得到的两个告警声不属于同一个告警声。
- 根据权利要求36所述的音频信息的处理方法,其特征在于,所述利用所述提取得到的两个告警声的幅度谱,对所述提取得到的两个告警声进行相似度计算,得到计算结果,包括:利用分类模型预测所述提取得到的两个告警声是否属于同一个告警声。
- 根据权利要求25所述的音频信息的处理方法,其特征在于,所述生成距离系数之后,还包括:确定所述距离系数在所述距离系数的范围内。
- 根据权利要求39所述的音频信息的处理方法,其特征在于,还包括:确定所述距离系数超过所述距离系数的范围;确定第三声音,所述第三声音包括所述第二位置信息和所述距离系数的范围的端点值表征的能量增益;播放所述第三声音。
- 根据权利要求25或26,或30至40中任意一项所述的音频信息的处理方法,其特征在于,所述确定所述音频信息包括告警声的方式,包括:调用告警声检测模型对所述音频信息是否包含告警声进行检测,得到检测结果,所述检测结果用于表征所述音频信息是否包含告警声。
Priority Applications (2)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| US18/291,854 US20240411507A1 (en) | 2021-10-26 | 2022-09-01 | Audio information processing method, electronic device, system, product, and medium |
| EP22885410.5A EP4354900B1 (en) | 2021-10-26 | 2022-09-01 | Audio information processing method, corresponding electronic device and computer storage medium |
Applications Claiming Priority (2)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN202111248720.6A CN114189790B (zh) | 2021-10-26 | 2021-10-26 | 音频信息的处理方法、电子设备、系统、产品及介质 |
| CN202111248720.6 | 2021-10-26 |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| WO2023071519A1 true WO2023071519A1 (zh) | 2023-05-04 |
Family
ID=80540443
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| PCT/CN2022/116528 Ceased WO2023071519A1 (zh) | 2021-10-26 | 2022-09-01 | 音频信息的处理方法、电子设备、系统、产品及介质 |
Country Status (4)
| Country | Link |
|---|---|
| US (1) | US20240411507A1 (zh) |
| EP (1) | EP4354900B1 (zh) |
| CN (1) | CN114189790B (zh) |
| WO (1) | WO2023071519A1 (zh) |
Families Citing this family (8)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN114189790B (zh) * | 2021-10-26 | 2022-11-29 | 北京荣耀终端有限公司 | 音频信息的处理方法、电子设备、系统、产品及介质 |
| CN114760560B (zh) * | 2022-03-23 | 2025-07-22 | 歌尔股份有限公司 | 声音信号处理方法、装置、耳机设备及存储介质 |
| CN115278468A (zh) * | 2022-05-27 | 2022-11-01 | 歌尔股份有限公司 | 声音输出方法、装置、电子设备及计算机可读存储介质 |
| CN115623156B (zh) * | 2022-08-30 | 2024-04-02 | 荣耀终端有限公司 | 音频处理方法和相关装置 |
| CN115273431B (zh) * | 2022-09-26 | 2023-03-07 | 荣耀终端有限公司 | 设备的寻回方法、装置、存储介质和电子设备 |
| US20250037550A1 (en) * | 2023-07-24 | 2025-01-30 | Samsung Electronics Co., Ltd. | Method and electronic device for providing environmental audio alert on personal audio device |
| US20260089439A1 (en) * | 2024-09-26 | 2026-03-26 | Intel Corporation | Apparatus, system, and method of direction-based sound event indication |
| CN121037738B (zh) * | 2025-10-30 | 2026-02-06 | 江苏物润船联网络股份有限公司 | 一种基于定位技术的智能耳机打断功能的实现方法与系统 |
Citations (6)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US20140301556A1 (en) * | 2012-04-09 | 2014-10-09 | Dts, Inc. | Directional based audio response to an external environment emergency signal |
| CN107767697A (zh) * | 2016-08-19 | 2018-03-06 | 索尼公司 | 用于处理交通声音数据以提供驾驶员辅助的系统和方法 |
| US20180206038A1 (en) * | 2017-01-13 | 2018-07-19 | Bose Corporation | Real-time processing of audio data captured using a microphone array |
| CN108600885A (zh) * | 2018-03-30 | 2018-09-28 | 广东欧珀移动通信有限公司 | 声音信号处理方法及相关产品 |
| CN110001512A (zh) * | 2018-01-02 | 2019-07-12 | 福特全球技术公司 | 具有声音采集装置的机动车辆 |
| CN114189790A (zh) * | 2021-10-26 | 2022-03-15 | 荣耀终端有限公司 | 音频信息的处理方法、电子设备、系统、产品及介质 |
Family Cites Families (13)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US20110026745A1 (en) * | 2009-07-31 | 2011-02-03 | Amir Said | Distributed signal processing of immersive three-dimensional sound for audio conferences |
| US8797386B2 (en) * | 2011-04-22 | 2014-08-05 | Microsoft Corporation | Augmented auditory perception for the visually impaired |
| CN102624980A (zh) * | 2012-03-06 | 2012-08-01 | 惠州Tcl移动通信有限公司 | 一种基于手机的耳机检测突发环境提示方法及手机 |
| US10425717B2 (en) * | 2014-02-06 | 2019-09-24 | Sr Homedics, Llc | Awareness intelligence headphone |
| US9788101B2 (en) * | 2014-07-10 | 2017-10-10 | Deutsche Telekom Ag | Method for increasing the awareness of headphone users, using selective audio |
| US9800990B1 (en) * | 2016-06-10 | 2017-10-24 | C Matter Limited | Selecting a location to localize binaural sound |
| KR101892028B1 (ko) * | 2016-10-26 | 2018-08-27 | 현대자동차주식회사 | 음향 추적 정보 제공 방법, 차량용 음향 추적 장치, 및 이를 포함하는 차량 |
| US10067737B1 (en) * | 2017-08-30 | 2018-09-04 | Daqri, Llc | Smart audio augmented reality system |
| US11625222B2 (en) * | 2019-05-07 | 2023-04-11 | Apple Inc. | Augmenting control sound with spatial audio cues |
| CN111432305A (zh) * | 2020-03-27 | 2020-07-17 | 歌尔科技有限公司 | 一种耳机告警方法、装置及无线耳机 |
| CN111398965A (zh) * | 2020-04-09 | 2020-07-10 | 电子科技大学 | 基于智能穿戴设备的危险信号监控方法、系统和穿戴设备 |
| CN111818441B (zh) * | 2020-07-07 | 2022-01-11 | Oppo(重庆)智能科技有限公司 | 音效实现方法、装置、存储介质及电子设备 |
| US11467666B2 (en) * | 2020-09-22 | 2022-10-11 | Bose Corporation | Hearing augmentation and wearable system with localized feedback |
-
2021
- 2021-10-26 CN CN202111248720.6A patent/CN114189790B/zh active Active
-
2022
- 2022-09-01 EP EP22885410.5A patent/EP4354900B1/en active Active
- 2022-09-01 WO PCT/CN2022/116528 patent/WO2023071519A1/zh not_active Ceased
- 2022-09-01 US US18/291,854 patent/US20240411507A1/en active Pending
Patent Citations (6)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US20140301556A1 (en) * | 2012-04-09 | 2014-10-09 | Dts, Inc. | Directional based audio response to an external environment emergency signal |
| CN107767697A (zh) * | 2016-08-19 | 2018-03-06 | 索尼公司 | 用于处理交通声音数据以提供驾驶员辅助的系统和方法 |
| US20180206038A1 (en) * | 2017-01-13 | 2018-07-19 | Bose Corporation | Real-time processing of audio data captured using a microphone array |
| CN110001512A (zh) * | 2018-01-02 | 2019-07-12 | 福特全球技术公司 | 具有声音采集装置的机动车辆 |
| CN108600885A (zh) * | 2018-03-30 | 2018-09-28 | 广东欧珀移动通信有限公司 | 声音信号处理方法及相关产品 |
| CN114189790A (zh) * | 2021-10-26 | 2022-03-15 | 荣耀终端有限公司 | 音频信息的处理方法、电子设备、系统、产品及介质 |
Non-Patent Citations (1)
| Title |
|---|
| See also references of EP4354900A4 |
Also Published As
| Publication number | Publication date |
|---|---|
| CN114189790A (zh) | 2022-03-15 |
| US20240411507A1 (en) | 2024-12-12 |
| EP4354900A4 (en) | 2024-10-30 |
| EP4354900A1 (en) | 2024-04-17 |
| EP4354900B1 (en) | 2025-08-06 |
| CN114189790B (zh) | 2022-11-29 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| CN114189790B (zh) | 音频信息的处理方法、电子设备、系统、产品及介质 | |
| EP3547712B1 (en) | Method for processing signals, terminal device, and non-transitory readable storage medium | |
| US10817251B2 (en) | Dynamic capability demonstration in wearable audio device | |
| CN108538320B (zh) | 录音控制方法和装置、可读存储介质、终端 | |
| CN108600885B (zh) | 声音信号处理方法及相关产品 | |
| CN108521621B (zh) | 信号处理方法、装置、终端、耳机及可读存储介质 | |
| CN111696570B (zh) | 语音信号处理方法、装置、设备及存储介质 | |
| WO2014161309A1 (zh) | 一种移动终端实现声源定位的方法及装置 | |
| WO2018045536A1 (zh) | 声音信号处理的方法、终端和耳机 | |
| WO2020020375A1 (zh) | 语音处理方法、装置、电子设备及可读存储介质 | |
| CN115775564B (zh) | 音频处理方法、装置、存储介质及智能眼镜 | |
| CN111341307A (zh) | 语音识别方法、装置、电子设备及存储介质 | |
| CN107863110A (zh) | 基于智能耳机的安全提醒方法、智能耳机及存储介质 | |
| US12411653B2 (en) | Method and electronic device for detecting ambient audio signal | |
| WO2021244056A1 (zh) | 一种数据处理方法、装置和可读介质 | |
| US20260089447A1 (en) | Smart glasses for hearing assistance, hearing assistance method, and auxiliary system | |
| CN108962241A (zh) | 位置提示方法、装置、存储介质及电子设备 | |
| CN117133282B (zh) | 一种语音交互方法及电子设备 | |
| CN114360546B (zh) | 电子设备及其唤醒方法 | |
| CN115379433B (zh) | 蓝牙设备配对方法及装置 | |
| HK40072239A (zh) | 音频信息的处理方法、电子设备、系统、产品及介质 | |
| HK40072239B (zh) | 音频信息的处理方法、电子设备、系统、产品及介质 | |
| WO2019183904A1 (zh) | 自动识别音频中不同人声的方法 | |
| CN115166633B (zh) | 声源方向确定方法、装置、终端以及存储介质 | |
| CN116405593B (zh) | 音频处理方法及相关装置 |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| 121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 22885410 Country of ref document: EP Kind code of ref document: A1 |
|
| WWE | Wipo information: entry into national phase |
Ref document number: 2022885410 Country of ref document: EP Ref document number: 22885410 Country of ref document: EP |
|
| WWE | Wipo information: entry into national phase |
Ref document number: 18291854 Country of ref document: US |
|
| ENP | Entry into the national phase |
Ref document number: 2022885410 Country of ref document: EP Effective date: 20240108 |
|
| NENP | Non-entry into the national phase |
Ref country code: DE |
|
| WWG | Wipo information: grant in national office |
Ref document number: 2022885410 Country of ref document: EP |
