WO2015107382A1 - Procédé et dispositif de compensation d'écho acoustique par paquets - Google Patents
Procédé et dispositif de compensation d'écho acoustique par paquets Download PDFInfo
- Publication number
- WO2015107382A1 WO2015107382A1 PCT/IB2014/003164 IB2014003164W WO2015107382A1 WO 2015107382 A1 WO2015107382 A1 WO 2015107382A1 IB 2014003164 W IB2014003164 W IB 2014003164W WO 2015107382 A1 WO2015107382 A1 WO 2015107382A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- packet
- echo
- stream
- packet stream
- target
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Ceased
Links
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04M—TELEPHONIC COMMUNICATION
- H04M9/00—Arrangements for interconnection not involving centralised switching
- H04M9/08—Two-way loud-speaking telephone systems with means for conditioning the signal, e.g. for suppressing echoes for one or both directions of traffic
- H04M9/082—Two-way loud-speaking telephone systems with means for conditioning the signal, e.g. for suppressing echoes for one or both directions of traffic using echo cancellers
Definitions
- the present invention relates to the field of communication, and in particular to the technology of packet acoustic echo cancellation.
- media servers provide a variety of basic and enhanced services, such as conferencing, audio and video interactive voice response (IVR), transcoding, audio and video announcements, and other advanced speech services.
- Conference facility of the media server allows a number of participants to join a conference.
- the media server providing conferencing service is supporting a 3-party audio conference.
- A, B, C are the 3 participants.
- Transcoding may be applied to each conference participant by the media server to guarantee the packet voice from each participant will be transcoded to a unique codec type, so that later the "mixing" function of the conference facility can work correctly.
- the "mixing" function is used to form unique broadcasted stream from all the target streams (incoming streams, from media server perspective).
- the broadcasted stream is then fed into "broadcasting" function to form all the reference streams (outgoing streams, from media server perspective).
- AEC allocated at C party can only cancel the echo in the target stream from C, which can't cancel the echo in the reference stream to C (mate echo). Even "Bidirectional PAEC” allocated at C's device can't cancel the mate echo to C because "Bidirectional PAEC" relies on 2 direction streams while both the original voice from Ato C and the mate echo from B to C are presented in the same reference flow to C.
- the second method relies on PAEC channels allocated on the conference facility, as shown in Figure 6, each target stream (incoming stream, from server perspective) will be compared to its history reference stream (outgoing stream, from server perspective) to detect acoustic echo. Target packets containing acoustic echo will be cancelled/suppressed, thus no acoustic echo will be sent to mixing function then broadcasting function.
- the broadcasting function needs a little bit intelligence to prevent backward broadcasting, with which voice packets from one party won't be sent back to that party, thus the local echo is avoided.
- An object of the invention is providing a method and device for packet acoustic echo cancellation.
- a method for packet acoustic echo cancellation comprises the following steps:
- an echo cancelling device for packet acoustic echo cancellation comprising:
- obtaining apparatus for obtaining one or more source voice packet streams in a multi-party call, wherein acoustic echoes in the one or more source voice packet streams are to be cancelled and each source voice packet stream is from one participant in the multi-party call;
- selecting apparatus for selecting packets from the one or more source voice packet streams so as to obtain a broadcasted stream corresponding to the multi-party call
- cancelling apparatus for cancelling echo in a target packet stream corresponding to the broadcasted stream based on a reference packet stream corresponding to the broadcasted stream, so as to obtain the echo-cancelled packet stream corresponding to the target packet stream;
- broadcasting apparatus for broadcasting the echo-cancelled packet stream to participants in the multi-party call.
- the present invention realizes packet acoustic echo cancellation for a multi-party call, by obtaining one or more source voice packet streams in a multi-party call, selecting packets from the one or more source voice packet streams so as to obtain a broadcasted stream corresponding to the multi-party call, cancelling echo in a target packet stream corresponding to the broadcasted stream based on a reference packet stream corresponding to the broadcasted stream, so as to obtain the echo-cancelled packet stream corresponding to the target packet stream, and finally broadcasting the echo-cancelled packet stream to participants in the multi-party call.
- the present invention reduces the buffer consumption, channel allocation and corresponding overheads of maintenance and signaling for packet acoustic echo cancellation, enhances the performance of packet acoustic echo cancellation, and makes the performance of packet acoustic echo cancellation independent of the number of participants.
- the present invention could determine parameter information of the storage space based on parameters of the multi-party call, in conjunction with packet voice lengths of the target packet stream and the reference packet stream, wherein the storage space is used for storing the target packet stream and the reference packet stream. Thereby, it avoids traversing throughout the whole storage space by utilizing the parameter information, and enhances the echo detection efficiency.
- the present invention could cancel echo in the target packet stream based on the reference packet stream corresponding to the broadcasted stream, in conjunction with source address information corresponding to each packet in a corresponding target packet stream of the broadcasted stream, so as to obtain the echo-cancelled packet stream corresponding to the target packet stream.
- it detects the state of the participant corresponding to the source address information based on the source address information, and if the terminal is provided with a packet acoustic echo cancellation device, a comparison computation for new packets from the participant may be saved, which enhances the performance of packet acoustic echo cancellation.
- the present invention could determine one or more target participants in the multi-party call based on source address information corresponding to each packet in the target packet stream, wherein the target participant is corresponding to the echo-cancelled packet stream, and broadcast the echo-cancelled packet stream to the one or more target participants. Thereby, it prevents backward broadcasting by determining the target participants before broadcasting, which may avoid local echo.
- Fig. 2 shows a diagram of forming a local echo according to one aspect of the present invention
- FIG. 3 shows a diagram of forming a far echo according to one aspect of the present invention
- Fig. 4 shows a diagram of forming a mate echo according to one aspect of the present invention
- FIG. 5 shows a diagram of a terminal-based multi-party packet acoustic echo cancellation according to the prior art
- FIG. 6 shows a diagram of a server-based multi-party packet acoustic echo cancellation according to the prior art
- Fig. 7 shows a diagram of a server-based multi-party packet acoustic echo cancellation independent of the number of conference participants according to one aspect of the present invention
- Fig. 8 shows a schematic diagram of an echo cancelling device for packet acoustic echo cancellation according to one aspect of the present invention
- Fig. 9 shows a schematic diagram of an echo cancelling device for packet acoustic echo cancellation according to a preferred embodiment of the present invention.
- FIG. 10 shows a flow diagram of a method for packet acoustic echo cancellation according to another aspect of the present invention.
- Fig. 11 shows a flow diagram of a method for packet acoustic echo cancellation according to a preferred embodiment of the present invention
- Fig. 12 shows a diagram of comparing packets for performing packet acoustic echo cancellation according to one aspect of the present invention.
- Fig. 8 shows a schematic diagram of an echo cancelling device for packet acoustic echo cancellation according to one aspect of the present invention; wherein the echo cancelling device comprises: a obtaining apparatus 1, a selecting apparatus 2, a cancelling apparatus 3, and a broadcasting apparatus 4.
- the obtaining apparatus 1 obtains one or more source voice packet streams in a multi-party call, wherein acoustic echoes in the one or more source voice packet streams are to be cancelled and each source voice packet stream is from one participant in the multi-party call;
- the selecting apparatus 2 selects packets from the one or more source voice packet streams so as to obtain a broadcasted stream corresponding to the multi-party call;
- the cancelling apparatus 3 cancels echo in a target packet stream corresponding to the broadcasted stream based on a reference packet stream corresponding to the broadcasted stream, so as to obtain the echo-cancelled packet stream corresponding to the target packet stream;
- the broadcasting apparatus 4 broadcasts the echo-cancelled packet stream to participants in the multi-party call.
- the echo cancelling device includes, but not limited to, an electronic hardware device or software device that automatically perform numerical value computation and information processing according to a pre-set or pre-stored instruction, wherein the hardware device includes, but not limited to, a microprocessor, an application-specific integrated circuit (ASIC), a field programmable gate array (FPGA), a digital processor (DSP), an embedded device, etc.
- ASIC application-specific integrated circuit
- FPGA field programmable gate array
- DSP digital processor
- embedded device etc.
- the echo cancelling device may be applied in any multi-party conference.
- each of the above apparatuses obtains one or more source voice packet streams in a multi-party call, obtains a broadcasted stream, obtains the echo-cancelled packet stream, broadcasts the echo-cancelled packet stream, and the like in real time or according to a preset or real-time adjusted working pattern requirements, until the echo cancelling device stops obtaining one or more source voice packet streams in a multi-party call.
- the obtaining apparatus 1 obtains one or more source voice packet streams in a multi-party call, wherein acoustic echoes in the one or more source voice packet streams are to be cancelled and each source voice packet stream is from one participant in the multi -party call. [0049] Specifically, the obtaining apparatus 1 interacts with one or more participants in a multi-party call (e.g., in one conference) to obtain source voice packet streams to be cancelled packet acoustic echo from the participant to the broadcast end, wherein the source voice packet stream comprises one or more packets, and the packet of the source voice packet stream may contain an echo packet.
- a multi-party call e.g., in one conference
- the obtaining apparatus 1 may obtain one or more source voice packet streams in the multi-party call; for example, if the multi-party call includes participant A, participant B, and participant C, then the obtaining apparatus 1 may obtain a source voice packet stream from the participant A to the broadcast end, a source voice packet stream from the participant B to the broadcast end, and a source voice packet stream from the participant C to the broadcast end.
- the selecting apparatus 2 selects packets from the one or more source voice packet streams so as to obtain a broadcasted stream corresponding to the multi-party call.
- the selecting apparatus 2 selects packets from the one or more source voice packet streams, based on voice intensity information of the packets in the source voice packet streams or continuity information (e.g., the person talking previously has a higher priority to be selected continuously) of the packets in the source voice packet streams, and generates a broadcasted stream corresponding to the multi-party call based on the information of the selected packets, so as to guarantee that no mixing will occur.
- voice intensity information of the packets in the source voice packet streams or continuity information e.g., the person talking previously has a higher priority to be selected continuously
- the cancelling apparatus 3 cancels echo in a target packet stream corresponding to the broadcasted stream based on a reference packet stream corresponding to the broadcasted stream, so as to obtain the echo-cancelled packet stream corresponding to the target packet stream.
- the cancelling apparatus 3 regards, based on the broadcasted stream, one or more new packets in the broadcasted stream as the target packet stream corresponding to the broadcasted stream, wherein the new packet might contain an echo, and regards one or more historical packets in the same broadcasted stream as the reference packet stream corresponding to the broadcasted stream, wherein the historical packets are earlier than the packets of the new packets in time sequence, and in special cases, the historical packets may contain echo.
- the cancelling apparatus 3 compares, in the buffer or other means that may perform high-speed read/write, one or more packets in the reference packet stream with one or more packets in the target packet stream, for example, determines whether they are similar based on a packet acoustic echo cancellation algorithm (PAEC algorithm), thereby detecting whether the target packet stream includes echo packets; if one or more similar packets exist, then the target packet stream includes echo packet(s); wherein the means that may perform high-speed read/write may maintain a longer period of historical voice.
- PAEC algorithm packet acoustic echo cancellation algorithm
- the cancelling apparatus 3 cancels echo in the target packet stream through deleting the echo packet or by substituting the detected echo packet with a replacement packet, or through other manners. Specifically, substitute the detected echo packet with a replacement packet to obtain an echo-cancelled packet stream corresponding to the target packet stream, wherein the replacement packet includes, but not limited to, a noise packet (for example, a packet including a certain type of noise, for example white noise, comfortable noise, etc.), a silent packet (for example, an empty packet), a 1/8 rate packet as finally buffered in the target packet stream, and the like or a combination thereof.
- a noise packet for example, a packet including a certain type of noise, for example white noise, comfortable noise, etc.
- a silent packet for example, an empty packet
- 1/8 rate packet as finally buffered in the target packet stream, and the like or a combination thereof.
- the cancelling apparatus 3 may also regard the echo-cancelled packet stream as its corresponding reference packet stream of the same broadcasted stream.
- the reference packet stream does not include any echo.
- the above method may identify whether the new packet is similar to the historical packet, thereby the far echo and the mate echo may be cancelled.
- the cancelling apparatus 3 may cancel echo in the target packet stream based on the reference packet stream corresponding to the broadcasted stream, in conjunction with source address information corresponding to each packet in a corresponding target packet stream of the broadcasted stream, so as to obtain the echo-cancelled packet stream corresponding to the target packet stream.
- the cancelling apparatus 3 regards, based on the broadcasted stream, one or more new packets in the broadcasted stream as the target packet stream corresponding to the broadcasted stream, wherein the new packet might contain an echo, and regards one or more historical packets in the same broadcasted stream as the reference packet stream corresponding to the broadcasted stream, wherein the historical packets are earlier than the packets of the new packets in time sequence, and in special cases, the historical packets may contain echo.
- the cancelling apparatus 3 determines, based on the source address information in header information of each packet in the target packet stream, the participant corresponding to the packet.
- the cancelling apparatus 3 interacts with the participant or with other third-party device to determine whether the participant is provided with a packet acoustic echo cancellation (PAEC) apparatus. If the participant is provided with the PAEC apparatus, it indicates that the echo in the packet from the participant has been cancelled before the packet is sent to the echo cancelling device, such that the cancelling apparatus 3 will not perform echo detection to the packets corresponding to the participant, thereby saving the comparison computation for new packets from the participant, reducing the computation complexity, and enhancing the performance of packet acoustic echo cancellation.
- PAEC packet acoustic echo cancellation
- the cancelling apparatus 3 may not determine the source address information of the packet in the reference packet stream, because the far echo or mate echo comes from other communication terminal, not from the communication terminal corresponding to the original packet.
- the cancelling apparatus 3 compares the packets in the target packet stream with the ones in the corresponding reference packet stream to detect whether the packets in the target packet stream contain any echo packet; if it does, then the cancelling apparatus 3 cancels echo in the target packet stream by deleting the echo packet or by substituting the detected echo packet using a substitution packet.
- PAEC packet acoustic echo cancellation
- the echo cancelling device further comprises a parameter determining apparatus(not shown), wherein, the parameter determining apparatus determines parameter information of the storage space for storing the target packet stream and the reference packet stream based on parameters of the multi-party call, in conjunction with packet voice lengths of the target packet stream and the reference packet stream.
- the parameter determining apparatus determines parameter information of the storage space for storing the target packet stream and the reference packet stream based on parameters of the multi-party call, in conjunction with packet voice lengths of the target packet stream and the reference packet stream.
- the storage space includes, but not limited to, a buffer or other means that may perform high-speed read/ write, and the parameter information of the storage space may be determined based on information such as the call parameter of the multi-party call (e.g., echo delay information of each participant of the conference) and voice packet lengths (e.g., each packet duration is 20 ms) of the target packet stream and the reference packet stream.
- the parameter information of the storage space includes, but not limited to, size of the storage space (e.g., buffer size) and/or window size of the reference packet stream.
- the design for the storage space of the broadcasted stream (for the buffer size of the reference packet stream as an example) mainly comprises three parameters:
- the first is delay th m millisecond, i.e., the maximum echo delay the conference facility can cancel, this is usually an engineering decision, but it should be greater than most echo delays.
- the second is delay m millisecond, i.e., the minimum echo delay of all the participants of one audio conference.
- the third is delay m millisecond, i.e., the maximum echo delay of all the participants of one audio conference.
- delay an d delay are dynamical per conference call parameters, which can be derived from statistics data, delay can be used to estimate the interval between the target packet window and the reference packet window, i.e. offset of the reference packet window.
- the delay and delay . > i- e -; the maximum delay and the minimum delay of the multi-party call as computed by the parameter determining apparatus based on the echo delay information of each participant of the conference, may be used to determine the size of the reference packet window, so as to adjust the size of the reference stream to avoid unnecessary comparison and enhance the echo detection efficiency.
- the broadcasting apparatus 4 broadcasts the echo-cancelled packet stream to participants in the multi-party call.
- the broadcasting apparatus 4 sends the echo-cancelled packet stream as obtained by the cancelling apparatus 3 to one or more participants in the multi-party call; wherein the echo-cancelled packet stream may be broadcasted to all participants in the multi-party call, or the echo-cancelled packet stream may be broadcasted to a part of participants in the multi -party call.
- the broadcasting apparatus 4 determines one or more target participants in the multi-party call corresponding to the echo-cancelled packet stream based on source address information corresponding to each packet in the target packet stream; and broadcasts the echo-cancelled packet stream to the one or more target participants.
- the broadcasting apparatus 4 determines, based on source address information in the header information of each packet in the echo-cancelled packet stream, one or more participants except the participant corresponding to the source address information in the multi-party call as the target participant; finally, the broadcasting apparatus 4 broadcasts the echo-cancelled packet stream to the one or more target participants.
- the broadcasting apparatus 4 determines the participant(s) corresponding to the packet based on the source address information of the packet so as to avoid sending the packet to its corresponding participant, i.e., if the source address information of the packet corresponds to participant A, the packet will not be sent to A anymore when broadcasting, thereby realizing intelligent broadcast and preventing backward broadcasting to avoid local echo.
- Fig. 7 shows a diagram of a server-based multi-party packet acoustic echo cancellation independent of the number of conference participants according to one aspect of the present invention.
- the target stream sent from each participant forms a single broadcasted stream by selection; the single broadcasted stream simultaneously includes target packets (selected new packets from each participant) and reference packets (according to Fig. 7, it is historical packets without any echo packet, because the echo packet has been cancelled during the previous AEC process).
- the target packets and the reference packets are compared according to the PAEC (Packet Acoustic Echo Cancellation) algorithm so as to remove the echo packet; the echo-cancelled packet is then broadcasted to the conference participants. Therefore, it transforms the universal method for detecting an echo in the packet field into a method of comparing new packets and historical packets in the broadcasted stream. It effectively cancels the packet echo in a packet-based audio conference independent of the number of conference participants. For a conference with N participants, only one PAEC resource (e.g., channel) is needed, which saves N-l channels.
- PAEC Packet Acoustic Echo Cancellation
- Fig. 9 shows a schematic diagram of an echo cancelling device for packet acoustic echo cancellation according to a preferred embodiment of the present invention; wherein the echo cancelling device comprises: a obtaining apparatus ⁇ , a selecting apparatus 2', a cancelling apparatus 3', and a broadcasting apparatus 4', the cancelling apparatus 3' comprises an updating unit 3 and a cancelling unit 32'.
- the obtaining apparatus ⁇ obtains one or more source voice packet streams in a multi-party call, wherein acoustic echoes in the one or more source voice packet streams are to be cancelled and each source voice packet stream is from one participant in the multi-party call; the selecting apparatus 2' selects packets from the one or more source voice packet streams so as to obtain a broadcasted stream corresponding to the multi -party call; the updating unit 3 updates the target packet stream and the reference packet stream corresponding to the multi-party call based on the broadcasted stream; the cancelling unit 32' cancels echo in the target packet stream based on the reference packet stream, so as to obtain the echo-cancelled packet stream corresponding to the target packet stream; the broadcasting apparatus 4' broadcasts the echo-cancelled packet stream to participants in the multi-party call.
- the obtaining apparatus the selecting apparatus 2'and the broadcasting apparatus 4' as comprised in the echo cancelling device are identical or substantially identical to corresponding apparatus shown in Fig. 8, which are thus not detailed here, but incorporated here by reference.
- each of the above apparatuses obtains one or more source voice packet streams in a multi-party call, obtains a broadcasted stream, updates the target packet stream and the reference packet stream, obtains the echo-cancelled packet stream, broadcasts the echo-cancelled packet stream, and the like in real time or according to a preset or real-time adjusted working pattern requirements, until the echo cancelling device stops obtaining one or more source voice packet streams in a multi-party call.
- the updating unit 3 updates the target packet stream and the reference packet stream corresponding to the multi-party call based on the broadcasted stream.
- the updating unit 3 regards, based on the broadcasted stream, one or more new packets in the broadcasted stream as the target packet stream corresponding to the broadcasted stream, wherein the new packet might contain an echo, and regards one or more historical packets in the same broadcasted stream as the reference packet stream corresponding to the broadcasted stream, wherein the historical packets are earlier than the packets of the new packets in time sequence, and the historical packets may contain echo.
- the updating unit 3 may also regard, based on the broadcasted stream, the same broadcasted stream which having been echo cancelled (i.e., the echo-cancelled packet stream earlier than the current new packets in time sequence) as the reference packet stream corresponding to the target packet stream.
- the reference packet stream does not include any echo.
- the updating unit 3 could update the target packet stream corresponding to the multi-party call based on the broadcasted stream; and determine the reference packet stream corresponding to the target packet stream based on the broadcasted stream and the target packet stream.
- the updating unit 3 regards, based on the broadcasted stream, one or more new packets in the broadcasted stream as the target packet stream corresponding to the broadcasted stream, wherein the new packet might contain an echo, and regards one or more historical packets in the same broadcasted stream corresponding to the target packet stream as the reference packet stream corresponding to the broadcasted stream, wherein the historical packets are earlier than the packets of the new packets in time sequence, and the historical packets may contain echo.
- the echo cancelling device further comprises a reference updating apparatus (not shown), wherein, the reference updating apparatus updates the reference packet stream based on the echo-cancelled packet stream.
- the reference updating apparatus obtains the echo-cancelled packet stream by interacting with the cancelling apparatus 3', and then updates the reference packet stream based on the echo-cancelled packet stream.
- the reference updating apparatus obtains the echo-cancelled packet stream by interacting with the cancelling apparatus 3', and then updates the reference packet stream based on the echo-cancelled packet stream.
- the cancelling unit 32' cancels echo in the target packet stream based on the reference packet stream, so as to obtain the echo-cancelled packet stream corresponding to the target packet stream.
- the cancelling unit 32' based on the reference packet stream, cancels echo in the target packet stream by comparing the reference packet stream with the target packet stream, such as computing according to the PAEC algorithm, so as to obtain the echo-cancelled packet stream corresponding to the target packet stream.
- Fig. 12 shows a diagram of comparing packets for performing packet acoustic echo cancellation according to one aspect of the present invention.
- the new packet when a new packet (which may contain an acoustic echo) in the broadcasted stream is added to the broadcasted stream, the new packet may act as a target packet for comparing with a historical packet which acts as the reference packet stream, so as to detect the acoustic echo. As shown in Fig.
- a packet set of the target packet stream (packet j to packet j+M, i.e., new packets) are compared with set 1 (packet 1 to packet q+M, i.e., historical packets), set 2 (packet q+1 to packet q+M+1, i.e., historical packets), set Q (packet q+Q-1 to packet q+Q+M-1, i.e., historical packets), respectively, to determine whether the target packet stream include any echo packet.
- Equations 1 to 3 show the PAEC algorithm for cancelling packet acoustic echo.
- equation 1 is an equation for computing a distance vector
- equation 2 is an equation for computing a distance value between each new packet and each historical packet.
- M+l is the new packet window size.
- M+Q is the corresponding history packet window size, g j q represents the comparison result of
- P denotes the value of LSP (Line Spectral Pair).
- the present invention realizes a method for packet acoustic echo cancellation, which has the following advantages over the prior art:
- the prior art needs buffering reference streams for each participant, and N-party audio conference will need N sets of reference buffers, such that when a new packet (which might contain an echo) occurs in the target packet stream, it can be compared to the buffered reference packets to detect acoustic echo; while by using the present invention, only one (broadcasted) stream is buffered for PAEC comparison, no matter how big N is.
- the present invention can greatly reduce the buffer consumption as well as the corresponding maintenance cost for PAEC.
- Fig. 10 shows a flow diagram of a method for packet acoustic echo cancellation according to another aspect of the present invention.
- the echo cancelling device obtains one or more source voice packet streams in a multi-party call, wherein acoustic echoes in the one or more source voice packet streams are to be cancelled and each source voice packet stream is from one participant in the multi-party call; in the step s2, the echo cancelling device selects packets from the one or more source voice packet streams so as to obtain a broadcasted stream corresponding to the multi-party call; in the step s3, the echo cancelling device cancels echo in a target packet stream corresponding to the broadcasted stream based on a reference packet stream corresponding to the broadcasted stream, so as to obtain the echo-cancelled packet stream corresponding to the target packet stream; in the step s4, the echo cancelling device broadcasts the echo-cancelled packet stream to participants in the multi -party call.
- each of the above steps obtains one or more source voice packet streams in a multi-party call, obtains a broadcasted stream, obtains the echo-cancelled packet stream, broadcasts the echo-cancelled packet stream, and the like in real time or according to a preset or real-time adjusted working pattern requirements, until the echo cancelling device stops obtaining one or more source voice packet streams in a multi-party call.
- the echo cancelling device obtains one or more source voice packet streams in a multi-party call, wherein acoustic echoes in the one or more source voice packet streams are to be cancelled and each source voice packet stream is from one participant in the multi-party call.
- the echo cancelling device interacts with one or more participants in a multi-party call (e.g., in one conference) to obtain source voice packet streams to be cancelled packet acoustic echo from the participant to the broadcast end, wherein the source voice packet stream comprises one or more packets, and the packet of the source voice packet stream may contain an echo packet.
- a multi-party call e.g., in one conference
- the echo cancelling device may obtain one or more source voice packet streams in the multi-party call; for example, if the multi-party call includes participant A, participant B, and participant C, then in the step si, the echo cancelling device may obtain a source voice packet stream from the participant A to the broadcast end, a source voice packet stream from the participant B to the broadcast end, and a source voice packet stream from the participant C to the broadcast end.
- the echo cancelling device selects packets from the one or more source voice packet streams so as to obtain a broadcasted stream corresponding to the multi -party call.
- the echo cancelling device selects packets from the one or more source voice packet streams, based on voice intensity information of the packets in the source voice packet streams or continuity information (e.g., the person talking previously has a higher priority to be selected continuously) of the packets in the source voice packet streams, and generates a broadcasted stream corresponding to the multi-party call based on the information of the selected packets, so as to guarantee that no mixing will occur.
- voice intensity information of the packets in the source voice packet streams or continuity information e.g., the person talking previously has a higher priority to be selected continuously
- the echo cancelling device cancels echo in a target packet stream corresponding to the broadcasted stream based on a reference packet stream corresponding to the broadcasted stream, so as to obtain the echo-cancelled packet stream corresponding to the target packet stream.
- the echo cancelling device regards, based on the broadcasted stream, one or more new packets in the broadcasted stream as the target packet stream corresponding to the broadcasted stream, wherein the new packet might contain an echo, and regards one or more historical packets in the same broadcasted stream as the reference packet stream corresponding to the broadcasted stream, wherein the historical packets are earlier than the packets of the new packets in time sequence, and in special cases, the historical packets may contain echo.
- the echo cancelling device compares, in the buffer or other means that may perform high-speed read/write, one or more packets in the reference packet stream with one or more packets in the target packet stream, for example, determines whether they are similar based on a packet acoustic echo cancellation algorithm (PAEC algorithm), thereby detecting whether the target packet stream includes echo packets; if one or more similar packets exist, then the target packet stream includes echo packet(s); wherein the means that may perform high-speed read/write may maintain a longer period of historical voice.
- PAEC algorithm packet acoustic echo cancellation algorithm
- the echo cancelling device cancels echo in the target packet stream through deleting the echo packet or by substituting the detected echo packet with a replacement packet, or through other manners. Specifically, substitute the detected echo packet with a replacement packet to obtain an echo-cancelled packet stream corresponding to the target packet stream, wherein the replacement packet includes, but not limited to, a noise packet (for example, a packet including a certain type of noise, for example white noise, comfortable noise, etc.), a silent packet (for example, an empty packet), a 1/8 rate packet as finally buffered in the target packet stream, and the like or a combination thereof.
- a noise packet for example, a packet including a certain type of noise, for example white noise, comfortable noise, etc.
- a silent packet for example, an empty packet
- a 1/8 rate packet as finally buffered in the target packet stream, and the like or a combination thereof.
- the echo cancelling device may also regard the echo-cancelled packet stream as its corresponding reference packet stream of the same broadcasted stream.
- the reference packet stream does not include any echo.
- the above method may identify whether the new packet is similar to the historical packet, thereby the far echo and the mate echo may be cancelled.
- the echo cancelling device may cancel echo in the target packet stream based on the reference packet stream corresponding to the broadcasted stream, in conjunction with source address information corresponding to each packet in a corresponding target packet stream of the broadcasted stream, so as to obtain the echo-cancelled packet stream corresponding to the target packet stream.
- the echo cancelling device regards, based on the broadcasted stream, one or more new packets in the broadcasted stream as the target packet stream corresponding to the broadcasted stream, wherein the new packet might contain an echo, and regards one or more historical packets in the same broadcasted stream as the reference packet stream corresponding to the broadcasted stream, wherein the historical packets are earlier than the packets of the new packets in time sequence, and in special cases, the historical packets may contain echo.
- the echo cancelling device determines, based on the source address information in header information of each packet in the target packet stream, the participant corresponding to the packet.
- the echo cancelling device interacts with the participant or with other third-party device to determine whether the participant is provided with a packet acoustic echo cancellation (PAEC) apparatus.
- PAEC packet acoustic echo cancellation
- the echo cancelling device may not determine the source address information of the packet in the reference packet stream, because the far echo or mate echo comes from other communication terminal, not from the communication terminal corresponding to the original packet.
- the echo cancelling device compares the packets in the target packet stream with the ones in the corresponding reference packet stream to detect whether the packets in the target packet stream contain any echo packet; if it does, then in the step s3, the echo cancelling device cancels echo in the target packet stream by deleting the echo packet or by substituting the detected echo packet using a substitution packet.
- PAEC packet acoustic echo cancellation
- the method further comprises a step s5(not shown), wherein, in the step s5, the echo cancelling device determines parameter information of the storage space for storing the target packet stream and the reference packet stream based on parameters of the multi-party call, in conjunction with packet voice lengths of the target packet stream and the reference packet stream.
- the storage space includes, but not limited to, a buffer or other means that may perform high-speed read/ write, and the parameter information of the storage space may be determined based on information such as the call parameter of the multi-party call (e.g., echo delay information of each participant of the conference) and voice packet lengths (e.g., each packet duration is 20 ms) of the target packet stream and the reference packet stream.
- the parameter information of the storage space includes, but not limited to, size of the storage space (e.g., buffer size) and/or window size of the reference packet stream.
- the design for the storage space of the broadcasted stream (for the buffer size of the reference packet stream as an example) mainly comprises three parameters:
- the first is delay th m millisecond, i.e., the maximum echo delay the conference facility can cancel, this is usually an engineering decision, but it should be greater than most echo delays.
- the second is delay min m millisecond, i.e., the minimum echo delay of all the participants of one audio conference.
- the third is delay m millisecond, i.e., the maximum echo delay of all the participants of one audio conference.
- the delay an d delay ⁇ , i- e -, the maximum delay and the minimum delay of the multi-party call as computed by the echo cancelling device in the step s5 based on the echo delay information of each participant of the conference may be used to determine the size of the reference packet window, so as to adjust the size of the reference stream to avoid unnecessary comparison and enhance the echo detection efficiency.
- the echo cancelling device broadcasts the echo-cancelled packet stream to participants in the multi -party call.
- the echo cancelling device sends the echo-cancelled packet stream as obtained by the step s3 to one or more participants in the multi-party call; wherein the echo-cancelled packet stream may be broadcasted to all participants in the multi-party call, or the echo-cancelled packet stream may be broadcasted to a part of participants in the multi -party call.
- the echo cancelling device determines one or more target participants in the multi-party call corresponding to the echo-cancelled packet stream based on source address information corresponding to each packet in the target packet stream; and broadcasts the echo-cancelled packet stream to the one or more target participants.
- the echo cancelling device determines, based on source address information in the header information of each packet in the echo-cancelled packet stream, one or more participants except the participant corresponding to the source address information in the multi-party call as the target participant; finally, in the step s4, the echo cancelling device broadcasts the echo-cancelled packet stream to the one or more target participants.
- the echo cancelling device determines the parti cipant(s) corresponding to the packet based on the source address information of the packet so as to avoid sending the packet to its corresponding participant, i.e., if the source address information of the packet corresponds to participant A, the packet will not be sent to A anymore when broadcasting, thereby realizing intelligent broadcast and preventing backward broadcasting to avoid local echo.
- Fig. 11 shows a flow diagram of a method for packet acoustic echo cancellation according to a preferred embodiment of the present invention.
- the echo cancelling device obtains one or more source voice packet streams in a multi-party call, wherein acoustic echoes in the one or more source voice packet streams are to be cancelled and each source voice packet stream is from one participant in the multi-party call;
- the echo cancelling device selects packets from the one or more source voice packet streams so as to obtain a broadcasted stream corresponding to the multi-party call;
- the echo cancelling device updates the target packet stream and the reference packet stream corresponding to the multi-party call based on the broadcasted stream;
- the echo cancelling device cancels echo in the target packet stream based on the reference packet stream, so as to obtain the echo-cancelled packet stream corresponding to the target packet stream;
- the echo cancelling device broadcast the echo cancelling device broadcast
- step si ', the step s2'and the step s4' of the method are identical or substantially identical to corresponding steps shown in Fig. 10, which are thus not detailed here, but incorporated here by reference.
- each of the above steps obtains one or more source voice packet streams in a multi-party call, obtains a broadcasted stream, updates the target packet stream and the reference packet stream, obtains the echo-cancelled packet stream, broadcasts the echo-cancelled packet stream, and the like in real time or according to a preset or real-time adjusted working pattern requirements, until the echo cancelling device stops obtaining one or more source voice packet streams in a multi-party call.
- the echo cancelling device updates the target packet stream and the reference packet stream corresponding to the multi-party call based on the broadcasted stream.
- the echo cancelling device regards, based on the broadcasted stream, one or more new packets in the broadcasted stream as the target packet stream corresponding to the broadcasted stream, wherein the new packet might contain an echo, and regards one or more historical packets in the same broadcasted stream as the reference packet stream corresponding to the broadcasted stream, wherein the historical packets are earlier than the packets of the new packets in time sequence, and the historical packets may contain echo.
- the echo cancelling device may also regard, based on the broadcasted stream, the same broadcasted stream which having been echo cancelled (i.e., the echo-cancelled packet stream earlier than the current new packets in time sequence) as the reference packet stream corresponding to the target packet stream.
- the reference packet stream does not include any echo.
- the echo cancelling device could update the target packet stream corresponding to the multi-party call based on the broadcasted stream; and determine the reference packet stream corresponding to the target packet stream based on the broadcasted stream and the target packet stream.
- the echo cancelling device regards, based on the broadcasted stream, one or more new packets in the broadcasted stream as the target packet stream corresponding to the broadcasted stream, wherein the new packet might contain an echo, and regards one or more historical packets in the same broadcasted stream corresponding to the target packet stream as the reference packet stream corresponding to the broadcasted stream, wherein the historical packets are earlier than the packets of the new packets in time sequence, and the historical packets may contain echo.
- the method further comprises a step s6' (not shown), wherein, in the step s6', the echo cancelling device updates the reference packet stream based on the echo-cancelled packet stream.
- the echo cancelling device obtains the echo-cancelled packet stream by interacting with the step s3', and then updates the reference packet stream based on the echo-cancelled packet stream.
- the echo-cancelled packet stream as a reference packet stream for comparing with the target packet stream, it has a better reference effect, thereby further enhancing the PAEC accuracy.
- the echo cancelling device cancels echo in the target packet stream based on the reference packet stream, so as to obtain the echo-cancelled packet stream corresponding to the target packet stream.
- the echo cancelling device based on the reference packet stream, cancels echo in the target packet stream by comparing the reference packet stream with the target packet stream, such as computing according to the PAEC algorithm, so as to obtain the echo-cancelled packet stream corresponding to the target packet stream.
Landscapes
- Engineering & Computer Science (AREA)
- Signal Processing (AREA)
- Cable Transmission Systems, Equalization Of Radio And Reduction Of Echo (AREA)
- Data Exchanges In Wide-Area Networks (AREA)
- Telephonic Communication Services (AREA)
Abstract
L'invention vise un procédé et un dispositif de compensation d'écho acoustique par paquets. Le dispositif de compensation d'écho obtient un ou plusieurs flux de paquets vocaux de source dans un appel à correspondants multiples, sélectionne des paquets de données dans un ou plusieurs flux de paquets vocaux de source de sorte à obtenir un flux diffusé correspondant à l'appel à correspondants multiples, compense l'écho dans un flux de paquets cible correspondant au flux diffusé sur la base d'un flux de paquets de référence correspondant au flux diffusé, de sorte à obtenir le flux de paquets à écho compensé correspondant au flux de paquets cible, et diffuse le flux de paquets à écho compensé entre correspondants dans l'appel à correspondants multiples. Par rapport à l'état de la technique, la présente invention met en oeuvre une compensation d'écho acoustique par paquets pour un appel à correspondants multiples, réduit la consommation de mémoire tampon, l'allocation de canaux et les surdébits correspondants de maintenance et de signalisation pour une compensation d'écho acoustique par paquets, améliore les performances de compensation d'écho acoustique par paquets et rend les performances de compensation d'écho acoustique par paquets indépendantes du nombre de participants.
Applications Claiming Priority (2)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN201410005371.9A CN104767895B (zh) | 2014-01-06 | 2014-01-06 | 一种用于分组声学回声消除的方法与设备 |
| CN201410005371.9 | 2014-01-06 |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| WO2015107382A1 true WO2015107382A1 (fr) | 2015-07-23 |
Family
ID=53175542
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| PCT/IB2014/003164 Ceased WO2015107382A1 (fr) | 2014-01-06 | 2014-12-29 | Procédé et dispositif de compensation d'écho acoustique par paquets |
Country Status (2)
| Country | Link |
|---|---|
| CN (1) | CN104767895B (fr) |
| WO (1) | WO2015107382A1 (fr) |
Cited By (2)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| WO2017111319A1 (fr) * | 2015-12-24 | 2017-06-29 | 삼성전자 주식회사 | Dispositif électronique et procédé de commande de fonctionnement de dispositif électronique |
| WO2020242670A1 (fr) * | 2019-05-31 | 2020-12-03 | Microsoft Technology Licensing, Llc | Agrégation de bouclage matériel |
Families Citing this family (1)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN108235052A (zh) * | 2018-01-09 | 2018-06-29 | 安徽小马创意科技股份有限公司 | 基于ios可选择多音频通道硬件混音、采集及播放的方法 |
Citations (4)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US6320958B1 (en) * | 1996-11-26 | 2001-11-20 | Nec Corporation | Remote conference system using multicast transmission for performing echo cancellation |
| US20090168673A1 (en) * | 2007-12-31 | 2009-07-02 | Lampros Kalampoukas | Method and apparatus for detecting and suppressing echo in packet networks |
| US20110235499A1 (en) * | 2010-02-25 | 2011-09-29 | Rajasekar Badri N | System and method for echo suppression in web browser-based communication |
| US20130155924A1 (en) * | 2011-12-15 | 2013-06-20 | Tellabs Operations, Inc. | Coded-domain echo control |
Family Cites Families (2)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN100492494C (zh) * | 2005-12-08 | 2009-05-27 | 华为技术有限公司 | 一种对分组语音进行回声抑制的系统和方法 |
| CN103152500B (zh) * | 2013-02-21 | 2015-06-24 | 黄文明 | 多方通话中回音消除方法 |
-
2014
- 2014-01-06 CN CN201410005371.9A patent/CN104767895B/zh not_active Expired - Fee Related
- 2014-12-29 WO PCT/IB2014/003164 patent/WO2015107382A1/fr not_active Ceased
Patent Citations (4)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US6320958B1 (en) * | 1996-11-26 | 2001-11-20 | Nec Corporation | Remote conference system using multicast transmission for performing echo cancellation |
| US20090168673A1 (en) * | 2007-12-31 | 2009-07-02 | Lampros Kalampoukas | Method and apparatus for detecting and suppressing echo in packet networks |
| US20110235499A1 (en) * | 2010-02-25 | 2011-09-29 | Rajasekar Badri N | System and method for echo suppression in web browser-based communication |
| US20130155924A1 (en) * | 2011-12-15 | 2013-06-20 | Tellabs Operations, Inc. | Coded-domain echo control |
Cited By (6)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| WO2017111319A1 (fr) * | 2015-12-24 | 2017-06-29 | 삼성전자 주식회사 | Dispositif électronique et procédé de commande de fonctionnement de dispositif électronique |
| US10489109B2 (en) | 2015-12-24 | 2019-11-26 | Samsung Electronics Co., Ltd. | Electronic device and method for controlling operation of electronic device |
| WO2020242670A1 (fr) * | 2019-05-31 | 2020-12-03 | Microsoft Technology Licensing, Llc | Agrégation de bouclage matériel |
| CN113874830A (zh) * | 2019-05-31 | 2021-12-31 | 微软技术许可有限责任公司 | 聚合硬件环回 |
| US11563857B2 (en) | 2019-05-31 | 2023-01-24 | Microsoft Technology Licensing, Llc | Aggregating hardware loopback |
| CN113874830B (zh) * | 2019-05-31 | 2024-05-10 | 微软技术许可有限责任公司 | 聚合硬件环回 |
Also Published As
| Publication number | Publication date |
|---|---|
| CN104767895A (zh) | 2015-07-08 |
| CN104767895B (zh) | 2017-11-03 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| US9191234B2 (en) | Enhanced communication bridge | |
| EP3729770B1 (fr) | Gestion de sessions de communications audio diffusées en continu | |
| US8289362B2 (en) | Audio directionality control for a multi-display switched video conferencing system | |
| US9237238B2 (en) | Speech-selective audio mixing for conference | |
| US10009475B2 (en) | Perceptually continuous mixing in a teleconference | |
| EP1942646A2 (fr) | Procédé de conférence multimédia et signal | |
| US9042535B2 (en) | Echo control optimization | |
| US20140278423A1 (en) | Audio Transmission Channel Quality Assessment | |
| US8553520B2 (en) | System and method for echo suppression in web browser-based communication | |
| US10439951B2 (en) | Jitter buffer apparatus and method | |
| WO2015107382A1 (fr) | Procédé et dispositif de compensation d'écho acoustique par paquets | |
| US11223716B2 (en) | Adaptive volume control using speech loudness gesture | |
| KR102056807B1 (ko) | 원격 화상 회의 시스템 | |
| KR20250140081A (ko) | 근접성 기반 오디오 회의 | |
| CN107113357B (zh) | 与语音质量估计相关的改进方法和设备 | |
| US20070129037A1 (en) | Mute processing apparatus and method | |
| GB2569650A (en) | Managing streamed audio communication sessions | |
| JP2016177176A (ja) | 音声処理装置、プログラム及び方法、並びに、交換装置 | |
| Vaighan | VoIp Voice Quality Measurement by Network Traffic Analysis | |
| Kryvyi et al. | Audio Routing for Scalable Conferencing using AAC-ELD and Bit Stream Domain Energy Estimation |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| 121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 14863049 Country of ref document: EP Kind code of ref document: A1 |
|
| NENP | Non-entry into the national phase |
Ref country code: DE |
|
| 122 | Ep: pct application non-entry in european phase |
Ref document number: 14863049 Country of ref document: EP Kind code of ref document: A1 |