WO2020134761A1 - 视频会议实现方法及装置、视频会议系统、存储介质 - Google Patents

视频会议实现方法及装置、视频会议系统、存储介质 Download PDF

Info

Publication number
WO2020134761A1
WO2020134761A1 PCT/CN2019/120230 CN2019120230W WO2020134761A1 WO 2020134761 A1 WO2020134761 A1 WO 2020134761A1 CN 2019120230 W CN2019120230 W CN 2019120230W WO 2020134761 A1 WO2020134761 A1 WO 2020134761A1
Authority
WO
WIPO (PCT)
Prior art keywords
terminal
video
stream
conference
synthesis
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Ceased
Application number
PCT/CN2019/120230
Other languages
English (en)
French (fr)
Inventor
屠要峰
赵志东
朱红军
梅君君
黄震江
陈俊
高洪
朱景升
周士俊
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
ZTE Corp
Original Assignee
ZTE Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by ZTE Corp filed Critical ZTE Corp
Priority to EP19902409.2A priority Critical patent/EP3905668A4/en
Publication of WO2020134761A1 publication Critical patent/WO2020134761A1/zh
Anticipated expiration legal-status Critical
Ceased legal-status Critical Current

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N7/00Television systems
    • H04N7/14Systems for two-way working
    • H04N7/15Conference systems
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/25Management operations performed by the server for facilitating the content distribution or administrating data related to end-users or client devices, e.g. end-user or client device authentication, learning user preferences for recommending movies
    • H04N21/258Client or end-user data management, e.g. managing client capabilities, user preferences or demographics, processing of multiple end-users preferences to derive collaborative data
    • H04N21/25808Management of client data
    • H04N21/25833Management of client data involving client hardware characteristics, e.g. manufacturer, processing or storage capabilities
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/234Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs
    • H04N21/2343Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs involving reformatting operations of video signals for distribution or compliance with end-user requests or end-user device requirements
    • H04N21/234363Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs involving reformatting operations of video signals for distribution or compliance with end-user requests or end-user device requirements by altering the spatial resolution, e.g. for clients with a lower screen resolution
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/27Server based end-user applications
    • H04N21/274Storing end-user multimedia data in response to end-user request, e.g. network recorder
    • H04N21/2743Video hosting of uploaded data from client
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/431Generation of visual interfaces for content selection or interaction; Content or additional data rendering
    • H04N21/4312Generation of visual interfaces for content selection or interaction; Content or additional data rendering involving specific graphical features, e.g. screen layout, special fonts or colors, blinking icons, highlights or animations
    • H04N21/4314Generation of visual interfaces for content selection or interaction; Content or additional data rendering involving specific graphical features, e.g. screen layout, special fonts or colors, blinking icons, highlights or animations for fitting data in a restricted space on the screen, e.g. EPG data in a rectangular grid
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/44Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs
    • H04N21/44016Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs involving splicing one content stream with another content stream, e.g. for substituting a video clip
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/47End-user applications
    • H04N21/478Supplemental services, e.g. displaying phone caller identification, shopping application
    • H04N21/4788Supplemental services, e.g. displaying phone caller identification, shopping application communicating with other users, e.g. chatting
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N7/00Television systems
    • H04N7/14Systems for two-way working
    • H04N7/141Systems for two-way working between two video terminals, e.g. videophone
    • H04N7/147Communication arrangements, e.g. identifying the communication as a video-communication, intermediate storage of the signals
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N7/00Television systems
    • H04N7/14Systems for two-way working
    • H04N7/15Conference systems
    • H04N7/152Multipoint control units therefor
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N7/00Television systems
    • H04N7/14Systems for two-way working
    • H04N7/15Conference systems
    • H04N7/155Conference systems involving storage of or access to video conference sessions

Definitions

  • the embodiments of the present disclosure relate to, but are not limited to, a video conference technology field.
  • Multimedia communication has always been a goal in the field of communication, and is constantly improving. With the development of 4G/5G communication technology and Internet/mobile Internet technology, it brings opportunities for customers to use multimedia communication. Multi-party video conferencing has been more and more widely used in the field of telecommunications, Internet and mobile Internet, and it has further brought great convenience to people's work and life.
  • the multi-party video conference system includes two modes, one is a single-stream conference mode, and the other is a multi-stream conference mode.
  • the single stream conference mode is based on the characteristic that the single stream server maintains a one-to-one flow with each single stream terminal.
  • the single stream server receives the incoming audio and video code streams from each single stream terminal, synthesizes one audio and video stream through coding and decoding, and sends them back to each single stream terminal.
  • the disadvantage of the single-stream conference mode is that the single-stream server video codec is more expensive and the synthesis cost is higher, thus spawning the multi-stream conference mode.
  • the multi-stream server is used as a media distribution unit for conference control and mixing.
  • the multi-stream server adopts a forwarding method for the video stream, that is, each multi-stream terminal sends one stream, receives the upstream streams of other multi-stream terminals forwarded by the multi-route multi-stream server, and finally the multi-stream terminal performs the multi-stream Decoding, synthesis and display of code streams.
  • the multi-stream conference mode is a cheaper and extensible multi-party conference mode. This method requires the terminal to join the conference. In this mode, the single-stream terminal cannot join the multi-stream conference.
  • the terminal participating in the conference can support the synthesis display of multiple video streams, while the single-stream terminal itself has no video synthesis capability, so that the single-stream terminal cannot access the multi-stream conference, thereby limiting the application scenarios of the multi-stream video conference.
  • At least one embodiment of the present disclosure provides a video conference implementation method and device, a video conference system, and a computer-readable storage medium to implement access to a single-stream terminal.
  • At least one embodiment of the present disclosure provides a video conference implementation method, including: determining a type of a terminal participating in a video conference, when a terminal participating in a video conference includes a single-stream terminal, determining a video synthesis terminal, and generating the video synthesis terminal
  • the synthesized code stream is sent to the single stream terminal, wherein the synthesized code stream is synthesized according to the original code stream of the terminal participating in the video conference.
  • At least one embodiment of the present disclosure provides a video conference implementation method, including: a multi-stream terminal receiving an original code stream sent by a media server, generating a composite code stream according to the original code stream, and uploading the composite code stream to the media server , Sending the synthesized code stream to a single stream terminal through the media server.
  • At least one embodiment of the present disclosure provides a video conference implementation device, including a memory and a processor, the memory stores a program, and when the program is read and executed by the processor, the video described in any embodiment is implemented Meeting realization method.
  • At least one embodiment of the present disclosure provides a computer-readable storage medium that stores one or more programs, and the one or more programs may be executed by one or more processors to implement any task.
  • a video conference implementation method is provided.
  • a video conference implementation system of the present disclosure includes: a signaling processing module configured to process terminal joining and dropping out of the terminal, and sending information of the joined terminal to a media processing section; a media processing component configured to determine based on the terminal information The type of the terminal participating in the video conference.
  • the terminal participating in the video conference includes a single-stream terminal, determine a video synthesis terminal and send the synthesized code stream generated by the video synthesis terminal to the single-stream terminal, wherein the synthesized code The stream is synthesized according to the original code stream of the terminal participating in the video conference.
  • Figure 1 is a schematic diagram of converged conference networking.
  • FIG. 2 is a flowchart of a method for implementing a video conference provided by an embodiment of the present disclosure (server side).
  • FIG. 3a is a schematic diagram of a code stream delivered by a server in an embodiment of the present disclosure.
  • FIG. 3b is a schematic diagram of a composite code stream uploaded by a multi-stream terminal in an embodiment of the present disclosure.
  • FIG. 4 is a flowchart of a video conference implementation method provided by an embodiment of the present disclosure (multi-stream terminal side).
  • FIG. 5 is a block diagram of a video conference system provided by an embodiment of the present disclosure.
  • FIG. 6a is a flowchart of a pure single-stream terminal video conference implementation provided by an embodiment of the present disclosure.
  • FIG. 6b is a flowchart of a pure single-stream terminal video conference implementation provided by an application example of the present disclosure.
  • 6c is a schematic diagram of a pure single-stream terminal networking provided by an embodiment of the present disclosure.
  • FIG. 7a is a flowchart of implementing a forced single-stream terminal video conference according to an embodiment of the present disclosure.
  • 7b is a flow chart of the implementation of a forced single-stream terminal video conference provided by an application example of the present disclosure.
  • 7c is a schematic diagram of a forced single-stream terminal networking provided by an embodiment of the present disclosure.
  • FIG. 8a is a schematic diagram of a single-stream terminal and a multi-stream terminal holding a converged video conference according to an embodiment of the present disclosure.
  • 8b is a schematic diagram of a single-stream terminal and a multi-stream terminal holding a converged video conference provided by an application example of the present disclosure.
  • 8c is a schematic diagram of a single-stream terminal and a multi-stream terminal networking provided by an embodiment of the present disclosure.
  • 8d is a schematic diagram of a downlink code stream received by a multi-stream terminal provided by an application example of the present disclosure.
  • FIG. 8e is a schematic diagram of a code stream uploaded by a multi-stream terminal provided by an application example of the present disclosure.
  • FIG. 9a is a schematic diagram of a single-stream terminal converged video conference processing provided after an embodiment of the present disclosure.
  • FIG. 9b is a schematic diagram of the fusion video conference processing after the single-stream terminal exits the conference provided by an application example of the present disclosure.
  • 9c is a schematic diagram of a single-stream terminal networking after a conference provided by an application example of the present disclosure.
  • FIG. 10a is a schematic diagram of a multi-stream terminal converged video conference processing provided by an embodiment of the present disclosure.
  • FIG. 10b is a schematic diagram of multi-stream terminal fusion video conference processing provided by an application example of the present disclosure after exiting a conference.
  • FIG. 10c is a schematic diagram of a multi-stream terminal networking after a conference provided by an application example of the present disclosure.
  • 11a is a schematic diagram of outbound idle multi-stream terminal processing according to an embodiment of the present disclosure.
  • 11b is a schematic diagram of outbound idle multi-stream terminal processing provided by an application example of the present disclosure.
  • 11c is a schematic diagram of a code stream received by an outgoing idle multi-stream terminal provided by an application example of the present disclosure.
  • 11d is a schematic diagram of a code stream uploaded by an outgoing idle multi-stream terminal provided by an application example of the present disclosure.
  • FIG. 12 is a block diagram of a video conference implementation device provided by an embodiment of the present disclosure.
  • FIG. 13 is a block diagram of a computer-readable storage medium provided by an embodiment of the present disclosure.
  • At least one embodiment of the present disclosure provides a video conference implementation method that supports simultaneous access of a single-stream terminal and a multi-stream terminal, determines a terminal type, and when there is a single-stream terminal, determines a video synthesis end of the single-stream terminal, thereby implementing Single-stream terminal access.
  • FIG. 1 is a schematic diagram of a converged conference networking.
  • the converged conference networking includes three parts: a part is a terminal 101, including a single-stream terminal and a multi-stream terminal.
  • the second is the signaling access and bearer network 102, which is responsible for signaling access and media bearing of multi-stream terminals and single-stream terminals.
  • the signaling access is responsible for the signaling access processing of various terminals, such as MGCF (Media Gateway Control Function), SS (Soft Switch), SIU (Signaling Interface) Unit, signaling front-end ), etc.
  • the media bearer network is used for media access processing, such as MGW (Media Gateway, media gateway), etc.
  • the third is the video conference system 103, including AS (Application Server, Application Server) and MRF (Media Resource Function, Media Server).
  • AS Application Server, Application Server
  • MRF Media Resource Function, Media Server
  • the solid line in the figure is the media stream, and the broken line is the control signaling.
  • AS is used for signaling processing and service processing;
  • MRF is used to receive AS control signaling and complete the corresponding media control processing functions, including but not limited to additional meeting, withdrawal, conference chairperson, conference control, audio playback, video playback, In-band and out-of-band number receiving function, audio conference, video conference, etc.
  • FIG. 2 is a flowchart of a video conference implementation method provided by an embodiment of the present disclosure. As shown in FIG. 2, it includes steps 201 and 202.
  • Step 201 Determine the type of terminal participating in the video conference.
  • Step 202 when the terminal participating in the video conference includes a single stream terminal, determine a video synthesis terminal, and send the synthesized code stream generated by the video synthesis terminal to the single stream terminal, wherein the synthesized code stream is based on the participation video
  • the source code stream of the conference terminal is synthesized.
  • a single-stream terminal refers to a terminal that does not have video synthesis capability
  • a multi-stream terminal refers to a terminal that has video synthesis capability
  • the video synthesis terminal when there is a single-stream terminal in the video conference, the video synthesis terminal is determined, and the video synthesis terminal performs video synthesis for the single-stream terminal to realize the access of the single-stream terminal and prevent the existing single-stream terminal from being unable to Access to video conferencing causes waste.
  • the video conference implementation method further includes: when a terminal participating in the video conference includes a multi-stream terminal, sending an original code stream generated by other terminals participating in the video conference to the multi-stream terminal .
  • a terminal participating in the video conference includes a multi-stream terminal
  • sending an original code stream generated by other terminals participating in the video conference to the multi-stream terminal .
  • the original code stream is directly sent, and the multi-stream terminal itself performs video synthesis.
  • the determining the type of the terminal participating in the video conference includes: determining the type of the terminal participating in the video conference according to the terminal information reported by the terminal.
  • a terminal accesses a video conference, it needs to report terminal information, and the terminal type can be determined according to the reported terminal information.
  • the terminal type can be determined according to the reported terminal information.
  • single-stream terminals and multi-stream terminals may also carry terminal type identification information to indicate the type of the terminal.
  • the video capability description includes description parameters such as video port, video codec, video resolution, transmission port, bandwidth, and transceiver mode. These description parameters determine the multi-stream terminal's ability to support various types of video.
  • the terminal carries audio and video capabilities, of which the audio capability G711, in the description of video capabilities, includes the video port, video codec, video resolution, transmission port, bandwidth, transceiver mode And other description parameters. These description parameters determine the ability of a single-stream terminal to support various types of video formats.
  • the video synthesis terminal includes a media server and/or a multi-stream terminal. That is, the media server may be used as the video synthesis terminal, or the multi-stream terminal may be used as the video synthesis terminal, or both the media server and the multi-stream terminal may be used as the video synthesis terminal.
  • the video synthesis capabilities of multi-stream terminals can be fully utilized, which effectively saves the computing power of the media server, improves the processing performance of the media server, facilitates converged conferences, supports single-stream terminals and multi-stream terminals The terminal simultaneously conducts video conferences.
  • one or more multi-stream terminals may be used as a video synthesis terminal.
  • Multiple multi-stream terminals can provide different formats of composite code streams to meet the needs of different single-stream terminals.
  • the participating single-stream terminals need multiple code stream formats to support.
  • multiple multi-stream terminals each provide a composite code stream in multiple formats, and so on.
  • the multi-stream terminal as a video synthesizer may be a multi-stream terminal participating in a video conference or an idle multi-stream terminal (in this case, the multi-stream terminal is only used as a video synthesizer and does not upload audio and video streams collected by itself.
  • the server of the video conference system actively calls into the idle multi-stream terminal).
  • the determining the video synthesis end includes: selecting one or more multi-stream terminals from the multi-stream terminals participating in the video conference as the video synthesis end.
  • selecting a video synthesizer you can choose among endpoints joining the video conference according to a preset strategy.
  • the preset strategy may be to select the video synthesis end according to the computing power of the terminal. Of course, this is only an example, and other strategies may be set.
  • the determining the video synthesis terminal includes: after calling an idle multi-stream terminal to join the conference, using the called multi-stream terminal as the video synthesis terminal.
  • an idle multi-stream terminal is called to join the conference, and the multi-stream terminal is used as a video synthesis terminal, so that the single-stream terminal can also Conduct a video conference.
  • the solution provided by this embodiment can expand the processing capacity of the entire video conference system, thereby being able to accommodate more terminals to access the video conference system.
  • the converged conference processing module queries the idle multi-stream terminal configuration table, and initiates a conference call to the idle multi-stream terminal. After the call is successful, the newly-joined multi-stream terminal is set as the video synthesis terminal, and the multi-stream terminal is for each single stream The terminal provides video synthesis services.
  • the idle multi-stream terminal configuration table includes the following information: terminal number, terminal IP, terminal port (PORT), terminal name, transmission protocol, control protocol, whether to support SVC (Switching Virtual Circuit), etc.
  • terminal number terminal IP
  • terminal port terminal port
  • terminal name terminal name
  • transmission protocol control protocol
  • whether to support SVC Switchching Virtual Circuit
  • the above slash font is an added signaling line, indicating that the idle multi-stream terminal is a video synthesis terminal, and the corresponding video synthesis format.
  • the outgoing call negotiation signaling contains four downlink code streams that are only received.
  • the downlink code stream is identified as "recvonly".
  • the four downlink code streams received by the outgoing multi-stream terminal are used for synthesis. Of course, it may be more than four. Bitstream.
  • multiple idle terminals are configured, multiple terminal calls can be made according to the actual situation.
  • the synthesized code stream can simultaneously contain code streams in multiple formats. In this case, a single multi-stream terminal that supports SVC stream can meet multiple single-stream terminals with multiple video formats.
  • the media server of the video conference system may also be used as the video synthesis terminal, that is, the media server is used as the video synthesis terminal, or the idle multi-stream terminal is called as the video synthesis terminal, or the media server and the idle call are used
  • the multi-stream terminal serves as a video synthesis terminal, where one or more idle multi-stream terminals can be called as the video synthesis terminal.
  • a media server a multi-stream terminal that has joined a video conference (self-participating in a video conference, you need to upload audio and video streams you have collected), and an incoming idle multi-stream terminal (self-not participating Video conferencing, without uploading the audio and video streams collected by itself) are used as the video synthesis end.
  • the video synthesis end may be one or more of a media server, a multi-stream terminal that joins a video conference, and an idle multi-stream terminal.
  • the selection of the video synthesizer can be triggered when a single-stream terminal joins, or when all terminals have joined the video conference, or when a multi-stream terminal as a video synthesizer exits or is abnormal.
  • the media server selects the video synthesizer, it can generate a record table for the video synthesizer, record the information of the video synthesizer, and record the single-stream terminal served by the video synthesizer, that is, the single-stream terminal provided by the video synthesizer. .
  • the video synthesis end sends the generated synthesized code stream to the media server
  • the media server sends the synthesized code stream to the single stream terminal.
  • the single stream terminal served by the video synthesis end is referred to as a distribution terminal of the video synthesis end.
  • the video conference implementation method further includes: when the video synthesis terminal is a multi-stream terminal, sending the multi-stream terminal to indicate the multi-stream terminal Selected as the indication information of the video synthesis end.
  • the video conference implementation method further includes: determining synthesized video format information according to a preset rule; wherein, the synthesized video format information is used for the video synthesis terminal Generate a synthetic code stream.
  • the preset rules include but are not limited to: most satisfaction principle, minimum satisfaction principle, priority satisfaction principle, and multi-file distribution principle. Various principles are described in detail in Table 1 below.
  • the video conference implementation method further includes: sending a first instruction to the video synthesis terminal, where the first instruction carries the synthesized video format information.
  • the composite video format information includes at least one of the following: composite video transmission port, composite video codec type, bandwidth, transceiver mode, composite video resolution, composite video code rate, composite video frame rate.
  • the terminal and the media server include an upstream audio stream and a video stream, and a downstream audio stream and a video stream.
  • the downlink code stream received by the multi-stream terminal is the original code stream of another terminal that needs to be synthesized
  • the upstream video code stream is its own original code stream collected by the multi-stream terminal; in the embodiment of the present disclosure, If the multi-stream terminal is selected as the video synthesis terminal, the received downstream stream is still the original stream of other terminals that need to be synthesized.
  • the upstream stream includes the synthesized stream in addition to its original stream At this time, the synthesized code stream uploaded by the multi-stream terminal will be forwarded to other single-stream terminals and displayed to the user through the single-stream terminal.
  • the downlink code stream delivered by the media server is a single screen of other terminals, the uplink code stream sent by the terminal is the screen of the terminal itself, and the rest are composite screens of different resolutions, code rates, and frame rates.
  • FIG. 3a it is a four-way stream image forwarded by the media server to the terminal, and the delivered stream is a single-screen image, which comes from the upstream original stream of other terminals.
  • the four-way code stream of the terminal is shown in Figure 3b. It can be seen that one code stream is the terminal's own image, and the other three are composite images; as shown in Figure 3b, the four-screen mode is synthesized. There may be There are many ways to synthesize images, but only one stream is shown. The resolution, code rate, and frame rate of the three-way composite images are different, which is needed for the terminal media processing module to forward the code stream of the appropriate size to the appropriate terminal. In addition, the number of upstream code streams of the terminal is also variable. According to policy selection, it may be three code streams, or one or two code streams or more composite code streams.
  • the SVC video stream with multiple resolutions, code rates, and frame rates can be included in the streaming code stream along the way, and the media server extracts and forwards the SVC stream with multiple video formats as required.
  • the multi-stream terminal may simply uplink one composite code stream.
  • the video conference implementation method further includes: when the video synthesis terminal is a multi-stream terminal, recording a single-stream terminal served by the video synthesis terminal.
  • a second instruction for instructing to cancel the multi-stream terminal as the video synthesizer is sent to the multi-stream terminal.
  • the second instruction can be realized by renegotiating to set the synthesis port to 0. It should be noted that this is only an example, and the multi-stream terminal may be notified by other means that the multi-stream terminal has been canceled as a video synthesis terminal.
  • each video synthesizer when determining the video synthesizer, multiple video synthesizers are determined, and each video synthesizer is a backup for each other.
  • one of the video synthesizers retires or is abnormal, it switches to the other video synthesizer, so that the end user
  • the adaptive switching of the video synthesis end can be realized without obvious perception, which can effectively ensure the video quality effect of the fusion conference.
  • at least two video synthesizers are selected, and both the multi-stream terminal and the media server are used as the video synthesizers. There is no actual difference in the video synthesis effect between the two, and they can be used as backups for each other.
  • the media server can also synthesize the synthesized code stream required by the single-stream terminal to meet the requirements of the converged video conference.
  • the multi-stream terminal is preferentially selected as the video synthesizer, which can save the processing capacity of the media server and improve the overall service performance of the system.
  • the video conference implementation method further includes: when the multi-stream terminal that is the video synthesis terminal retires or an abnormality occurs, re-determining the video synthesis terminal.
  • the abnormality of the multi-stream terminal includes: congestion caused by changes in network conditions results in that the multi-stream terminal that has been selected as the video synthesis terminal cannot continue to provide the video synthesis service.
  • the video synthesizer ends or is abnormal, it dynamically switches to another video synthesizer, so as to effectively reduce the adverse effects, so that the video conference will not terminate the conference service due to the exit of the multi-stream terminal (video synthesizer), and ensure the system function Of continuity.
  • the re-determining the video synthesizing end includes: using the backup video synthesizing end as the new video synthesizing end, or selecting the video synthesizing end among the multi-stream terminals that participate in the video conference and are not selected as the video synthesizing end, Or, call an idle multi-stream terminal as the video synthesis terminal. That is, when the video synthesis terminal is selected, a backup video synthesis terminal can be set, and when the current video synthesis terminal retreats or is abnormal, the backup video synthesis terminal is used as a video synthesis terminal to perform code stream synthesis.
  • the determining the video synthesizer includes: when the configured conference mode is a preset mode, the media server is used as the video synthesizer; for example, the preset mode is a forced single stream mode, that is, a single stream is used If a video conference is held in the form of a conference, then the media server serves as the video synthesis end.
  • the video conference implementation method further includes: when the terminal participating in the video conference includes a multi-stream terminal, sending the synthesized code stream generated by the media server to the multi-stream terminal, that is, in a forced single-stream mode, Multi-stream terminals are also regarded as single-stream terminals.
  • another embodiment of the present disclosure provides a video conference implementation method, including step 401 and step 402.
  • Step 401 a multi-stream terminal receives an original code stream, and generates a synthesized code stream according to the original code stream.
  • Step 402 The multi-stream terminal uploads the synthesized code stream to a media server to send the synthesized code stream to a single-stream terminal through the media server.
  • a multi-stream terminal serves as a video synthesizer and uploads a synthesized code stream to a media server, thereby providing support for a single-stream terminal to join a multi-stream conference.
  • the method before the multi-stream terminal receives the original code stream, the method further includes: the multi-stream terminal receives the request to join the conference and joins the video conference. That is, the idle multi-stream terminal responds to the request for joining the conference, joins the video conference, and performs code stream synthesis as the video synthesis terminal.
  • step 401 generating the synthesized code stream according to the original code stream includes: after receiving a first instruction indicating that the device (ie, the multi-stream terminal) is selected as the video synthesis end, Generate a synthesized code stream according to the original code stream. After selecting the multi-stream terminal as the video synthesizer, the media server sends an instruction to notify the multi-stream terminal that it is selected as the video synthesizer.
  • the first instruction carries video synthesis information
  • the generating the synthesis code stream includes: generating a synthesis code stream according to the video synthesis information in the first instruction.
  • the first instruction includes at least one of the following: a composite video transmission port, a composite video codec type, a bandwidth, a transceiver mode, a composite video resolution, a composite video code rate, and a composite video frame rate.
  • the video conference implementation method further includes: when the multi-stream terminal receives the second instruction for instructing to cancel the multi-stream terminal as a video synthesizer, stop generating the synthesized code stream and release the relevant Resources, that is, resources corresponding to generating and uploading a composite code stream are released.
  • the composite code stream is a composite code stream or multiple composite code streams, and the composite code stream includes one or more video format code streams.
  • the one-way composite code stream can be a code stream with only one resolution, or it can be a SVC (Scalable Video Coding, scalable video coding) code stream with multiple resolutions.
  • the multiplexed composite stream can be a stream of multiple formats (each channel is a format), and the format includes resolution, frame rate, and code rate.
  • the code rate of multiple formats can be video streams of different formats with multiple video resolutions, multiple frame rates, and multiple bit rates, which are used to meet the video support format requirements of different single-stream terminals.
  • FIG. 5 is a schematic diagram of the system architecture of the video meeting.
  • the video realization system includes a signaling processing module 501 and a media processing component 502, where the media processing component 502 can be multiple sets and adopt a distributed deployment; the signaling processing module 501 can be partially deployed on the AS, and Deployed on the MRF, the media processing component 502 can be deployed on the MRF, of course, can also be other deployment methods.
  • the signaling processing module 501 is configured to process the joining and exiting of the terminal, and send the information of the joining terminal to the media processing component 502.
  • the signaling processing module 501 is a control interface module that interacts with external applications.
  • the signaling processing module 501 is responsible for processing terminals (including single-stream terminals and multi-stream terminals for joining and dropping conferences, including terminal capability negotiation processing, joining processing, and leaving conference processing. At the same time, information about the joined terminals is sent to the media processing component 502 through internal instructions , So that the media processing component 502 can identify, record, and subsequently process various conference terminals.
  • Terminal information includes but is not limited to terminal type and/or terminal capability information. Among them, terminal capability information includes but is not limited to at least one of the following: audio Capability, video resolution, video frame rate, video code rate, transmission port, bandwidth, etc.
  • the media processing component 502 is configured to determine the type of the terminal participating in the video conference according to the information of the terminal, when the terminal participating in the video conference includes a single stream terminal, determine the video synthesizing terminal, and generate the video synthesizing terminal
  • the synthesized code stream is sent to the single stream terminal, wherein the synthesized code stream is synthesized according to the original code stream of the terminal participating in the video conference.
  • the media processing component 502 includes a terminal media processing module 5021 and a converged conference processing module 5022, where the terminal media processing module 5021 is configured to receive the information of the joining terminal sent by the signaling processing module 501, and The information of the terminal is sent to the fusion conference processing module 5022, and the original code stream is received, the video synthesis terminal information sent by the fusion conference processing module 5022 is received, and the synthesis code generated by the video synthesis terminal is generated according to the video synthesis terminal information The stream is sent to a single stream terminal; the terminal media processing module 5021 also completes various audio processing functions and video processing functions.
  • Audio processing functions include audio playback, audio recording, audio conference, and audio conference control functions; video processing functions include video playback, video recording, video conference, and video conference control functions.
  • the video conference function refers to receiving the code stream of each terminal and forwarding the code stream of each terminal to other terminals.
  • One is the original video stream collected by the terminal, which is generally a single image code stream. This type of code stream is used for forwarding to other multi-stream terminals.
  • the code stream is encoded and synthesized, and displayed to the end user; the second is a synthetic code stream, which is a code stream after multi-stream terminal synthesis. This type of code stream is used for forwarding to a single stream terminal, which is directly displayed to the single stream terminal. End user.
  • the converged conference processing module 5022 is responsible for the processing of converged conferences, including mandatory single-stream conference processing, video synthesis terminal selection, multi-stream adaptation, and multi-stream terminal for outgoing conference calls.
  • Mandatory single-stream conference processing determines the form of multi-party video conference development: the converged conference processing module 5022 determines whether the conference is configured as a forced single-stream. If it is a forced single-stream, the multi-stream terminal in the video conference is no longer treated as a video synthesizer , But only treated as a normal single-stream terminal, and will not choose a multi-stream terminal as the video synthesis terminal.
  • the converged conference processing module 5022 determines the type of the terminal participating in the video conference according to the information of the terminal. When the terminal participating in the video conference includes a single-stream terminal, the video synthesis terminal is determined and the information of the video synthesis terminal is determined. Notify the terminal media processing module 5021.
  • the converged conference processing module 5022 determines that the video synthesis end includes: using a media server and/or a multi-stream terminal as the video synthesis end.
  • the converged conference processing module 5022 determines that the video synthesis end includes: selecting one or more multi-stream terminals from the multi-stream terminals participating in the video conference as the video synthesis end.
  • the converged conference processing module 5022 determines that the video synthesis end includes: calling an idle multi-stream terminal to join the conference, and after the called multi-stream terminal joins the conference, using the called multi-stream terminal as the video synthesis end.
  • the converged conference processing module 5022 is further configured to, after determining the video synthesis end, when the video synthesis end is a multi-stream terminal, send to the terminal media processing module 5021 to indicate the multiple The stream terminal is selected as the indication information of the video synthesis end.
  • the terminal media processing module 5021 is further configured to send the instruction information indicating that the multi-stream terminal is selected as the video synthesis terminal to the corresponding multi-stream terminal.
  • the video synthesis terminal selection function determines the terminal type, and the fusion conference processing module 5022 identifies the multi-stream terminal.
  • the converged conference processing module 5022 selects a terminal with video synthesis capability (that is, a multi-stream terminal) to perform video synthesis.
  • the terminal media processing module 5021 sends a first instruction to notify the video synthesizer.
  • the first instruction includes the video synthesizer instruction information (that is, used to indicate that the multi-stream terminal is Selected as information on the video synthesis side) and/or video synthesis information.
  • the video synthesis information includes but is not limited to the following information: video port, video codec, video resolution, video frame rate, video code rate, transmission port, bandwidth, transmission mode, etc.; wherein, the video synthesis end indication information can be passed through the media Add description information to the description information.
  • Video synthesis information can be realized by adding description information to the media description information.
  • the first instruction can be transmitted to the multi-stream terminal through renegotiation.
  • one or more video media description lines can be added to the renegotiation signaling, such as adding a media description line, which is specifically used as a synthesis Video media stream channel.
  • the information added in the renegotiation signaling is as follows.
  • the converged conference processing module 5022 is further configured to, when the video synthesis end is a multi-stream terminal, record the single stream terminal served by the video synthesis end; when the multi-stream terminal serving as the video synthesis end serves When all single-stream terminals have retired from the conference, a second instruction for instructing to cancel the multi-stream terminal as a video synthesis terminal is sent to the terminal media processing module 5021.
  • the terminal media processing module 5021 is further configured to send a second instruction for instructing the multi-stream terminal to cancel the multi-stream terminal as a video synthesis terminal.
  • the converged conference processing module 5022 is further configured to re-determine the video synthesis end when the multi-stream terminal that is the video synthesis end retreats or an exception occurs.
  • Re-determining the video synthesizer may use the backup video synthesizer as the video synthesizer (in this case, add the single-stream terminal served by the withdrawal or abnormal multi-stream terminal to the distribution terminal list of the backup video synthesizer), or select a new one
  • the multi-stream terminal serves as a video synthesizer, or an idle multi-stream terminal is called as a video synthesizer.
  • Scenario 1 Pure single-stream terminal holds a converged video conference
  • the video conference implementation method includes steps 601a to 604a.
  • Step 601a Receive a membership request from a single-stream terminal.
  • Step 602a Record the information of the single-stream terminal, negotiate with the single-stream terminal, and establish a conference connection.
  • Step 603a Determine the synthesized video format information according to the information of the single stream terminal and the preset principle, and select the video synthesis terminal.
  • the media server is selected as the video synthesizer.
  • an idle multi-stream terminal may also be called as the video synthesizer, or the media server and the incoming idle multi
  • the streaming terminal serves as the video synthesis terminal.
  • Step 604a Apply for video synthesis resources, use the applied video synthesis resources, synthesize a code stream according to the synthesized video format information to obtain a synthesized code stream, and send the synthesized code stream to a single-stream terminal.
  • the video conference implementation method includes steps 601b to 608b.
  • Step 601b The single-stream terminal initiates a request for joining a conference to the signaling processing module.
  • Step 602b the signaling processing module processes the membership request, and sends the information of the single-stream terminal to the terminal media processing module, wherein the information of the single-stream terminal includes but is not limited to at least one of the following: terminal type, audio Port, audio format, video port, video resolution, video frame rate, video bit rate, transmission port, bandwidth.
  • Step 603b The terminal media processing module records the information of the single stream terminal, and feeds back the negotiated media information to the signaling processing module.
  • Step 604b The signaling processing module completes handshake negotiation with the single-stream terminal, and the single-stream terminal's in-conference connection is established.
  • the network includes a single-stream terminal C, a single-stream terminal D, and a single-stream terminal E. All terminals in the network are single-stream terminals.
  • Step 605b The terminal media processing module sends the information of the single-stream terminal to the converged conference processing module.
  • Step 606b The converged conference processing module obtains terminal information from the terminal media processing module, judges all terminals, finds that the terminal participating in the video conference is a single-stream terminal, and sorts the single-stream terminals according to the single-stream negotiation media information , This sorting is used to select the format of the composite video. When sorting, you can sort according to the video resolution, frame rate, and code rate.
  • the sorting table of each single-stream terminal is shown in Table 4, including the terminal number, terminal type, audio and video port, video resolution, video frame rate, video bit rate, etc.
  • Step 607b the fusion conference processing module selects a composite video format according to a preset rule, applies for video composite resources according to the selected composite video format, and sends the internal composite resource number to the terminal media processing module.
  • the internal synthetic resource number is the index information of the resource.
  • the converged conference processing module chooses to use a media server as the video synthesis end, and notifies the terminal media processing module of the internal synthesis resource number of the video synthesis end.
  • Step 608b The terminal media processing module uses the internal synthesized resource number to synthesize the original code stream to obtain a synthesized code stream, and forwards the synthesized code stream to each single-stream terminal.
  • the original code stream is sent by the terminal (in this embodiment, a single stream terminal) to the terminal media processing module.
  • a media server is selected as a video synthesis terminal, and the single-stream terminal is accessed.
  • Scenario 2 Force a single-stream terminal to hold a converged video conference
  • the video conference implementation method according to an embodiment of the present disclosure includes steps 701a to 704a.
  • Step 701a Receive a request to join a single-stream terminal and a multi-stream terminal.
  • the joining requests of the single-stream terminal and the multi-stream terminal may be initiated at the same time, or may be initiated separately.
  • Step 702a Record the information of the single-stream terminal and the multi-stream terminal, negotiate with the single-stream terminal, and establish a conference connection.
  • the preset mode is a forced single-stream conference mode, that is, regardless of the type of terminal accessed, the media server is used as the video synthesis end.
  • Step 703a Determine the synthesized video format information according to the terminal information and the preset principle, and select each server as the video synthesis end.
  • Step 704a Apply for video synthesis resources, synthesize the code stream to obtain a synthesized code stream, and send the synthesized code stream to a single-stream terminal and a multi-stream terminal.
  • the video conference implementation method includes steps 701b to 709b.
  • Step 701b the single-stream terminal and the multi-stream terminal respectively initiate a membership application to the signaling processing module.
  • the signaling processing module analyzes and processes the conference control signaling, and queries whether it is configured as a forced single-stream conference. If it is a forced single-stream conference, when the terminal information is sent to the terminal media processing module, the conference mode is set to the forced single-stream mode.
  • the terminal information includes but is not limited to at least one of the following: terminal type, audio port, audio format, video port, video resolution, video frame rate, video bit rate, transmission port, bandwidth, etc.
  • the terminal information is distributed to the terminal media processing module through internal messages.
  • the terminal media processing module records the information of the single-stream terminal and the multi-stream terminal, and determines whether it is a forced single-stream mode.
  • the forced single-stream mode the media information negotiated by the multi-stream terminal is fed back to the signaling processing module according to the format of the single-stream terminal.
  • Step 704b the signaling processing module completes handshake negotiation with the terminal, and the terminal joins the conference and establishes it.
  • the terminal information table is shown in Table 5 below (the information in Table 5 is just an example, but not limited to these information. Such as transmission port, bandwidth, etc. are also necessary, but they are not listed here due to space ). It should be noted that the terminal number in Table 5 is the index information of the terminal.
  • a single-stream terminal and a multi-stream terminal converge, and the multi-stream terminal will be forced to treat it as a single-stream terminal: one upstream audio stream, one video stream, and received A downlink synthetic code stream and a downlink audio code stream delivered by the media server.
  • Step 705b The terminal media processing module sends the terminal information to the converged conference processing module.
  • Step 706b The converged conference processing module obtains terminal information from the terminal media processing module, and determines whether it is a forced single-stream mode. If it is a forced single-stream mode, the single-stream terminal and the multi-stream terminal are sorted at the same time. This sorting is used to select the video synthesis format. In this example, the sorting table is described in Table 6 below.
  • Step 707b the fusion conference processing module selects the composite video format according to the preset rules.
  • the preset rules here include, but are not limited to: the majority satisfaction principle, the minimum satisfaction principle, the priority satisfaction principle, and the multi-file distribution principle. Various principles are described in detail in Table 1.
  • the converged conference processing module applies for video synthesis resources according to the format of the synthesized video.
  • it is a forced single-stream mode, selects the media server as the video synthesis end, and sends the internal synthesized resource number to the terminal media processing module.
  • Step 709b The terminal media processing module performs video synthesis according to the internal synthesized resource number, obtains a synthesized code stream, and forwards the synthesized code stream to each terminal (including single-stream terminals and multi-stream terminals).
  • Scenario 3 Single-stream/multi-stream terminals hold a converged video conference
  • the video conference implementation method according to an embodiment of the present disclosure includes steps 801a to 805a.
  • step 801a a membership request is received.
  • Step 802a Record the terminal information and negotiate with the terminal to establish the terminal's conference connection; wherein, the terminal information includes terminal capability information.
  • step 803a it is judged that the terminal type includes a single-stream terminal, and if it is not configured as a forced single-stream conference, a composite video format and a video synthesis terminal are selected.
  • the composite video format is selected according to preset rules.
  • the video synthesis end is selected according to a preset strategy. For example, select the terminal with the strongest capability as the video synthesizer and so on.
  • the video synthesis terminal when selecting the video synthesis terminal, perform at least one of the following: select the multi-stream terminal as the video synthesis terminal, where one or more multi-stream terminals can be selected as the video synthesis terminal; select the media server as the video synthesis terminal; the call is idle
  • the multi-stream terminal serves as the video synthesis terminal, where one or more idle multi-stream terminals can be called as the video synthesis terminal.
  • a multi-stream terminal is selected as the video synthesis terminal.
  • Step 804a When the video synthesizing terminal is a multi-stream terminal, send a first instruction to the selected video synthesizing terminal, where the first instruction carries video synthesizing terminal instruction information and synthesized video format information.
  • Step 805a Receive the synthesized code stream sent by the multi-stream terminal and forward it to the single-stream terminal.
  • the code stream interaction in the video conference includes: the single stream terminal sends the original code stream to the media server, and the media server delivers the synthesized code stream (from the video synthesis end) to the single stream terminal.
  • the multi-stream terminal that is not selected as the video synthesis end sends the original stream to the server, receives the original stream from other terminals forwarded by the media server, and combines the original stream to display it locally.
  • the multi-stream terminal (participating in the video conference) selected as the video synthesis end sends the original stream to the server, receives the original stream from other terminals forwarded by the media server, synthesizes the original stream and displays it locally, and compares the original stream Perform synthesis to obtain a synthesized code stream, and upload the synthesized code stream to the media server.
  • the synthetic code stream displayed locally and the uploaded synthetic code stream may be different.
  • Those original code streams are synthesized, and the layout of the synthesized graphics, etc. may be controlled by the media server.
  • the multi-stream terminal selected as the video synthesizer receives the original stream from other terminals forwarded by the media server, synthesizes the original stream to obtain the synthesized stream, and uploads the synthesized stream to Media server.
  • the synthesis of those original code streams, the layout of the synthesized image, etc. can be controlled by the media server.
  • the video synthesis terminal when a single-stream terminal participates in a video conference, the video synthesis terminal is determined, and the video synthesis terminal performs code stream synthesis for the single-stream terminal, thereby realizing the single-stream terminal to participate in the video conference and avoiding the investment Single-stream terminal waste.
  • the video conference implementation method includes steps 801b to 810b.
  • step 801b the single-stream terminal and the multi-stream terminal respectively initiate a request for joining a conference.
  • Step 802b The signaling processing module analyzes and processes the conference control signaling, and determines whether it is configured as a forced single-stream conference. If it is not configured as a forced single-stream conference, the conference mode is set to the converged conference mode and the terminal information is sent. To the media processing module of the terminal; where the information sent includes but is not limited to the terminal type, audio port, audio format, video port, video resolution, video frame rate, video bit rate, transmission port, bandwidth, etc.
  • the terminal information is distributed to the terminal media processing module through internal messages.
  • Step 803b The terminal media processing module records the information of the single-stream terminal and the multi-stream terminal, and feeds back the negotiated media information to the signaling processing module.
  • step 804b the signaling processing module completes handshake negotiation with the terminal, and the terminal joins the conference and establishes it.
  • the terminal information table is shown in Table 7 below (the following is just an example, but not limited to these information. For example, it may also include information such as transmission port and bandwidth).
  • Step 805b The terminal media processing module sends the terminal information to the converged conference processing module.
  • the converged conference processing module obtains the terminal information from the terminal media processing module. Since this embodiment is a converged conference mode, the single-stream terminal and the multi-stream terminal are sorted according to the media information negotiated by the terminal, and the terminal capabilities are sorted. The multi-stream terminal capability ranking is used for the selection of the video synthesis end, and the single-stream terminal capability ranking is used for the selection of the synthesized video format.
  • the multi-stream terminal sorting table is shown in Table 8 below.
  • Sort number Video resolution, frame rate, code rate Terminal number .... 1 720P/15/1M 1, 2 A
  • the single-stream sorting table is shown in Table 9 below.
  • Sort number Video resolution, frame rate, code rate Terminal number .... 1 720P/10/512K 3 A 2 480P/10/384K 4, 5 A
  • sorting may not be performed.
  • Step 807b the fusion conference processing module selects the composite video format according to the preset rules, and selects the video composite terminal according to the preset strategy; referring to the terminal capability list, the video composite terminal and the composite video format are selected.
  • the preset strategy may be: select a multi-stream terminal with strong computing power as the video synthesis terminal. It should be noted that this is only an example, and the video synthesis end can be selected according to other strategies.
  • the video synthesis terminal selection is triggered, and the fusion conference processing module sorts according to the preset rules and checks the multi-stream capability.
  • Streaming terminal A serves as a video synthesis terminal
  • multi-streaming terminal B serves as another video synthesis terminal. If you need to backup the video synthesizer, you need to select another different multi-stream terminal as the video synthesizer. If multi-stream terminal A is selected as the 512K code rate video synthesizer and multi-stream terminal B is the 384K code rate video synthesizer, the video synthesis terminal record table is shown in Table 10 below.
  • Step 808b the converged conference processing module notifies the terminal media processing module of the selected video synthesis end and the synthesized video format.
  • Step 809b the terminal media processing module sends the first instruction to the multi-stream terminal, where the first instruction includes but is not limited to the following information: synthesis port, synthesis video codec type, synthesis video resolution, synthesis video frame rate, synthesis video code Rate etc.
  • Step 810b The terminal media processing module obtains the synthesized code stream through the server port corresponding to the video synthesis terminal, and forwards the synthesized code stream to each single stream terminal.
  • the final multi-stream terminal A code stream situation is as follows: As shown in FIG. 8d, the multi-stream terminal A will receive four downstream code streams. As shown in FIG. 8e, the multi-stream terminal A will send two lines of code streams, one for its own code stream and one for its composite code stream. It should be noted that in this embodiment, the synthesized code stream is synthesized by four downstream code streams. In other embodiments, the synthesized code stream may be one of the four downstream code streams and the multi-stream terminal A's own code stream or Multiple synthesis.
  • Scenario 4 Converged video conference processing after a single-stream terminal leaves the conference
  • the video conference implementation method according to an embodiment of the present disclosure includes steps 901a to 904a.
  • step 901a a withdrawal request of the single-stream terminal is received.
  • Step 902a Clear the related data of the single-stream terminal withdrawing from the conference.
  • Step 903a delete the information of the single-stream terminal from the record table of the video synthesizer, and determine whether the video synthesizer (the multi-stream terminal in this case) serving the single-stream terminal still has a single-stream terminal in need of service, if not, then Delete the information of the video synthesizer in the record table of the video synthesizer, send a second instruction to the multi-stream terminal to release the receiving port resources corresponding to the multi-stream terminal; the second instruction is used to instruct to cancel the multi-stream terminal as Video synthesis end.
  • the multi-stream terminal receives the second instruction, stops video synthesis (stops synthesizing the synthesized code stream that needs to be uploaded, and does not affect the synthesis of the code stream that needs to be displayed locally), and releases related resources, which include video port resources.
  • the solution provided by this embodiment releases relevant resources in time after the single-stream terminal withdraws from the meeting to improve system performance and efficiency.
  • the single-stream terminal C initiates a withdrawal.
  • the video conference implementation method includes steps 901b to 907b.
  • Step 901b the single stream terminal C initiates a withdrawal request to the signaling processing module.
  • step 902b the signaling processing module processes the withdrawal request and notifies the terminal media processing module of the information of the withdrawal single-stream terminal through an internal instruction, that is, sends the single-stream terminal withdrawal instruction to the terminal media processing module.
  • Step 903b the signaling processing module notifies the converged conference processing module of the information of the terminal withdrawn through an internal instruction, that is, sends a single-stream terminal withdrawal instruction to the converged conference processing module.
  • the multi-stream terminal needs to be notified to release the video synthesis resource, so the withdrawal information of the single-stream terminal needs to be synchronously sent to the terminal media processing module and the converged conference processing module.
  • the single-stream terminal C withdraws from the meeting, leaving only the multi-stream terminal A, the multi-stream terminal B, the single-stream terminal D, and the single-stream terminal E.
  • step 904b the terminal media processing module processes the single-stream terminal withdrawal instruction to clear data related to the single-stream terminal withdrawal.
  • Step 905b The converged conference processing module receives the single stream terminal withdrawal instruction, deletes the single stream terminal information from the video synthesis terminal record table (ie, Table 10), and determines whether the distribution terminal of the video synthesis terminal corresponding to the single stream terminal All of them have exited. If they have all exited, the terminal media processing module is notified that the video synthesis end has exited, and step 906b is executed. If the distribution terminals of the video synthesis end corresponding to the single-stream terminal have not all exited, the process ends.
  • the video synthesis terminal record table ie, Table 10
  • the distribution terminals of the video synthesis terminal 1 have all been launched, and the records of the synthesis number 1 in the record table of the video synthesis terminal should be deleted, and the terminal media processing module is notified
  • the video synthesizer at synthesis number 1 has exited, and the updated video synthesizer record table is shown in Table 11.
  • Step 906b The terminal media processing module sends a second instruction to the multi-stream terminal to release the synthetic video receiving port resource on the server side, where the second instruction is used to instruct to cancel the multi-stream terminal as the video synthesis terminal.
  • the multi-stream terminal may be instructed to cancel as a video synthesis terminal in other ways.
  • step 907b the multi-stream terminal A receives the second instruction, stops video synthesis, and releases related resources (including video port resources).
  • the solution provided by this embodiment can timely release the resources of the video synthesis end and the resources of the server end (that is, the server end of the video conference system) serving terminal C after terminal C exits, thereby improving system performance.
  • the multi-stream terminal serving terminal C here is a multi-stream terminal participating in a video conference. In other embodiments, it may also be an idle multi-stream terminal for incoming calls. The processing is similar at this time and will not be described in detail.
  • Scenario 5 Adaptive dynamic switching for converged video conference
  • the video conference implementation method according to an embodiment of the present disclosure includes steps 1001a to 1003a.
  • step 1001a a withdrawal request initiated by a multi-stream terminal is received.
  • Step 1002a clear the relevant data of the multi-stream terminal withdrawing from the conference; when the multi-stream terminal is a video synthesizer, query whether there is a backup video synthesizer, if there is a backup video synthesizer, select the backup video synthesizer as the new video Synthesizer, and add the single-stream terminal served by the multi-stream terminal that has dropped out to the distribution terminal list of the newly selected video synthesizer; if there is no backup video synthesizer, reselect the video synthesizer, when the selected video synthesizer is In the case of a multi-stream terminal, the first instruction is sent to the multi-stream terminal.
  • the video synthesizer when re-selecting the video synthesizer, the video synthesizer may be selected among the multi-stream terminals that have joined the video conference, or an idle multi-stream terminal may be called as the video synthesizer, or the media server may be used as the video synthesizer.
  • Step 1003a Obtain the synthesized code stream through the server port corresponding to the video synthesis end, and forward the synthesized code stream to the single-stream terminal.
  • the solution provided by this embodiment can switch in time when the multi-stream terminal as the video synthesis end drops out, so that the video conference will not terminate the conference service due to the exit or abnormality of the multi-stream terminal (video synthesis end), which ensures the function of the system Persistent.
  • the video conference implementation method includes:
  • Step 1001b the multi-stream terminal initiates a withdrawal request to the signaling processing module.
  • Step 1002b the signaling processing module processes the withdrawal request, and notifies the terminal media processing module of the withdrawal terminal information through an internal instruction, that is, sends the multi-stream terminal withdrawal instruction to the terminal media processing module.
  • Step 1003b the signaling processing module notifies the converged conference processing module of the information of the terminal withdrawn through an internal instruction. That is, a multi-stream terminal withdrawal instruction is sent to the converged conference processing module.
  • Step 1004b the terminal media processing module processes the multi-stream terminal withdrawal instruction, and clears the data related to the multi-stream terminal withdrawal.
  • Step 1005b the converged conference processing module receives the multi-stream terminal withdrawal instruction and judges whether it is a forced single-stream conference.
  • the media server is used as the video synthesis terminal, so no processing is performed;
  • Stream conference and the multi-stream terminal is a video synthesizer, delete the multi-stream terminal from the video synthesizer record table, and determine whether there is a backup video synthesizer.
  • the backup video synthesizer uses the backup video synthesizer as the new video synthesizer and send a switching instruction to the terminal media processing module, instructing to use the backup video synthesizer to replace the multi-stream terminal withdrawn; if there is no backup video
  • the synthesizer selects a new video synthesizer that meets the requirements among the multi-stream terminals that join the video conference, and notifies the terminal media processing module to fill the new video synthesizer information into the video synthesizer record table. If a new video synthesis terminal cannot be selected, call the idle multi-stream terminal for the single-stream terminal as the new video synthesis terminal, and send the video synthesis terminal selection instruction to the terminal media processing module.
  • the record of the synthesis number 1 in the record table of the video synthesis end is deleted.
  • the video synthesis terminal of the synthesis serial number 2 is selected as the video synthesis terminal of the single stream terminal 3, and the single stream terminal 3 is added to the distribution terminal serial number item of the synthesis serial number 2.
  • the updated video synthesis terminal recording table is shown in Table 12.
  • Step 1006b if the terminal media processing module receives the switching instruction, it will forward the synthesized code stream according to the new distribution terminal sequence number list. If the video synthesis terminal selection instruction is received, the terminal media processing module sends the first instruction to multiple Streaming terminal, where the first instruction includes but is not limited to the following information: composite port, composite video codec type, composite video resolution, composite video frame rate, composite video code rate, etc.
  • the subsequent terminal media processing module obtains the synthesized code stream through the server port corresponding to the video synthesis end, and forwards the synthesized code stream to each single-stream terminal.
  • the video conference implementation method according to an embodiment of the present disclosure includes steps 1101a to 1103a.
  • step 1101a when the video synthesis terminal is selected, if the idle multi-stream terminal table is configured, the idle multi-stream terminal is selected in the idle multi-stream terminal table, and an outgoing instruction is issued to the selected idle multi-stream terminal.
  • an idle multi-stream terminal that meets the code stream synthesis requirements and is in a normal state is selected to initiate an outgoing call instruction.
  • Step 1102a after the idle multi-stream terminal joins the conference, use the multi-stream terminal as a video synthesizer, add the multi-stream terminal to the video synthesizer record table, and issue a first instruction to the multi-stream terminal; wherein, The first instruction carries video synthesis terminal indication information and video synthesis information.
  • Step 1103a Receive the synthesized code stream uploaded by the multi-stream terminal, and send the synthesized code stream to the single-stream terminal.
  • the video conference implementation method includes steps 1101b to 1105b.
  • Step 1101b When the converged conference processing module selects the video synthesis terminal, if the idle multi-stream terminal table is configured, the idle multi-stream terminal table that satisfies the code stream synthesis requirements and is in a normal state is preferentially selected in the idle multi-stream terminal table. Initiate an outgoing call instruction to the signaling processing module, and update the state in the idle multi-stream terminal table to the use state.
  • the status can include three states: idle, use, and offline.
  • the idle multi-stream terminal table is shown in Table 13 below, and the selected idle multi-stream terminal is the serial number 2 multi-stream terminal.
  • Step 1102b the signaling processing module initiates the outbound idle multi-stream terminal to join the conference according to the outbound terminal information selected in the idle multi-stream terminal table. Since the outgoing idle terminal does not need to participate in the video conference, there is no need to send its own audio stream and video stream to the server. Therefore, the media negotiation signaling carried by the outbound call does not include the upstream audio stream information, nor Contains upstream video stream information.
  • Step 1103b After the outgoing call is successful, the signaling processing module notifies the fusion conference processing module through an internal instruction.
  • step 1104b the converged conference processing module updates the record table of the video synthesis terminal, and notifies the terminal media processing module that the idle multi-stream terminal that is called is selected as the video synthesis terminal.
  • Step 1105b the terminal media processing module sends the first instruction to the multi-stream terminal, indicating that the multi-stream terminal is selected as the video synthesis terminal.
  • the terminal media processing module obtains the synthesized code stream through the server port corresponding to the video synthesis end, and forwards the synthesized code stream to the single-stream terminal.
  • the final code stream of the idle multi-stream terminal is as follows: as shown in FIG. 11c, the idle multi-stream terminal will receive four code streams; as shown in FIG. 11d, the idle multi-stream terminal will upload one code stream.
  • the code stream is a composite code stream.
  • an idle multi-stream terminal is called as a video synthesizing terminal, which can expand the processing capacity of the entire video conference system, thereby being able to accommodate more terminals to access the video conference system.
  • an embodiment of the present disclosure provides a video conference implementation device 120, including a memory 1210 and a processor 1220, where the memory 1210 stores a program, which is read and executed by the processor 1220 To implement the video conference implementation method described in any embodiment.
  • an embodiment of the present disclosure provides a computer-readable storage medium 130 that stores one or more programs 131, and the one or more programs 131 can be used by one or Multiple processors execute to implement the video conference implementation method described in any embodiment.
  • the term computer storage medium includes both volatile and nonvolatile implemented in any method or technology for storing information such as computer readable instructions, data structures, program modules, or other data Sex, removable and non-removable media.
  • Computer storage media include but are not limited to RAM, ROM, EEPROM, flash memory or other memory technologies, CD-ROM, digital versatile disk (DVD) or other optical disk storage, magnetic cartridges, magnetic tape, magnetic disk storage or other magnetic storage devices, or may Any other medium used to store desired information and accessible by a computer.
  • the communication medium generally contains computer readable instructions, data structures, program modules, or other data in a modulated data signal such as a carrier wave or other transmission mechanism, and may include any information delivery medium .

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Databases & Information Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Computer Graphics (AREA)
  • Telephonic Communication Services (AREA)
  • Two-Way Televisions, Distribution Of Moving Picture Or The Like (AREA)

Abstract

本申请公开了一种视频会议实现方法及装置、视频会议系统、存储介质。该视频会议实现方法包括:确定参与视频会议的终端的类型,当参与视频会议的终端包括单流终端时,确定视频合成端,将所述视频合成端生成的合成码流发送给所述单流终端,其中,所述合成码流根据参与视频会议的终端的原始码流合成得到。

Description

视频会议实现方法及装置、视频会议系统、存储介质 技术领域
本公开实施例涉及但不限于一种视频会议技术领域。
背景技术
多媒体通信一直是通信领域的目标,并在不断完善中。随着4G/5G通信技术及互联网/移动互联网技术的发展,为客户普遍使用多媒体通信带来机遇。多方视频会议在电信领域、互联网及移动互联网领域得到越来越广泛的应用,也进一步给人们的工作生活带来极大的便利。
多方视频会议系统包括两种模式,一种是单流会议模式,另一种是多流会议模式。单流会议模式基于单流服务器保持与每个单流终端一对一的流的特性。单流服务器接收每个单流终端传入音视频码流,通过编解码合成一路音视频流再发回给每个单流终端。单流会议模式的缺点是单流服务端视频编解码比较昂贵,合成代价较高,这样就催生了多流会议模式。
在多流会议模式下,多流服务器作为媒体分发单元用于会议控制,混音。多流服务器对于视频码流则采用转发的方式,即每个多流终端发送一路码流,接收多路由多流服务器转发的其他多流终端的上行码流,并最终由多流终端进行多路码流的解码、合成和展示。多流会议模式是一种更廉价的可扩展的多方会议模式,这种方式对入会的终端有要求,在这种模式下单流终端无法加入多流会议,主要原因是多流会议模式要求接入会议的终端能够支持多路视频码流的合成展示,而单流终端本身无视频合成能力,这样单流终端就无法接入多流会议,从而限制了多流视频会议的应用场景。
发明内容
本公开至少一实施例提供了一种视频会议实现方法及装置、视频会议系统、计算机可读存储介质,实现单流终端的接入。
本公开至少一实施例提供一种视频会议实现方法,包括:确定参与视频会议的终端的类型,当参与视频会议的终端包括单流终端时,确定视频合成端,将所述视频合成端生成的合成码流发送给所述单流终端,其中,所述合成码流根据参与视频会议的终端的原始码流合成得到。
本公开至少一实施例提供一种视频会议实现方法,包括:多流终端接收媒体服务器发送的原始码流,根据所述原始码流生成合成码流,上传所述合成码流至所述媒体服务器,通过所述媒体服务器将所述合成码流发送给单流终端。
本公开至少一实施例提供一种视频会议实现装置,包括存储器和处理器,所述存储器存储有程序,所述程序在被所述处理器读取执行时,实现任一实施例所述的视频会议实现方法。
本公开至少一实施例提供一种计算机可读存储介质,所述计算机可读存储介质存储有一个或者多个程序,所述一个或者多个程序可被一个或者多个处理器执行,以实现任一所述的视频会议实现方法。
本公开一种视频会议实现系统,包括:信令处理模块,配置为处理终端的入会和退会,将入会的终端的信息发送给媒体处理部分;媒体处理组件,配置为根据所述终端的信息确定参与视频会议的终端的类型,当参与视频会议的终端包括单流终端时,确定视频合成端,将所述视频合成端生成的合成码流发送给所述单流终端,其中,所述合成码流根据参与视频会议的终端的原始码流合成得到。
本公开的其它特征和优点将在随后的说明书中阐述,并且,部分地从说明书中变得显而易见,或者通过实施本发明而了解。本发明的目的和其他优点可通过在说明书、权利要求书以及附图中所特别指出的结构来实现和获得。
附图说明
图1是融合会议组网示意图。
图2是本公开一实施例提供的视频会议实现方法流程图(服务器端)。
图3a为本公开一实施例中服务器端下发的码流示意图。
图3b为本公开一实施例中多流终端上传的合成码流示意图。
图4为本公开一实施例提供的视频会议实现方法流程图(多流终端侧)。
图5为本公开一实施例提供的视频会议系统框图。
图6a为本公开一实施例提供的纯单流终端视频会议实现流程图。
图6b为本公开一应用实例提供的纯单流终端视频会议实现流程图。
图6c为本公开一实施例提供的纯单流终端组网示意图。
图7a为本公开一实施例提供的强制单流终端视频会议实现流程图。
图7b为本公开一应用实例提供的强制单流终端视频会议实现流程图。
图7c为本公开一实施例提供的强制单流终端组网示意图。
图8a是本公开一实施例提供的单流终端和多流终端召开融合视频会议的示意图。
图8b是本公开一应用实例提供的单流终端和多流终端召开融合视频会议的示意图。
图8c为本公开一实施例提供的单流终端和多流终端组网示意图。
图8d是本公开一应用实例提供的多流终端接收到的下行码流示意图。
图8e是本公开一应用实例提供的多流终端上传的码流示意图。
图9a是本公开一实施例提供的单流终端退会后融合视频会议处理示意图。
图9b是本公开一应用实例提供的单流终端退会后融合视频会 议处理示意图。
图9c是本公开一应用实例提供的单流终端退会后组网示意图。
图10a是本公开一实施例提供的多流终端退会后融合视频会议处理示意图。
图10b是本公开一应用实例提供的多流终端退会后融合视频会议处理示意图。
图10c是本公开一应用实例提供的多流终端退会后组网示意图。
图11a是本公开一实施例提供的外呼空闲多流终端处理示意图。
图11b是本公开一应用实例提供的外呼空闲多流终端处理示意图。
图11c是本公开一应用实例提供的外呼空闲多流终端接收到的码流示意图。
图11d是本公开一应用实例提供的外呼空闲多流终端上传的码流示意图。
图12是本公开一实施例提供的视频会议实现装置框图。
图13是本公开一实施例提供的计算机可读存储介质框图。
具体实施方式
为了使本公开的目的、技术方案及优点更加清楚明白,下面通过具体实施方式结合附图对本公开实施例作进一步详细说明。应当理解,此处所描述的具体实施例仅仅用以解释本公开,并不用于限定本公开。
在附图的流程图示出的步骤可以在诸如一组计算机可执行指令的计算机系统中执行。并且,虽然在流程图中示出了逻辑顺序,但是在某些情况下,可以以不同于此处的顺序执行所示出或描述的步骤。
目前,市场上存在众多的传统视频终端,这些都是单流终端, 如不能接入多流会议,会造成很大的浪费,从保护用户投资这个角度来看,也很有必要支持单流终端接入多流视频会议。
本公开至少一实施例提供了一种支持单流终端与多流终端同时接入的视频会议实现方法,确定终端类型,当存在单流终端时,确定单流终端的视频合成端,从而实现了单流终端的接入。
图1为融合会议组网示意图,如图1所示,融合会议组网包括三部分:一部分是终端101,包含单流终端和多流终端。二是信令接入和承载网102,负责多流终端、单流终端的信令接入及媒体承载。其中信令接入负责各种终端的信令接入处理,例如MGCF(Media Gateway Control Function,媒体网关控制功能)、SS(Soft Switch,软交换)、SIU(Signaling Interface Unit,信令前置机)等;媒体承载网用于媒体接入处理,例如MGW(Media Gateway,媒体网关)等。三是视频会议系统103,包括AS(Application Server,应用服务器)和MRF(Media Resource Function,媒体服务器)。图中实线为媒体流,虚线为控制信令。AS用于进行信令处理和业务处理;MRF用于接收AS的控制信令,完成相应的媒体控制处理功能,包括但不限于加会、退会、会议主席、会议控制、音频播放、视频播放、带内带外收号功能、音频会议、视频会议等。
图2为本公开一个实施例提供的一种视频会议实现方法流程图,如图2所示,包括步骤201和步骤202。
步骤201,确定参与视频会议的终端的类型。
步骤202,当参与视频会议的终端包括单流终端时,确定视频合成端,并将所述视频合成端生成的合成码流发送给所述单流终端,其中,所述合成码流根据参与视频会议的终端的原始码流合成得到。
本公开中单流终端指不具备视频合成能力的终端,多流终端指具备视频合成能力的终端。
本实施例提供的方案中,当视频会议中存在单流终端时,确定视频合成端,由视频合成端为单流终端进行视频合成,实现单流 终端的接入,避免已有单流终端无法接入视频会议造成浪费。
在一些实施例中,所述视频会议实现方法还包括:当参与所述视频会议的终端包括多流终端时,将参与所述视频会议的其他终端产生的原始码流发送至所述多流终端。对于多流终端,无需进行视频合成,直接发送原始码流,由多流终端自身进行视频合成。
在一些实施例中,所述确定参与视频会议的终端的类型包括:根据所述终端上报的终端信息确定所述参与视频会议的终端的类型。终端在接入视频会议时,需要上报终端信息,根据上报的终端信息可以确定终端的类型。比如,多流终端上报终端信息时,通常包含多条视频媒体描述信息(比如携带多条以“m=video”开头的信息);单流终端上报终端信息时,通常包含一条视频媒体描述信息(比如携带一条以“m=video”开头的信息)。因此,可以根据上述信息区分单流终端和多流终端。当然,单流终端和多流终端在上报终端信息时,也可以携带终端类型标识信息以指示该终端的类型。
如下述给出的多流协商例子中可以看出是多流终端携带了音视频能力,其中音频能力支持SILK,G711,AMR-WB,视频能力中包含多条视频媒体描述信息(下述例子中为“m=video”行),表示该终端为多流终端。在视频能力描述中,包含了视频端口、视频编解码、视频分辨率、传输端口、带宽、收发模式等描述参数,这些描述参数决定了多流终端对各种类型的视频支持能力。
一多流终端协商信令示例如下所示,下述信令中仅示出了m=video行,其余省略。
m=video 20594 RTP/AVP 118
m=video 20620 RTP/AVP 117
m=video 20646 RTP/AVP 117
m=video 20672 RTP/AVP 117
单流终端一般携带一条视频媒体描述信息(下述例子为“m=video”行),此时的终端不具备视频合成能力,即该终端为多流终端。如下述给出的单流协商例子可以看出终端携带了音视频能力,其中音频能力G711,在视频能力描述中,包含了视频端口、视频编解码、视频分辨率、传输端口、带宽、收发模式等描述参数,这些描述参数决定了单流终端对各种类型的视频格式的支持能力。
一单流终端协商信令示例如下所示,下述信令中仅示出了m=video行,其余省略。
m=video 20020 RTP/AVP 106
在一些实施例中,所述视频合成端包括媒体服务器和/或多流终端。即可以将媒体服务器作为视频合成端,也可以将多流终端作为视频合成端,或者,将媒体服务器和多流终端均作为视频合成端。将多流终端作为视频合成端时,可以充分利用多流终端的视频合成能力,从而有效节约媒体服务器的计算能力,提高媒体服务器的处理性能,能够方便开展融合会议,支持单流终端与多流终端同时进行视频会议。
在一些实施例中,将多流终端作为视频合成端时,可以将一个或多个多流终端作为视频合成端。多个多流终端可以分别提供不同格式的合成码流,以满足不同单流终端的需求,比如,参会的单流终端需要多种码流格式支持,在这种情况下,就可以使用不同的多流终端来进行不同格式要求的视频码流的合成。当然,也可以是多个多流终端中每个多流终端提供多种格式的合成码流,等等。
另外,作为视频合成端的多流终端可以是参与视频会议的多流终端,也可以是空闲的多流终端(此时,该多流终端仅作为视频合成端,不上传自身采集的音视频流,由视频会议系统服务器端主动呼入该空闲的多流终端)。
在一些实施例中,所述确定视频合成端包括:从参与视频会议的多流终端中选择一个或多个多流终端作为所述视频合成端。选 择视频合成端时,可以根据预设策略在加入视频会议的终端中进行选择。预设策略可以为根据终端计算能力选择视频合成端,当然,此处仅为示例,可以设置其他策略。
在一些实施例中,所述确定视频合成端包括:呼叫空闲的多流终端入会后,将所呼叫的多流终端作为所述视频合成端。可以在视频会议系统中配置空闲多流终端列表,呼叫空闲多流终端列表中的多流终端入会。比如,在只有单流终端加入视频会议的情况下,为了减轻视频会议系统的媒体服务器的压力,呼叫空闲的多流终端入会,并将该多流终端作为视频合成端,使得单流终端也能进行视频会议。本实施例提供的方案,能扩充整个视频会议系统的处理能力,从而能够容纳更多的终端接入视频会议系统。
对于只有单流终端参与的视频会议,申请视频合成资源时,由于无多流终端可选择,此时可以采用两种方法来处理:一是申请媒体服务器自身的合成资源,由媒体服务器进行视频合成;二是融合会议处理模块查询空闲多流终端配置表,对空闲的多流终端发起入会呼叫,呼叫成功后,将新入会的多流终端设置为视频合成端,由多流终端为各单流终端提供视频合成服务。此处,空闲多流终端配置表包括如下信息:终端编号、终端IP、终端端口(PORT)、终端名称、传输协议、控制协议、是否支持SVC(Switching Virtual Circuit,交换虚拟电路)等。对于外呼的多流终端,仅作为视频合成的能力端使用。由于外呼的多流终端所采集的视频并不参与合成,所以不需要上传自身采集的音频码流和视频码流,也不需要有音频码流的交互,详细内容参见如下举例。外呼协商举例如下。
v=0
o=ZTE-SOMTMS 4245362188 4245362189 IN IP4 192.168.1.118
s=session SDP
c=IN IP4 192.168.1.118
t=0 0
m=video 20594 RTP/AVP 118
b=AS:3584
a=rtpmap:118 H264/90000
a=fmtp:118 profile-level-id=64001f;packetization-mode=1;MaxBR=35840
a=recvonly
a=zimeclient
m=video 20620 RTP/AVP 117
b=AS:3584
a=rtpmap:117 H264/90000
a=fmtp:117 profile-level-id=42e01f;packetization-mode=1;MaxBR=35840
a=recvonly
a=zimeclient
m=video 20646 RTP/AVP 117
b=AS:3584
a=rtpmap:117 H264/90000
a=fmtp:117 profile-level-id=42e01f;packetization-mode=1;MaxBR=35840
a=recvonly
a=zimeclient
m=video 20672 RTP/AVP 117
b=AS:3584
a=rtpmap:117 H264/90000
a=fmtp:117 profile-level-id=42e01f;packetization-mode=1;MaxBR=35840
a=recvonly
m=video 20678 RTP/AVP 117
b=AS:3584
a=rtpmap:117 H264/90000
a=fmtp:117 profile-level-id=42e01f;packetization-mode=1;MaxBR=35840
a=sendonly
a=videoconfmixer
a=zimeclient
上述斜线字体处为增加的信令行,指示该空闲的多流终端为视频合成端,以及相应的视频合成格式。
该外呼协商信令包含四路仅接收的下行码流,下行码流被标识为“recvonly”,外呼多流终端收到的四路下行码流用来合成,当然也可以是多于四路的码流。类似的,如配置了多台空闲的终端,也可以依据实际情况,进行多台终端呼叫。对于支持SVC的多流终端,其合成码流中,可以同步包含多种格式的码流。在这种情况下选用一台支持SVC码流的多流终端,就可以满足多种视频格式要求的多种单流终端。
在一些实施例中,视频会议系统的媒体服务器也可以作为视频合成端,即将媒体服务器作为视频合成端,或者,呼叫空闲的多流终端作为视频合成端,或者,将媒体服务器和呼叫的空闲的多流终端作为视频合成端,其中,可以呼叫一个或多个空闲的多流终端作为视频合成端。
另外,在其他实施例中,也可以将媒体服务器、已加入视频会议的多流终端(自身参与视频会议,需上传自身采集的音视频流)以及呼入的空闲的多流终端(自身不参与视频会议,不上传自身采集的音视频流)均作为视频合成端。也就是说,视频合成端可以是媒体服务器、加入视频会议的多流终端,空闲的多流终端中的一种或多种。
视频合成端的选择可以在有单流终端加入时触发,或者,在所有终端均已加入视频会议时触发,或者,有作为视频合成端的多流终端退会或异常时触发。媒体服务器选择视频合成端后,可以生成视频合成端记录表,记录视频合成端的信息,以及记录该视频合成端所服务的单流终端,即该视频合成端所提供码流合成服务的单流终端。视频合成端将生成的合成码流发送到媒体服务器后,媒体服务器将该合成码流发送到该单流终端,下文中将视频合成端所服 务的单流终端称为视频合成端的分发终端。
在一些实施例中,所述确定视频合成端后,所述视频会议实现方法还包括:当所述视频合成端为多流终端时,向所述多流终端发送用于指示所述多流终端被选为视频合成端的指示信息。其中,该指示信息可以通过重协商信令来传给多流终端。比如在重协商信令中新增“a=videoconfmixer”行用来表明该多流终端被作为视频合成端。需要说明的是,此处仅为示例,可以采用其他方式来通知多流终端其被作为视频合成端。
在一些实施例中,确定参与视频会议的终端的类型后,所述视频会议实现方法还包括:根据预设规则确定合成视频格式信息;其中,所述合成视频格式信息用于所述视频合成端生成合成码流。
其中,所述预设规则包括但不限于:大部分满足原则、最低满足原则、优先满足原则、多档分配原则。各种原则详细描述如下表1。
表1 预设原则表
Figure PCTCN2019120230-appb-000001
在一些实施例中,所述确定视频合成端后,所述视频会议实现方法还包括:发送第一指令给所述视频合成端,所述第一指令中携带所述合成视频格式信息。
在一些实施例中,所述合成视频格式信息包括如下至少之一:合成视频传输端口、合成视频编解码类型、带宽、收发模式、合成视频分辨率、合成视频码率、合成视频帧率。
以下对终端同媒体服务器之间的码流交互过程进行说明。终端和媒体服务器之间包括上行的音频码流和视频码流,以及下行的音频码流和视频码流。在相关技术中,多流终端收到的下行码流为需要合成的其他终端的原始码流,上行的视频码流为该多流终端采集到的自身原始码流;在本公开实施例中,多流终端如被选择为视频合成端,则收到的下行的码流仍为需要合成的其他终端的原始码流,上行的码流除了自身原始码流外,还包含合成所得的合成码流,此时多流终端上传的合成码流会被转发到其他单流终端,并通过单流终端展示给用户。
媒体服务器下发的下行码流为其他终端的单画面,终端发的上行码流,一路为终端本身的画面,其余则为不同分辨率、码率、帧率的合成画面。
如图3a所示为媒体服务器转发给终端的四路码流图像,下发的码流都是单画面图像,来自于其他终端的上行原始码流。
媒体服务器下发的码流显示图像的端口对应关系为如下表2所示。
表2 对应关系表
Figure PCTCN2019120230-appb-000002
Figure PCTCN2019120230-appb-000003
终端的四路上行码流如图3b所示,可以看出此处一路码流为终端自身图像,另外三路为合成图像;如图3b所示合成为四分屏方式,此处可能会有多种合成图像方式,但表现为只有一路码流。这三路合成图像的分辨率、码率、帧率都不相同,是为了方便终端媒体处理模块将合适大小的码流转发给适合的终端所需要。另外,终端上行码流的数目也是可变的,根据策略选择,可以是三路码流,也可以是一路或者两路码流或者更多的合成码流。若终端支持SVC,在一路上行码流中可以包含有多种分辨率、码率和帧率的SVC视频码流,媒体服务器则按照需要对多个视频格式的SVC码流进行提取转发。在一些实施例中,按照终端能力优选原则,多流终端上行一路合成码流即可。
终端上传的码流显示图像的端口对应关系为如下表3所示。
表3 对应关系表
Figure PCTCN2019120230-appb-000004
在一些实施例中,所述视频会议实现方法还包括:当视频合成端为多流终端时,记录该视频合成端所服务的单流终端。当作为视频合成端的多流终端所服务的单流终端均已退会时,向所述多流终端发送用于指示取消所述多流终端作为视频合成端的第二指令。其中,第二指令可以通过重协商将合成端口置为0来实现。需要说 明的是,此处仅为示例,可以通过其他方式通知多流终端已取消该多流终端作为视频合成端。
在一些实施例中,确定视频合成端时,确定多个视频合成端,各视频合成端互为备份,当其中一个视频合成端退会或异常时,切换到另一个视频合成端,从而在终端用户无明显感知情况下实现视频合成端自适应切换,有效保证融合会议的视频质量效果。比如选择至少两路视频合成端,将多流终端和媒体服务器均作为视频合成端,二者在视频合成效果上并无实际差异,可以互为备份。在多流终端异常退出时,媒体服务器也能够合成单流终端所需要的合成码流,来满足融合视频会议的要求。当然,在一些实施例中,如有其他多流终端时,优先选取多流终端作为视频合成端,可以节约媒体服务器的处理能力,提升系统整体服务性能。
在一些实施例中,所述视频会议实现方法还包括:当作为所述视频合成端的多流终端退会或发生异常时,重新确定视频合成端。比如,多流终端异常包括:网络条件变化产生的拥塞导致已被选择为视频合成端的多流终端不能继续提供视频合成服务。本实施例在视频合成端退会或异常时,动态切换到另一视频合成端,从而有效降低不利影响,使得视频会议不会由于多流终端(视频合成端)退出而终止会议服务,保证系统功能的持续性。
在一些实施例中,所述重新确定视频合成端包括:将备份视频合成端作为新的视频合成端,或者,在参与视频会议且未被选为视频合成端的多流终端中选择视频合成端,或者,呼入空闲的多流终端作为视频合成端。即在选择视频合成端时,可以设置备份视频合成端,在当前视频合成端退会或异常时,将备份视频合成端作为视频合成端进行码流合成。
在一些实施例中,所述确定视频合成端包括:当配置的会议模式为预设模式时,将媒体服务器作为所述视频合成端;比如,预设模式为强制单流模式,即采用单流会议的方式召开视频会议,则此时媒体服务器作为视频合成端。
所述视频会议实现方法还包括:当参与视频会议的终端包括 多流终端时,将所述媒体服务器生成的合成码流发送给所述多流终端,也就是说,在强制单流模式下,多流终端也视为单流终端。
如图4所示,本公开的另一个实施例提供一种视频会议实现方法,包括步骤401和步骤402。
步骤401,多流终端接收原始码流,根据所述原始码流生成合成码流。
步骤402,所述多流终端向媒体服务器上传所述合成码流,用以通过所述媒体服务器将所述合成码流发送给单流终端。
本实施例提供的方案,多流终端作为视频合成端,向媒体服务器上传合成码流,从而可以为单流终端加入多流会议提供支持。
在一些实施例中,所述步骤401中,在多流终端接收原始码流前还包括:多流终端接收入会请求,加入视频会议。即空闲的多流终端响应入会请求,加入视频会议,并作为视频合成端进行码流合成。
在一些实施例中,所述步骤401中,根据所述原始码流生成合成码流包括:接收到用于指示本设备(即所述多流终端)被选为视频合成端的第一指令后,根据所述原始码流生成合成码流。媒体服务器在选择多流终端作为视频合成端后,会发送指令通知该多流终端被选为视频合成端。
在一些实施例中,所述第一指令中携带视频合成信息,所述生成合成码流包括:根据所述第一指令中的视频合成信息生成合成码流。
在一些实施例中,所述第一指令中包括如下至少之一:合成视频传输端口、合成视频编解码类型、带宽、收发模式、合成视频分辨率、合成视频码率、合成视频帧率。
在一些实施例中,所述视频会议实现方法还包括:所述多流终端接收到用于指示取消所述多流终端作为视频合成端的第二指令时,停止生成所述合成码流,释放相关资源,即释放用于生成及上传合成码流对应的资源。
在一些实施例中,所述合成码流为一路合成码流或多路合成 码流,所述一路合成码流包括一种或多种视频格式的码流。一路合成码流可以是只有一种分辨率的码流,也可以是多种分辨率的SVC(Scalable Video Coding,可分级视频编码)码流。多路合成码流可以为多种格式的码流(每路为一种格式),格式包括分辨率、帧率、码率等。多种格式的码率可以为多种视频分辨率、多种帧率、多种码率的不同格式的视频码流,用于满足不同单流终端的视频支持格式需求。
图5为视频会实现系统架构示意图。如图5所示,视频会实现系统包括信令处理模块501和媒体处理组件502,其中媒体处理组件502可以为多套,采用分布式部署;信令处理模块501可以部分部署在AS上,部分部署在MRF上,媒体处理组件502可以部署在MRF上,当然,也可以是其他部署方式。其中,所述信令处理模块501配置为处理终端的入会和退会,将入会的终端的信息发送给媒体处理组件502。该信令处理模块501为同外部应用交互的控制接口模块,负责控制信令解析,并从控制信令中提取终端信息,转换成内部指令,将该内部指令发给媒体处理组件502(比如发送给终端媒体处理模块5021),以及把内部的请求或者响应转换成控制信令发送给外部应用。信令处理模块501负责处理终端(包括单流终端和多流终端的加会退会,包括终端能力协商处理、入会处理、退会处理。同时将入会的终端的信息通过内部指令发给媒体处理组件502,便于媒体处理组件502能够对各种入会终端进行标识、记录以及后续处理。终端信息包括但不限于终端类型和/或终端能力信息。其中,终端能力信息包括但不限于如下至少之一:音频能力、视频分辨率、视频帧率、视频码率、传输端口、带宽等。
所述媒体处理组件502配置为,根据所述终端的信息确定参与视频会议的终端的类型,当参与视频会议的终端包括单流终端时,确定视频合成端,并将所述视频合成端生成的合成码流发送给所述单流终端,其中,所述合成码流根据参与视频会议的终端的原始码流合成得到。
在一些实施例中,媒体处理组件502包括终端媒体处理模块 5021和融合会议处理模块5022,其中,所述终端媒体处理模块5021配置为,接收信令处理模块501发送的入会的终端的信息,将所述终端的信息发送给融合会议处理模块5022,以及,接收原始码流,接收融合会议处理模块5022发送的视频合成端信息,根据所述视频合成端信息将所述视频合成端生成的合成码流发送给单流终端;终端媒体处理模块5021还完成各种音频处理功能和视频处理功能。音频处理功能包括音频播放、音频录制、音频会议和音频会议控制功能;视频处理功能包括视频播放、视频录制、视频会议和视频会议控制功能。其中,在多流视频会议中,视频会议功能是指接收各个终端的码流,并将各终端码流转发给其他终端。接收到的码流有两种,一种是终端采集的视频原始码流,一般为单个图像的码流,这类码流用于转发给其他多流终端,由多流终端对收到的所有原始码流进行编码合成,展现给终端用户;第二种是合成码流,合成码流是多流终端合成后的码流,此类码流用于转发给单流终端,由单流终端直接展示给终端用户。
所述融合会议处理模块5022负责融合会议的处理,包括强制单流会议处理、视频合成端选择、多流自适应、会议外呼多流终端。强制单流会议处理决定了多方视频会议开展形式:融合会议处理模块5022判断该会议是否被配置为强制单流,如是强制单流,则视频会议中的多流终端不再作为视频合成端来看待,而只是作为普通的单流终端对待,同时也不会选择多流终端作为视频合成端。在一些实施例中,所述融合会议处理模块5022根据所述终端的信息确定参与视频会议的终端的类型,当参与视频会议的终端包括单流终端时,确定视频合成端,将视频合成端信息通知所述终端媒体处理模块5021。
在一些实施例中,所述融合会议处理模块5022确定视频合成端包括:将媒体服务器和/或多流终端作为视频合成端。
在一些实施例中,所述融合会议处理模块5022确定视频合成端包括:从参与视频会议的多流终端中选择一个或多个多流终端作为所述视频合成端。
在一些实施例中,所述融合会议处理模块5022确定视频合成端包括:呼叫空闲的多流终端入会,在所呼叫的多流终端入会后,将所呼叫的该多流终端作为所述视频合成端。
在一些实施例中,所述融合会议处理模块5022还配置为,确定视频合成端后,当所述视频合成端为多流终端时,向所述终端媒体处理模块5021发送用于指示所述多流终端被选为视频合成端的指示信息。
所述终端媒体处理模块5021还配置为,将所述用于指示所述多流终端被选为视频合成端的指示信息发送给对应的多流终端。
视频合成端选择功能通过确定终端类型,融合会议处理模块5022将多流终端进行标识。融合会议处理模块5022选择有视频合成能力的终端(即多流终端)进行视频合成。融合会议处理模块5022选择多流终端作为视频合成端后,通过终端媒体处理模块5021发送第一指令通知视频合成端,第一指令中包括视频合成端指示信息(即用于指示该多流终端被选为视频合成端的信息)和/或视频合成信息。其中视频合成信息包括但不限于以下信息:视频端口、视频编解码、视频分辨率、视频帧率、视频码率、传输端口、带宽、发模式等;其中,视频合成端指示信息可以通过在媒体描述信息中增加描述信息来实现,视频合成信息可以通过在媒体描述信息中增加描述信息来实现。第一指令可以通过重协商来传给多流终端,在一些实施例中,可以在重协商信令中,增加一个或多个视频媒体描述行信息,比如增加媒体描述行,专门用于作为合成视频的媒体码流通道。此处多个视频媒体描述行,可以用于多种视频格式的合成视频。比如,使用新增的“m=video”行描述的视频格式信息用来生成上行合成码流,新增“a=videoconfmixer”行用来表明该多流终端被作为视频合成端。
如下述重协商信令所示,在重协商信令中新增下述信息,新增的“m=video”行,合成视频通过20678端口上行给服务端,合成视频格式满足新增的“m=video”的描述要求;新增“a=videoconfmixer”行用来表明该多流终端被作为视频合成端。
重协商信令中增加的信息如下。
m=video 20678RTP/AVP 117
b=AS:3584
a=rtpmap:117 H264/90000
a=fmtp:117 profile-level-id=42e01f;packetization-mode=1;MaxBR=35840
a=sendonly
a=videoconfmixer
a=zimeclient
在一些实施例中,所述融合会议处理模块5022还配置为,当视频合成端为多流终端时,记录该视频合成端所服务的单流终端;当作为视频合成端的多流终端所服务的单流终端均已退会时,向所述终端媒体处理模块5021发送用于指示取消所述多流终端作为视频合成端的第二指令。
所述终端媒体处理模块5021还配置为,向所述多流终端发送用于指示取消所述多流终端作为视频合成端的第二指令。
在一些实施例中,所述融合会议处理模块5022还配置为,当作为所述视频合成端的多流终端退会或发生异常时,重新确定视频合成端。重新确定视频合成端可以是将备份视频合成端作为视频合成端(此时将退会或异常的多流终端所服务的单流终端加入该备份视频合成端的分发终端列表中),也可以选择新的多流终端作为视频合成端,或者,呼入空闲的多流终端作为视频合成端。
场景一 纯单流终端召开融合视频会议
如图6a所示,在场景一下,本公开实施例的视频会议实现方法包括步骤601a至步骤604a。
步骤601a,接收单流终端的入会请求。
步骤602a,记录所述单流终端的信息,与所述单流终端进行协商,建立入会连接。
步骤603a,根据所述单流终端的信息以及预设原则确定合成视频格式信息,选择视频合成端。本实施例中,选择媒体服务器作 为视频合成端,需要说明的是,在其他实施例中,也可以呼入空闲的多流终端作为视频合成端,或者,选择媒体服务器和呼入的空闲的多流终端作为视频合成端。
步骤604a,申请视频合成资源,使用所申请的视频合成资源,根据所述合成视频格式信息对码流进行合成得到合成码流,将合成码流发送给单流终端。
下面通过一个具体应用实例说明。
如图6b所示,视频会议实现方法包括步骤601b至步骤608b。
步骤601b,单流终端发起入会请求至信令处理模块。
步骤602b,所述信令处理模块处理该入会请求,将所述单流终端的信息发送给终端媒体处理模块,其中所述单流终端的信息包括但不限于以下至少之一:终端类型、音频端口、音频格式、视频端口、视频分辨率、视频帧率、视频码率、传输端口、带宽。
步骤603b,所述终端媒体处理模块记录所述单流终端的信息,并将协商的媒体信息反馈给所述信令处理模块。
步骤604b,所述信令处理模块同所述单流终端完成握手协商,所述单流终端入会连接建立。
如图6c所示,组网包括单流终端C、单流终端D和单流终端E。组网中所有的终端都为单流终端。
步骤605b,所述终端媒体处理模块将所述单流终端的信息发送给融合会议处理模块。
步骤606b,所述融合会议处理模块从所述终端媒体处理模块获得终端信息,对所有终端进行判断,发现参与视频会议的终端为单流终端,依据单流协商媒体信息,对单流终端进行排序,此排序用于合成视频格式的选择。排序时,可以依据视频分辨率、帧率、码率进行排序。
需要说明的是,也可以不进行排序。
各单流终端排序表如表4所示,包含终端序号、终端类型、音视频端口、视频分辨率、视频帧率、视频码率等。
表4 终端排序表
Figure PCTCN2019120230-appb-000005
步骤607b,融合会议处理模块根据预设规则选择合成视频格式,根据所选的合成视频格式,申请视频合成资源,并把内部合成资源号发送给终端媒体处理模块。
其中,内部合成资源号为资源的索引信息。
本实施例中为纯单流会议,融合会议处理模块选择使用媒体服务器作为视频合成端,将所述视频合成端的内部合成资源号通知给终端媒体处理模块。
步骤608b,终端媒体处理模块使用内部合成资源号,对原始码流进行合成得到合成码流,将所述合成码流转发给各个单流终端。
其中,原始码流由终端(本实施例中为单流终端)发送给终端媒体处理模块。
本实施例提供的方案,当只有单流终端参与视频会议时,选择媒体服务器作为视频合成端,实现了单流终端的接入。
场景二 强制单流终端召开融合视频会议
如图7a所示,在场景二下,本公开实施例的视频会议实现方法包括步骤701a至步骤704a。
步骤701a,接收到单流终端以及多流终端的入会请求。
其中,单流终端和多流终端的入会请求可以是同时发起,也 可以是分别发起。
步骤702a,记录单流终端和多流终端的信息,与所述单流终端进行协商,建立入会连接。
本实施例中,预设模式为强制单流会议模式,即不管接入的终端类型,均由媒体服务器作为视频合成端。
步骤703a,根据终端信息以及预设原则确定合成视频格式信息,选择每台服务器作为视频合成端。
步骤704a,申请视频合成资源,对码流进行合成得到合成码流,将合成码流发送给单流终端和多流终端。
下面通过一个具体示例进一步说明。
如图7b所示,视频会议实现方法包括步骤701b至步骤709b。
步骤701b,单流终端、多流终端分别发起入会申请至信令处理模块。
步骤702b,信令处理模块进行入会控制信令解析及处理,并查询是否配置为强制单流会议。如果是强制单流会议,则在将终端的信息发送给终端媒体处理模块时,将会议模式置为强制单流模式。其中终端的信息包括但不限于以下至少之一:终端类型、音频端口、音频格式、视频端口、视频分辨率、视频帧率、视频码率、传输端口、带宽等。
如图7c所示,共2个多流终端(多流终端A和多流终端B),3个单流终端(单流终端C、单流终端D和单流终端E)上会。每个终端上会时,都将该终端的信息通过内部消息分发给终端媒体处理模块。
步骤703b,终端媒体处理模块记录单流终端和多流终端信息,并判断是否为强制单流模式。如为强制单流模式,多流终端协商的媒体信息都按照单流终端的格式反馈给信令处理模块。
步骤704b,信令处理模块同终端完成握手协商,终端入会连接建立。
当所有终端都入会后,终端信息表如下表5所示(下表5中的信息只是举例,但不限于这些信息。如传输端口、带宽等也是必 须的,但此处因篇幅关系没有列出)。需要说明的是,表5中的终端序号为终端的索引信息。
表5 终端信息表
Figure PCTCN2019120230-appb-000006
在这种情况下,如图7c所示,单流终端和多流终端召开融合会议,多流终端会被强制作为单流终端来看待:上行一路音频码流,一路视频码流,同时收到媒体服务器下发的一份下行合成码流和一份下行音频码流。
步骤705b,所述终端媒体处理模块将所述终端的信息发送给融合会议处理模块。
步骤706b,融合会议处理模块从终端媒体处理模块获得终端的信息,并判断是否为强制单流模式。如为强制单流模式,则对单流终端和多流终端同时进行排序,此排序用于视频合成格式的选择。在本例中排序表如下表6所述。
表6 终端排序表
Figure PCTCN2019120230-appb-000007
步骤707b,融合会议处理模块依据预设规则,选择合成视频 格式。此处预设规则包括但不限于:大部分满足原则、最低满足原则、优先满足原则、多档分配原则。各种原则详细描述见表1。
步骤708b,融合会议处理模块依据合成视频格式,申请视频合成资源,此处为强制单流模式,选择媒体服务器作为视频合成端,将内部合成资源号发送给终端媒体处理模块。
步骤709b,终端媒体处理模块根据内部合成资源号,进行视频合成,获取合成码流,将所述合成码流转发给各个终端(包括单流终端和多流终端)。
本实施例中,通过设置会议模式为强制单流模式,兼容已有的单流模式会议,同样实现了单流终端的接入。
场景三 单流/多流终端召开融合视频会议
如图8a所示,在场景三下,本公开实施例的视频会议实现方法包括步骤801a至步骤805a。
步骤801a,接收到入会请求。
步骤802a,记录终端信息,与终端协商建立终端的入会连接;其中,终端信息包括终端能力信息。
步骤803a,判断终端类型包括单流终端,如果未被配置为强制单流会议,选择合成视频格式和视频合成端。
其中,根据预设规则选择合成视频格式。
其中,根据预设策略选择视频合成端。比如,选择能力最强的终端作为视频合成端等等。
记录所选的视频合成端以及其分发终端。其中,选择视频合成端时,执行如下至少之一:选择多流终端作为视频合成端,其中,可以选择一个或多个多流终端作为视频合成端;选择媒体服务器作为视频合成端;呼入空闲的多流终端作为视频合成端,其中,可以呼入一个或多个空闲的多流终端作为视频合成端。
本实施例中,选择多流终端作为视频合成端。
步骤804a,当视频合成端为多流终端时,向所选的视频合成端发送第一指令,第一指令中携带视频合成端指示信息和合成视频 格式信息。
步骤805a,接收多流终端发送的合成码流,转发给单流终端。
需要说明的是,视频会议中码流交互包括:单流终端将原始码流发送给媒体服务器,媒体服务器下发合成码流(来自视频合成端)至单流终端。未被选为视频合成端的多流终端将原始码流发送给服务器,接收媒体服务器转发的来自其他终端的原始码流,对原始码流进行合成后在本地显示。被选为视频合成端的多流终端(参与视频会议)将原始码流发送给服务器,接收媒体服务器转发的来自其他终端的原始码流,对原始码流进行合成后在本地显示,对原始码流进行合成得到合成码流,将合成码流上传给媒体服务器。其中,本地显示的合成码流和上传的合成码流可以不同,对那些原始码流进行合成,合成图形的布局等等可以由媒体服务器进行控制。被选为视频合成端的多流终端(不参与视频会议的空闲多流终端)接收媒体服务器转发的来自其他终端的原始码流,对原始码流进行合成得到合成码流,将合成码流上传给媒体服务器。其中,对那些原始码流进行合成,合成图像的布局等等可以由媒体服务器进行控制。
本实施例提供的方案,当有单流终端参与视频会议时,确定视频合成端,由视频合成端为单流终端进行码流合成,从而实现了单流终端参与视频会议,避免了已投资的单流终端的浪费。
下面通过具体应用示例进一步说明。
如图8b所示,视频会议实现方法包括步骤801b至步骤810b。
步骤801b,单流终端、多流终端分别发起入会请求。
步骤802b,信令处理模块进行入会控制信令解析及处理,判断是否配置为强制单流会议,如果未配置为强制单流会议,则将会议模式置为融合会议模式,并将终端的信息发送给终端媒体处理模块;其中,发送的信息包括但不限于终端类型、音频端口、音频格式、视频端口、视频分辨率、视频帧率、视频码率、传输端口、带宽等。
如图8c所示,共2个多流终端(多流终端A和多流终端B), 3个单流终端(单流终端C、单流终端D和单流终端E)上会。每个终端上会时,都将该终端的信息通过内部消息分发给终端媒体处理模块。
步骤803b,终端媒体处理模块记录单流终端和多流终端信息,并将协商的媒体信息反馈给信令处理模块。
步骤804b,信令处理模块同终端完成握手协商,终端入会连接建立。
当所有终端都入会后,终端信息表如下表7所示(如下只是举例,但不限于这些信息。如也可以包括传输端口、带宽等信息)。
表7 终端信息表
Figure PCTCN2019120230-appb-000008
步骤805b,所述终端媒体处理模块将所述终端的信息发送给融合会议处理模块。
步骤806b,融合会议处理模块从终端媒体处理模块获得终端的信息,由于本实施例中为融合会议模式,则对单流终端和多流终端分别依据终端协商的媒体信息,进行终端能力排序。多流终端能力排序用于视频合成端的选择,单流终端能力排序用于合成视频格式的选择。
在本例中多流终端排序表如下表8所示。
表8 多流终端排序表
排序序号 视频分辨率、帧率、码率 终端序号 ....
1 720P/15/1M 1,2  
在本例中单流排序表如下表9所示。
表9 单流终端排序表
排序序号 视频分辨率、帧率、码率 终端序号 ....
1 720P/10/512K 3  
2 480P/10/384K 4,5  
需要说明的是,在其他实施例中,也可以不进行排序。
步骤807b,融合会议处理模块依据预设规则选择合成视频格式,以及,根据预设策略选择视频合成端;参考终端能力列表,选择出视频合成端及合成视频格式。
预设策略可以为:选择计算能力强的多流终端作为视频合成端。需要说明的是,此处仅为示例,可以根据其他策略选择视频合成端。
按照上例,假定终端A、B、C、D、E依次上会,则当终端C上会时,触发视频合成端选择,融合会议处理模块依据预设规则并查看多流能力排序,选择多流终端A作为一个视频合成端,多流终端B作为另一个视频合成端。如需备份视频合成端,则还需选择另一个不同的多流终端作为视频合成端。如选择多流终端A作为512K码率视频合成端,多流终端B作为384K码率视频合成端,则视频合成端记录表如下表10所示。
表10 视频合成端记录表
合成序号 合成视频格式 视频合成端终端序号 分发终端序号
1 720P/10/512K 1(即终端A) 3(即终端C)
2 480P/10/384K 2(即终端B) 4(即终端D),5(即终端E)
步骤808b,融合会议处理模块将所选的视频合成端、合成视频格式通知终端媒体处理模块。
步骤809b,终端媒体处理模块将第一指令发给多流终端,其中第一指令包括但不限于以下信息:合成端口、合成视频编解码类型、合成视频分辨率、合成视频帧率、合成视频码率等。
步骤810b,终端媒体处理模块通过视频合成端对应的服务器端口,获取合成码流,并将合成码流进行转发给各个单流终端。
通过如上操作,最终多流终端A的码流情况如下:如图8d所 示,多流终端A将收到四路下行码流。如图8e所示,多流终端A将发送两路上行码流,一路为自身的码流,一路为合成码流。需要说明的是,本实施例中,合成码流由四路下行码流合成得到,在其他实施例中,合成码流可能由四路下行码流和多流终端A自身码流中的一个或多个合成得到。
场景四 单流终端退会后的融合视频会议处理
如图9a所示,在场景四下,本公开实施例的视频会议实现方法包括步骤901a至步骤904a。
步骤901a,接收到单流终端的退会请求。
步骤902a,清除退会的单流终端的相关数据。
步骤903a,将单流终端信息从视频合成端记录表中删除,判断为该单流终端服务的视频合成端(此处为多流终端)是否还有需要服务的单流终端,如果没有,则删除视频合成端记录表中该视频合成端的信息,向该多流终端发送第二指令,释放与该多流终端对应的接收端口资源;所述第二指令用于指示取消所述多流终端作为视频合成端。
步骤904a,多流终端收到第二指令,停止视频合成(停止合成需要上传的合成码流,不影响本地需要显示的码流的合成),并释放相关资源,所述资源包括视频端口资源。
本实施例提供的方案,在单流终端退会后,及时释放相关资源,提高系统性能和效率。
下面通过一个具体应用实例对单流终端退会进行说明。
在图8b所示实施例的基础上,单流终端C发起退会,如图9b所示,视频会议实现方法包括步骤901b至步骤907b。
步骤901b,单流终端C发起退会请求至信令处理模块。
步骤902b,信令处理模块处理该退会请求,并将退会的单流终端的信息通过内部指令通知给终端媒体处理模块,即向终端媒体处理模块发送单流终端退会指令。
步骤903b,信令处理模块将该退会的终端的信息通过内部指 令通知给融合会议处理模块,即向融合会议处理模块发送单流终端退会指令。
当多流终端的合成视频所转发的单流终端都退会后,需要通知多流终端释放视频合成资源,所以单流终端的退会信息需要同步发送到终端媒体处理模块和融合会议处理模块。
在本实施例中,如图9c所示,单流终端C退会,只剩下多流终端A、多流终端B、单流终端D和单流终端E。
步骤904b,终端媒体处理模块处理所述单流终端退会指令,清除退会的单流终端相关数据。
步骤905b,融合会议处理模块收到所述单流终端退会指令,将单流终端信息从视频合成端记录表(即表10)中删除,并判断该单流终端对应的视频合成端的分发终端是否均已退出,如果均已退出,则通知终端媒体处理模块,该视频合成端已退出,执行步骤906b。如果该单流终端对应的视频合成端的分发终端未全部退出,则流程结束。
本实施例中,单流终端C(终端序号3)退会后,视频合成端1的分发终端均已推出,视频合成端记录表中合成序号1的记录都应该被删除,并通知终端媒体处理模块合成序号1的视频合成端已退出,更新后的视频合成端记录表如表11所示。
表11 视频合成端记录表
Figure PCTCN2019120230-appb-000009
步骤906b,终端媒体处理模块向多流终端发送第二指令,以释放服务器端的合成视频接收端口资源,其中,所述第二指令用于指示取消所述多流终端作为视频合成端。
本实施例中,指示取消终端序号1对应的多流终端(即多流终端A)作为视频合成端。此处可以通过重协商把合成端口置为0来完成指示取消该多流终端作为视频合成端。
下面的例子就是重协商的指令,注意合成端口已被置为0(例子中为“m=video 0 RTP/AVP 117”行):
重协商例子:
v=0
o=ZTE-SOMTMS 4245362188 4245362189 IN IP4 192.168.1.118
s=session SDP
c=IN IP4 192.168.1.118
t=0 0
m=audio 20580 RTP/AVP 8
a=ptime:20
a=rtpmap:8 PCMA/8000/1
a=sendrecv
a=zimeclient
m=video 20594 RTP/AVP 118
b=AS:3584
a=rtpmap:118 H264/90000
a=fmtp:118    profile-level-id=64001f;packetization-mode=1;MaxBR=35840
a=sendrecv
a=zimeclient
m=video 20620 RTP/AVP 117
b=AS:3584
a=rtpmap:117 H264/90000
a=fmtp:117    profile-level-id=42e01f;packetization-mode=1;MaxBR=35840
a=recvonly
a=zimeclient
m=video 20646 RTP/AVP 117
b=AS:3584
a=rtpmap:117 H264/90000
a=fmtp:117    profile-level-id=42e01f;packetization-mode=1;MaxBR=35840
a=recvonly
a=zimeclient
m=video 20672 RTP/AVP 117
b=AS:3584
a=rtpmap:117 H264/90000
a=fmtp:117    profile-level-id=42e01f;packetization-mode=1;MaxBR=35840
a=recvonly
m=video 0 RTP/AVP 117
b=AS:3584
a=rtpmap:117 H264/90000
a=fmtp:117    profile-level-id=42e01f;packetization-mode=1;MaxBR=35840
a=sendonly
a=videoconfmixer
a=zimeclient
需要说明的是,上述指示方法仅为示例,也可以通过其他方式指示取消多流终端作为视频合成端。
步骤907b,多流终端A收到第二指令,停止视频合成,并释放相关资源(包括视频端口资源)。
本实施例提供的方案,在终端C退出后,能及时释放为终端C 服务的视频合成端的资源和服务器端(即视频会议系统的服务器端)的资源,提高系统性能。需要说明的是,此处为终端C服务的多流终端为参与视频会议的多流终端,在其他实施例中,也可以是呼入的空闲多流终端,此时处理类似,不再赘述。
场景五 融合视频会议的自适应动态切换
如图10a所示,场景五下,本公开实施例的视频会议实现方法包括步骤1001a至步骤1003a。
步骤1001a,接收到多流终端发起的退会请求。
步骤1002a,清除退会的该多流终端的相关数据;当该多流终端为视频合成端时,查询是否有备份视频合成端,如果有备份视频合成端,则选择备份视频合成端作为新的视频合成端,并将退会的多流终端所服务的单流终端加入新选的视频合成端的分发终端列表中;如果没有备份视频合成端,则重新选择视频合成端,当所选的视频合成端为多流终端时,向该多流终端发送第一指令。
其中,重新选择视频合成端时,可以在已加入视频会议的多流终端中选择视频合成端,也可以呼入空闲的多流终端作为视频合成端,也可以将媒体服务器作为视频合成端。
步骤1003a,通过视频合成端对应的服务器端口,获取合成码流,将所述合成码流转发给单流终端。
本实施例提供的方案,在作为视频合成端的多流终端退会时,能及时进行切换,使得视频会议不会由于多流终端(视频合成端)退出或异常而终止会议服务,保证了系统功能的持续性。
下面通过一个具体应用实例进行进一步说明。
在图8b所示实施例的基础上,多流终端发起退会,如图10b所示,视频会议实现方法包括:
步骤1001b,多流终端发起退会请求至信令处理模块。
步骤1002b,信令处理模块处理退会请求,并将退会的终端的信息通过内部指令通知给终端媒体处理模块,即向终端媒体处理模块发送多流终端退会指令。
步骤1003b,信令处理模块将退会的终端的信息通过内部指令 通知融合会议处理模块。即向融合会议处理模块发送多流终端退会指令。
在本实施例中,多流终端A退会后,组网图如图10c所示。
步骤1004b,终端媒体处理模块处理多流终端退会指令,清除退会的该多流终端相关数据。
步骤1005b,融合会议处理模块收到多流终端退会指令,判断是否为强制单流会议,对于强制单流会议,由于采用的是媒体服务器作为视频合成端的方式,所以不做处理;如果不是强制单流会议且该多流终端为视频合成端,将该多流终端从视频合成端记录表中删除,并判断是否有备份的视频合成端。如有备份的视频合成端,则将备份的视频合成端作为新的视频合成端,向终端媒体处理模块发送切换指令,指示使用备份的视频合成端替代退会的多流终端;如果没有备份的视频合成端,在加入视频会议的多流终端中选择符合要求的新的视频合成端并通知终端媒体处理模块,将新的视频合成端信息填入视频合成端记录表。如选择不到新的视频合成端,为单流终端呼入空闲的多流终端作为新的视频合成端,向终端媒体处理模块发送视频合成端选择指令。
本实施例中,多流终端A退会后,视频合成端记录表中合成序号1的记录都被删除。选择合成序号2的视频合成端作为单流终端3的视频合成端,并将单流终端3加入到合成序号2的分发终端序号项中,更新后的视频合成端记录表如表12所示。
表12 视频合成端记录表
Figure PCTCN2019120230-appb-000010
步骤1006b,终端媒体处理模块如果收到切换指令,则将依据新的分发终端序号列表进行合成码流的转发处理,如果收到视频合 成端选择指令,终端媒体处理模块将第一指令发给多流终端,其中第一指令中包括但不限于以下信息:合成端口、合成视频编解码类型、合成视频分辨率、合成视频帧率、合成视频码率等。
后续终端媒体处理模块通过视频合成端对应的服务器端口,获取合成码流,并将合成码流转发给各个单流终端。
场景六 融合视频会议的外呼空闲多流终端
如图11a所示,场景六下,本公开实施例的视频会议实现方法包括步骤1101a至步骤1103a。
步骤1101a,选择视频合成端时,如配置了空闲多流终端表,则在空闲多流终端表中选择空闲多流终端,向所选的空闲的多流终端发起外呼指令。
需要说明的是,选择满足码流合成要求并且状态为正常的空闲多流终端发起外呼指令。
步骤1102a,在所述空闲的多流终端入会后,将该多流终端作为视频合成端,将该多流终端加入视频合成端记录表中,向该多流终端下发第一指令;其中,所述第一指令中携带视频合成端指示信息和视频合成信息。
步骤1103a,接收该多流终端上传的合成码流,将该合成码流发送给单流终端。
下面通过一具体实例进行说明。
如图11b所示,视频会议实现方法包括步骤1101b至步骤1105b。
步骤1101b,融合会议处理模块在选择视频合成端时,若配置了空闲多流终端表,则优先在空闲多流终端表中,选择满足码流合成要求的并且状态为正常的空闲多流终端,向信令处理模块发起外呼指令,并更新空闲多流终端表中的状态为使用状态。状态可以包含空闲,使用,离线三种状态。
本例中空闲多流终端表如下表13所示,选择的空闲多流终端为序号2的多流终端。
表13 空闲多流终端表
Figure PCTCN2019120230-appb-000011
步骤1102b,信令处理模块依据空闲多流终端表选择的外呼终端信息,发起外呼空闲多流终端入会。由于外呼空闲终端并不需要参与视频会议,就不需要将本身的音频码流和视频码流发给服务器端,因此外呼携带的媒体协商信令不包含上行的音频码流信息,也不包含上行视频码流信息。
步骤1103b,外呼成功后,信令处理模块通过内部指令通知融合会议处理模块。
步骤1104b,融合会议处理模块更新视频合成端记录表,通知终端媒体处理模块,所呼入的空闲的多流终端被选为视频合成端。
步骤1105b,终端媒体处理模块将第一指令发送给多流终端,指示所述多流终端被选为视频合成端。
后续,终端媒体处理模块通过视频合成端对应的服务器端口,获取合成码流,将合成码流转发给单流终端。
通过如上操作,最终空闲多流终端的码流情况如下:如图11c所示,空闲多流终端将收到四路码流;如图11d所示,空闲多流终端将上行一路码流,该码流为合成码流。
本实施例提供的方案,呼入空闲的多流终端作为视频合成端,能扩充整个视频会议系统的处理能力,从而能够容纳更多的终端接入视频会议系统。
如图12所示,本公开一个实施例提供一种视频会议实现装置120,包括存储器1210和处理器1220,所述存储器1210存储有程序,所述程序在被所述处理器1220读取执行时,实现任一实施例所述的视频会议实现方法。
如图13所示,本公开一个实施例提供一种计算机可读存储介 质130,所述计算机可读存储介质130存储有一个或者多个程序131,所述一个或者多个程序131可被一个或者多个处理器执行,以实现任一实施例所述的视频会议实现方法。
本领域普通技术人员可以理解,上文中所公开方法中的全部或某些步骤、系统、装置中的功能模块/单元可以被实施为软件、固件、硬件及其适当的组合。在硬件实施方式中,在以上描述中提及的功能模块/单元之间的划分不一定对应于物理组件的划分;例如,一个物理组件可以具有多个功能,或者一个功能或步骤可以由若干物理组件合作执行。某些组件或所有组件可以被实施为由处理器,如数字信号处理器或微处理器执行的软件,或者被实施为硬件,或者被实施为集成电路,如专用集成电路。这样的软件可以分布在计算机可读介质上,计算机可读介质可以包括计算机存储介质(或非暂时性介质)和通信介质(或暂时性介质)。如本领域普通技术人员公知的,术语计算机存储介质包括在用于存储信息(诸如计算机可读指令、数据结构、程序模块或其他数据)的任何方法或技术中实施的易失性和非易失性、可移除和不可移除介质。计算机存储介质包括但不限于RAM、ROM、EEPROM、闪存或其他存储器技术、CD-ROM、数字多功能盘(DVD)或其他光盘存储、磁盒、磁带、磁盘存储或其他磁存储装置、或者可以用于存储期望的信息并且可以被计算机访问的任何其他的介质。此外,本领域普通技术人员公知的是,通信介质通常包含计算机可读指令、数据结构、程序模块或者诸如载波或其他传输机制之类的调制数据信号中的其他数据,并且可包括任何信息递送介质。

Claims (32)

  1. 一种视频会议实现方法,包括:
    确定参与视频会议的终端的类型,当参与视频会议的终端包括单流终端时,确定视频合成端,将所述视频合成端生成的合成码流发送给所述单流终端,其中,所述合成码流根据参与视频会议的终端的原始码流合成得到。
  2. 根据权利要求1所述的视频会议实现方法,其中,所述方法还包括:当参与所述视频会议的终端包括多流终端时,将参与所述视频会议的其他终端产生的原始码流发送至所述多流终端。
  3. 根据权利要求1所述的视频会议实现方法,其中,所述确定参与视频会议的终端的类型包括:
    根据所述终端上报的终端信息确定所述参与视频会议的终端的类型。
  4. 根据权利要求1所述的视频会议实现方法,其中,所述视频合成端包括媒体服务器和/或多流终端。
  5. 根据权利要求1所述的视频会议实现方法,其中,所述确定视频合成端包括:
    从参与视频会议的多流终端中选择一个或多个多流终端作为所述视频合成端。
  6. 根据权利要求5所述的视频会议实现方法,其中,所述从参与视频会议的多流终端中选择一个或多个多流终端作为所述视频合成端包括:根据预设策略从参与视频会议的多流终端中选择一个或多个多流终端作为所述视频合成端。
  7. 根据权利要求1所述的视频会议实现方法,其中,所述确定视频合成端包括:
    呼叫空闲的多流终端入会,在所呼叫的多流终端入会后,将所呼叫的多流终端作为所述视频合成端。
  8. 根据权利要求1所述的视频会议实现方法,其中,在所述确定视频合成端后,还包括:当所述视频合成端为多流终端时,向所述多流终端发送用于指示所述多流终端被选为视频合成端的指示信息。
  9. 根据权利要求1所述的视频会议实现方法,其中,所述确定参与视频会议的终端的类型后,还包括:根据预设规则确定合成视频格式信息,所述合成视频格式信息用于所述视频合成端生成合成码流。
  10. 根据权利要求9所述的视频会议方法,其中,所述确定视频合成端后,还包括:发送第一指令给所述视频合成端,所述第一指令中携带所述合成视频格式信息。
  11. 根据权利要求9所述的视频会议实现方法,其中,所述合成视频格式信息包括如下至少之一:合成视频传输端口、合成视频编解码类型、带宽、收发模式、合成视频分辨率、合成视频码率、合成视频帧率。
  12. 根据权利要求8所述的视频会议实现方法,其中,所述方法还包括:当所述视频合成端为多流终端时,记录该视频合成端所服务的单流终端;
    当作为视频合成端的多流终端所服务的单流终端均已退会时,向所述多流终端发送用于指示取消所述多流终端作为视频合成端的第二指令。
  13. 根据权利要求1所述的视频会议实现方法,其中,所述方 法还包括,当作为所述视频合成端的多流终端退会或发生异常时,重新确定视频合成端。
  14. 根据权利要求13所述的视频会议实现方法,其中,所述重新确定视频合成端包括:将备份视频合成端作为新的视频合成端,或者,在参与视频会议且未被选为视频合成端的多流终端中选择视频合成端,或者,呼入空闲的多流终端作为视频合成端。
  15. 根据权利要求1至14任一所述的视频会议实现方法,其中,所述确定视频合成端包括:当配置的会议模式为预设模式时,将媒体服务器作为所述视频合成端;
    所述方法还包括:当参与视频会议的终端包括多流终端时,将所述媒体服务器生成的合成码流发送给所述多流终端。
  16. 一种视频会议实现方法,包括:
    接收媒体服务器发送的原始码流,根据所述原始码流生成合成码流,向所述媒体服务器上传所述合成码流,用以通过所述媒体服务器将所述合成码流发送给单流终端。
  17. 根据权利要求16所述的视频会议实现方法,其中,所述接收原始码流前还包括:接收入会请求,加入视频会议。
  18. 根据权利要求16所述的视频会议实现方法,其中,所述根据所述原始码流生成合成码流包括:接收到用于指示本设备被选为视频合成端的第一指令后,根据所述原始码流生成合成码流。
  19. 根据权利要求18所述的视频会议实现方法,其中,所述第一指令中携带合成视频格式信息;
    所述生成合成码流包括:根据所述第一指令中的合成视频格式信息生成合成码流。
  20. 根据权利要求19所述的视频会议实现方法,其中,所述合成视频格式信息包括如下至少之一:合成视频传输端口、合成视频编解码类型、带宽、收发模式、合成视频分辨率、合成视频码率、合成视频帧率。
  21. 根据权利要求16所述的视频会议实现方法,其中,所述方法还包括:接收到用于指示取消本设备作为视频合成端的第二指令时,停止生成所述合成码流,释放相关资源。
  22. 根据权利要求16至21任一所述的视频会议实现方法,其中,所述合成码流为一路合成码流或多路合成码流,所述一路合成码流包括一种或多种视频格式的码流。
  23. 一种视频会议实现装置,其中,包括存储器和处理器,所述存储器存储有程序,所述程序在被所述处理器读取执行时,实现如权利要求1至22任一所述的视频会议实现方法。
  24. 一种计算机可读存储介质,其中,所述计算机可读存储介质存储有一个或者多个程序,所述一个或者多个程序可被一个或者多个处理器执行,以实现如权利要求1至22任一所述的视频会议实现方法。
  25. 一种视频会议实现系统,其中,包括:
    信令处理模块,配置为处理终端的入会和退会,将入会的终端的信息发送给媒体处理组件;
    媒体处理组件,配置为根据所述终端的信息确定参与视频会议的终端的类型,当参与视频会议的终端包括单流终端时,确定视频合成端,将所述视频合成端生成的合成码流发送给所述单流终端,其中,所述合成码流根据参与视频会议的终端的原始码流合成得到。
  26. 根据权利要求25所述的视频会议实现系统,其中,所述媒体处理组件包括终端媒体处理模块和融合会议处理模块,其中:
    所述终端媒体处理模块配置为,接收信令处理模块发送的入会的终端的信息,将所述终端的信息发送给融合会议处理模块,以及,接收原始码流,接收融合会议处理模块发送的视频合成端信息,根据所述视频合成端信息将所述视频合成端生成的合成码流发送给单流终端;
    所述融合会议处理模块配置为,根据所述终端的信息确定参与视频会议的终端的类型,当参与视频会议的终端包括单流终端时,确定视频合成端,将视频合成端信息通知终端媒体处理模块。
  27. 根据权利要求26所述的视频会议实现系统,其中,所述融合会议处理模块确定视频合成端包括:将媒体服务器和/或多流终端作为视频合成端。
  28. 根据权利要求26所述的视频会议实现系统,其中,所述融合会议处理模块确定视频合成端包括:从参与视频会议的多流终端中选择一个或多个多流终端作为所述视频合成端。
  29. 根据权利要求26所述的视频会议实现系统,其中,所述融合会议处理模块确定视频合成端包括:呼叫空闲的多流终端入会,在所呼叫的多流终端入会后,将所呼叫的多流终端作为所述视频合成端。
  30. 根据权利要求26所述的视频会议实现系统,其中,
    所述融合会议处理模块还配置为,确定视频合成端后,当所述视频合成端为多流终端时,向所述终端媒体处理模块发送用于指示所述多流终端被选为视频合成端的指示信息;
    所述终端媒体处理模块还配置为,将所述用于指示所述多流终 端被选为视频合成端的指示信息发送给对应的多流终端。
  31. 根据权利要求26所述的视频会议实现系统,其中,
    所述融合会议处理模块还配置为,在确定多流终端为视频合成端后,记录该视频合成端所服务的单流终端;当作为视频合成端的多流终端所服务的单流终端均已退会时,向所述终端媒体处理模块发送用于指示取消所述多流终端作为视频合成端的第二指令;
    所述终端媒体处理模块还配置为,向所述多流终端发送用于指示取消所述多流终端作为视频合成端的第二指令。
  32. 根据权利要求25至31任一所述的视频会议实现系统,其中,所述融合会议处理模块还配置为,当作为所述视频合成端的多流终端退会或发生异常时,重新确定视频合成端。
PCT/CN2019/120230 2018-12-28 2019-11-22 视频会议实现方法及装置、视频会议系统、存储介质 Ceased WO2020134761A1 (zh)

Priority Applications (1)

Application Number Priority Date Filing Date Title
EP19902409.2A EP3905668A4 (en) 2018-12-28 2019-11-22 METHOD AND DEVICE FOR IMPLEMENTING VIDEOCONFERENCE, VIDEOCONFERENCING SYSTEM, RECORDING MEDIA

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201811627807.2 2018-12-28
CN201811627807.2A CN109660751A (zh) 2018-12-28 2018-12-28 一种视频会议实现方法及装置、视频会议系统、存储介质

Publications (1)

Publication Number Publication Date
WO2020134761A1 true WO2020134761A1 (zh) 2020-07-02

Family

ID=66118068

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2019/120230 Ceased WO2020134761A1 (zh) 2018-12-28 2019-11-22 视频会议实现方法及装置、视频会议系统、存储介质

Country Status (3)

Country Link
EP (1) EP3905668A4 (zh)
CN (1) CN109660751A (zh)
WO (1) WO2020134761A1 (zh)

Families Citing this family (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109660751A (zh) * 2018-12-28 2019-04-19 中兴通讯股份有限公司 一种视频会议实现方法及装置、视频会议系统、存储介质
CN111083426B (zh) * 2019-11-27 2023-01-24 视联动力信息技术股份有限公司 一种数据处理方法、装置、终端设备和存储介质
CN111083427B (zh) * 2019-12-27 2021-05-18 随锐科技集团股份有限公司 嵌入式终端、4k视频会议系统的数据处理方法
CN111182258B (zh) * 2020-02-11 2022-12-23 视联动力信息技术股份有限公司 一种网络会议的数据传输方法和装置
CN111641802A (zh) * 2020-06-04 2020-09-08 天津卓朗科技发展有限公司 视频数据共享方法、装置和服务器
CN113938468B (zh) * 2020-07-10 2025-10-28 华为技术有限公司 视频传输方法、设备、系统及存储介质
CN112911383A (zh) * 2021-01-19 2021-06-04 深圳乐播科技有限公司 一种局域网下多路投屏方法、设备及系统
CN115022573A (zh) * 2022-02-22 2022-09-06 北京平治东方科技股份有限公司 一种桌面视频会议系统
CN120568104B (zh) * 2025-06-18 2026-01-06 杭州电子科技大学 一种ai在线教育的内容编码调整方法

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP2334068A1 (en) * 2008-09-28 2011-06-15 Huawei Device Co., Ltd. Video and audio processing method, multi-point control unit and video conference system
CN105141884A (zh) * 2015-08-26 2015-12-09 苏州科达科技股份有限公司 混合会议中广播音视频码流的控制方法、装置及系统
CN108134918A (zh) * 2018-01-30 2018-06-08 苏州科达科技股份有限公司 视频处理方法、装置及多点视频处理单元、会议设备
CN109660751A (zh) * 2018-12-28 2019-04-19 中兴通讯股份有限公司 一种视频会议实现方法及装置、视频会议系统、存储介质

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8081205B2 (en) * 2003-10-08 2011-12-20 Cisco Technology, Inc. Dynamically switched and static multiple video streams for a multimedia conference
US8208004B2 (en) * 2007-05-08 2012-06-26 Radvision Ltd. Device, methods, and media for providing multi-point video conferencing unit functions
EP2227013A3 (en) * 2009-03-04 2017-05-10 Lifesize, Inc. Virtual distributed multipoint control unit
CN107241564B (zh) * 2016-03-29 2020-09-18 华为技术有限公司 基于ims网络架构的多流视频会议方法、装置及系统
US10079995B1 (en) * 2017-07-07 2018-09-18 Cisco Technology, Inc. Methods and systems for generating resolution based content

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP2334068A1 (en) * 2008-09-28 2011-06-15 Huawei Device Co., Ltd. Video and audio processing method, multi-point control unit and video conference system
CN105141884A (zh) * 2015-08-26 2015-12-09 苏州科达科技股份有限公司 混合会议中广播音视频码流的控制方法、装置及系统
CN108134918A (zh) * 2018-01-30 2018-06-08 苏州科达科技股份有限公司 视频处理方法、装置及多点视频处理单元、会议设备
CN109660751A (zh) * 2018-12-28 2019-04-19 中兴通讯股份有限公司 一种视频会议实现方法及装置、视频会议系统、存储介质

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
See also references of EP3905668A4 *

Also Published As

Publication number Publication date
CN109660751A (zh) 2019-04-19
EP3905668A4 (en) 2022-11-02
EP3905668A1 (en) 2021-11-03

Similar Documents

Publication Publication Date Title
WO2020134761A1 (zh) 视频会议实现方法及装置、视频会议系统、存储介质
US7257641B1 (en) Multipoint processing unit
CN104737514B (zh) 用于分布媒体内容服务的方法和设备
US8659636B2 (en) System and method for performing distributed video conferencing
CN107241564B (zh) 基于ims网络架构的多流视频会议方法、装置及系统
US8471890B1 (en) Adaptive video communication channel
EP1678951B1 (en) System and method for performing distributed video conferencing
US11711550B2 (en) Method and apparatus for supporting teleconferencing and telepresence containing multiple 360 degree videos
US9369671B2 (en) Method and system for handling content in videoconferencing
WO2011026382A1 (zh) 视频会议虚拟会场的呈现方法、设备及系统
WO2012055335A1 (zh) 会议控制方法及相关设备和系统
CN112291496A (zh) 一种基于内容的即时通信方法和系统
CN105611219A (zh) 视频会议的处理方法及装置
WO2022100528A1 (zh) 音视频转发方法、装置、终端与系统
US9013537B2 (en) Method, device, and network systems for controlling multiple auxiliary streams
CN108574840B (zh) 一种评估视频体验质量的方法及装置
CN108156413B (zh) 视频会议的传输方法及装置、mcu
WO2017000636A1 (zh) 媒体会话处理方法方法和相关设备及通信系统
US20230140286A1 (en) Method, computer program and system for streaming a video conference in a multi-point videoconferencing system
CN118860320A (zh) 小程序终端的屏幕帧率调整方法、装置及电子设备
CN116827911B (zh) 音频混合方法、装置、边缘服务器及中心服务器
JP4917497B2 (ja) 映像配信装置,配信映像切替え方法,配信映像切替えプログラムおよび配信映像切替えプログラム記録媒体
CN116708879A (zh) 一种降低视频会议网络带宽的传输方法及装置
CN110087020B (zh) 一种iOS设备进行视联网会议的实现方法及系统
CN106331578A (zh) 一种视频会议网络流量控制方法和系统

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19902409

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

ENP Entry into the national phase

Ref document number: 2019902409

Country of ref document: EP

Effective date: 20210728

WWW Wipo information: withdrawn in national office

Ref document number: 2019902409

Country of ref document: EP