WO2020063850A1 - Procédé de traitement de données multimédias, terminal et serveur - Google Patents

Procédé de traitement de données multimédias, terminal et serveur Download PDF

Info

Publication number
WO2020063850A1
WO2020063850A1 PCT/CN2019/108514 CN2019108514W WO2020063850A1 WO 2020063850 A1 WO2020063850 A1 WO 2020063850A1 CN 2019108514 W CN2019108514 W CN 2019108514W WO 2020063850 A1 WO2020063850 A1 WO 2020063850A1
Authority
WO
WIPO (PCT)
Prior art keywords
overlay
information
group
overlays
operation function
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Ceased
Application number
PCT/CN2019/108514
Other languages
English (en)
Chinese (zh)
Inventor
宋翼
范宇群
邸佩云
王业奎
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Huawei Technologies Co Ltd
Original Assignee
Huawei Technologies Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Huawei Technologies Co Ltd filed Critical Huawei Technologies Co Ltd
Publication of WO2020063850A1 publication Critical patent/WO2020063850A1/fr
Anticipated expiration legal-status Critical
Ceased legal-status Critical Current

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/431Generation of visual interfaces for content selection or interaction; Content or additional data rendering
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/47End-user applications
    • H04N21/485End-user interface for client configuration

Definitions

  • the embodiments of the present application relate to the technical field of streaming media transmission, and in particular, to a method, terminal, and server for processing media data.
  • the ISO / IEC 23090-2 standard specification is also called the OMAF (Omnidirectional Media Format) standard specification.
  • This specification defines a media application format that can implement omnidirectional media presentation in applications. Omnidirectional media mainly refers to panoramic video (360-degree video) and related audio.
  • the OMAF specification first specifies a list of projection methods that can be used to convert spherical video into two-dimensional video, and secondly, how to use ISO base media file format (ISOBMFF) to store omnidirectional media and the associated media.
  • ISO base media file format ISO base media file format
  • Metadata and how to encapsulate omnidirectional media data and transmit omnidirectional media data in a streaming media system, such as through Dynamic Adaptive Streaming based on HyperText Transfer Protocol (HTTP) HTTP, DASH), ISO / IEC 23009-1 standard dynamic adaptive streaming transmission.
  • HTTP Dynamic Adaptive Streaming based on HyperText Transfer Protocol
  • DASH Dynamic Adaptive Streaming based on HyperText Transfer Protocol
  • ISO / IEC 23009-1 standard dynamic adaptive streaming transmission.
  • the basic data structure of the overlay defines some basic properties of the overlay structure (for example, including the number of overlays, Id number, control symbol, control structure, etc.). Among them, the specific function of each structure is defined in the semantics of the control symbol syntax element overlay_control_flag. After the terminal resolves to the overlay, it can determine how to handle the overlay based on these syntax elements.
  • the embodiments of the present application provide a method, a terminal, and a server for processing media data, so as to reduce the complexity of operations that require users to perform the same operations on each overlay to achieve the corresponding purpose, make the operations on overlays more efficient, and improve the subjectivity of users. Experience.
  • an embodiment of the present application provides a method for processing media data, including: a terminal receiving at least two overlays corresponding to media data; overlay corresponding to first information; or overlay corresponding to second information and third information;
  • the first information includes the group identification information of the overlay
  • the second information is used to indicate the operation function corresponding to the overlay
  • the third information is used to indicate the group identification information of the overlay or the information that belongs to the same group as the overlay.
  • Identification information of other overlays when the overlay corresponds to the first information, the terminal processes the at least two overlays according to the first information of the at least two overlays, or when the overlay corresponds to the first information In the case of the second information and the third information, the terminal processes the at least two overlays according to the second information and the third information corresponding to the at least two overlays.
  • An embodiment of the present application provides a method for processing media data.
  • a terminal can process one or more overlays having the same group identification information by using first information corresponding to each overlay in at least two overlays.
  • each overlay can be processed one by one, which can reduce the complexity of operations and improve the user's subjective experience.
  • the foregoing terminal may perform the same processing on the overlays belonging to the same group according to the first information of at least two overlays. For example, to display all overlays in the same group at the granularity of the group. Or close all overlays in the same group with group granularity. For example, all overlays in the same group are scaled with group granularity.
  • the terminal processing the at least two overlays according to the first information of the at least two overlays includes: the terminal displays at least one group, and is used to indicate at least one group Information about the operation functions corresponding to each group and the overlays belonging to each group. At least one group is determined by the first information corresponding to each of the at least two overlays; the operation function corresponding to one group is determined by the overlay structure included in the overlay in each group.
  • the terminal processes the at least two overlays according to the second information and the third information corresponding to the at least two overlays, including: the terminal displaying at least one group, and indicating at least one group The information of the operation function corresponding to each group in the group and the overlay belonging to each group. At least one group is determined by third information corresponding to each of the at least two overlays, and an operation function corresponding to one group is determined by an overlay-related area control structure included in the overlay in the group.
  • the terminal may display at least one group on the display interface and information used to indicate the operation function corresponding to each group. In addition, it may also display information used to indicate the overlay in each group, which is convenient. The user knows the operation function corresponding to each group and what overlays there are in each group.
  • the terminal processes at least two overlays according to the first information of the at least two overlays, or the terminal processes second information and third information corresponding to the at least two overlays.
  • Processing the at least two overlays includes: when any group is triggered, all overlays in the any group respond to an operation function of the any group.
  • the terminal processes the at least two overlays according to the first information of the at least two overlays, or the terminal processes the second information and the first according to the second information corresponding to the at least two overlays.
  • Three pieces of information process the at least two overlays, including: when any overlay in any group is triggered, other overlays belonging to the same group as any overlay also respond to the operation of the any group Features.
  • the operation function is display.
  • the method provided in the embodiment of the present application further includes: when any one of the at least one group is triggered, the overlay in any one of the groups is displayed.
  • the terminal can display all overlays belonging to the same group with granularity based on the triggered operation.
  • the operation function of the overlay is taken as an example here. In an actual process, an operation function of the overlay may also be size scaling, position change, and the like.
  • the operation function is turned off.
  • the method provided in the embodiment of the present application further includes: when any overlay is triggered, any overlay displayed on the terminal, and any overlay that belongs to the same group as any overlay Other overlays are closed.
  • the terminal can close all overlays belonging to the same group with the granularity of the group based on the trigger operation.
  • each overlay needs to be closed one by one, which can reduce the operation complexity.
  • the overlay also corresponds to fourth information.
  • the fourth information is used to indicate that, when the first operation function is performed on the overlay, all the overlays in the group to which the overlay belongs respond to the first operation function.
  • the first information of the at least two overlays processes the at least two overlays, or the terminal processes the at least two overlays according to second information and third information corresponding to the at least two overlays.
  • the terminal processes at least two overlays according to the first information and the fourth information of the at least two overlays; or the terminal processes at least two overlays according to the second information, the third information, and the fourth information of the at least two overlays overlay for processing.
  • the terminal can determine the overlay for group operation processing.
  • each group in the at least one group further corresponds to indication information used to indicate a group operation.
  • the terminal may further display, according to the fourth information, instruction information indicating the group operation corresponding to each group.
  • the overlay also corresponds to the fifth information.
  • the fifth information is used to indicate that in the case of performing the first operation function on the overlay, all the overlays in the group to which the overlay belongs respond to the first operation function, or the overlay is executed.
  • the terminal processes the at least two overlays according to the first information of the at least two overlays, or the terminal pairs the second information and the third information according to the at least two overlays.
  • the processing of the at least two overlays further includes: the terminal processes the at least two overlays according to the first information and the fifth information of the at least two overlays; or the terminal processes the second information and the third information according to the at least two overlays And fifth information to process at least two overlays.
  • the terminal can determine that the overlay can be processed in a group operation and an individual operation.
  • each group in the at least one group further corresponds to instruction information for indicating a group operation and instruction information for indicating an independent operation. This is convenient for choosing whether to handle overlays in a group operation or in a single operation.
  • the terminal may further display, according to the fifth information, the instruction information indicating the group operation and the instruction information indicating the separate operation corresponding to each group.
  • the terminal may determine whether to operate alone or in groups according to the fourth information and the fifth information corresponding to the overlay.
  • the overlay when the overlay corresponds to the first information, the overlay also corresponds to the sixth information.
  • the sixth information is used to indicate an operation function corresponding to the overlay.
  • the terminal performs at least two overlays according to the first information of the at least two overlays.
  • the processing includes: the terminal processes at least two overlays according to the first information and the sixth information of the at least two overlays; or the terminal processes the at least two overlays according to the second information, the third information, and the sixth information of the at least two overlays. overlay for processing.
  • the terminal parses the sixth message, it can determine the operation function of each overlay based on the operation function indicated by the sixth message.
  • the operation function of the group may also be determined based on the sixth message.
  • the second information and the third information are carried in an overlay file format.
  • the second information and the third information are carried in the OMAF file format of the overlay.
  • the file format includes an overlay structure, an overlay-related area control structure, and an overlay group box located in the overlay structure. Then the second information is located in the overlay association area control structure, and the third information is located in the overlay group box. It should be understood that in this case, the terminal can obtain the overlay association area control structure and the third information by analyzing the overlay structure, and then determine the operation function of the overlay according to the overlay association area control structure, and obtain the overlay group identification information according to the third information. To determine the group to which the overlay belongs.
  • the third information is located in an overlay control structure included in the overlay structure.
  • the first information is carried in an overlay file format. It should be understood that the first information is located in an overlay group box included in the overlay. At this time, the overlay structure may not carry an overlay-related area control structure. It should be understood that when the first information is carried in the file format, after receiving the overlay, the terminal may obtain the group identification information of the overlay by analyzing the overlay group box of the overlay, and then determine the group to which the overlay belongs according to the group identification information.
  • the third information is carried in supplementary enhanced information (supplementary enhancement information) of the overlay code stream corresponding to the overlay, and the second information is carried in an overlay-related area control structure of the overlay.
  • supplementary enhanced information supplementary enhancement information
  • the overlay structure of the overlay at this time includes an overlay-related area control structure.
  • the operation function corresponding to each overlay can be indicated by the overlay-related area control structure.
  • the overlay-associated area control structure may indicate that the operation function is displaying or closing.
  • the first information is carried in an SEI of an overlay code stream corresponding to the overlay.
  • the overlay structure of the overlay may not include an overlay-related area control structure.
  • the overlay structure may include an overlay control symbol for indicating an operation function corresponding to the overlay.
  • the SEI payload type is used to indicate that the SEI carries the group identification information of the overlay.
  • the terminal when the terminal processes the overlay, it can determine the group identification information of the overlay carried in the SEI according to the payload type when parsing to the SEI, and further analyze the SEI to obtain the group identification information of the overlay.
  • the SEI load type is also used to indicate the attribute of the group.
  • the attributes of the SEI load type group are a common display group or a common interaction group. For example, if it is a common interaction group, an interactive operation may be performed on the group. If it is a common display group, you can display or close the overlay in the group in a group operation.
  • the third information is carried in a Media Presentation Description (MPD) including a media data stream of the overlay
  • the second information is carried in an overlay association area control structure of the overlay.
  • MPD Media Presentation Description
  • the overlay structure of the overlay at this time includes an overlay-related area control structure.
  • the operation function corresponding to each overlay can be indicated by the overlay-related area control structure.
  • the overlay-associated area control structure may indicate that the operation function is displaying or closing.
  • the first information is carried in an MPD of a media data stream including an overlay.
  • the overlay's overlay structure may not include an overlay-related area control structure.
  • the overlay structure may include an overlay control symbol for indicating an operation function corresponding to the overlay.
  • the first information or the third information is located in an overlay description word of an adaptation set level or a representation level of the MPD.
  • the server when the first information or the third information is carried in the MPD, the server also needs to send the MPD corresponding to the media data including at least two overlays before sending the code stream to the terminal.
  • the MPD includes first information or third information of each of at least two overlays. This embodiment of the present application does not limit this.
  • the code stream here includes information of at least two overlays. The information of each of the at least two overlays is the information defined in the overlay structure corresponding to the overlay.
  • the group identification information of the overlay includes at least one group identification information. It should be understood that when the overlay corresponds to at least two group identification information, one overlay may belong to at least two groups.
  • the overlay corresponds to multiple groups.
  • the overlay When the overlay is triggered, the overlay respectively responds to the operation function corresponding to the multiple groups, or all overlays in the multiple groups respectively respond to their respective groups. Corresponding operation function.
  • the overlay belongs to the first group and the second group, and different groups correspond to different operation functions.
  • all The overlay responds to the operation function corresponding to the first group
  • all overlays in the second group respond to the operation function corresponding to the second group.
  • the fifth information carried in the overlay is used to indicate that group operations and separate operations are available.
  • the terminal processes at least two overlays according to the group operation.
  • the terminal processes at least two overlays according to the separate operation. The specific process can be described in the above corresponding description, which will not be repeated here.
  • the overlay in the first group when the overlay in the first group is triggered, the overlay responds to the operation function corresponding to the first group, and the overlay in the second group responds to the operation corresponding to the second group.
  • the file format of the overlay further includes: an overlay group box, and the overlay group box carries name information of the overlay group; the method further includes: the terminal displays the overlay The group name indicated by the group's name information.
  • an embodiment of the present application provides a method for processing media data, including: the server obtains the media data; the server processes the media data to obtain at least two overlays corresponding to the media data; the overlay corresponds to the first information, or the overlay Corresponds to second information and third information; wherein the first information includes group identification information of the overlay or identification information of other overlays that belong to the same group as the overlay, and the second information is used to indicate that the overlay corresponds to The third information is used to indicate the group identification information of the overlay or identification information of other overlays that belong to the same group as the overlay; the overlays in the same group correspond to the same operation function; the server sends The terminal sends the at least two overlay layers.
  • the processing of the media data by the server may refer to encoding the media data and encapsulating the media data stream obtained after the encoding.
  • the server after the server obtains the media data, it can process the media data so that one or more overlays obtained after processing correspond to the group identification information, or at least two overlays corresponding to the group obtained after processing Group identification information, and operational functions.
  • the terminal when the terminal obtains one or more overlays, it can process at least two overlays based on the first information, or process at least two overlays based on the second information and the third information. Since the first information and the third information both indicate group identification information, the terminal can process at least two overlays with the group granularity.
  • the second information and the third information are carried in an overlay file format.
  • the server may encode the media data and use the video file format (for example, the OMAF standard file format) to encapsulate the media data stream obtained after the encoding. Therefore, the second information and the third information are carried in a file format obtained by encapsulating the overlay.
  • the file format includes an overlay structure, an overlay-related area control structure located in the overlay structure, and an overlay group box.
  • the third information is located in the overlay group box, and the second information is located in the overlay association area control structure.
  • the third information is located in an overlay control structure included in the overlay structure.
  • the first information is carried in an overlay file format.
  • the server may encode the media data and use the video file format (for example, the OMAF standard file format) to encapsulate the media data stream obtained after the encoding. So that the first information is in a file format obtained after the overlay is encapsulated.
  • the file format includes an overlay group box, and the first information is located in the overlay group box. It should be understood that the file format at this time also includes an overlay structure.
  • the third information is carried in the auxiliary enhanced information SEI of the overlay code stream corresponding to the overlay, and the second information is carried in the overlay-related area control structure of the overlay.
  • the server may encode the media data to obtain a media data stream, and then encode one or the overlay included in the media data to obtain an overlay code stream corresponding to each overlay.
  • the server carries the third information in the overlay corresponding to the overlay.
  • the overlay bitstream includes auxiliary enhancement information SEI.
  • the encoded media data stream and the overlay code stream corresponding to each overlay are encapsulated in a video file format (for example, the OMAF standard file format). So that the second information is carried in a file format obtained by encapsulating the overlay.
  • the first information is carried in an SEI of an overlay code stream corresponding to the overlay.
  • the server may encode the media data to obtain a media data stream, and then encode one or the overlay included in the media data to obtain an overlay code stream corresponding to each overlay.
  • the server carries the third information in the overlay corresponding to the overlay.
  • the overlay bitstream includes auxiliary enhancement information SEI.
  • the encoded media data stream and the overlay code stream corresponding to each overlay are encapsulated.
  • the encapsulated overlay has an overlay structure.
  • the payload type of the SEI is used to indicate that the SEI carries the group identification information of the overlay. It should be understood that the load type of the SEI may also be used to indicate the attributes of the group.
  • the third information is carried in a media presentation description MPD corresponding to the media data stream containing the overlay, and the second information is carried in an overlay-related area control structure of the overlay.
  • the server may encapsulate the overlay based on the HTTP adaptive network adaptive media transmission protocol (Dynamic Adaptive Streaming Through HTTP, DASH) to obtain the MPD.
  • the third information is then carried in the MPD.
  • the third information is located in an MPD's adaptation set level or representation level overlay description word.
  • the first information is carried in a media presentation description MPD corresponding to a media data stream containing an overlay.
  • a media presentation description MPD corresponding to a media data stream containing an overlay.
  • the encapsulated overlay has an overlay structure.
  • the first information is located in an overlay description word of an adaptation set or a representation level of the MPD.
  • the overlay also corresponds to fourth information, and the fourth information is used to indicate that in a case where the first operation function is performed on the overlay, all overlays in a group to which the overlay belongs respond to the first Operational functions. It should be understood that the terminal can thus determine to operate in overlays in groups.
  • the overlay also corresponds to fifth information, and the fifth information is used to indicate that in a case where the first operation function is performed on the overlay, all overlays in a group to which the overlay belongs respond to the first An operation function, or the overlay responds to the first operation function. It should be understood that, in this way, the terminal can determine that the overlay can perform a group operation or an independent operation.
  • the overlay when the overlay corresponds to the first information, the overlay also corresponds to the sixth information, and the sixth information is used to indicate an operation function corresponding to the overlay.
  • the file format of the overlay further includes: an overlay group box, and the overlay group box carries name information of the overlay group.
  • an embodiment of the present application provides a terminal, and the terminal includes a module for responding to the method in any one of the foregoing implementation manners of the first aspect.
  • a terminal is a device capable of presenting media data (eg, video images) and / or one or more overlays to a user.
  • media data eg, video images
  • an embodiment of the present application provides a server, and the server includes a module for executing a method in any one of the foregoing implementation manners of the second aspect.
  • the server is a device capable of storing media data and processing one or more overlays corresponding to the media data.
  • the server may provide video images and the processed one or more overlays to the terminal, so that the terminal can provide the media data, One or more overlays are presented to the user.
  • a terminal including: a non-volatile memory and a processor coupled to each other; wherein the processor is configured to call a program code stored in the memory to execute the method in any implementation manner of the first aspect Some or all of the steps.
  • a server including: a non-volatile memory and a processor coupled to each other; wherein the processor is configured to call program code stored in the memory to execute any one of the implementation manners of the second aspect Part or all of the steps of the method.
  • a computer-readable storage medium stores program code, where the program code includes a part for performing a method in any implementation manner of the first aspect or Instructions for all steps.
  • a computer-readable storage medium stores program code, where the program code includes a part for performing a method in any implementation manner of the second aspect or Instructions for all steps.
  • a computer program product is provided, and when the computer program product runs on a computer, the computer is caused to execute instructions of some or all steps of the method in any one of the implementation manners of the first aspect.
  • a computer program product is provided, and when the computer program product runs on a computer, the computer is caused to execute instructions of some or all steps of the method in any one of the implementation manners of the second aspect.
  • FIG. 1 is a schematic diagram of a communication system according to an embodiment of the present application.
  • FIG. 2 is a schematic structural diagram of a communication device according to an embodiment of the present application.
  • FIG. 3 is a first schematic flowchart of a media data processing method according to an embodiment of the present application.
  • FIG. 4 is a second schematic flowchart of a method for processing media data according to an embodiment of the present application.
  • FIG. 5 is a schematic diagram of a display interface according to an embodiment of the present application.
  • FIG. 6 is a schematic diagram of another display interface according to an embodiment of the present application.
  • FIG. 7 is a schematic structural diagram of a device for processing media data according to an embodiment of the present application.
  • FIG. 8 is a schematic structural diagram of another apparatus for processing media data according to an embodiment of the present application.
  • Panoramic video Also known as 360-degree panoramic video, it is composed of a series of panoramic pictures. The content of the panoramic picture covers the entire sphere surface in three-dimensional space. It is a video shot with a full-scale 360-degree using a 3D camera. When watching a video, you can freely adjust the video to watch up, down, left and right.
  • MPD Media presentation description
  • Track Chinese translation "Track”, the definition of Track in the standard ISO / IEC 14496-12 "timed sequence of related samples (qv) in an ISO media file. Translated into:” The relevant samples in ISO media files Time attribute sequence.
  • a track a sequence of images, or a sampled audio
  • a track a track, a stream of channels
  • a stream of channels a stream of channels.
  • Track refers to a series of time-dependent samples in accordance with the ISOBMFF packaging method, such as video track.
  • Video samples are code streams generated by the video encoder after encoding each frame. All video samples are encapsulated according to the ISOBMFF specification. Generate a sample.
  • a sample sample for example, an individual frame of video, a series of video frame frames in coding order, or a compressed section of audio section in audio coding in order; inhint tracks, a sample sample definitions of one or more streaming packets.
  • the sample can be an independent video frame, a series of video frames placed in decoding order, or a compressed audio placed in the decoding order; in the cue track, the sample defines one or The shape of multiple stream packets.
  • box Chinese translation of "box”, the definition of box in the ISO / IEC 14496-12 standard: "object-oriented building block defined by unique type identifier and length. It can be translated as” object-oriented building block, Defined by a unique type identifier and length. "
  • ISOBMFF files are made up of multiple boxes, and boxes can contain other boxes.
  • SEI full name supplementary enhancement, is a type of Network Abstract Unit (NALU) defined in the video codec standards (h.264, h.265).
  • NALU Network Abstract Unit
  • Overlay Chinese translation "overlay”, that is, the media content superimposed on the background video (specifically, it can refer to an additional layer of rendered video or picture superimposed on a certain area of the background video picture), in the OMAF standard
  • the overlay can also be information such as the name and age of an element displayed on the background video.
  • background video background visual media
  • video that can be superimposed by overlay.
  • background visual media video that can be superimposed by overlay.
  • OMAF there are the following definitions and explanations: "piece of visual media, which is superimposed.”
  • Chinese translation Visual media film superimposed by the overlay.
  • multiple means two or more.
  • “And / or” describes the association relationship of related objects, and indicates that there can be three kinds of relationships, for example, A and / or B can represent: the case where A exists alone, A and B exist simultaneously, and B alone exists, where A, B can be singular or plural.
  • the character “/” generally indicates that the related objects are an "or” relationship.
  • “At least one or more of the following” or similar expressions refers to any combination of these items, including any combination of single or plural items.
  • At least one (a), a, b, or c can be expressed as: a, b, c, ab, ac, bc, or abc, where a, b, and c can be single or multiple .
  • words such as “first” and “second” are used to distinguish between the same or similar items having substantially the same functions and functions. Those skilled in the art can understand that the words “first”, “second” and the like do not limit the number and execution order, and the words “first” and “second” are not necessarily different.
  • FIG. 1 shows a schematic diagram of a communication system provided by an embodiment of the present application.
  • the communication system includes a server 100 and at least one terminal 200 that communicates with the server 100.
  • the server 100 may be a media server having a function of processing panoramic video.
  • the terminal 200 may be a device having a function of playing a panoramic video.
  • the terminal 200 may be an electronic device such as VR glasses, a mobile phone, a tablet, a television, and a computer that can be connected to a network.
  • the terminal 200 receives the data sent by the media server, and decapsulates the code stream, and decodes and displays it.
  • the server 100 includes a pre-encoding processor 1001, a video encoder 1002, a code stream packaging device 1003, and a transmitting and transmitting device 1004.
  • the pre-encoding processor 1001 performs pre-processing on the panoramic video, such as image stitching, format conversion, etc., to convert the original panoramic video into a video that can be compression-encoded.
  • the video encoder 1002 is used to obtain the panoramic video from the pre-encoding processor 1001.
  • the video content is subjected to compression encoding or transcoding operation, and the encoded video bitstream is output.
  • the bitstream encapsulation device 1003 encapsulates the encoded bitstream data into a transportable file and transmits it to the terminal or the content distribution network through the network.
  • the server 100 may select the content to be transmitted for signal transmission according to the information (such as a user perspective) fed back by the terminal 200.
  • Terminal 200 includes: receiving device 2001, stream de-encapsulation device 2002, video decoder 2003, and display device 2004
  • the receiving device 2001 is configured to receive media data sent by the server 100.
  • the code stream decapsulating device 2002 is used for decapsulating the media data received by the receiving device 2001 to obtain a video code stream and code stream information corresponding to the code stream.
  • the video decoder 2003 is used to decode a video code stream and output a video image frame for display and playback.
  • FIG. 2 is a schematic diagram of a hardware structure of an apparatus for processing media data according to an embodiment of the present application.
  • the apparatus for processing media data shown in FIG. 2 may be regarded as a computer device, and the apparatus for processing media data may be used as an implementation manner of the server 100 or the terminal 200 in the embodiment of the present application, or may be used as an embodiment of the embodiment of the present application.
  • the apparatus for processing media data includes a processor 110, a memory 120, an input / output interface 130, and a bus 150.
  • the apparatus for processing media data may further include a communication interface 140.
  • the apparatus for processing media data may further include a display 160 for displaying video data to be played. For example, background video and one or more overlays.
  • the processor 110, the memory 120, the input / output interface 130, the communication interface 140, and the display 160 implement a communication connection with each other through the bus 150.
  • the processor 110 may use a general-purpose central processing unit (CPU), a microprocessor, an application-specific integrated circuit (ASIC), or one or more integrated circuits for executing related programs to To implement the functions required by the modules in the server in the embodiment of the present application, or to execute the method for processing media data in the method embodiment of the present application.
  • the processor 110 may be an integrated circuit chip with signal processing capabilities. In the implementation process, each step of the above method may be completed by an integrated logic circuit of hardware in the processor 110 or an instruction in the form of software.
  • the aforementioned processor 110 may be a general-purpose processor, a digital signal processor (DSP), an application specific integrated circuit (ASIC), a ready-made programmable gate array (Field Programmable Gate Array, FPGA), or other programmable logic device, Discrete gate or transistor logic devices, discrete hardware components.
  • DSP digital signal processor
  • ASIC application specific integrated circuit
  • FPGA ready-made programmable gate array
  • Various methods, steps, and logical block diagrams disclosed in the embodiments of the present application can be implemented or executed.
  • a general-purpose processor may be a microprocessor or the processor may be any conventional processor or the like.
  • the steps of the method disclosed in combination with the embodiments of the present application may be directly implemented by a hardware decoding processor, or may be performed by using a combination of hardware and software modules in the decoding processor.
  • the software module may be located in a mature storage medium such as a random access memory, a flash memory, a read-only memory, a programmable read-only memory, or an electrically erasable programmable memory, a register, and the like.
  • the storage medium is located in the memory 120, and the processor 110 reads the information in the memory 120 and, in conjunction with its hardware, completes the functions required by the modules included in the server in the embodiment of the present application, or performs processing of media data in the embodiments of the method of the present application method.
  • the memory 120 may be a read-only memory (Read Only Memory, ROM), a static storage device, a dynamic storage device, or a random access memory (Random Access Memory, RAM).
  • the memory 120 may store an operating system and other application programs. When software or firmware is used to implement the functions required by the modules included in the server in the embodiment of the present application, or the method for processing media data in the method embodiment of the present application, the method for implementing the technical solution provided in the embodiment of the present application is implemented.
  • the program code is stored in the memory 120, and the processor 110 performs operations required by the modules included in the server 100, or executes the method for processing media data provided by the method embodiment of the present application.
  • the input / output interface 130 is used to receive input data and information, and output data such as operation results.
  • the communication interface 140 uses a transceiving device such as, but not limited to, a transceiver to implement communication between a device that processes media data and other devices or a communication network. It can be used as an obtaining module or a sending module in a device for processing media data.
  • a transceiving device such as, but not limited to, a transceiver to implement communication between a device that processes media data and other devices or a communication network. It can be used as an obtaining module or a sending module in a device for processing media data.
  • the bus 150 may include a path for transmitting information between various components of a device that processes media data, such as the processor 110, the memory 120, the input / output interface 130, and the communication interface 140.
  • the apparatus for processing media data shown in FIG. 2 only shows the processor 110, the memory 120, the input / output interface 130, the communication interface 140, and the bus 150, in the specific implementation process, those skilled in the art It should be understood that the apparatus 100 also includes other devices necessary for achieving normal operation. At the same time, according to specific needs, those skilled in the art should understand that the apparatus for processing media data may further include hardware devices that implement other additional functions. In addition, those skilled in the art should understand that the apparatus for processing media data may also include only the components necessary to implement the embodiments of the present application, and not necessarily all the components shown in FIG. 2.
  • the apparatus for processing media data may further include one or more network cards for forming a session channel between the server 100 and the terminal 200 to transmit media services.
  • the overlay in the embodiments of the present application refers to the overlay media content superimposed on the background layer media content, and the overlay may be separately encoded as the media content or may be a part of the background layer media content. If the overlay is part of the media content of the background layer, the overlay may not be separately encoded, and the media data code stream obtained by the server after the media data is encapsulated will include the overlay information. If the overlay can be separately encoded as media content, the overlay codestream corresponding to each overlay will be obtained.
  • the media content is content displayed by playing media data.
  • overlay structure In the current OMAF standard document, the basic data structure of the overlay (abbreviated as overlay structure) and the carrying method have been defined, as shown in Table 1 below:
  • the overlay structure shown in Table 1 defines some basic attributes of the overlay structure, including the number of overlays (number, abbreviation: num), identification information (for example, Id number), overlay control symbols, and overlay control structure. Wait.
  • the value of the overlay control symbol syntax element overlay_control_flag can be used to indicate the function of the overlay control structure.
  • the semantics of overlay_control_flag include the overlay's associated source, hierarchical order, transparency, user operation information, flags, and priorities, as shown in Table 2:
  • an interactive control structure (OverlayInteraction control structure) is defined. It can be understood that the OverlayInteraction control structure is one of the overlay control structures. Among them, the OverlayInteraction control structure contains the types of interaction that the overlay may be operated by the user. The structure is shown in Table 3:
  • Table 4 is only a list of some operation functions. In the actual process, there may be other operation functions for overlay, and of course, there may be other operation functions.
  • FIG. 3 shows a schematic flowchart of a method for processing media data according to an embodiment of the present application.
  • the method includes:
  • Step 101 The server obtains media data.
  • the foregoing media data may be a video image, for example, a panoramic video.
  • the one or more overlays corresponding to the media data may be one or more overlays displayed on the media data.
  • the overlay layer may be a video or a picture displayed on the media data.
  • the picture overlaid on it may be a name or an age.
  • Step 102 The server processes the media data to obtain at least two overlay layers corresponding to the media data.
  • the overlay layer is a video, image, or text that is used to be superimposed on a background video or a background image for display.
  • processing of media data includes operations such as preprocessing, encoding, and encapsulation of the media data.
  • An example is that the overlay corresponds to the first information.
  • the first information includes group identification information of the overlay or identification information of other overlays that belong to the same group as the overlay.
  • the identification information of other overlays is used to determine other overlays that belong to the same group as the overlay.
  • the identification information of other overlays corresponding to overlay1 is overlay1 and overlay2. This means that overlay1, overlay2, and overlay3 belong to the same group.
  • the overlay corresponds to the second information and the third information.
  • the first information and the third information are used to determine a group of the overlay, respectively.
  • the second information is used to indicate an operation function corresponding to the overlay.
  • the third information is used to indicate group identification information of the overlay or identification information of other overlays that belong to the same group as the overlay.
  • the group identification information of the overlay in the same group is the same.
  • the operation functions corresponding to overlays in the same group have the same meaning: all operation functions corresponding to all overlays included in the same group are all the same.
  • overlay1 and overlay2 belong to group 1, and the operation functions corresponding to overlay1 and overlay2 both include rotation and the size of the window can be changed.
  • the same operation function corresponding to overlays in the same group means that at least one operation function corresponding to all overlays included in the same group is the same.
  • the operation functions corresponding to overlay1 include rotation and the size of the window can be changed.
  • the corresponding operation functions of overlay2 include rotation.
  • overlay1 and overlay2 can also be divided into group 1.
  • the group identification information is used to determine a group to which the overlay belongs.
  • the group identification information may be a group ID or a group name, which is not limited herein.
  • each overlay in the embodiment of the present application includes an overlay structure, and the overlay structure includes indication information for indicating an overlay operation function.
  • the operation function can be determined through the OverlayInteraction control structure. For example, rotation, free selection depth, window size can be changed, and so on.
  • the server encodes the media data to obtain one or more overlays included in the media data stream, and then determines at least one operation function that each overlay has when the media data stream is encapsulated, so that for any two or two In the above overlay, if any two or more overlays have at least one of the same operation functions, the server may set the group identification information of each of them to be the same. For example, if the operation function corresponding to overlay1 and overlay2 is rotation, the server may use the first information / third information corresponding to overlay1 and overlay2 to indicate group 1.
  • the overlay control structure also defines a control structure of an area (eg, a spherical area) associated with the overlay, which is used to indicate that when an area in a video image is triggered, the overlay display associated with the area can be triggered.
  • the control structure of the area associated with the overlay is an overlay associated area control structure (AssociatedSphereRegionStruct) as an example.
  • AssociatedSphereRegionStruct The syntax of AssociatedSphereRegionStruct is shown in Table 5 below:
  • SphereRegionStruct (1) in Table 5 defines a spherical area associated with the overlay.
  • the user can click the spherical area to trigger the overlay associated with the spherical area. On or off.
  • the above-mentioned area of the overlay may refer to an area just covered or occupied by the area of the overlay, that is, the media data in the area of the overlay belong to the overlay, and the media data in the overlay are all in the area of the overlay.
  • the above-mentioned area spatial information of the overlay may also be referred to as area spatial information of the area of the overlay.
  • the area spatial information of the overlay is used to indicate the spatial range or spatial position of the area associated with the overlay. In this way, when the user is watching a video image, the area associated with the area can be displayed in the video image by triggering the area.
  • the above-mentioned spatial position of the area associated with the overlay may specifically be directed to a coordinate system, and the coordinate system may be a three-dimensional coordinate system or a two-dimensional coordinate system.
  • the origin of the three-dimensional coordinate system may be the center point of the panoramic video image, the point in the upper left corner of the panoramic video image, or other fixed position points in the panoramic video image.
  • the spatial position of the area associated with the overlay may also be the position of the overlay in the panoramic video image area. Spatial location).
  • Scenario 2 You can add one or more overlays with a certain type of interaction to a group based on one or more overlays with the interaction defined in the OverlayInteraction control structure (you can call this group name for interaction) For: interaction group, it should be understood that the group for interaction operation may also be in another name). This enables the terminal to perform a certain type of defined operation function, such as an interactive operation, based on a trigger operation on the interaction group. Exemplarily, the interactive operation may be shown in Table 4, and details are not described herein again. At this time, each of the one or more overlays corresponds to the first information.
  • step 102 in the embodiment of the present application may be specifically implemented in the following manner: S1.
  • the server encodes the media data to obtain a media data code stream corresponding to the media data.
  • S2. The server encapsulates the media data stream obtained after encoding.
  • the encapsulated media data stream includes information of one or more overlays, and first information corresponding to each overlay in the one or more overlays. Or the second information and the third information corresponding to each overlay.
  • Each overlay in the one or more overlays corresponds to a file format.
  • the overlay in the following embodiments may be part of the media content (ie, media data) of the background layer, and the overlay may not be separately encoded at this time. That is, when the server encodes the media data, the obtained media data stream includes one or more overlays. The server may then encapsulate the media data stream including one or more overlays. For example, make the encapsulated media data stream correspond to the file description. Or the server encapsulates the media data stream so that one or more overlays included in the media data stream have an overlay structure.
  • the overlay can also be separately encoded as media content.
  • the server encodes the media data to obtain a media data stream, and then encodes the overlay included in the media data to obtain an overlay code stream.
  • the server encapsulates the media data stream and the overlay code stream, the encapsulated media data stream has overlay information.
  • the overlay information may be an overlay structure.
  • the second information and the third information may be carried in a file format of an overlay that encapsulates a media data stream including the overlay.
  • the file format includes: an overlay structure, and an overlay-related area control structure and an overlay group box located in the overlay structure.
  • the third information is located in the overlay group box, and the second information is located in the overlay associated area control structure.
  • the second information may be an overlay associated area control structure.
  • the operation function indicated by the third information may be display or shutdown.
  • the first information may be carried in the file format of the overlay. That is, one or more overlay file formats obtained after encapsulation may not have an overlay-related area control structure.
  • the file format includes: overlay group box.
  • the first information is located in an overlay group box. It should be understood that in scenario 2, the file format may also include an overlay structure.
  • the server may encapsulate the media data stream including one or more overlays according to the OMAF standard file format.
  • the file format of each overlay in the one or more overlays has an overlay control structure.
  • the overlay structure may have an overlay-related area control structure.
  • the overlay structure may be provided with an OverlayInteraction control structure.
  • the server may add a box corresponding to the overlay control region control structure to the file format, so that the overlay file format has an overlay control structure.
  • entity groups are defined for multiple overlays.
  • a type of group eg, a switching group
  • Multiple overlays in this switching group can be switched to each other.
  • the specific syntax for switching groups is shown in Table 6 below:
  • ref_overlay_id [i] represents identification information of other overlays that belong to the same group as an overlay.
  • the server can also make it possible to define identification information of other overlays that belong to the same group as the overlay in the overlay structure of each overlay, so as to replace the above-mentioned group identification information.
  • the identification information of the overlay is used to identify the overlay.
  • the identification information may be an ID number of the overlay.
  • overlay1 belongs to group1 and overlay2 also belongs to group1. Therefore, the identification information of group1 and the identification information of overlay2 can be defined in overlay1. In overlay2, identification information of group 1 and identification information of overlay1 can be defined.
  • entity entity group there may be an entity group and an overlay structure in a file format corresponding to each overlay.
  • entity entity group box EntityToGroupBox.
  • the file format in the embodiment of the present application further includes an overlay group box.
  • the overlay group box is used to indicate an operation function corresponding to the overlay when overlaying any one of the overlay group boxes.
  • the overlays all respond to this operation function.
  • the overlay group box can be defined as an OverlayConditionalShownGroupBox, which means that a group of overlays can be displayed or closed together when the user targets and triggers a certain overlay.
  • an overlay group box with an OverlayInteraction control structure may be an OverlayRelationGroupBox.
  • OverlayConditionalShownGroupBox and OverlayRelationGroupBox in the overlay group box in the embodiment of the present application may also have other names, which are not limited in the embodiment of the present application.
  • the group may be named after a common display group.
  • the common display group may indicate that multiple overlays in the common display group may be displayed or closed together.
  • the first prompt information corresponding to the group is used to indicate that the group can be expanded or closed together.
  • the interactive operation indicated by the corresponding operation type It can be understood that this is only an example, and the name of the group may also be another name, which is not limited in the embodiment of the present application.
  • the overlay group box in the embodiment of the present application may be OverlayConditionalShownGroupBox, which means that a group of overlays can be performed when the user triggers the display for any overlay or the group Show together.
  • the third information may be carried in the OverlayConditionalShownGroupBox.
  • the ref_overlay_id [i] in the above Table 7 indicates that the overlay_id corresponding to the track or image item indicated by the i-th entity_id is an overlay that can be displayed under the trigger of the user in this group. There will be an overlay_id corresponding to ref_overlay_id [i] in the referenced i-th track or image item.
  • the ref_overlay_id [i] syntax element in the structure is also allowed to exist.
  • Example 2-1 Taking interactive groups as an example, the file format of each overlay in one or more overlays obtained by the server after processing the media data has the group identification information of the overlay.
  • the overlay group box in the embodiment of the present application may be an OverlayRelationGroupBox, which is used to form multiple overlays into an interaction group, and all overlays in the interaction group may have the same interaction operation. At this time, the first information is carried in the OverlayRelationGroupBox.
  • the same interaction group specifies an interaction operation that can perform a certain type of operation function instruction for all overlays in the interaction group.
  • the other overlays in the OverlayRelationGroupBox also respond to the operation function corresponding to the OverlayRelationGroupBox.
  • Table 8 The specific syntax is shown in Table 8:
  • the interactive information syntax element included in the OverlayInteraction control structure when there are multiple overlays forming an interaction group OverlayRelationGroupBox. If any overlay in the interaction group is triggered, the operation functions defined in the OverlayInteraction control structure will be applied to each overlay in the interaction group together.
  • the OverlayRelationGroupBox defines an operation function for scaling all overlays in the OverlayRelationGroupBox as an example.
  • overlayA, overlayB, and overlayC in the OverlayRelationGroupBox form interaction group 1
  • resize_flag 1 in the OverlayInteraction control structure corresponding to overlayA, overlayB, and overlayC, respectively.
  • overlayA, overlayB, and overlayC will be scaled.
  • the OverlayRelationGroupBox definition is an operation function for changing the position of all overlays in the OverlayRelationGroupBox
  • change_position_flag 1 in the OverlayInteraction control structure corresponding to overlayA, overlayB, and overlayC respectively.
  • overlayA, overlayB, and overlayC will perform a position change operation.
  • the overlay in the embodiment of the present application can be displayed together with the background video, and the overlay and the background video can be bound for common display.
  • the syntax structure of overlay and background video is shown in Table 9:
  • step 102 in the embodiment of the present application may be specifically implemented in the following manner: S3.
  • the server encodes the media data to obtain a media data stream, and encodes one or more overlays included in the media data to obtain An overlay code stream corresponding to the overlay, and each overlay code stream includes a SEI.
  • the server encapsulates the media data stream and the overlay code stream corresponding to each overlay to obtain a media data stream including one or more overlay information.
  • the server can also separately encapsulate the overlay code stream. Then send the encapsulated overlay code stream to the terminal.
  • the SEI payload type is used to indicate that the SEI carries overlay group identification information.
  • the third information may be carried as an indication field in the SEI of the overlay code stream.
  • the first information may be used as an SEI of an overlay code stream.
  • the SEI has an indication field for indicating the group identification information of the overlay.
  • the encapsulated overlay may also include: an overlay associated area control structure.
  • the specific encapsulation process can refer to the above S2, which will not be repeated here.
  • the third information is carried in the SEI of the overlay code stream corresponding to the overlay, and the second information is carried in the overlay associated area control structure of the overlay.
  • the SEI corresponding to the scenario 1 may be named after a common group when carrying the group identification information, so that the second information may not be carried, that is, it is not defined in the overlay structure of the overlay overlay associated area control structure.
  • the first information may be carried as an indication field in the SEI of the overlay code stream corresponding to the overlay.
  • the encapsulated overlay may not have an overlay-related area control structure.
  • the operation function corresponding to each overlay in the group can be used to name the group.
  • the first information is carried in the SEI of the overlay code stream corresponding to the overlay.
  • the SEI is used to indicate group identification information of the overlay.
  • the syntax structure of the SEI is shown in Table 10:
  • the sei_payload in Table 10 defines the SEI payload information, including two parameters payloadType and payloadSize. Among them, the payloadType indicates the type of the SEI, and the payloadSize indicates the size of the SEI.
  • OLG in Table 10 is a variable, which represents the value of the payloadType of an SEI.
  • the value of OLG may be 190.
  • payloadSize indicates the payload size.
  • an overlay group can be represented as an overlay condition display group (overlay_conditional_shown_group).
  • the group identification information of the overlay in Table 10 may be replaced with overlay_conditional_shown_group_info (information).
  • overlay_conditional_shown_group_info information
  • Table 11 the syntax structure of overlay_conditional_shown_group_info
  • overlay_conditional_shown_group_id This value indicates the ID number of the group of the overlay.
  • the overlay group can be overlay_relation_group, and overlay_relation_group_info can be used to replace the overlay group identification information in Table 10.
  • the above interactive operations may refer to a common operation on a certain type of operation function, or a common operation on all operation functions supported by the overlay, which are not limited in the embodiments of the present application.
  • step S102 in the embodiment of the present application may be specifically implemented in the following manner: S5. Encode the media data to obtain a media data stream including one or more overlays. S6. The server encapsulates a media data stream including one or more overlays, and obtains a description file corresponding to the media data stream.
  • S6 may be specifically implemented in the following manner:
  • the server may encapsulate a media data stream including one or more overlays based on a DASH transmission protocol standard to obtain a media presentation description MPD of the media data stream as a description file.
  • the overlay descriptor of the MPD's adaptation, level, or representation level carries the group identification information of the overlay.
  • each overlay in the one or more overlays corresponds to the second information and the third information.
  • the description file includes at least third information of each overlay in one or more overlays, and the third information may be carried as an indication field in the description file of the media data stream. It should be understood that after the media data stream is encapsulated, one or more overlays included in the media data stream have an overlay associated area control structure. The specific encapsulation process can refer to the above S2, which will not be repeated here.
  • the third information is carried in an MPD corresponding to a media data stream obtained by encapsulating a media data stream including one or more overlays, and the second information is carried in an overlay-related area control structure of the overlay.
  • corresponding to scenario 1 can also be replaced in the following manner, that is, the overlay group in the description file containing the media data of the overlay is named after the operation function of the overlay. Has an overlay associated area control structure.
  • each overlay in one or more overlays corresponds to the first information.
  • the first information may be carried as an indication field in a description file containing the media data of the overlay. It can be understood that the name of the group can also be named by the operation function of the code stream.
  • the first information is carried in a media presentation description MPD of the media data obtained by encapsulating the media data including the overlay.
  • the first information or the third information is located in an overlay description word of an adaptation set level or a representation level of the MPD.
  • Example 5-1 Take the operation function as the display or shutdown as an example, so you can define a new @schemeIdUri for the overlay descriptor, the value is: "urn: mpeg: mpegI: omaf: 2018: ocsg", the semantics is common to overlay OCSG descriptor. A maximum of one OCSG descriptor is allowed to appear at the adaptation level or the representation level.
  • the value of the OCSG descriptor is a comma-separated string.
  • the specific values and semantics are defined in Table 12 below:
  • M represents a required parameter
  • O represents an optional parameter.
  • An adaptation set that has the same overlay_relation_group_id value belongs to the same interaction group.
  • the values in the adaptation set that belong to different groups can be different.
  • Table 13 shows an example of an MPD carrying a group indicating an overlay as a common display group:
  • Each of the two common display groups shown in Table 13 includes: two overlays.
  • a common display group Group 1 includes overlay1 and overlay2, and together shows that group 2 includes overlay3 and overlay4.
  • Example 6-1 Take the common interaction group of the group where the overlay is located as an example.
  • the server can define a new @schemeIdUri for the overlay descriptor, whose value is: "urn: mpeg: mpegI: omaf: 2018: ovly", the semantics is overlay common interaction grouping information (OVLY) descriptor, which describes the group for the overlay. Groups perform some kind of interaction. If the position of the overlay moves. A maximum of one OVLY descriptor can appear at the adaptation level or the representation level.
  • OVLY overlay common interaction grouping information
  • the value of the OVLY descriptor is a string separated by commas.
  • the specific values and semantic definitions are shown in Table 14 below:
  • adaptation sets having the same overlay_relation_group_id value belong to the same common interaction group, and the values in the adaptation sets belonging to different groups must be different.
  • Table 15 shows an example of an MPD carrying a group indicating that the overlay is a common interaction group:
  • the common interaction group 1 includes overlay1 and overlay2
  • common interaction group 2 includes overlay3 and overlay4.
  • the server may determine that one or more overlays having the same operation function belong to the same group. It can also be understood that if two or more overlays have the same operation function, two or more overlays with the same operation function can be divided into the same group. And the group can be named after the two or two overlays have a common operation. At this time, the group can correspond to an operation option, which is used to prompt the operation functions that the overlay in the group has in common.
  • the server may carry identification information indicating group 1 in overlay1 and overlay2. It should be understood that the operation option corresponding to group 1 is used to indicate the operation function shared by overlay1 and overlay2.
  • server may determine the respective operation function of each overlay through the respective overlay control structure of each overlay.
  • Step 103 The server sends one or more overlays to the terminal.
  • the server may send one or more overlays to the terminal through a transmitting and transmitting device.
  • the server may directly send the processed one or more overlays to the terminal. It is also possible to send the processed one or more overlays after receiving a request message for requesting an overlay sent by the terminal.
  • the first information, the second information, and the third information are included in the overlay.
  • the first information, the second information, and the third information are included in the MPD file.
  • the service uses the third possible implementation manner to process the media data, it is understood that the one or more overlays sent by the server in S103 also send the MPD corresponding to the one or more overlays to the terminal.
  • the MPD corresponding to one or more overlays includes information of each overlay.
  • the first information or the third information when carried in the SEI, it may be carried in the SEI of the overlay code stream corresponding to the overlay. If the server sends an overlay stream to the terminal, the terminal can display the overlay when decoding and playing the overlay stream.
  • Step 104 The terminal receives one or more overlays sent by the server.
  • the terminal may receive one or more overlays sent by the server through a receiving device.
  • the one or more overlays sent by the server may be implemented in the following manner: the server sends the encapsulated media data stream and the one or more overlays included in the media data stream to the terminal. Or, the server sends an overlay code stream corresponding to each overlay in the encapsulated one or more overlays to the terminal.
  • the terminal also needs to receive the MPD of the included media data stream of one or more overlays.
  • the terminal can obtain the group identification information from the overlay group box in the file format by analyzing the file format of the overlay.
  • the terminal can obtain the group identification information from the overlay group box in the file format by analyzing the file format of the overlay.
  • Step 105 When the overlay corresponds to the first information, the terminal processes at least two overlays according to the first information of the at least two overlays; or, when the overlay corresponds to the second information and the first information, In the third information, the terminal processes the at least two overlays according to the second information and the third information corresponding to the at least two overlays.
  • S105 may be implemented in the following manner: After receiving one or more overlays sent by the server, the terminal decapsulates the overlays to obtain first information corresponding to one or more overlays. Or after the terminal is decapsulated, the second information and the third information corresponding to one or more overlays are obtained. Then when the terminal decodes and plays media data, it can include in the client configuration or user interface prompts an operation option corresponding to the overlay in the same group, which is used to prompt all overlays in the group to perform common operations. Operational functions.
  • the terminal may determine each group of each overlay according to the first information corresponding to each overlay, and then may determine all overlays belonging to the same group.
  • the server may determine the interactive operation corresponding to each overlay according to the OverlayInteraction control structure of each overlay.
  • the terminal may determine a respective group of each overlay according to the third information corresponding to each overlay. You can then determine all overlays that belong to the same group. The terminal can determine the corresponding display or close operation function of each overlay according to the AssociatedSphereRegionStruct of each overlay.
  • the terminal may determine all overlays belonging to the same group in the following manner: The terminal divides the overlays with the same group identification information into the same group according to the group identification information corresponding to each overlay.
  • the terminal may determine that there are two groups, that is, group 1 and group 2.
  • the terminal may determine all overlays belonging to the same group by: the terminal indicates the identification information of any overlay and other overlays corresponding to any overlay according to the identification information of any overlay corresponding to any overlay The other overlays are grouped into the same group.
  • An embodiment of the present application provides a method for processing media data.
  • a terminal can process one or more overlays having the same group identification information by using first information corresponding to each overlay in at least two overlays.
  • each overlay can be processed one by one, which can reduce the complexity of operations and improve the user's subjective experience.
  • the method provided in this embodiment of the present application further includes:
  • Step 106 The terminal displays at least one group, and information indicating an operation function corresponding to each group in the at least one group and an overlay belonging to each group.
  • At least one group is determined by first information corresponding to each overlay in at least two overlays; an operation function corresponding to one group is determined by an overlay structure included in the overlay in each group.
  • At least one group is determined by third information corresponding to each overlay in the at least two overlays, and an operation function corresponding to one group is an overlay associated area included in the overlay in the group
  • the control structure is determined.
  • the terminal may also decode and play the received media data stream to display the media data.
  • the at least one group may be displayed overlaid on the media data.
  • Step 107 When any group in at least one group is triggered, all overlays belonging to any one group respond to the operation function corresponding to the group. Or when any overlay is triggered, any overlay and other overlays belonging to the same group as any overlay respond to the operation function of any overlay being triggered.
  • the operation function corresponding to a group is determined by the operation function shared by all overlays in the group.
  • a group can correspond to multiple operating functions.
  • any group All overlays respond to the group's triggered operation functions. It should be understood that if multiple operation functions corresponding to the group are triggered, all overlays in any one group respond to the multiple operation functions. If any one of a plurality of operation functions corresponding to the group is triggered, all overlays in any one group respond to any one of the triggered operation functions.
  • overlay1 and overlay2 belong to group 1, where the operation functions corresponding to overlay1 and overlay2 are rotation and size scaling.
  • the operation function corresponding to group 1 is also rotation and size scaling. If both rotation and size scaling are triggered, overlay1 and overlay2 respond to the rotation and size scaling operations. If the triggered operation function is rotation, overlay1 and overlay2 respond to the rotation operation.
  • the terminal may assign an operation option to each group, and the operation option is used to prompt an operation function that all overlays in the group can respond to.
  • each group may be assigned an operation option 1 for instructing execution of all operation functions.
  • the operation option 1 is triggered, if the group has multiple operation functions, all overlays in the group respond to multiple operation functions.
  • the user may be prompted for all overlays included in the group and the operation functions common to all overlays in the group. For example, when the mouse is on the operation option, but the click operation is not triggered, the user can also be prompted for all overlays included in the group and the operation functions common to all overlays in the group.
  • step 105 may be specifically implemented in the following manner:
  • the terminal parses each overlay to obtain the respective AssociatedSphereRegionStruct of each overlay.
  • the server determines whether each overlay's respective operation function is displayed or closed according to the AssociatedSphereRegionStruct.
  • the terminal parses the entity group in the media data stream containing one or more overlays, and obtains the OverlayConditionalShownGroupBox.
  • the terminal can further obtain ref_overlay_id. Therefore, the terminal can determine whether the operation function that can be performed on the one overlay and other overlays in the same common display group as the overlay is display or close.
  • step 106 may be specifically implemented in the following manner (1-1):
  • the client configuration or the user interface prompt may include operation options for triggering the display or closing of the common display group.
  • the media data may be displayed on the client or user interface of the terminal.
  • Step 107 may be specifically implemented in the following manner (1-2) or (1-3):
  • Method (1-2) When any group is triggered, the terminal displays all overlays in the any group. That is, all overlays in the group are displayed on the display interface.
  • the terminal closes all overlays in the first group. That is, all overlays in the first group are canceled.
  • the operation option may be displayed on the display interface in the form of icons or text.
  • the operation option is displayed in the form of a chart, when the user's touch operation or click operation is located on the chart or near the chart, the text used to prompt the corresponding function of the operation option can be displayed on the display interface.
  • an operation option of triggering display or closing the display may be performed on the overlay in the common display group in the client configuration or the user interface prompt of the terminal.
  • group 1 in FIG. 5 corresponds to operation option 1
  • group 2 corresponds to one operation option 2.
  • the display mode of the operation options is text as an example.
  • the terminal controls the first group. All overlays in a group perform operational functions.
  • the common display is taken as an example
  • the first group is taken as the group 1 shown in FIG. 5 as an example.
  • the terminal displays the information in the group 1 on the display interface. All overlays.
  • a touch operation is used as an example for the operation option 1 of the group 1 being triggered.
  • the display of the group 1 includes : Name corresponding to media content 1, name 2 corresponding to media content 2, and name 3 corresponding to media content 3.
  • the terminal closes all overlays included in group 1 and uses the group to 1 way is displayed on the display interface. That is, in response to closing the operation function of the display, the display interface at this time may be as shown in FIG. 5.
  • the terminal can simultaneously display the one or more overlays on the background video in response to the user's trigger operation.
  • the one or more overlays can be closed at the same time in response to the user's trigger operation.
  • Example 2-2 corresponds to Example 2-1.
  • step 105 can be specifically implemented in the following manner:
  • the terminal parses the overlay structure of each overlay to obtain the first information carried in the overlay structure of each overlay, and then determines the group identification information of each overlay according to the first information.
  • the terminal can parse the overlay structure of each overlay to obtain the operation function of each overlay.
  • the terminal may determine other overlays that belong to the same group as the overlay.
  • the terminal parses the entity group in the media data stream that contains one or more overlays.
  • the terminal may obtain the OverlayRelationGroupBox from the entity group, thereby obtaining the respective group identification information of each overlay, and / or ref_overlay_id.
  • the terminal can also determine the operation function of the overlay according to the overlay structure, and then can determine the identification information of all overlays belonging to the same group. And all the syntax elements in the overlay's OverlayInteraction control structure.
  • step 106 may be specifically implemented in the following manner (2-1):
  • the terminal configuration or the user interface prompt may include operation options corresponding to a common interaction group for a corresponding interaction operation.
  • the terminal determines all overlays in the first group according to the ref_overlay_id or the identification information of all overlays corresponding to the overlay. . The terminal then performs a common operation function on all overlays in the first group.
  • the operation function corresponding to the OverlayRelationGroupBox semantics is size scaling.
  • all overlays in the first group respond to the size scaling operation. If all overlays in the first group are displayed on the display interface, if the size scaling function corresponding to any of the overlays in the first group is triggered, then any of the triggered overlays responds to the size scaling operation. Other untriggered overlays in a group also respond to size scaling operations.
  • Example 3-2 corresponds to Example 3-1.
  • step 105 can be specifically implemented in the following manner:
  • the terminal decodes the NALU of one or more overlays to obtain the SEI contained in each overlay stream in the one or more overlays.
  • the SEI payload type is a value represented by OLG, it means that the SEI carries a common display group message.
  • the terminal continues to decode the SEI to obtain the overlay_conditional_shown_group_id, or the terminal continues to decode the SEI to obtain the ref_overlay_id corresponding to each overlay.
  • the terminal can determine all overlays belonging to the same common display group according to the overlay_conditional_shown_group_id or ref_overlay_id corresponding to each overlay.
  • the terminal searches for and resolves to the AssociatedSphereRegionStruct in the overlay control structure of each overlay, and learns that the overlay is triggered by the user or turned off.
  • Example 4-2 corresponds to Example 4-1.
  • step 105 may be specifically implemented in the following manner:
  • the terminal decodes the NALU of one or more overlays to obtain the SEI contained in each overlay stream in the one or more overlays.
  • the SEI payload type is a value represented by OLG, it means that the SEI carries a common interaction group message.
  • the terminal continues to decode the SEI to obtain the overlay_relation_group_id, which indicates the ID number of the common interaction group of the overlay or the terminal continues to decode the SEI to obtain the ref_overlay_id.
  • the terminal may determine all overlays belonging to the same common interaction group according to the obtained overlay_relation_group_id corresponding to each overlay or ref_overlay_id corresponding to each overlay.
  • Example 5-2 corresponds to Example 5-1.
  • step 105 can be specifically implemented in the following manner:
  • the terminal obtains one or an MPD corresponding to the overlay, and obtains an adaptation-level attribute by parsing the MPD corresponding to each overlay and obtains the overlay descriptor and the value of the attribute. Then, the identification information of the group to which each overlay belongs or the identification information of other overlays belonging to the same group can be obtained.
  • the terminal decapsulates each overlay, it can learn that the operation function of the overlay is to trigger the display or close the display by analyzing the AssociatedSphereRegionStruct in the overlay structure of each overlay.
  • the terminal can determine its operation function for the display or shutdown according to the group operation function of the overlay.
  • Example 6-2 corresponds to Example 6-1.
  • step 105 may be specifically implemented in the following manner:
  • the terminal obtains one or an MPD corresponding to the overlay, and obtains an adaptation-level attribute by parsing the MPD corresponding to each overlay and obtains the overlay descriptor and the value of the attribute. Then, the identification information of the group to which each overlay belongs or the identification information of other overlays belonging to the same group can be obtained. And according to the overlay structure in each overlay, determine the interactive operation that each overlay has.
  • the terminal may divide the overlay with the same group identification information into the same group. Or, the terminal converts it into the same group according to the identification information carried in each overlay and the identification information of other overlays that belong to the same group.
  • the embodiment of the present application can define conditions for performing a group operation on a group where an overlay is located.
  • one or more overlays in the embodiment of the present application further include fourth information, where the fourth information is used to indicate that in the case where the overlay responds to the first operation function, all overlays in the group to which the overlay belongs respond to An operation function (ie group operation).
  • the first operation function is any one of a plurality of operation functions of the overlay. It should be noted that the first operation function is any operation function that is triggered among a plurality of operation functions corresponding to the overlay.
  • the group operation refers to: in the case of performing an operation function on any overlay in a group, all overlays in the group to which the overlay belongs perform the operation function triggered by the overlay.
  • overlay1 and overlay2 are located in group 1.
  • overlay1 and overlay2 jointly respond to the operation function of overlay1 being triggered.
  • step 105 in the embodiment of the present application may be specifically implemented in the following manner: the terminal processes the at least two overlays according to the first information and the fourth information of the at least two overlays. Specifically, if the overlay corresponds to the fourth information, step 106 further includes that the terminal may include an operation option for performing a group operation on the group to which the overlay belongs in a client configuration or a user interface prompt.
  • step 107 For a specific implementation of step 107 at this time, reference may be made to the foregoing description, that is, the terminal uses the group as a granularity to perform triggered operation functions on all overlays in the group, and details are not described herein again. That is, any overlay is triggered, and all overlays in the group corresponding to any overlay respond to the operation function of any overlay being triggered.
  • one or more overlays in the embodiments of the present application further include: fifth information.
  • the fifth information is used to indicate that in the case where the overlay responds to the first operation function, all overlays in the group to which the overlay belongs respond to the first operation function (for example, group operation), or the overlay responds to the first operation function (single operation ).
  • step 105 may be specifically implemented in the following manner: the terminal processes the at least two overlays according to the first information and the fifth information of the at least two overlays.
  • the corresponding step 106 further specifically includes: the terminal displaying an operation option corresponding to the at least one group to perform a group operation and an operation option to perform an individual operation.
  • the corresponding step 107 further includes: if an individual operation is triggered, when any overlay is triggered, any overlay responds to the triggered operation function. If a group operation is triggered, when any overlay is triggered, any overlay and other overlays belonging to the same group as the overlay respond to the triggered operation function.
  • all overlays in the same group have fourth information.
  • all overlays in the same group have fifth information.
  • the individual operation refers to: in the case of performing an operation function on any overlay in a group, the any overlay responds to the triggered operation function. For example, when overlay1 is triggered, other overlays in group 1 do not respond to the operation function corresponding to overlay1, and only overlay1 responds to the operation function of overlay1.
  • the fourth information or the fifth information may be carried in an MPD of a media data stream, an SEI of an overlay code stream, or a file format.
  • the terminal may determine that the group to which the overlay belongs has only the permission with the granularity of the group operation. If the overlay carries the fifth information, the terminal may determine that the group to which the overlay belongs has permissions with group operations and individual operations as the granularity. The granularity of operation depends on the user's choice.
  • Example 2-3 combined with the above example 2-1, for example, the fourth information or the fifth information is located in the entity group defined in the overlay structure.
  • the following uses the third information and the fourth information as a condition type (condition_type) as an example.
  • condition_type is used to indicate a condition for the user to perform a certain type of common operation on the group.
  • condition_type may have different values, and different condition_type values indicate that the groups of the overlay have different permissions.
  • condition_type value is 0, which means that it has group operation authority. That is, when any overlay in the group is triggered, other overlays in the group also respond to the operation function of any overlay.
  • the condition_type value is 1, indicating that the group has group operation and individual operation permissions. If a group operation is triggered, when any overlay in the group is operated, other overlay codes in the group also respond to the operation function of any of the overlays being triggered. If an individual operation is triggered, when any overlay in the group is operated, only any overlay responds to the triggered operation function.
  • condition_type corresponding to all overlays in the same group are the same.
  • the terminal can obtain the fourth information or the fifth information by parsing to the entity group in the overlay structure.
  • condition_type 1
  • the client configuration or the user interface prompt may include an operation option for performing overlay group operations or an operation option for performing individual operations.
  • condition_type 1
  • the operation option of the group operation is not triggered, only interactive operations are responded to the triggered overlay.
  • any overlay in the group triggers the operation function corresponding to the OverlayRelationGroupBox semantics, according to all ref_overlay_id in the group, all overlays in the group will jointly respond to the triggered overlay. Interaction.
  • Example 4-3 combined with the above example 4-1, for example, the fourth information or the fifth information is located in the overlay code stream SEI. That is, the condition_type is defined in the SEI of the overlay stream corresponding to each overlay in one or more overlays, and the syntax structure is shown in Table 17:
  • step 105 the fourth information or the fifth information is located in the overlay code stream SEI corresponding to each overlay or the description file or file format of the media data including one or more overlays
  • step 107 the specific implementation of step 105 to step 107
  • Example 6-3 combined with the above example 6-1, for example, the fourth information or the fifth information is located in the MPD. Specifically, the fourth information and the fifth information may be located in the overlay description word of the overlay together with the group identification information.
  • overlay_relation_group_id and condition_type are defined in the MPD corresponding to one or more overlays, and the syntax structure is shown in Table 18:
  • condition_type for the definition and value of condition_type, reference may be made to the description in Table 16 above, which is not repeated here.
  • Table 19 shows the condition_type and syntax structure carried in the MPD file, as shown in Table 19:
  • Group 1 includes overlay1 and overlay2
  • Group 2 includes overlay 3 and overlay 4.
  • the group identification information of the overlay in this embodiment of the present application includes one or more group identification information of the overlay. That is, the overlay structure, description file, or SEI of the overlay can be used to indicate that the overlay corresponds to multiple groups. At this time, the first information is also used to indicate the number of groups corresponding to the overlay.
  • Example A when the information indicating that the overlay corresponds to multiple groups exists in the overlay structure, the overlay structure where each overlay is located is shown in Table 20:
  • overlay_relation_group_number indicates the number of groups to which the overlay belongs.
  • overlay_relation_group_id [i] represents the ID number of the i-th group in which the overlay is located.
  • the client configuration or the user interface prompt may set operation options for common operations on the same group overlay, and different groups have different operation options.
  • the overlay in any one group is triggered, all overlays in any one group will jointly respond to the triggered operation function of the triggered overlay. If an overlay is in multiple groups, the overlay will respond to user operations for different groups in which the overlay is located in turn.
  • the overlay structure of each overlay may also have a syntax element of condition_type, as shown in Table 21 below:
  • Example B When the information indicating that multiple groups correspond to the overlay exists in the SEI of the overlay stream, the SEI syntax structure corresponding to each overlay stream is shown in Table 22:
  • the SEI syntax shown in Table 22 may also have a syntax element of condition_type, as shown in Table 23 below:
  • an information description file indicating multiple groups corresponding to the overlay is taken as an example.
  • the information of multiple groups is located in the OVLY description word, and the OVLY description word corresponding to each overlay is shown in Table 24:
  • the adaptation sets having the same overlay_relation_group_id value belong to the same group, and the same overlay can belong to multiple different groups.
  • overlay_relation_group_number The number of groups to which the overlay belongs is specified by overlay_relation_group_number, and overlay_relation_group_id indicates the ID number of the group to which the overlay belongs.
  • condition_type is a condition type in which the groups corresponding to the overlay_relation_group_id interact together.
  • Table 25 shows a specific example of syntax elements with multiple group identification information and condition_type in the MPD, as shown in Table 25:
  • the multiple groups correspond to different operation options.
  • the overlay has fourth information
  • the overlay is located in the second group and the third group as an example.
  • the terminal performs an operation function corresponding to the second group on all overlays in the second group, and performs a third group on all overlays in the third group.
  • Corresponding operation function That is, all overlays in the second group respond to the operation function of the second group, and all overlays in the third group respond to the operation function corresponding to the third group.
  • the terminal performs an operation function corresponding to the second group on all overlays in the second group. (That is, all overlays in the second group respond to the operation functions corresponding to the second group), and perform the operation functions corresponding to the third group on all overlays in the third group (that is, the All overlays respond to the operation function corresponding to the third group). If a separate operation is triggered, when the overlay in the second group is triggered, the terminal performs an operation function corresponding to the second group on the overlay, and performs a third group on the overlay in the third group. Corresponding operation function.
  • the first information when each overlay in one or more overlays in the embodiment of the present application corresponds to first information, the first information further includes first indication information, and an operation function corresponding to a group in which the overlay is located is The operation type indicated by the first instruction information is determined.
  • the first indication information is used to indicate a type of an interaction operation (overlay_interaction_type). That is, it is used to indicate the specific operation function corresponding to the overlay.
  • Table 26 shows an example of the first indication information carried in the SEI syntax.
  • overlay_relation_group_info (payloadSize) ⁇ Descriptor overlay_relation_group_id Zh overlay_interaction_type Zh ⁇ Zh
  • overlay_interaction_type indicates the type of the group that the current overlay can perform common interaction operations on.
  • One representation is to indicate by bit, as shown in Table 27 below:
  • a bit of the overlay_interaction_type When a bit of the overlay_interaction_type has a value, it means that the overlay can perform an interaction operation corresponding to the bit in a group operation mode.
  • Table 27 only exemplifies the types of partial interaction operations, and the types of common interaction operations of the overlay may not be limited to those shown in Table 27 above.
  • the terminal determines the type of interaction operation that can be performed in a group operation based on the value of overlay_interaction_type. For example, when the terminal determines that the values of the overlay_interaction_type corresponding to all overlays in the same group are the bit index 6 in Table 24, the terminal determines that the operation function possessed by the group can be rotated together. If the terminal determines that one or more overlays belong to multiple groups, the terminal corresponds to each group with an operation option corresponding to a type of the interactive operation indicated by the first indication information. When an operation option corresponding to any one group is triggered, all overlays in any one group operate all overlays in any one group according to the operation function indicated by the first instruction information.
  • the terminal performs NALU decoding on one or more overlay code streams to obtain the SEI contained in the overlay code stream.
  • SEI payload type is a value represented by OLG, it indicates that the SEI is an overlay common interaction group Group of messages.
  • the terminal continues to decode to obtain the identification information of the overlay_relation_group_id in the SEI and other overlays that belong to the same group, and the overlay_interaction_type value, that is, the ID number of the common interaction group of the overlay and the type of group interaction .
  • the terminal decodes the part, it obtains the group identification information of all overlays, and determines that the overlays belong to the same group.
  • the terminal can also determine the type of interaction that can be performed according to the overlay_interaction_type value.
  • the client configuration or user interface prompt can set user interaction option information for overlays with the same ID number as a group, and specify the type of group operation that can be performed according to the overlay_interaction_type value. Different interaction options are given for overlay groups with different ID numbers. When the user clicks or activates the option corresponding to the ID number, the overlay of the group corresponding to the ID number can be operated simultaneously according to the operation type of the specified group.
  • overlay_interaction_type may also be carried in the MPD.
  • Table 27 For the specific process, refer to the description at Table 27, which is not repeated here.
  • the server may add common interaction group description information to a file format corresponding to each overlay.
  • the difference from Example 1-1 is that in this embodiment, the group identification information of the overlay is located in the overlay control structure. That is, the third information is located in an overlay control structure included in the overlay structure.
  • the server encodes one or more overlay media data to obtain one or more overlay media data streams, and then encapsulates the one or more overlay media data streams.
  • each overlay has an overlay control structure
  • each overlay has an overlay control structure, and may further include an Overlay control symbol Overlay group, which is used to indicate group identification information of the overlay.
  • the Overlay control symbols are shown in Table 28:
  • overlay_relation_group_id represents the ID number of the group to which the overlay belongs.
  • the group identification information of the overlays belonging to the same group is the same, and the group identification information of the overlays belonging to different groups is different.
  • the terminal obtains the overlay control structure syntax element overlay_control_flag after decapsulating one or more overlays, thereby obtaining the Overlay group represented by the tenth bit in Table 29, and then obtains the OverlayGroup structure information to obtain the overlay group identification information . After the parsing is completed, obtain the group information of all overlays.
  • the embodiment of the present application adds the group identification information used to indicate the overlay to the file format, the SEI or the MPD of the overlay code stream, so that the terminal can display the same group identification information in groups.
  • One or more overlays are used to display the same group identification information in groups.
  • the user can perform the corresponding operation function of the group's overlay at the same time, which reduces the steps when the user performs the same operation on one or more overlays, and improves the user's watching VR.
  • the subjective experience of the video is a subjective experience of the video.
  • the second information and the third information in the embodiment of the present application may be carried in an overlay file format that encapsulates a media data stream including an overlay, and the file format includes an entity group. Box (EntityToGroupBox) and overlay structure, wherein the overlay structure has an overlay associated area control structure.
  • the second information is located in the overlay association area control structure, and the third information is located in the EntityToGroupBox.
  • the group_id in Table 30 indicates the unique ID number of the group, which is different from any other EntityToGroupBox structure ID number.
  • num_entities_in_group represents the number of entities in the current group, and entity_id corresponds to the ID number of an entity in the file format.
  • the file format in the embodiment of the present application may also have the following table 31
  • ocsg represents a type of grouping_type, which is used to indicate that the group type is a common display group.
  • the terminal can resolve from EntityToGroupBox to OverlayConditionalShownGroupBox, which means that a group of overlays can be displayed together when the user triggers the display for any overlay or the group.
  • a common close is performed.
  • the ID number of the overlay contained in the current overlay group is represented by the entity_id in the current EntityToGroupBox.
  • the OverlayConditionalShownGroupBox can also contain information about the name of the overlay group.
  • the syntax structure is shown in Table 32 below:
  • overlay_group_label is a UTF-8 encoded string of unlimited length, representing the description of the overlay group. Can be null
  • overlay_group_label is used to give group description information, and it can be displayed on the user's display interface as group information.
  • the first information in the embodiment of the present application may be carried in an overlay file format that encapsulates a media data stream including an overlay, and the file format includes an EntityToGroupBox and an overlay structure, where The overlay structure includes control symbols.
  • the first information is in EntityToGroupBox.
  • the file format in the embodiment of the present application may also have the syntax shown in Table 32:
  • ovrg also represents a type of grouping_type (group type), which is used to indicate that the group type is a common interaction group.
  • the overlay structure can adopt the above description, which is not limited here.
  • the terminal can resolve from EntityToGroupBox to OverlayRelationGroupBox, which means that when a group of overlays can perform interactive operations on any overlay or all overlays in the group, the operation function of the overlay interactive operation is determined based on the overlay structure of each overlay. .
  • the ID number of the overlay contained in the current overlay group is represented by the entity_id in the current EntityToGroupBox.
  • the OverlayRelationGroupBox can also contain information about the name of the overlay group.
  • the syntax structure is shown in Table 34 below:
  • overlay_group_label is used to give group name information, which can be displayed on the user's display interface as group information.
  • the name information of the overlay group can be carried in the overlay group box.
  • the terminal can resolve the name information of the overlay group carried by the overlay group box, and display the group name indicated by the name information of the overlay group.
  • the overlay group box is defined as an OverlayRelationGroupBox
  • the name information of the overlay group is located in the OverlayRelationGroupBox.
  • the overlay group box is defined as an OverlayConditionalShownGroupBox
  • the name information of the overlay group is located in the OverlayConditionalShownGroupBox.
  • each network element such as a device for processing media data
  • each network element includes a hardware structure and / or a software module corresponding to each function.
  • this application can be implemented in the form of hardware or a combination of hardware and computer software. Whether a certain function is performed by hardware or computer software-driven hardware depends on the specific application of the technical solution and design constraints. Professional technicians can use different methods to implement the described functions for each specific application, but such implementation should not be considered to be beyond the scope of this application.
  • each functional unit may be divided corresponding to each function, or two or more functions may be integrated into one processing unit.
  • the above integrated unit may be implemented in the form of hardware or in the form of software functional unit. It should be noted that the division of the units in the embodiments of the present application is schematic, and is only a logical function division. There may be another division manner in actual implementation.
  • FIG. 7 is a schematic block diagram of an apparatus for processing media data according to an embodiment of the present application.
  • the apparatus for processing media data may be a terminal or a chip applied to the terminal.
  • the apparatus 500 for processing media data shown in FIG. 7 includes an obtaining module 510 and a processing module 520.
  • the obtaining module 510 and the processing module 520 in the apparatus 500 for processing media data may perform various steps performed by the terminal in the methods shown in FIG. 3 and FIG. 4.
  • the specific functions of the obtaining module 510 and the processing module 520 are as follows:
  • the obtaining module 510 is configured to receive at least two overlay layers corresponding to the media data.
  • each overlay corresponds to first information
  • the first information includes group identification information of the overlay
  • the overlay corresponds to second information and third information
  • the second information is used to indicate all
  • the third information includes group identification information of the overlay or identification information of other overlays that belong to the same group as the overlay.
  • a processing module 520 is configured to process the at least two overlays according to the first information of the at least two overlays when the overlay corresponds to the first information.
  • a processing module 520 is configured to, when the overlay corresponds to the second information and the third information, the terminal performs a pairing process on the at least two according to the second information and the third information corresponding to the at least two overlays. An overlay is processed.
  • the apparatus 500 for processing media data executes the method shown in FIG. 4, the apparatus 500 for processing media data further includes a display module 530.
  • the specific functions of the display module 530 and the processing module 520 are as follows:
  • the display module 530 is configured to display at least one group, and information indicating an operation function corresponding to each group in the at least one group and an overlay belonging to each group.
  • At least one group is determined by first information corresponding to each overlay in at least two overlays; an operation function corresponding to one group is determined by an overlay structure included in the overlay in each group.
  • At least one group is determined by third information corresponding to each overlay in the at least two overlays, and an operation function corresponding to one group is an overlay associated area included in the overlay in the group
  • the control structure is determined.
  • the processing module 520 is configured to process, when any one of the at least one group is triggered, all overlays belonging to any one group in response to an operation function corresponding to the group. Or when any overlay is triggered, it handles any overlay and other overlays that belong to the same group as any overlay in response to the operation function of any overlay being triggered.
  • FIG. 8 is a schematic block diagram of an apparatus 600 for processing media data according to an embodiment of the present application.
  • the apparatus 600 for processing media data may be a server. Or a chip used in a server.
  • the apparatus 600 for processing media data shown in FIG. 8 includes an obtaining module 610, a processing module 620, and a sending module 630.
  • the obtaining module 610, the processing module 620, and the sending module 630 in the apparatus 600 for processing media data may perform each step of the method shown in FIG. 3 and FIG. 4 by the server.
  • the specific functions of the obtaining module 610, the processing module 620, and the sending module 630 are as follows:
  • the obtaining module 610 is configured to obtain media data.
  • a processing module 620 configured to process media data to obtain at least two overlay layers corresponding to the media data
  • the sending module 630 is configured to send one or more overlays to the terminal.
  • the input / output interface 130 is configured to acquire media data.
  • the processor 110 is configured to process media data to obtain at least two overlay layers corresponding to the media data.
  • the input / output interface 130 is also used to send one or more overlays to the terminal.
  • the input / output interface 130 is configured to receive at least two overlay layers corresponding to the media data.
  • each overlay corresponds to first information
  • the first information includes group identification information of the overlay
  • the overlay corresponds to second information and third information
  • the second information is used to indicate all
  • the third information includes group identification information of the overlay or identification information of other overlays that belong to the same group as the overlay.
  • the processor 110 is configured to process the at least two overlays according to the first information of the at least two overlays when the overlay corresponds to the first information. Alternatively, the processor 110 is configured to: when the overlay corresponds to the second information and the third information, the terminal performs, on the at least two, the second information and the third information corresponding to the at least two overlays. An overlay is processed.
  • the display 160 is configured to display at least one group, and information used to indicate an operation function corresponding to each group in the at least one group and an overlay belonging to each group.
  • At least one group is determined by first information corresponding to each overlay in at least two overlays; an operation function corresponding to one group is determined by an overlay structure included in the overlay in each group.
  • At least one group is determined by third information corresponding to each overlay in the at least two overlays, and an operation function corresponding to one group is an overlay association area included in the overlay in the group
  • the control structure is determined.
  • the display 160 is configured to process all overlays belonging to any one group in response to an operation function corresponding to the group when any one of the groups is triggered. Or when any overlay is triggered, it handles any overlay and other overlays that belong to the same group as any overlay in response to the operation function of any overlay being triggered.
  • the disclosed systems, devices, and methods may be implemented in other ways.
  • the device embodiments described above are only schematic.
  • the division of the unit is only a logical function division.
  • multiple units or components may be combined or Can be integrated into another system, or some features can be ignored or not implemented.
  • the displayed or discussed mutual coupling or direct coupling or communication connection may be indirect coupling or communication connection through some interfaces, devices or units, which may be electrical, mechanical or other forms.
  • the units described as separate components may or may not be physically separated, and the components displayed as units may or may not be physical units, may be located in one place, or may be distributed on multiple network units. Some or all of the units may be selected according to actual needs to achieve the objective of the solution of this embodiment.
  • each functional unit in each embodiment of the present application may be integrated into one processing unit, or each of the units may exist separately physically, or two or more units may be integrated into one unit.
  • the functions are implemented in the form of software functional units and sold or used as independent products, they can be stored in a computer-readable storage medium.
  • the technical solution of this application is essentially a part that contributes to the existing technology or a part of the technical solution can be embodied in the form of a software product.
  • the computer software product is stored in a storage medium, including Several instructions are used to cause a computer device (which may be a personal computer, a server, or a network device, etc.) to perform all or part of the steps of the method described in the embodiments of the present application.
  • the aforementioned storage media include: U disks, mobile hard disks, read-only memories (ROMs), random access memories (RAMs), magnetic disks or compact discs and other media that can store program codes .

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Human Computer Interaction (AREA)
  • Two-Way Televisions, Distribution Of Moving Picture Or The Like (AREA)

Abstract

La présente invention concerne un procédé de traitement de données multimédias, un terminal et un serveur, se rapportant au domaine technique de la transmission multimédia en continu et utilisés pour réduire la complexité d'une opération qui survient lorsqu'un utilisateur doit exécuter respectivement la même opération sur des recouvrements afin d'atteindre une cible correspondante de sorte à rendre l'opération de l'utilisateur sur le recouvrement plus efficace et à améliorer l'expérience subjective de l'utilisateur. Le procédé comprend les étapes suivantes : un terminal acquiert au moins deux recouvrements correspondant à des données multimédias, le recouvrement correspondant à des premières informations ou le recouvrement correspondant à des deuxièmes informations et à des troisièmes informations, les premières informations comprenant des informations d'identification de groupe, les deuxièmes informations étant utilisées pour indiquer une fonction d'opération du recouvrement, les troisièmes informations étant utilisées pour indiquer les informations d'identification de groupe, et les recouvrements dans le même groupe correspondant à la même fonction d'opération ; et un terminal traite au moins deux recouvrements selon les premières informations d'au moins deux recouvrements.
PCT/CN2019/108514 2018-09-27 2019-09-27 Procédé de traitement de données multimédias, terminal et serveur Ceased WO2020063850A1 (fr)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US201862737900P 2018-09-27 2018-09-27
US62/737,900 2018-09-27

Publications (1)

Publication Number Publication Date
WO2020063850A1 true WO2020063850A1 (fr) 2020-04-02

Family

ID=69950275

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2019/108514 Ceased WO2020063850A1 (fr) 2018-09-27 2019-09-27 Procédé de traitement de données multimédias, terminal et serveur

Country Status (1)

Country Link
WO (1) WO2020063850A1 (fr)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105828160A (zh) * 2016-04-01 2016-08-03 腾讯科技(深圳)有限公司 视频播放方法及装置
WO2017202699A1 (fr) * 2016-05-23 2017-11-30 Canon Kabushiki Kaisha Procédé, dispositif et programme informatique pour la diffusion continue adaptative de contenu multimédia de réalité virtuelle
CN107770601A (zh) * 2016-08-16 2018-03-06 上海交通大学 一种面向多媒体内容组件个性化呈现的方法及系统
CN107888939A (zh) * 2016-09-30 2018-04-06 华为技术有限公司 一种视频数据的处理方法及装置
CN108271044A (zh) * 2016-12-30 2018-07-10 华为技术有限公司 一种信息的处理方法及装置

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105828160A (zh) * 2016-04-01 2016-08-03 腾讯科技(深圳)有限公司 视频播放方法及装置
WO2017202699A1 (fr) * 2016-05-23 2017-11-30 Canon Kabushiki Kaisha Procédé, dispositif et programme informatique pour la diffusion continue adaptative de contenu multimédia de réalité virtuelle
CN107770601A (zh) * 2016-08-16 2018-03-06 上海交通大学 一种面向多媒体内容组件个性化呈现的方法及系统
CN107888939A (zh) * 2016-09-30 2018-04-06 华为技术有限公司 一种视频数据的处理方法及装置
CN108271044A (zh) * 2016-12-30 2018-07-10 华为技术有限公司 一种信息的处理方法及装置

Similar Documents

Publication Publication Date Title
US11902350B2 (en) Video processing method and apparatus
US20190325652A1 (en) Information Processing Method and Apparatus
US20200092600A1 (en) Method and apparatus for presenting video information
US20200145736A1 (en) Media data processing method and apparatus
US10931930B2 (en) Methods and apparatus for immersive media content overlays
CN111937397A (zh) 媒体数据处理方法及装置
WO2018068236A1 (fr) Procédé de transmission de flux vidéo, dispositif associé, et système
CN115396647B (zh) 一种沉浸媒体的数据处理方法、装置、设备及存储介质
US20210218792A1 (en) Media data transmission method, client, and server
CN110035316A (zh) 处理媒体数据的方法和装置
US20250240383A1 (en) Method for processing media data, client, and server
WO2019062613A1 (fr) Procédé et appareil de traitement d'informations multimédia
US20200145716A1 (en) Media information processing method and apparatus
WO2024245182A1 (fr) Procédés et appareils de traitement de fichier de nuage de points, support, dispositif électronique et produit-programme
WO2020063850A1 (fr) Procédé de traitement de données multimédias, terminal et serveur
WO2023169003A1 (fr) Procédé et appareil de décodage multimédia de nuage de points et procédé et appareil de codage multimédia de nuage de points
EP4633174A1 (fr) Format de métadonnées d'avatar
WO2025180072A1 (fr) Procédé et dispositif de traitement de données de nuage de points et support de stockage
WO2024114519A1 (fr) Procédé et appareil d'encapsulation de nuage de points, procédé et appareil de désencapsulation de nuage de points, et support et dispositif électronique
HK40092358A (zh) 点云媒体的编解码方法及相关产品
CN115102932A (zh) 点云媒体的数据处理方法、装置、设备、存储介质及产品

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19866354

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 19866354

Country of ref document: EP

Kind code of ref document: A1