WO2024148751A1 - 场景描述文件的生成方法及装置 - Google Patents
场景描述文件的生成方法及装置 Download PDFInfo
- Publication number
- WO2024148751A1 WO2024148751A1 PCT/CN2023/097873 CN2023097873W WO2024148751A1 WO 2024148751 A1 WO2024148751 A1 WO 2024148751A1 CN 2023097873 W CN2023097873 W CN 2023097873W WO 2024148751 A1 WO2024148751 A1 WO 2024148751A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- description
- target
- scene
- accessor
- cache
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Ceased
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T9/00—Image coding
- G06T9/001—Model-based coding, e.g. wire frame
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/80—Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
- H04N21/81—Monomedia components thereof
- H04N21/816—Monomedia components thereof involving special video data, e.g 3D video
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/205—Parsing
- G06F40/211—Syntactic parsing, e.g. based on context-free grammar [CFG] or unification grammars
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T13/00—Animation
- G06T13/20—Three-dimensional [3D] animation
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T13/00—Animation
- G06T13/20—Three-dimensional [3D] animation
- G06T13/40—Three-dimensional [3D] animation of characters, e.g. humans, animals or virtual beings
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T15/00—Three-dimensional [3D] image rendering
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T15/00—Three-dimensional [3D] image rendering
- G06T15/005—General purpose rendering architectures
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T15/00—Three-dimensional [3D] image rendering
- G06T15/04—Texture mapping
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T17/00—Three-dimensional [3D] modelling for computer graphics
- G06T17/20—Finite element generation, e.g. wire-frame surface description, tesselation
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T9/00—Image coding
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T9/00—Image coding
- G06T9/40—Tree coding, e.g. quadtree, octree
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/70—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals characterised by syntax aspects related to video coding, e.g. related to compression standards
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/20—Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
- H04N21/23—Processing of content or additional data; Elementary server operations; Server middleware
- H04N21/234—Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs
- H04N21/23412—Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs for generating or manipulating the scene composition of objects, e.g. MPEG-4 objects
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/20—Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
- H04N21/23—Processing of content or additional data; Elementary server operations; Server middleware
- H04N21/234—Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs
- H04N21/2343—Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs involving reformatting operations of video signals for distribution or compliance with end-user requests or end-user device requirements
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/80—Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
- H04N21/85—Assembly of content; Generation of multimedia applications
- H04N21/854—Content authoring
- H04N21/85406—Content authoring involving a specific file format, e.g. MP4 format
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/50—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
- H04N19/597—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding specially adapted for multi-view video sequence encoding
Definitions
- Some embodiments of the present application relate to the field of video processing technology, and in particular, to a method and device for generating a scene description file.
- Point Cloud refers to a collection of massive three-dimensional points.
- the compression standards for point clouds mainly include Geometry-based Point Cloud Compression (G-PCC) and Video-based Point Cloud Compression (V-PCC).
- the current mainstream immersive media mainly include point clouds, 3D meshes, 6DoF panoramic videos, MPEG immersive videos (MIV), etc.
- MIV MPEG immersive videos
- a 3D scene multiple types of immersive media often exist at the same time.
- Different types of rendering engines are generated according to the types and numbers of supported codecs.
- the Moving Picture Experts Group (MPEG) initiated the formulation of the MPEG scene description standard, with the standard number ISO/IEC 23090-14.
- This standard mainly solves the problem of cross-platform description of MPEG media (including codecs developed by MPEG, MPEG file formats, and MPEG transmission mechanisms) in 3D scenes.
- the extensions made to the first version of the ISO/IEC 23090-14 MPEG-I scene description standard have met the key requirements of immersive scene description solutions.
- the current scene description standard does not support media files of type G-PCC coded point cloud.
- Point cloud is an important form of 3D media
- G-PCC is one of the current mainstream point cloud compression algorithms. Therefore, it is of great significance and value to support media files of type G-PCC coded point cloud in the scene description framework.
- some embodiments of the present application provide a method for generating a scene description file, including:
- the type of the target media file in the to-be-rendered three-dimensional scene is a geometry-based point cloud compression G-PCC coded point cloud, generating a target description module corresponding to the target media file according to the description information of the target media file;
- the target media description module is added to the media list of the MPEG media of the scene description file of the three-dimensional scene to be rendered.
- some embodiments of the present application provide a device for generating a scene description file, including:
- a memory configured to store a computer program
- the processor is configured to, when calling the computer program, enable the scene description file generation device to implement the scene description file generation method described in the first aspect.
- FIG1 is a schematic diagram showing the structure of an immersive media description framework in some embodiments.
- FIG2 is a schematic diagram showing the structure of a scene description file in some embodiments.
- FIG3 is a schematic diagram showing the structure of a scene description file in some other embodiments of the present application.
- FIG4 shows a schematic diagram of the structure of a G-PCC encoder in some embodiments
- FIG5 is a schematic diagram showing a LOD partitioning process in some embodiments.
- FIG6 is a schematic diagram showing a lifting transformation process in some embodiments.
- FIG7 is a schematic diagram showing a RAHT transformation process in some embodiments.
- FIG8 is a schematic diagram showing the structure of a G-PCC decoder in some embodiments.
- FIG9 is a schematic diagram showing the structure of a scene description file in some other embodiments.
- FIG10 is a schematic diagram showing the structure of a scene description file in some other embodiments.
- FIG11 is a schematic diagram showing a pipeline corresponding to a media file of type G-PCC coded point cloud provided by some embodiments;
- FIG12 is a flowchart showing the steps of a method for generating a scene description file in some embodiments
- FIG13 is a flowchart showing the steps of a scene description file parsing method in some embodiments.
- FIG14 is a flowchart showing steps of a method for processing a media file in some embodiments.
- FIG15 is a flowchart showing the steps of a method for rendering a three-dimensional scene in some embodiments
- FIG16 is a flowchart showing the steps of a cache management method in some embodiments.
- FIG. 17 shows an interactive flow chart of a method for rendering a three-dimensional scene in some embodiments.
- Some embodiments of the present application involve scene description of immersive media.
- the scene description framework of immersive media decouples the access and processing of media files from the rendering of media files, and designs a media access function (Media Access Function, MAF) 12 to be responsible for the access and processing of media files.
- MAF Media Access Function
- a media access function application programming interface is designed. (Application Programming Interface, API), the display engine 11 and the media access function 12 exchange commands through the media access function API.
- the display engine 11 can issue commands to the media access function 12 through the media access function API, and the media access function 12 can also request commands from the display engine 11 through the media access function API.
- the general workflow of the scene description framework of immersive media includes: 1) The display engine 11 obtains the scene description file (Scene Description Documents) provided by the immersive media service provider. 2) The display engine 11 parses the scene description file, obtains the access address of the media file, the attribute information of the media file (media type and encoding and decoding parameters, etc.) and the format requirements of the processed media file and other parameters or information, and calls the media access function API to pass all or part of the information obtained by parsing the scene description file to the media access function 12.
- the display engine 11 obtains the scene description file (Scene Description Documents) provided by the immersive media service provider. 2) The display engine 11 parses the scene description file, obtains the access address of the media file, the attribute information of the media file (media type and encoding and decoding parameters, etc.) and the format requirements of the processed media file and other parameters or information, and calls the media access function API to pass all or part of the information obtained by parsing the scene description file to the media access function 12.
- the media access function 12 requests to download the specified media file from the media resource server or obtains the specified media file from the local, and establishes a corresponding pipeline for the media file, and then decapsulates, decrypts, decodes, and post-processes the media file in the pipeline to convert the media file from the encapsulation format to the format specified by the display engine 11. 4)
- the pipeline stores the output data obtained after all processing in the specified cache. 5)
- the display engine 11 reads the fully processed data in the specified cache and renders the media file according to the data read from the cache.
- the scene description file is used to describe the structure of the three-dimensional scene (its features can be described by a three-dimensional mesh), texture (such as texture mapping, etc.), animation (rotation, translation), camera viewpoint position (rendering perspective), and other contents.
- GL Transmission Format 2.0 (glTF2.0) has been identified as a candidate format for a scene description file that can meet the requirements of MPEG-Immersive (MPEG-I) and 6-degrees of freedom (6DoF) applications.
- MPEG-I MPEG-Immersive
- 6DoF 6-degrees of freedom
- glTF2.0 is described in the GL Transmission Format (glTF) version 2.0 of the Khronos Group available at github.com/KhronosGroup/glTF/tree/master/specification/2.0#specifying-extensions.
- FIG. 2 is a schematic diagram of the structure of a scene description file in the glTF2.0 scene description standard (ISO/IEC 12113).
- the scene description file in the glTF2.0 scene description standard includes but is not limited to: scene description module (scene) 201, node description module (node) 202, mesh description module (mesh) 203, accessor description module (accessor) 204, cache slice description module (bufferView) 205, buffer description module (buffer) 206, camera description module (camera) 207, lighting description module (light) 208, material description module (material) 209, texture description module (texture) 210, sampler description module (sampler) 211 and texture map description module (image) 212, animation description module (animation) 213, skin description module (skin) 214.
- the scene description module (scene) 201 in the scene description file shown in FIG2 is used to describe the three-dimensional scene contained in the scene description file.
- a scene description file may contain any number of three-dimensional scenes, and each three-dimensional scene is represented by a scene description module 201.
- the scene description modules 201 are in parallel with each other, that is, the three-dimensional scenes are in parallel with each other.
- the node description module (node) 202 in the scene description file shown in FIG2 is a description module at the next level of the scene description module 201, and is used to describe the objects contained in the three-dimensional scene described by the scene description module 201. There may be many specific objects in each three-dimensional scene, such as virtual digital people, three-dimensional objects in the near distance, and background images in the far distance. The scene description file will describe these specific objects through the node description module 202. Each node description module 202 can represent An object or a group of objects consisting of several objects. The relationship between the node description modules 202 reflects the relationship between the various components in the three-dimensional scene described by the scene description module 201.
- a scene described by a scene description module 201 can contain one or more nodes.
- a parallel relationship or a hierarchical relationship between multiple nodes can be a parallel relationship or a hierarchical relationship between multiple nodes, that is, there is a relationship of inclusion and being included between the node description modules 202, which allows multiple specific objects to be described together, or multiple specific objects can be described separately. If a node is included by another node, the included node is called a child node (children), and the child node is represented by "children" instead of "node”. By flexibly using nodes and child nodes to combine, a hierarchical node structure can be formed to express rich scene content.
- the mesh description module (mesh) 203 in the scene description file shown in FIG2 is a description module of the next level of the node description module 202, and is used to describe the characteristics of the object represented by the node description module 202.
- the mesh description module 203 is a set of one or more primitives, each of which may include an attribute, and the attribute of the primitive defines the attribute required for the graphics processing unit (GPU) to render.
- the attributes may include: position (three-dimensional coordinates), normal (normal vector), tangent (tangent vector), texcoord_n (texture coordinates), color_n (color: RGB or RGBA), joints_n (attributes related to the skin description module 214), and weights_n (attributes related to the skin description module 214), etc.
- the access address (Uniform Resource Identifier, URI) of the media file is pointed out in the scene description file, and the data in the media file can be downloaded when it is needed, thereby realizing the separation of the scene description file and the media file.
- URI Uniform Resource Identifier
- the mesh description module 203 does not store media data, but stores the index value of the accessor description module (accessor) 204 corresponding to each attribute, and points to the corresponding data in the cache slice (bufferView) of the buffer (buffer) through the accessor description module 204.
- the scene description file and the media file may be merged to form a binary file, thereby reducing the types and number of files.
- mode there may be a syntax element "mode” in the primitives of the mesh description module 203.
- the value of "position” is 1, pointing to the accessor description module 204 with index 1, and finally pointing to the vertex coordinate data stored in the buffer;
- the value of "color_0” is 2, pointing to the accessor description module 204 with index 2, and finally pointing to the color data stored in the buffer.
- the accessor description module (accessor) 204, the buffer slice description module (bufferView) 205 and the buffer description module (buffer) 206 in the scene description file shown in FIG2 jointly implement the layer-by-layer refined indexing of the data of the media file by the grid description module 203.
- the grid description module 203 does not store specific media data, but stores the index value of the corresponding accessor description module 204, and accesses the specific media data through the accessor described by the accessor description module 204 indexed by the index value.
- the indexing process of the media data by the grid description module 203 includes: first, the grid description module 203 stores the index value of the corresponding accessor description module 204, and stores the index value of the corresponding accessor description module 204.
- the index value declared by the syntax element of will point to the corresponding accessor description module 204; then, the accessor description module 204 will point to the corresponding cache slice description module 205; finally, the cache slice description module 205 will point to the corresponding cache description module 206.
- the cache description module 206 in the scene description file shown in FIG2 is mainly responsible for pointing to the corresponding media file, including the URI of the media file, the byte length of the media file and other information, and is used to describe the buffer for caching the media data of the media file.
- a buffer can be divided into one or more cache slices.
- the cache slice description module 205 is mainly responsible for partial access to the media data in the buffer, including the starting byte offset of the access data and the byte length of the access data, etc.
- the accessor description module 204 is mainly responsible for adding additional information to the partial data delineated in the cache slice description module 205, such as the data type, the number of data of this type, the numerical range of data of this type, etc.
- Such a three-layer structure can realize the function of retrieving partial data from a media file, which is conducive to the accurate retrieval of data and also convenient for reducing the number of media files.
- the camera description module (camera) 207 in the scene description file shown in FIG2 is a next-level description module of the node description module 202, and is used to describe the viewpoint, viewing angle, and other information related to visual viewing when the user views the object described by the node description module 202.
- the node description module 202 can also point to the camera description module 207, and the camera description module 207 describes the viewpoint, viewing angle, and other information related to visual viewing when the user views the object described by the node description module 202.
- the light description module (light) 208 in the scene description file shown in FIG. 2 is a next-level description module of the node description module 202 , and is used to describe light intensity, ambient light color, light direction, light source position and other light-related information of the object described by the node description module 202 .
- the material description module (material) 209 in the scene description file shown in FIG2 is a description module at the next level of the mesh description module 203, and is used to describe the material information of the three-dimensional object described by the mesh description module 203.
- this process can also be referred to as texture mapping or adding textures.
- the scene description file in the glTF2.0 scene description standard also uses this description module.
- the material description module 209 uses a set of general parameters to define the material to describe the material information of the geometric objects appearing in the three-dimensional scene.
- the material description module 209 generally uses the metal-roughness model to describe the material of the virtual object, and the material characteristic parameters based on the metal-roughness model are represented by the widely used physically based rendering (PBR) material. Based on this, the material description module 209 makes a detailed description of the metal-roughness material attribute of the object.
- PBR physically based rendering
- the syntax elements in the metal-roughness (material.PbrMetarialRoughness) of the material description module 209 are defined as shown in Table 5 below:
- each attribute in the metal-roughness of the material description module 209 can be defined using factors and/or textures (e.g., baseColorTexture and baseColorFactor). If no texture is given, it can be determined that all corresponding texture components in this material model have a value of 1.0. If both factors and textures are present, the factor value acts as a linear multiplier for the corresponding texture value. Texture binding is defined by the index of the texture object and the optional texture coordinate index.
- factors and/or textures e.g., baseColorTexture and baseColorFactor
- the material description module 209 By parsing the material description module 209, it is possible to determine that the current material is named “gold” through the material name syntax element and its value ("name”:”gold"), and then determine that the base color value of the current material is [1.000,0.766,0.336,1.0] through the color syntax element under the pbrMetallicRoughness array and its value ("basecolorFactor”:[1.000,0.766,0.336,1.0]), and that the metallic value of the current material is "1.0” through the metallic syntax element under the pbrMetallicRoughness array and its value (“metalnessFactor”:1.0), and that the roughness value of the current material is "0.0” through the roughness syntax element under the pbrMetallicRoughness array and its value (“roughnessFactor”:0.0).
- the texture description module (texture) 210 in the scene description file shown in FIG2 is a next-level description module of the material description module 209, which is used to describe the color of the three-dimensional object described by the material description module 209 and other characteristics used in the material definition. Texture is an important aspect of giving an object a real appearance. Texture can be used to define the main color of the object and other characteristics used in the material definition in order to accurately describe the appearance of the rendered object.
- the material itself can define multiple texture objects, which can be used as textures of virtual objects during rendering and can be used to encode different material properties.
- the texture description module 210 uses sampler syntax elements and texture map syntax element indexes to reference a sampler description module (sampler) 211 and a texture map description module (image) 212.
- the texture map description module 212 contains a uniform resource identifier (URI), which links to the texture map or binary file package actually used by the texture description module 210.
- URI uniform resource identifier
- the sampler description module 211 is used to describe the filtering and packaging mode of the texture.
- the respective responsibilities and cooperation relationships of the material description module 209, the texture description module 210, the sampler description module 211 and the texture map description module 212 include: the material description module 209 and the texture description module 210 together define the color and physical information of the object surface.
- the sampler description module 211 defines how to attach the texture map to the object surface.
- the texture description module 210 specifies The sampler description module 211 and the texture map description module 212 implement the addition of textures through the texture map description module 212, and the texture map description module 212 uses URI for identification and indexing, and uses the accessor description module 204 to access data.
- the sampler description module 211 implements the specific adjustment and packaging of textures.
- Table 6 The definition of the syntax elements in the texture description module 210 is shown in Table 6 below:
- the definition of the syntax elements in sample (texture.sample) of the texture description module 210 is shown in Table 7 below:
- a material description module 209 For example, the following is a JSON example of a material description module 209, a texture description module 210, a sampler description module 211, and a texture map description module 212:
- the animation description module (animation) 213 in the scene description file shown in FIG2 is a description module of the next level of the node description module 202, and is used to describe the animation information added to the object described by the node description module 202.
- animation can be added to the object described by the node description module 202. Therefore, the description level of the animation description module 213 in the scene description file is specified by the node description module 202, that is, the animation description module 213 is a description module of the next level of the node description module 202, and the animation description module 213 also has a corresponding relationship with the grid description module 203.
- the animation description module 213 can describe the animation in three ways: position movement, angle rotation, and size scaling, and can also specify the start and end time of the animation and the implementation method of the animation. For example, if an animation is added to a grid description module 203 representing a three-dimensional object, the three-dimensional object represented by the grid description module 203 can complete the specified animation process within the specified time window through the fusion of position movement, angle rotation, and size scaling.
- the skin description module (skin) 214 in the scene description file shown in FIG2 is a description module of the next level of the node description module 202, and is used to describe the motion cooperation relationship between the skeleton added to the node described by the node description module 202 and the grid representing the surface information of the object.
- the node described by the node description module 202 represents an object with a large degree of freedom of movement such as a person, an animal, or a machine, in order to optimize the motion performance of these objects, the skeleton can be filled into the interior of the object, and the three-dimensional grid representing the surface information of the object becomes the skin in concept.
- the description level of the skin description module 214 is specified by the node description module 202, that is, the skin description module 214 is a description module of the next level of the node description module 202, and the skin description module 214 has a corresponding relationship with the grid description module 203.
- Each description module of the scene description file in the above glTF2.0 scene description standard only has the most basic ability to describe three-dimensional objects. There are problems such as not supporting dynamic three-dimensional immersive media, not supporting audio files, and not supporting scene updates.
- glTF also declares that each of its object attributes has an optional extended object attribute (extensions), allowing any part of it to be extended with extensions to achieve more complete functions. Including scene description module (scene), node description module (node), mesh description module (mesh), accessor description module (accessor), cache description module (buffer), animation description module (animation), etc. and their internally defined syntax elements all have optional extended object attributes to support certain functional extensions based on glTF2.0.
- MPEG Moving Picture Experts Group
- ISO/IEC 23090-14 the standard number ISO/IEC 23090-14.
- This standard mainly solves the cross-platform description problem of MPEG media (including MPEG-developed codecs, MPEG file formats, and MPEG transmission mechanisms) in 3D scenes.
- the MPEG#128 meeting decided to develop the MPEG-I Scene Description standard based on glTF2.0 (ISO/IEC 12113).
- the first version of the MPEG scene description standard has been developed and is in the FDIS voting stage.
- the MPEG scene description standard adds corresponding extensions to address the unrealized requirements in the cross-platform description of three-dimensional scenes, including interactivity, AR anchoring, user and avatar representation, tactile support, and extended support for immersive media codecs.
- the first version of the MPEG scene description standard has been formulated mainly to define the following contents:
- the MPEG scene description standard defines a scene description file format for describing immersive 3D scenes. This format combines the original glTF2.0 (ISO/IEC 12113) content and makes a series of extensions based on it.
- MPEG scene description defines a scene description framework and an application programming interface (API) for inter-module collaboration, which decouples the acquisition and processing of immersive media from the media rendering process, and is beneficial for optimizing the adaptation of immersive media to different network conditions, partial acquisition of immersive media files, access to different levels of detail of immersive media, and content quality adjustment. Decoupling the acquisition and processing of immersive media from the immersive media rendering process is the key to achieving cross-platform description of 3D scenes.
- API application programming interface
- MPEG scene description proposes a series of extensions based on the International Standardization Organization Base Media File Format (ISOBMFF) (ISO/IEC 14496-12) for transmitting immersive media content.
- ISOBMFF International Standardization Organization Base Media File Format
- the scene description file is extended in the MPEG scene description standard based on the scene description file shown in FIG2 .
- the extension of the scene description file in the MPEG scene description standard can be divided into two groups:
- the first group of extensions includes: MPEG media (MPEG_media) 301, MPEG time-varying accessor (MPEG_accessor_timed) 302 and MPEG circular buffer (MPEG_buffer_circular) 303.
- MPEG media 301 is an independent extension used to reference external media sources
- MPEG time-varying accessor 302 is an extension of the accessor level used to access time-varying media
- MPEG circular buffer is an extension of the buffer level used to support circular buffers.
- the first group of extensions provides a basic description and format of the media in the scene, meeting the basic requirements of describing time-varying immersive media in the scene description framework.
- MPEG time-varying accessor (MPEG_accessor_timed) 302 is used to access time-varying media.
- the glTF2.0 scene description standard does not support time-varying media, when media data needs to change over time, it is necessary to update the scene description file under the glTF2.0 scene description standard to achieve this. For example, if the texture map on the surface of an object needs to be updated in the glTF2.0 scene description standard so that the texture map on the surface of the object can change over time, the scene description file under the glTF2.0 scene description standard must be updated. Frequent updates of scene description files require frequent parsing, processing, and transmission of scene description files, which increases the performance overhead in the 3D scene rendering process. Based on this, MPEG has designed the MPEG time-varying accessor (MPEG_accessor_timed) 302. The parameters in the MPEG time-varying accessor can change over time to change the access method of media data, thereby realizing that the accessed data changes over time, thereby avoiding frequent parsing, processing, and transmission of scene description files.
- MPEG_accessor_timed MPEG time-varying accessor
- the second group of extensions includes: MPEG dynamic scene (MPEG_scene_dynamic) 304, MPEG texture (MPEG_texture_video) 305, MPEG audio space (MPEG_audio_spatial) 306, MPEG viewport recommendation (MPEG_viewport_recommended) 307, MPEG mesh mapping (MPEG_mesh_linking) 308 and MPEG animation time (MPEG_animation_timing) 309.
- MPEG dynamic scene MPEG_scene_dynamic
- MPEG texture MPEG_texture_video
- MPEG audio space MPEG_audio_spatial
- MPEG viewport recommendation MPEG_viewport_recommended
- MPEG mesh mapping MPEG_mesh_linking
- MPEG animation time MPEG_animation_timing
- MPEG_scene_dynamic 304 is a scene level extension to support dynamic scene updates
- MPEG_texture_video 305 is a texture level extension to support textures in video form
- MPEG_audio_spatial 306 is a node level and camera level extension to support spatial 3D audio
- MPEG_viewport_recommended 307 is a scene level extension to support the description of recommended viewing angles in two-dimensional display
- MPEG_mesh_linking 308 is a mesh level extension to support linking two meshes and providing mapping information
- MPEG_animation_timing 309 is a scene level extension to support controlling the animation timeline.
- the MPEG media in the MPEG scene description file is used to describe the type of media files and to provide necessary instructions for MPEG type media files so that these MPEG type media files can be used later.
- the definition of the first level syntax elements of MPEG media is shown in Table 8 below:
- ISO/IEC 23090-14 also defines the transmission format for the delivery of scene description files and data related to the glTF 2.0 extension.
- ISO/IEC 23090-14 defines how to encapsulate glTF files and related data as non-time-varying and time-varying data (for example, as track samples) in ISOBMFF files.
- MPEG_scene_dynamic, MPEG_mesh_linking, and MPEG_animation_timing provide a specific form of time-varying data to the display engine, and the display engine 11 should perform corresponding operations based on this changing information.
- ISO/IEC 23090-14 also defines the format of each extended time-varying data and how to encapsulate it in the ISOBMFF file.
- MPEF media MPEG_media
- MPEG_media allows reference to external media streams delivered via protocols such as RTP/SRTP, MPEG-DASH, etc.
- URL Uniform Resource Locator
- the scheme requires the presence of a stream identifier in the query part, but does not specify a specific type of identifier, allowing the use of Media Stream Identification scheme (RFC5888), labeling scheme (RFC4575) or zero-based indexing scheme.
- the main functions of the display engine 11 include obtaining a scene description file and parsing the obtained scene description file to obtain the composition structure of the three-dimensional scene to be rendered and the detailed information in the three-dimensional scene to be rendered, and rendering and displaying the three-dimensional scene to be rendered according to the information obtained by parsing the scene description file.
- the specific workflow and principle of the display engine 11 are not limited in the embodiment of the present application, so that the display engine 11 can parse the scene description document, and issue instructions to the media access function 12 through the media access function API, issue instructions to the cache management module 13 through the cache API, and retrieve the processed data from the cache and complete the three-dimensional scene. and the rendering and display of objects therein.
- the media access function 12 can receive instructions from the display engine 11, and complete the access and processing functions of the media files according to the instructions sent by the display engine 11. Specifically, it includes: after obtaining the media file, the media file is processed. There are large differences in the processing process of different types of media files. In order to achieve a wide range of media type support and considering the work efficiency of the media access function, a variety of pipelines are designed in the media access function. During the processing process, the pipeline that matches the media type can be enabled.
- the input of the pipeline is the media files downloaded from the server or the media files read from the local storage control. These media files often have a more complex structure and cannot be directly used by the display engine 11. Therefore, the main function of the pipeline is to process the data of such media files so that the data of the media files meets the requirements of the display engine 11.
- the media data processed by the pipeline needs to be delivered to the display engine 11 for use in a standardized arrangement structure, which requires the participation of the cache API and the cache management module 13.
- the cache API and cache management realize the creation of corresponding caches according to the format of the processed media data, and are responsible for the subsequent management of the cache, such as update, release and other operations.
- the cache management module 13 can communicate with the media access function 12 through the cache API, and can also communicate with the display engine 11. The goal of communicating with the display engine 11 and/or the media access function 12 is to achieve cache management.
- the display engine 11 needs to send the relevant instructions of cache management to the media access function 12 through the media access function API first, and the media access function 12 then sends the relevant instructions of cache management to the cache management module 13 through the cache API.
- the display engine 11 only needs to send the cache management description information parsed from the scene description document directly to the cache management module 13 through the cache API.
- the above embodiments introduce the basic process of rendering a three-dimensional scene including immersive media using a scene description framework, as well as the content and function of each functional module or file in the scene description framework.
- the immersive media in the three-dimensional scene can be a point cloud-based media file, a three-dimensional grid-based media file, a 6DoF-based media file, an MIV media file, etc.
- Some embodiments of the present application involve rendering a three-dimensional scene including a point cloud based on a scene description framework, so the following first describes the point cloud-related content.
- Point cloud refers to a collection of massive three-dimensional points. After obtaining the spatial coordinates of each sampling point on the surface of an object, a collection of points is obtained, which is called a point cloud. In addition to geometric coordinates, the points in the point cloud may also include some other attribute information, such as color, normal vector, reflectivity, transparency, material type, etc. Point cloud can be obtained in a variety of ways. In some embodiments, the implementation method of obtaining point cloud includes: using a camera array with a known fixed position in space to observe an object, and using the two-dimensional image obtained by the camera array to obtain a three-dimensional representation of the object using some related algorithms, thereby obtaining the point cloud corresponding to the object.
- the implementation method of obtaining point cloud includes: using a laser radar scanning device to obtain the point cloud corresponding to the object.
- the sensor of the laser radar scanning device records the electromagnetic waves emitted by the radar and reflected by the surface of the object, thereby obtaining the object volume information, and obtaining the point cloud corresponding to the object according to the object volume information.
- the implementation method of obtaining point cloud may also include: using artificial intelligence or computer vision algorithms to create three-dimensional volume information based on two-dimensional images, thereby obtaining the point cloud corresponding to the object.
- Point cloud provides a high-precision 3D expression for the fine digitization of the physical world and is widely used in 3D modeling, smart cities, autonomous navigation systems, augmented reality and other fields.
- G-PCC Geometry-based Point Cloud Compression
- V-PCC Video-based Point Cloud Compression
- the G-PCC encoder 400 can be divided into two parts: a geometry encoding module 41 and an attribute encoding module 42 .
- the geometry encoding module 41 can be further divided into an octree-based geometry encoding unit 411 and a prediction tree-based geometry encoding unit 412 .
- the main encoding steps of the geometric information of the point cloud to be encoded by the geometric encoding module 41 of the G-PCC encoder include: S401, extracting the geometric information (positions) in the point cloud to be encoded; S402, performing coordinate conversion on the geometric information so that the point cloud to be encoded is all contained in a bounding box; S403, voxelizing the geometric information after the coordinate conversion. That is, firstly, the geometric information after the coordinate conversion is quantized to scale the point cloud to be encoded.
- the quantization of the geometric information after the coordinate conversion also needs to determine whether to remove duplicate points according to parameters, and the process of quantization and removal of duplicate points is called voxelization.
- the voxelization of the geometric information is completed, it is encoded by the octree-based geometric encoding unit 411 and the prediction tree-based geometric encoding unit 412 respectively to obtain the geometric information code stream of the point cloud to be encoded.
- the coding process of the prediction tree-based geometric coding unit 412 includes: S406, constructing a prediction tree structure. Including: sorting the points in the point cloud to be coded, the sorting methods include: unordered, Morton order, azimuth order and radial distance order, and using two different methods (high-latency slow method and low-latency fast method) to construct the prediction tree structure.
- S407 based on the structure of the prediction tree, traverse each node in the prediction tree, predict the geometric position information of the node by selecting different prediction modes to obtain the prediction residual, and quantize the geometric prediction residual using the quantization parameter.
- S408 arithmetic coding; including: through continuous iteration, arithmetic coding of the prediction residual of the prediction tree node position information, the prediction tree structure and the quantization parameters, etc., to generate a binary geometric information code stream.
- the process of encoding the attribute information of the point cloud to be encoded by the attribute encoding module 42 of the G-PCC encoder mainly includes: S408, extracting the attribute information (attributes) in the point cloud to be encoded; S409, performing attribute prediction on the attribute information; S410, performing lifting transformation on the attribute information; S411, performing region adaptive hierarchical transformation (RAHT) on the attribute information; S412, quantizing the coefficients of the RAHT transformation and the coefficients of the lifting transformation; S413, performing arithmetic coding on the quantized coefficients of the RAHT transformation and the coefficients of the lifting transformation to obtain the attribute information code stream.
- S408 extracting the attribute information (attributes) in the point cloud to be encoded
- S409 performing attribute prediction on the attribute information
- S410 performing lifting transformation on the attribute information
- S412 quantizing the coefficients of the RAHT transformation and the coefficients of the lifting transformation
- Step S414 reconstruct the geometric information according to the geometric code stream, And match the original attribute information (attributes) and the reconstructed geometric information.
- S415 recolor the geometric information.
- the recoloring part in step S415 is to use the original point cloud to assign attribute information to the reconstructed point cloud, the goal is to make the attribute value of the reconstructed point cloud as similar as possible to the attribute value of the point cloud to be encoded, so as to minimize the error.
- the attribute prediction algorithm is an algorithm that uses the weighted sum of the reconstructed attribute values of the points that have been reconstructed in the three-dimensional space to obtain the predicted attribute value of the current point to be predicted.
- the attribute prediction algorithm can effectively remove the redundancy of the attribute space, thereby achieving the purpose of compressing the attribute information.
- the implementation method of attribute prediction may include: first, hierarchically divide the point cloud to be encoded by the level of detail (LOD) algorithm to establish a hierarchical structure of the point cloud to be encoded. Secondly, first encode and decode the low-level points and use the low-level points and the reconstructed points of the same level to predict the high-level points, thereby realizing progressive encoding.
- LOD level of detail
- the implementation method of hierarchically dividing the point cloud to be encoded by the LOD algorithm may include: first, all points in the point cloud to be encoded are marked as unvisited, and the visited point set is represented as V. In the initial state, the visited point set V is empty. Loop through all unvisited points in the point cloud to be encoded, calculate the minimum distance D from the current point to the visited point set V, if D is less than the threshold distance, ignore the current point, otherwise mark the current point as visited, and add it to the visited point set V and the current subspace. Finally, the points in each subspace and all the subspaces before each subspace are merged to obtain the hierarchical structure of the point cloud to be encoded.
- the point cloud to be encoded includes points P1 to P9.
- points P0, P2, P4, and P5 are added to the visited point set V and level R0 in sequence.
- points P1, P3, and P8 are added to the visited point set V and level R1 in sequence.
- points P6, P7, and P9 are added to the visited point set V and level R2 in sequence.
- the points in each level and all levels before each level are merged to obtain a hierarchical structure of the point cloud to be encoded including three levels.
- the first level is LOD 0 , including: points P0, P2, P4, P5;
- the second level is LOD 1 , including: points P0, P2, P4, P5, P1, P3, P8;
- the third level is LOD 2 , including P0, P2, P4, P5, P1, P3, P8, P6, P7, P9.
- the lifting transformation is built on the prediction transformation and includes three parts: segmentation, prediction and update.
- the segmentation module 61 spatially divides the point cloud to be encoded into two parts: a high-level point cloud H(N) and a low-level point cloud L(N).
- the update module 63 defines and recursively updates the influence weight of each point based on the prediction residual D(N) and the distance between the predicted point and its neighboring points.
- RAHT transform is a hierarchical region adaptive transform algorithm based on Haar wavelet transform. Based on the hierarchical tree structure, the occupied child nodes in the same parent node are recursively transformed in a bottom-up manner along each dimension, the low-frequency coefficients obtained by the transformation are passed to the next level of the transformation process, and the high-frequency coefficients are quantized and entropy encoded.
- the above RAHT transformation can be implemented by RAHT transformation based on upsampling prediction.
- RAHT transformation based on upsampling prediction the overall tree structure of RHAT transformation is changed from bottom-up to top-down, and the transformation is still performed in a 2 ⁇ 2 ⁇ 2 block.
- the transformation process includes: first, in the first direction, Perform RAHT transformation on the voxel block 71 upward. If there are adjacent voxel blocks in the first direction, the two are subjected to RAHT to obtain the weighted average (DC coefficient) and residual (AC coefficient) of the attribute values of the two adjacent points.
- DC coefficient weighted average
- AC coefficient residual
- the DC coefficient obtained exists as the attribute information of the voxel block 122 of the parent node, and the RAHT transformation of the next layer is performed; and the AC coefficient is retained for the final encoding. If there are no adjacent points, the attribute value of the voxel block 71 is directly passed to the second-layer parent node. During the second-layer RAHT transformation, it is performed along the second direction. If there are adjacent voxel blocks in the second direction, the two are subjected to RAHT transformation, and the weighted average (DC coefficient) and residual (AC coefficient) of the attribute values of the two adjacent points are obtained.
- the third-layer RAHT transformation is performed along the third direction, and the parent node voxel block 73 with three color depths is obtained as the child node of the next layer in the octree, and then the RAHT transformation is performed cyclically along the first direction, the second direction, and the third direction until there is only one parent node in the entire point cloud to be encoded.
- the G-PCC decoder 800 may be divided into a geometry decoding module 81 and an attribute decoding module 82 .
- the geometry decoding module 81 may be further divided into an octree-based geometry decoding unit 811 and a prediction tree-based geometry decoding unit 812 .
- the main steps of the G-PCC decoder decoding the geometric information code stream through the octree-based geometric decoding unit 811 of the geometric decoding module 81 include: S801, arithmetic decoding; S802, octree synthesis; S803, surface fitting; S804, geometry reconstruction; S805, inverse coordinate conversion steps to obtain the geometric information of the point cloud.
- the geometric decoding of the octree-based geometric decoding unit 811 includes: in the order of breadth-first traversal, the placeholder code of each node is obtained by continuous parsing, and the nodes are continuously divided in turn until the division is stopped when the 1x1x1 unit cube is obtained, the number of points contained in each leaf node is obtained by parsing, and finally the geometric reconstruction point cloud information is restored.
- the main steps of the G-PCC decoder decoding the geometric information code stream through the prediction tree-based geometric decoding unit 812 of the geometric decoding module 81 include: S801, arithmetic decoding; S806, reconstruction of the prediction tree; S807, residual calculation; S804, reconstruction of geometry; S805, inverse coordinate conversion steps to obtain the geometric information of the point cloud.
- the main steps of attribute decoding based on the attribute decoding module 82 of the G-PCC decoder 800 include: S808, arithmetic decoding; S809, inverse quantization; executing steps S810 and S811, or executing step S812; S810, attribute prediction; S811, lifting transformation; S812, RAHT-based inverse transformation; S813, color inverse transformation to obtain the attribute information of the point cloud. Finally, the three-dimensional image model of the point cloud data to be encoded is restored based on the geometric information and attribute information.
- the main steps of the G-PCC decoder decoding the attribute information code stream based on the attribute decoding module 82 and the main steps of the G-PCC encoder encoding the attribute information based on the attribute encoding module 82 are inverse processes and will not be repeated here.
- Some embodiments of the present application provide a scene description framework that supports point cloud code streams obtained by the G-PCC compression standard, and the specific contents include: scene description file support for media files of type G-PCC encoded point cloud, media access function API support for media files of type G-PCC encoded point cloud, media access function support for media files of type G-PCC encoded point cloud, cache API support for media files of type G-PCC encoded point cloud, cache management support for media files of type G-PCC encoded point cloud, and other contents.
- the process of rendering a media file of type G-PCC coded point cloud in a three-dimensional scene based on a scene description framework includes: first First, the display engine obtains the scene description file by downloading or reading locally. Among them, the scene description file contains the description information of the entire three-dimensional scene and the media file of the type G-PCC coded point cloud contained in the scene.
- the description information of the media file of the type G-PCC coded point cloud may include the access address of the media file of the type G-PCC coded point cloud, the storage format of the processed decoded data of the media file of the type G-PCC coded point cloud, the playback time and playback frame rate of the media file of the type G-PCC coded point cloud, etc.
- the display engine parses the scene description file, it passes the description information of the media file of the type G-PCC coded point cloud contained in the scene description to the media access function through the media access function API.
- the display engine calls the cache management module through the cache API to allocate the cache, and can also pass the cache information to the media access function, and the media access function calls the cache management module through the cache API to allocate the cache.
- the media access function first requests the server to download the media file of the type G-PCC coded point cloud, or reads the media file of the type G-PCC coded point cloud from the local file.
- the media access function After obtaining the media file of type G-PCC coded point cloud, the media access function creates and starts the corresponding pipeline to process the media file of type G-PCC coded point cloud.
- the input of the pipeline is the encapsulated file of the media file of type G-PCC coded point cloud.
- the pipeline performs decapsulation, G-PCC decoding, post-processing and other processes in sequence, and then stores the processed data into the specified cache.
- the display engine obtains the decoded data of the media file of type G-PCC coded point cloud from the specified cache, and renders and displays the three-dimensional scene according to the data obtained in the cache.
- the following describes the scene description file, media access function API, media access function, cache API, and cache management of media files supporting the G-PCC coded point cloud type.
- some embodiments of the present application extend the values of the syntax elements in the MPEG media (MPEG_media) of the scene description file, and the specific extension includes at least one of the following:
- the media type syntax element (MPEG_media.media.alternatives.mimeType) used to declare the encapsulation format of the media file in the options (MPEG_media.media.alternatives) of the media list (media) of the MPEG media (MPEG_media) of the scene description file is extended.
- the extension of the media type syntax element (mimeType) includes: extending the media type syntax element (mimeType) with a value "application/mp4" associated with the G-PCC coded point cloud.
- the value of the media type syntax element (mimeType) is "application/mp4".
- Extension 2 The value of the first track index syntax element (MPEG_media.media.alternatives.tracks.track) used to declare the track information of the media file in the optional track array (MPEG_media.media.alternatives.tracks) of the media list (media) of the MPEG media (MPEG_media) of the scene description file is extended.
- the extension of the first track index syntax element includes: when G-PCC data is referenced by the scene description file as an item in the track array of the optional items of the media list of the MPEG media and the referenced item complies with the provisions on tracks in the International Standardization Organization Base Media File Format (ISOBMFF): for G-PCC data encapsulated in a single track, the track referenced in the MPEG media is the G-PCC code stream track, and for G-PCC data encapsulated in multiple tracks, the track referenced in the MPEG media is the G-PCC geometry code stream track.
- ISOBMFF International Standardization Organization Base Media File Format
- Extension 3 The encoding and decoding parameters of the media data contained in the code stream track in the track array (tracks) of the alternatives (alternatives) of the media list (media) of the MPEG media (MPEG_media) of the scene description file are used to describe The codec parameter syntax element (MPEG_media.media.alternatives.tracks.codecs) is extended.
- the specific extension includes: extending the codec parameters of the media files contained in the codestream track defined in IETFRFC 6381.
- the codec parameter syntax element (codecs) can be represented by a comma-separated list of codec values. Therefore, the extension of the value of the syntax element codec parameter syntax element (codecs) includes: when the type of the media file is G-PCC encoded point cloud, the value of the codec parameter syntax element (codecs) should be set in accordance with the provisions of the ISO/IEC 23090-18 G-PCC Data Transport (Carriage of Geometry-based Point Cloud Compression Data) standard.
- the "codecs" attribute of the preselection signaling should be set to 'gpc1', indicating that the preselected media is based on a geometric point cloud;
- the "codecs" attribute of the Main G-PCC Adaptation Set should be set to 'gpcb' or 'gpeb', indicating that the adaptation set contains G-PCC Tile basic track data.
- the "codecs" attribute of the Main G-PCC adaptationsset should be set to 'gpcb'.
- the "codecs” attribute of the Main G-PCC Adaptation Set should be set to 'gpeb'.
- G-PCC Tile preselection signaling is used in an MPD file, the "codecs" attribute of the preselection signaling shall be set to 'gpt1', indicating that the preselected media is a geometry-based point cloud tile.
- some implementations of the present application extend the values of the syntax elements in the MPEG media (MPEG_media) in the scene description file, and the specific extension includes extending one or more of the following items shown in Table 12:
- At least one of the above extensions 1 to 3 is performed on the syntax element values in the MPEG media (MPEG_media) in the scene description file, so that the MPEG media (MPEG_media) part in the scene description file supports media files of the type G-PCC coded point cloud.
- a method for describing scenes and nodes in a scene description file containing a media file of type G-PCC coded point cloud includes: when a three-dimensional scene contains a media file of type G-PCC coded point cloud, the scene and node description method is used to describe the overall structure of the three-dimensional scene and the structural hierarchy and position of the media file of type G-PCC coded point cloud in the three-dimensional scene.
- the description method using a scene description module and a node description module is used to describe the overall structure of the three-dimensional scene and the structural hierarchy and position of the media file of type G-PCC coded point cloud in the three-dimensional scene, including: one three-dimensional scene is described using one scene description module.
- Each scene description file can describe one or more three-dimensional scenes, and the three-dimensional scenes can only be in a parallel relationship, not a hierarchical relationship.
- the nodes can be in a parallel relationship or a hierarchical relationship.
- a method for describing a three-dimensional mesh in a scene description file of a media file of type G-PCC encoded point cloud including: reusing the syntax elements in the attributes (mesh.primitives.attributes) of the primitives of the mesh description module to describe various types of data of the media file of type G-PCC encoded point cloud.
- attributes messages.primitives.attributes
- a point cloud is a scattered data structure, a collection of many scattered points is a point cloud, so describing a media file of type G-PCC encoded point cloud is equivalent to describing the data at each point in the point cloud.
- each point in a media file of type G-PCC encoded point cloud has two types of information: geometric information and attribute information.
- the geometric information represents the three-dimensional coordinates of the point in space
- the attribute information represents the color, reflectivity, normal direction, and other information attached to the point. Since the data contained in the points of a media file of type G-PCC encoded point cloud are similar to the attributes that can be declared by the syntax elements contained in the attributes of the primitives of the mesh description module, when describing the data contained in the points of a media file of type G-PCC encoded point cloud in the mesh description module (mesh), the syntax elements in the attributes (mesh.primitives.attribute) of the primitives (primitives) of the mesh description module (mesh) can be reused to describe the data contained in the points of the media file of type G-PCC encoded point cloud.
- the value of the position syntax element (position, the first table item in Table 1 above) in the attribute of the primitive of the mesh description module is a three-dimensional vector composed of floating point numbers.
- Such a data structure can also represent the geometric information of the G-PCC coded point cloud. Therefore, the position syntax element (position) in the attribute (mesh.primitives.attribute) of the primitive of the reused mesh description module represents the geometric information of the point in the media file of the type G-PCC coded point cloud.
- the color value of the point in the media file of the type G-PCC coded point cloud can also be represented by the color syntax element (color_n, the fifth table item in Table 1 above) in the attribute (mesh.primitives.attribute) of the primitive of the reused mesh description module.
- the normal vector of the point in the media file of the type G-PCC coded point cloud can also be represented by the normal vector syntax element (normal, the third table item in Table 1 above) in the attribute (mesh.primitives.attribute) of the reused mesh description module.
- the first syntax element set is defined as a set of syntax elements supported in the attributes of primitives of the mesh description module of the scene description file specified in the ISO/IEC 23090-14 MPEG-I scene description standard as the first syntax element set, and the description method of the three-dimensional mesh of the media file of the type G-PCC coded point cloud is supported, including: based on the syntax elements in the first syntax element set, adding syntax elements corresponding to various types of data possessed by the three-dimensional mesh in the attributes of the primitives of the mesh description module corresponding to the three-dimensional mesh.
- Table 13 lists the method of describing the syntax elements in the attributes (mesh.primitives.attribute) of the primitives of the mesh description module of the partial data on the points in the media file of the type G-PCC coded point cloud:
- the G-PCC coded point cloud data may also include other data.
- Other data of the G-PCC coded point cloud can also be described by reusing the syntax elements in the attributes of the primitives of the mesh description module, such as texture coordinates (texcoord_n), joints (joints_n), weights (weights_n), etc.
- a method for describing a three-dimensional mesh of a media file of type G-PCC coded point cloud comprising: adding a target extension array to an extension list of primitives (mesh.primitives.extensions) of a mesh description module, and adding syntax elements corresponding to various types of data contained in the three-dimensional mesh in the media file of type G-PCC coded point cloud in the target extension array, and describing geometric information, color data, normal vector and other data associated with each vertex of the three-dimensional mesh in the media file of type G-PCC coded point cloud through the syntax elements corresponding to various types of data.
- primitives mesh.primitives.extensions
- adding syntax elements corresponding to various types of data contained in the corresponding three-dimensional grid to the target extension array includes: adding syntax elements corresponding to various types of data contained in the corresponding three-dimensional grid to the target extension array based on syntax elements in a first syntax element set.
- the first syntax element set is a set of syntax elements supported by the attributes of primitives of a grid description module of a scene description file specified in the ISO/IEC 23090-14 MPEG-I scene description standard.
- syntax elements corresponding to each type of data contained in the corresponding three-dimensional grid are added to the target extended array, including: a second syntax element set composed of syntax elements corresponding to a preset G-PCC coded point cloud, and syntax elements corresponding to each type of data contained in the corresponding three-dimensional grid are added to the target extended array.
- syntax element used to represent the geometric information associated with each vertex is defined as the first syntax element
- syntax element used to represent the color data associated with each vertex is defined as the second syntax element
- syntax element used to represent the normal vector associated with each vertex is defined as the third syntax element.
- syntax elements added to the target extension array of the extension list (mesh.primitives.extensions) of the primitives of the partial mesh description module include:
- FIG9 is a schematic diagram of a scene description file structure after adding a target extension array to the extension list (mesh.primitives.extensions) of the primitives of the mesh description module and extending the first syntax element, the second syntax element and the third syntax element in the target extension array based on the above embodiment.
- the scene description file includes but is not limited to the following modules: MPEG media (MPEG_media) 901, scene description module (scene) 902, node description module (node) 903, mesh description module (mesh) 904, accessor description module (accessor) 905, buffer slice description module (bufferView) 906, buffer description module (buffer) 907, skin description module (skin) 908, animation description module (animation) 909, camera description module (camera) 910, material description module (material) 911, texture description module (texture) 912, sampler description module (sampler) 913 and texture map description module (image) 914.
- the extended list of the primitive attributes of the grid description module 904 includes a target extension array 9000, a target extension array
- the extended syntax elements in 9000 include: a first syntax element 9001 for representing the geometric information associated with each vertex, a second syntax element 9002 for representing the color data associated with each vertex, and a third syntax element 9003 for representing the normal vector associated with each vertex.
- the functions, accessor types, data types and other information of other elements in the scene description file shown in FIG9 are similar to those in the scene description file shown in FIG3, and are not described in detail here.
- a grid description method for a media file of type G-PCC coded point cloud including: pre-configuring syntax elements corresponding to various types of data of the G-PCC coded point cloud, and based on the pre-configured syntax elements corresponding to various types of data of the G-PCC coded point cloud, adding syntax elements corresponding to various types of data in the attributes of the primitives of the grid description module corresponding to the three-dimensional grid in the G-PCC coded point cloud.
- the syntax elements corresponding to various types of data of the preconfigured G-PCC coded point cloud include: a fourth syntax element for representing geometric information associated with each vertex, a fifth syntax element for representing color data associated with each vertex, and a sixth syntax element for representing a normal vector associated with each vertex.
- syntax elements corresponding to various types of data are added to the attributes of primitives of a mesh description module corresponding to the three-dimensional mesh in the G-PCC coded point cloud, including: adding at least one of the fourth syntax element, the fifth syntax element, and the sixth syntax element to the attributes of the primitives of the mesh description module corresponding to the three-dimensional mesh in the G-PCC coded point cloud.
- the syntax element corresponding to the G-PCC coded point cloud for representing the geometric information associated with each vertex is defined as the fourth syntax element
- the syntax element corresponding to the G-PCC coded point cloud for representing the color data associated with each vertex is defined as the fifth syntax element
- the syntax element corresponding to the G-PCC coded point cloud for representing the normal vector associated with each vertex is defined as the sixth syntax element.
- the description method of the syntax element in the attribute of the primitives of the partial mesh description module includes:
- Fig. 10 is a schematic diagram of the structure of a scene description file after the syntax elements in the attributes (mesh.primitives.attribute) of the primitives of the mesh description module are expanded based on the above embodiment.
- the scene description file includes but is not limited to the following modules: MPEG media (MPEG_media) 101, scene description module (scene) 102, node description module (node) 102, mesh description module (mesh) 104, accessor description module (accessor) 105, buffer slice description module (bufferView) 106, buffer description module (buffer) 107, skin description module (skin) 108, animation description module (animation) 109, camera description module (camera) 110, material description module (material) 111, texture description module (texture) 112, sampler description module (sampler) 113 and texture map description module (image) 114.
- MPEG media MPEG_media
- scene scene
- node description module node description module
- mesh description module mesh
- accessor description module accessor
- the primitive attributes (mesh.primitives.attribute) of the mesh description module 104 include: The fourth syntax element 1041 for indicating the geometric information associated with each vertex, the fifth syntax element 1042 for indicating the color data associated with each vertex, and the fifth syntax element 1043 for indicating the normal vector associated with each vertex.
- the functions, accessor types, data types and other information of other elements in the scene description file shown in FIG10 are similar to those in the scene description file shown in FIG3, and are not described in detail here.
- the scene description file describes a three-dimensional scene containing a media file of type G-PCC coded point cloud
- the syntax elements in the attributes of the primitives of the mesh description module are reused to describe the G-PCC coded point cloud data, or a target extension array is added to the primitives of the mesh description module or new syntax elements are extended in the attributes of the primitives of the mesh description module to describe the media file of type G-PCC coded point cloud
- the mesh description module (mesh) will contain a large number of points in the G-PCC coded point cloud, and each point contains at least geometric information and attribute information. Therefore, it is inconvenient to store the data of the media file of type G-PCC coded point cloud directly in the scene description framework. Instead, the link to the media file of type G-PCC coded point cloud is pointed out in the scene description framework, and the media file is downloaded when the data of the G-PCC coded point cloud is needed.
- the scene description file may also be merged with a media file of the type of G-PCC coded point cloud to form a binary file to reduce the types and number of files.
- the media file of type G-PCC coded point cloud needs to be specified in the buffer description module (buffer), but the Uniform Resource Locator (URL) of the media file of type G-PCC coded point cloud is not directly added in the buffer description module. Instead, the value of the media index syntax element (media) in the MPEG circular buffer (MPEG_buffer_circular) in the buffer description module (buffer) points to the media description module corresponding to the media file of type G-PCC coded point cloud in the MPEG media (MPEG_media).
- the value of the uniform resource identifier syntax element (uri) in the optional options of the media description module corresponding to the media file of type G-PCC coded point cloud in the media list (media) of the MPEG media (MPEG_media) is: "http://www.example.com/G-PCCexample.mp4", and it is the first media description module in the MPEG media, then the value of the media index syntax element (media) of the MPEG circular buffer (MPEG_buffer_circular) can be set to "0", so as to index the link of the first media file in the MPEG media in the MPEG circular buffer in the buffer description module, so as to index the media description module corresponding to the media file of type G-PCC coded point cloud in the MPEG media (MPEG_media) through the media index syntax element (media) in the MPEG circular buffer (MPEG_buffer_circular.media) of the buffer description module (buffer).
- an accessor accessor
- a cache slice cache slice
- a buffer (buffer) description method for a media file of type G-PCC coded point cloud including: track information of data cached by the value buffer of the second track index syntax element (track) of the track array (tracks) of the MPEG circular buffer (MPEG_buffer_circular) of the buffer description module (buffer).
- MPEG_buffer_circular is used to reduce the need for data caching while ensuring data caching.
- the MPEG circular buffer can be regarded as connecting the head and tail of the ordinary buffer to form a ring, and writing the buffer to the circular buffer and reading the data in the circular buffer rely on the write pointer and the read pointer to realize the simultaneous writing and reading process.
- the syntax elements contained in the MPEG circular buffer are shown in Table 16:
- the value of the media index syntax element (media) in Table 16 is the index value of the media description module corresponding to the media file of the type G-PCC coded point cloud declared in MPEG media (MPEG_media), so that the media file of the type G-PCC coded point cloud can be indexed in the buffer description module (buffer), and based on the setting rules of the value of the track index syntax element (tracks) in Table 16, the value of the track index syntax element (tracks) in Table 16 is the index value of one or more code stream tracks of the media file of the type G-PCC coded point cloud, so that the decoded data of the one or more code stream tracks can be cached in the corresponding buffer.
- a method for describing materials (material), texture (texture), sampler (sampler) and texture map (image) of a media file of type G-PCC encoded point cloud including: when a scene description file is used to describe a three-dimensional scene of a G-PCC encoded point cloud, materials (material), texture (texture), sampler (sampler) and texture map (image) are not used to describe the three-dimensional scene.
- the G-PCC encoded point cloud is a scattered topological structure, it does not actually have the concept of surface. Various additional information is directly represented on the points, and material, texture, sampler and image are all attachment information for the surface. Therefore, only the definitions of material, texture, sampler and image are retained, but material, texture, sampler and image are not used to describe the three-dimensional scene.
- a camera description module (camera) description method for a media file of type G-PCC encoded point cloud is supported, including: defining the viewpoint, viewing angle and other viewing-related visual information of a node in a three-dimensional scene through a camera description module.
- an animation description module (animation) description method for a media file of type G-PCC coded point cloud including: adding animation to a node description module (node) in a three-dimensional scene through an animation description module (animation).
- the animation description module may describe the animation added to the node description module (node) through one or more of position movement, angle rotation, and size scaling.
- the animation description module can also indicate the start time of the animation added to the node description module (node). At least one of the duration, end time and animation implementation method.
- a description method of a skin description module (skin) for a media file of type G-PCC encoded point cloud includes: defining the movement and deformation relationship between a mesh (mesh) in a node description module (node) and the corresponding bone through the skin description module (skin).
- the moving picture experts group media MPEG_media
- scene description module scene
- node description module node
- mesh description module mesh
- accessor description module accessor
- cache slice description module bufferView
- buffer description module buffer
- skin description module skin
- animation description module animation
- camera description module camera
- material description module material
- texture description module texture
- sampler description module sampler
- texture map description module image
- scene description file supporting media files of the G-PCC coded point cloud type provided in an embodiment of the present application is described below in conjunction with a specific scene description file.
- the pair of curly brackets between line 1 and line 118 contain the main contents of the scene description file supporting media files of type G-PCC coded point cloud, which includes: digital asset description module (asset), extension description module (extensionUsed), MPEG media (MPEG_media), scene declaration (scene), scene list (scenes), node list (nodes), mesh list (meshes), accessor list (accessors), buffer slice list (bufferViews), and buffer list (buffers).
- digital asset description module asset
- extension description module extension description module
- MPEG_media MPEG media
- scene declaration scene
- scene list scenes
- node list node list
- mesh list meshes
- accessor list accessors
- buffer slice list buffer slice list
- buffer list buffer list
- Digital asset description module (asset): The digital asset description module is the 2nd to 4th line. From the "version”: “2.0" in the 3rd line of the digital asset description module, it can be determined that the scene description file is written based on glTF 2.0, which is also the scene description file. From the parsing perspective, the display engine can determine which parser should be selected to parse the scene description file based on the digital asset description module.
- Extension description module used (extensionUsed): The extension description module used is lines 6 to 10. Since the extension description module used includes three syntax elements: MPEG media (MPEG_media), MPEG circular buffer (MPEG_buffer_circular) and MPEG time-varying accessor (MPEG_accessor_timed), it can be determined that the scene description file uses three MPEG extensions: MPEG media, MPEG circular buffer, and MPEG time-varying accessor. From the parsing perspective, the display engine can know in advance that the extension items involved in the subsequent parsing include: MPEG media, MPEG circular buffer, and MPEG time-varying accessor based on the content of the extension description module used.
- MPEG media MPEG_media
- MPEG_buffer_circular MPEG_buffer_circular
- MPEG_accessor_timed MPEG time-varying accessor
- MPEG media is lines 12 to 34.
- the track information of the media file of type G-PCC coded point cloud is indicated, the codec parameters of the media file of type G-PCC coded point cloud are indicated by "codecs":"gpc1" in line 26, the name of the media file of type G-PCC coded point cloud is indicated by "name":"G-PCCexample” in line 16, the media file of type G-PCC coded point cloud should be played automatically by “autoplay”:true in line 17, and the media file of type G-PCC coded point cloud should be played in a loop by "loop":true in line 18.
- the display engine can determine that there is a media file of type G-PCC coded point cloud in the 3D scene to be rendered by parsing MPEG media, and learn the method of accessing and parsing the media file of type G-PCC coded point cloud.
- Scene declaration The scene declaration is line 36. Because a scene description file can theoretically include multiple 3D scenes, the scene description file firstly points out through the scene declaration on line 36 and its "scene":0 that the 3D scene to be subsequently processed and rendered based on the scene description file is the first 3D scene in the scene list, that is, the 3D scene enclosed by the curly brackets on lines 39 to 43.
- Scene list (scenes): The scene list is lines 38 to 44.
- the scene list contains only one curly bracket, indicating that the scene list only includes one scene description module.
- the scene description file only contains one 3D scene.
- the "nodes":[0] in lines 40 to 42 in the curly bracket indicates that the 3D scene only includes one node, and the index value of the node description module corresponding to the node is 0.
- the content of the scene list clarifies that the entire scene description framework should select the first 3D scene in the scene list (the 3D scene with index 0) for subsequent processing and rendering, clarifies the overall structure of the 3D scene, and points to the next layer of more detailed node description modules (node).
- Node list (nodes): The node list is lines 46 to 51.
- the node list contains only one curly bracket, indicating that the node list includes only one node description module, and the three-dimensional scene has only one node, and the node is the same node as the node with an index value of 0 in the node description module in the scene description module, and the two are associated through indexing.
- the name of the node is "G-PCCexample_node” indicated by “name”:"G-PCCexample_node” in line 48
- the content mounted on the node is the three-dimensional mesh corresponding to the first mesh description module in the mesh list through "mesh":0 in line 49, which corresponds to the mesh description module of the next layer.
- the content of the node list indicates that the content mounted on the node is a three-dimensional mesh, and that the three-dimensional mesh is the three-dimensional mesh corresponding to the first mesh description module in the mesh list.
- the mesh list is lines 53 to 66.
- the mesh list contains only one curly bracket, indicating that the mesh list includes only one mesh description module.
- the three-dimensional scene has only one three-dimensional mesh, and the three-dimensional mesh is the same three-dimensional mesh as the three-dimensional mesh with an index value of 0 in the node description module.
- the curly brackets (mesh description module) describing the three-dimensional mesh the name of the three-dimensional mesh is indicated by "name":"G-PCCexample_mesh” on line 55, which is used only as an identification mark.
- the "primitives" on line 56 indicates that the three-dimensional mesh has primitives.
- Buffer list (buffers): The buffer list is lines 106 to 117.
- the buffer list contains only one curly bracket, indicating that the scene description file only includes one buffer description module, and the display of the 3D scene only needs to access one media file.
- the MPEG circular buffer (MPEG_buffer_circular) extension is used, indicating that the buffer is a circular buffer modified using the MPEG extension.
- the "media:0" in line 112 indicates that the data source in the circular buffer is the media file corresponding to the first media description module declared in the MPEG media in the previous text.
- the track with index 1 is not limited here.
- It can be the only track of a media file of type G-PCC coded point cloud in a single-track package, or it can be a geometric code stream track of a media file of type G-PCC coded point cloud in a multi-track package.
- the syntax element "count”:5 in the MPEG circular buffer it can also be determined that the MPEG circular buffer has five storage links.
- the syntax element "by teLength”:15000 it can also be determined that the byte length (capacity) of the MPEG ring buffer is 15000 bytes.
- the buffer list realizes the correspondence of the media files of the type G-PCC coded point cloud declared in the MPEG media to the buffer, or in other words, the buffer references the media files of the type G-PCC coded point cloud that were only declared but not used before.
- the media files of the type G-PCC coded point cloud referenced here are unprocessed G-PCC encapsulated files.
- the G-PCC encapsulated files need to be processed by the media access function to extract the position coordinates (position) and color values (color_0) mentioned in the grid description module, which can be directly used for rendering.
- Buffer slice list (bufferViews): The buffer slice list is rows 93 to 104.
- the buffer slice list contains two parallel curly brackets. Combined with the fact that there is only one buffer determined by the buffer description module, it means that the buffer used to store the media file of type G-PCC coded point cloud is divided into two cache slices, and the point cloud data in the media file of type G-PCC coded point cloud is stored in two cache slices.
- the buffer description module with index 0 is first pointed to by buffer:0 in row 95, that is, the only buffer description module mentioned in the buffer list, and then the data slice range of the corresponding cache slice is limited to the first 12,000 bytes by the two parameters of byte length (byteLength) and byte offset (byteOffset) in rows 96 and 94.
- the content in the second curly bracket is similar to the first curly bracket, except that the data slice range is defined as the last 3,000 bytes. From a parsing perspective, the cache slice list groups the point cloud data in the media file of type G-PCC encoded point cloud, which is conducive to the detailed definition of the subsequent accessor description module.
- Accessor list (accessors): The accessor list is lines 68 to 91. The structure of the accessor list is similar to that of the cache slice list, and both contain two parallel curly braces, indicating that the accessor list includes two accessor description modules. The display of the three-dimensional scene requires access to media data through two accessors.
- both curly braces (accessor description modules) contain the extension MPEG time-varying accessor (MPEG_accessor_timed), indicating that these two accessors point to time-varying media defined by MPEG.
- MPEG_accessor_timed extension MPEG time-varying accessor
- the data format stored in the accessor is a three-dimensional vector composed of 32-bit floating point numbers. "count”:1000 indicates that there are 1000 data that need to be accessed through the accessor of this format. Each 32-bit floating point number occupies 4 bytes. Therefore, the accessor corresponding to the accessor description module contains 12000 bytes of data, which corresponds to the setting in the cache slice description module with an index value of 0.
- the content in the second curly brace (the second accessor description module) is similar.
- the index value of the cache slice description module is changed to 1, and the data type is redefined. From the perspective of parsing, the accessor list (accessors) completes the complete definition of the data required for rendering. For example, the data types missing in the cache slice description module and the cache description module are defined in the corresponding accessor description module.
- the main function of the display engine supports the function of the display engine of the media file of the type of G-PCC coded point cloud, which is similar to the main function of the display engine in the workflow of the scene description framework of immersive media described above, including: 1. Able to parse the scene description file of the media file of the type of G-PCC coded point cloud to obtain the corresponding rendering method of the three-dimensional scene; 2. Able to pass media access instructions or media data processing instructions through the media access function API and the media access function; wherein the media access instructions or media data processing instructions come from the parsing result of the scene description file of the media file of the type of G-PCC coded point cloud; 3.
- the display engine can obtain a method for rendering a three-dimensional scene including a media file of type G-PCC media file by parsing the scene description file, and it is necessary to pass the method for rendering the three-dimensional scene to the media access function or send instructions to the media access function based on the method for rendering the three-dimensional scene.
- the process of passing the method for rendering the three-dimensional scene to the media access function or sending instructions to the media access function based on the method for rendering the three-dimensional scene is implemented through the media access function API.
- the display engine may send a media access instruction or a media data processing instruction to the media access function through the media access function API.
- the media access instruction or the media data processing instruction sent by the display engine to the media access function through the media access function API comes from the parsing result of the scene description file of the media file of the type G-PCC coded point cloud, and the media access instruction or the media data processing instruction may include: the index of the media file of the type G-PCC coded point cloud, the URL of the media file of the type G-PCC coded point cloud, the attribute information of the media file of the type G-PCC coded point cloud, the display time window of the media file of the type G-PCC coded point cloud, the format requirements for the processed media file of the type G-PCC coded point cloud, etc.
- the media access function can also request media access from the display engine through the media access function API. Instructions or media data processing instructions.
- the media access function After the media access function receives the media access instruction or media data processing instruction issued by the display engine through the media access function API, it will execute the media access instruction or media data processing instruction issued by the display engine through the media access function API. For example: obtaining a media file of type G-PCC coded point cloud, establishing a suitable pipeline for a media file of type G-PCC coded point cloud, allocating a suitable cache for a processed media file of type G-PCC coded point cloud, etc.
- the media access function obtains a media file of a G-PCC coded point cloud type, including: using a network transmission service to download the media file of a G-PCC coded point cloud type from a server.
- the media access function obtains a media file of a G-PCC coded point cloud type, including: reading the media file of a G-PCC coded point cloud type from a local storage space.
- the media access function After the media access function obtains the media file of type G-PCC coded point cloud, it needs to process the media file of type G-PCC coded point cloud. There are significant differences in the processing processes of different types of media files. In order to achieve wide media type support and take into account the work efficiency of the media access function, a variety of pipelines are designed in the media access function. In the process of processing media files, only the pipeline that matches the media type is enabled.
- the media access function needs to establish a corresponding pipeline for the media file of type G-PCC coded point cloud, and perform decapsulation, G-PCC decoding, post-processing and other processes on the media file of type G-PCC coded point cloud through the established pipeline to complete the processing of the media file of type G-PCC coded point cloud, and process the media file data of type G-PCC coded point cloud into a data format that can be used for direct rendering by the display engine.
- FIG. 11 is a schematic diagram of the structure of the pipeline corresponding to the G-PCC coded point cloud in some embodiments of the present application.
- the pipeline 1100 supporting media files of the type of G-PCC coded point cloud includes: an input module 111 , a decapsulation module 112 , a geometry decoder 113 , an attribute decoder 114 , a first post-processing module 115 , and a second post-processing module 116 .
- the input module 111 is used to receive a G-PCC encapsulation file and input the G-PCC encapsulation file into the decapsulation module 112.
- the G-PCC encapsulation file is a file obtained by encapsulating the G-PCC code stream obtained by G-PCC encoding the point cloud data. Since the G-PCC encapsulation file is presented in the form of a track, what the input module 111 receives is the track code stream of the G-PCC encapsulation file.
- the G-PCC encapsulation file can be a single track or a multi-track. Therefore, in the embodiment of the present application, the G-PCC encapsulation file received by the input module 111 can be a single track or a multi-track, and the embodiment of the present application does not limit this.
- the decapsulation module 112 is used to decapsulate the G-PCC encapsulation file input by the input module 111 to obtain a G-PCC code stream (including a geometry information code stream and an attribute information code stream, input the geometry information code stream to a geometry decoder 113, and input the attribute information code stream to an attribute decoder 114. It should be noted that, with the development of relevant technologies, the G-PCC code stream may also increase the code stream of other information. When the G-PCC code stream also includes the code stream of other information, the decapsulation module 112 decapsulates the G-PCC encapsulation file to obtain the code stream of other information, and inputs the code stream of other information into the corresponding decoder.
- the geometry decoder 113 is used to decode the geometry information code stream output by the decapsulation module 112 to obtain the geometry information of the point cloud.
- the main steps of the geometry decoder 113 decoding the geometry information code stream include: obtaining the geometry information of the point cloud through arithmetic decoding, octree synthesis, surface fitting, geometry reconstruction, inverse coordinate conversion, etc.
- the specific implementation of the geometry decoder 113 decoding the geometry information code stream can refer to the workflow of the geometry decoding module 81 in Figure 8, which will not be described in detail here.
- the attribute decoder 114 is used to decode the attribute information code stream input by the decapsulation module 112 to obtain the attribute information of the point cloud.
- the main steps of the attribute decoder 114 decoding the geometric information code stream include: attribute prediction, lifting and inverse operation of RAHT transformation, etc., to obtain the attribute information code stream.
- the specific implementation of the attribute decoder 114 decoding the attribute information code stream can refer to the workflow of the attribute decoding module 82 in Figure 8, which will not be described in detail here.
- the first post-processing module 115 is used to process the geometric information output by the geometry decoder 113. After the decoding of the geometric information code stream is completed, the geometric information of the points in the G-PCC encoded point cloud can be obtained, and in some cases the obtained geometric information can be directly used by the display engine, but because the scene description framework does not impose too many restrictions on the display engine or specifically define it, a wide variety of display engines may appear. These different display engines may have different requirements for input data, so after the decoding of the geometric information code stream is completed, the first post-processing module 115 is added to ensure that the geometric information output by the pipeline is available to any display engine. In some embodiments, the first post-processing module 115 processes the geometric information including: converting the format of the geometric information.
- the second post-processing module 116 is used to process the attribute information output by the attribute decoder 114. After the decoding of the attribute information code stream is completed, the attribute information of the points in the G-PCC encoded point cloud can be obtained, and in some cases the attribute information can be directly used by the display engine, but because the scene description framework does not impose too many restrictions on the display engine or specifically define it, a wide variety of display engines may appear. These different display engines may have different requirements for input data, so after the decoding of the attribute information code stream is completed, the second post-processing module 116 is added to ensure that the output attribute information of the pipeline is available to any display engine. In some embodiments, the first post-processing module 115 processes the geometric information including: format conversion of the attribute information.
- the processed geometric information output by the first post-processing module 115 and the processed attribute information output by the second post-processing module 116 are written into the buffer 117, so that the display engine 118 reads the geometric information and attribute information from the buffer as needed, and renders and displays the G-PCC encoded point cloud in the three-dimensional scene based on the read geometric information and attribute information.
- the media access function After the media access function completes the processing of the G-PCC encoded point cloud data through the pipeline, the media access function also needs to deliver the processed data to the display engine in a standardized arrangement structure. This requires the processed G-PCC encoded point cloud data to be correctly stored in the cache. This work is completed by the cache management module, but the cache management module needs to obtain cache management instructions from the media access function or the display engine through the cache API.
- the media access function may send a cache management instruction to the cache management module via a cache API, wherein the cache management instruction is a cache management instruction sent by the display engine to the media access function via the media access function API.
- the display engine may send cache management instructions to the cache management module via a cache API.
- the cache management module can communicate with the media access function through the cache API, and can also communicate with the display engine through the cache API, and the purpose of communicating with the media access function or the display engine is to achieve cache management.
- the display engine needs to send the cache management instruction to the media access function through the media access function API first, and the media access function then sends the cache management instruction to the cache management module through the cache API;
- the display engine only needs to generate the cache management instruction based on the cache management information parsed from the scene description file, and send it to the cache management module through the cache API.
- the cache management instruction may include one or more of an instruction to create a cache, an instruction to update a cache, and an instruction to release a cache.
- the processed G-PCC encoded point cloud data needs to be delivered to the display engine in a standardized arrangement structure. This requires the processed G-PCC encoded point cloud data to be correctly stored in the cache, and this task is the responsibility of the cache management module.
- the cache management module implements management operations such as cache creation, update, and release, and the operation instructions are received through the cache API.
- the cache management rules are recorded in the scene description document, parsed by the display engine, and finally issued to the cache management module by the display engine or the media access function.
- the role of cache management is to manage these caches so that they match the format of the processed media data without disrupting the processed media data.
- the specific design method of the media management module should be based on the design of the display engine and the media access function.
- some embodiments of the present application provide a method for generating a scene description file.
- the method for generating a scene description file includes the following steps S121 to S123:
- S121 Determine the type of the media file in the 3D scene to be rendered.
- the types of media files in the embodiments of the present application may include: one or more of G-PCC encoded point cloud, V-PCC encoded point cloud, tactile media files, 6DoF video, MIV video, etc., and the same type of media files may include any number of them.
- the three-dimensional scene to be rendered may include only one media file of the type G-PCC encoded point cloud.
- the three-dimensional scene to be rendered may include a media file of the type G-PCC encoded point cloud and a media file of the type V-PCC encoded point cloud.
- the three-dimensional scene to be rendered may include two media files of the type G-PCC encoded point cloud and a tactile media file.
- step S121 if the type of the target media file in the to-be-rendered three-dimensional scene is a G-PCC coded point cloud, the following step S122 is performed:
- S122 Generate a target description module corresponding to the target media file according to the description information of the target media file.
- the description information of the target media file includes: one or more of: the name of the target media file, whether the target media file needs to be played automatically, whether the target media file needs to be played in a loop, the encapsulation format of the target media file, the type of the code stream of the target media file, the encoding parameters of the target media file, etc.
- the above step S122 (generating a target description module corresponding to the target media file according to the description information of the target media file) includes at least one of the following steps 1221 to 1229:
- Step 1221 Add a media name syntax element (name) in the target media description module, and set the value of the media name syntax element according to the name of the target media file.
- the media name syntax element in the target media description module is "name"
- the name of the target media file is "G-PCCexample”
- add the syntax element "name” in the target media description module and set the value of the syntax element "name” to "G-PCCexample”.
- Step 1222 Add an autoplay syntax element (autoplay) in the target media description module, and set the value of the autoplay syntax element according to whether the target media file needs to be automatically played.
- the automatic play syntax element in the target media description module is "autoplay"
- the target media file needs to be played automatically
- the syntax element "autoplay” is added to the target media description module, and the syntax element "autoplay” is to "true”.
- the syntax element "autoplay” is added to the target media description module and the value of the syntax element "autoplay” is set to "false”.
- Step 1223 loop playback syntax element (loop) in the target media description module, and set the value of the loop playback syntax element according to whether the target media file needs to be looped.
- the automatic playback syntax element in the target media description module is "loop"
- the target media file needs to be played in a loop
- the syntax element "loop” is added to the target media description module, and the value of the syntax element "loop” is set to "true”.
- the syntax element "loop” is added to the target media description module, and the value of the syntax element "loop" is set to "false”.
- Step 1224 Add alternatives in the target media description module.
- Step 1225 Add a media type syntax element (mimeType) to the alternatives, and set the value of the media type syntax element to the encapsulation format value corresponding to the G-PCC coded point cloud.
- miType media type syntax element
- the encapsulation format corresponding to the G-PCC encoded point cloud is MP4, and the encapsulation format value corresponding to the G-PCC encoded point cloud is: application/mp4.
- the media type syntax element is "mimeType"
- the encapsulation format value corresponding to the G-PCC encoded point cloud is "application/mp4"
- the syntax element "mimeType” is added to the optional options of the target media description module, and the value of the syntax element "mimeType” is set to "application/mp4".
- Step 1226 Add a uniform resource identifier syntax element (URI) to the alternatives, and set the value of the uniform resource identifier syntax element to the access address of the target media file.
- URI uniform resource identifier syntax element
- the uniform resource identifier syntax element is "uri"
- the access address of the target media file is "http://www.exp.com/G-PCCexp.mp4"
- the syntax element "uri” is added to the optional items of the target media description module, and the value of the syntax element "uri” is set to http://www.exp.com/G-PCCexp.mp4.
- Step 1227 Add an array of tracks to the alternatives.
- Step 1228 Add a first track index syntax element (track) to the track array (tracks) of the options (alternatives) of the target media description module, and set the value of the first track index syntax element (track) according to the encapsulation method of the target media file.
- setting the value of the first track index syntax element (track) according to the encapsulation method of the target media file includes:
- the target media file is a single-track encapsulation file, setting the value of the first track index syntax element to the index value of the code stream track of the target media file;
- the value of the first track index syntax element is set to the index value of the geometry code stream track of the target media file.
- the encapsulation method of the G-PCC coded point cloud includes single-track encapsulation and multi-track encapsulation.
- single-track encapsulation refers to the encapsulation method of encapsulating the geometric code stream and attribute code stream of the G-PCC coded point cloud in the same code stream track
- multi-track encapsulation refers to the encapsulation method of encapsulating the geometric code stream and attribute code stream of the G-PCC coded point cloud in multiple code stream tracks respectively.
- Step 1229 add a codec parameter syntax element (codecs) to the optional track array of the target media description module, and set the value of the codec parameter syntax element according to the encoding parameters of the target media file, the type of the code stream of the target media file, and the ISO/IEC 23090-18 G-PCC data transmission standard.
- codecs codec parameter syntax element
- the ISO/IEC 23090-18 G-PCC data transmission standard stipulates that when the G-PCC coded point cloud is encapsulated in DASH, when the G-PCC pre-selection signaling is used in the MPD file, the "codecs" attribute of the pre-selection signaling should be set to 'gpc1', indicating that the pre-selected media is a point cloud based on geometry; when there are multiple G-PCC Tile tracks in the G-PCC container, the "codecs" attribute of the Main G-PCC Adaptation Set should be set to 'gpcb' or 'gpeb', indicating that the adaptation set contains G-PCC Tile basic track data.
- the "codecs" attribute of the Main G-PCC adaptivesset should be set to 'gpcb'.
- the "codecs" attribute of the Main G-PCC Adaptation Set should be set to 'gpeb'.
- the "codecs” attribute of the preselection signaling should be set to 'gpt1', indicating that the preselected media is a point cloud fragment based on geometry.
- the value of "codecs" in "tracks” of "alternatives” of the target media description module can be set to 'gpc1'.
- the encapsulation format value corresponding to the G-PCC coded point cloud is "application/mp4"
- the name of the target media file is "G-PCCexample”
- the target media file is automatically played and looped
- the access address of the target media file is: http://www.exp.com/G-PCCexp.mp4
- the target media file is a single-track encapsulation file and the index value of the code stream track of the target media file is 1
- the target media file is encapsulated using DASH and the G-PCC pre-selected signaling is used in the MPD file
- the target media description module corresponding to the target media file can be as follows:
- the target media description module is a media description module generated based on the description information of the target media file.
- the encapsulation format value corresponding to the G-PCC coded point cloud is application/mp4
- the name of the target media file is "G-PCCexample1”
- the target media file is automatically played and looped
- the access address of the target media file is "uri”: http://www.exp.com/G-PCCexp.mp4
- the target media file is a single-track encapsulation file
- the index value of the code stream track of the target media file is 1
- the target media file is encapsulated with DASH and the G-PCC pre-selection signaling is used in the MPD file
- the MPEG media of the scene description file can be as follows:
- the three-dimensional scene to be rendered may also include multiple media files, and the type of one or more media files among the multiple media files is G-PCC coded point cloud.
- the scene description file it is necessary to add a media description module corresponding to the media file of the type of G-PCC coded point cloud according to the above embodiment, and add media description modules corresponding to other types of media files according to the generation method of scene description files of other types of media files.
- the media files in the three-dimensional scene to be rendered include a target media file of type G-PCC coded point cloud and a tactile media file
- the encapsulation format value corresponding to the G-PCC coded point cloud is "application/mp4"
- the name of the target media file is "G-PCCexample”
- the target media file is automatically played and looped
- the access address of the target media file is "uri”: http://www.exp.com/G-PCCexp.mp4
- the target media file is a single-track encapsulation file
- the index value of the bitstream track of the target media file is 1
- the target media file is encapsulated with DASH and G-PCC pre-selection signaling is used in the MPD file
- the MPEG media of the scene description file can be as follows:
- the media list (media) of the MPEG media includes two curly brackets.
- the first curly bracket (lines n+2 to n+18) encompasses the media description module corresponding to the target media file of type G-PCC coded point cloud, and the second curly bracket (lines n+19 to n+35) encompasses the media description module corresponding to the tactile media file.
- the method for generating a scene description file first determines the type of media files in the three-dimensional scene to be rendered when generating a scene description file for a three-dimensional scene to be rendered, and when the type of the target media file in the three-dimensional scene to be rendered is a G-PCC coded point cloud, generates a target description module corresponding to the target media file according to the description information of the target media file, and adds the target media description module to the media list of the MPEG media in the scene description file of the three-dimensional scene to be rendered.
- the method for generating a scene description file can generate a target description module corresponding to the target media file according to the description information of the target media file, and adds the target media description module to the media list of the MPEG media in the scene description file of the three-dimensional scene to be rendered when the media files in the three-dimensional scene to be rendered include a target media file of the type of a G-PCC coded point cloud.
- the target media description module is added to the list, and the media description module corresponding to the target media file is added to the media description module list of the MPEG media of the scene description file. Therefore, the embodiment of the present application can generate a scene description file including a three-dimensional scene of the type G-PCC coded point cloud, thereby realizing the scene description file's support for media files of the type G-PCC coded point cloud.
- the method for generating a scene description file further includes:
- a target scene description module (scene) corresponding to the three-dimensional scene to be rendered is added to the scene list (scenes) of the scene description file, and an index value of a node description module corresponding to a node in the scene to be rendered is added to the node list (nodes) of the target scene description module.
- the three-dimensional scene to be rendered includes two nodes, and the index values of the node description modules (node) corresponding to the two nodes are 0 and 1 respectively, then the target scene description module corresponding to the three-dimensional scene to be rendered added in the scene description file can be as follows:
- the 3D scene to be rendered includes two nodes, and the index values of the node description modules corresponding to the two nodes are 0 and 1 respectively, so two index values 0 and 1 are added to the node list (nodes) of the scene description module corresponding to the 3D scene to be rendered.
- the method for generating a scene description file further includes:
- the method for generating a scene description file further includes:
- a node name syntax element (name) is added to the node description module, and a value of the node name syntax element (name) in the corresponding node description module is set according to the name of the node.
- the three-dimensional scene to be rendered includes two nodes, the names of the two nodes are G-PCCexp_node1 and G-PCCexp_node2, the index values of the mesh description module corresponding to the three-dimensional mesh contained in the node G-PCCexp_node1 are 0 and 1 respectively, and the index value of the mesh description module corresponding to the three-dimensional mesh contained in the node G-PCCexp_node2 is 2, then the node list (nodes) part of the scene description file can be as follows:
- the node list (nodes) of the scene description file corresponding to the 3D scene to be rendered includes two node description modules, the first node description module is the content enclosed by the curly braces of lines n+2 to n+5, and the second node description module is the content enclosed by the curly braces of lines n+6 to n+9.
- the value of the node name syntax element (name) in the first node description module is set to the name of the corresponding node "G-PCCexp_node1"
- the value of the mesh index syntax element (mesh) in the first node description module is set to the index values 0 and 1 of the mesh description module of the 3D mesh mounted on the corresponding node
- the value of the node name syntax element (name) in the second node description module is set to the name of the corresponding node "G-PCCexp_node2”
- the value of the mesh index syntax element (mesh) in the second node description module is set to the index value 2 of the mesh description module of the 3D mesh mounted on the corresponding node.
- the method for generating a scene description file further includes:
- a mesh description module (mesh) corresponding to the three-dimensional mesh in the scene to be rendered is added to the mesh list (meshes) of the scene description file, syntax elements corresponding to various types of data contained in the three-dimensional mesh corresponding to the mesh description module are added to the mesh description module, and the value of the syntax element corresponding to each type of data is set to the index value of the accessor description module corresponding to the accessor for accessing each type of data.
- the data contained in the three-dimensional grid may include: one or more of: geometric coordinates (position), color value (color), normal vector (normal), tangent vector (tangent), texture coordinates (texcoord), joints (joints), and weights (weights).
- adding syntax elements corresponding to various types of data contained in the three-dimensional grid corresponding to the grid description module in the grid description module includes:
- Extensions Add an extension list (extensions) to the primitives (primitives) of the mesh description module corresponding to the three-dimensional mesh in the target media file, add a target extension array (extensions) to the extension list (extensions), and add syntax elements corresponding to each type of data contained in the corresponding three-dimensional mesh in the target extension array.
- the target extension array may be MPEG_primitve_GPCC.
- adding syntax elements corresponding to various types of data contained in the corresponding three-dimensional grid to the target extension array includes: adding syntax elements corresponding to various types of data contained in the corresponding three-dimensional grid to the target extension array based on syntax elements in a first syntax element set.
- the first syntax element set is a set of syntax elements supported by the attributes of primitives of a grid description module of a scene description file specified in the ISO/IEC 23090-14 MPEG-I scene description standard.
- the syntax elements supported by the attributes of the primitives of the grid description module of the scene description file specified in the ISO/IEC 23090-14 MPEG-I scene description standard include: position, color_n, normal, tangent, texcoord, joints, weights, so the first syntax element set is: ⁇ position, color_n, normal, tangent, texcoord, joints, weights ⁇ .
- a certain three-dimensional mesh includes geometric coordinates and color data
- the index value of the accessor description module corresponding to the accessor for accessing the geometric coordinates is 0, and the index value of the accessor description module corresponding to the accessor for accessing the color data is 1.
- syntax elements corresponding to each type of data contained in the corresponding three-dimensional grid are added to the target extended array, including: a second syntax element set composed of syntax elements corresponding to a preset G-PCC coded point cloud, and syntax elements corresponding to each type of data contained in the corresponding three-dimensional grid are added to the target extended array.
- the syntax elements corresponding to the G-PCC coded point cloud may include: G-PCC_position, G-PCC_color_n, G-PCC_normal, G-PCC_tangent, G-PCC_texcoord, G-PCC_joints, G-PCC_weights, and accordingly, the second syntax element set is: ⁇ G-PCC_position, G-PCC_color_n, G-PCC_normal, G-PCC_tangent, G-PCC_texcoord, G-PCC_joints, G-PCC_weights ⁇ .
- a certain three-dimensional mesh includes geometric coordinates and color data
- the index value of the accessor description module corresponding to the accessor for accessing the geometric coordinates is 0, and the index value of the accessor description module corresponding to the accessor for accessing the color data is 1.
- syntax elements corresponding to various types of data contained in the three-dimensional mesh corresponding to the mesh description module are added in the mesh description module, including: adding syntax elements corresponding to various types of data contained in the three-dimensional mesh corresponding to the mesh description module in the attributes of the primitives of the mesh description module.
- adding syntax elements corresponding to various types of data contained in the three-dimensional mesh corresponding to the mesh description module to the attributes of the primitives of the mesh description module includes: adding syntax elements corresponding to various types of data contained in the three-dimensional mesh corresponding to the mesh description module to the attributes of the primitives of the mesh description module based on the first syntax element set.
- the first syntax element set is a set of syntax elements supported in the attributes of the primitives of the mesh description module of the scene description file specified in the ISO/IEC 23090-14 MPEG-I scene description standard.
- syntax elements are added to the attributes of primitives in the corresponding mesh description module based on the syntax elements in the same syntax element set.
- a certain three-dimensional mesh includes geometric coordinates and color data
- the index value of the accessor description module corresponding to the accessor for accessing the geometric coordinates is 1
- the index value of the accessor description module corresponding to the accessor for accessing the color data is 2.
- syntax elements corresponding to various types of data contained in the three-dimensional mesh corresponding to the mesh description module are added to the attributes of the primitives of the mesh description module, including: based on the syntax elements in the first syntax element set, syntax elements corresponding to various types of data contained in the corresponding three-dimensional mesh are added to the attributes of the primitives of the first mesh description module; based on the syntax elements in the second syntax element set, syntax elements corresponding to various types of data contained in the corresponding three-dimensional mesh are added to the attributes of the primitives of the second mesh description module.
- the first grid description module is a grid description module corresponding to the three-dimensional grid in the media file of the G-PCC encoded point cloud type
- the second grid description module is a grid description module corresponding to the three-dimensional grid in the media file of a type other than the G-PCC encoded point cloud.
- the first syntax element set is a set of syntax elements supported in the attributes of primitives of the grid description module of the scene description file specified in the ISO/IEC 23090-14 MPEG-I scene description standard;
- the second syntax element set is a set of syntax elements corresponding to the G-PCC coded point cloud of preset values.
- syntax elements corresponding to various types of data contained therein are added to the attributes of the primitives of the corresponding mesh description module; for the three-dimensional mesh in the media file of a type not G-PCC coded point cloud, based on the syntax elements in the second syntax element set, syntax elements corresponding to various types of data contained therein are added to the attributes of the primitives of the corresponding mesh description module.
- the scene description file includes two 3D meshes, the names of which are GPCCexample_mesh1 and GPCCexample_mesh2.
- GPCCexample_mesh1 does not belong to the 3D mesh in the media file of type G-PCC, including geometric coordinates and color data, which are used to access GPCCexample_mesh1.
- the index value of the accessor description module corresponding to the accessor for the geometric coordinates is 0, the index value of the accessor description module corresponding to the accessor for accessing the color data of GPCCexample_mesh1 is 1, GPCCexample_mesh2 belongs to a three-dimensional mesh in a media file of type G-PCC, including geometric coordinates and color data, the index value of the accessor description module corresponding to the accessor for accessing the geometric coordinates of GPCCexample_mesh2 is 2, and the index value of the accessor description module corresponding to the accessor for accessing the color data of GPCCexample_mesh2 is 3.
- the mesh list (meshes) in the scene description file can be as follows:
- the method for generating a scene description file further includes:
- the value of the grid name syntax element (name) in the grid description module corresponding to the three-dimensional grid is set according to the name of the three-dimensional grid.
- the method for generating a scene description file further includes:
- the syntax elements included in the attributes of the primitives of the mesh description module corresponding to the three-dimensional mesh are set according to the data types included in the three-dimensional mesh.
- the method for generating a scene description file further includes:
- the value of the syntax element used to describe the topological type of the three-dimensional mesh in the mesh description module corresponding to the three-dimensional mesh is set.
- the syntax element used to describe the topological type of the three-dimensional mesh in the mesh description module corresponding to the three-dimensional mesh is "mode”.
- the method for generating a scene description file further includes:
- An accessor description module (accessor) corresponding to a target accessor is added to a buffer list (accessor) of the scene description file, wherein the target accessor is an accessor for accessing decoded data of the target media file.
- the method for generating a scene description file further includes: adding a buffer description module (buffer) corresponding to a target buffer in a buffer list (buffers) of the scene description file, wherein the target buffer is a buffer for storing decoded data of the target media file.
- buffer buffer description module
- adding a buffer description module (buffer) corresponding to the target buffer in the buffer list (buffers) of the scene description file comprises at least one of the following steps a1 to a5:
- Step a1 Add a byte length syntax element (byteLength) in the buffer description module corresponding to the target buffer, and set the value of the byte length syntax element to the byte length of the target media file.
- the value of "byteLenth" in the buffer description module is set to "15000".
- Step a2 adding an MPEG circular buffer (MPEG_buffer_circular) to the buffer description module corresponding to the target buffer.
- MPEG_buffer_circular MPEG circular buffer
- Step a3 Add a link number syntax element (count) in the MPEG ring buffer, and set the corresponding value of the link number syntax element (count) according to the number of stored links in the target buffer.
- the "count” and its value in the circular buffer are set to: “count”:8.
- Step a4 Add a media index syntax element (media) in the MPEG ring buffer, and set the value of the media index syntax element (media) according to the index value of the target media description module.
- the index value of the target media description module is 0, then the "media” and its value in the description module of the ring buffer are set to: “media”:0.
- Step a5 Add a second track index syntax element (tracks) in the MPEG circular buffer, and set the value of the second track index syntax element (tracks) according to the track index value of the source data of the data stored in the target buffer.
- the buffer description module corresponding to the target buffer added in the buffer list of the scene description file can be as follows:
- the method for generating a scene description file further includes: in the cache cut of the scene description file Add the cache slice description module corresponding to the cache slice of the target buffer to the slice list (bufferViews).
- adding a cache slice description module corresponding to the cache slice of the target cache to the cache slice list of the scene description file includes at least one of the following steps b1 to b3:
- Step b1 add a buffer index syntax element (buffer) in the cache slice description module corresponding to the cache slice of the target buffer, and set the value of the buffer index syntax element (buffer) according to the index value of the buffer description module corresponding to the target buffer to which the cache slice belongs.
- buffer buffer index syntax element
- the "buffer” and its value in the cache slice description module are set to: “buffer”:2.
- Step b2 add a second byte length syntax element (byteLength) in the cache slice description module corresponding to the cache slice of the target cache, and set the value of the second byte length syntax element (byteLength) according to the capacity of the cache slice.
- Step b3 add an offset syntax element (byteOffset) in the cache slice description module corresponding to the cache slice of the target cache, and set the value of the offset syntax element according to the offset of the storage data of the corresponding cache slice.
- byteOffset an offset syntax element
- the cache slice description module corresponding to the cache slice of the target cache is added in the cache slice list (bufferViews) of the scene description file, including each item in the above steps b1 to b3, the index value of the cache description module corresponding to a certain target cache is 1, the capacity of the target cache is 8000, and the target cache includes two cache slices, the capacity of the first cache slice is 6000, the offset is 0, the capacity of the second cache slice is 2000, and the offset is 6001, then the cache slice description module corresponding to the cache slice of the target cache is added to the cache slice list of the scene description file as follows:
- the method for generating a scene description file further includes: adding an accessor description module corresponding to a target accessor in an accessor list (accessors) of the scene description file, wherein the target accessor is an accessor for accessing decoded data of the target media file.
- a target accessor is added to the accessor list (accessors) of the scene description file.
- the accessor description module includes at least one of the following steps c1 to c6:
- Step c1 add a data type syntax element (componentType) in the accessor description module corresponding to the target accessor, and set the value of the corresponding data type syntax element according to the type of data accessed by the target accessor.
- componentType a data type syntax element
- the data type syntax element and its value in the accessor description module corresponding to the accessor are set to: "componentType": 5126.
- Step c2 Add an accessor type syntax element (type) in the accessor description module corresponding to the target accessor, and set the value of the accessor type syntax element according to the preconfigured accessor type.
- the accessor type syntax element (type) and its value in the accessor description module corresponding to the accessor are set to: “type”:"VEC3".
- Step c3 Add a data quantity syntax element (count) in the accessor description module corresponding to the target accessor, and set the value of the corresponding accessor type syntax element according to the type of the target accessor.
- Step c4 Add an MPEG time-varying accessor (MPEG_accessor_timed) to the accessor description module corresponding to the target accessor.
- MPEG_accessor_timed MPEG time-varying accessor
- Step c5 add a cache slice index syntax element (bufferView) in the MPEG time-varying accessor, and set the corresponding value of the slice index syntax element according to the index value of the cache slice description module corresponding to the cache slice storing the data accessed by the target accessor.
- bufferView a cache slice index syntax element
- the buffer slice index syntax element and its value in the MPEG time-varying accessor of the accessor description module corresponding to the target accessor are set to: "bufferView":3.
- Step c6 Add a time-varying syntax element (immutable) in the MPEG time-varying accessor, and set the value of the time-varying syntax element according to whether the value of the syntax element in the corresponding target accessor changes with time.
- the time-varying syntax element and its value in the MPEG time-varying accessor of the accessor description module corresponding to the target accessor are set to: "immutable”:true; when the value of a syntax element within a target accessor changes with time, the time-varying syntax element and its value in the MPEG time-varying accessor of the accessor description module corresponding to the target accessor are set to: "immutable”:false.
- the accessor description module corresponding to the target accessor for accessing the data in the cache slice of the target cache added in the accessor list (accessors) of the scene description file includes each item in the above steps c1 to c6, the type of data accessed by a certain target accessor is 5121, the accessor type of the target accessor is VEC2, the amount of data accessed by the target accessor is 4000, the index value of the cache slice description module corresponding to the cache slice storing the data to be accessed by the target accessor is 1, and the value of the syntax element in the corresponding accessor does not change with time, then the accessor description module corresponding to the target accessor added in the accessor list (accessors) of the scene description file can be as follows:
- the method for generating a scene description file further includes:
- a digital asset description module (asset) is added to the scene description file, a version syntax element (version) is added to the digital asset description module, and when the scene description file is a scene description document written based on glTF 2.0, the value of the version syntax element is set to 2.0.
- the digital asset description module added to the scene description file may be as follows:
- the method for generating a scene description file further includes:
- extension usage description module (extensionsUsed) is added to the scene description file, and an extension of the MPEG scene description file of the glTF2.0 version used by the scene description file is added to the extension usage description module.
- the MPEG extensions used in the scene description file include: MPEG media (MPEG_media), MPEG circular buffer (MPEG_buffer_circular) and MPEG time-varying accessor (MPEG_accessor_timed), and the extended usage description module added in the scene description file can be as follows:
- the method for generating a scene description file further includes:
- a scene declaration (scene) is added to the scene description file, and the value of the scene declaration is set to the index value of the scene description module corresponding to the scene to be rendered.
- the index value of the scene description module corresponding to the scene to be rendered is 0, and the scene declaration added to the scene description file may be as follows:
- Some embodiments of the present application also provide a method for parsing a scene description file. As shown in FIG. 13 , the method for parsing a scene description file includes the following steps: S131 to S133:
- the three-dimensional scene to be rendered includes a target media file of the type of G-PCC coded point cloud.
- the 3D scene to be rendered in the embodiment of the present application may include one or more media files, and when the 3D scene to be rendered includes multiple media files, the type of one or more media files in the multiple media files may be G-PCC coded point cloud.
- the parsing method provided in the embodiment of the present application may be performed on the target media files of the type of G-PCC coded point cloud respectively.
- S132 Acquire a target media description module corresponding to the target media file from a media list (media) of the MPEG media (MPEG_media) of the scene description file.
- the target media description module corresponding to the target media file may be as follows:
- S133 Acquire description information of the target media file according to the target media description module.
- the above step S133 (obtaining description information of the target media file according to the target media description module) includes at least one of the following steps 1331 to 1337:
- Step 1331 Obtain the name of the target media file according to the value of the media name syntax element (name) in the target media description module.
- the media name syntax element in the target media description module and its value are: "name”:"GPCCexample", then it can be determined that the name of the target media file is: GPCCexample.
- Step 1332 Determine whether the target media file needs to be played automatically according to the value of the automatic play syntax element (autoplay) in the target media description module.
- whether the target media file needs to be played automatically is determined based on the value of the autoplay syntax element (autoplay) in the target media description module, including: when the autoplay syntax element (autoplay) in the target media description module and its value are: "autoplay":true, it is determined that the target media file needs to be played automatically; and when the autoplay syntax element (autoplay) in the target media description module and its value are: "autoplay”:false, it is determined that the target media file does not need to be played automatically.
- Step 1333 Determine whether the target media file needs to be played in a loop according to the value of the loop playback syntax element (loop) in the target media description module.
- whether the target media file needs to be played in a loop is determined based on the value of the loop playback syntax element (loop) in the target media description module, including: when the loop playback syntax element (loop) in the target media description module and its value are: "loop":true, it is determined that the target media file needs to be played in a loop; and when the loop playback syntax element (loop) in the target media description module and its value are: "loop":false, it is determined that the target media file does not need to be played in a loop.
- Step 1334 Obtain the encapsulation format of the target media file according to the value of the media type syntax element (mimeType) in the alternatives of the target media description module.
- the value of the media type syntax element (mimeType) in the media description module corresponding to the media file will be set to the encapsulation format value corresponding to the G-PCC encoded point cloud, and the encapsulation format value corresponding to the G-PCC encoded point cloud can be: "application/mp4". Therefore, when the encapsulation format value corresponding to the G-PCC encoded point cloud is: "application/mp4", the encapsulation format of the target media file can be obtained as MP4.
- Step 1335 Obtain the access address of the target media file according to the value of the unique address identifier syntax element (URI) in the alternatives of the target media description module.
- URI unique address identifier syntax element
- the unique address identifier syntax element (uri) in the alternatives of the target media description module and its value is: "uri”:"http://www.example.com/GPCCexample.mp4", then it can be determined that the access address of the target media file is: http://www.example.com/GPCCexample.mp4.
- Step 1336 Obtain the track information of the target media file according to the value of the first track index syntax element (track) in the track array (tracks) of the alternatives (alternatives) of the target media description module.
- the track information of the target media file is obtained according to the value of the first track index syntax element (track) in the track array (tracks) of the options (alternatives) of the target media description module, including: when the encapsulation file of the target media file is a single-track encapsulation file, the value of the first track index syntax element is determined as the index value of the codestream track of the target media file; when the target media file is a multi-track encapsulation file, the value of the first track index syntax element is determined as the index value of the geometric codestream track of the target media file.
- Step 1337 Determine the type of code stream and decoding parameters of the target media file according to the value of the codec parameter syntax element (codecs) in the track array (tracks) of the options (alternatives) of the target media description module and the SO/IEC 23090-18 G-PCC data transmission standard.
- codecs codec parameter syntax element
- the above step 1337 (determining the type of code stream and decoding parameters of the target media file according to the value of the codec parameter syntax element (codecs) in the track array (tracks) of the options (alternatives) of the target media description module and the ISO/IEC 23090-18 G-PCC data transmission standard) includes the following steps 13371 and 13372:
- Step 13371 determine the type of code stream and encoding parameters of the target media file according to the value of the codec parameter syntax element (codecs) in the track array (tracks) of the options (alternatives) of the target media description module and the SO/IEC 23090-18 G-PCC data transmission standard.
- codecs codec parameter syntax element
- the ISO/IEC 23090-18 G-PCC data transmission standard specifies that when a G-PCC coded point cloud is encapsulated using DASH, when G-PCC preselection signaling is used in the MPD file, the codecs attribute of the preselection signaling should be set to 'gpc1', indicating that the preselected media is a point cloud based on geometry; when there are multiple G-PCC Tile tracks in a G-PCC container, the "codecs" attribute of the Main G-PCC Adaptation Set should be set to 'gpcb' or 'gpeb', indicating that the adaptation set contains G-PCC Tile basic track data.
- the "codecs" attribute of the Main G-PCC adaptivesset should be set to 'gpcb'.
- the "codecs" attribute of the Main G-PCC Adaptation Set shall be set to 'gpeb'.
- the "codecs” attribute of the preselection signaling shall be set to 'gpt1', indicating that the preselected media is a point cloud fragment based on geometry.
- the value of "codecs” in "tracks” of "alternatives” of the target media description module shall be set to 'gpc1'. Therefore, the encapsulation method and encoding parameters of the target media file can be determined according to the value of the codec parameter syntax element (codecs) in the track array (tracks) of the options (alternatives) of the target media description module and the SO/IEC 23090-18 G-PCC data transmission standard.
- Step 13372 Determine decoding parameters of the target media file according to encoding parameters of the target media file.
- the decoding parameters of the target media file can be determined according to the encoding parameters of the target media file.
- target media description module corresponding to the target media file is as follows:
- the description information of the target media file obtained by the target media description module includes: the name of the target media file is: AAAA, the target media file does not need to be played automatically, but needs to be played in a loop; the encapsulation format of the target media file is MP4, and the access address of the target media file is: http://www.bbbb.com/AAAA.mp4; the reference track of the target media file is the code stream track with an index value of 0, the encapsulation/decapsulation method of the target media file is MP4, and the encoding and decoding parameter of the target media file is gpc1.
- the scene description file parsing method provided in the embodiment of the present application can obtain the target media description module corresponding to the target media file from the media list of the MPEG media of the scene description file after obtaining the scene description file of the to-be-rendered three-dimensional scene including the target media file of the type of G-PCC coded point cloud, and obtain the description information of the target media file according to the target media description module.
- the embodiment of the present application can obtain the description information of the target media file according to the target media description module, and then render and display the to-be-rendered three-dimensional scene including the target media file of the type of G-PCC coded point cloud based on the description information of the target media file, the embodiment of the present application provides a method capable of parsing the scene description file of the three-dimensional scene including the media file of the type of G-PCC coded point cloud, and realizes the parsing of the scene description file of the three-dimensional scene including the G-PCC coded point cloud.
- the scene description file parsing method provided in the above embodiment further includes:
- a target scene description module (scene) corresponding to the to-be-rendered three-dimensional scene is obtained from a scene list (scenes) of the scene description file, and description information of the to-be-rendered three-dimensional scene is obtained according to the target scene description module.
- a scene declaration (scene) and its declared index value can be obtained from the scene description file, and a target scene description module corresponding to the three-dimensional scene to be rendered can be obtained from the scene list of the scene description file based on the scene declaration and its declared index value.
- the first scene description module can be obtained from the scene list of the scene description file according to the scene declaration and its declared index value as the target scene description module corresponding to the three-dimensional scene to be rendered.
- obtaining description information of the three-dimensional scene to be rendered according to the target scene description module includes: determining the index value of the node description module corresponding to the node in the three-dimensional scene to be rendered according to the index value declared by the node index list (nodes) of the target scene description module.
- the target scene description module is as follows:
- the target scene description module to be described can be determined according to the index value declared in the node index list (nodes) of the target scene description module.
- the rendered three-dimensional scene includes two nodes, the index value of the node description module corresponding to one node is 0 (the first node description module in the node list), and the index value of the node description module corresponding to the other node is 1 (the second node description module in the node list).
- the scene description file parsing method further includes:
- the node description module corresponding to the node in the three-dimensional scene to be rendered is obtained from the node list (nodes) of the scene description file; and according to the node description module corresponding to the node in the three-dimensional scene to be rendered, the description information of the node in the three-dimensional scene to be rendered is obtained.
- the first node description module is obtained from the node list of the scene description file as the node description module corresponding to the node in the three-dimensional scene to be rendered.
- the index values declared in the node index list of the target scene description module include 0 and 1
- the first node description module and the second node description module are obtained from the node list of the scene description file as the node description modules corresponding to the nodes in the three-dimensional scene to be rendered.
- obtaining description information of nodes in the three-dimensional scene to be rendered according to a node description module corresponding to the nodes in the three-dimensional scene to be rendered includes at least one of the following steps a1 and a2:
- Step a1 Obtain the name of the node in the three-dimensional scene to be rendered according to the value of the node name syntax element (name) in the node description module corresponding to the node in the three-dimensional scene to be rendered.
- Step a2 determining the index value of the mesh description module corresponding to the three-dimensional mesh mounted on the node in the three-dimensional scene to be rendered according to the index value declared in the mesh index list in the node description module corresponding to the node in the three-dimensional scene to be rendered.
- a node description module corresponding to a certain node is as follows:
- step a1 it can be determined that the name of the node is: GPCCexample_node, and based on the above step a2, it can be determined that the index values of the grid description module corresponding to the three-dimensional grid mounted on the node are 0 and 1 respectively.
- the scene description file parsing method after determining the index value of the mesh description module corresponding to the three-dimensional mesh mounted on the node in the three-dimensional scene to be rendered, also includes: obtaining the mesh description module corresponding to the three-dimensional mesh mounted on the node in the three-dimensional scene to be rendered from the mesh list (meshes) of the scene description file according to the index value of the mesh description module corresponding to the three-dimensional mesh mounted on the node in the three-dimensional scene to be rendered; and obtaining the description information of the three-dimensional mesh mounted on the node in the three-dimensional scene to be rendered according to the mesh description module corresponding to the three-dimensional mesh mounted on the node in the three-dimensional scene to be rendered.
- the first grid description module is obtained from the grid list of the scene description file as the grid description module corresponding to the three-dimensional grid mounted on the node corresponding to the node description module.
- the index values declared in the grid index list of a node description module include 1 and 2
- the second grid description module and the third grid description module are obtained from the grid list of the above file as the grid description modules corresponding to the three-dimensional grid mounted on the node corresponding to the node description module.
- obtaining the description information of the three-dimensional mesh mounted on the node in the three-dimensional scene to be rendered includes at least one of the following steps b1 to b4:
- Step b1 Obtain the name of the three-dimensional grid according to the grid name syntax element (name) in the grid description module corresponding to the three-dimensional grid.
- Step b2 Acquire the data type included in the three-dimensional grid according to the data type syntax element in the grid description module corresponding to the three-dimensional grid.
- the above-mentioned step b2 (obtaining the data types included in the three-dimensional grid according to the data type syntax elements in the mesh description module corresponding to the three-dimensional grid) includes: obtaining the data types included in the three-dimensional grid according to the data type syntax elements in the target extension array of the extension list (extensions) of the primitives (primitives) of the mesh description module corresponding to the three-dimensional grid.
- the target extension array may be MPEG_primitve_GPCC.
- extension list of primitives of the mesh description module corresponding to a certain three-dimensional mesh is as follows:
- the three-dimensional grid includes position coordinates according to the position coordinate syntax element (position) in the target extension array (MPEG_primitve_GPCC) of the extension list (extensions) of the primitives (primitives) of the grid description module corresponding to the three-dimensional grid, that the three-dimensional grid includes color values according to the color value syntax element (color_0) in the target extension array (MPEG_primitve_GPCC) of the extension list (extensions) of the primitives (primitives) of the grid description module corresponding to the three-dimensional grid, and that the three-dimensional grid includes normal vectors according to the normal vector syntax element (normal) in the target extension array (MPEG_primitve_GPCC) of the extension list (extensions) of the primitives (primitives) of the grid description module corresponding to the three-dimensional grid.
- extension list of primitives of a grid description module corresponding to a certain three-dimensional grid is as follows:
- the three-dimensional grid can be determined to include position coordinates according to the position coordinate syntax element (G-PCC_position) in the target extension array (MPEG_primitve_GPCC) of the extension list (extensions) of the primitives (primitives) of the grid description module corresponding to the three-dimensional grid, and the position coordinates of the three-dimensional grid can be determined according to the extension list (extensions) of the primitives (primitives) of the grid description module corresponding to the three-dimensional grid.
- G-PCC_position position coordinate syntax element
- MPEG_primitve_GPCC target extension array
- the color value syntax element (G-PCC_color_0) in the target extension array (MPEG_primitve_GPCC) of (extensions) determines that the three-dimensional grid includes color values
- the normal vector syntax element (G-PCC_normal) in the target extension array (MPEG_primitve_GPCC) of the extension list (extensions) of the primitives (primitives) of the grid description module corresponding to the three-dimensional grid determines that the three-dimensional grid includes normal vectors.
- the above-mentioned step b2 (obtaining the data type included in the three-dimensional grid according to the data type syntax element in the grid description module corresponding to the three-dimensional grid) includes: obtaining the data type included in the three-dimensional grid according to the data type syntax element in the attributes of the primitives of the grid description module corresponding to the three-dimensional grid.
- the attributes of the primitives of the mesh description module corresponding to a certain three-dimensional mesh are as follows:
- the three-dimensional mesh includes position coordinates according to the position coordinate syntax element (position) in the attributes (attributes) of the primitives (primitives) of the mesh description module corresponding to the three-dimensional mesh, that the three-dimensional mesh includes color values according to the color value syntax element (color_0) in the attributes (attributes) of the primitives (primitives) of the mesh description module corresponding to the three-dimensional mesh, and that the three-dimensional mesh includes normal vectors according to the normal vector syntax element (normal) in the attributes (attributes) of the primitives (primitives) of the mesh description module corresponding to the three-dimensional mesh.
- the attributes of the primitives of the mesh description module corresponding to a certain three-dimensional mesh are as follows:
- the three-dimensional mesh includes position coordinates according to the position coordinate syntax element (G-PCC_position) in the attributes (attributes) of the primitives (primitives) of the mesh description module corresponding to the three-dimensional mesh, that the three-dimensional mesh includes color values according to the color value syntax element (G-PCC_color_0) in the attributes (attributes) of the primitives (primitives) of the mesh description module corresponding to the three-dimensional mesh, and that the three-dimensional mesh includes normal vectors according to the normal vector syntax element (G-PCC_normal) in the attributes (attributes) of the primitives (primitives) of the mesh description module corresponding to the three-dimensional mesh.
- G-PCC_position position coordinate syntax element
- the three-dimensional mesh includes color values according to the color value syntax element (G-PCC_color_0) in the attributes (attributes) of the primitives (primitives) of the mesh description module corresponding to the three-dimensional mesh
- the three-dimensional mesh includes normal vectors according to the normal vector
- Step b3 Obtain the index value of the accessor description module corresponding to the accessor for accessing the data of the type of the three-dimensional grid according to the value of the data type syntax element.
- the value of the position coordinate syntax element (G-PCC_position) is 0, so the index value of the accessor description module corresponding to the accessor used to access the position coordinates of the three-dimensional mesh is 0 (the first accessor in the accessor list), the value of the color value syntax element (G-PCC_color_0) is 1, so the index value of the accessor description module corresponding to the accessor used to access the color value of the three-dimensional mesh is 1 (the second accessor in the accessor list), and the value of the normal vector syntax element (G-PCC_normal) is 2, so the index value of the accessor description module corresponding to the accessor used to access the normal vector of the three-dimensional mesh is 2 (the third accessor in the accessor list).
- Step b4 Obtain the value of the mode syntax element (mode) in the mesh description module corresponding to the three-dimensional mesh. The type of topology of the grid.
- the type of the topological structure of the three-dimensional mesh can be determined as a scattered point; when the value of the pattern syntax element is 1, the type of the topological structure of the three-dimensional mesh can be determined as a line; when the value of the pattern syntax element is 4, the type of the topological structure of the three-dimensional mesh can be determined as a triangle.
- a grid description module corresponding to a certain three-dimensional grid is as follows:
- the description information of the three-dimensional mesh obtained according to the mesh description module corresponding to the three-dimensional mesh includes: the name of the three-dimensional mesh is: G-PCCexample_mesh; the topological type of the three-dimensional mesh is scattered points; the three-dimensional mesh includes three types of data, namely position coordinates, color values and normal vectors, the index value of the accessor description module corresponding to the accessor for accessing the position coordinates of the three-dimensional mesh is 0, the index value of the accessor description module corresponding to the accessor for accessing the color value of the three-dimensional mesh is 1, and the index value of the accessor description module corresponding to the accessor for accessing the normal vector of the three-dimensional mesh is 2.
- the method further includes:
- the second accessor description module is obtained from the accessor list of the scene description file as the accessor description module corresponding to the accessor used to access the color value of the three-dimensional grid.
- obtaining description information of accessors for accessing various types of data of a three-dimensional grid according to accessor description modules corresponding to accessors for accessing various types of data of a three-dimensional grid includes at least one of the following steps c1 to c6:
- Step c1 Determine the type of data accessed by the accessor according to the value of the data type syntax element (componentType) in the accessor description module.
- the data type syntax element in the accessor description module corresponding to the accessor used to access the normal vector of a three-dimensional grid is: "componentType": 5126, then it can be determined that the type of the data accessed by the accessor corresponding to the accessor description module (the normal vector of the three-dimensional grid) is a 32-bit floating point number (float).
- the accessor type syntax element in the accessor description module corresponding to the accessor for accessing the position coordinates of a three-dimensional grid is: "type": VEC3, then it can be determined that the type of the accessor corresponding to the accessor description module is a three-dimensional vector.
- Step c3 Determine the number of data accessed by the accessor according to the value of the data number syntax element (count) in the accessor description module.
- the data quantity syntax element in the accessor description module corresponding to the accessor for accessing the color value of a three-dimensional grid is "count": 1000, then it can be determined that the quantity of data (the color value of the three-dimensional grid) accessed by the accessor corresponding to the accessor description module is 1000.
- Step c4 Determine whether the accessor is a time-varying accessor based on MPEG extension modification according to whether the accessor description module contains an MPEG time-varying accessor (MPEG_accessor_timed).
- whether the accessor is a time-varying accessor modified based on MPEG extension is determined based on whether the accessor description module contains an MPEG time-varying accessor, including: if the accessor description module contains an MPEG time-varying accessor, then the accessor is determined to be a time-varying accessor modified based on MPEG extension, and if the accessor description module does not contain an MPEG time-varying accessor, then the accessor is determined not to be a time-varying accessor modified based on MPEG extension.
- Step c5 determining the index value of the cache slice description module corresponding to the cache slice storing the data accessed by the accessor according to the value of the cache slice index syntax element (bufferView) in the MPEG time-varying accessor (MPEG_accessor_timed) of the accessor description module.
- Step c6 Determine whether the value of the syntax element in the accessor changes with time based on the value of the time-varying syntax element (immutable) in the MPEG time-varying accessor of the accessor description module.
- whether the value of the syntax element in the accessor changes with time is determined based on the value of the time-varying syntax element (immutable) in the MPEG time-varying accessor of the accessor description module, including: if the time-varying syntax element in the MPEG time-varying accessor of the accessor description module and its value are: "immutable”:true, then it is determined that the value of the syntax element in the accessor does not change with time, and if the time-varying syntax element in the MPEG time-varying accessor of the accessor description module and its value are: "immutable”:false, then it is determined that the value of the syntax element in the accessor will change with time.
- the accessor description module corresponding to a certain accessor is as follows:
- the description information of the accessor obtained by the accessor description module corresponding to the accessor includes: the type of data accessed by the accessor is 5123; the accessor type is scalar (SCALAR); the number of data accessed by the accessor is 1000; the accessor is a modified time-varying accessor based on MPEG extension; the data accessed by the accessor is cached in the cache slice corresponding to the second cache slice description module in the cache slice list; the value of the syntax element in the accessor does not change with time.
- SCALAR scalar
- the number of data accessed by the accessor is 1000
- the accessor is a modified time-varying accessor based on MPEG extension
- the data accessed by the accessor is cached in the cache slice corresponding to the second cache slice description module in the cache slice list
- the value of the syntax element in the accessor does not change with time.
- Step d obtaining a buffer description module in a buffer list (buffers) of the scene description file.
- Step e Get the value of the media index syntax element (media) in the buffer description module.
- Step f determining the buffer description module whose value of the media index syntax element is the same as the index value of the target media description module as the target buffer description module corresponding to the target buffer for caching the decoded data of the target media file.
- the buffer description module whose value of the media index syntax element is 0 is determined as the target buffer description module corresponding to the target buffer for caching the decoded data of the target media file.
- the number of target buffers for caching the decoded data of the target media file may be one or more, and this embodiment of the present application does not impose any limitation on this.
- Step g obtaining description information of the target buffer according to the target buffer description module.
- Step g1 Obtain the capacity of the target buffer according to the value of the first byte length syntax element (byteLength) in the target buffer description module.
- the capacity of the target buffer is 15000 bytes.
- Step g2 Determine whether the target buffer is a circular buffer based on MPEG extension modification according to whether the target buffer description module includes an MPEG circular buffer (MPEG_buffer_circular).
- whether the target buffer is a ring buffer modified based on MPEG extension is determined according to whether the target buffer description module includes an MPEG ring buffer, including: if the target buffer description module includes an MPEG ring buffer, it is determined that the target buffer is a ring buffer modified based on MPEG extension, and if the target buffer description module does not include an MPEG ring buffer, it is determined that the target buffer is not a ring buffer modified based on MPEG extension.
- Step g3 obtaining the number of storage links of the MPEG circular buffer according to the value of the link number syntax element (count) in the MPEG circular buffer of the target buffer description module.
- the link number syntax element in the MPEG ring buffer of the target buffer description module and its value are: "count”:8, it can be determined that the MPEG ring buffer includes 5 storage links.
- Step g4 According to the value of the second track index syntax element (tracks) in the MPEG circular buffer of the target buffer description module, obtain the track index value of the source data cached by the MPEG circular buffer.
- a buffer description module corresponding to a certain buffer is as follows:
- the description information of the buffer can be obtained, including: the capacity of the buffer is 8000 bytes; the buffer is a circular buffer based on MPEG extension modification, the number of storage links of the circular buffer is 5, the media file stored in the circular buffer is the second media file declared in the MPEG media, and the track index value of the source data cached by the circular buffer is 1.
- the scene description file parsing method provided in the above embodiment further includes the following steps h to k:
- Step h Obtain the cache slice description module in the cache slice list (bufferViews) of the scene description file.
- Step i Get the value of the buffer index syntax element (buffer) in the cache slice description module.
- Step j determine the cache slice description module whose value of the cache index syntax element is the same as the index value of the target cache description module as the cache slice description module corresponding to the cache slice of the target cache.
- the cache slice description module whose value of the cache index syntax element is 1 is determined as the cache slice description module corresponding to the cache slice of the target cache.
- the number of cache slices of the target cache may be one or more, which is not limited in the embodiment of the present application.
- Step k acquiring description information of the cache slice of the target cache according to the cache slice description module corresponding to the cache slice of the target cache.
- obtaining description information of the cache slice of the target cache according to the cache slice description module corresponding to the cache slice of the target cache includes at least one of the following steps k1 and k2:
- Step k1 obtaining the capacity of the cache slice of the target cache according to the value of the second byte length syntax element (byteLength) in the cache slice description module corresponding to the cache slice of the target cache.
- the second byte length syntax element and its value in the cache slice description module corresponding to a cache slice of the target cache are: "byteLength": 12000, it can be determined that the capacity of the cache slice of the target cache is 12000 bytes.
- Step k2 Obtain the offset of the cache slice of the target cache according to the value of the offset syntax element (byteOffset) in the cache slice description module corresponding to the cache slice of the target cache.
- the offset syntax element and its value in the cache slice description module corresponding to a cache slice of the target cache are: "byteOffset": 0, it can be determined that the offset of the cache slice of the target cache is 0 bytes.
- the cache slice description module corresponding to a cache slice is as follows:
- the description information of the cache slice can be obtained, including: the cache slice is a cache slice of the cache corresponding to the second cache description module in the cache list, and the capacity of the cache slice is 8000 bytes; the offset of the cache slice is 0, that is, the data range cached by the cache slice is the first 8000 bytes.
- the scene description file parsing method provided in the above embodiment further includes the following steps 1 to 0:
- Step 1 Obtain an accessor description module in an accessor list (accessor) of the scene description file.
- Step m obtain the value of the cache slice index syntax element (bufferView) in the accessor description module.
- Step n determine the accessor description module whose value of the cache slice index syntax element is the same as the index value of the cache slice description module corresponding to the cache slice of the target cache as the accessor description module corresponding to the accessor used to access the data in the cache slice of the target cache.
- the accessor description module whose value of the cache slice index syntax element is 2 is determined as the accessor description module corresponding to the accessor used to access the data in the cache slice of the target cache.
- Step o According to the accessor description module corresponding to the accessor used to access the data in the cache slice of the target cache, obtain the description information of the accessor used to access the data in the cache slice of the target cache.
- obtaining description information of an accessor for accessing data in a cache slice of the target cache according to an accessor description module corresponding to the accessor for accessing data in a cache slice of the target cache includes at least one of the following steps o1 to o6:
- Step o1 Determine the type of data accessed by the accessor according to the value of the data type syntax element (componentType) in the accessor description module.
- Step o2 Determine the type of the accessor according to the value of the accessor type syntax element (type) in the accessor description module.
- Step o3 Determine the number of data accessed by the accessor according to the value of the data number syntax element (count) in the accessor description module.
- Step o4 Determine whether the accessor is a time-varying accessor based on MPEG extension modification according to whether the accessor description module contains an MPEG time-varying accessor (MPEG_accessor_timed).
- Step o5 determining the index value of the cache slice description module corresponding to the cache slice storing the data accessed by the accessor according to the value of the cache slice index syntax element (bufferView) in the MPEG time-varying accessor of the accessor description module.
- bufferView the cache slice index syntax element
- Step o6 Determine whether the value of the syntax element in the accessor changes with time based on the value of the time-varying syntax element (immutable) in the MPEG time-varying accessor of the accessor description module.
- the implementation of the above steps o1 to o6 can refer to the implementation of the above steps c1 to c6, and to avoid redundancy, they will not be described in detail here.
- Some embodiments of the present application further provide a method for rendering a three-dimensional scene.
- the execution subject of the method for rendering a three-dimensional scene is a display engine in an immersive media description framework. As shown in FIG. 14 , the method for rendering a three-dimensional scene includes the following steps:
- the three-dimensional scene to be rendered includes a target media file of the type of G-PCC coded point cloud.
- the implementation method of obtaining the scene description file of the three-dimensional scene to be rendered includes: sending a request message for requesting the scene description file of the three-dimensional scene to be rendered to a media resource server, and receiving a request response sent by the media resource server and carrying the scene description file of the three-dimensional scene to be rendered.
- S143 Send description information of the target media file to the media access function.
- the media access function can obtain the target media file according to the description information of the target media file, process the target media file to obtain the decoded data of the target media file, and write the decoded data of the target media file into the target buffer.
- the display engine sends the description information of the target media file to the media access function, including: the display engine may send the description information of the target media file to the media access function through a media access function API.
- data that has been completely processed by the media access function and can be directly used for rendering the three-dimensional scene to be rendered is read from the target buffer.
- S145 Render the to-be-rendered 3D scene based on the decoded data of the target media file.
- the rendering method of a three-dimensional scene after obtaining a scene description file of a three-dimensional scene to be rendered including a target media file of a type of G-PCC coded point cloud, first obtains description information of the target media file according to a media description module corresponding to the target media file in a media list of MPEG media of the scene description file, and sends the description information of the target media file to a media access function, so that the media access function obtains the target media file according to the description information of the target media file, processes the target media file to obtain decoded data of the target media file, writes the decoded data of the target media file into a target buffer, reads the decoded data of the target media file from the target buffer, and renders the three-dimensional scene to be rendered based on the decoded data of the target media file.
- the display engine can obtain the description information of the target media file according to the target media description module, send the description information of the target media file to the media access function, read the decoded data of the target media file of the type G-PCC coded point cloud, and render the three-dimensional scene to be rendered based on the decoded data of the target media file
- the embodiment of the present application provides a rendering method for rendering a three-dimensional scene to be rendered including a media file of the type G-PCC coded point cloud, and realizes the rendering of the media file of the type G-PCC coded point cloud based on the scene description file.
- Some embodiments of the present application further provide a method for processing a media file.
- the execution subject of the method for processing a media file is a media access function in an immersive media description framework.
- the method for processing a media file includes the following steps:
- S151 Receive description information of a target media file, description information of a target buffer, and description information of a cache slice of the target buffer sent by a display engine.
- the target media file is a media file of a G-PCC coded point cloud type
- the target buffer is a buffer for caching decoded data of the target media file.
- the description information of the target media file may include at least one of the following:
- the description information of the target buffer may include at least one of the following:
- the capacity of the buffer whether it is an MPEG circular buffer, the number of storage links of the circular buffer, the index value of the media description module corresponding to the target media file, and the track index value of the source data of the data cached by the circular buffer.
- the description information of the cache slice of the target cache may include at least one of the following:
- the cache to which the cache slice belongs the capacity of the cache slice, and the offset of the cache slice.
- receiving description information of a target media file, description information of a target buffer, and description information of a cache slice of the target buffer sent by a display engine includes:
- the description information of the target media file, the description information of the target buffer, and the description information of the cache slice of the target buffer sent by the display engine are received through a media access function API.
- S152 Obtain decoding data of the target media file according to the description information of the target media file.
- the media access function obtains the decoded data of the target media file according to the description information of the target media file, including:
- a target pipeline for processing the target media file is created according to the description information of the target media file, the target media file is acquired through the target pipeline, and the target media file is decapsulated and decoded to acquire decoded data of the target media file.
- obtaining the target media file through the target pipeline, and decapsulating and decoding the target media file to obtain decoded data of the target media file includes: obtaining the target media file through the input module of the target pipeline, and inputting the target media file into the decapsulation module of the target pipeline; decoding the target media file through the decapsulation module to obtain a geometry code stream and an attribute code stream of the target media file; decoding the geometry code stream through the geometry decoder of the target pipeline to obtain geometry decoding data of the target media file; decoding the attribute code stream through the attribute decoder of the target pipeline to obtain attribute decoding data of the target media file.
- acquiring the target media file through the target pipeline, and decapsulating and decoding the target media file to obtain decoded data of the target media file further includes: after acquiring the geometric decoding data of the target media file, processing the geometric decoding data through a first post-processing module of the target pipeline, and after acquiring the attribute decoding data of the target media file, processing the attribute decoding data through a second post-processing module of the target pipeline.
- processing the geometric decoding data through the first post-processing module of the target pipeline may include: format conversion of the geometric decoding data through the first post-processing module of the target pipeline
- processing the attribute decoding data through the second post-processing module of the target pipeline may include: format conversion of the attribute decoding data through the second post-processing module of the target pipeline.
- S153 Write the decoded data of the target media file into the target buffer according to the description information of the target buffer and the description information of the cache slice of the target buffer.
- the display engine may read the decoded data of the target media file from the target buffer according to the description information of the target buffer and the description information of the cache slice of the target buffer.
- the method further comprises: obtaining decoded data of the target media file, and rendering a to-be-rendered three-dimensional scene including the target media file based on the decoded data of the target media file.
- the media file processing method provided in the embodiment of the present application obtains the decoded data corresponding to the target media file according to the description information of the target media file after receiving the description information of the target media file of the type of G-PCC coded point cloud sent by the display engine, the description information of the target buffer for caching the decoded data of the target media file, and the description information of the cache slices of the target buffer.
- the decoded data of the target media file is written into the target buffer according to the description information of the target buffer and the description information of the cache slices of the target buffer.
- the display engine can read the decoded data of the target media file from the target buffer according to the description information of the target buffer and the description information of the cache slices of the target buffer, and render the three-dimensional scene to be rendered including the target media file based on the decoded data of the target media file. Therefore, the embodiment of the present application can support rendering media files of the type of G-PCC coded point cloud in the scene description box.
- Some embodiments of the present application further provide a cache management method.
- the execution subject of the cache management method is a cache management module in the immersive media description framework. As shown in FIG. 16 , the cache management method includes the following steps:
- the target buffer is a buffer for caching a target media file
- the target media file is a media file of a G-PCC coded point cloud type.
- the description information of the target buffer may include at least one of the following:
- the capacity of the buffer whether it is an MPEG circular buffer, the number of storage links of the circular buffer, the index value of the media description module corresponding to the media file (the target media file) cached by the circular buffer, and the track index value of the source data of the data cached by the circular buffer.
- the description information of the cache slice of the target cache may include at least one of the following:
- the cache to which the cache slice belongs the capacity of the cache slice, and the offset of the cache slice.
- S162 Create the target buffer according to the description information of the target buffer.
- the description information of the target buffer includes: the capacity of the target buffer is 8000 bytes; the target buffer is a circular buffer based on MPEG extension modification, the number of storage links of the circular buffer is 3, the media file stored in the circular buffer is the first media file declared in the MPEG media, and the track index value of the source data cached by the circular buffer is 1, then the cache management module creates a circular buffer with a capacity of 8000 bytes and 3 storage links as the target buffer.
- the description information of the first cache slice includes: the capacity is 6000 bytes, the offset is 0, and the description information of the second cache slice includes: the capacity is 2000 bytes, and the offset is 6001, then the target buffer is divided into 2 cache slices, the capacity of the first cache slice is 6000 bytes, which is used to cache the first 6000 bytes of decoded data of the target media file, and the capacity of the second cache slice is 2000 bytes, which is used to cache 6001 to 8000 bytes of decoded data of the target media file.
- the media access function can write the decoded data of the target media file into the target buffer
- the display engine can read the decoded data of the target media file from the target buffer
- the display engine can read the decoded data of the target media file based on the target media file.
- the decoded data of the volume file is used to render the to-be-rendered three-dimensional scene including the target media file.
- the cache management method provided in the embodiment of the present application can create the target buffer according to the description information of the target buffer and divide the cache slices of the target buffer according to the description information of the cache slices of the target buffer after receiving the description information of the target buffer. Therefore, the media access function can write the decoded data of the media file of type G-PCC coded point cloud into the target buffer, the display engine can read the decoded data of the target media file from the target buffer, and render the three-dimensional scene to be rendered including the target media file based on the decoded data of the target media file. Therefore, the embodiment of the present application can support rendering of media files of type G-PCC coded point cloud in the scene description box.
- Some embodiments of the present application also provide a method for rendering a three-dimensional scene, which includes: a scene description file parsing method and a three-dimensional scene rendering method executed by a display engine, a media file processing method executed by a media access function, and a cache management method executed by a cache management module.
- a scene description file parsing method and a three-dimensional scene rendering method executed by a display engine a media file processing method executed by a media access function
- a cache management method executed by a cache management module.
- the display engine obtains a scene description file of a 3D scene to be rendered.
- the three-dimensional scene to be rendered includes a target media file of the type of G-PCC coded point cloud.
- the display engine obtains a scene description file of a scene to be rendered, including: the display engine downloads the scene description file from a server using a network transmission service.
- the display engine obtains a scene description file of a scene to be rendered, including: reading the scene description file from a local storage space.
- the display engine obtains the media description module corresponding to each media file from the media list (media) of the MPEG media (MPEG_media) of the scene description file (including: obtaining the media description module corresponding to the target media file from the media list of the MPEG media of the scene description file).
- the display engine obtains description information of each media file according to the media description module corresponding to each media file (including: obtaining description information of the target media file according to the media description module corresponding to the target media file).
- the description information of the media file includes at least one of the following:
- the name of the media file whether the media file is played automatically, whether the media file is played in a loop, the packaging format of the media file, the access address of the media file, the track information of the packaging file of the media file, and the encoding and decoding parameters of the media file.
- the implementation manner in which the display engine obtains the description information of the target media file according to the media description module corresponding to the target media file can refer to the implementation manner in which the media description module of the target media file is parsed in the above-mentioned scene description parsing method. To avoid redundancy, it will not be described in detail here.
- the display engine sends description information of each media file to the media access function (including: sending description information of the target media file to the media access function).
- the media access function receives the description information of each media file sent by the display engine (including: receiving the description information of the target media file sent by the display engine).
- the display engine sends the description information of each media file to the media access function, including: the display engine sends the description information of each media file to the media access function through the media access function API.
- the media access function receives the description information of each media file sent by the display engine, including: the media access function receives the description information of each media file sent by the display engine through the media access function API.
- the media access function creates a management function for processing each media file according to the description information of each media file.
- the method further comprises: creating a target pipeline for processing the target media file according to the target media file description information.
- the target pipeline includes: an input module, a decapsulation module and a decoding module; the input module is used to obtain the target media file (encapsulated file); the decapsulation module is used to decapsulate the target media file to obtain the code stream of the target media file (which may be a G-PCC code stream encapsulated in a single track, or a G-PCC geometry code stream and a G-PCC attribute code stream encapsulated in multiple tracks); the decoding module includes a decoder, a geometry decoder and an attribute decoder; when the code stream of the target media file is a G-PCC code stream encapsulated in a single track, the decoding module decodes the G-PCC code stream through the decoder to obtain the decoded data of the target media file; when the code stream of the target media file is a G-PCC geometry code stream and a G-PCC attribute code stream encapsulated in multiple tracks, the G-PCC geometry code stream and the G-PCC attribute
- the target pipeline also includes: a first post-processing module and a second post-processing module; the first post-processing module is used to perform post-processing such as format conversion on the geometric data obtained by decoding the G-PCC geometric code stream, and the second post-processing module is used to perform post-processing such as format conversion on the attribute data obtained by decoding the G-PCC attribute code stream.
- first post-processing module is used to perform post-processing such as format conversion on the geometric data obtained by decoding the G-PCC geometric code stream
- the second post-processing module is used to perform post-processing such as format conversion on the attribute data obtained by decoding the G-PCC attribute code stream.
- the media access function obtains each media file through pipeline processing corresponding to each media file, and decapsulates and decodes each media file to obtain decoded data corresponding to each media file. (including obtaining the target media file through the target pipeline, and decapsulating and decoding the target media file to obtain decoded data corresponding to the target media file).
- the description information of the target media file includes an access address of the target media file
- the media access function obtains the decoded data of the target media file according to the description information of the target media file, including:
- the media access function obtains the target media file according to the access address of the target media file.
- the media access function obtains the target media file according to the access address of the target media file, including: the media access function sends a media resource request to a media resource server according to the access address of the target media file, and receives a media resource response sent by the media server carrying the target media file.
- the media access function obtains the target media file according to the access address of the target media file, including: the media access function reads the target media file from a preset storage space according to the access address of the target media file.
- the description information of the target media file further includes index values of each code stream track of the target media file; and the media access function obtains the decoded data of the target media file according to the description information of the target media file, including:
- the media access function decapsulates the target media file according to the target media file encapsulation format, and obtains the bitstreams of each bitstream track of the target media file.
- the description information of the target media file further includes the type and encoding and decoding parameters of the code stream of the target media file; the media access function obtains the decoding data of the target media file according to the description information of the target media file, including:
- the media access function decodes the bitstreams of each bitstream track of the target media file according to the bitstream type and encoding and decoding parameters of the target media file to obtain decoded data of the target media file.
- the display engine obtains each buffer description module in the buffer list (buffers) of the scene description file (including: obtaining the decoded data for caching the target media file from the buffer list of the scene description file); The buffer description module corresponding to the target buffer).
- the display engine obtains description information of each buffer according to the buffer description module corresponding to each buffer (including: obtaining description information of the target buffer according to the buffer description module corresponding to the target buffer).
- the description information of the cache may include at least one of the following:
- the capacity of the buffer (byte length), the access address of the data cached in the buffer, whether it is an MPEG circular buffer, the number of storage links in the circular buffer, the index value of the media description module corresponding to the media file cached in the circular buffer, and the track index value of the source data of the data cached in the circular buffer.
- the display engine obtains each cache slice description module in the cache slice list (bufferViews) of the scene description file (including: obtaining the cache slice description module corresponding to the cache slice description of the target buffer from the cache slice list of the scene description file).
- the display engine obtains description information of the cache slices of each cache according to the cache slice description module corresponding to the cache slice of each cache (including: obtaining description information of the cache slice of the target cache according to the cache slice description module corresponding to the cache slice of the target cache).
- the description information of the cache may include at least one of the following:
- the cache to which the cache slice belongs the capacity of the cache slice, and the offset of the cache slice.
- the display engine obtains each accessor description module in the accessor list (accessors) of the scene description file (including: obtaining the accessor description module corresponding to the target accessor for accessing the decoded data of the target media file from the accessor list of the scene description file).
- the display engine obtains description information of each accessor according to the accessor description module corresponding to each accessor (including: obtaining description information of the target accessor for accessing the decoded data of the target media file according to the accessor description module corresponding to the target accessor).
- the description information of the accessor may include at least one of the following:
- the embodiment of the present application can send the description information of each cache, the description information of the cache slices of each cache, and the description information of each accessor to the media access function and the cache management module through the following scheme 1.
- the implementation of solution 1 includes the following steps a and b:
- Step a the display engine sends the description information of each buffer, the description information of the cache slice of each buffer, and the description information of each accessor to the media access function (including: the display engine sends the description information of the target buffer, the description information of the cache slice of the target buffer, and the description information of the target accessor to the media access function).
- the media access function receives description information of each buffer and description information of the cache slices of each buffer sent by the display engine (including: the media access function receives description information of the target buffer, description information of the cache slices of the target buffer and description information of the accessor sent by the display engine).
- the above step a (the display engine sends the description information of each buffer, each The implementation method of the description information of the cache slices of the cache and the description information of each accessor) can be: the display engine sends the description information of each buffer, the description information of the cache slices of each buffer and the description information of each accessor to the media access function through the media access function API.
- the implementation method of the media access function receiving the description information of each buffer sent by the display engine can be: the media access function receives the description information of each buffer sent by the display engine, the description information of the cache slices of each buffer and the description information of each accessor through the media access function API.
- Step b the media access function sends the description information of each cache, the description information of the cache slice of each cache, and the description information of each accessor to the cache management module (including: the media access function sends the description information of the target cache, the description information of the cache slice of the target cache, and the description information of the target accessor to the cache management module).
- the cache management module receives description information of each cache, description information of the cache slices of each cache, and description information of each accessor sent by the media access function (including: the media access function receives description information of the target cache, description information of the cache slices of the target cache, and description information of the target accessor sent by the display engine).
- the implementation of the above step b may include: the media access function sends the description information of each buffer, the description information of the cache slice of each buffer, and the description information of each accessor to the cache management module through the cache API.
- the implementation of the cache management module receiving the description information of each buffer, the description information of the cache slice of each buffer, and the description information of each accessor sent by the media access function may include: the cache management module receives the description information of each buffer, the description information of the cache slice of each buffer, and the description information of each accessor sent by the media access function through the cache API.
- the implementation of solution 1 includes the following steps c and d:
- Step c the display engine sends the description information of each buffer, the description information of the cache slice of each buffer, and the description information of each accessor to the media access function (including: the display engine sends the description information of the target buffer, the description information of the cache slice of the target buffer, and the description information of the target accessor to the media access function).
- the media access function receives description information of each buffer, description information of the cache slices of each buffer, and description information of each accessor sent by the display engine (including: the media access function receives description information of the target buffer, description information of the cache slices of the target buffer, and description information of the accessor sent by the display engine).
- Step d the display engine sends the description information of each cache, the description information of the cache slice of each cache, and the description information of each accessor to the cache management module (including: the display engine sends the description information of the target cache, the description information of the cache slice of the target cache, and the description information of the target accessor to the cache management module).
- the cache management module receives description information of each cache, description information of cache slices of each cache, and description information of each accessor sent by the display engine.
- step d above the display engine sends the description information of each buffer, the description information of the cache slice of each buffer, and the description information of each accessor to the cache management module
- the display engine sends the description information of each buffer, the description information of the cache slice of each buffer to the cache management module through the cache API. The following information is provided along with descriptions of each accessor.
- the implementation method of the cache management module receiving the description information of each buffer, the description information of the cache slices of each buffer, and the description information of each accessor sent by the display engine may include: the cache management module receives the description information of each buffer, the description information of the cache slices of each buffer, and the description information of each accessor sent by the display engine through the cache API.
- the embodiment of the present application can send the description information of each cache, the description information of the cache slices of each cache, and the description information of each accessor to the media access function through the following scheme two, and send the description information of each cache and the description information of the cache slices of each cache to the cache management module.
- the implementation method of solution 2 (sending the description information of each buffer, the description information of the cache slice of each buffer, and the description information of each accessor to the media access function, and sending the description information of each buffer and the description information of the cache slice of each buffer to the cache management module) includes the following steps e and f:
- Step e the display engine sends the description information of each buffer, the description information of the cache slice of each buffer, and the description information of each accessor to the media access function (including: the display engine sends the description information of the target buffer, the description information of the cache slice of the target buffer, and the description information of the target accessor to the media access function).
- the media access function receives description information of each buffer and description information of the cache slices of each buffer sent by the display engine (including: the media access function receives description information of the target buffer, description information of the cache slices of the target buffer and description information of the accessor sent by the display engine).
- Step f the display engine sends the description information of each buffer and the description information of the cache slice of each buffer to the cache management module (including: the display engine sends the description information of the target buffer and the description information of the cache slice of the target buffer to the media access function).
- the implementation method of solution 2 (sending the description information of each buffer, the description information of the cache slice of each buffer, and the description information of each accessor to the media access function, and sending the description information of each buffer and the description information of the cache slice of each buffer to the cache management module) includes the following steps g and h:
- Step g the display engine sends the description information of each buffer, the description information of the cache slice of each buffer, and the description information of each accessor to the media access function (including: the display engine sends the description information of the target buffer, the description information of the cache slice of the target buffer, and the description information of the target accessor to the media access function).
- the media access function receives description information of each buffer and description information of the cache slices of each buffer sent by the display engine (including: the media access function receives description information of the target buffer, description information of the cache slices of the target buffer and description information of the accessor sent by the display engine).
- Step f the media access function sends the description information of each buffer and the description information of the cache slice of each buffer to the cache management module (including: the media access function sends the description information of the target buffer and the description information of the cache slice of the target buffer to the media access function).
- the cache management module receives the description information of each cache and the description information of the cache slice of each cache sent by the media access function (including: the description information of the target cache sent by the display engine received by the media access function and description information of the cache slice of the target cache).
- the cache management module creates each cache according to the description information of each cache (including: creating the target cache according to the description information of the target cache).
- the media access function writes the decoded data corresponding to each media file into the buffer corresponding to each media file according to the description information of each buffer, the description information of the cache slice of each buffer, and the description information of each accessor (including: the media access function writes the decoded data of the target media file into the target buffer according to the description information of the target buffer, the description information of the cache slice of the target buffer, and the description information of the target accessor).
- the display engine obtains a scene description module corresponding to the to-be-rendered three-dimensional scene from the scene list of the scene description file.
- the display engine obtains description information of the three-dimensional scene to be rendered according to the scene description module corresponding to the three-dimensional scene to be rendered.
- the display engine obtains the node description module corresponding to each node in the three-dimensional scene to be rendered from the node list of the scene description file according to the index value of the node description module corresponding to each node in the three-dimensional scene to be rendered.
- the display engine obtains description information of each node in the three-dimensional scene to be rendered according to the node description module corresponding to each node in the three-dimensional scene to be rendered.
- the description information of any node includes the index value of the mesh description module corresponding to the three-dimensional mesh mounted on the node.
- the description information of any node also includes the name of the node.
- the display engine obtains the mesh description module corresponding to the three-dimensional mesh in the three-dimensional scene to be rendered from the mesh list of the scene description file according to the index value of the mesh description module corresponding to the three-dimensional mesh mounted on each node in the three-dimensional scene to be rendered.
- the display engine obtains the data types contained in the three-dimensional mesh in the three-dimensional scene to be rendered and the accessors for accessing the various types of data of each three-dimensional mesh in the three-dimensional scene to be rendered according to the mesh description module corresponding to the three-dimensional mesh in the three-dimensional scene to be rendered.
- the method further includes: acquiring the name and topological structure type of the three-dimensional mesh in the three-dimensional scene to be rendered according to a mesh description module corresponding to the three-dimensional mesh in the three-dimensional scene to be rendered.
- the display engine creates each accessor according to the description information of each accessor (including creating an accessor for accessing each type of data of each three-dimensional mesh in the three-dimensional scene to be rendered according to the description information of the accessor for accessing each type of data of each three-dimensional mesh in the three-dimensional scene to be rendered).
- the display engine reads the decoded data of each media file from the buffer corresponding to each media file through each accessor (including: reading various types of data of each three-dimensional mesh in the three-dimensional scene to be rendered from the target buffer storage through an accessor for accessing various types of data of each three-dimensional mesh in the three-dimensional scene to be rendered).
- the display engine renders the to-be-rendered 3D scene based on the decoded data of each media file.
- some embodiments of the present application provide a device for generating a scene description file, the device for generating a scene description file comprising:
- a memory configured to store a computer program
- the processor is configured to, when calling the computer program, enable the scene description file generation device to implement the scene description file generation method described in any of the above embodiments.
- some embodiments of the present application provide a computer-readable storage medium, on which a computer program is stored.
- the computing device implements the method for generating a scene description file described in any of the above embodiments.
- some embodiments of the present application provide a computer program product, which, when executed on a computer, enables the computer to implement the method for generating a scene description file described in any of the above embodiments.
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Multimedia (AREA)
- Signal Processing (AREA)
- Computer Graphics (AREA)
- Software Systems (AREA)
- Geometry (AREA)
- Artificial Intelligence (AREA)
- General Engineering & Computer Science (AREA)
- General Health & Medical Sciences (AREA)
- Computational Linguistics (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Health & Medical Sciences (AREA)
- Computer Security & Cryptography (AREA)
- Processing Or Creating Images (AREA)
- Television Signal Processing For Recording (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
- Image Generation (AREA)
Abstract
Description
Claims (49)
- 一种场景描述文件的生成方法,包括:确定待渲染三维场景中的媒体文件的类型;当所述待渲染三维场景中的目标媒体文件的类型为基于几何的点云压缩G-PCC编码点云时,根据所述目标媒体文件的描述信息生成所述目标媒体文件对应的目标描述模块;在所述待渲染三维场景的场景描述文件的MPEG媒体的媒体列表中添加所述目标媒体描述模块。
- 根据权利要求1所述的方法,所述根据所述目标媒体文件的描述信息生成所述目标媒体文件对应的目标描述模块,包括:在所述目标媒体描述模块的可选项中添加媒体类型语法元素,并将所述媒体类型语法元素的值设置为G-PCC编码点云对应的封装格式值。
- 根据权利要求1所述的方法,所述根据所述目标媒体文件的描述信息生成所述目标媒体文件对应的目标描述模块,包括:在所述目标媒体描述模块的可选项的轨道数组中添加第一轨道索引语法元素,并根据所述目标媒体文件的封装方式设置所述第一轨道索引语法元素的值。
- 根据权利要求3所述的方法,所述根据所述目标媒体文件的封装方式设置所述第一轨道索引语法元素的值,包括:当所述目标媒体文件为单轨道封装文件时,将所述第一轨道索引语法元素的值设置为所述目标媒体文件的码流轨道的索引值;当所述目标媒体文件为多轨道封装文件时,将所述第一轨道索引语法元素的值设置为所述目标媒体文件的几何码流轨道的索引值。
- 根据权利要求1所述的方法,所述根据所述目标媒体文件的描述信息生成所述目标媒体文件对应的目标描述模块,包括:在所述目标媒体描述模块的可选项的轨道数组中添加编解码参数语法元素,并根据所述目标媒体文件的编码参数、所述目标媒体文件的码流的类型以及ISO/IEC 23090-18 G-PCC数据传输标准设置所述编解码参数语法元素的值。
- 根据权利要求1所述的方法,所述根据所述目标媒体文件的描述信息生成所述目标媒体文件对应的目标描述模块,包括:在所述目标媒体描述模块的可选项中添加统一资源标识符语法元素,并将所述统一资源标识符语法元素的值设置为所述目标媒体文件的访问地址。
- 根据权利要求1所述的方法,所述方法还包括:在所述场景描述文件的场景列表中添加所述待渲染三维场景对应的目标场景描述模块;在所述目标场景描述模块的节点索引列表中添加所述待渲染场景中的节点对应的节点描述模块的索引值。
- 根据权利要求1所述的方法,所述方法还包括:在所述场景描述文件的节点列表中添加所述待渲染场景中的节点对应的节点描述模块;在所述节点描述模块的网格索引列表中添加所述节点挂载的三维网格对应的网格描述模块的索引值。
- 根据权利要求1所述的方法,所述方法还包括:在所述场景描述文件的网格列表中添加所述待渲染场景中的三维网格对应的网格描述模块;在所述网格描述模块中添加所述网格描述模块对应的三维网格所包含的各个种类的数据对应的语法元素;将各个种类的数据对应的语法元素的值设置为用于访问各个种类的数据的访问器对应的访问器描述模块的索引值。
- 根据权利要求9所述的方法,所述在所述网格描述模块中添加所述网格描述模块对应的三维网格所包含的各个种类的数据对应的语法元素,包括:在所述目标媒体文件中的三维网格对应的网格描述模块的基元中添加扩展列表;在所述扩展列表中添加目标扩展数组;在所述目标扩展数组中添加对应的三维网格所包含的各个种类的数据对应的语法元素。
- 根据权利要求9所述的方法,所述在所述网格描述模块中添加所述网格描述模块对应的三维网格所包含的各个种类的数据对应的语法元素,包括:在所述网格描述模块的基元的属性中添加所述网格描述模块对应的三维网格所包含的各个种类的数据对应的语法元素。
- 根据权利要求11所述的方法,所述在所述网格描述模块的基元的属性中添加所述网格描述模块对应的三维网格所包含的各个种类的数据对应的语法元素,包括:基于第一语法元素集合中的语法元素,在第一网格描述模块的基元的属性中添加对应的三维网格所包含的各个种类的数据对应的语法元素,所述第一网格描述模块为类型为G-PCC编码点云的媒体文件中的三维网格对应的网格描述模块;基于第二语法元素集合中的语法元素,在第二网格描述模块的基元的属性中添加对应的三维网格所包含的各个种类的数据对应的语法元素,所述第二网格描述模块类型不为G-PCC编码点云的媒体文件中的三维网格对应的网格描述模块。
- 根据权利要求1所述的方法,所述方法还包括:在所述场景描述文件的访问器列表中添加目标访问器对应的访问器描述模块;其中,所述目标访问器为用于访问所述目标媒体文件的解码数据的访问器。
- 根据权利要求13所述的方法,所述在所述场景描述文件的访问器列表中添加目标访问器对应的访问器描述模块,包括以下至少一项:在所述目标访问器对应的访问器描述模块中添加数据类型语法元素,并根据所述目标访问器所访问的数据的类型设置所述数据类型语法元素的值;在所述目标访问器对应的访问器描述模块中添加访问器类型语法元素,并根据所述目标访问器的类型设置所述访问器类型语法元素的值;在所述目标访问器对应的访问器描述模块中添加数据数量语法元素,并根据所述目标访问器所访问的数据的数量设置所述数据数量语法元素的值;在所述目标访问器对应的访问器描述模块中添加MPEG时变访问器;在所述MPEG时变访问器中添加缓存切片索引语法元素,并根据存储所述目标访问器所访问的数据的缓存切片对应的缓存切片描述模块的索引值设置所述切片索引语法元素的值;在所述MPEG时变访问器中添加时变语法元素,并根据对应的目标访问器内的语法元素的取值是否随时间变化设置所述时变语法元素的值。
- 根据权利要求1所述的方法,所述方法还包括:在所述场景描述文件的缓存器列表中添加所述目标缓存器对应的缓存器描述模块;其中,所述目标缓存器为用于存储所述目标媒体文件的解码数据的缓存器。
- 根据权利要求15所述的方法,所述在所述场景描述文件的缓存器列表中添加目标缓存器对应的缓存器描述模块,包括:在所述缓存器描述模块中添加第一字节长度语法元素,并根据所述目标缓存器的容量设置对应的所述第一字节长度语法元素的值;在所述缓存器描述模块中添加MPEG环形缓存器;在所述MPEG环形缓存器中添加环节数量语法元素,并根据所述目标缓存器的存储环节数量设置对应的所述环节数量语法元素的值;在所述MPEG环形缓存器中添加媒体索引语法元素,并根据所述目标媒体描述模块的索引值设置所述媒体索引语法元素的值;在所述MPEG环形缓存器中添加第二轨道索引语法元素,并根据所述目标缓存器存储的数据的源数据的轨道索引值设置所述第二轨道索引语法元素的值。
- 根据权利要求15所述的方法,所述方法还包括:在所述场景描述文件的缓存切片列表中添加所述目标缓存器的缓存切片对应的缓存切片描述模块。
- 根据权利要求17所述的方法,所述在所述场景描述文件的缓存切片列表中添加所述目标缓存器的缓存切片对应的缓存切片描述模块,包括:在所述目标缓存器的缓存切片对应的缓存切片描述模块中添加缓存器索引语法元素,并根据所述缓存切片所属的目标缓存器对应的缓存器描述模块的索引值设置所述缓存器索引语法元素的值;在所述目标缓存器的缓存切片对应的缓存切片描述模块中添加第二字节长度语法元素,并根据所述缓存切片的容量设置所述第二字节长度语法元素的值;在所述目标缓存器的缓存切片对应的缓存切片描述模块中添加偏移量语法元素,并根据对应缓存切片的存储数据的偏移量设置所述偏移量语法元素的值。
- 一种场景描述文件的生成装置,包括:存储器,被配置为存储计算机程序;处理器,被配置为用于在调用计算机程序时,使得所述场景描述文件的生成装置实现权利要求1-18任一项所述的场景描述文件的生成方法。
- 一种场景描述文件的解析方法,包括:获取待渲染三维场景的场景描述文件,所述待渲染三维场景中包括类型为G-PCC编码点云的目标媒体文件;从所述场景描述文件的动态图像专家组MPEG媒体的媒体列表中获取所述目标媒体文件对应的目标媒体描述模块;根据所述目标媒体描述模块获取所述目标媒体文件的描述信息。
- 根据权利要求20所述的方法,所述根据所述目标媒体描述模块获取所述目标媒体文件的描述信息,包括以下至少一项:根据所述目标媒体描述模块中的媒体名称语法元素的值获取所述目标媒体文件的名称;根据所述目标媒体描述模块中的自动播放语法元素的值确定所述目标媒体文件是否需要自动播放;根据所述目标媒体描述模块中的循环播放语法元素的值确定所述目标媒体文件是否需要循环播放;根据所述目标媒体描述模块的可选项中的媒体类型语法元素的值获取所述目标媒体文件的封装格式;根据所述目标媒体描述模块的可选项中的唯一地址标识符语法元素的值获取所述目标媒体文件的访问地址;根据所述目标媒体描述模块的可选项的轨道数组中的第一轨道索引语法元素的值获取所述目标媒体文件的轨道信息;根据所述目标媒体描述模块的可选项的轨道数组中的编解码参数语法元素的值以及ISO/IEC 23090-18 G-PCC数据传输标准确定所述目标媒体文件的码流的类型和解码参数。
- 根据权利要求20所述的方法,所述方法还包括:从所述场景描述文件的场景列表中获取所述待渲染三维场景对应的目标场景描述模块;根据所述目标场景描述模块获取所述待渲染三维场景的描述信息。
- 根据权利要求22所述的方法,所述根据所述目标场景描述模块获取所述待渲染三维场景的描述信息,包括:根据所述目标场景描述模块的节点索引列表声明的索引值确定所述待渲染三维场景中的各个节点对应的节点描述模块的索引值。
- 根据权利要求23所述的方法,在根据所述目标场景描述模块的节点索引列表声明的索引值确定所述待渲染三维场景中的各个节点对应的节点描述模块的索引值之后,所述方法还包括:根据所述待渲染三维场景中的各个节点对应的节点描述模块的索引值,从所述场景描述文件的节点列表中获取所述待渲染三维场景中的各个节点对应的节点描述模块;根据所述待渲染三维场景中的各个节点对应的节点描述模块,获取所述待渲染三维场景中的各个节点的描述信息。
- 根据权利要求24所述的方法,所述根据所述待渲染三维场景中的各个节点对应的节点描述模块,获取所述待渲染三维场景中的各个节点的描述信息,包括以下至少一项:根据所述待渲染三维场景中的各个节点对应的节点描述模块中的节点名称语法元素的值,获取所述待渲染三维场景中的各个节点的名称;根据所述待渲染三维场景中的各个节点对应的节点描述模块中的网格索引列表声明的索引值,确定所述待渲染三维场景中的各个节点挂载的三维网格对应的网格描述模块的索引值。
- 根据权利要求25所述的方法,在确定所述待渲染三维场景中的各个节点挂载的三维网格对应的网格描述模块的索引值之后,所述方法还包括:根据所述待渲染三维场景中的各个节点挂载的三维网格对应的网格描述模块的索引值, 从所述场景描述文件的网格列表中获取所述待渲染三维场景中的各个节点挂载的三维网格对应的网格描述模块;根据所述待渲染三维场景中的各个节点挂载的三维网格对应的网格描述模块,获取所述待渲染三维场景中的各个节点挂载的三维网格的描述信息。
- 根据权利要求26所述的方法,所述根据所述待渲染三维场景中的各个节点挂载的三维网格对应的网格描述模块获取所述待渲染三维场景中的各个节点挂载的三维网格的描述信息,包括以下至少一项:根据各个三维网格对应的网格描述模块中的网格名称语法元素获取各个三维网格的名称;根据各个三维网格对应的网格描述模块中的数据种类语法元素获取各个三维网格所包括的数据种类;根据各个数据种类语法元素的值获取用于访问各个三维网格的各个种类的数据的访问器对应的访问器描述模块的索引值;根据各个三维网格对应的网格描述模块中的模式语法元素的值获取各个三维网格的拓扑结构的类型。
- 根据权利要求27所述的方法,在根据各个数据种类语法元素的值获取用于访问各个三维网格的各个种类的数据的访问器对应的访问器描述模块的索引值之后,所述方法还包括:根据用于访问各个三维网格的各个种类的数据的访问器对应的访问器描述模块的索引值,从所述场景描述文件的访问器列表中获取用于访问各个三维网格的各个种类的数据的访问器对应的访问器描述模块;根据用于访问各个三维网格的各个种类的数据的访问器对应的访问器描述模块,获取用于访问各个三维网格的各个种类的数据的访问器的描述信息。
- 根据权利要求20所述的方法,所述方法还包括:获取所述场景描述文件的缓存器列表中的各个缓存器描述模块;获取各个缓存器描述模块中的媒体索引语法元素的值;将所述媒体索引语法元素的值与所述目标媒体描述模块的索引值相同的缓存器描述模块确定为用于缓存所述目标媒体文件的解码数据的目标缓存器对应的目标缓存器描述模块;根据所述目标缓存器描述模块获取所述目标缓存器的描述信息。
- 根据权利要求29所述的方法,所述根据所述目标缓存器描述模块获取所述目标缓存器的描述信息,包括以下至少一项:根据所述目标缓存器描述模块中的第一字节长度语法元素的值,获取所述目标缓存器的容量;根据所述目标缓存器描述模块中是否包含MPEG环形缓存器确定所述目标缓存器是否为基于MPEG扩展改造的环形缓存器;根据所述目标缓存器描述模块的MPEG环形缓存器中的环节数量语法元素的值,获取所述MPEG环形缓存器的存储环节的数量;根据所述目标缓存器描述模块的MPEG环形缓存器中的第二轨道索引语法元素的值,获取所述MPEG环形缓存器所缓存的数据的源数据的轨道索引值。
- 根据权利要求29所述的方法,所述方法还包括:获取所述场景描述文件的缓存切片列表中的各个缓存切片描述模块;获取各个缓存切片描述模块中的缓存器索引语法元素的值;将所述缓存器索引语法元素的值与所述目标缓存器描述模块的索引值相同的缓存切片描述模块确定为所述目标缓存器的缓存切片对应的缓存切片描述模块;根据所述目标缓存器的缓存切片对应的缓存切片描述模块,获取所述目标缓存器的缓存切片的描述信息。
- 根据权利要求31所述的方法,所述根据所述目标缓存器的缓存切片对应的缓存切片描述模块,获取所述目标缓存器的缓存切片的描述信息,包括以下至少一项:根据所述目标缓存器的缓存切片对应的缓存切片描述模块中的第二字节长度语法元素的值,获取所述目标缓存器的缓存切片的容量;根据所述目标缓存器的缓存切片对应的缓存切片描述模块中的偏移量语法元素的值,获取所述目标缓存器的缓存切片的偏移量。
- 根据权利要求31所述的方法,所述方法还包括:获取所述场景描述文件的访问器列表中的各个访问器描述模块;获取各个访问器描述模块中的缓存切片索引语法元素的值;将所述缓存切片索引语法元素的值与所述目标缓存器的缓存切片对应的缓存切片描述模块的索引值相同的访问器描述模块,确定为用于对所述目标缓存器的缓存切片中的数据进行访问的访问器对应的访问器描述模块;根据用于对所述目标缓存器的缓存切片中的数据进行访问的访问器对应的访问器描述模块,获取用于对所述目标缓存器的缓存切片中的数据进行访问的访问器的描述信息。
- 根据权利要求28或33所述的方法,根据访问器对应的访问器描述模块获取访问器的描述信息,包括以下至少一项:根据访问器描述模块中的数据类型语法元素的值确定访问器所访问的数据的类型;根据访问器描述模块中的访问器类型语法元素的值确定访问器的类型;根据访问器描述模块中的数据数量语法元素的值确定访问器所访问的数据的数量;根据访问器描述模块中是否包含MPEG时变访问器确定访问器是否为基于MPEG扩展改造的时变访问器;根据访问器描述模块的MPEG时变访问器中的缓存切片索引语法元素的值确定存储访问器所访问的数据的缓存切片对应的缓存切片描述模块的索引值;根据访问器描述模块的MPEG时变访问器中的时变语法元素的值,确定访问器内的语法元素的取值是否随时间变化。
- 一种场景描述文件的解析装置,包括:存储器,被配置为存储计算机程序;处理器,被配置为用于在调用计算机程序时,使得所述三维场景的渲染装置实现权利要求20-34任一项所述的场景描述文件的解析方法。
- 一种三维场景的渲染方法,包括:获取待渲染三维场景的场景描述文件,所述待渲染三维场景中包括类型为基于几何的点云压缩G-PCC编码点云的目标媒体文件;根据所述场景描述文件的动态图像组专家MPEG媒体的媒体列表中所述目标媒体文件对应的媒体描述模块,获取所述目标媒体文件的描述信息;向媒体接入函数发送所述目标媒体文件的描述信息,以使所述媒体接入函数根据所述目标媒体文件的描述信息获取所述目标媒体文件,对所述目标媒体文件进行处理获取所述目标媒体文件的解码数据,以及将所述目标媒体文件的解码数据写入目标缓存器;从所述目标缓存器中读取所述目标媒体文件的解码数据;基于所述目标媒体文件的解码数据对所述待渲染三维场景进行渲染。
- 根据权利要求36所述的方法,所述向媒体接入函数发送所述目标媒体文件的描述信息,包括:通过媒体接入函数应用程序编程接口API向所述媒体接入函数发送所述目标媒体文件的描述信息。
- 根据权利要求36所述的方法,在从所述目标缓存器中读取所述目标媒体文件的解码数据之前,所述方法还包括:从所述场景描述文件的缓存器列表中获取所述目标缓存器对应的缓存器描述模块;根据所述目标缓存器对应的缓存器描述模块获取所述目标缓存器的描述信息;从所述场景描述文件的缓存切片列表中获取所述目标缓存器的缓存切片对应的缓存切片描述模块;根据所述目标缓存器的缓存切片对应的缓存切片描述模块获取所述目标缓存器的缓存切片的描述信息;向缓存管理模块发送所述目标缓存器的描述信息和所述目标缓存器的缓存切片的描述信息,以使所述缓存管理模块根据所述目标缓存器的描述信息创建所述目标缓存器以及根据所述目标缓存器的缓存切片的描述信息对所述目标缓存器进行缓存切片的划分。
- 根据权利要求38所述的方法,所述向缓存管理模块发送所述目标缓存器的描述信息和所述目标缓存器的缓存切片的描述信息,包括:通过缓存API向所述向缓存管理模块发送所述目标缓存器的描述信息和所述目标缓存器的缓存切片的描述信息。
- 根据权利要求39所述的方法,在从所述目标缓存器中读取所述目标媒体文件的解码数据之前,所述方法还包括:向所述媒体接入函数发送所述目标缓存器的描述信息和所述目标缓存器的缓存切片的描述信息,以使所述媒体接入函数根据所述目标缓存器的描述信息和所述目标缓存器的缓存切片的描述信息将所述目标媒体文件的解码数据写入所述目标缓存器中。
- 根据权利要求40所述的方法,所述向所述媒体接入函数发送所述目标缓存器的描述信息和所述目标缓存器的缓存切片的描述信息,包括:通过媒体接入函数API向所述媒体接入函数发送所述目标缓存器的描述信息和所述目标缓存器的缓存切片的描述信息。
- 根据权利要求38所述的方法,所述向缓存管理模块发送所述目标缓存器的描述信息和所述目标缓存器的缓存切片的描述信息,包括:通过媒体接入函数API向所述媒体接入函数发送所述目标缓存器的描述信息和所述目标 缓存器的缓存切片的描述信息,以使所述媒体接入函数将所述目标缓存器的描述信息和所述目标缓存器的缓存切片的描述信息转发至所述缓存管理模块。
- 根据权利要求38所述的方法,在从所述目标缓存器中读取所述目标媒体文件的解码数据之前,所述方法还包括:从所述场景描述文件的访问器列表中获取目标访问器对应的访问器描述模块,所述目标访问器为用于访问所述目标媒体文件的解码数据的访问器;根据所述目标访问器对应的访问器描述模块获取所述目标访问器的描述信息;向所述媒体接入函数发送所述目标访问器的描述信息,以使所述媒体接入函数根据所述目标访问器的描述信息将所述目标媒体文件的解码数据写入所述目标缓存器中。
- 根据权利要求36-43任一项所述的方法,所述从所述目标缓存器中读取所述目标媒体文件的解码数据,包括:通过用于访问所述待渲染三维场景中的各个三维网格的各个种类的数据的访问器,从所述目标缓存器存储的所述目标媒体文件的解码数据中读取所述待渲染三维场景中的各个三维网格的各个种类的数据。
- 根据权利要求44所述的方法,在通过用于访问所述待渲染三维场景中的各个三维网格的各个种类的数据的访问器,从所述目标缓存器存储的所述目标媒体文件的解码数据中读取所述待渲染三维场景中的各个三维网格的各个种类的数据之前,所述方法还包括:根据所述待渲染三维场景中的各个节点挂载的三维网格对应的网格描述模块的索引值,从所述场景描述文件的网格列表中获取所述待渲染三维场景中的各个三维网格对应的网格描述模块;根据所述待渲染三维场景中的各个三维网格对应的网格描述模块,获取所述待渲染三维场景中的各个三维网格所包含的数据种类以及用于访问所述待渲染三维场景中的各个三维网格的各个种类的数据的访问器。
- 根据权利要求45所述的方法,在根据所述待渲染三维场景中的各个节点挂载的三维网格对应的网格描述模块的索引值,从所述场景描述文件的网格列表中获取所述待渲染三维场景中的各个三维网格对应的网格描述模块之前,所述方法还包括:根据所述待渲染三维场景中的各个节点对应的节点描述模块的索引值,从所述场景描述文件的节点列表中获取所述待渲染三维场景中的各个节点对应的节点描述模块;根据所述待渲染三维场景中的各个节点对应的节点描述模块,获取所述待渲染三维场景中的各个节点的描述信息;任一节点的描述信息包括该节点挂载的三维网格对应的网格描述模块的索引值。
- 根据权利要求46所述的方法,在根据所述待渲染三维场景中的各个节点对应的节点描述模块的索引值,从所述场景描述文件的节点列表中获取所述待渲染三维场景中的各个节点对应的节点描述模块之前,所述方法还包括:从所述场景描述文件的场景列表中获取所述待渲染三维场景对应的场景描述模块;根据所述待渲染三维场景对应的场景描述模块,获取所述待渲染三维场景的描述信息,所述待渲染三维场景的描述信息包括所述待渲染三维场景中的各个节点对应的节点描述模块的索引值。
- 根据权利要求47所述的方法,任一三维网格的描述信息还包括用于访问该三维网格的各个种类的数据的访问器对应的访问器描述模块的索引值,所述根据所述待渲染三维场景中的各个三维网格对应的网格描述模块,获取用于访问所述待渲染三维场景中的各个三维网格的各个种类的数据的访问器,包括:根据用于访问所述待渲染三维场景中的各个三维网格的各个种类的数据的访问器对应的访问器描述模块的索引值,从所述场景描述文件的访问器列表中获取用于访问所述待渲染三维场景中的各个三维网格的各个种类的数据的访问器对应的访问器描述模块;根据用于访问所述待渲染三维场景中的各个三维网格的各个种类的数据的访问器对应的访问器描述模块,获取用于访问所述待渲染三维场景中的各个三维网格的各个种类的数据的访问器的描述信息;根据用于访问所述待渲染三维场景中的各个三维网格的各个种类的数据的访问器的描述信息创建用于访问所述待渲染三维场景中的各个三维网格的各个种类的数据的访问器。
- 一种三维场景的渲染装置,包括:存储器,被配置为存储计算机程序;处理器,被配置为用于在调用计算机程序时,使得所述三维场景的渲染装置实现权利要求36-48任一项所述的三维场景的渲染方法。
Priority Applications (5)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| EP23915515.3A EP4557723A4 (en) | 2023-01-10 | 2023-06-01 | METHOD AND DEVICE FOR GENERATING A SCENE DESCRIPTION FILE |
| KR1020257003775A KR20250034972A (ko) | 2023-01-10 | 2023-06-01 | 장면 묘사 파일의 생성 방법 및 장치 |
| CN202380034331.4A CN119013965A (zh) | 2023-01-10 | 2023-06-01 | 场景描述文件的生成方法及装置 |
| JP2025507589A JP2025529756A (ja) | 2023-01-10 | 2023-06-01 | シーン記述ファイルの生成方法及び装置 |
| US19/033,804 US20250166234A1 (en) | 2023-01-10 | 2025-01-22 | Method and apparatus for generating scene description document |
Applications Claiming Priority (4)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN202310036790 | 2023-01-10 | ||
| CN202310036790.8 | 2023-01-10 | ||
| CN202310474240.4A CN118334199A (zh) | 2023-01-10 | 2023-04-27 | 一种场景描述文件的生成方法及装置 |
| CN202310474240.4 | 2023-04-27 |
Related Child Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| US19/033,804 Continuation US20250166234A1 (en) | 2023-01-10 | 2025-01-22 | Method and apparatus for generating scene description document |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| WO2024148751A1 true WO2024148751A1 (zh) | 2024-07-18 |
Family
ID=91763118
Family Applications (2)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| PCT/CN2023/097873 Ceased WO2024148751A1 (zh) | 2023-01-10 | 2023-06-01 | 场景描述文件的生成方法及装置 |
| PCT/CN2023/120131 Ceased WO2024148849A1 (zh) | 2023-01-10 | 2023-09-20 | 场景描述文件的生成方法及装置 |
Family Applications After (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| PCT/CN2023/120131 Ceased WO2024148849A1 (zh) | 2023-01-10 | 2023-09-20 | 场景描述文件的生成方法及装置 |
Country Status (6)
| Country | Link |
|---|---|
| US (1) | US20250166234A1 (zh) |
| EP (1) | EP4557723A4 (zh) |
| JP (1) | JP2025529756A (zh) |
| KR (1) | KR20250034972A (zh) |
| CN (17) | CN118334198A (zh) |
| WO (2) | WO2024148751A1 (zh) |
Families Citing this family (2)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US12530845B2 (en) * | 2023-03-01 | 2026-01-20 | Toyota Research Institute, Inc. | Hybrid geometric primitive representation for point clouds |
| CN120599137B (zh) * | 2025-05-28 | 2025-12-09 | 慧航(江西)数字科技有限公司 | 基于3d高斯溅射的三维场景实时渲染优化系统及方法 |
Citations (6)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US20190251737A1 (en) * | 2017-05-31 | 2019-08-15 | Verizon Patent And Licensing Inc. | Methods and Systems for Rendering Frames Based on a Virtual Entity Description Frame of a Virtual Scene |
| CN112492385A (zh) * | 2020-09-30 | 2021-03-12 | 中兴通讯股份有限公司 | 点云数据处理方法、装置、存储介质及电子装置 |
| CN112700550A (zh) * | 2021-01-06 | 2021-04-23 | 中兴通讯股份有限公司 | 三维点云数据处理方法、装置、存储介质及电子装置 |
| CN114747219A (zh) * | 2019-10-02 | 2022-07-12 | 诺基亚技术有限公司 | 用于存储和信令传送子样本条目描述的方法和装置 |
| CN115315943A (zh) * | 2021-01-06 | 2022-11-08 | 腾讯美国有限责任公司 | 用于媒体场景描述的方法和设备 |
| CN115396646A (zh) * | 2022-08-22 | 2022-11-25 | 腾讯科技(深圳)有限公司 | 一种点云媒体的数据处理方法及相关设备 |
Family Cites Families (7)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US10911787B2 (en) * | 2018-07-10 | 2021-02-02 | Apple Inc. | Hierarchical point cloud compression |
| US11010931B2 (en) * | 2018-10-02 | 2021-05-18 | Tencent America LLC | Method and apparatus for video coding |
| WO2020072665A1 (en) * | 2018-10-02 | 2020-04-09 | Futurewei Technologies, Inc. | Hierarchical tree attribute coding in point cloud coding |
| JP7467646B2 (ja) * | 2020-06-24 | 2024-04-15 | 中興通訊股▲ふん▼有限公司 | 3次元コンテンツ処理方法および装置 |
| EP4193602A1 (en) * | 2020-08-07 | 2023-06-14 | Vid Scale, Inc. | Tile tracks for geometry-based point cloud data |
| WO2022220278A1 (ja) * | 2021-04-14 | 2022-10-20 | ソニーグループ株式会社 | 情報処理装置および方法 |
| CN114898043A (zh) * | 2022-05-19 | 2022-08-12 | 中国南方电网有限责任公司超高压输电公司检修试验中心 | 一种激光点云数据瓦片构建方法 |
-
2023
- 2023-04-11 CN CN202310383435.8A patent/CN118334198A/zh active Pending
- 2023-04-11 CN CN202310383151.9A patent/CN118334195A/zh active Pending
- 2023-04-11 CN CN202310383395.7A patent/CN118334197A/zh active Pending
- 2023-04-11 CN CN202310383168.4A patent/CN118338024A/zh active Pending
- 2023-04-11 CN CN202310383368.XA patent/CN118334196A/zh active Pending
- 2023-04-27 CN CN202310474368.0A patent/CN118334200A/zh active Pending
- 2023-04-27 CN CN202310479017.9A patent/CN118334137A/zh active Pending
- 2023-04-27 CN CN202310479071.3A patent/CN118334139A/zh active Pending
- 2023-04-27 CN CN202310479050.1A patent/CN118334138A/zh active Pending
- 2023-04-27 CN CN202310474240.4A patent/CN118334199A/zh active Pending
- 2023-06-01 KR KR1020257003775A patent/KR20250034972A/ko active Pending
- 2023-06-01 EP EP23915515.3A patent/EP4557723A4/en active Pending
- 2023-06-01 WO PCT/CN2023/097873 patent/WO2024148751A1/zh not_active Ceased
- 2023-06-01 CN CN202380034331.4A patent/CN119013965A/zh active Pending
- 2023-06-01 JP JP2025507589A patent/JP2025529756A/ja active Pending
- 2023-06-20 CN CN202310735525.9A patent/CN118334202A/zh active Pending
- 2023-06-20 CN CN202310735430.7A patent/CN118338095A/zh active Pending
- 2023-06-20 CN CN202310735434.5A patent/CN118334201A/zh active Pending
- 2023-06-20 CN CN202310738668.5A patent/CN118334140A/zh active Pending
- 2023-06-20 CN CN202310738680.6A patent/CN118338025A/zh active Pending
- 2023-09-20 WO PCT/CN2023/120131 patent/WO2024148849A1/zh not_active Ceased
- 2023-09-20 CN CN202380091041.3A patent/CN120500859A/zh active Pending
-
2025
- 2025-01-22 US US19/033,804 patent/US20250166234A1/en active Pending
Patent Citations (6)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US20190251737A1 (en) * | 2017-05-31 | 2019-08-15 | Verizon Patent And Licensing Inc. | Methods and Systems for Rendering Frames Based on a Virtual Entity Description Frame of a Virtual Scene |
| CN114747219A (zh) * | 2019-10-02 | 2022-07-12 | 诺基亚技术有限公司 | 用于存储和信令传送子样本条目描述的方法和装置 |
| CN112492385A (zh) * | 2020-09-30 | 2021-03-12 | 中兴通讯股份有限公司 | 点云数据处理方法、装置、存储介质及电子装置 |
| CN112700550A (zh) * | 2021-01-06 | 2021-04-23 | 中兴通讯股份有限公司 | 三维点云数据处理方法、装置、存储介质及电子装置 |
| CN115315943A (zh) * | 2021-01-06 | 2022-11-08 | 腾讯美国有限责任公司 | 用于媒体场景描述的方法和设备 |
| CN115396646A (zh) * | 2022-08-22 | 2022-11-25 | 腾讯科技(深圳)有限公司 | 一种点云媒体的数据处理方法及相关设备 |
Non-Patent Citations (1)
| Title |
|---|
| See also references of EP4557723A4 * |
Also Published As
| Publication number | Publication date |
|---|---|
| CN118334139A (zh) | 2024-07-12 |
| WO2024148849A1 (zh) | 2024-07-18 |
| CN118334196A (zh) | 2024-07-12 |
| KR20250034972A (ko) | 2025-03-11 |
| CN118338024A (zh) | 2024-07-12 |
| EP4557723A1 (en) | 2025-05-21 |
| CN118334199A (zh) | 2024-07-12 |
| CN120500859A (zh) | 2025-08-15 |
| CN118334140A (zh) | 2024-07-12 |
| CN118334198A (zh) | 2024-07-12 |
| CN119013965A (zh) | 2024-11-22 |
| CN118334200A (zh) | 2024-07-12 |
| CN118334195A (zh) | 2024-07-12 |
| EP4557723A4 (en) | 2025-12-31 |
| CN118334202A (zh) | 2024-07-12 |
| US20250166234A1 (en) | 2025-05-22 |
| CN118334201A (zh) | 2024-07-12 |
| CN118338025A (zh) | 2024-07-12 |
| CN118334137A (zh) | 2024-07-12 |
| JP2025529756A (ja) | 2025-09-09 |
| CN118334197A (zh) | 2024-07-12 |
| CN118334138A (zh) | 2024-07-12 |
| CN118338095A (zh) | 2024-07-12 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| US20250166234A1 (en) | Method and apparatus for generating scene description document | |
| WO2024037247A1 (zh) | 一种点云媒体的数据处理方法及相关设备 | |
| CN118334203A (zh) | 缓存管理方法及装置 | |
| WO2024149117A1 (zh) | 场景描述文件的生成方法及装置 | |
| CN118334211A (zh) | 一种场景描述文件的生成方法及装置 | |
| WO2025012275A1 (en) | Decoder, encoder, system, data stream, method and computer program for nn rendering in scenes based on an anchoring information | |
| CN121353476A (zh) | 数字人的动画方法及装置 | |
| CN121304880A (zh) | 三维场景的场景数据的处理方法及装置 | |
| CN121304878A (zh) | 场景描述文件的生成方法及装置 |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| 121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 23915515 Country of ref document: EP Kind code of ref document: A1 |
|
| WWE | Wipo information: entry into national phase |
Ref document number: 202380034331.4 Country of ref document: CN |
|
| REG | Reference to national code |
Ref country code: BR Ref legal event code: B01A Ref document number: 112024021335 Country of ref document: BR |
|
| ENP | Entry into the national phase |
Ref document number: 20257003775 Country of ref document: KR Kind code of ref document: A |
|
| WWE | Wipo information: entry into national phase |
Ref document number: 1020257003775 Country of ref document: KR |
|
| WWE | Wipo information: entry into national phase |
Ref document number: 2025507589 Country of ref document: JP |
|
| WWE | Wipo information: entry into national phase |
Ref document number: 2023915515 Country of ref document: EP |
|
| ENP | Entry into the national phase |
Ref document number: 2023915515 Country of ref document: EP Effective date: 20250214 |
|
| WWP | Wipo information: published in national office |
Ref document number: 1020257003775 Country of ref document: KR |
|
| WWE | Wipo information: entry into national phase |
Ref document number: 202517032074 Country of ref document: IN |
|
| WWP | Wipo information: published in national office |
Ref document number: 202517032074 Country of ref document: IN |
|
| WWP | Wipo information: published in national office |
Ref document number: 2023915515 Country of ref document: EP |
|
| ENP | Entry into the national phase |
Ref document number: 112024021335 Country of ref document: BR Kind code of ref document: A2 Effective date: 20241014 |
|
| NENP | Non-entry into the national phase |
Ref country code: DE |