WO2026005485A1 - Dispositif de codage de données de maillage, procédé de codage de données de maillage, dispositif de décodage de données de maillage et procédé de décodage de données de maillage - Google Patents
Dispositif de codage de données de maillage, procédé de codage de données de maillage, dispositif de décodage de données de maillage et procédé de décodage de données de maillageInfo
- Publication number
- WO2026005485A1 WO2026005485A1 PCT/KR2025/008911 KR2025008911W WO2026005485A1 WO 2026005485 A1 WO2026005485 A1 WO 2026005485A1 KR 2025008911 W KR2025008911 W KR 2025008911W WO 2026005485 A1 WO2026005485 A1 WO 2026005485A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- mesh
- attribute
- information
- encoding
- bitstream
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T15/00—Three-dimensional [3D] image rendering
- G06T15/04—Texture mapping
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/102—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
- H04N19/119—Adaptive subdivision aspects, e.g. subdivision of a picture into rectangular or non-rectangular coding blocks
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/50—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
- H04N19/597—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding specially adapted for multi-view video sequence encoding
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/70—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals characterised by syntax aspects related to video coding, e.g. related to compression standards
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/85—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using pre-processing or post-processing specially adapted for video compression
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/90—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using coding techniques not provided for in groups H04N19/10-H04N19/85, e.g. fractals
- H04N19/91—Entropy coding, e.g. variable length coding [VLC] or arithmetic coding
Definitions
- a point cloud is a collection of points in 3D space.
- the sheer number of points in 3D space makes it difficult to generate point cloud data.
- the technical problem according to the embodiments is to provide a point cloud data transmission device, transmission method, point cloud data reception device, and reception method for resolving latency and encoding/decoding complexity.
- the point cloud data transmission method, transmission device, point cloud data reception method, and reception device can provide a high-quality point cloud service.
- the point cloud data transmission method, transmission device, point cloud data reception method, and reception device can achieve various video codec methods.
- Figure 3 shows a V-MESH compression method according to embodiments.
- FIG. 4 illustrates pre-processing of V-MESH compression according to embodiments.
- Figure 5 illustrates a mid-edge subdivision method according to embodiments.
- Figure 6 shows a displacement generation process according to embodiments.
- Figure 8 shows a lifting conversion process for displacement according to embodiments.
- Fig. 10 shows an attribute transfer process of a V-MESH compression method according to embodiments.
- Figure 11 illustrates a V-DMC decoding process according to embodiments.
- Figure 12 illustrates a V-DMC encoding process according to embodiments.
- Figure 14 is a drawing for explaining dynamic mesh data having multiple attribute information according to embodiments.
- Figure 15 is a drawing for explaining dynamic mesh data having multiple attribute information according to embodiments.
- FIG. 16 is a diagram for explaining a method for compressing attribute information of mesh data having a plurality of attribute information according to embodiments.
- FIG. 17 is a diagram for explaining a method of selecting a representative texture map by calculating a block-by-block correlation between texture maps according to embodiments.
- Figure 18 shows an encoding method according to embodiments.
- Figure 19 shows a decryption method according to embodiments.
- Figure 1 illustrates a V-DMC based encoder and decoder according to embodiments.
- the basic structure of the currently in-progress V-DMC (v-mesh) is shown in Figure 1.
- the encoder and decoder according to Figure 1 perform the encoding and decoding process of media representing dynamic meshes using V3C technology.
- the preprocessor converts the input dynamic mesh representation into several V3C components: a base mesh, a set of displacements, a 2D representation of attributes, and an atlas.
- the original mesh is simplified into a base mesh.
- the base mesh can be encoded using any mesh codec.
- Displacement vectors can be encoded into V3C geometric video components using any video codec, either indicated by a profile or using SEI messages. For example, depending on the profile, the displacement vectors (displacement data) can be encoded using arithmetic coding.
- the attribute data can include additional attributes.
- texture or material information can be included as additional attributes and can be encoded using any video codec.
- the atlas data contains information on how to perform inverse reconstruction and is provided to the V3C decoding and/or rendering system.
- atlas data may include how to perform subdivision of a base mesh, how to apply displacement vectors to subdivided mesh vertices, how to apply attributes to the reconstructed mesh, etc.
- the encoder may comprise a memory and at least one processor connected to the memory.
- the at least one processor may be configured to perform operations such as a preprocessor, an atlas encoder, a basemesh encoder, a displacement vector encoder, a video encoder, and a multiplexer.
- the atlas encoding unit encodes the atlas of mesh data to generate an atlas bitstream.
- the basemesh encoding unit encodes the basemesh of mesh data to generate a basemesh bitstream.
- the displacement vector encoding unit encodes the displacement vector of mesh data to generate a displacement vector bitstream.
- the video encoding unit encodes the properties (attributes) of mesh data to generate an attribute bitstream.
- the encoder generates parameter information (which may be referred to as signaling information, metadata, etc.) related to each encoding.
- the encoder may generate a bitstream including parameter information, atlas, basemesh, displacement vector, and/or attribute.
- the decoder may be configured with a memory and at least one processor connected to the memory.
- the at least one processor may be configured to perform operations such as a demultiplexing unit, an atlas decoding unit, a basemesh decoding unit, a displacement vector decoding unit, and a video decoding unit.
- the atlas decoding unit decodes the atlas within the bitstream.
- the basemesh decoding unit decodes the basemesh within the bitstream.
- the displacement vector decoding unit decodes the displacement vector within the bitstream.
- the video decoding unit decodes the attributes within the bitstream.
- the decoder can perform each decoding operation based on parameter information within the bitstream.
- the decoder can reconstruct dynamic mesh data based on the atlas, displacement vector, attributes, and base mesh.
- Figure 2 illustrates a system for providing dynamic mesh content according to embodiments.
- the system of FIG. 2 includes a point cloud data transmission device (100) and a point cloud data reception device (110) according to embodiments.
- the point cloud data transmission device may include a dynamic mesh video acquisition unit (101), a dynamic mesh video encoder (102), a file/segment encapsulator (103), and a transmitter (104).
- the point cloud data reception device (110) may include a reception unit (111), a file/segment decapsulator (112), a dynamic mesh video decoder (113), and a renderer (114).
- Each component of FIG. 1 may correspond to hardware, software, a processor, and/or a combination thereof.
- the point cloud data transmission device may be interpreted as a term referring to the transmission device (100) or a dynamic mesh video encoder (hereinafter, referred to as an encoder) (102).
- the point cloud data receiving device according to the embodiments may be interpreted as a term referring to a receiving device (110) or a dynamic mesh video decoder (hereinafter, decoder) (113).
- the system of FIG. 2 can perform video-based dynamic mesh compression and decompression.
- 3D content increasingly represents objects with greater precision and realism, enabling users to enjoy immersive experiences.
- 3D meshes are widely used for efficient data utilization and realistic object representation. Embodiments include a series of processing steps in a system that utilizes such mesh content.
- V-PCC Video-based point cloud compression
- Point cloud data is data that contains color information at the vertex coordinates (X, Y, Z).
- Mesh data refers to data in which connectivity information between vertices is added to this vertex information.
- When creating content it can be created in the form of mesh data from the beginning.
- connectivity information to point cloud data it can be converted into mesh data and used.
- the MPEG standards body defines two types of dynamic mesh data: Category 1: Mesh data with texture maps as color information.
- Category 2 Mesh data with vertex colors as color information.
- Mesh coding standards for Category 1 data are currently under development, and work on Category 2 data standards is also planned for the future.
- the overall process for providing mesh content services may include acquisition, encoding, transmission, decoding, rendering, and/or feedback, as shown in Figure 1.
- 3D data acquired through multiple cameras or specialized cameras can be processed into mesh data types through a series of processes and then converted into video.
- the generated mesh video is then transmitted through a series of processes, and the receiving end can then reprocess the received data into mesh video and render it. This allows mesh video to be presented to users, who can then interact with the mesh content according to their intended intent.
- a mesh compression system may include a transmitting device and a receiving device.
- the transmitting device can encode mesh video to output a bitstream, which can be delivered to the receiving device via digital storage media or a network in the form of a file or streaming segment.
- the digital storage media may include various storage media, such as USB, SD, CD, DVD, Blu-ray, HDD, or SSD.
- the transmitting device may roughly include a mesh video acquisition unit, a mesh video encoder, and a transmitting unit.
- the receiving device may roughly include a receiving unit, a mesh video decoder, and a renderer.
- the encoder may be referred to as a mesh video/video/picture/frame encoding device, and the decoder may be referred to as a mesh video/video/picture/frame decoding device.
- the transmitter may be included in the mesh video encoder.
- the receiver may be included in the mesh video decoder.
- the renderer may include a display unit, and the renderer and/or the display unit may be configured as separate devices or external components.
- the transmitting device and the receiving device may further include separate internal or external modules/units/components for a feedback process.
- Mesh data represents the surface of an object as a number of polygons. Each polygon is defined by its vertices in 3D space and connection information that describes how those vertices are connected. It can also contain vertex properties such as vertex color and normal. Mapping information that allows the surface of the mesh to be mapped to a 2D planar area can also be included as a mesh property. The mapping is typically described as a set of parametric coordinates, called UV coordinates or texture coordinates, associated with the mesh vertices. Meshes contain 2D attribute maps, which can be used to store high-resolution attribute information such as textures, normals, and displacement.
- the mesh video acquisition unit may include processing 3D object data acquired through a camera, etc. into a mesh data type with the properties described above through a series of processes and generating a video composed of such mesh data.
- the mesh video may have properties of the mesh, such as vertices, polygons, connection information between vertices, colors, normals, etc., that may change over time.
- a mesh video with properties and connection information that change over time can be expressed as a dynamic mesh video.
- a mesh video encoder can encode an input mesh video into one or more video streams.
- a single video can include multiple frames, and a single frame can correspond to a still image/picture.
- a mesh video can include a mesh image/frame/picture, and the mesh video can be used interchangeably with the mesh image/frame/picture.
- a mesh video encoder can perform a Video-based Dynamic Mesh (V-Mesh) Compression procedure.
- V-Mesh Video-based Dynamic Mesh
- a mesh video encoder can perform a series of procedures such as prediction, transformation, quantization, and entropy coding for compression and coding efficiency.
- the encoded data (encoded video/image information) can be output in the form of a bitstream.
- the encapsulation processing unit can encapsulate encoded mesh video data and/or mesh video-related metadata in the form of a file, etc.
- the mesh video-related metadata may be received from the metadata processing unit, etc.
- the metadata processing unit may be included in the mesh video encoder, or may be configured as a separate component/module.
- the encapsulation processing unit may encapsulate the corresponding data in a file format such as ISOBMFF, or process it in the form of other DASH segments, etc.
- the encapsulation processing unit may include mesh video-related metadata in the file format according to an embodiment.
- the mesh video metadata may be included in boxes at various levels in the ISOBMFF file format, for example, or may be included as data in a separate track within the file.
- the encapsulation processing unit may encapsulate mesh video-related metadata itself in a file.
- the transmission processing unit can process encapsulated mesh video data for transmission according to the file format.
- the transmission processing unit can be included in the transmission unit, or can be configured as a separate component/module.
- the transmission processing unit can process mesh video data according to any transmission protocol.
- the processing for transmission can include processing for transmission through a broadcast network or processing for transmission through broadband.
- the transmission processing unit can receive not only mesh video data but also mesh video-related metadata from the metadata processing unit and process the same for transmission.
- the transmission unit can transmit encoded video/image information or data output in the form of a bitstream to the receiving unit of a receiving device via a digital storage medium or network in the form of a file or streaming.
- the digital storage medium can include various storage media such as USB, SD, CD, DVD, Blu-ray, HDD, SSD, etc.
- the transmission unit can include an element for generating a media file via a predetermined file format and can include an element for transmission via a broadcasting/communication network.
- the receiving unit can extract the bitstream and transmit it to a decoding device.
- the receiver can receive mesh video data transmitted by a mesh video transmission device. Depending on the transmission channel, the receiver can receive mesh video data via a broadcast network, via broadband, or via digital storage media.
- the receiving processing unit can perform processing on the received mesh video data according to a transmission protocol.
- the receiving processing unit can be included in the receiving unit, or can be configured as a separate component/module.
- the receiving processing unit can perform the reverse process of the aforementioned transmitting processing unit.
- the receiving processing unit can transfer the acquired mesh video data to the decapsulation processing unit, and transfer the acquired mesh video-related metadata to a metadata parser.
- the mesh video-related metadata acquired by the receiving processing unit can be in the form of a signaling table.
- a decapsulation processing unit can decapsulate mesh video data in file format received from a receiving processing unit.
- the decapsulation processing unit can decapsulate files according to ISOBMFF, etc., to obtain a mesh video bitstream or mesh video-related metadata (metadata bitstream).
- the obtained mesh video bitstream can be transmitted to a mesh video decoder, and the obtained mesh video-related metadata (metadata bitstream) can be transmitted to a metadata processing unit.
- the mesh video bitstream may include metadata (metadata bitstream).
- the metadata processing unit may be included in the mesh video decoder, or may be configured as a separate component/module.
- the mesh video-related metadata obtained by the decapsulation processing unit may be in the form of a box or track within a file format. If necessary, the decapsulation processing unit may receive metadata required for decapsulation from the metadata processing unit.
- Mesh video related metadata can be passed to a Mesh video decoder for use in the Mesh video decoding process, or passed to a renderer for use in the Mesh video rendering process.
- a mesh video decoder can receive a bitstream and perform operations corresponding to those of a mesh video encoder to decode video/images.
- the decoded mesh video can be displayed via a display unit. Users can view all or part of the rendered result via a VR/AR display or a general display.
- the feedback process may include a process of transmitting various feedback information that may be acquired during the rendering/display process to the transmitter or to the decoder on the receiver. Interactivity may be provided in mesh video consumption through the feedback process. Depending on the embodiment, head orientation information, viewport information indicating the area that the user is currently viewing, etc. may be transmitted during the feedback process. Depending on the embodiment, the user may interact with things implemented in the VR/AR/MR/autonomous driving environment, in which case information related to the interaction may be transmitted to the transmitter or the service provider during the feedback process. Depending on the embodiment, the feedback process may not be performed.
- Head orientation information can refer to information about the user's head position, angle, and movement. Based on this information, information about the area the user is currently viewing within the mesh video, i.e. viewport information, can be calculated.
- Viewport information can be information about the area the user is currently viewing in the mesh video. This can be used to perform gaze analysis to determine how the user consumes the mesh video, which area of the mesh video they are gazing at, and for how long. Gaze analysis can be performed on the receiving side and transmitted to the transmitting side through a feedback channel.
- Devices such as VR/AR/MR displays can extract the viewport area based on the user's head position/orientation, the vertical or horizontal FOV supported by the device, etc.
- the aforementioned feedback information may not only be transmitted to the transmitter but may also be consumed by the receiver. That is, the aforementioned feedback information may be utilized to perform decoding, rendering, and other processes on the receiver. For example, head orientation information and/or viewport information may be utilized to preferentially decode and render only the mesh video for the area currently being viewed by the user.
- Dynamic mesh video compression is a method for processing mesh connection information and properties that change over time, and it can perform lossy and lossless compression for various applications such as real-time communication, storage, free-view video, and AR/VR.
- the dynamic mesh video compression method described below is based on MPEG's V-Mesh method.
- picture/frame can generally mean a unit representing one video of a specific time period.
- a pixel or pel can refer to the smallest unit that constitutes a picture (or image). Additionally, the term "sample" can be used as a counterpart to a pixel.
- a sample can generally represent a pixel or a pixel value, and can represent only the pixel/pixel value of the luma component, only the pixel/pixel value of the chroma component, or only the pixel/pixel value of the depth component.
- a unit may represent a basic unit of image processing.
- a unit may include at least one of a specific region of a picture and information related to the region.
- the term “unit” may be used interchangeably with terms such as "block” or "area.”
- an MxN block may include a set (or array) of samples (or sample array) or transform coefficients consisting of M columns and N rows.
- V-Mesh Video-based dynamic mesh compression (V-Mesh) compression methods can provide a method for compressing dynamic mesh video data based on 2D video codecs such as HEVC and VVC.
- the V-Mesh compression process receives the following data as input and performs compression.
- Input mesh Contains the 3D coordinates (geometry) of the vertices that make up the mesh, normal information for each vertex, mapping information that maps the mesh surface to a 2D plane, and connection information between the vertices that make up the surface.
- the mesh surface can be expressed as triangles or more polygons, and connection information between the vertices that make up each surface is stored according to a set shape.
- the input mesh can be saved in the OBJ file format.
- Attribute map (Hereinafter, texture map is also used in the same meaning): Contains information about the properties (color, normal, displacement, etc.) of the mesh, and stores data in the form of a mapping of the surface of the mesh onto a 2D image. Mapping which part (surface or vertex) of the mesh each data of this attribute map corresponds to is based on the mapping information contained in the input mesh. Since the attribute map has data for each frame of the mesh video, it can also be expressed as an attribute map video (or attribute for short).
- the attribute map in the V-Mesh compression method mainly contains the color information of the mesh, and is saved in an image file format (PNG, BMP, etc.).
- Material Library File Contains information about the material properties used in a mesh, particularly information that links the input mesh to its corresponding attribute map. It is saved in the Wavefront Material Template Library (MTL) file format.
- MTL Wavefront Material Template Library
- the following data and information can be generated through the compression process.
- Base mesh The input mesh is simplified (decimated) through a preprocessing process to express the objects of the input mesh using the minimum number of vertices determined by the user's standards.
- Displacement This is displacement information used to express the input mesh as similarly as possible to the base mesh, and is expressed in the form of 3D coordinates.
- Atlas information This is the metadata required to reconstruct a mesh using base mesh, displacement, and attribute map information. It can be created and utilized as sub-mesh units (such as patches) that make up the mesh.
- FIGS. 3 to 7 a method for encoding mesh position information (vertex) is described, and referring to FIGS. 7-10, etc., a method for encoding attribute information (attribute map) by restoring mesh position information is described.
- Figure 3 shows a V-MESH compression method according to embodiments.
- Fig. 3 illustrates the encoding process of Fig. 2, and the encoding process may include a pre-processing process and an encoding process.
- the encoder of Fig. 2 may include a pre-processor (200) and an encoder (201) as in Fig. 3.
- the transmitting device of Fig. 2 may be broadly referred to as an encoder, and the dynamic mesh video encoder of Fig. 2 may be referred to as an encoder.
- the V-Mesh compression method may include a pre-processing (200) and an encoding (201) process as in Fig. 3.
- the pre-processor of Fig. 3 may be located in front of the encoder of Fig. 3.
- the pre-processor and the encoder of Fig. 3 may be referred to as a single encoder.
- the preprocessor can receive a static dynamic mesh and/or an attribute map.
- the preprocessor can generate a base mesh and/or displacement through preprocessing.
- the preprocessor can receive feedback information from the encoder and generate the base mesh and/or displacement based on the feedback information.
- the encoder can receive a base mesh, displacement mesh, static dynamic mesh, and/or attribute map.
- the encoder can encode mesh-related data to generate a compressed bitstream.
- FIG. 4 illustrates pre-processing of V-MESH compression according to embodiments.
- Figure 4 shows the configuration and operation of the pre-processor of Figure 3.
- Fig. 3 shows a process of performing preprocessing on an input mesh.
- the preprocessing process (200) can be broadly divided into four steps: 1) Group of Frame (GoF) generation, 2) Mesh Decimation, 3) UV parameterization, and 4) Fitting subdivision surface (300).
- the preprocessor (200) can receive an input mesh, generate a displacement and/or base mesh, and transmit the generated displacement and/or base mesh to the encoder (201).
- the preprocessor (200) can transmit GoF information related to GoF generation to the encoder (201).
- GoF Generation This is the process of generating a reference structure for mesh data. If the number of vertices, number of texture coordinates, vertex connection information, and texture coordinate connection information of the mesh of the previous frame and the current mesh are all the same, the previous frame can be set as the reference frame. In other words, if only the vertex coordinate values are different between the current input mesh and the reference input mesh, inter-frame encoding can be performed. Otherwise, the frame performs intra-frame encoding.
- the input mesh (voxelized), target triangle ratio (TTR), and minimum triangle component (CCCount) information are passed as input, and the simplified mesh (decimated mesh) can be obtained as output.
- the simplified mesh (decimated mesh) can be obtained as output.
- connected triangle components smaller than the set minimum triangle component (CCCount) can be removed.
- UV parameterization This is the process of mapping a 3D surface of a decimated mesh into a texture domain. Parameterization can be performed using the UVAtlas tool. This process generates mapping information, which indicates where each vertex of the decimated mesh can be mapped to on a 2D image. This mapping information is expressed and stored as texture coordinates, and through this process, the final base mesh is created.
- OrthoAtlas technology generates texture coordinates using orthographic projection. OrthoAtlas technology sequentially generates patches and packs them. First, adjacent triangles are divided to generate Connected Components (CCs), and then the optimal CCs are merged using a cost function to generate patches. The cost function can measure the cost based on the degree of distortion that occurs when orthogonally projecting patches in each direction. By packing the patch that minimizes the cost function into the texture domain, the texture coordinates can be ultimately calculated. In the case of orthoAtlas technology, texture coordinates and texture connection information can be derived from the base mesh decoder without compressing them during the base mesh encoding process.
- Fitting subdivision surface This is the process of performing subdivision on a simplified mesh.
- the subdivision method can be a user-defined method, such as the mid-edge method.
- the fitting process ensures that the input mesh and the subdivision mesh are similar to each other.
- User-defined subdivision methods such as the mid-edge method ( Figure 5), the loop method, or the LS3 method, can be applied.
- Figure 5 illustrates a mid-edge subdivision method according to embodiments.
- Figure 5 illustrates the mid-edge method of the fitting subdivision surface described in Figure 4.
- an original mesh containing four vertices is subdivided to create a sub-mesh.
- a sub-mesh can be created by creating a new vertex midway between the edges between the vertices.
- a fitted subdivided mesh (hereinafter referred to as a fitted subdivided mesh)
- displacement is calculated using this result and a pre-compressed and decrypted base mesh (hereinafter referred to as a reconstructed base mesh). That is, the reconstructed base mesh is subdivided in the same way as the fitting subdivision surface.
- the difference in position of each vertex between this result and the fitted subdivided mesh is the displacement for each vertex. Since displacement represents a position difference in three-dimensional space, it is also expressed as a value in the (x, y, z) space of a Cartesian coordinate system.
- the (x, y, z) coordinate values can be converted to (normal, tangential, bi-tangential) coordinate values of the local coordinate system.
- Figure 6 shows a displacement generation process according to embodiments.
- FIG. 6 illustrates in detail the displacement calculation method of the fitting subdivision surface (300) as described in FIG. 5.
- An encoder and/or pre-processor may include 1) a subdivision unit, 2) a local coordinate system calculation unit, and 3) a displacement calculation unit.
- the subdivision unit may receive a reconstructed base mesh and generate a subdivided reconstructed base mesh.
- the local coordinate system calculation unit may receive a fitted subdivision mesh and a subdivided reconstructed base mesh, and transform a coordinate system of the mesh into a local coordinate system.
- the local coordinate system calculation operation may be optional.
- the displacement calculation unit may calculate a positional difference between the fitted subdivision mesh and the subdivided reconstructed base mesh. For example, a positional difference value between vertices of two input meshes may be generated. The vertex positional difference value becomes a displacement.
- the method and device for transmitting point cloud data can encode the point cloud as follows.
- the point cloud data (which may be referred to as a point cloud for short) according to the embodiments can refer to data including vertex coordinates and color information.
- the term "point cloud” includes mesh data, and in this document, point cloud and mesh data can be used interchangeably.
- the V-Mesh compression (reconstruction) method may include intra frame encoding (Fig. 6) and inter frame encoding (Fig. 7).
- intra-frame encoding or inter-frame encoding is performed.
- the data to be compressed may be a base mesh, displacement, attribute map, etc.
- the data to be compressed may be a displacement, attribute map, and a motion field between a reference base mesh and the current base mesh.
- Figure 7 shows a V-DMC encoding process according to embodiments.
- Fig. 7 details the encoding of Figs. 1 and 2.
- the preprocessor can receive an input mesh and perform the preprocessing described above.
- the preprocessing can generate a base mesh and/or a fitted subdivided mesh.
- the quantizer can quantize the base mesh and/or the fitted subdivided mesh.
- the static mesh encoder can encode the static mesh.
- the static mesh encoder can generate a bitstream including the encoded base mesh.
- the motion encoder can encode a motion vector for the base mesh based on inter-frame motion estimation and motion compensation for inter-prediction.
- the atlas encoder can encode an atlas for the vertices of the base mesh.
- the encoded base mesh can be reconstructed and inversely quantized through a dequantizer.
- the displacement calculation unit can receive the reconstructed mesh and generate displacement, which is a position difference, based on the fitted subdivided mesh.
- a lifting transform unit can receive displacement and generate lifting coefficients.
- a quantizer can quantize the lifting coefficients.
- an image packing unit can pack an image based on the quantized lifting coefficients.
- a video encoder can encode the packed image.
- inter-prediction can be applied to the quantized lifting coefficients, and the predicted lifting coefficients can be encoded according to an arithmetic encoding method.
- a mesh restoration unit restores a warped mesh using the restored displacement and the restored base mesh.
- Displacement data is restored, and the warped mesh is restored based on the restored displacement data and the restored base mesh, and provided to an attribute transfer unit.
- the attribute transfer unit receives an input mesh and/or an input attribute map, and generates an attribute map based on the restored warped mesh.
- a push-pull padding unit can pad data in the attribute map based on a push-pull method.
- the color space transform unit can transform the space of the color component, which is an attribute.
- the video encoder can encode the attribute.
- the multiplexer can multiplex the compressed base mesh, compressed displacement, and compressed attribute to generate a bitstream.
- the base mesh compression method can be divided into INTRA type, INTER type, and SKIP type depending on the base mesh type, and encoding can be performed in different ways for each. If the base mesh is INTRA type, it can be encoded using the static mesh encoding method. If the base mesh is INTER type, the motion field between the reference base mesh and the current base mesh can be encoded. If the current base mesh is SKIP type, the reference base mesh can be derived as the current base mesh.
- the decoded base mesh can be subdivided into a subdivided mesh through a subdivision process.
- Subdivision algorithms such as mid-point subdivision and loop subdivision can be used.
- Static Basemesh Encoding When performing Intra encoding on the current basemesh, the base mesh generated in the preprocessing process can be encoded using static mesh compression technology after going through a quantization process. Static mesh compression applies MPEG EdgeBreaker (MEB) technology, and the vertex position information, mapping information (texture coordinates), vertex connection information, and normals of the base mesh are compressed.
- MEB MPEG EdgeBreaker
- the edgebreaker algorithm sequentially traverses triangles according to a rule, mapping symbols based on the characteristics of each triangle, and then encoding those symbols.
- a technique for compressing vertex position information can encode the residual value, which is the difference between the current vertex and the predicted value, after obtaining the predicted value based on a prediction technique such as multiple parallelogram prediction.
- a technique for compressing mapping information can encode the residual value, which is the difference between the current mapping information (texture coordinates) and the predicted value, after obtaining the predicted value based on a prediction technique such as stretch.
- Techniques for compressing normals can encode residual values, which are the differences between the current normal and the predicted values, after obtaining predicted values based on prediction techniques such as delta coding, multiple parallelogram prediction, and cross product-based prediction.
- Inter Basemesh Encoding can be performed when a one-to-one correspondence exists between the reference mesh and the current input mesh, and only the vertex position information differs.
- the difference between the vertices of the reference base mesh and the current base mesh, i.e. the motion field can be calculated and this information can be encoded.
- the reference base mesh is the result of quantizing the already decoded base mesh data and is determined by the reference frame index.
- the motion field can be encoded as is, or the predicted motion field can be calculated by averaging the motion fields of the reconstructed vertices among the vertices connected to the current vertex, and the residual motion field, which is the difference between the predicted motion field value and the motion field value of the current vertex, can be encoded. This value can be encoded using entropy coding.
- Displacement Encoding After encoding the base mesh, it is restored and dequantized to generate a Recon. The base mesh is then subdivided to calculate the displacement between the results and the fitted subdivided mesh.
- a data transform process such as the Wavelet transform can be applied to the displacement information, and Figure 7 shows the process of transforming the displacement information using the Lifting transform in V-Mesh.
- the transform coefficients generated through the transform process are quantized, and the quantized transform coefficients can be compressed using a video codec or through arithmetic encoding, depending on the compression method.
- transform coefficients When compressed through a video codec, it is packed into a 2D image as shown in Figure 8.
- the transform coefficients are organized into one block for each N ⁇ 2(N*N) unit, and each block can be packed in z-scan order.
- the horizontal number of blocks is fixed to N, but the vertical number of blocks can be determined according to the number of vertices of the subdivided base mesh.
- the transform coefficients can be packed by sorting them with Morton code.
- the packed images generate a displacement video for each GoF unit, and this displacement video can be encoded using an existing video compression codec.
- inter-frame prediction can be performed on the quantized displacement vector transform coefficients.
- the residual value which is the difference between the current displacement vector transform coefficients and the reference displacement vector transform coefficients, can be encoded, and information about the reference target can be encoded.
- the quantized displacement vector transform coefficients can be arithmetic-coded, and if it is an INTER type, the residual value can be arithmetic-coded.
- Arithmetic coding can be performed based on Context Adaptive Binary Arithmetic Coding (CABAC).
- the CABAC process can first binarize the displacement vector data and map it to a bin string.
- the bin string can be a binarized output of 0 and 1, and each 0 or 1 can be a bin.
- Each bin can be arithmetic-coded using context information selected from a context model, and a process of updating the probability can be performed.
- Figure 8 shows a lifting conversion process for displacement according to embodiments.
- Figure 9 illustrates a process of packing transformation coefficients according to embodiments into a 2D image.
- Figures 8-9 show the process of converting the displacement of the encoding process of Figure 7 and the process of packing the conversion coefficients, respectively.
- the encoding method according to the embodiments includes displacement encoding.
- a reconstructed base mesh is generated through restoration and dequantization, and the displacement between the result of performing subdivision on the reconstructed base mesh and the fitted subdivided mesh generated through the fitting subdivision surface can be calculated.
- a data transform process such as wavelet transform can be applied to the displacement information.
- Figure 8 shows the process of transforming displacement information using lifting transform in V-Mesh.
- the transform coefficients generated through the transform process are quantized and then packed into a 2D image as shown in Figure 9.
- the horizontal number of blocks is fixed to 16, but the vertical number of blocks can be determined according to the number of vertices of the subdivided base mesh.
- the transform coefficients can be packed by sorting them with Morton code within a block.
- the packed images generate a displacement video for each GoF unit, and this displacement video can be encoded using an existing video compression codec.
- the base mesh (original) may include vertices and edges for LoD0.
- the first subdivision mesh generated by dividing the base mesh includes vertices generated by further dividing the edges of the base mesh.
- the first subdivision mesh includes vertices for LoD0 and vertices for LoD1.
- LoD1 includes the subdivided vertices and the vertices (LoD0) of the base mesh.
- the first subdivision mesh may be generated by dividing the second subdivision mesh.
- the second subdivision mesh includes LoD2.
- LoD2 includes the base mesh vertices (LoD0), LoD1 including the vertices additionally generated from LoD0, and the vertices additionally divided from LoD1.
- LoD is a level indicating the degree of detail (Level of Detail), and as the level index increases, the distance between vertices becomes closer and the level of detail increases.
- LoD N includes the vertices included in the previous LoDN-1 as they are.
- the mesh can be encoded based on a prediction and/or update method. Instead of still encoding information about the current LoD N, a residual value between the previous LoD N-1 can be generated, and the mesh can be encoded using the residual value to reduce the size of the bitstream.
- the prediction process means the operation of predicting the current vertex v using the previous vertices v1, v2. Since adjacent subdivision meshes have similar data, efficient encoding can be achieved by utilizing this property.
- the current vertex position information is predicted as the residual for the previous vertex position information, and the previous vertex position information is updated using the residual.
- the vertices have coefficients generated through the lifting transformation.
- the coefficients of the vertices related to the lifting transformation can be encoded by packing them into an image.
- Fig. 10 shows an attribute transfer process of a V-MESH compression method according to embodiments.
- Figure 10 shows the detailed operation of attribute transfer of the encoding of Figure 7.
- Information about the input mesh is compressed through base mesh encoding, motion field encoding, and displacement encoding.
- the compressed input mesh is restored through base mesh decoding (intra frame), motion field decoding (inter frame), and displacement video decoding, and the restored result, the reconstructed deformed mesh (hereinafter referred to as Recon. deformed mesh), is used to compress the input attribute map as shown in FIGS. 6 and 7.
- the reconstructed deformed mesh (Recon. deformed mesh) has vertex position information, texture coordinates, and corresponding connection information, but does not have color information corresponding to the texture coordinates. Therefore, as shown in Fig. 10, in the V-Mesh compression method, a new attribute map having color information corresponding to the texture coordinates of the reconstructed deformed mesh is created through the attribute transfer process.
- Attribute transfer first checks whether each point P(u, v) in the 2D texture domain belongs to a texture triangle of the reconstructed deformed mesh, and if it is in the texture triangle T, calculates the barycentric coordinate ( ⁇ , ⁇ ⁇ ) of P(u, v) according to the triangle T. Then, using the 3D vertex position and ( ⁇ , ⁇ ⁇ ) of triangle T, calculate the 3D coordinate M(x, y, z) of P(u, v). Find the vertex coordinate M'(x', y', z') that corresponds to the position most similar to the calculated M(x, y, z) in the input mesh domain and the triangle T' that contains this point.
- the center of mass coordinates ( ⁇ ', ⁇ ', ⁇ ') of M'(x', y', z') are calculated.
- the texture coordinates (u', v') are calculated, and the color information corresponding to these coordinates is found in the input attribute map.
- the color information found in this way is immediately assigned to the pixel location (u, v) of the new attribute map. If P(u, v) does not belong to any triangle, the pixel at that location in the new attribute map can be filled with a color value using a padding algorithm such as the push-pull algorithm.
- the new attribute map generated through attribute transfer is grouped into GoF units to form an attribute map video, which is then compressed using a video codec.
- the decoding process of Fig. 1 can perform the reverse process of the corresponding encoding process of Fig. 1.
- the specific decoding process is as follows.
- Figure 11 illustrates a VV-DMC decoding process according to embodiments.
- Fig. 11 shows the configuration and operation of a decoder of a receiving device such as Fig. 1.
- the input bitstream can be separated into a basemesh sub-stream, a displacement sub-stream, an attribute map sub-stream, and an atlas sub-stream.
- Atlas sub-streams can be decoded through Exp-Golomb coding, etc., and as a result, information necessary for decoding, such as tile information and patch information, can be obtained.
- the Basemesh sub-stream is of INTRA type according to the Basemesh type, it can be decoded through a static mesh decoder based on MEB (MPEG EdgeBreaker) technology, and as a result, the connection information, vertex geometry information, and vertex mapping information (texture coordinates) of the Basemesh can be restored.
- MEB MPEG EdgeBreaker
- the decoder can derive mapping information (texture coordinates) and attribute information (texture) connection information using vertex coordinates.
- the process of deriving mapping information (texture coordinates) and connection information can generate mapping information (texture coordinates) and attribute information (texture) connection information by calculating the homography transform of each face and then projecting the vertices based on this.
- the Basemesh type is INTER
- motion information can be decoded through entropy decoding and inverse prediction.
- the reconstructed motion information is combined with the reference Basemesh, which has already been reconstructed and stored in the buffer, to create a Reconstructed Quantized Basemesh for the current frame.
- the reconstructed Basemesh can then undergo an inverse quantization process.
- the displacement sub-stream is compressed through a video codec according to the compression method used in encoding, it is decoded into displacement video through the decoder of the video compression codec, and then the image unpacking process is performed.
- the displacement vector bitstream can decode the binarized syntax elements through arithmetic decoding, and a contextual probability model (CPM) can be adaptively determined according to each bin of the syntax elements, and the occurrence probability of the bin can be predicted through the CPM to perform arithmetic decoding.
- the binarized syntax elements can be decoded through inverse binarization.
- the quantized displacement vector transform coefficients can be derived from the decoded syntax elements. If the displacement information type is INTER, an inverse inter prediction process is performed using reference information for the quantized displacement coefficients (if inter prediction is performed).
- the quantized displacement coefficient is restored as displacement information for each vertex through inverse quantization, inverse transform, and coordinate system transformation processes.
- the restored base mesh and restored displacement information are combined to generate the final decoded mesh.
- the attribute map sub-stream is decoded through the decoder of the video compression codec used in encoding, and then restored to the final attribute map through processes such as color format conversion.
- the restored Decoded mesh and Decoded attribute map can be utilized by the receiver as final mesh data that can be utilized by the user.
- the atlas decoder decodes the atlas data within the bitstream.
- the motion decoder derives the motion field of the base mesh of the current frame through motion estimation and compensation based on the base mesh in the reference frame, if the mesh data in the bitstream is encoded based on inter prediction.
- the spatial decoder decodes the base mesh, if the mesh data in the bitstream is encoded based on intra prediction.
- the displacement data is decoded by applying arithmetic encoding decoding or video decoding.
- the video decoder decodes the attribute data in the bitstream.
- the decoding method of Fig. 11 can follow the reverse process of the encoding method according to the embodiments.
- Figure 12 illustrates a V-DMC encoding process according to embodiments.
- Fig. 12 illustrates the configuration and operation of the encoder of the transmitting device of Figs. 1 and 2.
- Each component of Fig. 12 corresponds to hardware, software, a processor, and/or a combination thereof.
- Figure 12 shows the encoding process of V-Mesh technology.
- the mesh preprocessing unit receives the original mesh as input and generates a simplified mesh (decimated mesh). Simplification can be performed based on the target number of vertices or the target number of polygons that constitute the mesh. Parameterization can be performed on the simplified mesh to generate mapping information (texture coordinates) and attribute information (texture) connection information per vertex. Additionally, quantization of floating-point mesh information into fixed-point information can be performed. This result can be encoded as a base mesh through a static mesh encoding unit.
- the mesh preprocessing unit can perform mesh subdivision on the base mesh to generate additional vertices. Depending on the subdivision method, vertex connection information, texture coordinates, and texture coordinate connection information including the added vertices can be generated.
- the subdivided mesh can be fitted by adjusting the vertex positions to resemble the original mesh, thereby generating a fitted subdivided mesh.
- the base mesh generated through the mesh preprocessing unit can perform intra encoding or inter encoding depending on the base mesh type. If the base mesh frame performs intra encoding, it can be compressed through the static mesh encoding unit. In this case, encoding can be performed on the connection information, vertex geometry information, vertex texture information, normal information, etc. of the base mesh. If the base mesh frame performs inter encoding, a motion vector encoding unit is performed, which can use the base mesh and the reference reconstruction base mesh as input to calculate the motion vector between the two meshes and encode the value.
- the motion vector encoding unit can perform connection information-based prediction using the previously encoded/decoded motion vector as a predictor, and can encode the residual motion vector obtained by subtracting the predicted motion vector from the current motion vector.
- the base mesh bitstream generated through the base mesh encoding unit is transmitted to the multiplexing unit.
- the encoded base mesh bitstream can generate a restored base mesh through a base mesh restoration unit.
- the displacement vector calculation unit can perform mesh refinement on the restored base mesh.
- the displacement vector can be calculated as the difference in vertex positions between the refined restored base mesh and the fitted subdivision mesh generated in the preprocessing unit. As a result, the displacement vector can be calculated as many times as the number of vertices of the refined mesh.
- the displacement vector calculation unit can convert the displacement vector calculated in the 3D Cartesian coordinate system into a local coordinate system based on the normal vector of each vertex.
- the displacement vector processing unit can transform the displacement vector for effective encoding.
- the transform can be performed by a lifting transform, a wavelet transform, etc. depending on the embodiment.
- quantization can be performed on the transformed displacement vector value, i.e., the transform coefficient. Different quantization parameters can be applied to each axis of the transform coefficient, and the quantization parameters can be derived according to the agreement of the encoder/decoder.
- the quantized displacement vector transform coefficients calculated by the displacement vector processing unit can be encoded through a displacement vector video encoding unit or a displacement vector arithmetic encoding unit depending on the compression method.
- the displacement vector video encoding unit can pack displacement vector information that has undergone transformation and quantization into a 2D image.
- the packed 2D images can be bundled for each frame to generate a displacement vector video, and the displacement vector video can be generated for each GoF (Group of Frame) unit of the input mesh.
- the generated displacement vector video can be encoded using a video compression codec.
- the generated displacement vector video bitstream is transmitted to the multiplexing unit.
- the displacement vector arithmetic encoding unit can perform inter-screen prediction on the quantized displacement vector transform coefficients if the displacement vector type is INTER.
- the inter-screen prediction process may be a process of obtaining a residual value, which is the difference between the current transform coefficient and the reference transform coefficient.
- the displacement vector transform coefficient or the residual value can be encoded through an arithmetic encoding process.
- the displacement vector restored through the displacement vector restoration unit and the base mesh restored through the base mesh restoration unit and refined are restored through the mesh restoration unit, and the restored mesh has restored vertices, connection information between vertices, texture coordinates, and connection information between texture coordinates.
- the attribute information (texture map) of the original mesh can be regenerated as attribute information (texture map) for the restored mesh through the attribute information (texture map) video generation unit.
- the color information per vertex of the texture map of the original mesh can be assigned to the texture coordinates of the restored mesh.
- the regenerated texture maps for each frame can be bundled by GoF unit to generate a texture map video.
- the generated texture map video can be encoded using a video compression codec through a texture map video encoding unit.
- the texture map video bitstream generated through encoding is transmitted to a multiplexing unit.
- the atlas encoding unit can encode atlases, which are additional information required for mesh decoding and rendering.
- the generated atlas bitstream is transmitted to the multiplexing unit.
- the generated base mesh bitstream, displacement vector bitstream, texture map bitstream, and atlas bitstream can be multiplexed into a single bitstream and transmitted to a receiver via a transmitter.
- the generated base mesh bitstream, displacement vector bitstream, texture map bitstream, and atlas bitstream can be generated into a file with one or more track data or encapsulated into segments and transmitted to a receiver (decoder) via a transmitter.
- the data input unit can receive an original mesh and/or an original texture map ('attribute').
- the mesh preprocessing unit can simplify the original mesh to generate a base mesh and fit it to generate a refined mesh.
- the motion vector encoding unit can generate a motion vector (motion field) by referring to a reconstructed base mesh within a previously processed reference frame when the mesh encoding method is inter-prediction, and can encode it based on a motion estimation and compensation method.
- the static mesh encoding unit can encode the base mesh within the frame when the mesh encoding method is intra-prediction.
- the displacement vector calculation unit can calculate a displacement vector for a vertex from the fitted refined mesh based on the reconstructed base mesh.
- the displacement vector processing unit can process the displacement vector into a form suitable for encoding.
- the displacement vector can be encoded based on a video method or an arithmetic encoding method.
- the displacement vector can be reconstructed and provided to the mesh restoration unit together with the reconstructed base mesh.
- the attribute (texture map) video generation unit can generate a video for encoding the texture map using the original mesh and the texture map for the original mesh.
- the attribute is encoded based on the video method.
- the atlas is encoded by the atlas encoding unit.
- Figure 13 illustrates a V-DMC decoding process according to embodiments.
- Fig. 13 corresponds to the decoders of Figs. 1 to 3. Each component of Fig. 13 corresponds to hardware, software, a processor, and/or a combination thereof.
- the bitstream of the received Mesh is demultiplexed into a compressed base mesh bitstream, a displacement vector bitstream, an attribute information (texture map) bitstream, and an atlas bitstream after file/segment decapsulation.
- the motion vector decoding unit can perform decoding on the base mesh bitstream.
- the final motion vector can be reconstructed by adding the previously decoded motion vector to the residual motion vector decoded from the bitstream using the previously decoded motion vector as a predictor.
- the current base mesh can be reconstructed by adding the decoded motion vector to the reference base mesh.
- the base mesh bitstream can be used to restore the connection information, vertex geometry information, texture coordinates, normal information, etc. of the base mesh through the static mesh decoder.
- the base mesh restoration unit can perform inverse quantization on the decoded base mesh to generate a restored base mesh.
- the displacement vector bitstream may be decoded using a video codec and then subjected to a depacking process. If encoded using arithmetic coding, arithmetic decoding may be performed using a displacement vector arithmetic decoding unit. If inter-screen prediction is performed, the current displacement vector transform coefficient may be generated by adding a residual value to the reference displacement vector transform coefficient through inter-screen prediction.
- the displacement vector restoration unit restores the displacement vector by performing inverse quantization and inverse transformation on the decoded displacement vector transform coefficients. If the restored displacement vector is a value in the local coordinate system, a process of inverse transformation to the Cartesian coordinate system can be performed.
- the mesh restoration unit can generate additional vertices by performing subdivision on the restored base mesh.
- Subdivision can generate vertex connection information, texture coordinates, and texture coordinate connection information, including the added vertices.
- the subdivided restored base mesh can be combined with the restored displacement vector to generate the final restored mesh.
- the texture map bitstream can be decoded as a video bitstream using a video codec in a texture map video decoding unit.
- the restored texture map contains color information for each vertex contained in the restored mesh, and the color value of each vertex can be obtained from the texture map using the texture coordinates of the corresponding vertex.
- the atlas bitstream can be decrypted by the atlas decryptor.
- the restored mesh and texture map are displayed to the user through a rendering process using a mesh data renderer, etc.
- the decoder receives an encoded bitstream and decodes the base mesh, displacement vectors, attributes, and atlas within the bitstream based on parameter information (which may be referred to as signaling information, metadata, etc.) contained within the bitstream.
- the decoding process may follow the reverse process of the encoding process.
- a mesh is reconstructed from the reconstructed base mesh and the reconstructed displacement mesh.
- the mesh can be rendered based on the reconstructed mesh and the reconstructed attributes.
- a point cloud data encoding device and method can encode mesh data and transmit a bitstream including the encoded mesh data.
- a point cloud data decoding device and method according to embodiments can receive a bitstream including mesh data and decode the mesh data.
- the point cloud data encoding/decoding method/device according to embodiments may be referred to as the method/device according to embodiments.
- the point cloud data encoding/decoding method/device according to embodiments may also be referred to as the mesh data encoding/decoding method/device according to embodiments.
- the term encoding/decoding method/device may be used in this document for short.
- V-Mesh video-based dynamic mesh compression
- V-Mesh regenerates a texture map of an input original mesh into a texture map for a mesh restored during an encoding process, and then processes and compresses the regenerated texture map images as a video stream.
- a related sequence has been added to the MPEG V-DMC standard. While previously, mesh data had one attribute information (texture map), newly added images have data with multiple attribute information.
- FIG. 14 and FIG. 15 are drawings for explaining dynamic mesh data having multiple attribute information according to embodiments.
- the encoding method and device can encode dynamic mesh data having a plurality of attribute information as shown in Figs. 14 and 15, and generate syntax elements (which can be referred to as parameters, metadata, signaling information, etc.) related to the plurality of attribute information according to the embodiments.
- the decoding method and device can decode dynamic mesh data having a plurality of attribute information, as shown in FIGS. 14 and 15, based on syntax elements related to the plurality of attribute information.
- the embodiments aim to solve the problems of low compression ratio, high complexity, and use of many codec instances that occur in existing methods when compressing dynamic mesh data having a larger number of attribute information than general point cloud/mesh data sets using V-DMC.
- V-DMC V-mesh
- V-mesh a method for compressing 3D dynamic mesh data based on existing 2D video codecs.
- a method capable of achieving effective compression performance is provided when compressing data with multiple attribute information using V-DMC.
- This relates to a method for generating attribute video streams within V-DMC, signaling for the video stream generation method, and a method for processing video streams at transmitters and receivers.
- the embodiments propose a method of selecting representative attribute information among a plurality of attribute information, obtaining the corresponding image and residual, and compressing and utilizing the same, thereby proposing an efficient V-DMC utilization method.
- dynamic mesh data may include multiple attribute information (texture maps) mapped to a single texture coordinate.
- attribute information texture maps
- T1 to T5 multiple attribute information
- the multiple attribute information may simply be concatenated in a single row. In this case, the amount of data may be large and the compression ratio may be low.
- a frame related to mesh data may include an object (e.g., a person or a building).
- object e.g., a person or a building
- attribute e.g., a texture map
- the multiple textures may be referred to as texture 1, texture 2, texture 3, texture 4, texture 5, etc.
- information regarding texture 1 to texture 5 may all be required.
- FIG. 16 is a diagram for explaining a method for compressing attribute information of mesh data having a plurality of attribute information according to embodiments.
- the encoding method and device can encode multi-attribute information, as in Fig. 16, based on the multi-attribute information of Figs. 14 and 15.
- the decoding method and device can decode dynamic mesh data having the multi-attribute information of FIGS. 14 and 15, multi-attribute information encoded using the method of FIG. 16, and can decode based on syntax elements related to the multi-attribute information.
- the embodiments first select a representative texture video to serve as a reference. This can be arbitrarily designated by the user. Once the representative texture video is selected, the remaining texture videos calculate the color difference per pixel with respect to the representative texture video to generate a residual texture video. As a result, one representative texture video having original texture information and N - 1 residual texture videos are generated from N texture videos, and each of these can be compressed and transmitted using a video codec, or can be concatenated and transmitted as a single data.
- the original texture video may include multi-attribute information.
- the multi-attribute information may be arranged in the order of texture video #1 to texture video #N.
- a representative video may be set to encode the texture video.
- the first texture video, texture video #1 may be set as the representative video.
- a residual texture video #2 can be generated, which is a residual between texture video #1 and texture video #2. Then, a residual texture video #3 can be generated, which is a residual between texture video #1 and texture video #3, and a residual texture video #4 can be generated, which is a residual between texture video #1 and texture video #4. Then, a residual texture video #N can be generated, which is a residual between texture video #1 and texture video #N.
- Texture video #1 set as a representative video can be encoded as texture video #1.
- texture video #2 to texture video #N can be encoded by generating a residual texture video which is a residual with texture video #1.
- the encoded multi-attribute information can include texture video #1 and residual texture video #2 to residual texture video #N.
- each bitstream is decoded individually or combined, and then the residuals can be combined and restored as in Equation 1 below based on the representative texture video as a reference.
- the reconstructed texture map (Recon_texture map[i]) may mean a reconstructed texture map having an index i.
- the texture video #1 (texture video #1) may mean a texture video set as a representative texture video, and may be set as a texture video having a different index depending on the embodiment.
- the residual data [i] (residual_data[i]) may mean residual data having an index i.
- the residual data may include a pixel-to-pixel residual value between the texture map and the representative texture map.
- the texture map and the texture video may be used interchangeably.
- FIG. 17 is a diagram for explaining a method of selecting a representative texture map by calculating block-by-block correlations between texture maps according to embodiments.
- the encoding method and device can encode multi-attribute information as in Fig. 16 based on the multi-attribute information of Figs. 14 to 15 and the representative texture map setting method of Fig. 17.
- the decoding method and device can decode dynamic mesh data having multi-attribute information of FIGS. 14 and 15, multi-attribute information encoded according to FIG. 16 based on a representative texture map set according to FIG. 17, and can decode based on syntax elements related to the multi-attribute information.
- a method for selecting a representative texture video (or map) to serve as a reference according to embodiments is described.
- the first texture video (or map) can be selected.
- the texture video can be reconstructed by combining the decoded residual images and the first image, without having to store and transmit information about a separate representative texture video (or map).
- a representative texture video (or map) among N texture videos (or maps) can be selected through a separate process.
- a block-level correlation between texture maps can be calculated to select a representative texture map.
- matching and/or correlation between images can be calculated for each pixel unit or specific block size.
- Information regarding the representative texture map determined through the calculation can be additionally signaled and utilized for restoration during the decoding process.
- the data of the texture map can be created as a data array based on RGB values or YUV values, and the correlation between each texture map can be calculated as in Equation 2 below.
- X and Y can mean all or part of the data of the texture map input for comparison, and x' and y' can mean the average of each data for normalization.
- Equation 3 a cost function such as Equation 3 can be created to select a representative texture map that minimizes the value.
- the multi-attribute information may include sequentially connected texture maps 1 to N.
- the encoding/decoding method and device may set the first texture map among the multi-attribute information as a representative texture map.
- information about the first texture map set as the representative texture map may be preset and thus may not require separate signaling.
- the encoding/decoding method and device can determine one representative texture map from among a plurality of texture maps based on the correlation between each texture. For example, when calculating the correlation between texture 1 and texture 2 using Equation 2, X may mean texture 1 and Y may mean texture 2. And i may mean a block index of texture 1 or texture 2.
- the encoding/decoding method and device may determine a representative texture map among a plurality of texture maps using a cost function.
- the cost function may be an objective function of an optimization algorithm in a machine learning algorithm.
- a texture map that minimizes the cost function value may be determined as a representative texture map.
- Information about the determined representative texture map may be signaled as a dominant texture map (dominant_texture_map) value.
- the encoding method and device can encode mesh data and generate related syntax elements as shown in Figs. 14 to 16.
- the encoding method and device according to the embodiments can generate and transmit a bitstream including encoded mesh data and syntax elements.
- the decoding method and device can decode mesh data in a bitstream based on syntax elements included in the bitstream, as in FIG. 17 and FIG. 19.
- Signaling information (which may be referred to as parameters/metadata, etc.) according to the embodiments may be encoded by a metadata encoding unit (or, which may be referred to as a metadata encoder, etc.) in a point cloud data transmission device according to the embodiments and transmitted as included in a bitstream.
- a metadata encoding unit or, which may be referred to as a metadata encoder, etc.
- signaling information (which may be referred to as parameters/metadata, etc.) according to the embodiments may be decoded by a metadata decoding unit (or, which may be referred to as a metadata decoder, etc.) in a point cloud data receiving device according to the embodiments and provided to a decoding process of point cloud data.
- a metadata decoding unit or, which may be referred to as a metadata decoder, etc.
- a transmitter may encode point cloud data to generate a bitstream.
- a bitstream according to embodiments may include a V3C unit.
- a receiver can receive a bitstream transmitted by a transmitter, decode, and restore point cloud data.
- v3c_unit ( numBytesInV3CUnit) ⁇ Descriptor v3c_unit_header( ) v3c_unit_payload(numBytesInV3CUnit - 4 ) ⁇
- vuh_unit_type represents the V3C unit type specified as follows.
- V3C_VPS V3C parameter set V3C level parameters 1 V3C_AD Atlas data Atlas information 2 V3C_OVD Occupancy video data Occupancy information 3 V3C_GVD Geometry video data Geometry information 4 V3C_AVD Attribute video data Attribute information 5...31 V3C_RSVD Reserved -
- vuh_v3c_parameter_set_id represents the value of vps_v3c_parameter_set_id for the active V3C VPS.
- the value of vuh_v3c_parameter_set_id ranges from 0 to 15.
- vuh_atlas_id represents the ID of the atlas corresponding to the current V3C unit.
- the value of vuh_atlas_id ranges from 0 to 63.
- vuh_attribute_index represents the index of the attribute data stored in the Attribute Video Data unit.
- the value of vuh_attribute_index ranges from 0 to (ai_attribute_count[vuh_atlas_id] - 1).
- vuh_attribute_partition_index represents the index of an attribute dimension group contained in an Attribute Video Data unit.
- the value of vuh_attribute_partition_index ranges from 0 to ai_attribute_dimension_partitions_minus1[ vuh_atlas_id ][ vuh_attribute_index ].
- vuh_map_index indicates the map index of the current geometry or attribute stream. If not present, the map index of the current geometry or attribute stream is derived based on the stream type and the operations set for the geometry and attribute video streams, respectively. If vuh_map_index is present, its value ranges from 0 to vps_map_count_minus1[vuh_atlas_id].
- vuh_auxiliary_video_flag If vuh_auxiliary_video_flag is equal to 1, it indicates that the associated geometry or attribute video data unit is RAW and/or EOM code point video. If vuh_auxiliary_video_flag is equal to 0, it indicates that the associated geometry or attribute video data unit may contain RAW and/or EOM code points. If vuh_auxiliary_video_flag is not present, its value is inferred to be equal to 0.
- vuh_reserved_zero_12bit If vuh_reserved_zero_12bit is present, it is set to 0 in bitstreams conforming to this version of the document. Other values of vuh_reserved_zero_12bit are reserved for future use in ISO/IEC. Decoders should ignore the value of vuh_reserved_zero_12bit.
- vuh_reserved_zero_17 bit If the vuh_reserved_zero_17 bit is present, it is set to 0 in bitstreams conforming to this version of the document. Other values of vuh_reserved_zero_17bit are reserved for future use in ISO/IEC. Decoders should ignore the value of vuh_reserved_zero_17bit.
- vuh_reserved_zero_27bit If vuh_reserved_zero_27bit is present, it is set to 0 in bitstreams conforming to this version of the document. Other values of vuh_reserved_zero_27bit are reserved for future use in ISO/IEC. Decoders should ignore the value of vuh_reserved_zero_27bit.
- attribute residual texture video flag (vuh_attribute_residual_texture_video_flag) value is 1, it indicates that the attribute video data unit is for residual texture video data. If the attribute residual texture video flag (vuh_attribute_residual_texture_video_flag) value is 0, it indicates that the attribute video data unit is for video with original attribute data.
- the proposed attribute video type can be distinguished by using ai_attribute_type_id, which is signaled as attribute information syntax among the existing V3C parameter set syntax.
- ai_attribute_type_id[ j ][ i ] indicates the attribute type of attribute video data having the ith index for the atlas with ID j, and the types that it can indicate are as shown in the table showing the V3C attribute types below.
- the V3C unit header may include a unit type (vuh_unit_type), and when the unit type (vuh_unit_type) has a value of 4, the V3C unit type may indicate attribute information. And when the V3C unit type indicates attribute information, the V3C unit header (v3c_unit_header) may include an attribute index (vuh_attribute_index), an attribute partition index (vuh_attribute_partition_index), a map index (vuh_map_index, an auxiliary video flag (vuh_auxiliary_video_flag), and an attribute residual texture video flag (vuh_attribute_residual_texture_video_flag).
- attribute residual texture video flag (vuh_attribute_residual_texture_video_flag) value is 1, it can indicate that the attribute video data unit is for residual texture video data.
- ai_attribute_count[j] indicates the number of attributes associated with the atlas with ID j. ai_attribute_count[j] is between 0 and 127.
- ai_attribute_type_id[ j ][ I ] represents the attribute type of the attribute video data unit with index i for the atlas with ID j.
- the table below describes the list of supported attributes and their relationship to ai_attribute_type_id.
- ATTR_TEXTURE represents an attribute that contains texture information for a volumetric frame. For example, it could represent an attribute that contains RGB (red, green, blue) color information.
- ATTR_MATERIAL_ID represents an attribute containing supplementary information indicating the material type of a point in the volumetric frame.
- the material type can be used as an indicator to identify the characteristics of an object or point within the volumetric frame. Interpreting the values of these attribute frame types is beyond the scope of this document.
- ATTR_TRANSPARENCY represents an attribute that contains transparency information associated with each point in the volumetric frame.
- ATTR_REFLECTANCE represents an attribute containing reflectance information associated with each point in the volumetric frame.
- ATTR_NORMAL represents an attribute that contains unit vector information associated with each point in the volumetric frame.
- the unit vector represents the normal direction to the surface at a point (i.e., the direction the point is facing).
- Attribute frames with this attribute type must have ai_attribute_dimension_minus1 equal to 2.
- Each channel of an attribute frame with this attribute type must contain one component of a unit vector (x, y, z), where the first component contains the x-coordinate, the second component contains the y-coordinate, and the third component contains the z-coordinate.
- the attribute residual texture represents an attribute that contains residual texture video information, which is composed of pixel-by-pixel color difference information of two different texture videos.
- the residual texture video information is composed of vectors (Red_diff, Green_diff, Blue_diff), where each value represents the difference in the red component value, the difference in the green component value, and the difference in the blue component value of the two texture videos, all of which have integer value types.
- ATTR_UNSPECIFIED Indicates that an attribute has no specified meaning and contains a value whose meaning will not be made mandatory in the future as part of this document.
- ai_attribute_codec_id[ j ][ i ] represents the identifier of the codec used to compress the attribute video data of the atlas with ID j.
- ai_attribute_codec_id[ j ][ I ] is identifiable by the component codec mapping the SEI message or by a method outside of this document.
- ai_auxiliary_attribute_codec_id[ j ][ i ] When ai_auxiliary_attribute_codec_id[ j ][ i ] is present, it indicates the identifier of the codec used to compress the attribute video data of attribute i, whose RAW and/or EOM code points are encoded in the auxiliary video stream of the atlas with ID j. ai_auxiliary_attribute_codec_id[ j ][ i ] is in the range 0 to 255. This codec may be identified by a component codec that maps SEI messages, or by means outside of this document.
- ai_auxiliary_attribute_codec_id[ j ][ i ] If the value of ai_auxiliary_attribute_codec_id[ j ][ i ] is not present, it is inferred to be equal to ai_attribute_codec_id[ j ][ i ].
- ai_attribute_map_absolute_coding_persistence_flag[ j ][ i ] is equal to 1, it indicates that all attribute maps corresponding to the atlas with ID j for the attribute with index i will be coded without any form of map prediction. If ai_attribute_map_absolute_coding_persistence_flag[ j ][ i ] is equal to 0, it indicates that the attribute map with index i will use the same map prediction method as that used for the geometries in the atlas with ID j. If ai_attribute_map_absolute_coding_persistence_flag[ j ][ i ] is not present, its value is inferred to be equal to 1.
- ai_attribute_dimension_minus1[ j ][ i ] plus 1 represents the total number of dimensions (i.e., number of channels) of the attribute with index i in the atlas with ID j.
- ai_attribute_dimension_minus1[ j ][ i ] is an integer ranging from 0 to 63.
- ai_attribute_dimension_partitions_minus1[ j ][ i ] plus 1 represents the number of partition groups into which the attribute channel with index i in the atlas with ID j should be grouped.
- ai_attribute_dimension_partitions_minus1[ j ][ i ] must be in the range 0 to 63, inclusive.
- ai_attribute_partition_channels_minus1[ k ][ i ][ j ] plus 1 represents the number of channels assigned to the dimension partition group with attribute index j for the atlas with ID k.
- ai_attribute_partition_channels_minus1[ k ][ i ][ j ] must be in the range of 0 to ai_attribute_dimension_minus1 [ k ][ i ] for all dimension partition groups.
- ai_attribute_2d_bit_depth_minus1[ j ][ i ] plus 1 indicates the nominal 2D bit depth to which all attribute videos with attribute index i in the atlas with ID j will be converted.
- ai_attribute_2d_bit_depth_minus1[ j ][ I ] must be in the range 0 to 31, inclusive.
- ai_attribute_MSB_align_flag[ j ][ i ] indicates how the decoded attribute video samples associated with the atlas with ID j and index i are converted to samples of the nominal attribute bit depth.
- a V3C parameter set (v3c_parameter_set) can contain attribute information (attribute_information), and the attribute information can contain an attribute type ID (ai_attribute_type_id[ j ][ i ]).
- the attribute type ID can indicate whether the attribute type is Texture, Material ID, Transparency, Reflectance, Normals, or Residual texture, depending on its value.
- the attribute residual texture indicates that the attribute contains residual texture video information, which is composed of pixel-by-pixel color difference information of two texture videos with different attributes.
- the residual texture video information includes the difference in red component values, the difference in green component values, and the difference in blue component values of the two texture videos.
- Fig. 18 shows an encoding method according to embodiments.
- the encoding method according to the embodiments may include a step of encoding a base mesh of mesh data (S1810), and/or a step of encoding a displacement of mesh data (S1820), and/or a step of encoding an attribute of mesh data (S1830).
- the step of encoding attributes of mesh data includes encoding attributes related to multi-attribute information, and the multi-attribute information may be data in which a plurality of attribute data related to a bounding box for one object is concatenated.
- And encoding an attribute regarding the above multi-attribute information may include calculating a pixel-by-pixel color difference from representative attribute information among the multi-attribute information to calculate residual attribute information.
- bitstream includes a flag indicating whether the attribute is related to multiple attribute information associated with one object of mesh data, and a first value of the flag may indicate that the attribute includes residual attribute information for the multiple attribute information. And a second value of the flag may indicate that the attribute does not include residual attribute information for the multiple attribute information.
- bitstream includes parameter set information
- the parameter set information includes information indicating a type of the attribute in the bitstream
- a first value of the type of the attribute may indicate that the attribute includes a residual derived between the multiple attribute information.
- the encoded multi-attribute information may include one representative attribute information and N-1 residual attribute information.
- the residual attribute information may be generated by calculating the pixel-by-pixel color difference with the representative attribute information.
- the residual attribute information may be referred to as residual attribute data, residual data, residual data, and residual attribute data.
- the process of selecting representative attribute information can be performed by selecting the first attribute information among multiple attribute information as the representative attribute information, or by calculating the correlation between each of the multiple attribute information at the pixel or block level.
- the representative attribute information can be selected using a cost function.
- each attribute information can be divided into block or pixel units, the matching correlation of each attribute information can be calculated based on a specific pixel or block size, and the attribute information that allows for the most efficient compression can be selected as representative attribute information. At this time, information regarding the selected attribute information can be additionally signaled.
- the encoding method can be performed by an encoding device, referring to FIG. 1 or FIG. 2 together.
- the encoding device includes a memory; and at least one processor connected to the memory; and the at least one processor can be configured to: encode a base mesh of mesh data; encode a displacement of the mesh data; and encode an attribute of the mesh data.
- Embodiments may include a computer-readable storage medium storing a bitstream generated by a method according to the encoding method of FIG. 18.
- Figure 19 shows a decryption method according to embodiments.
- the decoding method may include a step of decoding a base mesh within a bitstream (S1910), and/or a step of decoding a displacement within a bitstream (S1920), and/or a step of decoding an attribute within a bitstream (S1930).
- the decryption method of Fig. 19 and the encoding method of Fig. 18 can be performed in reverse processes.
- the step (S1930) of decoding an attribute in a bitstream includes decoding an attribute related to the multi-attribute information based on a flag indicating whether the attribute in the bitstream is related to multi-attribute information associated with one object of mesh data, wherein a first value of the flag may indicate that the attribute includes residual attribute information for the multi-attribute information. And a second value of the flag may indicate that the attribute does not include residual attribute information for the multi-attribute information.
- multi-attribute information may be data in which multiple attribute data related to the bounding box of one object are concatenated.
- the bitstream may include parameter set information
- the parameter set information may include information indicating a type of the attribute within the bitstream.
- a first value of the type of the attribute may indicate that the attribute includes residual attribute data derived between the multiple attribute information.
- v3c_parameter_set may include information about an attribute type, and a first value of the information about the attribute type may indicate that the attribute includes residual attribute information.
- Residual attribute information can represent attribute information composed of pixel-by-pixel color difference information of two different attribute information.
- Residual attribute information information is composed of a vector (Red_diff, Green_diff, Blue_diff), and each value represents the difference in the red component value, the green component value difference, and the blue component value difference of the two attribute information, and all can have integer value types.
- decoding an attribute regarding multi-attribute information may include restoring the attribute by adding representative attribute information among the multi-attribute information and residual attribute information with the representative attribute information.
- And decoding an attribute regarding multi-attribute information may include receiving information representing a representative attribute among the multi-attribute information, and restoring the attribute based on the information representing the representative attribute.
- the decryption method can be performed by a decryption device, referring to FIG. 1 or FIG. 2.
- the decryption device includes a memory; and at least one processor connected to the memory; and the at least one processor can be configured to: decode a base mesh within a bitstream; decode a displacement within the bitstream; and decode an attribute within the bitstream.
- the embodiments propose a method for improving the problem of a dynamic mesh data compression method having a plurality of attribute information and obtaining more efficient compression performance.
- the embodiments do not compress the original data of a plurality of attribute information, i.e., the original color values, as they are, but select representative attribute information (texture map) and calculate the pixel-by-pixel color value difference with it to regenerate residual texture data containing residual signal data.
- representative attribute information texture map
- each drawing has been described separately, but it is also possible to design a new embodiment by combining the embodiments described in each drawing.
- designing a computer-readable recording medium having a program recorded thereon for executing the previously described embodiments, as needed by a person skilled in the art also falls within the scope of the embodiments.
- the devices and methods according to the embodiments are not limited to the configurations and methods of the embodiments described above, but the embodiments may be configured by selectively combining all or part of the embodiments so that various modifications can be made.
- the various components of the devices of the embodiments may be implemented by hardware, software, firmware, or a combination thereof.
- the various components of the embodiments may be implemented by a single chip, for example, a single hardware circuit.
- the components according to the embodiments may be implemented by separate chips.
- at least one of the components of the devices of the embodiments may be configured with one or more processors capable of executing one or more programs, and the one or more programs may perform, or include instructions for performing, one or more of the operations/methods according to the embodiments.
- the executable instructions for performing the methods/operations of the devices of the embodiments may be stored in non-transitory CRMs or other computer program products configured to be executed by one or more processors, or may be stored in temporary CRMs or other computer program products configured to be executed by one or more processors.
- the memory according to the embodiments may be used as a concept including not only volatile memory (e.g., RAM, etc.), but also non-volatile memory, flash memory, PROM, etc. Additionally, it may include implementations in the form of carrier waves, such as transmissions via the Internet.
- processor-readable recording media may be distributed across network-connected computer systems, allowing processor-readable code to be stored and executed in a distributed manner.
- first first
- second second
- first user input signal
- first user input signal second user input signal
- first user input signal second user input signal
- the operations according to the embodiments described in this document may be performed by a transceiver device including a memory and/or a processor according to the embodiments.
- the memory may store programs for processing/controlling the operations according to the embodiments, and the processor may control various operations described in this document.
- the processor may be referred to as a controller, etc.
- the operations according to the embodiments may be performed by firmware, software, and/or a combination thereof, and the firmware, software, and/or a combination thereof may be stored in the processor or in the memory.
- the transmitting/receiving device may include a transmitting/receiving unit for transmitting and receiving media data, a memory for storing instructions (program code, algorithm, flowchart, and/or data) for a process according to the embodiments, and a processor for controlling the operations of the transmitting/receiving device.
- the processor may be referred to as a controller or the like, and may correspond to, for example, hardware, software, and/or a combination thereof.
- the operations according to the above-described embodiments may be performed by the processor.
- the processor may be implemented as an encoder/decoder or the like for the operations of the above-described embodiments.
- the embodiments may be applied in whole or in part to a point cloud data transmission and reception device and system.
- Embodiments may include modifications/changes, which do not depart from the scope of the claims and their equivalents.
Landscapes
- Engineering & Computer Science (AREA)
- Multimedia (AREA)
- Signal Processing (AREA)
- Computer Graphics (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Compression Or Coding Systems Of Tv Signals (AREA)
Abstract
L'invention concerne un procédé de décodage, selon des modes de réalisation, qui peut comprendre les étapes consistant à : décoder un maillage de base dans un flux binaire ; décoder un déplacement dans le flux binaire ; et décoder un attribut dans le flux binaire. Un procédé de codage selon des modes de réalisation peut comprendre les étapes consistant à : coder un maillage de base de données de maillage ; coder un déplacement des données de maillage ; et coder un attribut des données de maillage.
Applications Claiming Priority (2)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| KR10-2024-0082896 | 2024-06-25 | ||
| KR20240082896 | 2024-06-25 |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| WO2026005485A1 true WO2026005485A1 (fr) | 2026-01-02 |
Family
ID=98222368
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| PCT/KR2025/008911 Pending WO2026005485A1 (fr) | 2024-06-25 | 2025-06-25 | Dispositif de codage de données de maillage, procédé de codage de données de maillage, dispositif de décodage de données de maillage et procédé de décodage de données de maillage |
Country Status (1)
| Country | Link |
|---|---|
| WO (1) | WO2026005485A1 (fr) |
Citations (5)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| WO2022186675A1 (fr) * | 2021-03-05 | 2022-09-09 | 엘지전자 주식회사 | Dispositif d'émission de données de nuage de points, procédé d'émission de données de nuage de points, dispositif de réception de données de nuage de points et procédé de réception de données de nuage de points |
| WO2024063811A1 (fr) * | 2022-09-22 | 2024-03-28 | Tencent America LLC | Fusion de multiples cartes d'attributs |
| WO2024085654A1 (fr) * | 2022-10-19 | 2024-04-25 | Samsung Electronics Co., Ltd. | Encapsulage de données de déplacements dans des trames vidéo pour un codage maillé dynamique |
| US20240153150A1 (en) * | 2022-10-26 | 2024-05-09 | Apple Inc. | Mesh Compression Texture Coordinate Signaling and Decoding |
| WO2024123039A1 (fr) * | 2022-12-05 | 2024-06-13 | 엘지전자 주식회사 | Appareil de transmission de données 3d, procédé de transmission de données 3d, appareil de réception de données 3d et procédé de réception de données 3d |
-
2025
- 2025-06-25 WO PCT/KR2025/008911 patent/WO2026005485A1/fr active Pending
Patent Citations (5)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| WO2022186675A1 (fr) * | 2021-03-05 | 2022-09-09 | 엘지전자 주식회사 | Dispositif d'émission de données de nuage de points, procédé d'émission de données de nuage de points, dispositif de réception de données de nuage de points et procédé de réception de données de nuage de points |
| WO2024063811A1 (fr) * | 2022-09-22 | 2024-03-28 | Tencent America LLC | Fusion de multiples cartes d'attributs |
| WO2024085654A1 (fr) * | 2022-10-19 | 2024-04-25 | Samsung Electronics Co., Ltd. | Encapsulage de données de déplacements dans des trames vidéo pour un codage maillé dynamique |
| US20240153150A1 (en) * | 2022-10-26 | 2024-05-09 | Apple Inc. | Mesh Compression Texture Coordinate Signaling and Decoding |
| WO2024123039A1 (fr) * | 2022-12-05 | 2024-06-13 | 엘지전자 주식회사 | Appareil de transmission de données 3d, procédé de transmission de données 3d, appareil de réception de données 3d et procédé de réception de données 3d |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| WO2020190114A1 (fr) | Dispositif de transmission de données de nuage de points, procédé de transmission de données de nuage de points, dispositif de réception de données de nuage de points, et procédé de réception de données de nuage de points | |
| WO2020189895A1 (fr) | Dispositif de transmission de données de nuage de points, procédé de transmission de données de nuage de points, dispositif de réception de données de nuage de points et procédé de réception de données de nuage de points | |
| WO2024063544A1 (fr) | Dispositif d'émission de données 3d, procédé d'émission de données 3d, dispositif de réception de données 3d et procédé de réception de données 3d | |
| WO2020190090A1 (fr) | Dispositif de transmission de données de nuage de points, procédé de transmission de données de nuage de points, dispositif de réception de données de nuage de points, et procédé de réception de données de nuage de points | |
| WO2020189982A1 (fr) | Dispositif et procédé de traitement de données de nuage de points | |
| WO2024049197A1 (fr) | Dispositif d'émission de données 3d, procédé d'émission de données 3d, dispositif de réception de données 3d et procédé de réception de données 3d | |
| WO2021025392A1 (fr) | Dispositif et procédé de traitement de données en nuage de points | |
| WO2022050688A1 (fr) | Dispositif de transmission de données en trois dimensions, procédé de transmission de données en trois dimensions, dispositif de réception de données en trois dimensions et procédé de réception de données en trois dimensions | |
| WO2024043659A1 (fr) | Dispositif de transmission de données de nuage de points, procédé de transmission de données de nuage de points, dispositif de réception de données de nuage de points et procédé de réception de données de nuage de points | |
| WO2024185940A1 (fr) | Dispositif de transmission de données de nuage de points, procédé de transmission de données de nuage de points, dispositif de réception de données de nuage de points et procédé de réception de données de nuage de points | |
| WO2021101066A1 (fr) | Procédé de codage d'image basé sur des informations liées à un point d'entrée dans un système de codage de vidéo ou d'image | |
| WO2024191192A1 (fr) | Dispositif de transmission de données maillées, procédé de transmission de données maillées, dispositif de réception de données maillées et procédé de réception de données maillées | |
| WO2026005485A1 (fr) | Dispositif de codage de données de maillage, procédé de codage de données de maillage, dispositif de décodage de données de maillage et procédé de décodage de données de maillage | |
| WO2023132605A1 (fr) | Dispositif d'émission pour données de nuage de points, procédé mis en œuvre par le dispositif d'émission, dispositif de réception pour données de nuage de points, et procédé mis en œuvre par le dispositif de réception | |
| WO2025048566A1 (fr) | Dispositif de transmission de données de maillage, procédé de transmission de données de maillage, dispositif de réception de données de maillage et procédé de réception de données de maillage | |
| WO2025048473A1 (fr) | Dispositif de transmission de données de nuage de points, procédé de transmission de données de nuage de points, dispositif de réception de données de nuage de points et procédé de réception de données de nuage de points | |
| WO2021118076A1 (fr) | Procédé de codage d'image basé sur des informations associées à un point d'entrée dans un système de codage de vidéo ou d'image | |
| WO2021002562A1 (fr) | Dispositif de transmission de données de nuage de points, procédé de transmission de données de nuage de points, dispositif de réception de données de nuage de points et procédé de réception de données de nuage de points | |
| WO2025220986A1 (fr) | Dispositif de transmission de données de maillage, procédé de transmission de données de maillage, dispositif de réception de données de maillage et procédé de réception de données de maillage | |
| WO2025216593A1 (fr) | Dispositif de codage de données de maillage, procédé de codage de données de maillage, dispositif de décodage de données de maillage et procédé de décodage de données de maillage | |
| WO2025009868A1 (fr) | Dispositif et procédé de transmission et de réception de données de maillage | |
| WO2026014975A1 (fr) | Dispositif de codage de données de maillage, procédé de codage de données de maillage, dispositif de décodage de données de maillage, et procédé de décodage de données de maillage | |
| WO2026010463A1 (fr) | Dispositif de transmission de données de maillage, procédé de transmission de données de maillage, dispositif de réception de données de maillage et procédé de réception de données de maillage | |
| WO2026005484A1 (fr) | Dispositif de transmission de données de maillage, procédé de transmission de données de maillage, dispositif de réception de données de maillage et procédé de réception de données de maillage | |
| WO2026084533A1 (fr) | Dispositif de transmission de données de maillage, procédé de transmission de données de maillage, dispositif de réception de données de maillage et procédé de réception de données de maillage |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| 121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 25827055 Country of ref document: EP Kind code of ref document: A1 |