WO2019198521A1 - 画像処理装置および方法 - Google Patents
画像処理装置および方法 Download PDFInfo
- Publication number
- WO2019198521A1 WO2019198521A1 PCT/JP2019/013535 JP2019013535W WO2019198521A1 WO 2019198521 A1 WO2019198521 A1 WO 2019198521A1 JP 2019013535 W JP2019013535 W JP 2019013535W WO 2019198521 A1 WO2019198521 A1 WO 2019198521A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- unit
- information
- projection direction
- decoding
- image processing
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Ceased
Links
Images
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/70—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals characterised by syntax aspects related to video coding, e.g. related to compression standards
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/46—Embedding additional information in the video signal during the compression process
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T3/00—Geometric image transformations in the plane of the image
- G06T3/08—Projecting images onto non-planar surfaces, e.g. geodetic screens
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T9/00—Image coding
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/169—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
- H04N19/17—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
- H04N19/172—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a picture, frame or field
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/169—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
- H04N19/17—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
- H04N19/174—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a slice, e.g. a line of blocks or a group of blocks
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/169—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
- H04N19/184—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being bits, e.g. of the compressed video stream
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/30—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using hierarchical techniques, e.g. scalability
- H04N19/37—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using hierarchical techniques, e.g. scalability with arrangements for assigning different transmission priorities to video input data or to video coded data
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/50—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
- H04N19/597—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding specially adapted for multi-view video sequence encoding
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/85—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using pre-processing or post-processing specially adapted for video compression
Definitions
- the present disclosure relates to an image processing apparatus and method, and more particularly, to an image processing apparatus and method that can more easily decode encoded data of 3D data.
- Non-Patent Document 1 Conventionally, as a method for encoding 3D data representing a three-dimensional structure such as a point cloud, for example, there has been encoding using a voxel such as Octtree (for example, Non-Patent Document 1). reference).
- a voxel such as Octtree
- the present disclosure has been made in view of such a situation, and makes it possible to more easily decode encoded data of 3D data.
- An image processing apparatus includes projection direction information that is information related to a projection direction of position information of 3D data representing a three-dimensional structure onto a two-dimensional plane, and a geometry obtained by projecting the position information onto the two-dimensional plane.
- the image processing apparatus includes a bit stream generation unit that generates a bit stream including encoded image data.
- An image processing method includes projection direction information that is information regarding a projection direction of position information of 3D data representing a three-dimensional structure onto a two-dimensional plane, and a geometry obtained by projecting the position information onto the two-dimensional plane. This is an image processing method for generating a bitstream including encoded image data.
- An image processing apparatus converts the position information into the two-dimensional plane based on projection direction information that is information related to the projection direction of the position information of 3D data representing a three-dimensional structure onto the two-dimensional plane.
- An image processing apparatus includes a decoding unit that decodes a bitstream including encoded data of a projected geometry image.
- An image processing method is configured to convert the position information into the two-dimensional plane based on projection direction information that is information related to the projection direction of the position information of 3D data representing a three-dimensional structure onto the two-dimensional plane.
- This is an image processing method for decoding a bit stream including encoded data of a projected geometry image.
- an image processing apparatus projects a geometry image obtained by projecting position information of 3D data representing a three-dimensional structure onto a two-dimensional plane in a direction in which the position information is projected onto the two-dimensional plane.
- the image processing apparatus includes a packing unit arranged and packed in a corresponding encoding unit.
- an image processing method projects a geometry image obtained by projecting position information of 3D data representing a three-dimensional structure onto a two-dimensional plane in a direction in which the position information is projected onto the two-dimensional plane.
- This is an image processing method in which data is arranged and packed in a corresponding encoding unit.
- the image processing apparatus further includes the encoding unit corresponding to the projection direction of the position information of the 3D data representing the three-dimensional structure of the bitstream on the two-dimensional plane, and the two-dimensional image in the projection direction. It is an image processing apparatus provided with the decoding part which decodes the geometry image projected on the plane.
- an image processing method in which, from a coding unit corresponding to a projection direction of position information of 3D data representing a three-dimensional structure of a bitstream on a two-dimensional plane, the two-dimensional image in the projection direction.
- This is an image processing method for decoding a geometry image projected on a plane.
- projection direction information that is information related to a projection direction of position information of 3D data representing a three-dimensional structure onto a two-dimensional plane, and the position information on the two-dimensional plane.
- a bit stream including encoded data of the projected geometry image is generated.
- the position information is the A bit stream including encoded data of the geometry image projected onto the dimension plane is decoded.
- a geometry image obtained by projecting position information of 3D data representing a three-dimensional structure onto a two-dimensional plane is an image of the position information to the two-dimensional plane. Packed by being arranged in a coding unit corresponding to the projection direction.
- an image can be processed.
- encoded data of 3D data can be decoded more easily.
- FIG. 20 is a block diagram illustrating a main configuration example of a computer.
- Video-based approach> ⁇ Documents that support technical contents and technical terms>
- the scope disclosed in the present technology includes not only the contents described in the embodiments but also the contents described in the following non-patent documents that are known at the time of filing.
- Non-Patent Document 1 (above)
- Non-Patent Document 2 TELECOMMUNICATION STANDARDIZATION SECTOR OF ITU (International Telecommunication Union), "Advanced video coding for generic audiovisual services", H.264, 04/2017
- Non-Patent Document 3 TELECOMMUNICATION STANDARDIZATION SECTOR OF ITU (International Telecommunication Union), "High efficiency video coding", H.265, 12/2016
- Non-Patent Document 4 Jianle Chen, Maria Alshina, Gary J.
- ⁇ Point cloud> Conventionally, there are data such as point clouds that represent 3D structures based on point cloud position information and attribute information, and meshes that are composed of vertices, edges, and surfaces, and that define 3D shapes using polygonal representations. .
- a three-dimensional structure as shown in A of FIG. 1 is expressed as a set (point group) of a large number of points as shown in B of FIG. That is, the point cloud data is configured by position information and attribute information (for example, color) of each point in the point cloud. Therefore, the data structure is relatively simple, and an arbitrary three-dimensional structure can be expressed with sufficient accuracy by using a sufficiently large number of points.
- an input point cloud (Point ⁇ cloud) is divided into a plurality of segmentations (also referred to as regions), and each region is projected onto a two-dimensional plane.
- the data for each position of the point cloud (that is, the data of each point) is composed of the position information (Geometry (also referred to as Depth)) and the attribute information (Texture) as described above. Projected onto a dimensional plane.
- Each segmentation (also referred to as a patch) projected on the two-dimensional plane is arranged in a two-dimensional image.
- two-dimensional plane images such as AVC (Advanced Video Coding) and HEVC (High Efficiency Video Coding). It is encoded by the encoding method.
- AVC Advanced Video Coding
- HEVC High Efficiency Video Coding
- ⁇ Transmission of projection direction information> Therefore, a bit including projection direction information, which is information related to the projection direction of the position information of the 3D data representing the three-dimensional structure onto the two-dimensional plane, and encoded data of the geometry image obtained by projecting the position information onto the two-dimensional plane. Create a stream.
- projection direction information that is information related to the projection direction of position information of 3D data representing a three-dimensional structure onto a two-dimensional plane, and encoded data of a geometry image obtained by projecting the position information onto the two-dimensional plane
- bit stream generation unit that generates a bit stream including.
- the projection direction can be arbitrarily set, for example, a patch with a more appropriate projection direction according to the view direction or the like can be generated and encoded.
- the encoded data of the 3D data can be more easily decoded on the decoding side. For example, it is possible to suppress the occurrence of occlusion or to suppress an increase in the number of patches, and it is possible to suppress an increase in the load of decoding processing (for example, processing amount, processing time, data amount to be processed, etc.). .
- projection direction information which is information related to the projection direction of the position information of the 3D data representing the three-dimensional structure onto the two-dimensional plane
- encoded data of a geometry image obtained by projecting the position information onto the two-dimensional plane is included. Decode the bitstream.
- a code of a geometry image obtained by projecting the position information onto the two-dimensional plane A decoding unit for decoding the bitstream including the digitized data is provided.
- a geometry image obtained by projecting position information of 3D data representing a three-dimensional structure onto a two-dimensional plane is arranged and packed in an encoding unit corresponding to the projection direction of the position information onto the two-dimensional plane.
- a geometry image obtained by projecting position information of 3D data representing a three-dimensional structure onto a two-dimensional plane is arranged in an encoding unit corresponding to the projection direction of the position information onto the two-dimensional plane. Then, a packing unit for packing is provided.
- a geometry image projected on the two-dimensional plane in the projection direction is decoded from an encoding unit corresponding to the projection direction on the two-dimensional plane of the position information of the 3D data representing the three-dimensional structure of the bitstream. To do.
- a geometry image projected on a two-dimensional plane in the projection direction from an encoding unit corresponding to the projection direction on the two-dimensional plane of position information of 3D data representing the three-dimensional structure of the bitstream is provided.
- encoded data of 3D data can be more easily decoded, and an increase in decoding processing load (for example, processing amount, processing time, data amount to be processed, etc.) can be suppressed.
- the projection direction indicates an angle of projection of 3D data (for example, a point cloud) onto a two-dimensional plane in the video-based approach, that is, the direction and position (distance) of the two-dimensional plane as viewed from the 3D data.
- the projection direction (direction and position) may be expressed using spherical coordinates (r, ⁇ , ⁇ ).
- This projection direction may include the same orthogonal directions (0 degrees, 90 degrees, 180 degrees, 270 degrees) as before, and new directions and positions other than these orthogonal coordinate directions. It may be.
- the projection direction is only the orthogonal coordinate direction as shown in the table of FIG. 4B, and is predetermined (cannot be set).
- a projection direction other than the orthogonal coordinate direction can be set as shown in the table of FIG. 4C, for example.
- projection direction information that is information about the projection direction is generated and transmitted.
- the projection direction information is information indicating the correspondence between the identifier and the projection direction. That is, the projection direction information is information for assigning an identifier to the set projection direction.
- FIG. 4C shows an example of the projection direction information.
- the projection direction (direction sphere coordinate ⁇ , direction sphere coordinate ⁇ , distance r) is associated with each projection direction identifier (projection direction index). Note that the same projection direction may be associated with a plurality of identifiers.
- a direction or position that tends to be the View direction can be set as the projection direction, or conversely, the View direction
- the projection direction can be set so as to suppress the occurrence of occlusion, such as by excluding the direction and position that are difficult to become from the projection direction, and the reduction in reproducibility can be suppressed.
- the projection direction can be set so as to suppress the increase in patch division, an increase in the smoothing processing amount can also be suppressed.
- a decoding order (decoding priority order) may be set for each projection direction.
- a number indicating the decoding order (priority order of decoding) is assigned to the identifier (projection direction index) of each projection direction. Note that the same decoding order may be assigned to a plurality of identifiers.
- More diverse decoding methods can be realized by controlling the decoding order of each patch using such information. For example, the more important projection direction can be decoded earlier.
- a decoding method in which decoding is performed preferentially from a patch having a projection direction close to the requested view direction with reference to the decoding order becomes possible. As a result, the requested image in the view direction can be restored (displayed) more quickly.
- a decoding method in which decoding of patches in an unimportant projection direction is omitted depending on the load situation or the like is possible.
- semantic information indicating characteristics (meaning) in the three-dimensional structure may be included as attribute information.
- a projection direction for projecting a portion having a predetermined meaning in the three-dimensional structure may be set, and semantic information indicating the meaning may be added as attribute information in association with the projection direction.
- a dedicated projection direction for projecting a person's “face” may be set, and semantic information such as “Face” may be added to the projection direction.
- the semantic information “Face1” is added to the projection directions whose projection direction indexes are “10” and “11”.
- the patch relating to “Face 1” can be decoded by decoding the patches in the projection direction.
- a plurality of attribute information may be added to one projection direction index.
- a plurality of pieces of semantic information may be added to one projection direction index.
- encoding unit information that is information regarding an encoding unit in which a patch is arranged for each projection direction is generated and transmitted.
- the encoding unit information is information indicating a correspondence relationship between the above-described identifier of the projection direction and the encoding unit in which the patch projected in the projection direction is arranged. That is, the coding unit information is information indicating the correspondence between each projection direction and the coding unit to be used.
- An encoding unit is a data unit that can be encoded and decoded independently.
- a specific data unit of the encoding unit is not particularly limited, and may be a slice, a tile, a picture, or the like, for example.
- FIG. 5A An example of patch arrangement in the conventional case is shown in FIG.
- a plurality of patches 32 are arranged in a picture 31.
- FIG. 5A only one patch is given a reference numeral, but the figure in the picture 31 is a patch 32, respectively.
- the numbers shown in each patch 32 indicate the projection direction (that is, the projection direction index).
- each patch 32 is arranged in the picture 31 without considering the decoding order or the like. Therefore, as in the table shown in FIG. 5B, each projection direction index is assigned the same frame (frame index “0”) (ie, cannot be set). ).
- encoding units (frames, slices, tiles, etc.) in which patches are arranged can be set for each projection direction.
- a frame index for identifying a frame a slice index for identifying a slice, and a tile index for identifying a tile are associated with each projection direction index.
- the patch with the projection direction index “0” is arranged at the slice index “1” of the frame index “0”.
- the patch with the projection direction index “6” is arranged at the slice index “0” with the frame index “0”.
- a plurality of encoding units may be assigned to one projection direction as in the projection index “5” in the table of FIG. 6, or as projection indexes “1” and “2”.
- a plurality of projection directions may be assigned to the same encoding unit.
- each patch can be arranged in a coding unit corresponding to the projection direction. For example, as shown in FIG. 7, each patch 32 in FIG. 5A can be divided into slices 51 to 57 for each projection index.
- decoding of patches in unnecessary projection directions can be omitted (that is, only some patches can be decoded) (partial decoding can be realized). It is also possible to preferentially decode important patches (control the decoding order of patches). That is, encoded data of 3D data can be more easily decoded, and an increase in decoding processing load (for example, processing amount, processing time, data amount to be processed, etc.) can be suppressed. Even when various decoding methods are realized, it is possible to decode a part of necessary data, so that an increase in the load of the decoding process can be suppressed.
- patch placement control is performed using slices, but placement control may be performed using tiles instead of slices. Further, the arrangement control may be performed using both slices and tiles.
- the coding unit in such patch arrangement control may be hierarchized. That is, for example, the arrangement control of a plurality of hierarchies may be performed like a frame and a slice. For example, in the example of FIG. 6, patches may be arranged in frames other than the frame index “0”.
- the attribute information regarding decoding may include semantic information (for example, “Face”) indicating characteristics (meaning) in the three-dimensional structure. By doing so, it becomes possible to select a coding unit to be decoded based on the semantic information.
- semantic information for example, “Face”
- each patch and its projection direction are associated with each other using the projection direction identifier described above. That is, an identifier of the projection direction is assigned to each patch. Thereby, various information such as the projection direction, attribute information, and encoding unit can be associated with each patch.
- the encoding of the patch in the projection direction that is unnecessary for 2D imaging may be omitted.
- a patch in the projection direction that projects the sole of the foot is likely not to be clearly decoded. Therefore, by omitting such encoding of the patch in the projection direction, it is possible to suppress an increase in code amount and suppress a decrease in encoding efficiency.
- this unnecessary projection direction setting method is arbitrary. For example, it may be set based on arbitrary information or the like.
- each patch is arranged in an encoding unit corresponding to the projection direction of the patch.
- each patch is arranged according to the coding unit information.
- the default is the initial value of the view direction set when the user does not specify the view direction to be displayed. You may make it give priority to the decoding order of the data (patch) for View directions. In this way, the image in the default view direction can be restored (displayed) earlier.
- the quality setting may be controlled according to the display frequency of the patch.
- each patch is arranged in a coding unit controlled for each projection direction. Since each coding unit can be independently coded and decoded, the quality can be controlled for each coding unit. That is, the quality of the patch can be controlled according to the projection direction of the patch. For example, when the displayed view direction is biased, the patch display frequency is also biased, for example, the display frequency of patches used for image generation in the frequently displayed view direction is increased. As a matter of course, the subjective image quality with respect to the code amount is improved by improving the image quality of a patch with a higher display frequency than a patch with a lower display frequency. In other words, by performing encoding quality setting according to such a bias, it is possible to suppress a decrease in encoding efficiency.
- the decoding method can be controlled based on the above-described projection direction information, coding unit information, and the like.
- the decoding may be performed in the order of projection directions close to the designated view direction.
- the projection direction information By referring to the projection direction information, the projection direction of each patch can be grasped more easily, so that such decoding control can be realized more easily.
- partial decoding may be performed according to the view direction. Since the encoding unit in which each patch is arranged is controlled for each projection direction, a part of patches can be decoded (partial decoding can be realized) based on the projection direction. Also, by referring to the encoding unit information, it is possible to more easily grasp the encoding unit in which the patch in the desired projection direction is arranged. Furthermore, the projection direction of each patch can be more easily grasped by referring to the projection direction information. Therefore, such decoding control can be realized more easily.
- decoding of a patch in the projection direction opposite to the View direction may be omitted.
- a patch in the direction opposite to the View direction does not contribute to generation of an image in the View direction. Therefore, decoding of unnecessary information can be omitted by omitting decoding of the patch in the projection direction opposite to the View direction by partial decoding (partial decoding) corresponding to the View direction. That is, an increase in decoding load can be suppressed.
- the view direction is set as indicated by a thick arrow with respect to four projection directions id0 to id3.
- the decoding order of patches in each projection direction may be, for example, the order in which the inner product value of each projection direction and the View direction is small.
- the patches are decoded in the order of id0, id3, id1, and id2.
- the decoding order of patches in each projection direction may be as follows.
- Decoding is performed starting from a negative inner product and a small absolute value. 2. Decode the inner product of 0. 3. Decoding is performed starting from a positive inner product and a small absolute value.
- the patches are decoded in the order of id3, id0, id1, id2.
- decoding may be performed from the smallest value, and when the inner product is not negative, decoding may not be performed. In that case, the patches are decoded in the order of id0 and id3.
- the 3D data for generating the patch described above may be position information (Geometry) indicating the position of each point, or attribute information (Texture) such as color information added to the position information. May be.
- FIG. 9 is a block diagram illustrating an example of a configuration of an encoding device that is an aspect of an image processing device to which the present technology is applied.
- An encoding apparatus 100 shown in FIG. 9 is an apparatus that encodes 3D data such as a point cloud onto a two-dimensional plane and performs encoding using an encoding method for a two-dimensional image (an encoding apparatus to which a video-based approach is applied). ).
- FIG. 9 shows main components such as a processing unit and a data flow, and the ones shown in FIG. 9 are not all. That is, in the encoding apparatus 100, there may be a processing unit that is not shown as a block in FIG. 9, or there may be a process or data flow that is not shown as an arrow or the like in FIG. The same applies to other diagrams illustrating the processing unit and the like in the encoding device 100.
- the encoding device 100 includes a patch decomposing unit 111, a metadata generating unit 112, a packing unit 113, an auxiliary patch information compressing unit 114, a video encoding unit 115, a video encoding unit 116, and an OMap encoding. Part 117 and multiplexer 118.
- the patch decomposition unit 111 performs processing related to decomposition of 3D data. For example, the patch decomposing unit 111 acquires 3D data (for example, a point cloud) that represents a three-dimensional structure and information on a view direction (View Info) that are input to the encoding device 100. Further, the patch decomposing unit 111 decomposes the acquired 3D data into a plurality of segmentations, and projects the 3D data onto a two-dimensional plane for each segmentation to generate a patch. At that time, the patch decomposition unit 111 acquires projection direction information from the metadata generation unit 112, and assigns an identifier (projection direction index) to each patch based on the information.
- 3D data for example, a point cloud
- View Info view direction
- the patch decomposing unit 111 decomposes the acquired 3D data into a plurality of segmentations, and projects the 3D data onto a two-dimensional plane for each segmentation to generate a patch.
- the patch decomposition unit 111 acquires projection
- the patch disassembling unit 111 supplies information about each generated patch to the packing unit 113. Further, the patch decomposing unit 111 supplies auxiliary patch information, which is information relating to the decomposition, to the auxiliary patch information compressing unit 114. Further, the patch decomposing unit 111 supplies information related to the projection direction and the like when generating the patch to the metadata generating unit 112.
- the metadata generation unit 112 performs processing related to generation of metadata. For example, the metadata generation unit 112 acquires information regarding the projection direction and the like supplied from the patch decomposition unit 111. The metadata generation unit 112 generates projection direction information and encoding unit information based on the information. The metadata generation unit 112 supplies the generated projection direction information to, for example, the patch decomposition unit 111, the packing unit 113, the video encoding unit 115, the video encoding unit 116, the OMap encoding unit 117, and the multiplexer 118. Also, the metadata generation unit 112 supplies the generated encoding unit information to, for example, the packing unit 113, the video encoding unit 115, the video encoding unit 116, the OMap encoding unit 117, and the multiplexer 118.
- the packing unit 113 performs processing related to data packing. For example, the packing unit 113 acquires two-dimensional plane data (patch) on which 3D data is projected for each region supplied from the patch decomposition unit 111. The packing unit 113 packs each acquired patch as a video frame. For example, the packing unit 113 arranges a patch of position information (Geometry) indicating the position of the point in a two-dimensional image, and adds a patch of attribute information (Texture) such as color information added to the position information to the two-dimensional image. Place and pack their two-dimensional images as video frames respectively.
- position information Geographicmetry
- Texture patch of attribute information
- the packing unit 113 performs packing based on the projection direction information and the coding unit information supplied from the metadata generation unit 112. That is, as described above, the packing unit 113 controls the encoding unit (frame, slice, tile, etc.) in which each patch is arranged according to the projection direction. That is, the packing unit 113 arranges each patch in an encoding unit corresponding to the projection direction.
- the packing unit 112 also generates an occupancy map (Occupancy Map) indicating the presence / absence of data for each position and performs a Dilation process.
- the packing unit 113 supplies various processed data to a subsequent processing unit. For example, the packing unit 113 supplies a video frame of position information (Geometry) to the video encoding unit 115. For example, the packing unit 113 supplies a video frame of attribute information (Texture) to the video encoding unit 116. Further, for example, the packing unit 113 supplies an occupancy map to the OMap encoding unit 117. In addition, the packing unit 113 supplies control information regarding the packing to the multiplexer 118.
- the auxiliary patch information compression unit 114 performs processing related to compression of auxiliary patch information. For example, the auxiliary patch information compression unit 114 acquires data supplied from the patch decomposition unit 111. The auxiliary patch information compression unit 114 encodes (compresses) auxiliary patch information included in the acquired data. The auxiliary patch information compression unit 114 supplies the encoded data of the obtained auxiliary patch information to the multiplexer 118.
- the video encoding unit 115 performs processing related to encoding of a video frame of position information (Geometry). For example, the video encoding unit 115 acquires a video frame of position information (Geometry) supplied from the packing unit 113. Further, the video encoding unit 115 encodes the acquired video frame of the position information (Geometry) by an arbitrary two-dimensional image encoding method such as AVC or HEVC. The video encoding unit 115 supplies encoded data (encoded data of a video frame of position information (Geometry)) obtained by the encoding to the multiplexer 118.
- encoded data encoded data of a video frame of position information (Geometry)
- the video encoding unit 115 may perform encoding quality control based on the projection direction information and the inter-code unit control information supplied from the metadata generation unit 112. For example, the video encoding unit 115 may control the quality (for example, quantization parameter) of the video frame in accordance with the display frequency of patches included in the video frame.
- the quality for example, quantization parameter
- the video encoding unit 116 performs processing related to encoding of a video frame of attribute information (Texture). For example, the video encoding unit 116 acquires a video frame of attribute information (Texture) supplied from the packing unit 113. Further, the video encoding unit 116 encodes the acquired video frame of the attribute information (Texture) by an arbitrary two-dimensional image encoding method such as AVC or HEVC. The video encoding unit 116 supplies the encoded data (the encoded data of the video frame of the attribute information (Texture)) obtained by the encoding to the multiplexer 118.
- a video frame of attribute information (Texture) supplied from the packing unit 113.
- the video encoding unit 116 encodes the acquired video frame of the attribute information (Texture) by an arbitrary two-dimensional image encoding method such as AVC or HEVC.
- the video encoding unit 116 supplies the encoded data (the encoded data of the video frame of the attribute information (Texture)) obtained by the en
- the video encoding unit 116 performs encoding quality control based on the projection direction information and the inter-code unit control information supplied from the metadata generation unit 112. You may do it.
- the OMap encoding unit 117 performs processing related to encoding of an occupancy map indicating the presence / absence of data for each position. For example, the OMap encoding unit 117 acquires a video frame of an occupancy map supplied from the packing unit 113. The OMap encoding unit 117 encodes the acquired occupancy map by an arbitrary encoding method such as arithmetic encoding. The OMap encoding unit 117 supplies the encoded data of the occupancy map obtained by the encoding to the multiplexer 118.
- the OMap encoding unit 117 also performs encoding quality control based on the projection direction information and the inter-code unit control information supplied from the metadata generation unit 112, as in the case of the video encoding unit 115. You may do it.
- the multiplexer 118 performs processing related to multiplexing. For example, the multiplexer 118 acquires encoded data of auxiliary patch information supplied from the auxiliary patch information compression unit 114. Further, the multiplexer 118 acquires control information related to packing supplied from the packing unit 113. Further, the multiplexer 118 acquires encoded data of a video frame of position information (Geometry) supplied from the video encoding unit 115. Further, the multiplexer 118 acquires encoded data of a video frame of attribute information (Texture) supplied from the video encoding unit 116. Further, the multiplexer 118 acquires encoded data of an occupancy map supplied from the OMap encoding unit 117.
- the multiplexer 118 acquires encoded data of auxiliary patch information supplied from the auxiliary patch information compression unit 114. Further, the multiplexer 118 acquires control information related to packing supplied from the packing unit 113. Further, the multiplexer 118 acquires encoded data of a video frame of position information (Geometry) supplied from
- the multiplexer 118 acquires the projection direction information and the coding unit information supplied from the metadata generation unit 112.
- the multiplexer 118 multiplexes the acquired information and generates a bitstream.
- the multiplexer 118 outputs the generated bit stream to the outside of the encoding device 100.
- FIG. 10 is a block diagram illustrating a main configuration example of the patch decomposition unit 111 of FIG.
- the patch decomposition unit 111 in this case includes a normal estimation unit 151, a segmentation initial setting unit 152, a segmentation update unit 153, a projection direction setting unit 154, a two-dimensional projection unit 155, and an index adding unit. 156.
- the normal estimation unit 151 performs processing related to estimation of the normal of the surface of the 3D data. For example, the normal estimation unit 151 acquires input 3D data (Point Cloud). Moreover, the normal estimation part 151 estimates the normal of the surface of the object which the acquired 3D data represents. For example, the normal estimation unit 151 estimates a normal by constructing a kd-tree, searching for a neighborhood, and calculating an optimal approximate tangent plane. The normal estimation unit 151 supplies the normal estimation result together with other data to the segmentation initial setting unit 152.
- the segmentation initial setting unit 152 performs processing related to the initial setting of the segmentation. For example, the segmentation initial setting unit 152 acquires data supplied from the normal estimation unit 151. In addition, the segmentation initial setting unit 152 classifies the surface of the 3D data corresponding to the normal line based on the components of the six axes in the normal direction estimated by the normal line estimation unit 151. The segmentation initial setting unit 152 supplies the classification result to the segmentation update unit 153 together with other data.
- the segmentation update unit 153 performs processing related to segmentation update. For example, the segmentation update unit 153 acquires data supplied from the segmentation initial setting unit 152. Also, the segmentation update unit 153 collects areas that are too small in the default segmentation set by the segmentation initial setting unit 152 so that the area becomes sufficiently large. The segmentation updating unit 153 supplies the updated information on the segmentation to the projection direction setting unit 154 together with other information.
- the projection direction setting unit 154 performs processing related to the setting of the projection direction. For example, the projection direction setting unit 154 acquires data (including information on the updated segmentation) supplied from the segmentation update unit 153. In addition, the projection direction setting unit 154 acquires View Info that is information related to the View direction. The projection direction setting unit 154 sets the projection direction of each segmentation based on the information. For example, the projection direction setting unit 154 sets the projection direction so as to suppress the occurrence of occlusion based on the normal line of each segmentation, the assumed view direction, and the like. Further, for example, the projection direction setting unit 154 sets the projection direction so as to suppress an increase in the number of patches to be generated based on the normal line of each segmentation, the assumed view direction, and the like.
- the projection direction setting unit 154 supplies information about the set projection direction and the like to the metadata generation unit 112. In addition, the projection direction setting unit 154 supplies information regarding the projection direction and the like to the two-dimensional projection unit 155 together with other information such as information regarding the updated segmentation.
- the 2D projection unit 155 performs processing related to 2D projection of 3D data.
- the two-dimensional projection unit 155 acquires data supplied from the projection direction setting unit 154.
- the two-dimensional projection unit 155 projects each segmentation onto a two-dimensional plane in the projection direction, and generates a patch.
- the two-dimensional projection unit 155 generates a patch of position information (Geometry) and attribute information (Texture).
- the two-dimensional projection unit 155 supplies the generated position information (Geometry) patch and attribute information (Texture) patch to the index adding unit 156 together with other data.
- the index assigning unit 156 performs processing related to the assignment of the projection direction index. For example, the index assigning unit 156 acquires data supplied from the two-dimensional projection unit 155. Further, the index assigning unit 156 acquires the projection direction information supplied from the metadata generation unit 112. Further, the index assigning unit 156 assigns a projection direction index corresponding to the projection direction to each patch based on the acquired projection direction information. The index assigning unit 156 supplies the processed data (position information (Geometry) patch to which the projection direction index is assigned, attribute information (Texture) patch, etc.) to the packing unit 113.
- position information Geometry
- Texture attribute information
- FIG. 11 is a block diagram illustrating a main configuration example of the metadata generation unit 112 in FIG. 9. As illustrated in FIG. 11, the metadata generation unit 112 includes a projection direction information generation unit 171 and a coding unit information generation unit 172.
- the projection direction information generation unit 171 performs processing related to generation of projection direction information. For example, the projection direction information generation unit 171 acquires information regarding the projection direction and the like supplied from the patch decomposition unit 111. Further, the projection direction information generation unit 171 sets a projection direction based on the information, attaches a projection direction index, and generates projection direction information. Further, the projection direction information generation unit 171 adds attribute information to each projection direction index as necessary.
- the projection direction information generation unit 171 supplies the generated projection direction information to the encoding unit information generation unit 172.
- the projection direction information generation unit 171 also supplies the projection direction information to the patch decomposition unit 111, the packing unit 113, the video encoding unit 115, the video encoding unit 116, the OMap encoding unit 117, and the multiplexer 118. .
- the coding unit information generation unit 172 performs processing related to generation of coding unit information. For example, the encoding unit information generation unit 172 acquires the projection direction information supplied from the projection direction information generation unit 171. Also, the coding unit information generation unit 172 sets a coding unit in which patches in each projection direction are arranged, and generates coding unit information by associating information indicating the coding unit with a projection direction index.
- the encoding unit information generation unit 172 supplies the generated encoding unit information to the packing unit 113, the video encoding unit 115, the video encoding unit 116, the OMap encoding unit 117, and the multiplexer 118.
- FIG. 12 is a block diagram illustrating an example of a configuration of a decoding device that is an aspect of an image processing device to which the present technology is applied.
- the decoding device 200 shown in FIG. 12 decodes encoded data obtained by projecting 3D data such as a point cloud onto a two-dimensional plane and encoding it using a decoding method for a two-dimensional image, and projects the encoded data into a three-dimensional space.
- Device decoding device to which a video-based approach is applied).
- FIG. 12 main components such as a processing unit and a data flow are shown, and the components shown in FIG. 12 are not all. That is, in the decoding apparatus 200, there may be a processing unit that is not shown as a block in FIG. 12, or there may be a process or data flow that is not shown as an arrow or the like in FIG. This is the same in other diagrams explaining the processing unit and the like in the decoding device 200.
- the decoding device 200 includes a demultiplexer 211, a metadata processing unit 212, an auxiliary patch information decoding unit 213, a video decoding unit 214, a video decoding unit 215, an OMap decoding unit 216, an unpacking unit 217, and A 3D reconstruction unit 218 is included.
- the demultiplexer 211 performs processing related to data demultiplexing. For example, the demultiplexer 211 acquires a bit stream input to the decoding device 200. This bit stream is supplied from the encoding device 100, for example. The demultiplexer 211 demultiplexes this bit stream, extracts the encoded data of the auxiliary patch information, and supplies it to the auxiliary patch information decoding unit 213. Further, the demultiplexer 211 extracts encoded data of a video frame of position information (Geometry) from the bit stream by demultiplexing, and supplies it to the video decoding unit 214.
- a video frame of position information Gaometry
- the demultiplexer 211 extracts the encoded data of the video frame of the attribute information (Texture) from the bit stream by demultiplexing, and supplies it to the video decoding unit 215. Also, the demultiplexer 211 extracts encoded data of the occupancy map from the bit stream by demultiplexing, and supplies it to the OMap decoding unit 216.
- the demultiplexer 211 extracts metadata such as projection direction information and encoding unit information from the bit stream by demultiplexing, and supplies the metadata to the metadata processing unit 212.
- the metadata processing unit 212 performs processing related to decoding control based on metadata. For example, the metadata processing unit 212 acquires metadata (projection direction information, encoding unit information, etc.) supplied from the demultiplexer 211. In addition, the metadata processing unit 212 acquires a view direction designation (view) by a user or the like. The metadata processing unit 212 controls decoding by the video decoding unit 214, the video decoding unit 215, and the OMap decoding unit 216 based on the metadata, the view direction, and the like. For example, the decoding order of patches and the range of patches to be decoded (partial decoding) are controlled.
- the auxiliary patch information decoding unit 213 performs processing related to decoding of encoded data of auxiliary patch information. For example, the auxiliary patch information decoding unit 213 acquires encoded data of auxiliary patch information supplied from the demultiplexer 211. The auxiliary patch information decoding unit 213 decodes encoded data of auxiliary patch information included in the acquired data. The auxiliary patch information decoding unit 213 supplies auxiliary patch information obtained by the decoding to the 3D reconstruction unit 218.
- the video decoding unit 214 performs processing related to decoding of encoded data of a video frame of position information (Geometry). For example, the video decoding unit 214 acquires encoded data of a video frame of position information (Geometry) supplied from the demultiplexer 211. In addition, the video decoding unit 214 receives control of the metadata processing unit 212.
- the video decoding unit 214 decodes the encoded data acquired from the demultiplexer 211 under the control of the metadata processing unit 212, and obtains a video frame of position information (Geometry). For example, the video decoding unit 214 decodes the encoded data of the encoding unit corresponding to the projection direction of the position information (Geometry) onto the two-dimensional plane specified by the metadata processing unit 212. For example, the video decoding unit 214 decodes the encoded data of the encoding unit in the decoding range specified by the metadata processing unit 212. For example, the video decoding unit 214 decodes the encoded data of each encoding unit in the decoding order specified by the metadata processing unit 212. The video decoding unit 214 supplies the decoded encoding unit position information (Geometry) data to the unpacking unit 217.
- the video decoding unit 215 performs processing related to decoding of the encoded data of the video frame of the attribute information (Texture). For example, the video decoding unit 215 acquires encoded data of a video frame of attribute information (Texture) supplied from the demultiplexer 211. In addition, the video decoding unit 215 receives control of the metadata processing unit 212.
- the video decoding unit 215 decodes the encoded data acquired from the demultiplexer 211 under the control of the metadata processing unit 212 to obtain a video frame of attribute information (Texture). For example, the video decoding unit 215 decodes the encoded data of the encoding unit in the decoding range specified by the metadata processing unit 212. For example, the video decoding unit 215 decodes the encoded data of each encoding unit in the decoding order specified by the metadata processing unit 212. The video decoding unit 215 supplies the decoded attribute unit attribute information (Texture) data to the unpacking unit 217.
- a video frame of attribute information (Texture).
- the video decoding unit 215 decodes the encoded data of the encoding unit in the decoding range specified by the metadata processing unit 212.
- the video decoding unit 215 decodes the encoded data of each encoding unit in the decoding order specified by the metadata processing unit 212.
- the video decoding unit 215 supplies the decoded attribute unit attribute information (
- the OMap decoding unit 216 performs processing related to decoding of encoded data of the occupancy map. For example, the OMap decoding unit 216 acquires encoded data of an occupancy map supplied from the demultiplexer 211. In addition, the OMap decoding unit 216 receives control of the metadata processing unit 212.
- the OMap decoding unit 216 decodes the encoded data acquired from the demultiplexer 211 under the control of the metadata processing unit 212, and obtains an occupancy map. For example, the OMap decoding unit 216 decodes the encoded data of the encoding unit in the decoding range specified by the metadata processing unit 212. For example, the OMap decoding unit 216 decodes the encoded data of each encoding unit in the decoding order specified by the metadata processing unit 212. The OMap decoding unit 216 supplies the decoded occupancy map data of the coding unit to the unpacking unit 217.
- the unpacking unit 217 performs processing related to unpacking. For example, the unpacking unit 217 acquires the video frame (coding unit data) of the position information (Geometry) from the video decoding unit 214, and the video frame (coding unit data) of the attribute information (Texture) from the video decoding unit 215. Data) and an occupancy map (coding unit data) from the OMap decoding unit 216. The unpacking unit 217 unpacks the position information video frame and the attribute information video frame.
- the unpacking unit 217 3D stores various data such as position information (Geometry) data (patch, etc.), attribute information (Texture) data (patch, etc.), and occupancy map data obtained by unpacking. This is supplied to the reconstruction unit 218.
- the 3D reconstruction unit 218 performs processing related to reconstruction of 3D data.
- the 3D reconstruction unit 218 includes auxiliary patch information supplied from the auxiliary patch information decoding unit 213, position information (Geometry) data, attribute information (Texture) data, and occupancy information supplied from the unpacking unit 217.
- 3D data Point (Cloud) is reconstructed based on pansy map data.
- the 3D reconstruction unit 218 outputs the 3D data obtained by such processing to the outside of the decoding device 200.
- the 3D data is supplied to a display unit to display the image, recorded on a recording medium, or supplied to another device via communication, for example.
- FIG. 13 is a block diagram illustrating a main configuration example of the metadata processing unit 212 of FIG.
- the metadata processing unit 212 includes an inner product calculation unit 251, a decoding range setting unit 252, a decoding order setting unit 253, and a decoding control unit 254.
- the inner product calculation unit 251 performs processing related to calculation of the inner product. For example, the inner product calculation unit 251 acquires the projection direction information supplied from the demultiplexer 211. Also, the inner product calculation unit 251 acquires the designation of the view direction input by the user or the like. The inner product calculation unit 251 calculates the inner product of a vector indicating the projection direction and a vector indicating the view direction. The inner product calculation unit 251 supplies the inner product calculation result to the decoding range setting unit 252.
- the decoding range setting unit 252 performs processing related to setting of the decoding range. For example, the decoding range setting unit 252 acquires the inner product result supplied from the inner product calculation unit 251. Further, the decoding range setting unit 252 acquires the coding unit information supplied from the demultiplexer 211. Also, the decoding range setting unit 252 sets a decoding range (coding unit to be decoded) based on the result of the inner product and the coding unit information. That is, the decoding range setting unit 252 controls whether or not to perform partial decoding, and further controls the decoding range when performing partial decoding. The decoding range setting unit 252 supplies the decoding order setting unit 253 with the result of the inner product, the coding unit information, and information regarding the set decoding range.
- the decoding order setting unit 253 performs processing related to setting of the decoding order. For example, the decoding order setting unit 253 acquires information supplied from the decoding range setting unit 252. In addition, the decoding order setting unit 253 acquires the projection direction information supplied from the demultiplexer 211. Further, the decoding order setting unit 253 corresponds to the setting of the decoding order, the setting of the decoding range, and the view direction based on the result of the inner product, the projection direction information, the coding unit information, the information about the decoding range, and the like. The decoding order of the encoding unit to decode is set. The decoding order setting unit 253 supplies the decoding control unit 254 with the inner product result, the coding unit information, the information about the decoding range, and the information about the set decoding order.
- the decoding control unit 254 performs processing related to decoding control. For example, the decoding control unit 254 acquires information supplied from the decoding order setting unit 253. Also, the decoding control unit 254 controls decoding by the video decoding unit 214, the video decoding unit 215, and the OMap decoding unit 216 based on the information. For example, the decoding control unit 254 designates a coding unit that is a target of decoding performed by these processing units. Further, for example, the decoding control unit 254 controls the specification of the decoding range of decoding performed by these processing units, the decoding order, and the like.
- the projection direction can be arbitrarily set, so that the occurrence of occlusion can be suppressed and the increase in the number of patches can be suppressed.
- the encoded data of the 3D data can be more easily decoded by various decoding methods such as selecting and decoding a patch having a more appropriate projection direction according to the view direction or the like. As a result, an increase in the load of the decoding process can be suppressed.
- the patch decomposing unit 111 of the encoding apparatus 100 projects 3D data onto a two-dimensional plane and decomposes it into patches in step S101.
- step S102 the metadata generation unit 112 generates projection direction information and encoding unit information as metadata.
- step S103 the patch decomposing unit 111 (index assigning unit 156) assigns (corresponds) a projection direction index to each patch according to the projection direction information generated in step S102.
- step S104 the auxiliary patch information compression unit 114 compresses the auxiliary patch information obtained by the process in step S101.
- step S105 the packing unit 113 performs packing according to the projection direction information and the coding unit information generated in step S102. That is, the packing unit 113 arranges each patch in a coding unit (slice or the like) corresponding to the projection direction of the frame image and packs it as a video frame. In addition, the packing unit 113 generates an occupancy map corresponding to the video frame.
- step S106 the video encoding unit 115 encodes the geometry video frame, which is the video frame of the position information obtained by the process in step S105, using the encoding method for two-dimensional images.
- step S107 the video encoding unit 116 encodes the color video frame, which is the video frame of the attribute information obtained by the processing in step S105, by the encoding method for two-dimensional images.
- step S108 the OMap encoding unit 117 encodes the occupancy map obtained by the processing in step S105 by a predetermined encoding method.
- the video encoding unit 115 to the OMap encoding unit 117 control the quality based on the projection direction information, the encoding unit information, and the like in the encoding performed by each of them (the encoding in steps S106 to S108). May be.
- step S109 the multiplexer 118 performs various information generated as described above (for example, the encoded data generated in steps S106 to S108 and the metadata generated in step S102 (projection direction information and encoding unit). Information)) is multiplexed, and a bit stream including the information is generated.
- step S110 the multiplexer 118 outputs the bit stream generated by the process in step S109 to the outside of the encoding apparatus 100.
- step S110 ends, the encoding process ends.
- the normal estimation unit 151 estimates the normal of each surface of the 3D data in step S131.
- step S132 the segmentation initial setting unit 152 performs initial setting of segmentation.
- step S133 the segmentation updating unit 153 updates the segmentation in the initial state set in step S132 as necessary.
- step S134 the projection direction setting unit 154 sets the projection direction of each segmentation based on, for example, View Info.
- step S135 the projection direction setting unit 154 supplies information related to the projection direction set in step S134 to the metadata generation unit 112.
- step S136 the two-dimensional projection unit 155 projects each segmentation of the 3D data onto the two-dimensional plane in the projection direction set in step S134.
- step S136 the patch disassembly process ends, and the process returns to FIG.
- the projection direction information generation unit 171 When the metadata generation process is started, the projection direction information generation unit 171 generates projection direction information based on the information regarding the projection direction supplied from the patch decomposition unit 111 (projection direction setting unit 154) in step S151. To do.
- step S152 the encoding unit information generation unit 172 sets the encoding unit for projecting each patch based on the projection direction information generated in step S151, and generates encoding unit information.
- each process as described above it is possible to arbitrarily set the projection direction. For example, it is possible to generate and encode a patch with a more appropriate projection direction according to the View direction or the like. As a result, the encoded data of the 3D data can be more easily decoded on the decoding side. For example, it is possible to suppress the occurrence of occlusion or to suppress an increase in the number of patches, and to suppress an increase in the load of decoding processing.
- the decoding order of patches can be easily controlled during decoding, or partial decoding for decoding only a part of patches can be realized. That is, the encoded data of 3D data can be more easily decoded on the decoding side, and an increase in decoding processing load can be suppressed.
- the demultiplexer 211 of the decoding device 200 demultiplexes the bitstream in step S201.
- step S202 the auxiliary patch information decoding unit 213 decodes the auxiliary patch information extracted from the bitstream in step S201.
- step S203 the metadata processing unit 212 controls decoding in accordance with the metadata (projection direction information and coding unit information) extracted from the bitstream in step S201.
- step S204 the video decoding unit 214 decodes the encoded data of the geometry video frame (position information video frame) extracted from the bit stream in step S201 in accordance with the decoding control in step S203.
- step S205 the video decoding unit 215 decodes the encoded data of the color video frame (the attribute information video frame) extracted from the bit stream in step S201 in accordance with the decoding control in step S203.
- step S206 the OMap decoding unit 216 decodes the encoded data of the occupancy map extracted from the bit stream in step S201 according to the decoding control in step S203.
- step S207 the unpacking unit 217 unpacks the geometry video frame and the color video frame, and extracts a patch.
- step S208 the 3D reconstruction unit 218 reconstructs 3D data such as a point cloud based on the auxiliary patch information obtained in step S202, the patch obtained in step S207, and the like.
- step S208 ends, the decryption process ends.
- step S222 the decoding range setting unit 252 determines a projection direction that satisfies the condition (encoding unit corresponding to the projection direction) based on the inner product result and the encoding unit information calculated in step S221, that is, decoding.
- the projection direction (encoding unit to decode) is specified.
- step S223 the decoding order setting unit 253 specifies the decoding order of the coding units that satisfy the conditions specified in step S222 based on the projection direction information.
- step S224 the decoding control unit 254 controls the video decoding unit 214, the video decoding unit 215, and the OMap decoding unit 216, and specifies the “coding unit that satisfies the condition” specified in step S222 in step S223. Decrypt in “decoding order”. In accordance with this decoding control, each processing from step S204 to step S206 (FIG. 17) is executed.
- step S224 when it is set so that the decoding control is performed as described above, the metadata processing ends, and the processing returns to FIG.
- control information related to the present technology described in each of the above embodiments may be transmitted from the encoding side to the decoding side. For example, you may make it transmit the control information (for example, enabled_flag) which controls whether application (or prohibition) of applying this technique mentioned above is permitted.
- control that specifies a range (for example, an upper limit or a lower limit of the block size, or both, a slice, a picture, a sequence, a component, a view, a layer, etc.) that is permitted (or prohibited) to apply the present technology described above.
- Information may be transmitted.
- ⁇ Computer> The series of processes described above can be executed by hardware or can be executed by software.
- a program constituting the software is installed in the computer.
- the computer includes, for example, a general-purpose personal computer that can execute various functions by installing a computer incorporated in dedicated hardware and various programs.
- FIG. 19 is a block diagram showing an example of the hardware configuration of a computer that executes the above-described series of processing by a program.
- a CPU Central Processing Unit
- ROM Read Only Memory
- RAM Random Access Memory
- An input / output interface 910 is also connected to the bus 904.
- An input unit 911, an output unit 912, a storage unit 913, a communication unit 914, and a drive 915 are connected to the input / output interface 910.
- the input unit 911 includes, for example, a keyboard, a mouse, a microphone, a touch panel, an input terminal, and the like.
- the output unit 912 includes, for example, a display, a speaker, an output terminal, and the like.
- the storage unit 913 includes, for example, a hard disk, a RAM disk, a nonvolatile memory, and the like.
- the communication unit 914 includes a network interface, for example.
- the drive 915 drives a removable medium 921 such as a magnetic disk, an optical disk, a magneto-optical disk, or a semiconductor memory.
- the CPU 901 loads the program stored in the storage unit 913 into the RAM 903 via the input / output interface 910 and the bus 904 and executes the program, for example. Is performed.
- the RAM 903 also appropriately stores data necessary for the CPU 901 to execute various processes.
- the program executed by the computer can be recorded and applied to, for example, a removable medium 921 as a package medium or the like.
- the program can be installed in the storage unit 913 via the input / output interface 910 by attaching the removable medium 921 to the drive 915.
- This program can also be provided via a wired or wireless transmission medium such as a local area network, the Internet, or digital satellite broadcasting. In that case, the program can be received by the communication unit 914 and installed in the storage unit 913.
- a wired or wireless transmission medium such as a local area network, the Internet, or digital satellite broadcasting.
- the program can be received by the communication unit 914 and installed in the storage unit 913.
- this program can be installed in the ROM 902 or the storage unit 913 in advance.
- the encoding device 100 and the decoding device 200 have been described as application examples of the present technology. However, the present technology can be applied to any configuration.
- the present technology can be applied to transmitters and receivers (for example, television receivers and mobile phones) in satellite broadcasting, cable broadcasting such as cable TV, distribution on the Internet, and distribution to terminals via cellular communication, or
- the present invention can be applied to various electronic devices such as an apparatus (for example, a hard disk recorder or a camera) that records an image on a medium such as an optical disk, a magnetic disk, and a flash memory and reproduces an image from the storage medium.
- the present technology provides a processor (for example, a video processor) as a system LSI (Large Scale Integration) or the like, a module (for example, a video module) using a plurality of processors, or a unit (for example, a video unit) using a plurality of modules.
- a processor for example, a video processor
- LSI Large Scale Integration
- module for example, a video module
- unit for example, a video unit
- it may be implemented as a configuration of a part of the apparatus such as a set (for example, a video set) in which other functions are further added to the unit.
- the present technology can also be applied to a network system including a plurality of devices.
- the present technology may be implemented as cloud computing that is shared and processed by a plurality of devices via a network.
- this technology is implemented in a cloud service that provides services related to images (moving images) to arbitrary terminals such as computers, audio visual (AV) devices, portable information processing terminals, and Internet of Things (IoT) devices. You may make it do.
- AV audio visual
- IoT Internet of Things
- the system means a set of a plurality of constituent elements (devices, modules (parts), etc.), and it does not matter whether all the constituent elements are in the same casing. Therefore, a plurality of devices housed in separate housings and connected via a network, and a single device housing a plurality of modules in one housing are all systems. .
- Systems, devices, processing units, etc. to which this technology is applied can be used in any field such as traffic, medical care, crime prevention, agriculture, livestock industry, mining, beauty, factory, home appliances, weather, nature monitoring, etc. . Moreover, the use is also arbitrary.
- “flag” is information for identifying a plurality of states, and is not only information used for identifying two states of true (1) or false (0), but also three or more Information that can identify the state is also included. Therefore, the value that can be taken by the “flag” may be, for example, a binary value of 1/0, or may be three or more values. That is, the number of bits constituting this “flag” is arbitrary, and may be 1 bit or a plurality of bits.
- the identification information includes not only the form in which the identification information is included in the bitstream but also the form in which the difference information of the identification information with respect to certain reference information is included in the bitstream.
- the “flag” and “identification information” include not only the information but also difference information with respect to the reference information.
- various information (metadata, etc.) related to the encoded data may be transmitted or recorded in any form as long as it is associated with the encoded data.
- the term “associate” means, for example, that one data can be used (linked) when one data is processed. That is, the data associated with each other may be collected as one data, or may be individual data. For example, information associated with encoded data (image) may be transmitted on a different transmission path from the encoded data (image). Further, for example, information associated with encoded data (image) may be recorded on a recording medium different from the encoded data (image) (or another recording area of the same recording medium). Good.
- the “association” may be a part of the data, not the entire data. For example, an image and information corresponding to the image may be associated with each other in an arbitrary unit such as a plurality of frames, one frame, or a part of the frame.
- the configuration described as one device (or processing unit) may be divided and configured as a plurality of devices (or processing units).
- the configurations described above as a plurality of devices (or processing units) may be combined into a single device (or processing unit).
- a configuration other than that described above may be added to the configuration of each device (or each processing unit).
- a part of the configuration of a certain device (or processing unit) may be included in the configuration of another device (or other processing unit). .
- the above-described program may be executed in an arbitrary device.
- the device may have necessary functions (functional blocks and the like) so that necessary information can be obtained.
- each step of one flowchart may be executed by one device, or may be executed by a plurality of devices in a shared manner.
- the plurality of processes may be executed by one apparatus, or may be executed in a shared manner by a plurality of apparatuses.
- a plurality of processes included in one step can be executed as a process of a plurality of steps.
- the processing described as a plurality of steps can be collectively executed as one step.
- a program executed by a computer may be executed in a time series according to the order described in this specification, or in parallel or in a call process. It may be executed individually at a necessary timing such as when it is broken. That is, as long as no contradiction occurs, the processing of each step may be executed in an order different from the order described above. Furthermore, the processing of the steps describing this program may be executed in parallel with the processing of other programs, or may be executed in combination with the processing of other programs.
- a plurality of technologies related to the present technology can be independently implemented as long as no contradiction occurs.
- any of a plurality of present technologies can be used in combination.
- part or all of the present technology described in any of the embodiments can be combined with part or all of the present technology described in other embodiments.
- a part or all of the arbitrary present technology described above can be implemented in combination with other technologies not described above.
- this technique can also take the following structures.
- An image processing apparatus including a bit stream generation unit that generates a bit stream.
- the attribute information includes information regarding a decoding order.
- a projection direction information generation unit that generates the projection direction information is further provided, The image processing apparatus according to (1), wherein the bit stream generation unit generates a bit stream including the projection direction information generated by the projection direction information generation unit and the encoded data.
- a packing unit that arranges and packs the geometry image in an image according to the projection direction information; An encoding unit that encodes the image in which the geometry image is packed by the packing unit and generates the encoded data; and The image processing device according to (1), wherein the bit stream generation unit generates a bit stream including the projection direction information and the encoded data generated by the encoding unit.
- An image processing apparatus comprising: a decoding unit that decodes a bitstream including the bitstream.
- a coding unit information generation unit that generates the coding unit information is further provided.
- An encoding unit that encodes the image in which the geometry image is packed by the packing unit and generates encoded data;
- the image processing device according to (11), wherein the coding unit is a slice, a tile, or a picture.
- 100 encoding device 111 patch decomposing unit, 112 metadata generating unit, 113 packing unit, 114 auxiliary patch information compressing unit, 115 video encoding unit, 116 video encoding unit, 117 OMap encoding unit, 118 multiplexer, 151 method Line estimation unit, 152 segmentation initial setting unit, 153 segmentation update unit, 154 projection direction setting unit, 155 two-dimensional projection unit, 156 indexing unit, 171 projection direction information generation unit, 172 encoding unit information generation unit, 200 decoding device , 211 Demultiplexer, 212 Metadata processing unit, 213 Auxiliary patch information decoding unit, 214 Video decoding unit, 215 Video decoding unit, 216 OMap decoding unit, 217 Unpack Ring section, 218 3D reconstruction unit, 251 an inner product calculation unit, 252 decoding range setting unit, 253 decoding order setting unit, 254 decoding control unit
Landscapes
- Engineering & Computer Science (AREA)
- Multimedia (AREA)
- Signal Processing (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Compression Or Coding Systems Of Tv Signals (AREA)
- Testing, Inspecting, Measuring Of Stereoscopic Televisions And Televisions (AREA)
- Image Generation (AREA)
Abstract
Description
1.ビデオベースドアプローチ
2.第1の実施の形態(投影方向毎の制御)
3.付記
<技術内容・技術用語をサポートする文献等>
本技術で開示される範囲は、実施の形態に記載されている内容だけではなく、出願当時において公知となっている以下の非特許文献に記載されている内容も含まれる。
非特許文献2:TELECOMMUNICATION STANDARDIZATION SECTOR OF ITU(International Telecommunication Union), "Advanced video coding for generic audiovisual services", H.264, 04/2017
非特許文献3:TELECOMMUNICATION STANDARDIZATION SECTOR OF ITU(International Telecommunication Union), "High efficiency video coding", H.265, 12/2016
非特許文献4:Jianle Chen, Elena Alshina, Gary J. Sullivan, Jens-Rainer, Jill Boyce, "Algorithm Description of Joint Exploration Test Model 4", JVET-G1001_v1, Joint Video Exploration Team (JVET) of ITU-T SG 16 WP 3 and ISO/IEC JTC 1/SC 29/WG 11 7th Meeting: Torino, IT, 13-21 July 2017
従来、点群の位置情報や属性情報等により3次元構造を表すポイントクラウドや、頂点、エッジ、面で構成され、多角形表現を使用して3次元形状を定義するメッシュ等のデータが存在した。
このようなポイントクラウドの位置と色情報それぞれを、小領域毎に2次元平面に投影し、2次元画像用の符号化方法で符号化するビデオベースドアプローチ(Video-based approach)が提案されている。
そこで、3次元構造を表す3Dデータの位置情報の2次元平面への投影方向に関する情報である投影方向情報と、その位置情報をその2次元平面に投影したジオメトリ画像の符号化データとを含むビットストリームを生成するようにする。
また、3次元構造を表す3Dデータの位置情報を2次元平面に投影したジオメトリ画像を、画像の、その位置情報のその2次元平面への投影方向に応じた符号化単位に配置してパッキングするようにする。
以上に説明したようなビデオベースドアプローチに関する本技術について説明する。図3の表に示されるように、本技術においては、パッチに対して投影方向に関する設定を行うようにする。
投影方向とは、ビデオベースドアプローチにおける3Dデータ(例えばポイントクラウド等)の2次元平面への投影のアングル、つまり、3Dデータからみた2次元平面の方向および位置(距離)を示す。例えば、図4のAに示されるように、球座標(r,θ,φ)を用いて投影方向(方向および位置)を表すようにしてもよい。
投影方向に関する設定として、例えば、この投影方向に関する情報である投影方向情報を生成し、伝送する。投影方向情報は、識別子と投影方向との対応関係を示す情報である。つまり、投影方向情報は、設定する投影方向に対して識別子を割り当てる情報である。図4のCにその投影方向情報の例を示す。図4のCに示される投影方向情報では、各投影方向の識別子(投影方向インデックス)に対して、投影方向(方向球座標φ、方向球座標θ、距離r)が対応付けられている。なお、複数の識別子に互いに同一の投影方向を対応付けてもよい。
なお、この投影方向情報には、付加情報として、復号に関する属性情報を付加するようにしてもよい。属性情報の内容は任意である。また、付加する属性情報の数は任意である。
投影方向に関する設定として、例えば、この投影方向毎にパッチを配置する符号化単位に関する情報である符号化単位情報を生成し、伝送する。符号化単位情報は、上述した投影方向の識別子と、その投影方向で投影されたパッチを配置する符号化単位との対応関係を示す情報である。つまり、符号化単位情報は、各投影方向と使用する符号化単位との対応関係を示す情報である。
3Dデータの符号化処理においては、上述した投影方向の識別子を用いて、各パッチとその投影方向とを紐づけする。つまり、各パッチに投影方向の識別子を割り当てる。これにより、投影方向、属性情報、符号化単位等の各種情報を各パッチに対応付けることができる。
3Dデータの復号処理においては、上述した投影方向情報や符号化単位情報等に基づいて、復号方法を制御することができる。例えば、指定されたView方向に近い投影方向順で復号するようにしてもよい。投影方向情報を参照することにより、各パッチの投影方向をより容易に把握することができるので、このような復号制御をより容易に実現することができる。
2.内積が0のものを復号する。
3.内積が正で絶対値が小さいものから復号する。
<符号化装置>
次に、以上のような各手法を実現する構成について説明する。図9は、本技術を適用した画像処理装置の一態様である符号化装置の構成の一例を示すブロック図である。図9に示される符号化装置100は、ポイントクラウドのような3Dデータを2次元平面に投影して2次元画像用の符号化方法により符号化を行う装置(ビデオベースドアプローチを適用した符号化装置)である。
図10は、図9のパッチ分解部111の主な構成例を示すブロック図である。図10に示されるように、この場合のパッチ分解部111は、法線推定部151、セグメンテーション初期設定部152、セグメンテーション更新部153、投影方向設定部154、2次元投影部155、およびインデックス付与部156を有する。
図11は、図9のメタデータ生成部112の主な構成例を示すブロック図である。図11に示されるように、メタデータ生成部112は、投影方向情報生成部171および符号化単位情報生成部172を有する。
図12は、本技術を適用した画像処理装置の一態様である復号装置の構成の一例を示すブロック図である。図12に示される復号装置200は、ポイントクラウドのような3Dデータが2次元平面に投影されて符号化された符号化データを、2次元画像用の復号方法により復号し、3次元空間に投影する装置(ビデオベースドアプローチを適用した復号装置)である。
図13は、図12のメタデータ処理部212の主な構成例を示すブロック図である。図13に示されるように、メタデータ処理部212は、内積算出部251、復号範囲設定部252、復号順設定部253、および復号制御部254を有する。
次に、符号化装置100により実行される符号化処理の流れの例を、図14のフローチャートを参照して説明する。
次に、図15のフローチャートを参照して、図14のステップS101において実行されるパッチ分解処理の流れの例を説明する。
次に、図14のステップS102において実行されるメタデータ生成処理の流れの例を、図16のフローチャートを参照して説明する。
次に、復号装置200により実行される復号処理の流れの例を、図17のフローチャートを参照して説明する。
次に、図17のステップS203において実行されるメタデータ処理の流れの例を、図18のフローチャートを参照して説明する。
<制御情報>
以上の各実施の形態において説明した本技術に関する制御情報を符号化側から復号側に伝送するようにしてもよい。例えば、上述した本技術を適用することを許可(または禁止)するか否かを制御する制御情報(例えばenabled_flag)を伝送するようにしてもよい。また、例えば、上述した本技術を適用することを許可(または禁止)する範囲(例えばブロックサイズの上限若しくは下限、またはその両方、スライス、ピクチャ、シーケンス、コンポーネント、ビュー、レイヤ等)を指定する制御情報を伝送するようにしてもよい。
上述した一連の処理は、ハードウエアにより実行させることもできるし、ソフトウエアにより実行させることもできる。一連の処理をソフトウエアにより実行する場合には、そのソフトウエアを構成するプログラムが、コンピュータにインストールされる。ここでコンピュータには、専用のハードウエアに組み込まれているコンピュータや、各種のプログラムをインストールすることで、各種の機能を実行することが可能な、例えば汎用のパーソナルコンピュータ等が含まれる。
以上においては、ポイントクラウドデータの符号化・復号に本技術を適用する場合について説明したが、本技術は、これらの例に限らず、任意の規格の3Dデータの符号化・復号に対して適用することができる。つまり、上述した本技術と矛盾しない限り、符号化・復号方式等の各種処理、並びに、3Dデータやメタデータ等の各種データの仕様は任意である。また、本技術と矛盾しない限り、上述した一部の処理や仕様を省略してもよい。
本技術を適用したシステム、装置、処理部等は、例えば、交通、医療、防犯、農業、畜産業、鉱業、美容、工場、家電、気象、自然監視等、任意の分野に利用することができる。また、その用途も任意である。
なお、本明細書において「フラグ」とは、複数の状態を識別するための情報であり、真(1)または偽(0)の2状態を識別する際に用いる情報だけでなく、3以上の状態を識別することが可能な情報も含まれる。したがって、この「フラグ」が取り得る値は、例えば1/0の2値であってもよいし、3値以上であってもよい。すなわち、この「フラグ」を構成するbit数は任意であり、1bitでも複数bitでもよい。また、識別情報(フラグも含む)は、その識別情報をビットストリームに含める形だけでなく、ある基準となる情報に対する識別情報の差分情報をビットストリームに含める形も想定されるため、本明細書においては、「フラグ」や「識別情報」は、その情報だけではなく、基準となる情報に対する差分情報も包含する。
(1) 3次元構造を表す3Dデータの位置情報の2次元平面への投影方向に関する情報である投影方向情報と、前記位置情報を前記2次元平面に投影したジオメトリ画像の符号化データとを含むビットストリームを生成するビットストリーム生成部
を備える画像処理装置。
(2) 前記投影方向情報は、識別子と、前記位置情報を前記2次元平面に投影する際の方向および位置との対応関係に関する情報を含む
(1)に記載の画像処理装置。
(3) 前記投影方向情報は、復号に関する属性情報をさらに含む
(2)に記載の画像処理装置。
(4) 前記属性情報は、復号順に関する情報を含む
(3)に記載の画像処理装置。
(5) 前記属性情報は、前記3次元構造における特性を示す意味情報を含む
(3)に記載の画像処理装置。
(6) 前記投影方向情報を生成する投影方向情報生成部をさらに備え、
前記ビットストリーム生成部は、前記投影方向情報生成部により生成された前記投影方向情報と、前記符号化データとを含むビットストリームを生成する
(1)に記載の画像処理装置。
(7) 前記投影方向情報に応じて、前記ジオメトリ画像を画像に配置してパッキングするパッキング部と、
前記パッキング部により前記ジオメトリ画像がパッキングされた前記画像を符号化し、前記符号化データを生成する符号化部と
をさらに備え、
前記ビットストリーム生成部は、前記投影方向情報と、前記符号化部により生成された前記符号化データとを含むビットストリームを生成する
(1)に記載の画像処理装置。
(8) 3次元構造を表す3Dデータの位置情報の2次元平面への投影方向に関する情報である投影方向情報と、前記位置情報を前記2次元平面に投影したジオメトリ画像の符号化データとを含むビットストリームを生成する
画像処理方法。
(9) 3次元構造を表す3Dデータの位置情報の2次元平面への投影方向に関する情報である投影方向情報に基づいて、前記位置情報を前記2次元平面に投影したジオメトリ画像の符号化データを含むビットストリームを復号する復号部
を備える画像処理装置。
(10) 3次元構造を表す3Dデータの位置情報の2次元平面への投影方向に関する情報である投影方向情報に基づいて、前記位置情報を前記2次元平面に投影したジオメトリ画像の符号化データを含むビットストリームを復号する
画像処理方法。
を備える画像処理装置。
(12) 前記パッキング部は、前記投影方向の識別子と、前記ジオメトリ画像を配置する符号化単位を示す情報との対応関係を示す符号化単位情報に基づいて、前記ジオメトリ画像をパッキングする
(11)に記載の画像処理装置。
(13) 前記符号化単位情報は、復号に関する属性情報をさらに含む
(12)に記載の画像処理装置。
(14) 前記属性情報は、前記3次元構造における特性を示す意味情報を含む
(13)に記載の画像処理装置。
(15) 前記符号化単位情報を生成する符号化単位情報生成部をさらに備え、
前記パッキング部は、前記符号化単位情報生成部により生成された前記符号化単位情報に基づいて、前記ジオメトリ画像をパッキングする
(12)に記載の画像処理装置。
(16) 前記パッキング部により前記ジオメトリ画像がパッキングされた前記画像を符号化し、符号化データを生成する符号化部と、
前記符号化単位情報と、前記符号化部により生成された前記符号化データとを含むビットストリームを生成するビットストリーム生成部をさらに備える
(12)に記載の画像処理装置。
(17) 前記符号化単位は、スライス、タイル、またはピクチャである
(11)に記載の画像処理装置。
(18) 3次元構造を表す3Dデータの位置情報を2次元平面に投影したジオメトリ画像を、画像の、前記位置情報の前記2次元平面への投影方向に応じた符号化単位に配置してパッキングする
画像処理方法。
(19) ビットストリームの、3次元構造を表す3Dデータの位置情報の2次元平面への投影方向に応じた符号化単位から、前記投影方向で前記2次元平面に投影したジオメトリ画像を復号する復号部
を備える画像処理装置。
(20) ビットストリームの、3次元構造を表す3Dデータの位置情報の2次元平面への投影方向に応じた符号化単位から、前記投影方向で前記2次元平面に投影したジオメトリ画像を復号する
画像処理方法。
Claims (20)
- 3次元構造を表す3Dデータの位置情報の2次元平面への投影方向に関する情報である投影方向情報と、前記位置情報を前記2次元平面に投影したジオメトリ画像の符号化データとを含むビットストリームを生成するビットストリーム生成部
を備える画像処理装置。 - 前記投影方向情報は、識別子と、前記位置情報を前記2次元平面に投影する際の方向および位置との対応関係に関する情報を含む
請求項1に記載の画像処理装置。 - 前記投影方向情報は、復号に関する属性情報をさらに含む
請求項2に記載の画像処理装置。 - 前記属性情報は、復号順に関する情報を含む
請求項3に記載の画像処理装置。 - 前記属性情報は、前記3次元構造における特性を示す意味情報を含む
請求項3に記載の画像処理装置。 - 前記投影方向情報を生成する投影方向情報生成部をさらに備え、
前記ビットストリーム生成部は、前記投影方向情報生成部により生成された前記投影方向情報と、前記符号化データとを含むビットストリームを生成する
請求項1に記載の画像処理装置。 - 前記投影方向情報に応じて、前記ジオメトリ画像を画像に配置してパッキングするパッキング部と、
前記パッキング部により前記ジオメトリ画像がパッキングされた前記画像を符号化し、前記符号化データを生成する符号化部と
をさらに備え、
前記ビットストリーム生成部は、前記投影方向情報と、前記符号化部により生成された前記符号化データとを含むビットストリームを生成する
請求項1に記載の画像処理装置。 - 3次元構造を表す3Dデータの位置情報の2次元平面への投影方向に関する情報である投影方向情報と、前記位置情報を前記2次元平面に投影したジオメトリ画像の符号化データとを含むビットストリームを生成する
画像処理方法。 - 3次元構造を表す3Dデータの位置情報の2次元平面への投影方向に関する情報である投影方向情報に基づいて、前記位置情報を前記2次元平面に投影したジオメトリ画像の符号化データを含むビットストリームを復号する復号部
を備える画像処理装置。 - 3次元構造を表す3Dデータの位置情報の2次元平面への投影方向に関する情報である投影方向情報に基づいて、前記位置情報を前記2次元平面に投影したジオメトリ画像の符号化データを含むビットストリームを復号する
画像処理方法。 - 3次元構造を表す3Dデータの位置情報を2次元平面に投影したジオメトリ画像を、画像の、前記位置情報の前記2次元平面への投影方向に応じた符号化単位に配置してパッキングするパッキング部
を備える画像処理装置。 - 前記パッキング部は、前記投影方向の識別子と、前記ジオメトリ画像を配置する符号化単位を示す情報との対応関係を示す符号化単位情報に基づいて、前記ジオメトリ画像をパッキングする
請求項11に記載の画像処理装置。 - 前記符号化単位情報は、復号に関する属性情報をさらに含む
請求項12に記載の画像処理装置。 - 前記属性情報は、前記3次元構造における特性を示す意味情報を含む
請求項13に記載の画像処理装置。 - 前記符号化単位情報を生成する符号化単位情報生成部をさらに備え、
前記パッキング部は、前記符号化単位情報生成部により生成された前記符号化単位情報に基づいて、前記ジオメトリ画像をパッキングする
請求項12に記載の画像処理装置。 - 前記パッキング部により前記ジオメトリ画像がパッキングされた前記画像を符号化し、符号化データを生成する符号化部と、
前記符号化単位情報と、前記符号化部により生成された前記符号化データとを含むビットストリームを生成するビットストリーム生成部をさらに備える
請求項12に記載の画像処理装置。 - 前記符号化単位は、スライス、タイル、またはピクチャである
請求項11に記載の画像処理装置。 - 3次元構造を表す3Dデータの位置情報を2次元平面に投影したジオメトリ画像を、画像の、前記位置情報の前記2次元平面への投影方向に応じた符号化単位に配置してパッキングする
画像処理方法。 - ビットストリームの、3次元構造を表す3Dデータの位置情報の2次元平面への投影方向に応じた符号化単位から、前記投影方向で前記2次元平面に投影したジオメトリ画像を復号する復号部
を備える画像処理装置。 - ビットストリームの、3次元構造を表す3Dデータの位置情報の2次元平面への投影方向に応じた符号化単位から、前記投影方向で前記2次元平面に投影したジオメトリ画像を復号する
画像処理方法。
Priority Applications (5)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN201980024400.7A CN111937388A (zh) | 2018-04-11 | 2019-03-28 | 图像处理装置和方法 |
| US16/981,722 US11310518B2 (en) | 2018-04-11 | 2019-03-28 | Image processing apparatus and method |
| EP19784358.4A EP3780613A4 (en) | 2018-04-11 | 2019-03-28 | IMAGE PROCESSING DEVICE AND METHOD |
| KR1020207027270A KR20200140256A (ko) | 2018-04-11 | 2019-03-28 | 화상 처리 장치 및 방법 |
| JP2020513193A JPWO2019198521A1 (ja) | 2018-04-11 | 2019-03-28 | 画像処理装置および方法 |
Applications Claiming Priority (2)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| JP2018076225 | 2018-04-11 | ||
| JP2018-076225 | 2018-04-11 |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| WO2019198521A1 true WO2019198521A1 (ja) | 2019-10-17 |
Family
ID=68163656
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| PCT/JP2019/013535 Ceased WO2019198521A1 (ja) | 2018-04-11 | 2019-03-28 | 画像処理装置および方法 |
Country Status (7)
| Country | Link |
|---|---|
| US (1) | US11310518B2 (ja) |
| EP (1) | EP3780613A4 (ja) |
| JP (1) | JPWO2019198521A1 (ja) |
| KR (1) | KR20200140256A (ja) |
| CN (1) | CN111937388A (ja) |
| TW (1) | TW201946449A (ja) |
| WO (1) | WO2019198521A1 (ja) |
Cited By (6)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| WO2021210513A1 (ja) * | 2020-04-13 | 2021-10-21 | パナソニック インテレクチュアル プロパティ コーポレーション オブ アメリカ | 三次元データ符号化方法、三次元データ復号方法、三次元データ符号化装置、及び三次元データ復号装置 |
| JP2021529482A (ja) * | 2018-06-30 | 2021-10-28 | 華為技術有限公司Huawei Technologies Co.,Ltd. | 点群符号化方法、点群復号化方法、符号器、及び復号器 |
| WO2021261516A1 (ja) * | 2020-06-23 | 2021-12-30 | パナソニック インテレクチュアル プロパティ コーポレーション オブ アメリカ | 三次元データ符号化方法、三次元データ復号方法、三次元データ符号化装置、及び三次元データ復号装置 |
| JP2023508271A (ja) * | 2020-01-07 | 2023-03-02 | エルジー エレクトロニクス インコーポレイティド | ポイントクラウドデータ送信装置、ポイントクラウドデータ送信方法、ポイントクラウドデータ受信装置及びポイントクラウドデータ受信方法 |
| JP2024016955A (ja) * | 2022-07-27 | 2024-02-08 | 日本放送協会 | 符号化装置、ストリーム合成装置、復号装置、およびプログラム |
| JP2024138070A (ja) * | 2019-06-28 | 2024-10-07 | エルジー エレクトロニクス インコーポレイティド | ポイントクラウドデータ処理装置及び方法 |
Families Citing this family (6)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN111566703B (zh) * | 2018-01-17 | 2023-10-20 | 索尼公司 | 图像处理装置和方法 |
| JP7100523B2 (ja) * | 2018-07-27 | 2022-07-13 | 京セラ株式会社 | 表示装置、表示システムおよび移動体 |
| WO2020036384A1 (en) * | 2018-08-12 | 2020-02-20 | Lg Electronics Inc. | An apparatus for transmitting a video, a method for transmitting a video, an apparatus for receiving a video, and a method for receiving a video |
| US11956478B2 (en) * | 2019-01-09 | 2024-04-09 | Tencent America LLC | Method and apparatus for point cloud chunking for improved patch packing and coding efficiency |
| WO2020230710A1 (ja) * | 2019-05-10 | 2020-11-19 | パナソニック インテレクチュアル プロパティ コーポレーション オブ アメリカ | 三次元データ符号化方法、三次元データ復号方法、三次元データ符号化装置、及び三次元データ復号装置 |
| US11575935B2 (en) * | 2019-06-14 | 2023-02-07 | Electronics And Telecommunications Research Institute | Video encoding method and video decoding method |
Citations (1)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| WO2019055963A1 (en) * | 2017-09-18 | 2019-03-21 | Apple Inc. | COMPRESSION OF CLOUD OF POINTS |
Family Cites Families (9)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| WO2018022940A1 (en) * | 2016-07-27 | 2018-02-01 | Align Technology, Inc. | Intraoral scanner with dental diagnostics capabilities |
| US11514613B2 (en) * | 2017-03-16 | 2022-11-29 | Samsung Electronics Co., Ltd. | Point cloud and mesh compression using image/video codecs |
| EP3642800B1 (en) * | 2017-07-10 | 2025-02-26 | Samsung Electronics Co., Ltd. | Point cloud and mesh compression using image/video codecs |
| US11405643B2 (en) * | 2017-08-15 | 2022-08-02 | Nokia Technologies Oy | Sequential encoding and decoding of volumetric video |
| TWI815842B (zh) * | 2018-01-16 | 2023-09-21 | 日商索尼股份有限公司 | 影像處理裝置及方法 |
| CN110049323B (zh) * | 2018-01-17 | 2021-09-07 | 华为技术有限公司 | 编码方法、解码方法和装置 |
| US10887574B2 (en) * | 2018-07-31 | 2021-01-05 | Intel Corporation | Selective packing of patches for immersive video |
| US12010350B2 (en) * | 2019-03-22 | 2024-06-11 | Lg Electronics Inc. | Point cloud data transmission device, point cloud data transmission method, point cloud data reception device, and point cloud data reception method |
| WO2020197086A1 (ko) * | 2019-03-25 | 2020-10-01 | 엘지전자 주식회사 | 포인트 클라우드 데이터 송신 장치, 포인트 클라우드 데이터 송신 방법, 포인트 클라우드 데이터 수신 장치 및 포인트 클라우드 데이터 수신 방법 |
-
2019
- 2019-03-28 EP EP19784358.4A patent/EP3780613A4/en not_active Withdrawn
- 2019-03-28 KR KR1020207027270A patent/KR20200140256A/ko not_active Withdrawn
- 2019-03-28 US US16/981,722 patent/US11310518B2/en active Active
- 2019-03-28 CN CN201980024400.7A patent/CN111937388A/zh not_active Withdrawn
- 2019-03-28 JP JP2020513193A patent/JPWO2019198521A1/ja active Pending
- 2019-03-28 WO PCT/JP2019/013535 patent/WO2019198521A1/ja not_active Ceased
- 2019-04-01 TW TW108111474A patent/TW201946449A/zh unknown
Patent Citations (1)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| WO2019055963A1 (en) * | 2017-09-18 | 2019-03-21 | Apple Inc. | COMPRESSION OF CLOUD OF POINTS |
Non-Patent Citations (7)
| Title |
|---|
| "Advanced video coding for generic audiovisual services", TELECOMMUNICATION STANDARDIZATION SECTOR OF ITU (INTERNATIONAL TELECOMMUNICATION UNION, April 2017 (2017-04-01) |
| "High efficiency video coding", TELECOMMUNICATION STANDARDIZATION SECTOR OF ITU (INTERNATIONAL TELECOMMUNICATION UNION, December 2016 (2016-12-01) |
| JIANLE CHENELENA ALSHINAGARY J. SULLIVANJENS-RAINER, JILL BOYCE: "Algorithm Description of Joint Exploration Test Model 4", JVET-G1001 V1, JOINT VIDEO EXPLORATION TEAM (JVET) OF ITU-T SG 16 WP 3 AND ISO/IEC JTC 1/SC 29/WG 11 7TH MEETING, 13 July 2017 (2017-07-13) |
| MARIUS PREDA, POINT CLOUD COMPRESSION IN MPEG, 28 October 2017 (2017-10-28), XP055643992, Retrieved from the Internet <URL:https://mpeg.chiariglione.org/sites/default/files/events/05_MP20%20PPC%20Preda%202017.pdf> [retrieved on 20190510] * |
| R. MEKURIAK. BLOMP. CESAR, DESIGN, IMPLEMENTATION AND EVALUATION OF A POINT CLOUD CODEC FOR TELE-IMMERSIVE VIDEO |
| See also references of EP3780613A4 |
| TILO OCHOTTA ET AL., IMAGE-BASED SURFACE COMPRESSION, 2008, XP055534929, Retrieved from the Internet <URL:http://kops.uni-konstanz.de/bitstream/handle/123456789/3009/Ochotta-opus-117979.pdf> [retrieved on 20190510] * |
Cited By (14)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US11704837B2 (en) | 2018-06-30 | 2023-07-18 | Huawei Technologies Co., Ltd. | Point cloud encoding method, point cloud decoding method, encoder, and decoder |
| US11328450B2 (en) | 2018-06-30 | 2022-05-10 | Huawei Technologies Co., Ltd. | Point cloud encoding method, point cloud decoding method, encoder, and decoder |
| JP2021529482A (ja) * | 2018-06-30 | 2021-10-28 | 華為技術有限公司Huawei Technologies Co.,Ltd. | 点群符号化方法、点群復号化方法、符号器、及び復号器 |
| JP7057453B2 (ja) | 2018-06-30 | 2022-04-19 | 華為技術有限公司 | 点群符号化方法、点群復号化方法、符号器、及び復号器 |
| US12464138B2 (en) | 2019-06-28 | 2025-11-04 | Lg Electronics Inc. | Apparatus and method for processing point cloud data |
| JP2024138070A (ja) * | 2019-06-28 | 2024-10-07 | エルジー エレクトロニクス インコーポレイティド | ポイントクラウドデータ処理装置及び方法 |
| JP7798974B2 (ja) | 2019-06-28 | 2026-01-14 | エルジー エレクトロニクス インコーポレイティド | ポイントクラウドデータ処理装置及び方法 |
| JP2023508271A (ja) * | 2020-01-07 | 2023-03-02 | エルジー エレクトロニクス インコーポレイティド | ポイントクラウドデータ送信装置、ポイントクラウドデータ送信方法、ポイントクラウドデータ受信装置及びポイントクラウドデータ受信方法 |
| JP7448660B2 (ja) | 2020-01-07 | 2024-03-12 | エルジー エレクトロニクス インコーポレイティド | ポイントクラウドデータ送信装置、ポイントクラウドデータ送信方法、ポイントクラウドデータ受信装置及びポイントクラウドデータ受信方法 |
| JP2024063123A (ja) * | 2020-01-07 | 2024-05-10 | エルジー エレクトロニクス インコーポレイティド | ポイントクラウドデータ送信装置、ポイントクラウドデータ送信方法、ポイントクラウドデータ受信装置及びポイントクラウドデータ受信方法 |
| JP7753419B2 (ja) | 2020-01-07 | 2025-10-14 | エルジー エレクトロニクス インコーポレイティド | ポイントクラウドデータ送信装置、ポイントクラウドデータ送信方法、ポイントクラウドデータ受信装置及びポイントクラウドデータ受信方法 |
| WO2021210513A1 (ja) * | 2020-04-13 | 2021-10-21 | パナソニック インテレクチュアル プロパティ コーポレーション オブ アメリカ | 三次元データ符号化方法、三次元データ復号方法、三次元データ符号化装置、及び三次元データ復号装置 |
| WO2021261516A1 (ja) * | 2020-06-23 | 2021-12-30 | パナソニック インテレクチュアル プロパティ コーポレーション オブ アメリカ | 三次元データ符号化方法、三次元データ復号方法、三次元データ符号化装置、及び三次元データ復号装置 |
| JP2024016955A (ja) * | 2022-07-27 | 2024-02-08 | 日本放送協会 | 符号化装置、ストリーム合成装置、復号装置、およびプログラム |
Also Published As
| Publication number | Publication date |
|---|---|
| EP3780613A1 (en) | 2021-02-17 |
| KR20200140256A (ko) | 2020-12-15 |
| US11310518B2 (en) | 2022-04-19 |
| CN111937388A (zh) | 2020-11-13 |
| US20210250600A1 (en) | 2021-08-12 |
| TW201946449A (zh) | 2019-12-01 |
| EP3780613A4 (en) | 2021-05-19 |
| JPWO2019198521A1 (ja) | 2021-05-13 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| WO2019198521A1 (ja) | 画像処理装置および方法 | |
| JP7552828B2 (ja) | 画像処理装置および方法 | |
| TWI815842B (zh) | 影像處理裝置及方法 | |
| US11699248B2 (en) | Image processing apparatus and method | |
| US20210027505A1 (en) | Image processing apparatus and method | |
| US11399189B2 (en) | Image processing apparatus and method | |
| JP7331852B2 (ja) | 画像処理装置および方法 | |
| US11356690B2 (en) | Image processing apparatus and method | |
| JP7396302B2 (ja) | 画像処理装置および方法 | |
| HK40029724A (en) | Image processing device and method | |
| HK40029724B (zh) | 图像处理装置和方法 |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| 121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 19784358 Country of ref document: EP Kind code of ref document: A1 |
|
| ENP | Entry into the national phase |
Ref document number: 2020513193 Country of ref document: JP Kind code of ref document: A |
|
| ENP | Entry into the national phase |
Ref document number: 20207027270 Country of ref document: KR Kind code of ref document: A |
|
| NENP | Non-entry into the national phase |
Ref country code: DE |
|
| ENP | Entry into the national phase |
Ref document number: 2019784358 Country of ref document: EP Effective date: 20201111 |