WO2024205274A2 - Procédé et dispositif de codage/décodage d'image et support d'enregistrement stockant des flux binaires - Google Patents
Procédé et dispositif de codage/décodage d'image et support d'enregistrement stockant des flux binaires Download PDFInfo
- Publication number
- WO2024205274A2 WO2024205274A2 PCT/KR2024/003974 KR2024003974W WO2024205274A2 WO 2024205274 A2 WO2024205274 A2 WO 2024205274A2 KR 2024003974 W KR2024003974 W KR 2024003974W WO 2024205274 A2 WO2024205274 A2 WO 2024205274A2
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- current block
- information
- transformation
- transform
- block
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Ceased
Links
Images
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/102—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
- H04N19/12—Selection from among a plurality of transforms or standards, e.g. selection between discrete cosine transform [DCT] and sub-band transform or selection between H.263 and H.264
- H04N19/122—Selection of transform size, e.g. 8x8 or 2x4x8 DCT; Selection of sub-band transforms of varying structure or type
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/102—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
- H04N19/124—Quantisation
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/102—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
- H04N19/129—Scanning of coding units, e.g. zig-zag scan of transform coefficients or flexible macroblock ordering [FMO]
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/102—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
- H04N19/13—Adaptive entropy coding, e.g. adaptive variable length coding [AVLC] or context adaptive binary arithmetic coding [CABAC]
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/169—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
- H04N19/17—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
- H04N19/176—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a block, e.g. a macroblock
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/169—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
- H04N19/18—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being a set of transform coefficients
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/60—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/70—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals characterised by syntax aspects related to video coding, e.g. related to compression standards
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/90—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using coding techniques not provided for in groups H04N19/10-H04N19/85, e.g. fractals
- H04N19/91—Entropy coding, e.g. variable length coding [VLC] or arithmetic coding
Definitions
- the present invention relates to a video encoding/decoding method, a device, and a recording medium storing a bitstream. Specifically, the present invention relates to a video encoding/decoding method, a device, and a recording medium storing a bitstream that uses a method of performing residual coding by using information of a valid area where non-zero transform coefficients exist.
- the existing residual coding technique when selecting a context model of an arbitrary syntactic element in residual coding, a fixed-size N x N square area is used. According to the existing residual coding technique, the residual characteristics of the current block are not properly reflected, so the optimal context model cannot be selected, and as a result, the coding efficiency is reduced.
- the purpose of the present invention is to provide a video encoding/decoding method and device with improved encoding/decoding efficiency.
- the present invention aims to provide a recording medium storing a bitstream generated by an image decoding method or device according to the present invention.
- the present invention aims to provide a residual coding method performed based on a valid region reflecting the characteristics of the transform coefficients of the current block in order to solve the problem of the residual coding.
- a video decoding method includes a step of determining a transform type of a current block, a step of determining context information on transform coefficient information of the current block based on the transform type, and a step of entropy decoding the transform coefficient information based on the context information, wherein the context information can be determined based on whether the transform type of the current block is a non-separable transform.
- the transform coefficient information is determined based on transform coefficient information of a neighboring transform coefficient adjacent to a current transform coefficient, the neighboring transform coefficient is located within a valid area of the current block, and the valid area of the current block is determined based on whether a transform type of the current block is a non-separable transform.
- the current block may be characterized by being composed of a zero area and a valid area.
- the neighboring transform coefficients are determined based on the scan direction for transforming the current block.
- the transform coefficient information may be binarized based on the valid area.
- the transform coefficient information may include information indicating a prefix for a position value of the last non-zero coefficient in the current block, and information indicating a suffix for a position value of the last non-zero coefficient in the current block.
- the length of information indicating a prefix for the position value of the last non-zero coefficient in the current block can be set to a maximum value determined based on the valid area.
- the transform coefficient information can indicate information about each coefficient group divided from the current block.
- the transform coefficient information may indicate whether a coefficient group includes a non-zero transform coefficient, and may be encoded only for a coefficient group included within the valid area.
- a video encoding method includes a step of determining a transform type of a current block, a step of determining context information for transform coefficient information of the current block based on the transform type, and a step of entropy encoding the transform coefficient information based on the context information, wherein the context information can be determined based on whether the transform type of the current block is a non-separable transform.
- a non-transitory computer-readable recording medium can store a bitstream generated by a video encoding method, including the steps of determining a transform type of a current block, determining context information about transform coefficient information of the current block based on the transform type, and entropy encoding the transform coefficient information based on the context information, wherein the context information is determined based on whether the transform type of the current block is a non-separable transform.
- a bitstream transmission method comprises a step of determining a transform type of a current block, a step of determining context information about transform coefficient information of the current block based on the transform type, and a step of entropy encoding the transform coefficient information based on the context information, wherein the context information is determined based on whether the transform type of the current block is a non-separable transform, and a bitstream generated by a video encoding method can be transmitted.
- a video encoding/decoding method and device with improved encoding/decoding efficiency can be provided.
- a residual coding method performed based on a valid region reflecting the characteristics of the transform coefficients of the current block can be provided.
- the number of context encoding bins used for residual coding can be reduced.
- Figure 1 is a block diagram showing the configuration according to one embodiment of an encoding device to which the present invention is applied.
- FIG. 2 is a block diagram showing the configuration of one embodiment of a decryption device to which the present invention is applied.
- FIG. 3 is a diagram schematically showing a video coding system to which the present invention can be applied.
- FIG. 4 is a diagram for explaining a zeroing method in non-separable transformation according to one embodiment of the present invention.
- FIG. 5 is a drawing for explaining a valid area derived from a transformation result for a current block according to an embodiment of the present invention.
- Figure 6 is a diagram illustrating one embodiment of a CG divided from a transformation block.
- FIG. 7 is a diagram for explaining a method of encoding a flag indicating the presence or absence of a non-zero coefficient in a CG according to one embodiment of the present invention.
- FIG. 8 is a diagram illustrating one embodiment of a method for determining a context model based on a valid area according to one embodiment of the present invention.
- FIG. 9 is a diagram illustrating one embodiment of a method for determining a context model based on a valid area according to one embodiment of the present invention.
- Figure 10 is a flowchart illustrating an image decoding method according to an embodiment of the present invention.
- FIG. 11 is a drawing exemplarily showing a content streaming system to which an embodiment according to the present invention can be applied.
- first, second, etc. may be used to describe various components, but the components should not be limited by the terms. The terms are only used for the purpose of distinguishing one component from another.
- the first component may be referred to as the second component, and similarly, the second component may also be referred to as the first component.
- the term and/or includes a combination of a plurality of related described items or any item among a plurality of related described items.
- each component shown in the embodiments of the present invention are independently depicted to indicate different characteristic functions, and do not mean that each component is formed as a separate hardware or software configuration unit. That is, each component is listed and included as a separate component for convenience of explanation, and at least two components among each component may be combined to form a single component, or one component may be divided into multiple components to perform a function, and such integrated and separated embodiments of each component are also included in the scope of the present invention as long as they do not deviate from the essence of the present invention.
- the terminology used in the present invention is only used to describe specific embodiments and is not intended to limit the present invention.
- the singular expression includes the plural expression unless the context clearly indicates otherwise.
- some components of the present invention are not essential components that perform essential functions in the present invention and may be optional components only for improving performance.
- the present invention may be implemented by including only essential components for implementing the essence of the present invention excluding components only used for improving performance, and a structure including only essential components excluding optional components only used for improving performance is also included in the scope of the present invention.
- the term "at least one” can mean one of a number greater than or equal to 1, such as 1, 2, 3, and 4.
- the term "a plurality of” can mean one of a number greater than or equal to 2, such as 2, 3, and 4.
- video may mean one picture constituting a video, and may also represent the video itself.
- encoding and/or decoding of a video may mean “encoding and/or decoding of a video,” and may also mean “encoding and/or decoding of one of the videos constituting the video.”
- the target image may be an encoding target image that is a target of encoding and/or a decoding target image that is a target of decoding.
- the target image may be an input image input to an encoding device and may be an input image input to a decoding device.
- the target image may have the same meaning as the current image.
- encoder and image encoding device may be used interchangeably and have the same meaning.
- decoder and image decoding device may be used interchangeably and interchangeably.
- image may be used with the same meaning and may be used interchangeably.
- target block may be an encoding target block that is a target of encoding and/or a decoding target block that is a target of decoding.
- target block may be a current block that is a target of current encoding and/or decoding.
- target block and current block may be used with the same meaning and may be used interchangeably.
- a coding tree unit may be composed of one luma component (Y) coding tree block (CTB) and two chroma component (Cb, Cr) coding tree blocks related to it.
- sample may represent a basic unit constituting a block.
- Figure 1 is a block diagram showing the configuration according to one embodiment of an encoding device to which the present invention is applied.
- the encoding device (100) may be an encoder, a video encoding device, or an image encoding device.
- the video may include one or more images.
- the encoding device (100) may sequentially encode one or more images.
- an encoding device (100) may include an image segmentation unit (110), an intra prediction unit (120), a motion prediction unit (121), a motion compensation unit (122), a switch (115), a subtractor (113), a transformation unit (130), a quantization unit (140), an entropy encoding unit (150), an inverse quantization unit (160), an inverse transformation unit (170), an adder (117), a filter unit (180), and a reference picture buffer (190).
- the encoding device (100) can generate a bitstream including encoded information through encoding an input image, and output the generated bitstream.
- the generated bitstream can be stored in a computer-readable recording medium, or can be streamed through a wired/wireless transmission medium.
- the video segmentation unit (110) can segment the input video into various forms to increase the efficiency of video encoding/decoding. That is, the input video is composed of multiple pictures, and one picture can be hierarchically segmented and processed for compression efficiency, parallel processing, etc. For example, one picture can be segmented into one or multiple tiles or slices, and then segmented again into multiple CTUs (Coding Tree Units). Alternatively, one picture can be segmented into multiple sub-pictures defined as groups of rectangular slices, and each sub-picture can be segmented into the tiles/slices. Here, the sub-pictures can be utilized to support the function of partially independently encoding/decoding and transmitting the picture.
- a brick can be utilized as a basic unit of intra-picture parallel processing.
- one CTU can be recursively split into a quad tree (QT: Quadtree), and the terminal node of the split can be defined as a CU (Coding Unit).
- the CU can be split into a PU (Prediction Unit), which is a prediction unit, and a TU (Transform Unit), which is a transformation unit, to perform prediction and splitting. Meanwhile, the CU can be utilized as a prediction unit and/or a transformation unit itself.
- each CTU can be recursively split into not only a QT but also a multi-type tree (MTT: Multi-Type Tree).
- MTT Multi-Type Tree
- Splitting of a CTU into a multi-type tree can start from the terminal node of a QT, and the MTT can be composed of a BT (Binary Tree) and a TT (Triple Tree).
- the MTT structure can be distinguished into vertical binary split mode (SPLIT_BT_VER), horizontal binary split mode (SPLIT_BT_HOR), vertical ternary split mode (SPLIT_TT_VER), and horizontal ternary split mode (SPLIT_TT_HOR).
- the minimum block size (MinQTSize) of the quad tree of the luma block during splitting can be set to 16x16
- the maximum block size (MaxBtSize) of the binary tree can be set to 128x128, and the maximum block size (MaxTtSize) of the triple tree can be set to 64x64.
- the minimum block size (MinBtSize) of the binary tree and the minimum block size (MinTtSize) of the triple tree can be set to 4x4
- the maximum depth (MaxMttDepth) of the multi-type tree can be set to 4.
- a dual tree that uses different CTU split structures for luma and chrominance components can be applied to improve the encoding efficiency of the I slice.
- the luminance and chrominance CTBs (Coding Tree Blocks) within the CTU can be split into a single tree sharing the coding tree structure.
- the encoding device (100) may perform encoding on the input image in the intra mode and/or the inter mode.
- the encoding device (100) may perform encoding on the input image in a third mode (e.g., IBC mode, Palette mode, etc.) other than the intra mode and the inter mode.
- a third mode e.g., IBC mode, Palette mode, etc.
- the third mode may be classified as the intra mode or the inter mode for convenience of explanation. In the present invention, the third mode will be classified and described separately only when a specific explanation is required.
- the switch (115) can be switched to intra, and when the inter mode is used as the prediction mode, the switch (115) can be switched to inter.
- the intra mode can mean an intra-screen prediction mode
- the inter mode can mean an inter-screen prediction mode.
- the encoding device (100) can generate a prediction block for an input block of an input image.
- the encoding device (100) can encode a residual block using a residual of the input block and the prediction block.
- the input image can be referred to as a current image which is a current encoding target.
- the input block can be referred to as a current block which is a current encoding target or an encoding target block.
- the intra prediction unit (120) can use samples of blocks already encoded/decoded around the current block as reference samples.
- the intra prediction unit (120) can perform spatial prediction on the current block using the reference samples, and can generate prediction samples for the input block through spatial prediction.
- intra prediction can mean prediction within the screen.
- non-directional prediction modes such as DC mode and Planar mode and directional prediction modes (e.g., 65 directions) can be applied.
- the intra prediction method can be expressed as an intra prediction mode or an intra-screen prediction mode.
- the motion prediction unit (121) can search for an area that best matches the input block from the reference image during the motion prediction process, and can derive a motion vector using the searched area. At this time, the search area can be used as the area.
- the reference image can be stored in the reference picture buffer (190).
- it when encoding/decoding for the reference image is processed, it can be stored in the reference picture buffer (190).
- the motion compensation unit (122) can generate a prediction block for the current block by performing motion compensation using a motion vector.
- inter prediction can mean inter-screen prediction or motion compensation.
- the above motion prediction unit (121) and motion compensation unit (122) can generate a prediction block by applying an interpolation filter to a portion of an area within a reference image when the value of a motion vector does not have an integer value.
- the AFFINE mode of sub-PU based prediction the AFFINE mode of sub-PU based prediction, the SbTMVP (Subblock-based Temporal Motion Vector Prediction) mode, and the MMVD (Merge with MVD) mode, the GPM (Geometric Partitioning Mode) mode of PU based prediction can be applied.
- the SbTMVP Subblock-based Temporal Motion Vector Prediction
- MMVD Merge with MVD
- GPM Gaometric Partitioning Mode
- the HMVP History based MVP
- the PAMVP Positionwise Average MVP
- the CIIP Combined Intra/Inter Prediction
- the AMVR Adaptive Motion Vector Resolution
- the BDOF Bi-Directional Optical-Flow
- the BCW Block Predictive with CU Weights
- the LIC Lical Illumination Compensation
- the TM Tempolate Matching
- the OBMC Overlapped Block Motion Compensation
- AFFINE mode is a technology that is used in both AMVP and MERGE modes and also has high encoding efficiency. Since the conventional video coding standard performs MC (Motion Compensation) by considering only the parallel translation of the block, there was a disadvantage in that it could not properly compensate for motions that occur in reality, such as zoom in/out and rotation. To supplement this, a four-parameter affine motion model using two control point motion vectors (CPMV) and a six-parameter affine motion model using three control point motion vectors can be applied to inter prediction.
- CPMV is a vector representing an affine motion model of one of the upper left, upper right, and lower left of the current block.
- the subtractor (113) can generate a residual block using the difference between the input block and the predicted block.
- the residual block may also be referred to as a residual signal.
- the residual signal may mean the difference between the original signal and the predicted signal.
- the residual signal may be a signal generated by transforming, quantizing, or transforming and quantizing the difference between the original signal and the predicted signal.
- the residual block may be a residual signal in block units.
- the transform unit (130) can perform a transform on the residual block to generate a transform coefficient and output the generated transform coefficient.
- the transform coefficient can be a coefficient value generated by performing a transform on the residual block.
- the transform unit (130) can also skip the transform on the residual block.
- a quantized level can be generated by applying quantization to a transform coefficient or a residual signal.
- a quantized level may also be referred to as a transform coefficient.
- a 4x4 luminance residual block generated through within-screen prediction can be transformed using a basis vector based on DST (Discrete Sine Transform), and a basis vector based on DCT (Discrete Cosine Transform) can be used to transform the remaining residual blocks.
- DST Discrete Sine Transform
- DCT Discrete Cosine Transform
- RQT Residual Quad Tree
- a transform block is divided into a quad tree shape for one block, and after transforming and quantizing each transform block divided through RQT, a coded block flag (cbf) can be transmitted to increase encoding efficiency when all coefficients become 0.
- cbf coded block flag
- the Multiple Transform Selection (MTS) technique can be applied to selectively perform transformation using multiple transformation bases. That is, instead of dividing the CU into TUs through the RQT, a function similar to TU division can be performed through the Sub-block Transform (SBT) technique. Specifically, the SBT is applied only to inter-screen prediction blocks, and unlike the RQT, the current block can be divided into 1 ⁇ 2 or 1 ⁇ 4 sizes in the vertical or horizontal direction, and then the transformation can be performed on only one of the blocks. For example, if it is divided vertically, the transformation can be performed on the leftmost or rightmost block, and if it is divided horizontally, the transformation can be performed on the topmost or bottommost block.
- SBT Sub-block Transform
- LFNST Low Frequency Non-Separable Transform
- a secondary transform technique that additionally transforms the residual signal converted to the frequency domain through DCT or DST, can be applied.
- LFNST additionally performs a transform on the low-frequency region of 4x4 or 8x8 in the upper left, so that the residual coefficients can be concentrated in the upper left.
- the quantization unit (140) can generate a quantized level by quantizing a transform coefficient or a residual signal according to a quantization parameter (QP), and can output the generated quantized level. At this time, the quantization unit (140) can quantize the transform coefficient using a quantization matrix.
- QP quantization parameter
- a quantizer using QP values of 0 to 51 can be used.
- 0 to 63 QP can be used.
- DQ Dependent Quantization
- DQ performs quantization using two quantizers (e.g., Q0 and Q1), and even without signaling information about the use of a specific quantizer, the quantizer to be used for the next transform coefficient can be selected based on the current state through a state transition model.
- the entropy encoding unit (150) can generate a bitstream by performing entropy encoding according to a probability distribution on values produced by the quantization unit (140) or coding parameter values produced in the encoding process, and can output the bitstream.
- the entropy encoding unit (150) can perform entropy encoding on information about image samples and information for decoding the image. For example, information for decoding the image can include syntax elements, etc.
- the entropy encoding unit (150) can use an encoding method such as exponential Golomb, Context-Adaptive Variable Length Coding (CAVLC), and Context-Adaptive Binary Arithmetic Coding (CABAC) for entropy encoding.
- CAVLC Context-Adaptive Variable Length Coding
- CABAC Context-Adaptive Binary Arithmetic Coding
- the entropy encoding unit (150) can perform entropy encoding using a Variable Length Coding/Code (VLC) table.
- VLC Variable Length Coding/Code
- the entropy encoding unit (150) may derive a binarization method of a target symbol and a probability model of a target symbol/bin, and then perform arithmetic encoding using the derived binarization method, probability model, and context model.
- the table probability update method when applying CABAC, in order to reduce the size of the probability table stored in the decryption device, the table probability update method can be changed to a table update method using a simple formula and applied.
- two different probability models can be used to obtain more accurate symbol probability values.
- the entropy encoding unit (150) can change a two-dimensional block form coefficient into a one-dimensional vector form through a transform coefficient scanning method to encode a transform coefficient level (quantized level).
- Coding parameters may include information (flags, indexes, etc.) encoded in an encoding device (100) and signaled to a decoding device (200), such as syntax elements, as well as information derived during an encoding process or a decoding process, and may mean information necessary when encoding or decoding an image.
- signaling a flag or index may mean that the encoder entropy encodes the flag or index and includes it in the bitstream, and that the decoder entropy decodes the flag or index from the bitstream.
- the encoded current image can be used as a reference image for other images to be processed later. Therefore, the encoding device (100) can restore or decode the encoded current image again, and store the restored or decoded image as a reference image in the reference picture buffer (190).
- the quantized level can be dequantized in the dequantization unit (160) and inverse transformed in the inverse transform unit (170).
- the dequantized and/or inverse transformed coefficients can be combined with a prediction block through an adder (117), and a reconstructed block can be generated by combining the dequantized and/or inverse transformed coefficients and the prediction block.
- the dequantized and/or inverse transformed coefficients mean coefficients on which at least one of dequantization and inverse transformation has been performed, and may mean a reconstructed residual block.
- the dequantization unit (160) and the inverse transform unit (170) can be performed in the reverse process of the quantization unit (140) and the transform unit (130).
- the restoration block may pass through a filter unit (180).
- the filter unit (180) may apply a deblocking filter, a sample adaptive offset (SAO), an adaptive loop filter (ALF), a bilateral filter (BIF), LMCS (Luma Mapping with Chroma Scaling), etc. as a filtering technique, in whole or in part, to the restoration sample, restoration block, or restoration image.
- the filter unit (180) may also be called an in-loop filter. In this case, the in-loop filter is also used as a name excluding LMCS.
- the deblocking filter can remove block distortion that occurs at the boundary between blocks.
- different filters can be applied depending on the required deblocking filtering strength.
- a sample adaptive offset can be used to add an appropriate offset value to the sample value to compensate for the encoding error.
- the sample adaptive offset can correct the offset from the original image on a sample basis for the image on which deblocking has been performed.
- a method can be used in which the samples included in the image are divided into a certain number of regions, the regions to be offset are determined, and the offset is applied to the regions, or the offset is applied by considering the edge information of each sample.
- Bilateral filter can also compensate for the offset from the original image on a sample-by-sample basis for the deblocked image.
- An adaptive loop filter can perform filtering based on a comparison value between a restored image and an original image. After dividing samples included in an image into a predetermined group, a filter to be applied to each group can be determined, and filtering can be performed differentially for each group. Information related to whether to apply an adaptive loop filter can be signaled for each coding unit (CU), and the shape and filter coefficients of the adaptive loop filter to be applied can vary for each block.
- CU coding unit
- LMCS Luma Mapping with Chroma Scaling
- LM luma mapping
- CS chroma scaling
- LMCS can be utilized as an HDR correction technique that reflects the characteristics of HDR (High Dynamic Range) images.
- the restored block or restored image that has passed through the filter unit (180) may be stored in the reference picture buffer (190).
- the restored block that has passed through the filter unit (180) may be a part of the reference image.
- the reference image may be a restored image composed of restored blocks that have passed through the filter unit (180).
- the stored reference image may be used for inter-screen prediction or motion compensation thereafter.
- FIG. 2 is a block diagram showing the configuration of one embodiment of a decryption device to which the present invention is applied.
- the decoding device (200) may be a decoder, a video decoding device, or an image decoding device.
- the decoding device (200) may include an entropy decoding unit (210), an inverse quantization unit (220), an inverse transformation unit (230), an intra prediction unit (240), a motion compensation unit (250), an adder (201), a switch (203), a filter unit (260), and a reference picture buffer (270).
- an entropy decoding unit (210) may include an entropy decoding unit (210), an inverse quantization unit (220), an inverse transformation unit (230), an intra prediction unit (240), a motion compensation unit (250), an adder (201), a switch (203), a filter unit (260), and a reference picture buffer (270).
- the decoding device (200) can receive a bitstream output from the encoding device (100).
- the decoding device (200) can receive a bitstream stored in a computer-readable recording medium, or can receive a bitstream streamed through a wired/wireless transmission medium.
- the decoding device (200) can perform decoding on the bitstream in an intra mode or an inter mode.
- the decoding device (200) can generate a restored image or a decoded image through decoding, and can output the restored image or the decoded image.
- the switch (203) can be switched to intra. If the prediction mode used for decryption is inter mode, the switch (203) can be switched to inter.
- the decoding device (200) can obtain a reconstructed residual block by decoding the input bitstream and can generate a prediction block. When the reconstructed residual block and the prediction block are obtained, the decoding device (200) can generate a reconstructed block to be decoded by adding the reconstructed residual block and the prediction block.
- the decoding target block can be referred to as a current block.
- the entropy decoding unit (210) can generate symbols by performing entropy decoding according to a probability distribution for the bitstream.
- the generated symbols can include symbols in the form of quantized levels.
- the entropy decoding method can be the reverse process of the entropy encoding method described above.
- the entropy decoding unit (210) can change a one-dimensional vector-shaped coefficient into a two-dimensional block-shaped coefficient through a transform coefficient scanning method to decode a transform coefficient level (quantized level).
- the quantized level can be dequantized in the dequantization unit (220) and detransformed in the inverse transform unit (230).
- the quantized level can be generated as a restored residual block as a result of the dequantization and/or detransformation.
- the dequantization unit (220) can apply a quantization matrix to the quantized level.
- the dequantization unit (220) and the detransform unit (230) applied to the decoding device can apply the same technology as the dequantization unit (160) and the detransform unit (170) applied to the encoding device described above.
- the intra prediction unit (240) can generate a prediction block by performing spatial prediction on the current block using sample values of already decoded blocks surrounding the block to be decoded.
- the intra prediction unit (240) applied to the decoding device can apply the same technology as the intra prediction unit (120) applied to the encoding device described above.
- the motion compensation unit (250) can perform motion compensation using a motion vector and a reference image stored in the reference picture buffer (270) for the current block to generate a prediction block.
- the motion compensation unit (250) can apply an interpolation filter to a part of the reference image to generate a prediction block when the value of the motion vector does not have an integer value.
- the motion compensation unit (250) applied to the decoding device can apply the same technology as the motion compensation unit (122) applied to the encoding device described above.
- the adder (201) can add the restored residual block and the prediction block to generate a restored block.
- the filter unit (260) can apply at least one of an Inverse-LMCS, a deblocking filter, a sample adaptive offset, and an adaptive loop filter to the restored block or the restored image.
- the filter unit (260) applied to the decoding device can apply the same filtering technology as that applied to the filter unit (180) applied to the above-described encoding device.
- the filter unit (260) can output a restored image.
- the restored block or restored image can be stored in the reference picture buffer (270) and used for inter prediction.
- the restored block that has passed through the filter unit (260) can be a part of the reference image.
- the reference image can be a restored image composed of restored blocks that have passed through the filter unit (260).
- the stored reference image can be used for inter-screen prediction or motion compensation thereafter.
- FIG. 3 is a diagram schematically showing a video coding system to which the present invention can be applied.
- a video coding system may include an encoding device (10) and a decoding device (20).
- the encoding device (10) may transmit encoded video and/or image information or data to the decoding device (20) in the form of a file or streaming through a digital storage medium or a network.
- An encoding device (10) may include a video source generating unit (11), an encoding unit (12), and a transmitting unit (13).
- a decoding device (20) may include a receiving unit (21), a decoding unit (22), and a rendering unit (23).
- the encoding unit (12) may be called a video/image encoding unit, and the decoding unit (22) may be called a video/image decoding unit.
- the transmitting unit (13) may be included in the encoding unit (12).
- the receiving unit (21) may be included in the decoding unit (22).
- the rendering unit (23) may include a display unit, and the display unit may be configured as a separate device or an external component.
- the video source generation unit (11) can obtain a video/image through a process of capturing, synthesizing, or generating a video/image.
- the video source generation unit (11) can include a video/image capture device and/or a video/image generation device.
- the video/image capture device can include, for example, one or more cameras, a video/image archive including previously captured video/image, etc.
- the video/image generation device can include, for example, a computer, a tablet, a smartphone, etc., and can generate a video/image (electronically).
- a virtual video/image can be generated through a computer, etc., and in this case, the video/image capture process can be replaced with a process of generating related data.
- the encoding unit (12) can encode the input video/image.
- the encoding unit (12) can perform a series of procedures such as prediction, transformation, and quantization for compression and encoding efficiency.
- the encoding unit (12) can output encoded data (encoded video/image information) in the form of a bitstream.
- the detailed configuration of the encoding unit (12) can also be configured in the same manner as the encoding device (100) of FIG. 1 described above.
- the transmission unit (13) can transmit encoded video/image information or data output in the form of a bitstream to the reception unit (21) of the decoding device (20) through a digital storage medium or a network in the form of a file or streaming.
- the digital storage medium can include various storage media such as USB, SD, CD, DVD, Blu-ray, HDD, SSD, etc.
- the transmission unit (13) can include an element for generating a media file through a predetermined file format and can include an element for transmission through a broadcasting/communication network.
- the reception unit (21) can extract/receive the bitstream from the storage medium or the network and transmit it to the decoding unit (22).
- the decoding unit (22) can decode video/image by performing a series of procedures such as inverse quantization, inverse transformation, and prediction corresponding to the operation of the encoding unit (12).
- the detailed configuration of the decoding unit (22) can also be configured in the same manner as the decoding device (200) of FIG. 2 described above.
- the rendering unit (23) can render the decrypted video/image.
- the rendered video/image can be displayed through the display unit.
- Transform is a technology that converts a signal in the spatial domain into a signal in the frequency domain.
- high-resolution videos such as HD (high definition) videos or UHD (ultra-high definition) videos
- the latest video compression standards support transform for transform blocks with large sizes.
- the H.264/AVC standard supports transform only for transform blocks of sizes 4x4 and 8x8, but the HEVC standard supports transform for transform blocks of sizes from 4x4 to 32x32.
- the VVC standard supports transform for transform blocks of sizes up to 64x64.
- a method of zeroing the high frequency transform coefficients can be used when performing the transform.
- FIG. 4 is a diagram for explaining a zeroing method in non-separable transformation according to one embodiment of the present invention.
- the transform coefficients of the input block to be non-separably transformed can be scanned in a fixed direction and rearranged into a one-dimensional vector in the form of 1 x M.
- the fixed direction can be one of the row-major direction, the column-major direction, and the diagonal direction.
- M can be (TbW x TbH), which is the product of the width and the height of the input block of the transformation.
- M can be the product of the width and the height of a fixed region-of-interest (ROI) within the transformation block.
- ROI region-of-interest
- An M x N transform kernel can be applied to a rearranged 1 x M vector, where N can be a positive integer less than or equal to M. If M is larger than N, the high-frequency transform coefficients can be zeroed out. As a result of performing the transform, a one-dimensional vector in the form of 1 x N can be output.
- the 1-dimensional vector in the form of 1 x N generated as a result of the transformation can be rearranged into a 2-dimensional form by scanning in a predetermined direction of block units or CG (coefficient group) units.
- the predetermined direction can be one of the directions such as row-majority direction, column-majority direction, and diagonal direction.
- Figure 4 illustrates a case where zeroing is performed in a non-separable transform.
- a transform coefficient of 0 or non-zero may exist in a specific area of a block based on the (0, 0) position of the block.
- the transform coefficients of the remaining areas may all be 0.
- the area where a non-zero transform coefficient exists may be referred to as a valid area.
- the transform coefficients of the 8 x 8 transform block can be scanned in a predetermined direction and rearranged into a 1 x 64 vector. Then, a 64 x 32 transform kernel can be applied to the rearranged 1 x 64 vector.
- the 1 x 32 vector obtained as a result of performing the transform can be rearranged into a two-dimensional form by scanning in a predetermined direction.
- non-zero transform coefficients can exist only in 4 x 8 blocks, which are valid areas within the 8 x 8 transform block, and all transform coefficients located outside the valid area become 0.
- FIG. 5 is a drawing for explaining a valid area derived from a transformation result for a current block according to an embodiment of the present invention.
- the size of the current block may be TbW x TbH.
- the current block may be divided into a region including N transform coefficients that are 0 or not 0 and a region including only 0 transform coefficients.
- the region including N transform coefficients that are 0 or not 0 may be referred to as a valid region.
- the region including only 0 transform coefficients may be referred to as a zeroing region.
- the size of the valid region may be ZoTbW x ZoTbH.
- encoding and/or decoding of transform coefficient information can be performed based on a valid area derived from a transform result for a current block.
- the transform coefficient information can include a syntax element indicating information about residual coding of the current block.
- binarization of transform coefficient information, encoding and/or decoding of transform coefficient information can be performed based on the valid area, and a context model for the transform coefficient information can be determined.
- the transform coefficient information can be binarized based on the valid area derived from the transform result of the current block.
- transform coefficient information is binarized using the size of a transform block or a fixed-size N x N square region. For example, in indicating the position of the last non-zero transform coefficient in a transform block, information indicating the x-coordinate position of the last non-zero transform coefficient and information indicating the y-coordinate position can be independently signaled.
- information indicating a prefix for the position of the last non-zero transform coefficient may be binarized into TU (truncated unary), and information indicating a suffix may be binarized into FLC (fixed length code).
- TU truncated unary
- FLC fixed length code
- the transform coefficient information can be binarized using the width/height of a predetermined region of size K x L determined based on various information.
- the values of K and L may be the same or different from each other.
- the values of width K and height L of a given region can be determined using the width ZoTbW and/or height ZoTbH of the effective region derived as a result of zeroing a separable transform or a non-separable transform.
- the values of ZoTbW and ZoTbH can be the same or different from each other.
- the maximum length of a codeword of information indicating a prefix for the x-coordinate position of the last non-zero transform coefficient can be determined according to mathematical expression 1.
- the maximum length of a codeword of a prefix of information indicating a prefix for the y-coordinate position of the last non-zero transform coefficient can be determined according to mathematical expression 2.
- the current transform block is 8 x 8
- the width of the valid region of the current transform block with the zeroing of the non-separable transform applied is 4 x 8
- the position of the last non-zero transform coefficient is (3, 7)
- the information indicating the position of the last non-zero transform coefficient can be binarized to (111, 111111).
- the valid region when a separation transform is applied to a current block of size 32 x 32, the valid region may be a region of size 4 x 16. In the valid region, when the position of the last non-zero transform coefficient is (3, 15), the information indicating the position of the last non-zero transform coefficient may be binarized to (111, 1111111011).
- the method proposed in the present invention can set a non-square area as a valid area, thereby reducing the number of bins for binarizing the x-coordinate value 3 of the position of the last non-zero transform coefficient.
- the values of the width K of the predetermined region and the height L of the predetermined region may be positive integers greater than or equal to 1 predefined in the encoder and the decoder.
- K and L which are arbitrary positive numbers, may be determined according to one or more pieces of information from among the size of the block, the type of transformation applied to the current block, the size of the transformation kernel of the current block, the aspect ratio of the block, the quantization parameter (QP), the prediction mode of the block, and the like.
- the values of K and L may be the same or different from each other.
- the value of the width K of a predetermined region and the value of the height L of the predetermined region can be determined based on a pre-encoded syntax element.
- the pre-encoded syntax element can be at least one of a separable transformation or non-separable transformation related syntax element including a syntax element for information of a current block or a neighboring block, a syntax element indicating a type of applied transformation, a transformation index, and an index indicating transformation kernel information/transformation kernel list.
- the maximum length of a codeword for the x-coordinate prefix can be determined according to Equation 3.
- the maximum length of a codeword for the y-coordinate prefix can be determined according to Equation 4.
- the proposed method reduces the number of generated bins and improves coding efficiency by binarizing transform coefficient information using a random region where non-zero transform coefficients can be located with a high probability.
- the proposed method can improve the throughput of residual coding.
- K and L can be determined according to the width or/and height of the valid region after zeroing, a predetermined arbitrary positive integer, or an already encoded syntax element.
- the transform coefficient information of the current block can be encoded based on the valid area derived from the transform result of the current block.
- the transform coefficient information of the current block can be encoded in CG units divided from the current block.
- One embodiment of the CG divided from the current block can be as described below.
- Figure 6 is a diagram illustrating one embodiment of a CG divided from a transformation block.
- the size of the transform block can be 8 x 8. And, the transform block can be divided into four CGs. The size of each CG can be 4 x 4.
- a flag indicating the presence of non-zero transform coefficients in each CG can be encoded.
- the flag indicating the presence of non-zero transform coefficients can be sb_coded_flag.
- FIG. 7 is a diagram for explaining a method of encoding a flag indicating the presence or absence of a non-zero transform coefficient in a CG according to one embodiment of the present invention.
- sb_coded_flag which is information on the transform coefficient of the current block, using the width and/or height of an existing transform block as in the prior art
- sb_coded_flag having values of '0', '0', '1', '1' in the opposite diagonal direction from the lower right of the transform block is encoded.
- information of right CGs where non-zero transform coefficients are not located is also encoded, so unnecessary bits are used.
- sb_coded_flag which is information on the transform coefficient of the current block, using the width and/or height of a square area (e.g., 4 x 4) as in the prior art
- sb_coded_flag having a value of '1' is encoded for the CG at the upper left.
- sb_coded_flag may not be coded for some CGs where a non-zero transform coefficient may exist with a high probability. Therefore, all residual information is lost, which may significantly reduce the coding efficiency.
- information about the transform coefficients of the current block can be encoded using the width and/or the height of the K x L region, as illustrated in (c) of FIG. 7.
- the values of K and L can be determined using the width ZoTbW and/or the height ZoTbH of the effective region after zeroing of the separable transform or the non-separable transform.
- the values of ZoTbW and ZoTbH can be the same as or different from each other.
- the values of the width K of the predetermined region and the height L of the predetermined region may be positive integers greater than or equal to 1 predefined in the encoder and the decoder.
- K and L which are arbitrary positive numbers, may be determined according to one or more pieces of information from among the size of the block, the type of transformation applied to the current block, the size of the transformation kernel of the current block, the aspect ratio of the block, the quantization parameter (QP), the prediction mode of the block, and the like.
- the values of K and L may be the same or different from each other.
- the value of the width K of a predetermined region and the value of the height L of the predetermined region can be determined based on pre-encoded syntax elements.
- the pre-encoded syntax elements can be at least one of syntax elements related to a separable transformation or a non-separable transformation, including syntax elements for a current block and surrounding blocks, syntax elements indicating a type of applied transformation, a transformation index, and an index indicating transformation kernel information/transformation kernel list.
- the proposed method can reduce the number of generated bins and improve coding efficiency by encoding transform coefficient information using an arbitrary region where non-zero transform coefficients can be located.
- the reduced bins are context-coded bins, the throughput of residual coding can be improved.
- the current block it is possible to encode not only information indicating whether there is at least one non-zero transform coefficient in the CG, but also other information about the transform coefficients, by using a given region instead of the current block. That is, instead of the width or/and height of the block, or the width or/and height of a predetermined square region, it is possible to encode information related to the transform coefficients using arbitrary positive integers K or/and L.
- K and L can be determined according to the width or/and height of the effective region after zeroing, an arbitrary positive integer that is predetermined, or a syntax element that has already been encoded.
- a context model of the transform coefficient information can be determined based on a valid area derived from the transform result of the current block.
- a context model of transform coefficient information can be determined using information of a predetermined region of a K x L size determined based on various information.
- the values of K and L may be the same or different from each other.
- the value of the width K of a given region and the value of the height L of the given region can be determined by the width of the current transformation block and the height of the current transformation block.
- the values of the width K of a given region and the height L of the given region can be determined as the width of a valid region and the height of a valid region derived as a result of performing zeroing on the current transform block.
- the value of the width K of the predetermined region and the value of the height L of the predetermined region can be any positive number that is a power of 2.
- the arbitrary positive number can be determined as a positive integer that is a power of 2 predefined in the encoder and the decoder.
- the arbitrary positive number can be determined according to one or more pieces of information from among the size of the block, the type of transformation applied to the current block, the size of the transformation kernel of the current block, the aspect ratio of the block, the quantization parameter (QP), the prediction mode of the block, and the like.
- the values of K and L can be the same or different from each other.
- the value of the width K of a predetermined region and the value of the height L of the predetermined region can be determined based on already encoded syntax elements.
- the already encoded syntax elements can be at least one of separable transformation or non-separable transformation related syntax elements including syntax elements for information of a current block or a neighboring block, syntax elements indicating a type of applied transformation, a transformation index, an index indicating transformation kernel information and/or a transformation kernel list, etc.
- a context model for information of transform coefficients within a current block can be determined based on a valid area.
- a context model for information of transform coefficients can be determined using neighboring transform coefficients within the valid region of the current block. Specifically, in the process of encoding transform coefficient information, neighboring transform coefficients included in the valid region of the current block can be scanned. Then, the context model for information of the current transform coefficient can be determined based on information of a plurality of neighboring transform coefficients scanned before the current transform coefficient. That is, since a plurality of neighboring transform coefficients are selected by considering the scanning order, only transform coefficients valid for determining the context model can be selected.
- a method for determining a context model for transform coefficient information based on the valid area may be as described below.
- FIGS. 8 and 9 are diagrams illustrating an embodiment of a method for determining a context model based on a valid area according to one embodiment of the present invention.
- the current block may have a size of 8 x 8 and may include 64 transform coefficients.
- the valid region may be a 4x8 sized region on the left.
- the valid region may include non-zero transform coefficients.
- the values of transform coefficients located outside the valid region are 0.
- the 29th, 30th, 36th, 37th and 44th neighboring transform coefficients may be selected.
- the 29th, 30th and 37th neighboring transform coefficients are transform coefficients outside the valid region, and thus the values of the transform coefficients may be 0.
- neighboring transform coefficients 33, 34, 35, 41 and 42 which are neighboring transform coefficients scanned before transform coefficient 28, may be selected.
- the selected neighboring transform coefficients may all be transform coefficients within the valid region.
- the neighboring transform coefficients 37, 38, 44, 45, and 52 may be selected.
- the transform coefficients 37, 38, and 45 are transform coefficients outside the valid region, and thus the value of the transform coefficient may be 0.
- transform coefficients 44, 51, 52, 58 and 59 which are neighboring transform coefficients scanned before transform coefficient 36, may be selected.
- the selected transform coefficients may all be transform coefficients within the valid region.
- context information for transform coefficient information can be determined based on the size of the valid area as follows.
- log2K and log2L can represent values obtained by applying binary logarithm to K and L, which are width and height values of a given region.
- K and L can be determined by the methods described above.
- last_sig_coeff_x_prefix and last_sig_coeff_y_prefix which are syntax elements indicating the location information of the last non-zero transform coefficient within a transform block, can be determined based on log2K and log2L, respectively.
- a given region can be used to select a context model of not only information indicating the last non-zero transform coefficient position in the transform block, but also other information about the transform coefficients. That is, instead of the width or/and height of a predetermined square region, a context model of arbitrary information can be selected using K or/and L.
- K and L can be determined according to the width or/and height of the transform block, the width or/and height of the effective region after zeroing, any predetermined positive integer that is a power of 2, or already encoded syntax elements.
- a context model for transform coefficient information of a current block can be selected by considering an area reflecting the characteristics of the current block. Accordingly, coding efficiency can be improved.
- Fig. 10 is a flowchart illustrating an image decoding method according to an embodiment of the present invention.
- the image decoding method of Fig. 10 can be performed by an image decoding device.
- the conversion type of the current block can be determined (S1010).
- Context information about the transformation coefficient information of the current block can be determined based on the transformation type (S1020).
- Transform coefficient information can be entropy decoded based on context information (S1030).
- context information can be determined based on whether the transformation type of the current block is a non-separable transformation.
- the transform coefficient information is determined based on the transform coefficient information of the neighboring transform coefficient adjacent to the current transform coefficient, and the neighboring transform coefficient is located within the valid area of the current block, and the valid area of the current block can be determined based on whether the transform type of the current block is a non-separable transform.
- the current block may be composed of a zero region and a valid region.
- the valid region where non-zero transform coefficients are located is as described in FIG. 4 and related content.
- the neighboring transform coefficients can be determined based on the scan direction for transforming the current block.
- the method for determining context information for residual coding information is as described in FIGS. 8 to 9 and related content.
- the conversion coefficient information may be a binarized syntax element based on the valid area.
- the transform coefficient information may include information indicating a prefix for the position value of the last non-zero coefficient in the current block and information indicating a suffix for the position value of the last non-zero coefficient in the current block.
- the length of the information indicating the prefix for the position value of the last non-zero coefficient in the current block can be set to a maximum value determined based on the valid area.
- the binarized syntax elements based on the valid area are as described in Fig. 5 and related contents.
- the transformation coefficient information can indicate information about each coefficient group divided from the current block.
- the transform coefficient information indicates whether the coefficient group includes non-zero transform coefficients, and can be encoded only for coefficient groups included within the valid region.
- the syntax elements indicating information about each of the coefficient groups split from the current block are as described in FIGS. 6 to 7 and their related contents.
- a bitstream can be generated by an image encoding method including the steps described in Fig. 10.
- the bitstream can be stored in a non-transitory computer-readable recording medium, and can also be transmitted (or streamed).
- FIG. 11 is a drawing exemplarily showing a content streaming system to which an embodiment according to the present invention can be applied.
- a content streaming system to which an embodiment of the present invention is applied may largely include an encoding server, a streaming server, a web server, a media storage, a user device, and a multimedia input device.
- the encoding server compresses content input from multimedia input devices such as smartphones, cameras, CCTVs, etc. into digital data to generate a bitstream and transmits it to the streaming server.
- multimedia input devices such as smartphones, cameras, CCTVs, etc. directly generate a bitstream
- the encoding server may be omitted.
- the above bitstream can be generated by an image encoding method and/or an image encoding device to which an embodiment of the present invention is applied, and the streaming server can temporarily store the bitstream during the process of transmitting or receiving the bitstream.
- the above streaming server transmits multimedia data to a user device based on a user request via a web server, and the web server can act as an intermediary that informs the user of any available services.
- the web server transmits it to the streaming server, and the streaming server can transmit multimedia data to the user.
- the content streaming system may include a separate control server, and in this case, the control server may perform a role of controlling commands/responses between each device within the content streaming system.
- the above streaming server can receive content from a media storage and/or an encoding server. For example, when receiving content from the encoding server, the content can be received in real time. In this case, in order to provide a smooth streaming service, the streaming server can store the bitstream for a certain period of time.
- Examples of the user devices may include mobile phones, smart phones, laptop computers, digital broadcasting terminals, personal digital assistants (PDAs), portable multimedia players (PMPs), navigation devices, slate PCs, tablet PCs, ultrabooks, wearable devices (e.g., smartwatches, smart glasses, HMDs), digital TVs, desktop computers, digital signage, etc.
- PDAs personal digital assistants
- PMPs portable multimedia players
- navigation devices slate PCs
- tablet PCs tablet PCs
- ultrabooks ultrabooks
- wearable devices e.g., smartwatches, smart glasses, HMDs
- digital TVs desktop computers, digital signage, etc.
- Each server within the above content streaming system can be operated as a distributed server, in which case data received from each server can be distributedly processed.
- an image can be encoded/decoded using at least one or a combination of at least one of the above embodiments.
- the order in which the above embodiments are applied may be different in the encoding device and the decoding device. Alternatively, the order in which the above embodiments are applied may be the same in the encoding device and the decoding device.
- the above embodiments can be performed for each of the luminance and chrominance signals, or the above embodiments can be performed identically for the luminance and chrominance signals.
- the methods are described based on the flowchart as a series of steps or units, but the present invention is not limited to the order of the steps, and some steps may occur in a different order or simultaneously with other steps described above.
- the steps shown in the flowchart are not exclusive, and other steps may be included, or one or more steps in the flowchart may be deleted without affecting the scope of the present invention.
- the above embodiments may be implemented in the form of program commands that can be executed through various computer components and recorded on a computer-readable recording medium.
- the computer-readable recording medium may include program commands, data files, data structures, etc., alone or in combination.
- the program commands recorded on the computer-readable recording medium may be those specifically designed and configured for the present invention or may be those known to and available to those skilled in the art of computer software.
- a bitstream generated by an encoding method according to the above embodiment can be stored in a non-transitory computer-readable recording medium.
- the bitstream stored in the non-transitory computer-readable recording medium can be decoded by a decoding method according to the above embodiment.
- examples of computer-readable recording media include magnetic media such as hard disks, floppy disks, and magnetic tapes, optical recording media such as CD-ROMs, DVDs, magneto-optical media such as floptical disks, and hardware devices specifically configured to store and execute program instructions such as ROMs, RAMs, and flash memories.
- Examples of program instructions include not only machine language codes generated by a compiler, but also high-level language codes that can be executed by a computer using an interpreter, etc.
- the hardware devices may be configured to operate as one or more software modules to perform processing according to the present invention, and vice versa.
- the present invention can be used in a device for encoding/decoding an image and a recording medium storing a bitstream.
Landscapes
- Engineering & Computer Science (AREA)
- Multimedia (AREA)
- Signal Processing (AREA)
- Physics & Mathematics (AREA)
- Discrete Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Compression Or Coding Systems Of Tv Signals (AREA)
Abstract
Priority Applications (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN202480009014.1A CN120660348A (zh) | 2023-03-29 | 2024-03-28 | 图像编码/解码方法和设备及存储比特流的记录介质 |
Applications Claiming Priority (4)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| KR20230041228 | 2023-03-29 | ||
| KR10-2023-0041228 | 2023-03-29 | ||
| KR1020240042614A KR20240146602A (ko) | 2023-03-29 | 2024-03-28 | 영상 부호화/복호화 방법, 장치 및 비트스트림을 저장한 기록 매체 |
| KR10-2024-0042614 | 2024-03-28 |
Publications (2)
| Publication Number | Publication Date |
|---|---|
| WO2024205274A2 true WO2024205274A2 (fr) | 2024-10-03 |
| WO2024205274A3 WO2024205274A3 (fr) | 2025-06-19 |
Family
ID=92906438
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| PCT/KR2024/003974 Ceased WO2024205274A2 (fr) | 2023-03-29 | 2024-03-28 | Procédé et dispositif de codage/décodage d'image et support d'enregistrement stockant des flux binaires |
Country Status (1)
| Country | Link |
|---|---|
| WO (1) | WO2024205274A2 (fr) |
Family Cites Families (5)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US9253481B2 (en) * | 2012-01-13 | 2016-02-02 | Qualcomm Incorporated | Determining contexts for coding transform coefficient data in video coding |
| US10440399B2 (en) * | 2015-11-13 | 2019-10-08 | Qualcomm Incorporated | Coding sign information of video data |
| WO2018174402A1 (fr) * | 2017-03-21 | 2018-09-27 | 엘지전자 주식회사 | Procédé de transformation dans un système de codage d'image et appareil associé |
| KR102509347B1 (ko) * | 2017-04-13 | 2023-03-14 | 엘지전자 주식회사 | 비디오 신호를 인코딩, 디코딩하는 방법 및 장치 |
| US11589075B2 (en) * | 2018-10-01 | 2023-02-21 | Lg Electronics Inc. | Encoding/decoding method for video signal and device therefor |
-
2024
- 2024-03-28 WO PCT/KR2024/003974 patent/WO2024205274A2/fr not_active Ceased
Also Published As
| Publication number | Publication date |
|---|---|
| WO2024205274A3 (fr) | 2025-06-19 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| WO2021015537A1 (fr) | Procédé et dispositif de codage/décodage d'image permettant de signaler des informations de prédiction de composante de chrominance en fonction de l'applicabilité d'un mode palette et procédé de transmission de flux binaire | |
| WO2023200206A1 (fr) | Procédé et appareil de codage/décodage d'image, et support d'enregistrement stockant un train de bits | |
| WO2023239147A1 (fr) | Procédé et dispositif de codage/décodage d'image, et support d'enregistrement sur lequel un flux binaire est stocké | |
| WO2024053963A1 (fr) | Procédé et appareil de codage/décodage d'image, et support d'enregistrement sur lequel un flux binaire est stocké | |
| WO2023128648A1 (fr) | Procédé et dispositif de codage/décodage d'image, et support d'enregistrement stockant un flux binaire | |
| WO2024205274A2 (fr) | Procédé et dispositif de codage/décodage d'image et support d'enregistrement stockant des flux binaires | |
| WO2024215069A2 (fr) | Procédé de codage/décodage d'image, dispositif, et support d'enregistrement pour le stockage de flux binaire | |
| WO2024258110A1 (fr) | Procédé et dispositif de codage/décodage d'image et support d'enregistrement stockant un flux binaire | |
| WO2024248598A1 (fr) | Procédé et dispositif de codage/décodage d'image, et support d'enregistrement stockant un flux binaire | |
| WO2024210624A1 (fr) | Procédé de codage/décodage d'image, dispositif, et support d'enregistrement stockant des flux binaires | |
| WO2025009816A1 (fr) | Procédé et dispositif de codage/décodage d'image, et support d'enregistrement pour stocker un flux binaire | |
| WO2025048441A1 (fr) | Procédé et dispositif de codage/décodage d'image, et support d'enregistrement stockant un flux binaire | |
| WO2025178438A1 (fr) | Procédé et dispositif de codage et de décodage d'image et support d'enregistrement dans lequel est stocké un flux binaire | |
| WO2026071463A1 (fr) | Procédé de codage/décodage d'image, dispositif et support d'enregistrement pour le stockage de flux binaire | |
| WO2024253465A1 (fr) | Procédé et appareil de codage/décodage d'image, et support d'enregistrement pour stocker un flux binaire | |
| WO2024181820A1 (fr) | Procédé et dispositif de codage/décodage d'image, et support d'enregistrement sur lequel un train de bits est stocké | |
| WO2024191219A1 (fr) | Procédé et appareil de codage/décodage d'image et support d'enregistrement dans lequel est stocké un flux binaire | |
| WO2025192990A1 (fr) | Dispositif et procédé de codage/décodage d'image, et support d'enregistrement dans lequel sont stockés des trains de bits | |
| WO2025048492A1 (fr) | Procédé et dispositif de codage/décodage d'image, et support d'enregistrement sur lequel un flux binaire est stocké | |
| WO2025135613A1 (fr) | Procédé et dispositif de codage/décodage d'image, et support d'enregistrement stockant un flux binaire | |
| WO2024147600A1 (fr) | Procédé et dispositif de codage/décodage d'image, et support d'enregistrement sur lequel un train de bits est stocké | |
| WO2024210648A1 (fr) | Procédé et dispositif de codage/décodage d'image, et support d'enregistrement stockant un flux binaire | |
| WO2025042121A1 (fr) | Procédé et dispositif de codage/décodage d'image, et support d'enregistrement sur lequel un flux binaire est stocké | |
| WO2024262870A1 (fr) | Procédé et dispositif de codage/décodage d'images et support d'enregistrement stockant un flux binaire | |
| WO2025029091A1 (fr) | Procédé et dispositif de codage/décodage d'image, et support d'enregistrement sur lequel un flux binaire est stocké |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| 121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 24781264 Country of ref document: EP Kind code of ref document: A2 |
|
| WWE | Wipo information: entry into national phase |
Ref document number: 202517064911 Country of ref document: IN |
|
| WWE | Wipo information: entry into national phase |
Ref document number: 202480009014.1 Country of ref document: CN |
|
| WWP | Wipo information: published in national office |
Ref document number: 202480009014.1 Country of ref document: CN |
|
| WWP | Wipo information: published in national office |
Ref document number: 202517064911 Country of ref document: IN |
|
| NENP | Non-entry into the national phase |
Ref country code: DE |
|
| 122 | Ep: pct application non-entry in european phase |
Ref document number: 24781264 Country of ref document: EP Kind code of ref document: A2 |