WO2024253427A1 - Procédé et dispositif de codage/décodage d'image, et support d'enregistrement sur lequel un flux binaire est enregistré - Google Patents
Procédé et dispositif de codage/décodage d'image, et support d'enregistrement sur lequel un flux binaire est enregistré Download PDFInfo
- Publication number
- WO2024253427A1 WO2024253427A1 PCT/KR2024/007711 KR2024007711W WO2024253427A1 WO 2024253427 A1 WO2024253427 A1 WO 2024253427A1 KR 2024007711 W KR2024007711 W KR 2024007711W WO 2024253427 A1 WO2024253427 A1 WO 2024253427A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- screen
- prediction mode
- prediction
- current block
- intra
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Images
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/102—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
- H04N19/103—Selection of coding mode or of prediction mode
- H04N19/105—Selection of the reference unit for prediction within a chosen coding or prediction mode, e.g. adaptive choice of position and number of pixels used for prediction
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/102—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
- H04N19/103—Selection of coding mode or of prediction mode
- H04N19/11—Selection of coding mode or of prediction mode among a plurality of spatial predictive coding modes
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/169—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
- H04N19/17—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
- H04N19/176—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a block, e.g. a macroblock
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/50—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
- H04N19/593—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving spatial prediction techniques
Definitions
- the present invention relates to a video encoding/decoding method, a device, and a recording medium storing a bitstream. Specifically, the present invention relates to a video encoding/decoding method using an intra-screen prediction method, a device, and a recording medium storing a bitstream.
- intra prediction is a technique for predicting the current block using previously reconstructed reference pixels of the current picture.
- a prediction block is generated from neighboring reference pixels surrounding the current block based on a predetermined non-directional mode or directional mode.
- Intra prediction may have lower prediction accuracy than inter prediction, which may limit encoding efficiency. Therefore, various methods for improving the prediction accuracy of intra prediction are being discussed.
- the purpose of the present invention is to provide a video encoding/decoding method and device with improved encoding/decoding efficiency.
- the present invention aims to provide a recording medium storing a bitstream generated by the image decoding method or device provided in the present invention.
- a video decoding method may include a step of determining an initial intra-picture prediction mode of a current block, a step of determining one or more adjacent candidate modes adjacent to the initial intra-picture prediction mode based on the initial intra-picture prediction mode of the current block, a step of determining a final intra-picture prediction mode of the current block among candidate intra-picture prediction modes including the one or more adjacent candidate modes, and a step of predicting the current block based on the final intra-picture prediction mode of the current block.
- the method may further include a step of determining whether intra-screen prediction improvement is performed on the current block, and if intra-screen prediction improvement is performed on the current block, a step of determining one or more adjacent candidate modes adjacent to the initial intra-screen prediction mode of the current block based on an initial intra-screen prediction mode of the current block, a step of determining a final intra-screen prediction mode of the current block among candidate intra-screen prediction modes including the one or more adjacent candidate modes, and a step of predicting the current block based on the final intra-screen prediction mode of the current block are performed, and if intra-screen prediction improvement is not performed on the current block, the current block is predicted based on the initial intra-screen prediction mode of the current block.
- the initial intra-screen prediction mode is a non-directional intra-screen prediction mode, it is determined that intra-screen prediction improvement is not performed on the current block.
- it may be characterized in that if an index of a prediction mode within the initial screen is not included in a predetermined range, it is determined that no prediction improvement within the screen is performed on the current block.
- it may be characterized in that whether or not intra-screen prediction improvement is performed on the current block is determined based on at least one of a width, a height, a height, and an aspect ratio of the current block.
- the initial intra-screen prediction mode of the current block may be determined based on the initial intra-screen prediction mode of the reference block.
- the one or more adjacent candidate modes may include a first adjacent candidate mode having a prediction direction between a prediction direction of a prediction mode in a first adjacent picture of the current block and a prediction direction of a prediction mode in the initial picture, and a second adjacent candidate mode having a prediction direction between a prediction direction of a prediction mode in a second adjacent picture of the current block and a prediction direction of the prediction mode in the initial picture, wherein the first adjacent picture prediction mode has an index that is smaller by a predetermined value than an index of the prediction mode in the initial picture, and the second adjacent picture prediction mode has an index that is larger by a predetermined value than an index of the prediction mode in the initial picture.
- the first adjacent candidate mode may be determined by evenly dividing the difference in direction between the prediction direction of the prediction mode in the first adjacent screen of the current block and the prediction direction of the prediction mode in the initial screen according to the number of the first adjacent candidate modes
- the second adjacent candidate mode may be determined by evenly dividing the difference in direction between the prediction direction of the prediction mode in the second adjacent screen of the current block and the prediction direction of the prediction mode in the initial screen according to the number of the second adjacent candidate modes.
- the one or more adjacent candidate modes may include only the second adjacent intra-screen prediction mode, and if the prediction mode within the initial screen is an intra-screen prediction mode with a largest index among directional modes within a given range, the one or more adjacent candidate modes may include only the first adjacent intra-screen prediction mode.
- a mode with the smallest distortion among the candidate in-screen prediction modes of the current block is determined as the final in-screen prediction mode of the current block.
- the distortion of a prediction mode within the candidate screen may be determined based on a difference between a restoration sample included in a template adjacent to the current block and a prediction sample corresponding to the restoration sample.
- the prediction sample may be characterized in that it is determined based on a template reference sample adjacent to the template and a prediction direction of a prediction mode within the candidate screen.
- the image decoding method may further include a step of obtaining, from a bitstream, intra-picture prediction mode improvement index information indicating a final intra-picture prediction mode of the current block among candidate intra-picture prediction modes of the current block, wherein the final intra-picture prediction mode of the current block is determined based on the intra-picture prediction mode improvement index information.
- the intra-screen prediction mode improvement index information may include intra-screen prediction mode improvement sign information indicating which direction the final intra-screen prediction mode is in among a + direction and a - direction from the initial intra-screen prediction mode, and intra-screen prediction mode improvement index difference information indicating an index difference between the final intra-screen prediction mode and the initial intra-screen prediction mode.
- the length of a codeword assigned to the prediction mode within the candidate screen is determined based on an index difference between the prediction mode within the initial screen and the prediction mode within the candidate screen.
- a video encoding method may include the steps of determining an initial intra-picture prediction mode of a current block, determining one or more adjacent candidate modes adjacent to the initial intra-picture prediction mode based on the initial intra-picture prediction mode of the current block, determining a final intra-picture prediction mode of the current block among candidate intra-picture prediction modes including the one or more adjacent candidate modes, and predicting the current block based on the final intra-picture prediction mode of the current block.
- a non-transitory computer-readable recording medium stores a bitstream generated by the image encoding method.
- a transmission method transmits a bitstream generated by the image encoding method.
- the present invention proposes various embodiments of an intra-screen prediction method according to improvement of the intra-screen prediction mode on the decoding side.
- the present invention proposes various embodiments of a method according to improvement of prediction mode in an encoded-side screen.
- the overall encoding efficiency can be improved.
- Figure 1 is a block diagram showing the configuration according to one embodiment of an encoding device to which the present invention is applied.
- FIG. 2 is a block diagram showing the configuration of one embodiment of a decryption device to which the present invention is applied.
- FIG. 3 is a diagram schematically showing a video coding system to which the present invention can be applied.
- FIG. 4 shows a template-based intra mode derivation (TIMD) method.
- Figures 5 and 6 illustrate one embodiment of a method for determining adjacent candidate modes of a prediction mode within an initial screen.
- FIGS. 7 and 8 illustrate one embodiment of a method for determining adjacent candidate modes of an initial intra-screen prediction mode when the initial intra-screen prediction mode is the intra-screen prediction mode with the smallest index among directional modes in a given range.
- Figures 9 and 10 illustrate one embodiment of a method for determining adjacent candidate modes of an initial intra-screen prediction mode when the initial intra-screen prediction mode is the intra-screen prediction mode with the largest index among directional modes in a given range.
- Fig. 11 illustrates an embodiment of an intra-screen prediction method to which intra-screen prediction mode improvement is applied.
- Figure 12 exemplarily illustrates a content streaming system to which an embodiment according to the present invention can be applied.
- a video decoding method may include a step of determining an initial intra-picture prediction mode of a current block, a step of determining one or more adjacent candidate modes adjacent to the initial intra-picture prediction mode based on the initial intra-picture prediction mode of the current block, a step of determining a final intra-picture prediction mode of the current block among candidate intra-picture prediction modes including the one or more adjacent candidate modes, and a step of predicting the current block based on the final intra-picture prediction mode of the current block.
- first, second, etc. may be used to describe various components, but the components should not be limited by the terms. The terms are only used for the purpose of distinguishing one component from another.
- the first component may be referred to as the second component, and similarly, the second component may also be referred to as the first component.
- the term and/or includes a combination of a plurality of related described items or any item among a plurality of related described items.
- each component shown in the embodiments of the present invention are independently depicted to indicate different characteristic functions, and do not mean that each component is formed as a separate hardware or software configuration unit. That is, each component is listed and included as a separate component for convenience of explanation, and at least two components among each component may be combined to form a single component, or one component may be divided into multiple components to perform a function, and such integrated embodiments and separate embodiments of each component are also included in the scope of the present invention as long as they do not deviate from the essence of the present invention.
- the terminology used in the present invention is only used to describe specific embodiments and is not intended to limit the present invention.
- the singular expression includes the plural expression unless the context clearly indicates otherwise.
- some components of the present invention are not essential components that perform essential functions in the present invention and may be optional components that merely enhance performance.
- the present invention may be implemented by including only essential components for implementing the essence of the present invention excluding components used only for enhancing performance, and a structure including only essential components excluding optional components used only for enhancing performance is also included in the scope of the present invention.
- the term "at least one” can mean one of a number greater than or equal to 1, such as 1, 2, 3, and 4.
- the term "a plurality of” can mean one of a number greater than or equal to 2, such as 2, 3, and 4.
- video may mean one picture constituting a video, and may also represent the video itself.
- encoding and/or decoding of a video may mean “encoding and/or decoding of a video,” and may also mean “encoding and/or decoding of one of the videos constituting the video.”
- the target image may be an encoding target image that is a target of encoding and/or a decoding target image that is a target of decoding.
- the target image may be an input image input to an encoding device and may be an input image input to a decoding device.
- the target image may have the same meaning as the current image.
- image may be used with the same meaning and may be used interchangeably.
- target block may be an encoding target block that is a target of encoding and/or a decoding target block that is a target of decoding.
- target block may be a current block that is a target of current encoding and/or decoding.
- target block and current block may be used with the same meaning and may be used interchangeably.
- a coding tree unit may be composed of one luma component (Y) coding tree block (CTB) and two chroma component (Cb, Cr) coding tree blocks related to it.
- sample may represent a basic unit constituting a block.
- Figure 1 is a block diagram showing the configuration according to one embodiment of an encoding device to which the present invention is applied.
- the encoding device (100) may be an encoder, a video encoding device, or an image encoding device.
- the video may include one or more images.
- the encoding device (100) may sequentially encode one or more images.
- an encoding device (100) may include an image segmentation unit (110), an intra prediction unit (120), a motion prediction unit (121), a motion compensation unit (122), a switch (115), a subtractor (113), a transformation unit (130), a quantization unit (140), an entropy encoding unit (150), an inverse quantization unit (160), an inverse transformation unit (170), an adder (117), a filter unit (180), and a reference picture buffer (190).
- the encoding device (100) can generate a bitstream including encoded information through encoding an input image, and output the generated bitstream.
- the generated bitstream can be stored in a computer-readable recording medium, or can be streamed through a wired/wireless transmission medium.
- the video segmentation unit (110) can segment the input video into various forms to increase the efficiency of video encoding/decoding. That is, the input video is composed of multiple pictures, and one picture can be hierarchically segmented and processed for compression efficiency, parallel processing, etc. For example, one picture can be segmented into one or multiple tiles or slices, and then segmented again into multiple CTUs (Coding Tree Units). Alternatively, one picture can be segmented into multiple sub-pictures defined as groups of rectangular slices, and each sub-picture can be segmented into the tiles/slices. Here, the sub-pictures can be utilized to support the function of partially independently encoding/decoding and transmitting the picture.
- multiple sub-pictures can be individually restored, they have the advantage of being easy to edit in applications that configure multi-channel input into one picture.
- tiles can be segmented horizontally to generate bricks.
- a brick can be utilized as a basic unit of intra-picture parallel processing.
- one CTU can be recursively split into a quad tree (QT), and the terminal node of the split can be defined as a CU (Coding Unit).
- the CU can be split into a PU (Prediction Unit) and a TU (Transform Unit), which are prediction units, and prediction and splitting can be performed. Meanwhile, the CU can be utilized as a prediction unit and/or a transform unit itself.
- each CTU can be recursively split into not only a quad tree (QT) but also a multi-type tree (MTT).
- Splitting of a CTU into a multi-type tree can start from the terminal node of a QT, and the MTT can be composed of a BT (Binary Tree) and a TT (Triple Tree).
- the MTT structure can be distinguished into vertical binary partition mode (SPLIT_BT_VER), horizontal binary partition mode (SPLIT_BT_HOR), vertical ternary partition mode (SPLIT_TT_VER), and horizontal ternary partition mode (SPLIT_TT_HOR).
- the minimum block size (MinQTSize) of the quad tree of the luma block during partitioning can be set to 16x16
- the maximum block size (MaxBtSize) of the binary tree can be set to 128x128, and the maximum block size (MaxTtSize) of the triple tree can be set to 64x64.
- the minimum block size (MinBtSize) of the binary tree and the minimum block size (MinTtSize) of the triple tree can be set to 4x4
- the maximum depth (MaxMttDepth) of the multi-type tree can be set to 4.
- a dual tree that uses different CTU partition structures of luma and chrominance components can be applied to improve the encoding efficiency of the I slice.
- the luminance and chrominance CTBs (Coding Tree Blocks) within the CTU can be split into a single tree sharing the coding tree structure.
- the encoding device (100) may perform encoding on the input image in the intra mode and/or the inter mode.
- the encoding device (100) may perform encoding on the input image in a third mode (e.g., IBC mode, Palette mode, etc.) other than the intra mode and the inter mode.
- a third mode e.g., IBC mode, Palette mode, etc.
- the third mode may be classified as the intra mode or the inter mode for convenience of explanation. In the present invention, the third mode will be classified and described separately only when a specific explanation is required.
- the switch (115) can be switched to intra, and when the inter mode is used as the prediction mode, the switch (115) can be switched to inter.
- the intra mode can mean the intra prediction mode
- the inter mode can mean the inter-screen prediction mode.
- the encoding device (100) can generate a prediction block for an input block of an input image.
- the encoding device (100) can encode a residual block using a residual of the input block and the prediction block.
- the input image can be referred to as a current image which is a current encoding target.
- the input block can be referred to as a current block which is a current encoding target or an encoding target block.
- the intra prediction unit (120) can use samples of blocks already encoded/decoded around the current block as reference samples.
- the intra prediction unit (120) can perform spatial prediction on the current block using the reference sample, and can generate prediction samples for the input block through spatial prediction.
- intra prediction can mean prediction within the screen.
- non-directional prediction modes such as DC mode and Planar mode and directional prediction modes (e.g., 65 directions) can be applied.
- the intra prediction method can be expressed as an intra prediction mode or an intra prediction mode.
- the motion prediction unit (121) can search for an area that best matches the input block from the reference image during the motion prediction process, and can derive a motion vector using the searched area. At this time, the search area can be used as the area.
- the reference image can be stored in the reference picture buffer (190).
- it when encoding/decoding for the reference image is processed, it can be stored in the reference picture buffer (190).
- the motion compensation unit (122) can generate a prediction block for the current block by performing motion compensation using a motion vector.
- inter prediction can mean inter-screen prediction or motion compensation.
- the above motion prediction unit (121) and motion compensation unit (122) can generate a prediction block by applying an interpolation filter to a portion of an area within a reference image when the value of a motion vector does not have an integer value.
- the AFFINE mode of sub-PU based prediction the AFFINE mode of sub-PU based prediction, the SbTMVP (Subblock-based Temporal Motion Vector Prediction) mode, and the MMVD (Merge with MVD) mode, the GPM (Geometric Partitioning Mode) mode of PU based prediction can be applied.
- the SbTMVP Subblock-based Temporal Motion Vector Prediction
- MMVD Merge with MVD
- GPM Gaometric Partitioning Mode
- the HMVP History based MVP
- the PAMVP Positionwise Average MVP
- the CIIP Combined Intra/Inter Prediction
- the AMVR Adaptive Motion Vector Resolution
- the BDOF Bi-Directional Optical-Flow
- the BCW Block Predictive with CU Weights
- the LIC Lical Illumination Compensation
- the TM Tempolate Matching
- the OBMC Overlapped Block Motion Compensation
- the subtractor (113) can generate a residual block using the difference between the input block and the predicted block.
- the residual block may also be referred to as a residual signal.
- the residual signal may mean the difference between the original signal and the predicted signal.
- the residual signal may be a signal generated by transforming, quantizing, or transforming and quantizing the difference between the original signal and the predicted signal.
- the residual block may be a residual signal in block units.
- the transform unit (130) can perform a transform on the residual block to generate a transform coefficient and output the generated transform coefficient.
- the transform coefficient can be a coefficient value generated by performing a transform on the residual block.
- the transform unit (130) can also skip the transform on the residual block.
- a quantized level can be generated by applying quantization to a transform coefficient or a residual signal.
- a quantized level may also be referred to as a transform coefficient.
- a 4x4 luminance residual block generated through within-screen prediction can be transformed using a basis vector based on DST (Discrete Sine Transform), and a basis vector based on DCT (Discrete Cosine Transform) can be used to transform the remaining residual blocks.
- a transform block can be divided into a quad tree shape for one block using RQT (Residual Quad Tree) technology, and after performing transformation and quantization on each transform block divided through RQT, a coded block flag (cbf) can be transmitted to increase encoding efficiency when all coefficients become 0.
- RQT Residual Quad Tree
- the Multiple Transform Selection (MTS) technique can be applied to perform transformation by selectively using multiple transformation bases. That is, instead of dividing the CU into TUs through the RQT, a function similar to TU division can be performed through the Sub-block Transform (SBT) technique.
- SBT Sub-block Transform
- the SBT is applied only to inter-screen prediction blocks, and unlike the RQT, the current block can be divided into 1 ⁇ 2 or 1 ⁇ 4 sizes in the vertical or horizontal direction, and then the transformation can be performed on only one of the blocks. For example, if it is divided vertically, the transformation can be performed on the leftmost or rightmost block, and if it is divided horizontally, the transformation can be performed on the topmost or bottommost block.
- LFNST Low Frequency Non-Separable Transform
- a secondary transform technique that additionally transforms the residual signal converted to the frequency domain through DCT or DST, can be applied.
- LFNST additionally performs a transform on the low-frequency region of 4x4 or 8x8 in the upper left, so that the residual coefficients can be concentrated in the upper left.
- the quantization unit (140) can generate a quantized level by quantizing a transform coefficient or a residual signal according to a quantization parameter (QP), and can output the generated quantized level. At this time, the quantization unit (140) can quantize the transform coefficient using a quantization matrix.
- QP quantization parameter
- a quantizer using QP values of 0 to 51 can be used.
- 0 to 63 QP can be used.
- DQ Dependent Quantization
- DQ performs quantization using two quantizers (e.g., Q0 and Q1), and even without signaling information about the use of a specific quantizer, the quantizer to be used for the next transform coefficient can be selected based on the current state through a state transition model.
- the entropy encoding unit (150) can generate a bitstream by performing entropy encoding according to a probability distribution on values produced by the quantization unit (140) or coding parameter values produced in the encoding process, and can output the bitstream.
- the entropy encoding unit (150) can perform entropy encoding on information about image samples and information for decoding the image. For example, information for decoding the image can include syntax elements, etc.
- the entropy encoding unit (150) can use an encoding method such as exponential Golomb, Context-Adaptive Variable Length Coding (CAVLC), or Context-Adaptive Binary Arithmetic Coding (CABAC) for entropy encoding.
- CAVLC Context-Adaptive Variable Length Coding
- CABAC Context-Adaptive Binary Arithmetic Coding
- the entropy encoding unit (150) can perform entropy encoding using a Variable Length Coding/Code (VLC) table.
- VLC Variable Length Coding/Code
- the entropy encoding unit (150) may derive a binarization method of a target symbol and a probability model of a target symbol/bin, and then perform arithmetic encoding using the derived binarization method, probability model, and context model.
- the table probability update method when applying CABAC, in order to reduce the size of the probability table stored in the decryption device, the table probability update method can be changed to a table update method using a simple formula and applied.
- two different probability models can be used to obtain more accurate symbol probability values.
- the entropy encoding unit (150) can change a two-dimensional block form coefficient into a one-dimensional vector form through a transform coefficient scanning method to encode a transform coefficient level (quantized level).
- Coding parameters may include information (flags, indexes, etc.) encoded in an encoding device (100) and signaled to a decoding device (200), such as syntax elements, as well as information derived during an encoding process or a decoding process, and may mean information necessary when encoding or decoding an image.
- signaling a flag or index may mean that the encoder entropy encodes the flag or index and includes it in the bitstream, and that the decoder entropy decodes the flag or index from the bitstream.
- the encoded current image can be used as a reference image for other images to be processed later. Therefore, the encoding device (100) can restore or decode the encoded current image again, and store the restored or decoded image as a reference image in the reference picture buffer (190).
- the quantized level can be dequantized in the dequantization unit (160) and inverse transformed in the inverse transform unit (170).
- the dequantized and/or inverse transformed coefficients can be combined with a prediction block through an adder (117), and a reconstructed block can be generated by combining the dequantized and/or inverse transformed coefficients and the prediction block.
- the dequantized and/or inverse transformed coefficients mean coefficients on which at least one of dequantization and inverse transformation has been performed, and may mean a reconstructed residual block.
- the dequantization unit (160) and the inverse transform unit (170) can be performed in the reverse process of the quantization unit (140) and the transform unit (130).
- the restoration block may pass through a filter unit (180).
- the filter unit (180) may apply a deblocking filter, a sample adaptive offset (SAO), an adaptive loop filter (ALF), a bilateral filter (BIF), LMCS (Luma Mapping with Chroma Scaling), etc. as a filtering technique, in whole or in part, to the restoration sample, restoration block, or restoration image.
- the filter unit (180) may also be called an in-loop filter. In this case, the in-loop filter is also used as a name excluding LMCS.
- the deblocking filter can remove block distortion that occurs at the boundary between blocks.
- different filters can be applied depending on the required deblocking filtering strength.
- a sample adaptive offset can be used to add an appropriate offset value to the sample value to compensate for the encoding error.
- the sample adaptive offset can correct the offset from the original image on a sample basis for the image on which deblocking has been performed.
- a method can be used in which the samples included in the image are divided into a certain number of regions, and then the region to be offset is determined and the offset is applied to the region, or a method can be used in which the offset is applied by considering the edge information of each sample.
- Bilateral filter can also compensate for the offset from the original image on a sample-by-sample basis for the deblocked image.
- An adaptive loop filter can perform filtering based on a comparison value between a restored image and an original image. After dividing samples included in an image into a predetermined group, a filter to be applied to each group can be determined, and filtering can be performed differentially for each group. Information related to whether to apply an adaptive loop filter can be signaled for each coding unit (CU), and the shape and filter coefficients of the adaptive loop filter to be applied can vary for each block.
- CU coding unit
- LMCS Luma Mapping with Chroma Scaling
- LM luma mapping
- CS chroma scaling
- LMCS can be utilized as an HDR correction technique that reflects the characteristics of HDR (High Dynamic Range) images.
- the restored block or restored image that has passed through the filter unit (180) may be stored in the reference picture buffer (190).
- the restored block that has passed through the filter unit (180) may be a part of the reference image.
- the reference image may be a restored image composed of restored blocks that have passed through the filter unit (180).
- the stored reference image may be used for inter-screen prediction or motion compensation thereafter.
- FIG. 2 is a block diagram showing the configuration of one embodiment of a decryption device to which the present invention is applied.
- the decoding device (200) may be a decoder, a video decoding device, or an image decoding device.
- the decoding device (200) may include an entropy decoding unit (210), an inverse quantization unit (220), an inverse transformation unit (230), an intra prediction unit (240), a motion compensation unit (250), an adder (201), a switch (203), a filter unit (260), and a reference picture buffer (270).
- an entropy decoding unit (210) may include an entropy decoding unit (210), an inverse quantization unit (220), an inverse transformation unit (230), an intra prediction unit (240), a motion compensation unit (250), an adder (201), a switch (203), a filter unit (260), and a reference picture buffer (270).
- the decoding device (200) can receive a bitstream output from the encoding device (100).
- the decoding device (200) can receive a bitstream stored in a computer-readable recording medium, or can receive a bitstream streamed through a wired/wireless transmission medium.
- the decoding device (200) can perform decoding on the bitstream in an intra mode or an inter mode.
- the decoding device (200) can generate a restored image or a decoded image through decoding, and can output the restored image or the decoded image.
- the switch (203) can be switched to intra. If the prediction mode used for decryption is inter mode, the switch (203) can be switched to inter.
- the decoding device (200) can obtain a reconstructed residual block by decoding the input bitstream and can generate a prediction block. When the reconstructed residual block and the prediction block are obtained, the decoding device (200) can generate a reconstructed block to be decoded by adding the reconstructed residual block and the prediction block.
- the decoding target block can be referred to as a current block.
- the entropy decoding unit (210) can generate symbols by performing entropy decoding according to a probability distribution for the bitstream.
- the generated symbols can include symbols in the form of quantized levels.
- the entropy decoding method can be the reverse process of the entropy encoding method described above.
- the entropy decoding unit (210) can change a one-dimensional vector-shaped coefficient into a two-dimensional block-shaped coefficient through a transform coefficient scanning method to decode a transform coefficient level (quantized level).
- the quantized level can be dequantized in the dequantization unit (220) and detransformed in the inverse transform unit (230).
- the quantized level can be generated as a restored residual block as a result of the dequantization and/or detransformation.
- the dequantization unit (220) can apply a quantization matrix to the quantized level.
- the dequantization unit (220) and the detransform unit (230) applied to the decoding device can apply the same technology as the dequantization unit (160) and the detransform unit (170) applied to the encoding device described above.
- the intra prediction unit (240) can generate a prediction block by performing spatial prediction on the current block using sample values of already decoded blocks surrounding the block to be decoded.
- the intra prediction unit (240) applied to the decoding device can apply the same technology as the intra prediction unit (120) applied to the encoding device described above.
- the motion compensation unit (250) can perform motion compensation using a motion vector and a reference image stored in the reference picture buffer (270) for the current block to generate a prediction block.
- the motion compensation unit (250) can apply an interpolation filter to a part of the reference image to generate a prediction block when the value of the motion vector does not have an integer value.
- the motion compensation unit (250) applied to the decoding device can apply the same technology as the motion compensation unit (122) applied to the encoding device described above.
- the adder (201) can add the restored residual block and the prediction block to generate a restored block.
- the filter unit (260) can apply at least one of an Inverse-LMCS, a deblocking filter, a sample adaptive offset, and an adaptive loop filter to the restored block or the restored image.
- the filter unit (260) applied to the decoding device can apply the same filtering technology as that applied to the filter unit (180) applied to the encoding device described above.
- the filter unit (260) can output a restored image.
- the restored block or restored image can be stored in the reference picture buffer (270) and used for inter prediction.
- the restored block that has passed through the filter unit (260) can be a part of the reference image.
- the reference image can be a restored image composed of restored blocks that have passed through the filter unit (260).
- the stored reference image can be used for inter-screen prediction or motion compensation thereafter.
- FIG. 3 is a diagram schematically showing a video coding system to which the present invention can be applied.
- a video coding system may include an encoding device (10) and a decoding device (20).
- the encoding device (10) may transmit encoded video and/or image information or data to the decoding device (20) in the form of a file or streaming through a digital storage medium or a network.
- An encoding device (10) may include a video source generating unit (11), an encoding unit (12), and a transmitting unit (13).
- a decoding device (20) may include a receiving unit (21), a decoding unit (22), and a rendering unit (23).
- the encoding unit (12) may be called a video/image encoding unit, and the decoding unit (22) may be called a video/image decoding unit.
- the transmitting unit (13) may be included in the encoding unit (12).
- the receiving unit (21) may be included in the decoding unit (22).
- the rendering unit (23) may include a display unit, and the display unit may be configured as a separate device or an external component.
- the video source generation unit (11) can obtain a video/image through a process of capturing, synthesizing, or generating a video/image.
- the video source generation unit (11) can include a video/image capture device and/or a video/image generation device.
- the video/image capture device can include, for example, one or more cameras, a video/image archive including previously captured video/image, etc.
- the video/image generation device can include, for example, a computer, a tablet, a smartphone, etc., and can (electronically) generate a video/image.
- a virtual video/image can be generated through a computer, etc., and in this case, the video/image capture process can be replaced with a process of generating related data.
- the encoding unit (12) can encode the input video/image.
- the encoding unit (12) can perform a series of procedures such as prediction, transformation, and quantization for compression and encoding efficiency.
- the encoding unit (12) can output encoded data (encoded video/image information) in the form of a bitstream.
- the detailed configuration of the encoding unit (12) can also be configured in the same manner as the encoding device (100) of FIG. 1 described above.
- the transmission unit (13) can transmit encoded video/image information or data output in the form of a bitstream to the reception unit (21) of the decoding device (20) through a digital storage medium or a network in the form of a file or streaming.
- the digital storage medium can include various storage media such as USB, SD, CD, DVD, Blu-ray, HDD, SSD, etc.
- the transmission unit (13) can include an element for generating a media file through a predetermined file format and can include an element for transmission through a broadcasting/communication network.
- the reception unit (21) can extract/receive the bitstream from the storage medium or the network and transmit it to the decoding unit (22).
- the decoding unit (22) can decode video/image by performing a series of procedures such as inverse quantization, inverse transformation, and prediction corresponding to the operation of the encoding unit (12).
- the detailed configuration of the decoding unit (22) can also be configured in the same manner as the decoding device (200) of FIG. 2 described above.
- the rendering unit (23) can render the decrypted video/image.
- the rendered video/image can be displayed through the display unit.
- the present disclosure describes a method for improving an intra-prediction mode by correcting an intra-prediction mode determined in intra-prediction, in order to improve the prediction accuracy of intra-prediction.
- a decoder may select a candidate intra-prediction mode having the highest prediction accuracy among a plurality of candidate intra-prediction modes based on a template.
- the decoder may select one of the plurality of candidate intra-prediction modes based on information transmitted from an encoder.
- the prediction accuracy of intra-prediction can be improved.
- DIMR decoder side intra prediction mode refinement
- FIG. 4 shows a template-based intra mode derivation (TIMD) method.
- Fig. 4 shows a template (402) of a current block (400) of a size MxN and reference pixels (404) of the template used for generating a prediction value of pixels of the template (402).
- the template (402) of the current block includes at least one of a left area and an upper area of the current block (400) and is restored prior to the current block (400).
- the left area has a size L1xN
- the upper area consists of an upper template of a size MxL2.
- M, N, L1, and L2 are arbitrary positive integers.
- the suitability for each candidate mode in the MPM (most probable mode) list of the current block (400) is calculated.
- a prediction value of the template is generated from a reference pixel of the template based on the directionality of the candidate mode.
- the sum of absolute transformed differences (SATD) between the prediction value and the restored value of the pixel of the template is calculated.
- the candidate mode with the smallest sum of absolute transformed differences is selected as the on-screen prediction mode.
- the intra-screen prediction mode can be selected from among the prediction modes within the entire screen.
- the intra-screen prediction mode can be selected from among any intra-screen prediction modes.
- an initial intra prediction mode is determined. Then, the initial intra prediction mode is corrected by a template-based intra prediction mode derivation method. Specifically, among candidate intra prediction modes including adjacent candidate modes of the initial intra prediction mode, an optimal candidate mode is selected by the template-based intra prediction mode derivation method. Then, the optimal candidate mode is determined as the final intra prediction mode.
- DIMR decoder side intra prediction mode refinement
- the candidate intra-picture prediction modes may include an initial intra-picture prediction mode and adjacent candidate modes of the initial intra-picture prediction mode. This is because, without data acquired on a block-by-block basis, the decoder must determine whether the initial intra-picture prediction mode has lower distortion than all adjacent candidate modes.
- the candidate intra-picture prediction modes may only include adjacent candidate modes of the initial intra-picture prediction mode. This is because, if the encoder determines that the initial intra-picture prediction mode of the current block has the lowest distortion, the block-by-block syntax element encoded by the encoder will be encoded to indicate that the intra-picture prediction mode improvement on the decoding side is unnecessary.
- the adjacent candidate modes of the prediction mode within the initial screen can have a prediction direction between the prediction direction of the first adjacent intra-screen prediction mode having an index that is 1 less than the index of the prediction mode within the initial screen and the prediction direction of the prediction mode within the initial screen.
- the adjacent candidate modes of the prediction mode within the initial screen can include S intra-screen prediction modes whose prediction directions are between the prediction direction of the first adjacent intra-screen prediction mode and the prediction direction of the prediction mode within the initial screen.
- S intra-screen prediction modes having a prediction direction between the prediction direction of the prediction mode within screen 33, which is the first adjacent intra-screen prediction mode, and the prediction direction of the prediction mode within screen 34, which is the initial intra-screen prediction mode can be included as the adjacent candidate modes.
- the S is an arbitrary positive integer greater than or equal to 1.
- the adjacent candidate modes of the prediction mode within the initial screen can have a prediction direction between a prediction direction of a second adjacent intra-screen prediction mode having an index that is 1 greater than the index of the prediction mode within the initial screen and the prediction direction of the prediction mode within the initial screen.
- the adjacent candidate modes of the prediction mode within the initial screen can include S intra-screen prediction modes whose prediction directions are between the prediction direction of the second adjacent intra-screen prediction mode and the prediction direction of the prediction mode within the initial screen.
- S intra-screen prediction modes having a prediction direction between the prediction direction of the prediction mode within screen 34, which is the initial intra-screen prediction mode, and the prediction direction of the prediction mode within screen 35, which is the second adjacent intra-screen prediction mode can be included as the adjacent candidate modes.
- the S is an arbitrary positive integer greater than or equal to 1.
- Fig. 5 illustrates one embodiment of a method for determining adjacent candidate modes of a prediction mode within an initial screen.
- the prediction mode within the initial screen is mode 63
- the prediction mode within the first adjacent screen and the prediction mode within the second adjacent screen are mode 62 and mode 64.
- the adjacent candidate modes may include a prediction mode 63-1/2 mode which is between the prediction direction of the 62nd mode and the 63rd mode and a prediction mode 63+1/2 mode which is between the prediction direction of the 63rd mode and the 64th mode.
- an intra-screen prediction mode having a smallest distortion value among the 63rd mode, the 63-1/2 mode, and the 63+1/2 mode may be determined as the intra-screen prediction mode of the current block.
- Fig. 6 illustrates another embodiment of a method for determining adjacent candidate modes of a prediction mode within an initial screen.
- the prediction mode within the initial screen is mode 63
- the prediction mode within the first adjacent screen and the prediction mode within the second adjacent screen are mode 62 and mode 64.
- the adjacent candidate modes may include prediction modes 63-1/3 mode and 63-2/3 mode between the prediction direction of mode 62 and the prediction direction of mode 63, and prediction modes 63+1/3 mode and 63+2/3 mode between the prediction direction of mode 63 and the prediction direction of mode 64.
- an intra-screen prediction mode having a smallest distortion value among the 63 mode, the 63-1/3 mode, the 63-2/3 mode, the 63+1/3 mode, and the 63+2/3 mode may be determined as the intra-screen prediction mode of the current block.
- the number of adjacent candidate modes in Fig. 5 is 2 in total, and the number of adjacent candidate modes in Fig. 6 is 4 in total.
- the number of adjacent candidate modes can be increased considering the computational burden of the decoder.
- the adjacent candidate mode is selected based on the encoded adjacent candidate mode information in the encoder instead of selecting the adjacent candidate mode based on the distortion in the decoder, the computational burden of the decoder for calculating the distortion of the adjacent candidate mode is not increased even if the number of adjacent candidate modes increases. Therefore, if the adjacent candidate mode is selected based on the adjacent candidate mode information, more adjacent candidate modes can be used to improve the prediction mode within the screen.
- Fig. 7 illustrates one embodiment of a method for determining adjacent candidate modes of an initial intra-screen prediction mode when the initial intra-screen prediction mode is the intra-screen prediction mode with the smallest index among directional modes in a given range.
- the initial intra-screen prediction mode with the smallest index among directional modes in a given range is mode 2.
- the adjacent candidate modes may include a 2+1/2 mode whose prediction direction is between the prediction direction of the 2nd mode and the 3rd mode. And based on the template-based intra-screen mode derivation method, the intra-screen prediction mode having the smallest distortion value among the 2nd mode and the 2+1/2 mode may be determined as the intra-screen prediction mode of the current block.
- Fig. 8 illustrates another embodiment of a method for determining adjacent candidate modes of an initial intra-screen prediction mode when the initial intra-screen prediction mode is the intra-screen prediction mode with the smallest index among directional modes in a given range.
- the initial intra-screen prediction mode with the smallest index among directional modes in a given range is mode 2.
- the adjacent candidate modes may include mode 2+1/3 and mode 2+2/3, whose prediction directions are between the prediction directions of mode 2 and mode 3. And based on the template-based intra-screen mode derivation method, an intra-screen prediction mode having the smallest distortion value among mode 2, mode 2+1/3, and mode 2+2/3 may be determined as the intra-screen prediction mode of the current block.
- the smallest index among the directional modes is depicted as mode 2, but this is only an example, and the smallest index among the directional modes may be a different value.
- the number of adjacent candidate modes may be increased considering the computational burden of the decoder.
- the computational burden of the decoder decreases, and more adjacent candidate modes can be used to improve the prediction mode within the screen.
- Fig. 9 illustrates one embodiment of a method for determining adjacent candidate modes of an initial intra-screen prediction mode when the initial intra-screen prediction mode is the intra-screen prediction mode with the largest index among directional modes in a given range.
- the initial intra-screen prediction mode with the largest index among directional modes in a given range is mode 66.
- the adjacent candidate modes may include the 66-1/2 mode, whose prediction direction is between the prediction direction of the 65th mode and the prediction direction of the 66th mode. And based on the template-based intra-screen mode derivation method, the intra-screen prediction mode having the smallest distortion value among the 66-1/2 mode and the 66th mode may be determined as the intra-screen prediction mode of the current block.
- Fig. 10 illustrates another embodiment of a method for determining adjacent candidate modes of an initial intra-screen prediction mode when the initial intra-screen prediction mode is the intra-screen prediction mode with the largest index among directional modes in a given range.
- the initial intra-screen prediction mode with the largest index among directional modes in a given range is mode 66.
- the adjacent candidate modes may include the 66-1/3 mode and the 66-2/3 mode, the prediction directions of which are between the prediction directions of the 65 mode and the 66 mode. Then, based on the template-based intra-screen mode derivation method, the intra-screen prediction mode having the smallest distortion value among the 66-1/3 mode, the 66-2/3 mode, and the 66 mode may be determined as the intra-screen prediction mode of the current block.
- the largest index among the directional modes is depicted as mode 66, but this is only an example, and the largest index among the directional modes may be determined to a different value in consideration of the number of prediction modes within the entire screen.
- the number of adjacent candidate modes may be increased in consideration of the computational burden of the decoder.
- the computational burden of the decoder decreases, and more adjacent candidate modes can be used to improve the prediction mode within the screen.
- one or two intra-screen prediction modes are included as adjacent candidate modes in each index direction of the prediction mode within the initial screen.
- S intra-screen prediction modes may be included as adjacent candidate modes in each index direction of the prediction mode within the initial screen.
- S is any positive integer greater than or equal to 3.
- the adjacent candidate mode is determined by equally dividing the difference in the prediction direction between the prediction mode in the initial screen and the prediction mode in the adjacent screen.
- the adjacent candidate mode may be determined by unequally dividing the difference in the prediction direction.
- the adjacent candidate mode may be determined based on an arbitrary direction adjacent to the direction pointed to by the prediction mode in the initial screen.
- the above equal division may mean equal division of the angular difference between the prediction direction of the prediction mode in the initial screen and the prediction mode in the adjacent screen.
- the above equal division may mean equal division of the positional difference between the reference positions indicated by the prediction direction of the prediction mode in the initial screen and the prediction mode in the adjacent screen.
- the adjacent candidate mode is determined based on the prediction mode in the adjacent screen immediately adjacent to the prediction mode in the initial screen.
- the adjacent candidate mode may be determined based on the prediction mode in the adjacent screen having an index difference of A from the prediction mode in the initial screen. For example, if the prediction mode in the initial screen is 32 and A is 2, the prediction mode in the first adjacent screen may be determined as 30, and the prediction mode in the second adjacent screen may be determined as 34.
- multiple intra-screen prediction modes having index values between 30 and 34 may be included in the adjacent candidate mode.
- A is any positive integer greater than or equal to 1.
- the method of improving the prediction mode in the decryption side screen can be determined whether to be applied based on the following conditions.
- the method for improving the intra-screen prediction mode on the decoding side may not be applied to the current block.
- whether to apply the prediction mode improvement method on the decoding side within the picture may be determined depending on the size of the current block. For example, if the width of the current block is less than or equal to M luminance samples (M is an arbitrary positive integer greater than or equal to 1), the prediction mode improvement method on the decoding side within the picture may not be applied to the current block.
- the width of the current block may be measured by the number of luminance samples included in the current block.
- K is an arbitrary positive integer greater than or equal to 1
- the prediction mode improvement method on the decoding side within the picture may not be applied to the current block.
- the aspect ratio (width/height or height/width) of the current block is included in a predetermined range, the prediction mode improvement method on the decoding side within the picture may not be applied to the current block.
- the decoding-side Intra prediction mode improvement method may not be applied to the current block.
- the prediction mode improvement method within the decoding-side screen may not be applied to the current block.
- the T value and the U value may be determined according to the number of prediction modes within the entire screen.
- the intra-screen prediction mode of the reference block is referred to.
- the initial intra-screen prediction mode instead of the final intra-screen prediction mode of the reference block, can be used for generating the MPM list and determining the direct mode of the chrominance component.
- the computational burden on the decoder side may increase in calculating the distortion of the adjacent candidate mode. Therefore, instead of the decoder calculating the distortion of the adjacent candidate mode, the encoder can encode intra prediction mode refinement information for the adjacent candidate mode applied to the current block. Then, the decoder can determine the optimal adjacent candidate mode for the encoder-side intra prediction mode refinement (EIMR) based on the intra prediction mode refinement information.
- EIMR encoder-side intra prediction mode refinement
- the intra-screen prediction mode improvement information may include intra-screen prediction mode improvement application information indicating whether intra-screen prediction mode improvement is applied and intra-screen prediction mode improvement index information indicating an optimal adjacent candidate mode among adjacent candidate modes.
- intra-screen prediction mode improvement application information indicates that the current block or an upper unit (encoding tree block, slice, tile, picture, etc.) of the current block is applied
- the optimal candidate intra-screen prediction mode may be selected as the final intra-screen prediction mode of the current block based on the intra-screen prediction mode improvement index information.
- the intra-screen prediction mode improvement application information indicates that the current block or an upper unit of the current block is not applied
- the initial intra-screen prediction mode may be selected as the final intra-screen prediction mode of the current block.
- the initial prediction mode within the screen is also included in the candidate prediction mode within the screen.
- the initial prediction mode within the screen is excluded from the candidate prediction mode within the screen.
- the above intra-screen prediction mode improvement application information can be encoded as a 1-bit codeword.
- the intra-screen prediction mode improvement index information represents optimal candidate intra-screen prediction modes among candidate intra-screen prediction modes.
- the optimal candidate intra-screen prediction mode is mapped to a predetermined index in consideration of the number of adjacent candidate modes. For example, when there are four adjacent candidate modes, the optimal adjacent candidate mode is mapped to one of values 0 to 3.
- the intra-screen prediction mode improvement index information is set to represent the mapped value.
- the intra-screen prediction mode improvement index information can be encoded based on a fixed length code (FLC), a truncated unary code (TU), a signed Exp-Golomb code, or the like.
- the above-described intra-screen prediction mode improvement index information can be encoded by considering the occurrence frequency of adjacent candidate modes. Specifically, a short codeword can be matched to an adjacent candidate mode with a high occurrence frequency, and a long codeword can be matched to an adjacent candidate mode with a low occurrence frequency.
- the occurrence frequency of adjacent candidate modes can differ depending on the index difference with respect to the intra-screen prediction mode. For example, the occurrence frequency of an adjacent candidate mode having the smallest absolute value of index difference with respect to the intra-screen prediction mode can be the highest. Accordingly, a short codeword can be matched to an adjacent candidate mode having the smallest absolute value of index difference with respect to the intra-screen prediction mode.
- a long codeword can be matched to an adjacent candidate mode having the largest absolute value of index difference with respect to the intra-screen prediction mode.
- the intra-screen prediction mode improvement index information may include intra-screen prediction mode improvement sign information and intra-screen prediction mode improvement index difference information.
- the intra-screen prediction mode improvement sign information indicates which direction the optimal candidate intra-screen prediction mode is in among the + direction and the - direction from the initial intra-screen prediction mode.
- the intra-screen prediction mode improvement index difference information indicates the index difference between the optimal candidate intra-screen prediction mode and the initial intra-screen prediction mode.
- the above intra-screen prediction mode improvement code information can be encoded as a 1-bit codeword. If the initial intra-screen prediction mode is the intra-screen prediction mode of the smallest index or the intra-screen prediction mode of the largest index among the directional modes of a predetermined range, the intra-screen prediction mode improvement code information is not encoded, and only the intra-screen prediction mode improvement index difference information can be encoded.
- the intra-picture prediction mode improvement index difference information may not be encoded. If there are two or more adjacent candidate modes in each direction, the intra-picture prediction mode improvement index difference information may be encoded based on a fixed length code (FLC), a truncated unary code (TU), a signed Exp-Golomb code, etc.
- FLC fixed length code
- TU truncated unary code
- Exp-Golomb code signed Exp-Golomb code
- the improved index difference information of the prediction mode within the above screen can be encoded by considering the occurrence frequency of the adjacent candidate mode.
- the occurrence frequency of the adjacent candidate mode can differ depending on the index difference with the prediction mode within the initial screen. For example, the occurrence frequency of the adjacent candidate mode having the smallest absolute value of the index difference with the prediction mode within the initial screen can be the highest. Accordingly, a short codeword can be matched with respect to the index of the adjacent candidate mode having the smallest absolute value of the index difference with the prediction mode within the initial screen. In addition, a long codeword can be matched with respect to the index of the adjacent candidate mode having the largest absolute value of the index difference with the prediction mode within the initial screen.
- Fig. 11 illustrates an embodiment of an intra-screen prediction method to which intra-screen prediction mode improvement is applied.
- an initial intra-screen prediction mode of a current block is determined.
- the initial intra-screen prediction mode of the current block can be determined with reference to an intra-screen prediction mode of a reference block. If intra-screen prediction improvement is applied to a reference block of the current block, the initial intra-screen prediction mode of the current block can be determined based on the initial intra-screen prediction mode of the reference block.
- the reference block can be a block referenced when determining an MPM (Most Probable Mode) or DM (Direct Mode) of the current block.
- MPM Motion Probable Mode
- DM Direct Mode
- the reference block can be a spatially adjacent block of the current block.
- the reference block In determining the DM of the current block of the chroma component, the reference block can be a corresponding block of a luma component at a position corresponding to the current block.
- step 1104 based on the prediction mode within the initial screen of the current block, one or more adjacent candidate modes adjacent to the prediction mode within the initial screen are determined.
- the one or more adjacent candidate modes may include a first adjacent candidate mode having a prediction direction between a prediction direction of a prediction mode in a first adjacent picture of the current block and a prediction direction of a prediction mode in the initial picture, and a second adjacent candidate mode having a prediction direction between a prediction direction of a prediction mode in a second adjacent picture of the current block and a prediction direction of the prediction mode in the initial picture.
- the first adjacent picture prediction mode has an index that is smaller than an index of the prediction mode in the initial picture by a predetermined value
- the second adjacent picture prediction mode has an index that is larger than an index of the prediction mode in the initial picture by a predetermined value.
- the first adjacent candidate mode is determined by evenly dividing the directional difference between the prediction direction of the prediction mode in the first adjacent screen of the current block and the prediction direction of the prediction mode in the initial screen according to the number of first adjacent candidate modes. That is, the directional difference between the first adjacent candidate modes and the directional difference between the first adjacent candidate mode and the prediction mode in the initial screen are determined evenly.
- the second adjacent candidate mode is determined by evenly dividing the difference in direction between the prediction direction of the prediction mode in the second adjacent screen of the current block and the prediction direction of the prediction mode in the initial screen according to the number of second adjacent candidate modes. That is, the difference in direction between the second adjacent candidate modes and the difference in direction between the second adjacent candidate mode and the prediction mode in the initial screen are determined evenly.
- the one or more adjacent candidate modes may include only the second intra-screen prediction mode. Conversely, if the initial intra-screen prediction mode is an intra-screen prediction mode with a largest index among the directional modes in the given range, the one or more adjacent candidate modes may include only the first intra-screen prediction mode.
- a final within-picture prediction mode of the current block is determined among the candidate within-picture prediction modes including one or more adjacent candidate modes.
- a mode having the smallest distortion among the candidate screen prediction modes of the current block may be determined as the final screen prediction mode of the current block.
- the distortion of the candidate screen prediction mode may be determined based on a difference between a restoration sample included in a template adjacent to the current block and a prediction sample corresponding to the restoration sample.
- the prediction sample may be determined based on a template reference sample adjacent to the template and a prediction direction of the candidate screen prediction mode.
- intra-picture prediction mode improvement index information indicating a final intra-picture prediction mode of the current block among candidate intra-picture prediction modes of the current block can be obtained from a bitstream. Then, a final intra-picture prediction mode of the current block can be determined based on the intra-picture prediction mode improvement index information.
- the intra-screen prediction mode improvement index information may include intra-screen prediction mode improvement sign information indicating in which direction the final intra-screen prediction mode is in the + direction or - direction from the initial intra-screen prediction mode, and intra-screen prediction mode improvement index difference information indicating an index difference between the final intra-screen prediction mode and the initial intra-screen prediction mode. Accordingly, the final intra-screen prediction mode of the current block can be derived according to the sign according to the intra-screen prediction mode improvement index sign information and the index difference according to the intra-screen prediction mode improvement index difference information.
- a codeword of a predetermined length is assigned to each symbol of an index difference indicated by the improved index difference information of the prediction mode within the screen.
- the length of the codeword assigned to the prediction mode within the candidate screen can be determined based on the index difference between the prediction mode within the initial screen and the prediction mode within the candidate screen. For example, a codeword of a short length can be assigned to a symbol of a small index difference.
- the current block is predicted based on the final on-screen prediction mode of the current block.
- the video decoding method may further include a step of determining whether intra-picture prediction improvement is performed on the current block. If intra-picture prediction improvement is performed on the current block, steps 1104 to 1108 may be performed. If intra-picture prediction improvement is not performed on the current block, steps 1104 to 1108 may be omitted, and the current block may be predicted based on an initial intra-picture prediction mode of the current block.
- the initial intra-screen prediction mode is a non-directional intra-screen prediction mode, it may be determined that intra-screen prediction improvement is not to be performed on the current block.
- an index of a prediction mode within the initial screen is not within a given range, it may be determined that no prediction improvement within the screen is to be performed on the current block.
- whether in-screen prediction improvement is performed on a current block may be determined based on at least one of a width, a height, an area, and an aspect ratio of the current block.
- the current block can be encoded or decoded.
- the intra-screen prediction mode improvement index information is encoded in the encoder according to the prediction result.
- the intra-screen prediction mode improvement index information is decoded in the decoder and can be used to derive the final intra-screen prediction mode.
- bitstream generated by the encoder according to the prediction method performed in steps 1102 to 1108 can be stored in a recording medium or transmitted outside the encoder.
- FIG. 12 is a drawing exemplarily showing a content streaming system to which an embodiment according to the present invention can be applied.
- a content streaming system to which an embodiment of the present invention is applied may largely include an encoding server, a streaming server, a web server, a media storage, a user device, and a multimedia input device.
- the encoding server compresses content input from multimedia input devices such as smartphones, cameras, CCTVs, etc. into digital data to generate a bitstream and transmits it to the streaming server.
- multimedia input devices such as smartphones, cameras, CCTVs, etc. directly generate a bitstream
- the encoding server may be omitted.
- the above bitstream can be generated by an image encoding method and/or an image encoding device to which an embodiment of the present invention is applied, and the streaming server can temporarily store the bitstream during the process of transmitting or receiving the bitstream.
- the above streaming server transmits multimedia data to a user device based on a user request via a web server, and the web server can act as an intermediary that informs the user of any available services.
- the web server transmits it to the streaming server, and the streaming server can transmit multimedia data to the user.
- the content streaming system may include a separate control server, and in this case, the control server may perform a role of controlling commands/responses between each device within the content streaming system.
- the above streaming server can receive content from a media storage and/or an encoding server. For example, when receiving content from the encoding server, the content can be received in real time. In this case, in order to provide a smooth streaming service, the streaming server can store the bitstream for a certain period of time.
- Examples of the user devices may include mobile phones, smart phones, laptop computers, digital broadcasting terminals, personal digital assistants (PDAs), portable multimedia players (PMPs), navigation devices, slate PCs, tablet PCs, ultrabooks, wearable devices (e.g., smartwatches, smart glasses, HMDs (head mounted displays)), digital TVs, desktop computers, digital signage, etc.
- PDAs personal digital assistants
- PMPs portable multimedia players
- navigation devices slate PCs
- tablet PCs tablet PCs
- ultrabooks ultrabooks
- wearable devices e.g., smartwatches, smart glasses, HMDs (head mounted displays)
- digital TVs desktop computers, digital signage, etc.
- Each server within the above content streaming system can be operated as a distributed server, in which case data received from each server can be processed in a distributed manner.
- an image can be encoded/decoded using at least one or a combination of at least one of the above embodiments.
- the order in which the above embodiments are applied may be different in the encoding device and the decoding device. Alternatively, the order in which the above embodiments are applied may be the same in the encoding device and the decoding device.
- the above embodiments can be performed for each of the luminance and chrominance signals, or the above embodiments can be performed identically for the luminance and chrominance signals.
- the methods are described based on the flowchart as a series of steps or units, but the present invention is not limited to the order of the steps, and some steps may occur in a different order or simultaneously with other steps described above.
- the steps shown in the flowchart are not exclusive, and other steps may be included, or one or more steps in the flowchart may be deleted without affecting the scope of the present invention.
- the above embodiments may be implemented in the form of program commands that can be executed through various computer components and recorded on a computer-readable recording medium.
- the computer-readable recording medium may include program commands, data files, data structures, etc., alone or in combination.
- the program commands recorded on the computer-readable recording medium may be those specifically designed and configured for the present invention or may be those known to and available to those skilled in the art of computer software.
- a bitstream generated by an encoding method according to the above embodiment can be stored in a non-transitory computer-readable recording medium.
- the bitstream stored in the non-transitory computer-readable recording medium can be decoded by a decoding method according to the above embodiment.
- examples of computer-readable recording media include magnetic media such as hard disks, floppy disks, and magnetic tapes, optical recording media such as CD-ROMs and DVDs, magneto-optical media such as floptical disks, and hardware devices specifically configured to store and execute program instructions such as ROMs, RAMs, and flash memories.
- Examples of program instructions include not only machine language codes generated by a compiler, but also high-level language codes that can be executed by a computer using an interpreter, etc.
- the hardware devices may be configured to operate as one or more software modules to perform the processing according to the present invention, and vice versa.
- the present invention can be used in a device for encoding/decoding an image and a recording medium storing a bitstream.
Landscapes
- Engineering & Computer Science (AREA)
- Multimedia (AREA)
- Signal Processing (AREA)
- Compression Or Coding Systems Of Tv Signals (AREA)
Abstract
La présente invention concerne un procédé de décodage d'image comprenant les étapes consistant à : déterminer un mode de prédiction intra initial du bloc courant ; déterminer un ou plusieurs modes candidats adjacents adjacents au mode de prédiction intra initial sur la base du mode de prédiction intra initial du bloc courant ; déterminer un mode de prédiction intra final du bloc courant parmi des modes de prédiction intra candidats comprenant le ou les modes candidats adjacents ; et prédire le bloc courant sur la base du mode de prédiction intra final du bloc courant.
Priority Applications (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN202480025049.4A CN121128165A (zh) | 2023-06-08 | 2024-06-05 | 图像编码/解码的方法和装置以及用于存储比特流的记录介质 |
Applications Claiming Priority (4)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| KR10-2023-0073307 | 2023-06-08 | ||
| KR20230073307 | 2023-06-08 | ||
| KR1020240073540A KR20240174498A (ko) | 2023-06-08 | 2024-06-05 | 영상 부호화/복호화 방법, 장치 및 비트스트림을 저장한 기록 매체 |
| KR10-2024-0073540 | 2024-06-05 |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| WO2024253427A1 true WO2024253427A1 (fr) | 2024-12-12 |
Family
ID=93796150
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| PCT/KR2024/007711 Pending WO2024253427A1 (fr) | 2023-06-08 | 2024-06-05 | Procédé et dispositif de codage/décodage d'image, et support d'enregistrement sur lequel un flux binaire est enregistré |
Country Status (1)
| Country | Link |
|---|---|
| WO (1) | WO2024253427A1 (fr) |
Citations (5)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| KR101585565B1 (ko) * | 2011-06-17 | 2016-01-14 | 미디어텍 인크. | 인트라 예측 모드의 코딩을 위한 방법 및 장치 |
| US20190166370A1 (en) * | 2016-05-06 | 2019-05-30 | Vid Scale, Inc. | Method and system for decoder-side intra mode derivation for block-based video coding |
| KR20200007044A (ko) * | 2017-06-22 | 2020-01-21 | 후아웨이 테크놀러지 컴퍼니 리미티드 | 인트라-프레임 예측 방법 및 장치 |
| KR20220159464A (ko) * | 2021-04-26 | 2022-12-02 | 텐센트 아메리카 엘엘씨 | 디코더 측 인트라 모드 도출 |
| KR20230058166A (ko) * | 2021-08-02 | 2023-05-02 | 텐센트 아메리카 엘엘씨 | 개선된 인트라 예측을 위한 방법 및 장치 |
-
2024
- 2024-06-05 WO PCT/KR2024/007711 patent/WO2024253427A1/fr active Pending
Patent Citations (5)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| KR101585565B1 (ko) * | 2011-06-17 | 2016-01-14 | 미디어텍 인크. | 인트라 예측 모드의 코딩을 위한 방법 및 장치 |
| US20190166370A1 (en) * | 2016-05-06 | 2019-05-30 | Vid Scale, Inc. | Method and system for decoder-side intra mode derivation for block-based video coding |
| KR20200007044A (ko) * | 2017-06-22 | 2020-01-21 | 후아웨이 테크놀러지 컴퍼니 리미티드 | 인트라-프레임 예측 방법 및 장치 |
| KR20220159464A (ko) * | 2021-04-26 | 2022-12-02 | 텐센트 아메리카 엘엘씨 | 디코더 측 인트라 모드 도출 |
| KR20230058166A (ko) * | 2021-08-02 | 2023-05-02 | 텐센트 아메리카 엘엘씨 | 개선된 인트라 예측을 위한 방법 및 장치 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| WO2024039155A1 (fr) | Procédé de codage/décodage d'image, dispositif, et support d'enregistrement pour le stockage de flux binaire | |
| WO2023239147A1 (fr) | Procédé et dispositif de codage/décodage d'image, et support d'enregistrement sur lequel un flux binaire est stocké | |
| WO2023200214A1 (fr) | Procédé et appareil de codage/décodage d'image, et support d'enregistrement stockant un train de bits | |
| WO2023200206A1 (fr) | Procédé et appareil de codage/décodage d'image, et support d'enregistrement stockant un train de bits | |
| WO2024253427A1 (fr) | Procédé et dispositif de codage/décodage d'image, et support d'enregistrement sur lequel un flux binaire est enregistré | |
| WO2025110783A1 (fr) | Procédé et dispositif de codage/décodage d'image, et support d'enregistrement stockant un flux binaire | |
| WO2024253506A1 (fr) | Procédé et appareil de codage/décodage d'image, et support d'enregistrement pour stocker un flux binaire | |
| WO2025135613A1 (fr) | Procédé et dispositif de codage/décodage d'image, et support d'enregistrement stockant un flux binaire | |
| WO2024253365A1 (fr) | Procédé de codage/décodage d'image, dispositif, et support d'enregistrement pour le stockage de flux binaire | |
| WO2026019073A1 (fr) | Procédé et appareil de codage/décodage d'image et support d'enregistrement dans lequel est stocké un flux binaire | |
| WO2025084817A1 (fr) | Procédé et dispositif de codage/décodage d'image, et support d'enregistrement stockant un flux binaire | |
| WO2025192990A1 (fr) | Dispositif et procédé de codage/décodage d'image, et support d'enregistrement dans lequel sont stockés des trains de bits | |
| WO2026063631A1 (fr) | Procédé de codage/décodage d'image, dispositif et support d'enregistrement pour le stockage de flux binaire | |
| WO2024262870A1 (fr) | Procédé et dispositif de codage/décodage d'images et support d'enregistrement stockant un flux binaire | |
| WO2025005615A1 (fr) | Procédé et dispositif de codage/décodage d'image, et support d'enregistrement pour stocker un flux binaire | |
| WO2024210648A1 (fr) | Procédé et dispositif de codage/décodage d'image, et support d'enregistrement stockant un flux binaire | |
| WO2025023735A1 (fr) | Procédé et dispositif de codage/décodage d'image, et support d'enregistrement pour stocker un flux binaire | |
| WO2025037911A1 (fr) | Procédé et dispositif de codage/décodage d'image, et support d'enregistrement sur lequel est stocké un flux binaire | |
| WO2025009816A1 (fr) | Procédé et dispositif de codage/décodage d'image, et support d'enregistrement pour stocker un flux binaire | |
| WO2024210624A1 (fr) | Procédé de codage/décodage d'image, dispositif, et support d'enregistrement stockant des flux binaires | |
| WO2024253465A1 (fr) | Procédé et appareil de codage/décodage d'image, et support d'enregistrement pour stocker un flux binaire | |
| WO2024258110A1 (fr) | Procédé et dispositif de codage/décodage d'image et support d'enregistrement stockant un flux binaire | |
| WO2025192907A1 (fr) | Procédé et dispositif de codage et de décodage d'image, et support d'enregistrement stockant un flux binaire | |
| WO2024262883A1 (fr) | Procédé et appareil de codage/décodage d'image, support d'enregistrement pour stocker un flux binaire | |
| WO2024053963A1 (fr) | Procédé et appareil de codage/décodage d'image, et support d'enregistrement sur lequel un flux binaire est stocké |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| 121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 24819578 Country of ref document: EP Kind code of ref document: A1 |
|
| NENP | Non-entry into the national phase |
Ref country code: DE |