WO2025005664A1 - Procédé de codage/décodage d'image et support d'enregistrement pour stocker un flux binaire - Google Patents
Procédé de codage/décodage d'image et support d'enregistrement pour stocker un flux binaire Download PDFInfo
- Publication number
- WO2025005664A1 WO2025005664A1 PCT/KR2024/008916 KR2024008916W WO2025005664A1 WO 2025005664 A1 WO2025005664 A1 WO 2025005664A1 KR 2024008916 W KR2024008916 W KR 2024008916W WO 2025005664 A1 WO2025005664 A1 WO 2025005664A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- block
- prediction
- motion vector
- picture
- luma
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Ceased
Links
Images
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/102—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
- H04N19/103—Selection of coding mode or of prediction mode
- H04N19/105—Selection of the reference unit for prediction within a chosen coding or prediction mode, e.g. adaptive choice of position and number of pixels used for prediction
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/102—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
- H04N19/132—Sampling, masking or truncation of coding units, e.g. adaptive resampling, frame skipping, frame interpolation or high-frequency transform coefficient masking
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/169—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
- H04N19/186—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being a colour or a chrominance component
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/50—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
- H04N19/503—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
- H04N19/51—Motion estimation or motion compensation
- H04N19/577—Motion compensation with bidirectional frame interpolation, i.e. using B-pictures
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/70—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals characterised by syntax aspects related to video coding, e.g. related to compression standards
Definitions
- the present disclosure relates to a method and device for processing a video signal.
- HD High Definition
- UHD Ultra High Definition
- inter-picture prediction technology that predicts pixel values included in the current picture from pictures before or after the current picture
- intra-picture prediction technology that predicts pixel values included in the current picture using pixel information in the current picture
- entropy coding technology that assigns short codes to values with high frequency of appearance and long codes to values with low frequency of appearance, etc.
- the present disclosure aims to provide a method for predicting a chroma block through color component-specific prediction even when a luma block is encoded with inter prediction, and a device therefor.
- the present disclosure aims to provide a method for deriving prediction parameters based on previously restored luma reference blocks and chroma reference blocks and a device therefor.
- a video decoding method may include a step of deriving a first reference block of a luma block that is in the same position as a chroma block; a step of deriving a second reference block of the chroma block; a step of deriving a prediction parameter based on the first reference block and the second reference block; and a step of applying the prediction parameter to the luma block to obtain a prediction block for the chroma block.
- a video encoding method may include a step of deriving a first reference block of a luma block that is in the same position as a chroma block; a step of deriving a second reference block of the chroma block; a step of deriving a prediction parameter based on the first reference block and the second reference block; and a step of applying the prediction parameter to the luma block to obtain a prediction block for the chroma block.
- the first reference block when bidirectional prediction is applied to the luma block, the first reference block may be obtained by weighting the L0 reference block and the L1 reference block of the luma block, and the second reference block may be obtained by weighting the L0 reference block and the L1 reference block of the chroma block.
- the first reference block and the second reference block can be determined based on the POC (Picture Order Count) of the L0 reference picture and the L1 reference picture of the luma block.
- each of the first reference block and the second reference block may represent an L0 reference block of the luma block and an L0 reference block of the chroma block
- each of the first reference block and the second reference block may represent an L1 reference block of the luma block and an L1 reference block of the chroma block.
- the first reference block may be obtained by weighting the L0 reference block and the L1 reference block of the luma block
- the second reference block may be obtained by weighting the L0 reference block and the L1 reference block of the chroma block.
- the first reference block and the second reference block are derived from at least one of a reference picture in the L0 direction or a reference picture in the L1 direction, and at least one of the L0 direction and the L1 direction can be selected based on prediction direction information decoded from a bitstream.
- the prediction parameters may include weights and offsets.
- the prediction sample of the chroma block can be derived by adding the offset to the result of multiplying the restoration sample corresponding to the position of the prediction sample in the luma block by the weight.
- the prediction parameter may include filter coefficients for a convolutional filter.
- the prediction sample of the chroma block can be derived by inputting a restoration sample corresponding to a position of the prediction sample in the luma block and at least one neighboring sample adjacent to the restoration sample into the convolution filter.
- the type of the prediction parameter is determined as one of a plurality of prediction parameter type candidates, and the plurality of prediction parameter type candidates may include a first prediction parameter candidate including a weight and an offset and a second prediction parameter candidate including filter coefficients of a convolution filter.
- the first reference block when the within-screen block copy mode is applied to the luma block, the first reference block may be derived from a current luma picture including the luma block, and the second reference block may be derived from a current chroma picture including the chroma block.
- the prediction parameter can be derived for each sub-block within the chroma block.
- a computer-readable recording medium storing a bitstream generated by an image encoding method can be provided.
- signaling overhead can be reduced by deriving prediction parameters based on previously restored luma reference blocks and chroma reference blocks.
- FIG. 1 is a block diagram illustrating an image encoding device according to an embodiment of the present disclosure.
- FIG. 2 is a block diagram illustrating an image decoding device according to an embodiment of the present disclosure.
- Figure 3 is a diagram schematically illustrating the process of performing inter prediction in an encoder and decoder.
- Figure 4 shows an example in which motion estimation is performed.
- Figures 5 and 6 illustrate examples in which a prediction block of a current block is generated based on motion information generated through motion estimation.
- Figure 7 shows the locations referenced to derive motion vector prediction values.
- Figure 8 is a diagram for explaining a template-based motion estimation method.
- Figure 9 shows examples of template configurations.
- Figure 10 is a diagram for explaining a motion estimation method based on a bilateral matching method.
- Figure 11 is a diagram for explaining a motion estimation method based on a one-way matching method.
- Figures 12 and 13 illustrate examples in which prediction blocks are generated according to the precision of a motion vector.
- Figure 14 shows an example in which motion compensation based on the translational model and the zooming model is performed for the current block.
- Figure 15 shows an example in which motion compensation based on translational models and rotational models is performed for the current block.
- Figures 16 and 17 show examples of generating a prediction block for a current block using control point motion vectors.
- Figure 18 shows an example of generating a prediction block for the current block using three control point motion vectors.
- Figure 19 shows an example in which a motion vector is derived in sub-block units.
- Figures 20 and 21 illustrate examples in which motion vectors are derived for each sub-block within the current block when SbTMVP is applied.
- Figures 22 and 23 are diagrams showing examples in which prediction blocks are derived according to motion vector precision.
- Figures 24 and 25 are diagrams for explaining the process of encoding and decoding a motion vector difference value, respectively, when the AMVR method is applied.
- Figure 26 shows a flow chart of a color component prediction method based on prediction parameters.
- Figures 27 and 28 illustrate the operation of the encoder/decoder according to a color component prediction method based on prediction parameters.
- Figure 29 illustrates an example of predicting a chroma block by selecting one of multiple prediction parameter candidates.
- Figure 30 shows an example of deriving prediction parameters for color difference components.
- Figure 31 shows the sub-sampled locations.
- Figure 32 shows an example of deriving prediction parameters using a convolution filter.
- Figure 33 shows the form of a convolution filter.
- Figure 34 is a diagram for explaining an example in which a color component discrimination prediction method based on prediction parameters is performed on a sub-block basis.
- first, second, etc. may be used to describe various components, but the components should not be limited by the terms. The terms are only used to distinguish one component from another.
- the first component could be referred to as the second component, and similarly, the second component could also be referred to as the first component.
- the term and/or includes any combination of a plurality of related described items or any item among a plurality of related described items.
- FIG. 1 is a block diagram illustrating an image encoding device according to an embodiment of the present disclosure.
- a video encoding device may include a picture segmentation unit (110), a prediction unit (120, 125), a transformation unit (130), a quantization unit (135), a reordering unit (160), an entropy encoding unit (165), an inverse quantization unit (140), an inverse transformation unit (145), a filter unit (150), and a memory (155).
- each component shown in FIG. 1 is independently illustrated to indicate different characteristic functions in the image encoding device, and does not mean that each component is composed of separate hardware or a single software configuration unit. That is, each component is listed and included as a separate component for convenience of explanation, and at least two components among each component may be combined to form a single component, or one component may be divided into multiple components to perform a function, and such integrated and separated embodiments of each component are also included in the scope of the present disclosure as long as they do not deviate from the essence of the present disclosure.
- some components may not be essential components that perform essential functions in the present disclosure, but may be optional components that are merely used to improve performance.
- the present disclosure may be implemented by including only essential components for implementing the essence of the present disclosure, excluding components that are merely used to improve performance, and a structure that includes only essential components, excluding optional components that are merely used to improve performance, is also included in the scope of the present disclosure.
- the picture splitting unit (110) can split an input picture into at least one processing unit.
- the processing unit may be a prediction unit (PU), a transform unit (TU), or a coding unit (CU).
- the picture splitting unit (110) can split one picture into a combination of multiple coding units, prediction units, and transform units, and select one combination of coding units, prediction units, and transform units based on a predetermined criterion (e.g., a cost function) to encode the picture.
- a predetermined criterion e.g., a cost function
- a picture can be split into multiple coding units.
- a recursive tree structure such as a quad tree, a ternary tree, or a binary tree can be used.
- a coding unit that is split into other coding units with one image or the largest coding unit as the root can be split with as many child nodes as the number of split coding units.
- a coding unit that cannot be split any further according to a certain restriction becomes a leaf node. For example, assuming that a quad tree split is applied to a coding unit, a coding unit can be split into at most four different coding units.
- the encoding unit may be used to mean a unit that performs encoding, or may be used to mean a unit that performs decoding.
- a prediction unit may be divided into at least one square or rectangular shape of the same size within one coding unit, or may be divided such that one prediction unit among the divided prediction units within one coding unit has a different shape and/or size from another prediction unit.
- the transformation unit and the prediction unit can be set to be the same.
- the encoding unit can be divided into multiple transformation units, and then intra-screen prediction can be performed for each transformation unit.
- the encoding unit can be divided in the horizontal direction or the vertical direction.
- the number of transformation units generated by dividing the encoding unit can be 2 or 4, depending on the size of the encoding unit.
- the prediction unit (120, 125) may include an inter-prediction unit (120) that performs inter-prediction and an intra-prediction unit (125) that performs intra-prediction. It may be determined whether to use inter-prediction or intra-prediction for an encoding unit, and specific information (e.g., intra-prediction mode, motion vector, reference picture, etc.) according to each prediction method may be determined. At this time, the processing unit where the prediction is performed and the processing unit where the prediction method and specific contents are determined may be different. For example, the prediction method and prediction mode, etc. are determined in the encoding unit, and the prediction may be performed in the prediction unit or the transformation unit.
- specific information e.g., intra-prediction mode, motion vector, reference picture, etc.
- the residual value (residual block) between the generated prediction block and the original block may be input to the transformation unit (130).
- the prediction mode information, motion vector information, etc. used for the prediction may be encoded together with the residual value in the entropy encoding unit (165) and transmitted to the decoding device.
- the inter-screen prediction unit (120) may predict a prediction unit based on information of at least one picture among the previous picture or the subsequent picture of the current picture, and in some cases, may predict a prediction unit based on information of a part of an encoded region within the current picture.
- the inter-screen prediction unit (120) may include a reference picture interpolation unit, a motion prediction unit, and a motion compensation unit.
- the reference picture interpolation unit can receive reference picture information from the memory (155) and generate pixel information below an integer pixel from the reference picture.
- a DCT-based 8-tap interpolation filter (DCT-based Interpolation Filter) with different filter coefficients can be used to generate pixel information below an integer pixel in units of 1/4 pixels.
- a DCT-based 4-tap interpolation filter (DCT-based Interpolation Filter) with different filter coefficients can be used to generate pixel information below an integer pixel in units of 1/8 pixels.
- the prediction unit (125) within the screen can generate a prediction block based on reference pixel information, which is pixel information within the current picture.
- the reference pixel information can be derived from one selected from among a plurality of reference pixel lines.
- the Nth reference pixel line among the plurality of reference pixel lines can include left pixels having an x-axis difference of N from the upper left pixel within the current block and upper pixels having a y-axis difference of N from the upper left pixel.
- the number of reference pixel lines that the current block can select can be 1, 2, 3, or 4.
- the surrounding blocks of the current prediction unit are blocks that have performed inter-screen prediction and the reference pixel is a pixel that has performed inter-screen prediction
- the reference pixel included in the block that has performed inter-screen prediction can be replaced and used with reference pixel information of the surrounding blocks that have performed intra-screen prediction. That is, if the reference pixel is unavailable, the unavailable reference pixel information can be replaced and used with information on at least one of the available reference pixels.
- intra-screen prediction for the prediction unit can be performed based on the pixels on the left side of the prediction unit, the pixels on the upper left side, and the pixels on the top.
- the on-screen prediction method can generate a prediction block after applying a smoothing filter to reference pixels according to the prediction mode. Depending on the selected reference pixel line, whether or not to apply the smoothing filter can be determined.
- the intra-screen prediction mode of the current prediction unit can be predicted from the intra-screen prediction modes of prediction units existing in the vicinity of the current prediction unit.
- the prediction mode of the current prediction unit is predicted using mode information predicted from the surrounding prediction units, if the intra-screen prediction modes of the current prediction unit and the surrounding prediction units are the same, information indicating that the prediction modes of the current prediction unit and the surrounding prediction units are the same can be transmitted using predetermined flag information, and if the prediction modes of the current prediction unit and the surrounding prediction units are different, entropy encoding can be performed to encode the prediction mode information of the current block.
- a residual block including residual value information which is a difference value between the prediction unit that performed the prediction and the original block of the prediction unit based on the prediction unit generated in the prediction unit (120, 125), can be generated.
- the generated residual block can be input to the transformation unit (130).
- the residual block including the residual value information of the prediction unit generated through the original block and the prediction unit (120, 125) can be transformed using a transformation method such as DCT (Discrete Cosine Transform), DST (Discrete Sine Transform), or KLT. Whether to apply DCT, DST, or KLT to transform the residual block can be determined based on at least one of the size of the transformation unit, the shape of the transformation unit, the prediction mode of the prediction unit, or the prediction mode information within the screen of the prediction unit.
- DCT Discrete Cosine Transform
- DST Discrete Sine Transform
- KLT Discrete Sine Transform
- the quantization unit (135) can quantize the values converted to the frequency domain in the transformation unit (130).
- the quantization coefficients can vary depending on the block or the importance of the image.
- the values produced by the quantization unit (135) can be provided to the dequantization unit (140) and the reordering unit (160).
- the rearrangement unit (160) can perform rearrangement of coefficient values for quantized residual values.
- the rearrangement unit (160) can change a two-dimensional block-shaped coefficient into a one-dimensional vector form through a coefficient scanning method.
- the rearrangement unit (160) can change the two-dimensional block-shaped coefficient into a one-dimensional vector form by scanning from the DC coefficient to the coefficient of the high-frequency region using a zig-zag scan method.
- a vertical scan that scans the two-dimensional block-shaped coefficient in the column direction, a horizontal scan that scans the two-dimensional block-shaped coefficient in the row direction, or a diagonal scan that scans the two-dimensional block-shaped coefficient in the diagonal direction may be used instead of the zig-zag scan. That is, depending on the size of the conversion unit and the prediction mode within the screen, it is possible to determine which scan method among the zig-zag scan, the vertical scan, the horizontal scan, or the diagonal scan is used.
- the entropy encoding unit (165) can perform entropy encoding based on the values produced by the rearrangement unit (160). Entropy encoding can use various encoding methods such as, for example, Exponential Golomb, CAVLC (Context-Adaptive Variable Length Coding), and CABAC (Context-Adaptive Binary Arithmetic Coding).
- the entropy encoding unit (165) can encode various information such as residual value coefficient information of an encoding unit, block type information, prediction mode information, division unit information, prediction unit information, transmission unit information, motion vector information, reference frame information, block interpolation information, and filtering information from the rearrangement unit (160) and the prediction unit (120, 125).
- the entropy encoding unit (165) can entropy encode the coefficient values of the encoding unit input from the rearrangement unit (160).
- the values quantized in the quantization unit (135) are inversely quantized and the values transformed in the transformation unit (130) are inversely transformed.
- the residual values generated in the inverse quantization unit (140) and the inverse transformation unit (145) can be combined with the predicted prediction units predicted through the motion estimation unit, motion compensation unit, and intra-screen prediction unit included in the prediction unit (120, 125) to generate a reconstructed block.
- the filter unit (150) may include at least one of a deblocking filter, an offset correction unit, and an ALF (Adaptive Loop Filter).
- a deblocking filter may include at least one of a deblocking filter, an offset correction unit, and an ALF (Adaptive Loop Filter).
- ALF Adaptive Loop Filter
- a deblocking filter can remove block distortion caused by boundaries between blocks in a restored picture.
- a strong filter or a weak filter can be applied depending on the required deblocking filtering strength.
- horizontal filtering and vertical filtering can be processed in parallel when performing vertical filtering and horizontal filtering.
- the offset correction unit can correct the offset from the original image on a pixel basis for the image on which deblocking has been performed.
- a method can be used in which the pixels included in the image are divided into a certain number of regions, the regions to be offset are determined, and the offset is applied to the regions, or a method can be used in which the offset is applied by considering the edge information of each pixel.
- Adaptive Loop Filtering can be performed based on the value compared between the filtered restored image and the original image. After dividing the pixels included in the image into a predetermined group, one filter to be applied to the group is determined, and filtering can be performed differentially for each group. Information related to whether to apply ALF can be transmitted by luminance signal for each coding unit (CU), and the shape and filter coefficient of the ALF filter to be applied can be different for each block. In addition, the same shape (fixed shape) of the ALF filter can be applied regardless of the characteristics of the target block.
- ALF Adaptive Loop Filtering
- the memory (155) can store a restored block or picture produced through the filter unit (150), and the stored restored block or picture can be provided to the prediction unit (120, 125) when performing inter-screen prediction.
- FIG. 2 is a block diagram illustrating an image decoding device according to an embodiment of the present disclosure.
- the image decoding device (200) may include an entropy decoding unit (210), a reordering unit (215), an inverse quantization unit (220), an inverse transformation unit (225), a prediction unit (230, 235), a filter unit (240), and a memory (245).
- the entropy decoding unit (210) can perform entropy decoding in a procedure opposite to that of performing entropy encoding in the entropy encoding unit of the video encoding device.
- various methods such as Exponential Golomb, CAVLC (Context-Adaptive Variable Length Coding), and CABAC (Context-Adaptive Binary Arithmetic Coding) can be applied in response to the method performed in the video encoding device.
- the entropy decoding unit (210) can decode information related to intra-screen prediction and inter-screen prediction performed in the encoding device.
- the inverse quantization unit (220) can perform inverse quantization based on the quantization parameters provided from the encoding device and the coefficient values of the rearranged block.
- the inverse transform unit (225) can perform inverse transform, i.e., inverse DCT, inverse DST, and inverse KLT, on the transforms performed by the transform unit, i.e., DCT, DST, and KLT, on the quantization result performed by the image encoding device.
- the inverse transform can be performed based on the transmission unit determined by the image encoding device.
- a transform technique e.g., DCT, DST, KLT
- a transform technique can be selectively performed according to a plurality of pieces of information, such as a prediction method, the size and shape of the current block, the prediction mode, and the prediction direction within the screen.
- the prediction unit (230, 235) can generate a prediction block based on prediction block generation related information provided from the entropy decoding unit (210) and previously decoded block or picture information provided from the memory (245).
- intra-screen prediction for the prediction unit is performed based on the pixels on the left side of the prediction unit, the pixels on the upper left side, and the pixels on the upper side.
- intra-screen prediction can be performed using reference pixels based on the transformation unit.
- intra-screen prediction using NxN division only for the minimum coding unit can be used.
- the prediction unit (230, 235) may include a prediction unit determination unit, an inter-screen prediction unit, and an intra-screen prediction unit.
- the prediction unit determination unit may receive various information such as prediction unit information input from the entropy decoding unit (210), prediction mode information of an intra-screen prediction method, and motion prediction-related information of an inter-screen prediction method, and may distinguish a prediction unit from a current encoding unit and determine whether the prediction unit performs inter-screen prediction or intra-screen prediction.
- the inter-screen prediction unit (230) may perform inter-screen prediction for the current prediction unit based on information included in at least one picture among a previous picture or a subsequent picture of the current picture including the current prediction unit by using information necessary for inter-screen prediction of the current prediction unit provided by the video encoding device. Alternatively, inter-screen prediction may be performed based on information of a pre-restored portion of the current picture including the current prediction unit.
- the motion prediction method of the prediction unit included in the encoding unit is Skip Mode, Merge Mode, AMVP Mode, or Intra-screen Block Copy Mode based on the encoding unit.
- the intra-screen prediction unit (235) can generate a prediction block based on pixel information within the current picture. If the prediction unit is a prediction unit that has performed intra-screen prediction, the intra-screen prediction can be performed based on intra-screen prediction mode information of the prediction unit provided by the image encoding device.
- the intra-screen prediction unit (235) can include an AIS (Adaptive Intra Smoothing) filter, a reference pixel interpolation unit, and a DC filter.
- the AIS filter is a part that performs filtering on the reference pixels of the current block, and can determine whether to apply the filter and apply it according to the prediction mode of the current prediction unit.
- the AIS filter can be performed on the reference pixels of the current block using the prediction mode and AIS filter information of the prediction unit provided by the image encoding device. If the prediction mode of the current block is a mode that does not perform AIS filtering, the AIS filter may not be applied.
- the reference pixel interpolation unit can generate a reference pixel of a pixel unit less than an integer value by interpolating the reference pixel when the prediction mode of the prediction unit is a prediction unit that performs prediction within the screen based on the pixel value interpolated with the reference pixel.
- the prediction mode of the current prediction unit is a prediction mode that generates a prediction block without interpolating the reference pixel
- the reference pixel may not be interpolated.
- the DC filter can generate a prediction block through filtering when the prediction mode of the current block is the DC mode.
- the restored block or picture may be provided to a filter unit (240).
- the filter unit (240) may include a deblocking filter, an offset correction unit, and an ALF.
- a deblocking filter of a video decoding device can receive information related to a deblocking filter provided from a video encoding device and perform deblocking filtering on a corresponding block in the video decoding device.
- the offset correction unit can perform offset correction on the restored image based on information such as the type of offset correction applied to the image during encoding and the offset value.
- ALF can be applied to an encoding unit based on ALF application information provided from an encoding device, ALF coefficient information, etc. This ALF information can be provided by being included in a specific parameter set.
- the memory (245) can store a restored picture or block so that it can be used as a reference picture or reference block, and can also provide the restored picture to an output unit.
- the term coding unit is used as an encoding unit, but it may also be a unit that performs not only encoding but also decoding.
- the current block represents a block to be encoded/decoded, and may represent a coding tree block (or coding tree unit), an encoding block (or encoding unit), a transform block (or transform unit), a prediction block (or prediction unit), or a block to which an in-loop filter is applied, depending on the encoding/decoding step.
- a 'unit' represents a basic unit for performing a specific encoding/decoding process
- a 'block' may represent a pixel array of a predetermined size.
- 'block' and 'unit' may be used with the same meaning.
- an encoding block (coding block) and an encoding unit (coding unit) may be understood to have the same meaning.
- Inter prediction can be performed on a block-by-block basis.
- a prediction block of the current block can be generated from a reference picture using the motion information of the current block.
- the motion information can include at least one of a motion vector, a reference picture index, and a prediction direction.
- Figure 3 is a diagram schematically illustrating the process of performing inter prediction in an encoder and decoder.
- motion information for the current block can be obtained (S310).
- the motion information can include at least one of a motion vector, a reference picture index, or a weight applied to the prediction block.
- motion information for at least one of the L0 direction or the L1 direction can be obtained.
- motion information of the current block can be derived through motion estimation, and the derived motion information can be encoded and signaled to the decoder.
- encoding/decoding of motion information can be based on a motion information merging mode, a motion vector prediction mode, a motion estimation method based on a template, or a bilateral matching method, which will be described later.
- motion information of the current block can be derived based on the information transmitted from the encoder.
- the motion information of the current block can be derived from the decoder in the same way as from the encoder. This method can be called decoder-side motion estimation.
- a prediction block for the current block can be obtained based on the derived motion information (S320). For example, a reference block spaced apart by a motion vector from the position of the current block in the reference picture can be set as the prediction block of the current block.
- the motion information of the current block can be generated through motion estimation.
- Figure 4 shows an example in which motion estimation is performed.
- a search range for motion estimation can be set from the same position as the reference point of the current block in the reference picture.
- the reference point can be the position of the upper left sample of the current block.
- a rectangle of sizes (w0+w01) and (h0+h1) is set as a search range centered on a reference point.
- w0, w1, h0, and h1 may have the same values.
- at least one of w0, w1, h0, and h1 may be set to have a different value from the other.
- the sizes of w0, w1, h0, and h1 may be determined so as not to exceed a Coding Tree Unit (CTU) boundary, a slice boundary, a tile boundary, or a picture boundary.
- CTU Coding Tree Unit
- reference blocks having the same size as the current block can be set, and then the cost for each reference block with respect to the current block can be measured.
- the cost can be calculated using the similarity between the two blocks.
- the cost can be calculated based on the absolute sum of the differences between the original samples in the current block and the original samples (or reconstructed samples) in the reference block. The smaller the absolute sum, the lower the cost.
- the cost of each reference block is compared, and the reference block with the optimal cost can be set as the prediction block of the current block.
- the distance between the current block and the reference block can be set as a motion vector.
- the x-coordinate difference and the y-coordinate difference between the current block and the reference block can be set as the motion vector.
- the index of the picture containing the reference block identified through motion estimation is set as the reference picture index.
- the prediction direction can be set based on whether the reference picture belongs to the L0 reference picture list or the L1 reference picture list.
- motion estimation can be performed for each of the L0 direction and the L1 direction. If prediction is performed for both the L0 direction and the L1 direction, motion information in the L0 direction and motion information in the L1 direction can be generated, respectively.
- Figures 5 and 6 illustrate examples in which a prediction block of a current block is generated based on motion information generated through motion estimation.
- Figure 5 shows an example of generating a prediction block by unidirectional (i.e., L0 direction) prediction
- Figure 6 shows an example of generating a prediction block by bidirectional (i.e., L0 and L1 direction) prediction.
- a prediction block of the current block is generated using one motion information.
- the motion information may include an L0 motion vector, an L0 reference picture index, and prediction direction information pointing to the L0 direction.
- two pieces of motion information are used to generate a prediction block.
- a reference block in the L0 direction specified based on motion information for the L0 direction (L0 motion information) can be set as an L0 prediction block
- a reference block in the L1 direction specified based on motion information for the L1 direction (L1 motion information) can be generated as an L1 prediction block.
- the L0 prediction block and the L1 prediction block can be weighted and combined to generate a prediction block of the current block.
- the L0 reference picture is illustrated as existing in the previous direction of the current picture (i.e., having a POC value smaller than that of the current picture), and the L1 reference picture is illustrated as existing in the subsequent direction of the current picture (i.e., having a POC value larger than that of the current picture).
- the L0 reference picture may exist in the subsequent direction of the current picture, or the L1 reference picture may exist in the previous direction of the current picture.
- both the L0 reference picture and the L1 reference picture may exist in the previous direction of the current picture, or both may exist in the subsequent direction of the current picture.
- bidirectional prediction may be performed using the L0 reference picture existing in the subsequent direction of the current picture and the L1 reference picture existing in the previous direction of the current picture.
- the motion information of the block on which inter prediction is performed can be stored in memory.
- the motion information can be stored in units of samples.
- the motion information of the block to which a specific sample belongs can be stored as the motion information of the specific sample.
- the stored motion information can be used to derive the motion information of the neighboring block to be encoded/decoded in the future.
- information encoding a residual sample corresponding to a difference between a sample of a current block (i.e., an original sample) and a prediction sample, and motion information required to generate a prediction block can be signaled to the decoder.
- information about the signaled difference can be decoded to derive a difference sample, and a prediction sample in a prediction block generated using motion information can be added to the difference sample to generate a reconstructed sample.
- one of a plurality of inter prediction modes may be selected.
- the plurality of inter prediction modes may include a motion information merging mode and a motion vector prediction mode.
- the motion vector prediction mode is a mode that signals by encoding the difference between a motion vector and a motion vector prediction value.
- the motion vector prediction value can be derived based on motion information of a neighboring block or neighboring sample adjacent to the current block.
- Figure 7 shows the locations referenced to derive motion vector prediction values.
- the current block is assumed to have a size of 4x4.
- 'LB' represents a sample included in the leftmost column and the bottommost row in the current block.
- 'RT' represents a sample included in the rightmost column and the topmost row in the current block.
- A0 to A4 represent samples neighboring to the left of the current block, and B0 to B5 represent samples neighboring to the top of the current block.
- A1 represents a sample neighboring to the left of LB, and B1 represents a sample neighboring to the top of RT.
- Col indicates the location of a sample neighboring the lower right of the current block in a co-located picture.
- a co-located picture is a picture different from the current picture, and information for specifying the co-located picture (e.g., a co-located picture index) can be explicitly encoded and signaled in the bitstream.
- a reference picture having a predefined reference picture index can be set as the co-located picture.
- the motion vector prediction value of the current block can be derived from at least one motion vector prediction candidate included in a motion vector prediction list.
- the number of motion vector prediction candidates that can be inserted into the motion vector prediction list (i.e., the size of the list) may be predefined in the encoder and decoder.
- the maximum number of motion vector prediction candidates may be 2.
- a motion vector stored at the location of a neighboring sample adjacent to the current block or a scaled motion vector derived by scaling the motion vector can be inserted into the motion vector prediction list as a motion vector prediction candidate.
- the neighboring samples adjacent to the current block can be scanned in a predefined order to derive the motion vector prediction candidate.
- the first found available motion vector can be inserted into the motion vector prediction list as a motion vector prediction candidate.
- the motion vector prediction candidate can be derived based on the available vector found first. Specifically, the available motion vector that is found first can be scaled, and then the scaled motion vector can be inserted as a motion vector prediction candidate into the motion vector prediction list.
- the scaling can be performed based on the output order difference between the current picture and the reference picture (i.e., the POC difference) and the output order difference between the current picture and the reference picture of the neighboring sample (i.e., the POC difference).
- the order of B0 to B5 it can be checked whether a motion vector is stored at each position. Then, according to the above scanning order, the first found available motion vector can be inserted into the motion vector prediction list as a motion vector prediction candidate.
- the motion vector prediction candidate can be derived based on the available vector found first. Specifically, the available motion vector that is found first can be scaled, and then the scaled motion vector can be inserted as a motion vector prediction candidate into the motion vector prediction list.
- the scaling can be performed based on the output order difference between the current picture and the reference picture (i.e., the POC difference) and the output order difference between the current picture and the reference picture of the neighboring sample (i.e., the POC difference).
- a motion vector prediction candidate can be derived from a sample adjacent to the left of the current block, and a motion vector prediction candidate can be derived from a sample adjacent to the top of the current block.
- the motion vector prediction candidate derived from the left sample may be inserted into the motion vector prediction list before the motion vector prediction candidate derived from the upper sample.
- the index assigned to the motion vector prediction candidate derived from the left sample may have a smaller value than the motion vector prediction candidate derived from the upper sample.
- the motion vector prediction candidates derived from the upper samples can also be inserted into the motion vector prediction list before the motion vector prediction candidates derived from the left samples.
- a motion vector prediction candidate having the highest encoding efficiency can be set as a motion vector predictor (MVP) of a current block.
- index information indicating a motion vector prediction candidate set as a motion vector predictor of a current block among a plurality of motion vector prediction candidates can be encoded and signaled to a decoder.
- the index information can be a 1-bit flag (e.g., an MVP flag).
- MVD motion vector difference
- the decoder can construct a motion vector prediction list in the same manner as the encoder. In addition, it can decode index information from a bitstream and select one of a plurality of motion vector prediction candidates based on the decoded index information. The selected motion vector prediction candidate can be set as the motion vector prediction value of the current block.
- the motion vector differential can be decoded from the bitstream. Afterwards, the motion vector prediction value and the motion vector differential value can be combined to derive the motion vector of the current block.
- a motion vector prediction list can be generated for each of the L0 direction and the L1 direction. That is, the motion vector prediction list can be composed of motion vectors in the same direction. Accordingly, the motion vector of the current block and the motion vector prediction candidates included in the motion vector prediction list have the same direction.
- reference picture index and prediction direction information can be explicitly encoded and signaled to the decoder.
- a reference picture index for specifying a reference picture from which motion information of the current block is derived among the multiple reference pictures can be explicitly encoded and signaled to the decoder.
- the reference picture list contains only one reference picture, encoding/decoding of the reference picture index may be omitted.
- the prediction direction information may be an index pointing to one of L0 unidirectional prediction, L1 unidirectional prediction, or bidirectional prediction.
- an L0 flag indicating whether prediction is performed in the L0 direction and an L1 flag indicating whether prediction is performed in the L1 direction may be encoded and signaled, respectively.
- Motion information merge mode is a mode in which the motion information of the current block is set to be the same as the motion information of the neighboring block.
- motion information can be encoded/decoded using the motion information merge list.
- Motion information merging candidates can be derived based on motion information of neighboring blocks or neighboring samples adjacent to the current block. For example, after defining a location to be referenced around the current block, it can be checked whether motion information exists at the defined reference location. If motion information exists at the defined reference location, the motion information at that location can be inserted into the motion information merging list as a motion information merging candidate.
- the predefined reference positions may include at least one of A0, A1, B0, B1, B5, and Col.
- the motion information merging candidates may be derived in the order of A1, B1, B0, A0, B5, and Col.
- the motion information of the motion information merging candidate with the optimal cost can be set as the motion information of the current block.
- index information e.g., a merge index
- the motion information merging candidate selected from among a plurality of motion information merging candidates can be encoded and transmitted to the decoder.
- a motion information merge list can be constructed in the same manner as in the encoder. Then, a motion information merge candidate can be selected based on a merge index decoded from a bitstream. The motion information of the selected motion information merge candidate can be set as the motion information of the current block.
- the motion information merge list is composed of a single list regardless of the prediction direction. That is, the motion information merge candidates included in the motion information merge list may have only L0 motion information or L1 motion information, or may have bidirectional motion information (i.e., L0 motion information and L1 motion information).
- the motion information of the current block can also be derived by using the restoration sample area around the current block.
- the restoration sample area used to derive the motion information of the current block can also be called a template.
- Figure 8 is a diagram for explaining a template-based motion estimation method.
- the prediction block of the current block is determined based on the cost between the current block and the reference block within the search range.
- motion estimation for the current block can be performed based on the cost between the template neighboring the current block (hereinafter referred to as the current template) and the reference template having the same size and shape as the current template.
- the cost can be calculated based on the absolute sum of the differences between the restored samples in the current template and the restored samples in the reference block. The smaller the absolute sum, the lower the cost.
- the reference block neighboring the reference template can be set as the predicted block of the current block.
- the motion information of the current block can be set.
- the decoder itself can perform motion estimation in the same manner as the encoder. Accordingly, when deriving motion information using a template, there is no need to encode and signal the motion information other than information indicating whether or not the template is used.
- the current template may include at least one of an area adjacent to the top of the current block or an area adjacent to the left of the current block, wherein the area adjacent to the top may include at least one row, and the area adjacent to the left may include at least one column.
- Figure 9 shows examples of template configurations.
- the current template can be constructed following one of the examples illustrated in FIG. 9.
- the template may be composed of only the area adjacent to the left of the current block, or only the area adjacent to the top of the current block.
- the size and/or shape of the current template may be predefined in the encoder and decoder.
- one of the multiple template candidates may be adaptively selected based on at least one of the size, shape, or position of the current block. For example, if the current block is adjacent to the upper boundary of the CTU, the current template may be composed of only the region adjacent to the left of the current block.
- Motion estimation based on a template can be performed on each of the reference pictures stored in the reference picture list. Alternatively, motion estimation can be performed on only some of the reference pictures. For example, motion estimation can be performed only on reference pictures having a reference picture index of 0, or motion estimation can be performed only on reference pictures having a reference picture index smaller than a threshold or reference pictures having a POC difference from the current picture smaller than a threshold.
- the reference picture index can be explicitly encoded and signaled, and then motion estimation can be performed only for the reference picture pointed to by the reference picture index.
- motion estimation can be performed targeting reference pictures of neighboring blocks corresponding to the current template. For example, if the template is composed of a left neighboring region and an upper neighboring region, at least one reference picture can be selected using at least one of the reference picture index of the left neighboring block or the reference picture index of the upper neighboring block. Thereafter, motion estimation can be performed targeting at least one selected reference picture.
- Information indicating whether motion estimation based on template is applied can be encoded and signaled to a decoder.
- the information can be a 1-bit flag. For example, if the flag is true (1), it indicates that motion estimation based on template is applied in the L0 direction and the L1 direction of the current block. On the other hand, if the flag is false (0), it indicates that motion estimation based on template is not applied. In this case, motion information of the current block can be derived based on the motion information merging mode or the motion vector prediction mode.
- template-based motion estimation can be applied only when it is determined that neither the motion information merging mode nor the motion vector prediction mode is applied to the current block. For example, when the first flag indicating whether the motion information merging mode is applied and the second flag indicating whether the motion vector prediction mode is applied are both 0, template-based motion estimation can be performed.
- template-based motion estimation can be applied to one of the L0 direction and the L1 direction, while another mode (e.g., motion information merging mode or motion vector prediction mode) can be applied to the other.
- another mode e.g., motion information merging mode or motion vector prediction mode
- the prediction block of the current block can be generated based on a weighted sum operation of the L0 prediction block and the L1 prediction block.
- the prediction block of the current block can be generated based on a weighted sum operation of the L0 prediction block and the L1 prediction block.
- a motion estimation method based on a template may be inserted as a motion information merging candidate in a motion information merging mode or a motion vector prediction candidate in a motion vector prediction mode.
- whether or not to apply a motion estimation method based on a template may be determined based on whether the selected motion information merging candidate or the selected motion vector prediction candidate indicates a motion estimation method based on a template.
- Figure 10 is a diagram for explaining a motion estimation method based on a bilateral matching method.
- the bilateral matching method can be performed only when the temporal order (i.e., POC) of the current picture exists between the temporal order of the L0 reference picture and the temporal order of the L1 reference picture.
- POC temporal order
- a search range can be set for each of the L0 reference picture and the L1 reference picture.
- an L0 reference picture index for identifying the L0 reference picture and an L1 reference picture index for identifying the L1 reference picture can be encoded and signaled, respectively.
- only the L0 reference picture index may be encoded and signaled, and an L1 reference picture may be selected based on a distance between the current picture and the L0 reference picture (hereinafter referred to as the L0 POC difference).
- the L0 POC difference an L1 reference picture included in an L1 reference picture list, of which an absolute value of the distance from the current picture (hereinafter referred to as the L1 POC difference) is equal to the absolute value of the distance between the current picture and the L0 reference picture, may be selected. If there is no L1 reference picture having an L1 POC difference equal to the L0 POC difference, an L1 reference picture whose L1 POC difference is most similar to the L0 POC difference may be selected among the L1 reference pictures.
- the L1 reference pictures only the L1 reference pictures that have different temporal directions from the L0 reference pictures can be used for bilateral matching. For example, if the POC of the L0 reference picture is smaller than that of the current picture, one of the L1 reference pictures that has a larger POC than that of the current picture can be selected.
- a bilateral matching method may be performed using the L0 reference picture having the closest distance to the current picture among the L0 reference pictures and the L1 reference picture having the closest distance to the current picture among the L1 reference pictures.
- a bilateral matching method may be performed using an L0 reference picture (e.g., index 0) assigned with a predefined index in the L0 reference picture list and an L1 reference picture (e.g., index 0) assigned with a predefined index in the L1 reference picture list.
- L0 reference picture e.g., index 0
- L1 reference picture e.g., index 0
- the LX (X is 0 or 1) reference picture is selected based on an explicitly signaled reference picture index, and the L
- the L0 and/or L1 reference pictures can be selected based on the motion information of the neighboring blocks of the current block.
- the L0 and/or L1 reference pictures to be used for bilateral matching can be selected using the reference picture index of the left or upper neighboring block of the current block.
- the search range can be set within a predetermined range from a collocated block in a reference picture.
- the search range can be set based on the initial motion information.
- the initial motion information can be derived from the neighboring blocks of the current block.
- the motion information of the left neighboring block or the upper neighboring block of the current block can be set as the initial motion information of the current block.
- the L0 motion vector and the L1 direction motion vector are set to opposite directions. This indicates that the sign of the L0 motion vector and the L1 direction motion vector have opposite signs.
- the size of the LX motion vector can be proportional to the distance between the current picture and the LX reference picture (i.e., the POC difference).
- L0 reference block a reference block belonging to the search range of the L0 reference picture
- L1 reference block a reference block belonging to the search range of the L1 reference picture
- an L1 reference block located at a position (-Dx, -Dy) away from the current block can be selected.
- D can be determined by the ratio of the distance between the current picture and the L0 reference picture and the distance between the L1 reference picture and the current picture.
- the absolute value of the distance between the current picture (T) and the L0 reference picture (T-1) and the absolute value of the distance between the current picture (T) and the L1 reference picture (T+1) are equal to each other. Accordingly, in the illustrated example, the L0 motion vector (x0, y0) and the L1 motion vector (x1, y1) have equal magnitudes but opposite distances. If an L1 reference picture with POC of (T+2) were used, the L1 motion vector (x1, y1) would be set to (-2*x0, -2*y0).
- the L0 reference block and the L1 reference block can be set as the L0 prediction block and the L1 prediction block of the current block, respectively. Thereafter, the final prediction block of the current block can be generated through a weighted sum operation of the L0 reference block and the L1 reference block.
- the decoder can perform motion estimation in the same manner as the encoder. Accordingly, information indicating whether the bilateral motion matching method is applied can be explicitly encoded/decoded, while encoding/decoding of motion information such as a motion vector can be omitted. As described above, at least one of the L0 reference picture index or the L1 reference picture index can be explicitly encoded/decoded.
- Information indicating whether the bilateral matching method is applied may be a 1-bit flag. For example, if the flag is true (e.g., 1), it may indicate that the bilateral matching method is applied to the current block. If the flag is false (e.g., 0), it may indicate that the bilateral matching method is not applied to the current block. In this case, the motion information merging mode or the motion vector prediction mode may be applied to the current block.
- the bilateral matching method may be applied only when it is determined that neither the motion information merging mode nor the motion vector prediction mode is applied to the current block.
- the bilateral matching method may be applied when both the first flag indicating whether the motion information merging mode is applied and the second flag indicating whether the motion vector prediction mode is applied are 0.
- the bilateral matching method may be inserted as a motion information merging candidate in the motion information merging mode or a motion vector prediction candidate in the motion vector prediction mode.
- whether the bilateral matching method is applied may be determined based on whether the selected motion information merging candidate or the selected motion vector prediction candidate indicates the bilateral matching method.
- the temporal order of the current picture must exist between the temporal order of the L0 reference picture and the temporal order of the L1 reference picture.
- a unidirectional matching method may also be applied to generate a prediction block of the current block.
- two reference pictures having a temporal order i.e., POC
- both reference pictures may be derived from the L0 reference picture list or the L1 reference picture list.
- one of the two reference pictures may be derived from the L0 reference picture list and the other may be derived from the L1 reference picture list.
- the one-way matching method can be performed based on two reference pictures having a POC smaller than that of the current picture (i.e., forward reference pictures) or two reference pictures having a POC larger than that of the current picture (i.e., backward reference pictures).
- Fig. 11 it is illustrated that motion estimation based on the one-way matching method is performed based on a first reference picture (T-1) and a second reference picture (T-2) having a POC smaller than that of the current picture (T).
- a first reference picture index for identifying the first reference picture and a second reference picture index for identifying the second reference picture may be encoded and signaled, respectively.
- a reference picture having a smaller POC difference from the current picture among the two reference pictures used in the unidirectional matching method may be set as the first reference picture.
- only reference pictures included in the reference picture list having a larger POC difference from the current picture than the first reference picture may be set as the second reference picture.
- the second reference picture index may be set to point to an index of one of the rearranged reference pictures after rearranging reference pictures having the same temporal direction as the first reference picture and having a larger POC difference from the current picture than the first reference picture.
- a reference picture having a larger POC difference from the current picture among the two reference pictures may be set as the first reference picture.
- the second reference picture index may be set to point to an index of one of the rearranged reference pictures after rearranging the reference pictures having the same temporal direction as the first reference picture and having a smaller POC difference from the current picture than the first reference picture.
- a one-way matching method may be performed using a reference picture to which a predefined index is assigned in the reference picture list and a reference picture having the same temporal direction as the reference picture.
- a reference picture having an index of 0 in the reference picture list may be set as the first reference picture, and a reference picture having the smallest index among reference pictures having the same temporal direction as the first reference picture in the reference picture list may be selected as the second reference picture.
- Both the first reference picture and the second reference picture can be selected from the L0 reference picture list or the L1 reference picture list.
- two L0 reference pictures are illustrated as being used in the unidirectional matching method.
- the first reference picture may be selected from the L0 reference picture list and the second reference picture may be selected from the L1 reference picture list.
- Information indicating whether the first reference picture and/or the second reference picture belongs to the L0 reference picture list or the L1 reference picture list may be additionally encoded/decoded.
- one-way matching can be performed using one of the L0 reference picture list and the L1 reference picture list, which is set as default.
- two reference pictures can be selected from the L0 reference picture list and the L1 reference picture list, whichever has a larger number of reference pictures.
- a search range within the first reference picture and the second reference picture can be set.
- the search range can be set within a predetermined range from a collocated block in a reference picture.
- the search range can be set based on the initial motion information.
- the initial motion information can be derived from the neighboring blocks of the current block.
- the motion information of the left neighboring block or the upper neighboring block of the current block can be set as the initial motion information of the current block.
- the size of the motion vector should be set to increase in proportion to the distance between the current picture and the reference picture.
- the second reference block should be spaced apart from the current block by (Dx, Dy).
- D can be determined by the ratio of the distance between the current picture and the first reference picture and the distance between the current picture and the second reference picture.
- the distance between the current picture and the first reference picture i.e., the POC difference
- the distance between the current picture and the second reference picture i.e., the POC difference
- the first motion vector for the first reference block in the first reference picture is (x0, y0)
- the second motion vector (x1, y1) for the second reference block in the second reference picture can be set to (2x0, 2y0).
- the first reference block and the second reference block having the optimal cost can be set as the first prediction block and the second prediction block of the current block, respectively. Thereafter, the final prediction block of the current block can be generated through a weighted sum operation of the first prediction block and the second prediction block.
- the decoder can perform motion estimation in the same manner as the encoder. Accordingly, information indicating whether the unidirectional motion matching method is applied can be explicitly encoded/decoded, while encoding/decoding of motion information such as a motion vector can be omitted. As described above, at least one of the first reference picture index or the second reference picture index can be explicitly encoded/decoded.
- information indicating whether a unidirectional matching method is applied may be explicitly encoded/decoded, and if the unidirectional matching method is applied, the first motion vector or the second motion vector may be explicitly encoded and signaled. If the first motion vector is signaled, the second motion vector may be derived based on the POC difference between the current picture and the first reference picture and the POC difference between the current picture and the second reference picture. If the second motion vector is signaled, the first motion vector may be derived based on the POC difference between the current picture and the first reference picture and the POC difference between the current picture and the second reference picture. In this case, the encoder may explicitly encode a smaller one of the first motion vector and the second motion vector.
- Information indicating whether a one-way matching method is applied may be a 1-bit flag. For example, if the flag is true (e.g., 1), it may indicate that a one-way matching method is applied to the current block. If the flag is false (e.g., 0), it may indicate that a one-way matching method is not applied to the current block. In this case, a motion information merging mode or a motion vector prediction mode may be applied to the current block.
- the one-way matching method may be applied only when it is determined that the motion information merging mode and the motion vector prediction mode are not applied to the current block.
- the one-way matching method may be applied when the first flag indicating whether the motion information merging mode is applied and the second flag indicating whether the motion vector prediction mode is applied are both 0.
- the unidirectional matching method may be inserted as a motion information merging candidate in the motion information merging mode or a motion vector prediction candidate in the motion vector prediction mode.
- whether the unidirectional matching method is applied may be determined based on whether the selected motion information merging candidate or the selected motion vector prediction candidate indicates the unidirectional matching method.
- the position of each pixel in a picture is specified as an integer.
- the movement of an object between screens may not be expressed as an integer position.
- Figures 12 and 13 illustrate examples in which prediction blocks are generated according to the precision of a motion vector.
- Figure 12 shows the location of the current block within the current picture
- Figure 13 shows an example in which a prediction block is obtained according to a motion vector.
- FIG. 13 shows an example in which the motion vector precision is in integer pixel units
- FIG. 13 shows an example in which the motion vector precision is in 1/2 pixel units and 1/4 pixel units, respectively.
- the motion vector precision can also be set in smaller units than those shown.
- the motion vector precision can be set in 1/8 pixel units, 1/16 pixel units, or 1/32 pixel units.
- a reference block composed of integer position samples can be set as the prediction block of the current block, as in the example illustrated in (a) of Fig. 13.
- a reference block composed of fractional position samples can be set as a prediction block of the current block.
- the fractional position samples in the reference block can be generated by interpolating integer position samples.
- the interpolation filter can have a size of 4 taps or 8 taps.
- fractional position samples can be generated via linear interpolation using only integer position samples adjacent to the fractional position.
- Information indicating the motion vector precision of the current block can be encoded and signaled. For example, after assigning a different index to each of a plurality of motion vector precision candidates, the index of the motion vector precision candidate corresponding to the motion vector precision of the current block can be encoded and signaled.
- the number and/or types of available motion vector candidates can be determined based on at least one of the size of the current block, the shape of the current block, the reference picture, or the motion compensation model.
- the motion compensation model can include at least one of a translation model, a zooming model, or a rotation model.
- a motion compensation model in which at least one of a zooming model or a rotation model is combined with a translation model may be referred to as an affine model.
- An index pointing to one of the available motion vector candidates for the current block may be encoded.
- the maximum number of bits required to encode the index may be determined.
- the motion vector By adjusting the precision of the motion vector, the motion vector can be searched more precisely, and thus the prediction accuracy for the current block can be improved.
- motion vectors expressed in fractional positions can be scaled up to integers and encoded.
- Compensation for the motion of the object may be performed based on at least one of a translational model for compensating for linear motion of the object (e.g., motion in the horizontal and/or vertical direction), a zooming model for compensating for change in size of the object, and a rotational model for compensating for rotational motion of the object.
- zooming may represent size enlargement or size reduction.
- Figure 14 shows an example in which motion compensation based on the translational model and the zooming model is performed for the current block.
- the current block is assumed to have a size of 4x4, as illustrated in Fig. 12.
- variable ⁇ represents a size adjustment parameter.
- the size of the reference block can be derived by multiplying the size of the current block by the variable ⁇ .
- a sizing parameter ⁇ less than 1 indicates that the reference block is smaller than the current block, and a sizing parameter ⁇ greater than 1 indicates that the reference block is larger than the current block.
- Figures 14 (a) and (b) show examples when the size adjustment parameter ⁇ is less than 1, and Figure 14 (c) shows examples when the size adjustment parameter ⁇ is greater than 1.
- the upper left position of the reference block can be specified. Specifically, a position spaced apart by the motion vector from a position corresponding to the upper left sample of the current block in the reference picture can be set as the upper left position of the reference block. Then, according to the size adjustment parameter, a reference block whose width and height are ⁇ times the width and height of the current block, respectively, can be set.
- the fractional position samples in the reference block can be generated by interpolating the integer position samples.
- a reference block derived by the motion vector and scale parameters can be set as a prediction block of the current block.
- information about the sizing parameter ⁇ may be encoded and signaled. Specifically, a different index may be assigned to each of a plurality of sizing parameter candidates, and an index specifying a sizing parameter candidate applied to the current block may be encoded and signaled.
- the resizing parameters of the current block can be derived based on the resizing parameters of the neighboring blocks. For example, the resizing parameters of the neighboring blocks at a predefined position can be set as the resizing parameters of the current block.
- the size adjustment parameter of the first searched available neighboring block can be set as the size adjustment parameter of the current block.
- the size adjustment parameter of the neighboring block may be set as the size adjustment parameter candidate.
- a plurality of neighboring blocks may be sequentially searched to generate a size adjustment parameter candidate list including a plurality of size adjustment parameter candidates.
- One of the plurality of size adjustment parameter candidates included in the plurality of size adjustment parameter candidate lists may be set as the size adjustment parameter of the current block.
- an index indicating a candidate that is identical to the size adjustment parameter of the current block among the plurality of size adjustment parameter candidates may be encoded and signaled.
- the neighboring blocks used to derive the size adjustment parameters of the current block may include at least one of an upper neighboring block, a left neighboring block, an upper-left neighboring block, an upper-right neighboring block, or a lower-left neighboring block.
- Figure 15 shows an example in which motion compensation based on translational models and rotational models is performed for the current block.
- the current block is assumed to have a size of 4x4, as illustrated in Fig. 12.
- the position of a temporary block in a reference picture can be specified based on the motion vector of the current block. Specifically, a block position that takes a position spaced apart by the motion vector from the position corresponding to the upper left sample of the current block in the reference picture as the upper left sample can be specified.
- the temporary block can be rotated, as in the example shown in (b) of Fig. 15.
- the block at the rotated position is set as a reference block, and the reference block can be set as a prediction block of the current block.
- a rotation matrix can be used to rotate a temporary block specified by a motion vector. That is, a prediction sample for the current block can be set to a sample at a position obtained by applying a rotation matrix to a sample position within the temporary block.
- Mathematical expression 1 represents the rotation matrix
- (pos_x, pos_y) represents the position of a sample within a temporary block. That is, (pos_x, pos_y) can be derived by adding a motion vector to the position of a prediction target sample within the current block.
- the sample value at the (pos_x', pos_y') position in the reference picture can be set as the value of the prediction sample for the position of the prediction target sample. If the (pos_x', pos_y') position is a fractional position, the sample at the corresponding position can be generated by interpolating integer position samples.
- information indicating the rotation angle ⁇ can be encoded and signaled. For example, after assigning a different index to each of a plurality of rotation angle candidates, the index of the rotation angle candidate corresponding to the rotation angle of the current block can be encoded and signaled.
- the rotation angle of the current block can be derived based on the rotation angle of the neighboring block.
- the rotation angle of the neighboring block at a predefined position can be set to the rotation angle of the current block.
- the rotation angle of the first searched available neighboring block can be set to the rotation angle of the current block.
- the rotation angle of the neighboring block may be set as the rotation angle candidate.
- a plurality of neighboring blocks may be sequentially searched to generate a rotation angle candidate list including a plurality of rotation angle candidates.
- One of the plurality of rotation angle candidates included in the plurality of rotation angle candidate lists may be set as the rotation angle of the current block.
- an index indicating a candidate having the same rotation angle as the current block among the plurality of rotation angle candidates may be encoded and signaled.
- the neighboring block used to derive the rotation angle of the current block may include at least one of an upper neighboring block, a left neighboring block, an upper-left neighboring block, an upper-right neighboring block, or a lower-left neighboring block.
- the motion vector precision for the current block or the number and/or types of motion vector precision candidates available for the current block may be determined differently depending on the motion compensation model.
- the number and/or type of motion vector precision candidates available for the current block may differ between cases where only a translational model is applied and cases where at least one of a zooming model or a rotational model is applied.
- candidates of 1/4 pixel unit or more may be available for the current block.
- candidates of 1/16 pixel unit or more may be available for the current block.
- the motion vector precision of the current block may be set to 1/4 pixel units.
- the motion vector precision of the current block may be set to 1/16 pixel units.
- the available motion vector precisions or available motion vector precision candidates for each motion compensation model may be pre-stored.
- information indicating the available motion vector precisions or available motion vector precision candidates for each motion compensation model may be encoded and signaled through the upper header.
- Motion compensation can be performed on an affine model to which a zooming model and/or a rotation model are added to a translational model by using the motion vector of the control point.
- the control point may correspond to a corner of the current block.
- at least one of the motion vector of the upper left corner, the motion vector of the upper right corner, or the motion vector of the lower left corner can be used.
- control point motion vector the motion vector of a control point.
- Figures 16 and 17 show examples of generating a prediction block for a current block using control point motion vectors.
- the current block is assumed to have a size of 4x4, as illustrated in Fig. 12.
- a prediction block for the current block is derived by a motion vector of a first control point corresponding to the upper left corner of the current block (a first control point motion vector, A) and a motion vector of a second control point corresponding to the upper right corner of the current block (a second control point motion vector, B).
- a prediction block of the current block by additionally utilizing the motion vector of the lower left corner, or by utilizing the motion vector of the lower left corner instead of the upper right corner.
- Figure 18 shows an example of generating a prediction block for the current block using three control point motion vectors.
- a prediction block for the current block is derived by a motion vector of a first control point corresponding to the upper left corner of the current block (first control point motion vector, A), a motion vector of a second control point corresponding to the upper right corner of the current block (second control point motion vector, B), and a motion vector of a third control point corresponding to the lower left corner of the current block (third control point motion vector, C).
- translational, zooming, and rotational motion compensation for the current block can be performed using two control point motion vectors or three control point motion vectors.
- Information indicating the number of control point motion vectors can be encoded and signaled.
- the information can be signaled on a block-by-block basis.
- the information can indicate whether two or three control point motion vectors are used in the current block.
- the number of control point motion vectors can be adaptively determined based on at least one of the size or shape of the current block.
- the number of control point motion vectors for the current block can be set equal to the number of control point motion vectors of the neighboring blocks.
- Mathematical expression 2 represents a formula for deriving a motion vector for each sample using two control point motion vectors.
- (mv x , mv y ) represents a motion vector at the (x, y) position within the current block.
- (mv Ax , mv Ay ) represents a first control point motion vector (A)
- (mv Bx , mv By ) represents a second control point motion vector (B).
- W represents the width of the current block.
- a motion vector per sample can be derived by the following mathematical expression 3.
- (mv Cx , mv Cy ) represents the third control point motion vector (C).
- motion compensation can be performed for each sample, as in the example illustrated in Fig. 17.
- a reference sample indicated by the motion vector of the prediction target sample can be set as a prediction sample for the prediction target sample.
- integer position samples can be interpolated to generate fractional position samples, and the generated fractional position samples can be set as prediction samples for the prediction target sample.
- the precision of the motion vector for each sample may be different.
- the motion vector for the first prediction target sample may be derived in units of 1/2 pixels, while the motion vector for the second prediction target sample may be derived in units of 1/4 pixels.
- the fractional position sample can be generated according to the motion vector precision for each of the prediction target samples.
- the motion vector of the prediction target sample can be adjusted according to the reference motion vector precision, and then the prediction sample for the prediction target sample can be derived based on the adjusted motion vector. For example, if the reference motion vector precision is 1/2, the motion vector for the second prediction target sample can be adjusted in units of 1/4 pixels.
- the reference motion vector precision can be determined on a block-by-block basis. Alternatively, the precision of control point motion vectors can be set as the reference motion vector precision. Alternatively, in the encoder and decoder, the reference motion vector precision can be predefined.
- motion vectors can be derived on a sub-block basis.
- Figure 19 shows an example in which a motion vector is derived in sub-block units.
- a sub-block may be predefined in the encoder and decoder.
- a sub-block may be a square block of size 2x2 or 4x4.
- the size and/or shape of the sub-block may be adaptively determined based on the size and/or shape of the current block. For example, if the current block is square, the sub-block may also be square. On the other hand, if the current block is non-square, the sub-block may also be non-square.
- information on at least one of the division method or division shape of the current block may be explicitly encoded and signaled.
- information on at least one of the size of a sub-block, the shape of a sub-block, the position of a division line dividing the current block, or the number of division lines may be explicitly encoded and signaled.
- the information may be encoded and signaled on a block-by-block basis, or may be encoded and signaled via an upper header.
- a motion vector of a sub-block can be derived using coordinates of a predefined position within a sub-block.
- the predefined position can be one of the positions of the upper left sample, the upper right sample, the lower left sample, the lower right sample, or the center position within a sub-block.
- the motion vector of the sub-block can be derived.
- motion vectors can be derived for each sub-block based on the affine motion model.
- deriving motion vectors in sub-block units using collocated pictures can be called SbTMVP (Sub-block Temporal Motion Vector Prediction).
- a collocated picture may be one of the reference pictures included in a reference picture list. For example, a picture having an index of 0 in the reference picture list may be selected as a collocated picture.
- information indicating the index of a reference picture to be set as a collocated picture in the reference picture list can be explicitly encoded and signaled.
- Figures 20 and 21 illustrate examples in which motion vectors are derived for each sub-block within the current block when SbTMVP is applied.
- the size and/or shape of the sub-block may be predefined in the encoder and decoder.
- the size and/or shape of the sub-block may be adaptively determined based on the size and/or shape of the current block. For example, if at least one of the width or the height of the current block is greater than the threshold, the size of the sub-block may be set to 8x8. Otherwise, the size of the sub-block may be set to 4x4.
- information indicating the size and/or shape of the sub-block may be explicitly encoded and signaled.
- an initial motion vector of a current block can be derived.
- the initial motion vector can be derived based on at least one of a motion vector prediction list or a motion information merge list. For example, an index indicating one of the motion vector prediction candidates included in the motion vector prediction list can be encoded and signaled.
- the initial motion vector can be derived by adding a motion vector differential value to the motion vector prediction candidate indicated by the index. Meanwhile, the motion vector differential value can also be explicitly encoded and signaled.
- encoding of the index may be omitted, and a motion vector prediction candidate having a predefined index in the motion vector prediction list may be set as a prediction value for the initial motion vector.
- the motion vector prediction candidate having a predefined index may be a motion vector prediction candidate having an index of 0 or a motion vector prediction candidate having the largest index.
- an index indicating one of the motion information merging candidates included in the motion information merging list may be encoded and signaled.
- the initial motion vector may be set to be the same as the motion vector of the motion information merging candidate indicated by the index.
- the initial motion vector may be derived based on a motion information merge candidate having a predefined index in the motion information merge list.
- the motion information merge candidate having a predefined index may be a motion information merge candidate having an index of 0 or a motion information merge candidate having the largest index.
- the initial motion vector can be derived using the motion vector of a neighboring block at a predefined position.
- the neighboring block at the predefined position can be a left neighboring block or an upper neighboring block.
- the motion vector of a neighboring block at a predefined position can be set as a predicted value of the initial motion vector, and the initial motion vector can be derived by adding a difference value to the predicted value.
- the motion vector of a neighboring block at a predefined position can be set as the initial motion vector.
- the initial motion vector can be derived using a template-based motion estimation method (i.e., template matching method) or bilateral matching.
- the precision of the initial motion vector may be predefined in the encoder and decoder.
- the precision of the initial motion vector may be fixed in integer pixel units.
- information indicating the precision of the initial motion vector can be explicitly encoded and signaled.
- the information can be an index indicating one of a plurality of motion vector precision candidates.
- the motion vector prediction candidates can be derived based on the motion vector precision of the initial motion vector. That is, after adjusting the motion vector prediction candidate according to the motion vector precision of the initial motion vector, the adjusted initial motion vector prediction candidate can be inserted into the motion vector prediction list.
- the motion information merge candidates can be derived based on the motion vector precision of the initial motion vector. That is, after adjusting the motion information merge candidate according to the motion vector precision of the initial motion vector, the adjusted initial motion information merge candidate can be inserted into the motion information merge list.
- the initial motion vector may not be derived from the motion information merge candidate.
- an index indicating one of the multiple candidates may be encoded and signaled.
- the initial motion vector may be derived from the candidate having the smallest index or the largest index among the multiple candidates.
- one of the motion information in the L0 direction and the motion information in the L1 direction can be selected according to a preset priority, and an initial motion vector can be derived from the selected motion information.
- the priority may be determined based on at least one of the magnitude of the motion vector of the motion merging candidate, the index of the reference picture of the motion merging candidate, or whether the reference picture of the motion merging candidate is identical to the collocated picture.
- it can be set to always derive the initial motion vector based on the motion information in the L0 direction.
- motion estimation can be performed according to the precision of the initial motion vector. For example, when the precision of the initial motion vector is in integer pixel units, motion estimation based on template matching can also be performed only at integer positions.
- motion estimation can be performed according to the precision of the initial motion vector.
- a motion vector for the L0 direction (L0 motion vector) and a motion vector for the L1 direction (L1 motion vector) are derived.
- one of the L0 motion vector and the L1 motion vector can be set as the initial motion vector according to the preset priority.
- it can be set to always derive the initial motion vector based on the motion information in the L0 direction.
- information indicating which of the L0 motion vector and the L1 motion vector is set as the initial motion vector may be encoded and signaled.
- the position of the collocated block within the collocated block can be determined using the initial motion vector. For example, a block at a position spaced apart by the initial motion vector from a position corresponding to the current block within the reference picture can be set as the collocated block. At this time, the position of the collocated block can be determined based on a predefined position within the current block.
- the predefined position can be an upper left position, an upper right position, a lower left position, a lower right position, or a center position.
- the collocated block can be divided into a plurality of collocated sub-blocks. Then, the motion vector of each of the collocated sub-blocks in the collocated block can be set as the motion vector of each of the sub-blocks in the current block.
- the positions of the collocated sub-blocks corresponding to each of the sub-blocks in the current block in the collocated picture can be determined using the initial motion vector.
- the position of the collocated sub-block can be derived based on a predefined position in the sub-block.
- the predefined position can be an upper left position, an upper right position, a lower left position, a lower right position, or a center position.
- the motion vector of the collocated sub-block corresponding to the sub-block can be set as the motion vector of the sub-block.
- the motion vector stored at a position corresponding to a predefined position within the sub-block within the collocated sub-block can be set as the motion vector of the sub-block.
- a predefined motion vector can be set as the motion vector of the sub-block.
- the predefined motion vector can be a zero vector (i.e., (0, 0)) or an initial motion vector.
- the motion vector of the sub-block may be derived from another location within the collocated sub-block.
- a position corresponding to a predefined position within a collocated sub-block is encoded with intra prediction, then there is no motion vector at that position.
- the predefined position is a central position (e.g., c10 in Fig. 21)
- the motion vector of the sub-block cannot be derived.
- the motion vector of the sub-block can be derived based on the motion vector stored at a different location from the center location. Specifically, the motion vector of the sub-block can be derived from the motion vector stored at a location adjacent to the center location (e.g., the top adjacent location c6, the left adjacent location c9, or the top left adjacent location c5).
- the samples within the collocated sub-block can be searched according to the scan order, and the first available motion vector found can be set as the motion vector of the sub-block.
- the scan order can be horizontal scan, vertical scan, diagonal scan, or raster scan.
- the motion vector of the sub-block may be set as the motion vector of the collocated block.
- a motion vector stored at a position corresponding to a predefined position within the current block in the collocated block may be set as the motion vector of the sub-block.
- motion vectors can be derived for each sub-block using the affine motion model or SbTMVP.
- motion compensation can be performed for each sub-block based on the motion vectors of each sub-block.
- a prediction block for the current block By performing motion compensation for each of the sub-blocks, a prediction block for the current block can be obtained. That is, the prediction block can be composed of prediction samples of each of the sub-blocks.
- the motion vector precision can be adjusted.
- the position of each sample in a picture is defined as an integer position.
- the position where the motion is reflected may be a decimal position rather than an integer position.
- Figures 22 and 23 are diagrams showing examples in which prediction blocks are derived according to motion vector precision.
- Figure 22 shows the location of the current block within the current picture
- Figure 23 shows the location of the reference block according to the motion vector precision.
- the motion vector of the current block can be defined as the distance from a sample corresponding to the upper left position of the current block in the reference picture to a sample corresponding to the upper left position of the reference block in the reference picture.
- Figure 23 (a) illustrates a case where the motion vector precision of the current block is an integer pel
- Figure 23 (b) illustrates a case where the motion vector precision of the current block is 1/2 pel
- Figure 23 (c) illustrates a case where the motion vector precision of the current block is 1/4 pel.
- motion vectors are expressed up to 1/4 vector precision, but motion vectors can also be expressed more precisely, such as 1/8, 1/16, or 1/32.
- information for indicating the motion vector precision of the current block may be encoded and signaled.
- the information may be an index identifying one of the motion vector precision candidates.
- a different index may be assigned to each of the motion vector precision candidates, and the information may indicate an index of a motion vector precision candidate applied to the current block.
- the samples existing in the real position can be generated using the samples existing in the integer position and the interpolation filter.
- the motion vector expressed in real numbers can be scaled up to an integer and encoded/decoded.
- the motion vector (MV), the motion vector predictor (MVP) and the motion vector differential (MVD) can be encoded/decoded as integer values through integerization.
- the motion vector, the motion vector predictor and/or the motion vector differential can be integerized based on the motion vector precision.
- integerization can be performed by multiplying the motion vector difference MVD by N.
- the motion vector difference MVD is (4/16, 8/16)
- the motion vector difference MVD can be integerized by multiplying by 16. That is, the integerized motion vector difference MVD can be expressed as (4, 8).
- the actual MVD can be derived from the integerized MVD. For example, if the motion vector precision is 1/N, the integerized MVD can be divided by N to derive the actual MVD. For example, if the integerized MVD is (4, 8) and the motion vector precision is 1/8, the actual MVD can be (4/8, 8/8). Alternatively, if the integerized MVD is (4, 8) and the motion vector precision is 1/4, the actual MVD can be (4/4, 8/4).
- the representation range of the integerized MVD may be different. For example, assume that the motion vector difference MVD is (4/16, 8/16) (i.e., (1/4, 2/4)). When the motion vector precision is 1/16, the integerized MVD is derived as (4, 8). On the other hand, when the motion vector precision is 1/4, the integerized MVD is derived as (1, 2).
- the integerized MVD value can be reduced from (4, 8) to (1, 2).
- the number of bits required to encode/decode the integerized motion vector difference MVD may be different. Accordingly, when encode/decode the motion vector difference MVD, the motion vector precision that can minimize the number of bins can be selected. Then, based on the selected motion vector precision, the motion vector difference MVD can be integerized, and the integerized motion vector difference MVD can be encode/decode. In addition, information about the motion vector precision can be additionally encode/decode.
- the actual MVD can be reconstructed from the decoded MVD based on the motion vector precision. Then, the motion vector MV can be derived by combining the reconstructed MVD and the motion vector prediction value MVP.
- the method of adjusting the value of the motion vector differential value MVD to be encoded/decoded based on the motion vector precision is called the Adaptive Motion Vector Resolution (AMVR) method.
- AMVR Adaptive Motion Vector Resolution
- Figures 24 and 25 are diagrams for explaining the process of encoding and decoding a motion vector difference value, respectively, when the AMVR method is applied.
- motion vector and motion vector difference are expressed in units of 1/16 before integerization is performed, and 1/16 is expressed as the original motion vector precision.
- the motion vector differential value MVD can be derived by differentiating the motion vector prediction value MVP from the motion vector MV (S2410).
- the motion vector difference MVD can be composed of a horizontal component (i.e., x-axis component) and a vertical component (i.e., y-axis component).
- the motion vector difference is 0, that is, if both the horizontal direction component and the vertical direction component are 0, the value of the motion vector difference MVD to be encoded becomes 0 regardless of the motion vector precision. Therefore, if the motion vector difference MVD is 0, encoding of AMVR-related information can be omitted (S2420).
- the motion vector precision can be determined (S2430). Meanwhile, the motion vector precision can be encoded as AMVR-related information.
- Information related to AMVR may include at least one of a flag (e.g., amvr_flag) indicating whether the AMVR method is applied to the current block, and an index (e.g., amvr_prec_idx) indicating one of a plurality of motion precision candidates when the AMVR method is applied.
- a flag e.g., amvr_flag
- an index e.g., amvr_prec_idx
- the motion vector precision can be set to the default value.
- amvr_flag can be encoded as a value of 0.
- the default value can be 1, 1/2, 1/4, 1/8 or 1/16.
- an index indicating one of the plurality of motion vector precision candidates i.e., amvr_prec_idx
- amvr_flag may be encoded with a value of 1
- amvr_prec_idx may be encoded with a value from 0 to (n-1).
- n represents the number of motion vector precision candidates.
- the plurality of motion vector precision candidates may include at least one of 4, 2, 1, 1/2, 1/4, 1/8, or 1/16.
- the default value may not be set to the plurality of motion vector precision candidates indicated by the index. That is, when the motion vector precision of the current block is the default value, it is encoded and signaled as the value of amvr_flag, 0, and encoding of amvr_prec_idx may be omitted.
- the optimal motion vector precision can be determined by performing RDO (Rate Distortion Optimization) for each combination of amvr_flag and amvr_prec_idx. That is, the combination with the optimal cost can be selected by performing RDO for the following cases.
- RDO Rate Distortion Optimization
- amvr_flag is 1 and amvr_prec_idx is 0
- amvr_flag is 1 and amvr_prec_idx is 1
- amvr_flag 1 and amvr_prec_idx is 2
- a variable for scaling the motion vector difference i.e., a scaling parameter
- Table 1 illustrates the values of the variable amvrshift according to the motion vector precision.
- the motion vector precision can be expressed as shown in the following mathematical expression 4.
- the variable amvrshft can be determined according to the value of amvr_prec_idx. For example, when amvr_prec_idx is 1, the variable amvrshift is set to 4. This indicates that the motion vector precision is 1 according to mathematical expression 4.
- the motion vector difference value MVD can be scaled down and encoded using the variable amvrshift according to the motion vector precision.
- mathematical expression 5 shows an example in which a scale down operation is performed on the motion vector difference value MVD.
- MVD_x represents the horizontal component of the motion vector difference
- MVD_y represents the vertical component of the motion vector difference
- MVD'_x and MVD'_y represent the results of performing the scale down operation.
- the encoder can encode motion vector difference and AMVR information with changed precision (S2440).
- the motion vector difference MVD can be decoded (S2510).
- the motion vector difference is 0, decoding of AMVR related information is omitted, and the motion vector MV of the current block can be set to be the same as the motion vector prediction value (S2520).
- a variable amvrshift can be derived for scaling the motion vector difference.
- a variable amvrshfit can be derived based on amvr_flag and/or amvr_prec_idx.
- the decoded MVD can be scaled up to obtain a motion vector difference MVD restored to the original precision (S2540).
- Mathematical expression 6 shows an example in which a scale-up operation is applied to the decoded MVD.
- MVD' represents the decoded motion vector difference.
- MVD represents the motion vector difference restored to the original precision, i.e., 1/16, through a scale-up operation.
- the motion vector MV can be obtained by combining the motion vector difference MVD restored to the original precision and the motion vector prediction MVP.
- the decoder can derive the motion vector MV by combining the motion vector prediction value MVP and the motion vector differential value MVD.
- a color picture may be composed of multiple channels.
- a color picture may be composed of a Y picture, a Cb picture, and a Cr picture.
- Y represents a luma (or luminance) component
- Cb and Cr represent chroma (or chrominance) components.
- Chroma formats can indicate the size of a chroma picture relative to a luma picture. For example, a 4:4:4 format indicates that the size of a luma picture is the same as that of a chroma picture. A 4:2:0 format indicates that the width and height of a chroma picture are each half the width and height of a luma picture.
- a block within a luma picture is referred to as a luma block
- a block within a chroma picture is referred to as a chroma block
- a chroma block may represent at least one of a Cb component block or a Cr component block.
- the motion information encoding/decoding method described above can be applied.
- the motion information of the chroma block can be derived based on the motion information of the luma block. For example, if the sizes of the chroma picture and the luma picture are different depending on the chroma format, the motion information of the chroma block can be derived by scaling the motion vector of the luma block existing at the same position as the chroma block. If the chroma format is 4:2:0, the width and height of the chroma picture are each half the size of the width and height of the luma picture. Accordingly, the x-axis component and the y-axis component of the motion vector of the luma block, each reduced by 1/2 (i.e., shifted to the right by 1), can be set as the motion vector of the chroma block.
- the motion information of the luma block can be used as the motion information of the chroma block without performing scaling.
- a reference block within the reference picture can be specified, and the specified reference block can be set as a prediction block of the chroma block.
- the prediction direction of the chroma block can also be set to be the same as that of the luma block. For example, if bi-prediction is used for the luma block, bi-prediction can also be applied to the chroma block, and if uni-directional prediction is used for the luma block, uni-directional prediction can also be applied to the chroma block.
- a chroma block can also be predicted in a different way than described above. Specifically, after deriving a prediction parameter, a chroma block can be predicted from a luma block corresponding to the chroma block. Predicting a chroma block from a restored luma block can be referred to as a color component discrimination prediction method based on a prediction parameter. Hereinafter, a color component discrimination prediction method based on a prediction parameter will be described in detail. Meanwhile, the prediction parameter can be derived in the same way in each of an encoder and a decoder.
- Figure 26 shows a flow chart of a color component prediction method based on prediction parameters.
- a reference block of a luma block in a reference picture (hereinafter, referred to as a luma reference block) can be derived (S2610).
- a block at a position spaced apart by a motion vector of the luma block from the position of the luma block in the reference picture can be set as a luma reference block.
- the reference picture represents a previously reconstructed luma picture.
- the luma reference block can be set as a prediction block of the luma block.
- the prediction block of the luma block (hereinafter referred to as the luma prediction block) and the reference block of the luma block can be replaced with each other.
- the luma prediction block can be replaced with the reference block of the luma block, or the reference block of the luma block can be replaced with the luma prediction block.
- a reference block of a chroma block in a reference picture (hereinafter, referred to as a chroma reference block) can be derived (S2620).
- a block at the same position as a luma reference block in the reference picture can be derived as a chroma reference block.
- motion information of a chroma block can be derived from a luma block, and then a chroma reference block can be set based on the derived motion information.
- a block at a position spaced apart from a position of a chroma block in the reference picture by a motion vector of the chroma block can be set as a chroma reference block.
- the reference picture represents a previously reconstructed chroma picture.
- prediction parameters can be derived based on the correlation between the luma prediction block and the chroma reference block (S2630).
- the prediction parameters can include at least one of a weight and an offset.
- a prediction sample of a chroma block can be obtained from a restored luma block (hereinafter, luma restored block) based on the derived prediction parameters (S2640).
- Figures 27 and 28 illustrate the operation of the encoder/decoder according to a color component prediction method based on prediction parameters.
- Figure 27 shows an example of a case where bidirectional prediction is applied to a luma block
- Figure 28 shows an example of a case where unidirectional prediction is applied to a luma block.
- a reference block in the L0 direction can be obtained based on the L0 motion information of the luma block
- a reference block in the L1 direction can be obtained based on the L1 motion information of the luma block.
- Each of the L0 reference block and the L1 reference block can be set as an L0 prediction block and an L1 prediction block, respectively.
- a prediction block of the luma block i.e., a luma prediction block, can be obtained based on an average or weighted sum operation of the prediction block in the L0 direction and the prediction block in the L1 direction.
- reference blocks can be derived for each of the L0 direction and the L1 direction.
- a Cb component block at the same position as the L0 reference block of a luma block can be set as an L0 reference block for the Cb block
- a Cb component block at the same position as the L1 reference block of a luma block can be set as an L1 reference block for the Cb component block.
- the motion information of the Cb block can be derived based on the motion information of the luma block.
- the motion vector of the luma block can be directly set to the motion vector of the Cb block, or the motion vector of the luma block can be scaled to derive the motion vector of the Cb block.
- a reference block in the L0 direction can be obtained based on the L0 motion information of the Cb block
- a reference block in the L1 direction can be obtained based on the L1 motion information of the Cb block.
- a weighted reference block for the Cb component (hereinafter, referred to as a weighted Cb reference block) can be obtained based on an average or weighted sum operation of the reference block in the L0 direction and the prediction block in the L1 direction.
- a weighted reference block for the Cr component (hereinafter referred to as a weighted Cr reference block) can be obtained in the same manner as for the Cb block.
- a first prediction parameter for the Cb component can be derived using the luma prediction block and the weighted Cb reference block.
- a second prediction parameter for the Cr component can be derived using the luma prediction block and the weighted Cr reference block.
- a prediction block of the Cb block can be obtained.
- the luma restoration block can be obtained by adding a residual block of the luma component to the luma prediction block.
- a prediction block of the Cr block can be obtained.
- the Cb block By adding the residual block of the Cb component to the prediction block of the Cb block, the Cb block can be restored, and by adding the residual block of the Cr component to the prediction block of the Cr block, the Cr block can be restored.
- the weighted sum process can be omitted.
- a reference block in the L0 direction can be obtained based on the L0 motion information of the luma block, and the L0 reference block can be set as a prediction block of the luma block.
- the prediction parameter can be derived based on the correlation between the luma prediction block (i.e., the L0 reference block of the luma component) and the L0 reference block of the chroma block.
- the first prediction parameter can be derived based on the luma prediction block and the L0 reference block of the Cb block
- the first prediction parameter can be derived based on the luma prediction block and the L0 reference block of the Cr block.
- prediction parameters can be derived based on the correlation between the luma prediction block (i.e., the L1 reference block of the luma component) and the L1 reference block of the chroma block.
- prediction parameters may be derived using reference blocks for either the L0 direction or the L1 direction. That is, prediction parameters may be derived using a luma reference block in the L0 direction and a chroma reference block in the L0 direction, or prediction parameters may be derived using a luma reference block in the L1 direction and a chroma reference block in the L1 direction.
- L0 direction or the L1 direction may be predefined in the encoder and decoder.
- one of the L0 direction and the L1 direction may be selected by comparing the distances of the reference picture in the L0 direction (i.e., the L0 reference picture) and the reference picture in the L1 direction (i.e., the L1 reference picture) with the current picture, respectively.
- the distance represents the POC (Picture Order Count) difference between the two pictures.
- the prediction parameters may be derived using reference blocks in the direction in which the distance to the current picture is closer among the L0 reference picture and the L1 reference picture.
- the prediction parameters may be derived using the luma reference block in the L0 direction and the chroma reference block in the L0 direction.
- prediction parameters can be derived using the weighted reference blocks.
- prediction parameters can be derived using only reference blocks derived from the reference picture designated as the Col picture.
- prediction parameters can be derived using reference blocks in the same direction as the call picture.
- a chroma block when bidirectional prediction is applied to a luma block, can also be predicted by selecting one of multiple prediction parameter candidates.
- Figure 29 illustrates an example of predicting a chroma block by selecting one of multiple prediction parameter candidates.
- the plurality of prediction parameter candidates may include at least one of a first prediction parameter candidate derived using reference blocks in the L0 direction, a second prediction parameter candidate derived using reference blocks in the L1 direction, or a third prediction parameter candidate derived using a weighted predicted reference block.
- the first prediction parameter candidate may be derived based on a correlation between an L0 reference block of a luma block and an L0 reference block of a chroma block
- the second prediction parameter candidate may be derived based on a correlation between an L1 reference block of the luma block and an L1 reference block of the chroma block.
- the third prediction parameter candidate may be derived based on a correlation between a result of weighting the L0 reference block and the L1 reference block of the luma block and a result of weighting the L0 reference block and the L1 reference block of the chroma block.
- an optimal prediction parameter candidate can be selected from among a plurality of prediction parameter candidates, and a chroma block can be predicted based on the selected prediction parameter candidate.
- index information indicating an optimal prediction parameter candidate from among a plurality of prediction parameter candidates can be encoded and signaled to a decoder.
- the index information may be encoded and signaled for each of the Cb component and the Cr component. That is, the optimal prediction parameter for the Cb component may be determined based on the index information decoded for the Cb component among a plurality of prediction parameter candidates for the Cb component, and the optimal prediction parameter for the Cr component may be determined based on the index information decoded for the Cr component among a plurality of prediction parameter candidates for the Cr component.
- a single index information may be encoded and signaled. For example, if the index information points to the L0 direction, prediction parameters for both the Cb and Cr components may be derived based on reference blocks in the L0 direction.
- the prediction parameters can be derived using chroma reference blocks for either the L0 direction or the L1 direction of the chroma block.
- the prediction parameters can be derived using a weighted reference block of the luma component (i.e., a block derived by weighting the L0 reference block and the L1 reference block) and a chroma reference block in the L0 direction, or the prediction parameters can be derived using a weighted reference block of the luma component and a chroma reference block in the L1 direction.
- a weighted reference block of the luma component i.e., a block derived by weighting the L0 reference block and the L1 reference block
- a chroma reference block in the L1 direction i.e., a block derived by weighting the L0 reference block and the L1 reference block
- whether to use the chroma reference block in the L0 direction or the chroma reference block in the L1 direction can be determined by a preset condition.
- information indicating the prediction direction of the chroma block can be encoded and signaled.
- the prediction direction can indicate L0 unidirectional prediction, L1 unidirectional prediction, or bidirectional prediction.
- at least one of the chroma reference block in the L0 direction or the chroma reference block in the L1 direction can be selected.
- a prediction parameter may be derived by a correlation between a luma prediction block and a chroma reference block, and for the other of the Cb component and the Cr component, differential information with respect to the prediction parameter may be encoded and signaled.
- the prediction parameter can be derived based on the correlation between the luma prediction block and the reference block of the Cb component.
- the differential information between the prediction parameter of the Cr component and the prediction parameter of the Cb component can be encoded and signaled.
- the prediction parameter can be derived based on the correlation between the luma prediction block and the reference block of the Cb component. From the bitstream, the differential information between the prediction parameter of the Cr component and the prediction parameter of the Cb component is decoded, and then the differential value is added to the prediction parameter of the Cb component, thereby deriving the prediction parameter for the Cr component.
- the differential information may include at least one of a difference between weights or a difference between offsets.
- Figure 30 shows an example of deriving prediction parameters for color difference components.
- the chroma format is 4:4:4 and the size of the reference block for the luma prediction block and the chroma component (i.e., the chroma reference block) is 4x4.
- the chroma component can represent the Cb component or the Cr component.
- the difference (i.e., error (E)) between samples in a luma prediction block and samples in a chroma reference block can be defined as in the following mathematical expression (7).
- T represents a block
- (i, j) represents the coordinate of a sample within the block.
- RefC represents a sample value within a chroma reference block
- PredL represents a prediction sample value within a luma prediction block.
- the chroma format is not 4:4:4, PredL can be obtained by applying a down-sampling filter to prediction samples within the luma prediction block.
- mathematical expression 7 is partially differentiated with the weight ⁇ and the offset ⁇ as in mathematical expressions 8 and 9, respectively, and the weight ⁇ and the offset ⁇ for which the result of the partial differentiation becomes 0 can be derived.
- the derived prediction parameters can be applied to the restored luma block to obtain a prediction block for the chroma block.
- Mathematical expression 10 shows an example of deriving a prediction block for a chroma block.
- PredC represents a prediction sample of a chroma component
- recL represents a reconstructed sample in a reconstructed luma block.
- the chroma format is not 4:4:4
- recL can be obtained by applying a down-sampling filter to the reconstructed samples in the luma reconstructed block.
- a prediction sample of a chroma block can be obtained by multiplying a luma restoration sample at the same location as a location to be predicted within a chroma block by a weight ⁇ and adding an offset ⁇ to the result.
- prediction parameters can be derived using only samples at sub-sampled locations.
- Figure 31 shows the sub-sampled locations.
- a prediction block for a chroma block can be obtained by using only samples at sub-sampled locations within a luma prediction block and a chroma reference block.
- Which of the sub-sampling location candidates illustrated in Fig. 31 is to be used may be predefined in the encoder and decoder.
- index information indicating one of the multiple sub-sampling location candidates may be encoded and signaled.
- multiple prediction parameter candidates can be derived based on multiple sub-sampling location candidates.
- a prediction parameter candidate can be derived from each of the candidates illustrated in (a) to (d) of FIG. 31.
- an optimal prediction parameter among a plurality of prediction parameter candidates can be determined, and index information indicating the optimal prediction parameter among the plurality of prediction parameter candidates can be encoded and signaled.
- the subsampling locations can be determined adaptively.
- subsampling may be performed only on luma prediction blocks.
- the prediction parameters are exemplified as including weights ⁇ and offsets ⁇ .
- multiple filter coefficients may be defined as prediction parameters.
- filter coefficients of a convolutional filter that minimizes the difference between a luma prediction block and a chroma reference block may be defined as prediction parameters.
- Figure 32 shows an example of deriving prediction parameters using a convolution filter.
- Figure 33 shows the form of a convolution filter.
- Fig. 32 an example of deriving prediction parameters using the 5-tap convolution filter illustrated in Fig. 33 is shown.
- C represents a luma prediction sample located at the center of the filter
- N, W, S, and E represent samples around the luma prediction sample.
- N may represent a sample neighboring the upper side of the luma prediction sample C, i.e., a sample at position [i, j-1].
- S may represent a sample neighboring the lower side of the luma prediction sample C, i.e., a sample at position [i, j+1].
- W may represent a sample neighboring the left side of the luma prediction sample C, i.e., a sample at position [i-1, j].
- E may represent a sample neighboring the right side of the luma prediction sample C, i.e., a sample at position [i+1, j].
- the luma prediction block can be downsampled. That is, C can represent a downsampled luma prediction sample. Additionally, N, W, S, and E can represent samples adjacent to C within the downsampled luma prediction block.
- the sample input to the convolution filter may be a restoration sample around the luma block. That is, [i, j] may represent the coordinate within the luma block.
- the samples input to the convolution filter may be reconstructed samples around the reference block of the luma block in the reference picture. That is, [i, j] may represent the coordinates of the reference block of the luma block in the reference picture.
- an output value of the convolution filter can be obtained, and filter coefficients that minimize the difference (i.e., error (E)) between the output value of the convolution filter and the corresponding sample value in the chroma reference block can be derived.
- mathematical expression 11 shows an example of deriving filter coefficients.
- Equation 11 w0 to w4 represent weights applied to C, N, S, E, and W, respectively.
- RefC represents a sample within a chroma reference block.
- B may be a value derived based on the bit depth of the picture.
- mathematical expression 12 shows an example of deriving the variable B.
- D represents the bit depth.
- B can be set to 512, which is the middle value of the range that can be expressed by 10 bits.
- B can be set to 128, which is the middle value of the range that can be expressed by 8 bits.
- variable B can be set to the mean of prediction samples within the luma prediction block.
- variable B can be set to the mean of samples input to the convolution filter.
- information representing the value of variable B can be explicitly encoded and signaled.
- mathematical expression 11 can be partially differentiated with respect to each of the filter coefficients (i.e., w0 to w5), and filter coefficients for which the result of the partial differentiation becomes 0 can be derived.
- a convolution filter can be applied to the luma block to obtain a prediction sample of the chroma block.
- the prediction sample of the chroma block can be derived by the following mathematical expression 13.
- PredC represents a prediction sample of a chroma block.
- C' represents a reconstructed sample (i.e., recL[i][j]) at the same location as the chroma prediction sample in the luma block.
- N', S', E', and W' represent samples adjacent to C'. For example, N' may represent an upper adjacent sample of C', S' may represent a lower adjacent sample of C', E' may represent a right adjacent sample of C', and W' may represent a left adjacent sample of C'.
- the luma block can be downsampled. That is, C' can represent a downsampled luma restoration sample. Additionally, N', W', S' and E' can represent samples adjacent to C' within the downsampled luma block.
- prediction parameters and chroma prediction samples may be derived using a 1D shape, square or rectangular filter.
- the prediction parameters including the weights ⁇ and the offset ⁇ may be referred to as linear prediction parameters, and the prediction parameters including the filter coefficients of the convolution filter (e.g., w0 to w5) may be referred to as convolution prediction parameters.
- one of the linear prediction parameter and the convolution prediction parameter may be selected based on at least one of the chroma format, the size of the luma/chroma block, the bit depth, the mean value of the reconstructed samples in the luma block, or the slice type.
- the chroma format is 4:4:4
- the chroma block can be predicted using the linear prediction parameter.
- the chroma format is 4:2:2 or 4:2:0
- the chroma block can be predicted using the convolution prediction parameter.
- the linear prediction parameters and the convolution prediction parameters may be combined to obtain the prediction block of the chroma block.
- a first prediction block for the chroma block may be obtained based on the linear prediction parameters
- a second prediction block for the chroma block may be obtained based on the convolution prediction parameters. Thereafter, the first prediction block and the second prediction block may be averaged or weighted to derive the final prediction block of the chroma block.
- color component prediction based on prediction parameters can be performed on a sub-block basis.
- Figure 34 is a diagram for explaining an example in which a color component discrimination prediction method based on prediction parameters is performed on a sub-block basis.
- a chroma block can be divided into multiple sub-blocks, and prediction parameters can be derived independently for each chroma sub-block.
- a reference block for the first luma sub-block can be determined based on motion information of the first luma sub-block within the luma block.
- the reference block for the first luma sub-block can be a prediction block of the first luma sub-block.
- a reference block can be determined for the first chroma sub-block within a chroma block.
- the reference block of the first chroma sub-block can be a block at the same location as the reference block of the first luma sub-block within the reference picture.
- motion information of the first chroma sub-block can be derived from motion information of the first luma sub-block, and a reference block of the first chroma sub-block can be derived based on the motion information of the first chroma block.
- a first prediction parameter for the first chroma sub-block can be derived using the reference block of the first luma sub-block and the reference block of the first chroma sub-block.
- the second to fourth prediction parameters can be derived for the second to fourth chroma sub-blocks.
- a prediction block of the chroma sub-block can be derived. For example, by applying a first prediction parameter to a first luma sub-block in a luma block, a prediction block for a first chroma sub-block in the chroma block can be obtained. In addition, by applying a second prediction parameter to a second luma sub-block in the luma block, a prediction block for a second chroma sub-block in the chroma block can be obtained.
- a prediction block for a third chroma sub-block can be obtained by applying a third prediction parameter to a third luma sub-block
- a prediction block for a fourth chroma sub-block can be obtained by applying a fourth prediction parameter to a fourth luma sub-block.
- the size of a sub-block may be predefined in the encoder and decoder.
- information indicating the size of the sub-block can be encoded and signaled via the upper header.
- whether color component prediction based on prediction parameters is performed on a sub-block basis can be determined based on at least one of the size of a luma/chroma block, the number of samples included in a sub-block generated when dividing the block, or a chroma format.
- whether the color component discriminant prediction based on the prediction parameter is performed on a sub-block basis can be determined based on whether the inter prediction of the luma block is performed on a sub-block basis. For example, if the luma block is encoded/decoded based on an affine model or SbTMVP, the color component discriminant prediction based on the prediction parameter can be performed on a sub-block basis. On the other hand, if the luma block is encoded/decoded based on a translational motion model, the color component discriminant prediction can be performed by deriving the prediction parameter at the block level.
- prediction parameters derived at the block level can be derived.
- a luma sub-block corresponding to a chroma sub-block is not encoded with inter prediction or there is no motion vector stored in the luma sub-block, it may not be possible to derive prediction parameters for the chroma sub-block.
- prediction parameters can be derived at the block level, and the derived block level prediction parameters can be used as prediction parameters of chroma sub-blocks.
- the prediction parameters at the block level can be derived based on the correlation between the reference block of the luma block and the reference block of the chroma block.
- Information indicating whether color component prediction based on prediction parameters is applied to a chroma block may be encoded and signaled.
- the information may be a 1-bit flag.
- color component discrimination prediction based on prediction parameters may be allowed in a limited manner when certain conditions are satisfied.
- the certain conditions may be determined based on at least one of the number of transformed and quantized coefficients, the values of the transformed and quantized coefficients, the number of samples in a luma block, the values of the samples in a luma block, the chroma format, or whether bidirectional prediction is performed.
- the transformed and quantized coefficients may also be referred to as residual coefficients.
- color component-specific prediction based on prediction parameters can be performed.
- color component-specific prediction based on prediction parameters can be applied only if the number of non-zero transformed and quantized coefficients in the luma block is greater than or equal to a threshold.
- color component prediction based on prediction parameters can be performed only if all samples within the luma block have non-zero values.
- color component prediction based on prediction parameters can be performed only if the values of all residual samples within the luma block are not 0.
- color component-specific prediction based on prediction parameters can be performed only when at least one of AMVP mode, merge mode, template matching or bilateral matching is applied to the luma block.
- prediction parameters for a chroma sub-block only if all residual samples in the luma sub-block have non-zero values. If at least one residual sample in the luma sub-block has zero values, or if all residual samples have zero values, it may not be possible to derive prediction parameters for the chroma sub-block.
- prediction parameters derived at the block level or prediction parameters of a neighboring chroma sub-block can be set as the prediction parameters of the chroma sub-block.
- color component prediction based on prediction parameters can also be applied when the within-screen block copy mode is applied to the luma block.
- the reference block of the luma block and the reference block of the chroma block may exist in the previously restored area in the current picture.
- the prediction parameters of a chroma block can be derived based on the correlation between a reference block in the current luma picture and a reference block in the current chroma picture.
- the reference region for deriving prediction parameters may be set differently based on the encoding mode of the corresponding luma block.
- a reference block of the luma block when a luma block is encoded with inter-screen prediction, a reference block of the luma block can be derived from a reference picture based on motion information of the luma block, and a reference block of the chroma block can be derived from a reference picture based on motion information of the chroma block. Thereafter, a prediction parameter can be derived based on a correlation between the reference block of the luma block and the reference block of the chroma block.
- the prediction parameters can be derived based on the correlation between the templates adjacent to the luma block and the templates adjacent to the chroma block.
- each of the components (e.g., units, modules, etc.) constituting the block diagram in the above-described disclosure may be implemented as a hardware device or software, or a plurality of components may be combined to be implemented as a single hardware device or software.
- the hardware device may include at least one of a processor for performing a calculation, a memory for storing data, a transmitter for transmitting data, and a receiver for receiving data.
- the above-described disclosure may be implemented in the form of program commands that can be executed through various computer components and recorded on a computer-readable recording medium.
- the computer-readable recording medium may include program commands, data files, data structures, etc., singly or in combination.
- a computer-readable recording medium storing a bitstream generated by the above-described encoding method.
- the bitstream can be transmitted by an encoding device, and a decoding device can receive the bitstream and decode an image.
- Examples of computer-readable recording media include magnetic media such as hard disks, floppy disks, and magnetic tapes, optical recording media such as CD-ROMs, DVDs, magneto-optical media such as floptical disks, and hardware devices specifically configured to store and execute program instructions such as ROMs, RAMs, flash memories, and the like.
- the hardware devices may be configured to operate as one or more software modules to perform processing according to the present disclosure, and vice versa.
- the present disclosure may be applied to a computing or electronic device capable of encoding/decoding a video signal.
Landscapes
- Engineering & Computer Science (AREA)
- Multimedia (AREA)
- Signal Processing (AREA)
- Compression Or Coding Systems Of Tv Signals (AREA)
Abstract
Un procédé de décodage d'image selon la présente divulgation peut comprendre les étapes consistant à : dériver un premier bloc de référence d'un bloc de luminance qui est dans la même position qu'un bloc de chrominance ; dériver un second bloc de référence du bloc de chrominance ; dériver un paramètre de prédiction sur la base du premier bloc de référence et du second bloc de référence ; et appliquer le paramètre de prédiction au bloc de luminance pour obtenir un bloc de prédiction pour le bloc de chrominance.
Priority Applications (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN202480033627.9A CN121359442A (zh) | 2023-06-27 | 2024-06-26 | 图像编码/解码方法和用于存储比特流的记录介质 |
Applications Claiming Priority (4)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| KR20230083018 | 2023-06-27 | ||
| KR10-2023-0083018 | 2023-06-27 | ||
| KR1020240082641A KR20250000891A (ko) | 2023-06-27 | 2024-06-25 | 영상 부호화/복호화 방법 및 비트스트림을 저장하는 기록 매체 |
| KR10-2024-0082641 | 2024-06-25 |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| WO2025005664A1 true WO2025005664A1 (fr) | 2025-01-02 |
Family
ID=93939283
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| PCT/KR2024/008916 Ceased WO2025005664A1 (fr) | 2023-06-27 | 2024-06-26 | Procédé de codage/décodage d'image et support d'enregistrement pour stocker un flux binaire |
Country Status (2)
| Country | Link |
|---|---|
| CN (1) | CN121359442A (fr) |
| WO (1) | WO2025005664A1 (fr) |
Citations (4)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US20220038738A1 (en) * | 2018-01-24 | 2022-02-03 | Vid Scale, Inc. | Generalized bi-prediction for video coding with reduced coding complexity |
| KR102398997B1 (ko) * | 2011-06-21 | 2022-05-17 | 한국전자통신연구원 | 인터 예측 방법 및 그 장치 |
| KR20230070198A (ko) * | 2017-06-09 | 2023-05-22 | 한국전자통신연구원 | 영상 부호화/복호화 방법, 장치 및 비트스트림을 저장한 기록 매체 |
| KR20230093063A (ko) * | 2018-10-04 | 2023-06-26 | 엘지전자 주식회사 | Cclm에 기반한 인트라 예측 방법 및 그 장치 |
-
2024
- 2024-06-26 WO PCT/KR2024/008916 patent/WO2025005664A1/fr not_active Ceased
- 2024-06-26 CN CN202480033627.9A patent/CN121359442A/zh active Pending
Patent Citations (4)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| KR102398997B1 (ko) * | 2011-06-21 | 2022-05-17 | 한국전자통신연구원 | 인터 예측 방법 및 그 장치 |
| KR20230070198A (ko) * | 2017-06-09 | 2023-05-22 | 한국전자통신연구원 | 영상 부호화/복호화 방법, 장치 및 비트스트림을 저장한 기록 매체 |
| US20220038738A1 (en) * | 2018-01-24 | 2022-02-03 | Vid Scale, Inc. | Generalized bi-prediction for video coding with reduced coding complexity |
| KR20230093063A (ko) * | 2018-10-04 | 2023-06-26 | 엘지전자 주식회사 | Cclm에 기반한 인트라 예측 방법 및 그 장치 |
Non-Patent Citations (1)
| Title |
|---|
| P. ASTOLA (NOKIA), J. LAINEMA (NOKIA): "AHG12: Cross-component residual model (CCRM) for inter prediction", 30. JVET MEETING; 20230421 - 20230428; ANTALYA; (THE JOINT VIDEO EXPLORATION TEAM OF ISO/IEC JTC1/SC29/WG11 AND ITU-T SG.16 ), 21 April 2023 (2023-04-21), XP030308741 * |
Also Published As
| Publication number | Publication date |
|---|---|
| CN121359442A (zh) | 2026-01-16 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| WO2018030773A1 (fr) | Procédé et appareil destinés au codage/décodage d'image | |
| WO2018012886A1 (fr) | Procédé de codage/décodage d'images et support d'enregistrement correspondant | |
| WO2016137149A1 (fr) | Procédé de traitement d'image à base d'unité polygonale, et dispositif associé | |
| WO2020004978A1 (fr) | Procédé et appareil de traitement de signal vidéo | |
| WO2020184979A1 (fr) | Procédé de codage/décodage de signal d'image et dispositif associé | |
| WO2020096426A1 (fr) | Procédé pour le codage/décodage d'un signal d'image, et dispositif associé | |
| WO2023043226A1 (fr) | Procédé de codage/décodage de signal vidéo, et support d'enregistrement ayant un flux binaire stocké sur celui-ci | |
| WO2020130714A1 (fr) | Procédé de codage/décodage de signal vidéo et dispositif associé | |
| WO2022260374A1 (fr) | Procédé et dispositif de codage vidéo à l'aide d'une prédiction de modèle linéaire à composantes transversales améliorée | |
| WO2023043223A1 (fr) | Procédé de codage/décodage de signal vidéo et support d'enregistrement dans lequel est stocké un flux binaire | |
| WO2020141904A1 (fr) | Procédé de codage/décodage de signal d'image et dispositif associé | |
| WO2020159199A1 (fr) | Procédé de codage/décodage de signal d'image et dispositif associé | |
| WO2020005046A1 (fr) | Procédé et appareil d'encodage et de décodage d'images | |
| WO2023048512A1 (fr) | Procédé de codage/décodage de signal vidéo, et support d'enregistrement sur lequel est stocké un flux binaire | |
| WO2025005610A1 (fr) | Procédé de codage/décodage d'image et support d'enregistrement pour stocker un flux binaire | |
| WO2019245342A1 (fr) | Procédé et dispositif de traitement de signal vidéo | |
| WO2024025316A1 (fr) | Procédé de codage/décodage d'image et support d'enregistrement stockant un flux binaire | |
| WO2024058595A1 (fr) | Procédé de codage/décodage d'image et support d'enregistrement stockant un flux binaire | |
| WO2023195646A1 (fr) | Procédé et dispositif de codage vidéo utilisant une ligne multi-référence sélective | |
| WO2025005664A1 (fr) | Procédé de codage/décodage d'image et support d'enregistrement pour stocker un flux binaire | |
| WO2025071348A1 (fr) | Procédé de codage/décodage d'image et support d'enregistrement pour stocker un flux binaire | |
| WO2025259002A1 (fr) | Procédé et appareil de codage et de décodage d'images pour transmettre des données vidéo compressées | |
| WO2025263985A1 (fr) | Procédé et dispositif de codage/décodage d'image pour transmettre des données vidéo compressées | |
| WO2025005667A1 (fr) | Procédé de codage/décodage d'image et support d'enregistrement pour stocker un flux binaire | |
| WO2025147175A1 (fr) | Procédé de codage/décodage d'image et support d'enregistrement pour stocker un flux binaire |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| 121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 24832442 Country of ref document: EP Kind code of ref document: A1 |
|
| NENP | Non-entry into the national phase |
Ref country code: DE |