WO2024253465A1 - Procédé et appareil de codage/décodage d'image, et support d'enregistrement pour stocker un flux binaire - Google Patents

Procédé et appareil de codage/décodage d'image, et support d'enregistrement pour stocker un flux binaire Download PDF

Info

Publication number
WO2024253465A1
WO2024253465A1 PCT/KR2024/007817 KR2024007817W WO2024253465A1 WO 2024253465 A1 WO2024253465 A1 WO 2024253465A1 KR 2024007817 W KR2024007817 W KR 2024007817W WO 2024253465 A1 WO2024253465 A1 WO 2024253465A1
Authority
WO
WIPO (PCT)
Prior art keywords
block
prediction
current
prediction block
weight
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
PCT/KR2024/007817
Other languages
English (en)
Korean (ko)
Inventor
최정아
허진
박승욱
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hyundai Motor Co
Kia Corp
Original Assignee
Hyundai Motor Co
Kia Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hyundai Motor Co, Kia Corp filed Critical Hyundai Motor Co
Priority to CN202480027475.1A priority Critical patent/CN121058240A/zh
Priority claimed from KR1020240074453A external-priority patent/KR20240174507A/ko
Publication of WO2024253465A1 publication Critical patent/WO2024253465A1/fr
Anticipated expiration legal-status Critical
Pending legal-status Critical Current

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/103Selection of coding mode or of prediction mode
    • H04N19/105Selection of the reference unit for prediction within a chosen coding or prediction mode, e.g. adaptive choice of position and number of pixels used for prediction
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/132Sampling, masking or truncation of coding units, e.g. adaptive resampling, frame skipping, frame interpolation or high-frequency transform coefficient masking
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/134Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
    • H04N19/146Data rate or code amount at the encoder output
    • H04N19/147Data rate or code amount at the encoder output according to rate distortion criteria
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/134Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
    • H04N19/157Assigned coding mode, i.e. the coding mode being predefined or preselected to be further used for selection of another element or parameter
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/17Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
    • H04N19/176Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a block, e.g. a macroblock
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/593Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving spatial prediction techniques
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/70Methods or arrangements for coding, decoding, compressing or decompressing digital video signals characterised by syntax aspects related to video coding, e.g. related to compression standards

Definitions

  • the present invention relates to a video encoding/decoding method, a device, and a recording medium storing a bitstream. Specifically, the present invention relates to a video encoding/decoding method, a device, and a recording medium storing a bitstream based on an improved intra-block copy (IBC)-based prediction method.
  • IBC intra-block copy
  • intra block copy is a technique for searching for a prediction block for a current coding unit (CU) block (current block) in an already restored area.
  • Intra block copy prediction has high prediction accuracy for screen contents with repeated similar shapes.
  • encoding efficiency can be improved by applying intra block copy prediction. Therefore, various tools for intra block copy prediction are being discussed in order to improve encoding efficiency when only intra prediction is applicable.
  • the purpose of the present invention is to provide a video encoding/decoding method and device with improved encoding/decoding efficiency.
  • the present invention aims to provide a recording medium storing a bitstream generated by an image decoding method or device according to the present invention.
  • the present invention aims to provide a prediction method for solving the problems of the existing binary prediction intra block copy-based prediction as described above.
  • a video decoding method includes the steps of generating a first prediction block of a current block based on a first block vector, generating a second prediction block of the current block based on a second block vector, and generating a final prediction block of the current block by weighting the first prediction block and the second prediction block, wherein the first prediction block is generated based on a first matching block in a current picture indicated by the first block vector, and the second prediction block is generated based on a second matching block in the current picture indicated by the second block vector.
  • the first block vector is derived by performing block vector prediction
  • the second block vector is determined based on an intra block copy merge candidate list and an index indicating one merge candidate.
  • the first block vector and the second block vector may be determined based on an intra block copy merge candidate list, a first merge index and a second merge index indicating different merge candidates.
  • the final prediction block of the current block is generated by adding the first prediction block to which the first weight is applied and the second prediction block to which the second weight is applied.
  • the first weight and the second weight are determined as preset values.
  • the first weight and the second weight may be determined based on an index indicating one weight candidate among a plurality of preset weight candidates.
  • the codeword assigned to each of the plurality of preset weight candidates is determined based on the usage frequency of each of the plurality of weight candidates.
  • the first weight and the second weight are determined based on a first index indicating one weight candidate set from among a plurality of weight candidate sets and a second index indicating one weight candidate from among a plurality of weight candidates included in one weight candidate set.
  • the first index indicates a set of weight candidates used in at least one group among a picture group, a slice group, and a block group
  • the second index indicates a weight candidate used in at least one unit among a picture, a slice, and a block.
  • the at least one unit is a unit of a lower layer corresponding to the at least one group.
  • the first weight and the second weight may be determined based on a first distortion value, which is a distortion value between an adjacent current template of the current block and a first reference template adjacent to the first matching block, and a second distortion value, which is a distortion value between an adjacent current template of the current block and a second reference template adjacent to the second matching block.
  • the first distortion value and the second distortion value are measured using one of a SAD (sum of absolute differences) measurement method and a SSE (sum of square error) measurement method.
  • the first weight and the second weight are determined based on a look-up table corresponding to the first distortion value and the second distortion value.
  • the shape of the current template may be determined as one of a first template shape including samples adjacent to the left side of the current block, a second template shape including samples adjacent to the top of the current block, and a third template shape including samples adjacent to the left side of the current block and samples adjacent to the top of the current block, and the shapes of the first reference template and the second reference template may be characterized in that they correspond to the shape of the current template.
  • the final prediction block of the current block may be generated by weighting the first prediction block and the second prediction block based on a comparison result between a threshold value derived based on the first distortion value and the second distortion value.
  • the second distortion value may be a value smaller than the first distortion value, and when the second distortion value is smaller than or equal to the threshold value, the final prediction block of the current block may be generated by weighting the first prediction block and the second prediction block.
  • the second distortion value may be a value greater than the first distortion value, and when the second distortion value is less than the threshold value, the final prediction block of the current block may be generated by weighting the first prediction block and the second prediction block.
  • a video encoding method includes the steps of generating a first prediction block of a current block based on a first block vector, generating a second prediction block of the current block based on a second block vector, and generating a final prediction block of the current block by weighting the first prediction block and the second prediction block, wherein the first prediction block is generated based on a first matching block in a current picture indicated by the first block vector, and the second prediction block is generated based on a second matching block in the current picture indicated by the second block vector.
  • a non-transitory computer-readable recording medium can store a bitstream generated by a video encoding method, comprising the steps of: generating a first prediction block of a current block based on a first block vector; generating a second prediction block of the current block based on a second block vector; and generating a final prediction block of the current block by weighting the first prediction block and the second prediction block, wherein the first prediction block is generated based on a first matching block in the current picture indicated by the first block vector, and the second prediction block is generated based on a second matching block in the current picture indicated by the second block vector.
  • a transmission method comprises the steps of transmitting the bitstream, generating a first prediction block of a current block based on a first block vector, generating a second prediction block of the current block based on a second block vector, and generating a final prediction block of the current block by weighting the first prediction block and the second prediction block, wherein the first prediction block is generated based on a first matching block in a current picture indicated by the first block vector, and the second prediction block is generated based on a second matching block in the current picture indicated by the second block vector.
  • a video encoding/decoding method and device with improved encoding/decoding efficiency can be provided.
  • a binary prediction intra block copy-based prediction method using weighted sum can be provided.
  • a method can be provided for judging the efficiency of a binary prediction intra block copy-based prediction method based on distortion values of different prediction signals, and determining whether to apply the binary prediction intra block copy-based prediction method based on the judgment result.
  • Figure 1 is a block diagram showing the configuration according to one embodiment of an encoding device to which the present invention is applied.
  • FIG. 2 is a block diagram showing the configuration of one embodiment of a decryption device to which the present invention is applied.
  • FIG. 3 is a diagram schematically showing a video coding system to which the present invention can be applied.
  • FIG. 4 is a diagram for explaining an intra block copy-based prediction method according to an embodiment of the present invention.
  • FIG. 5 is a diagram for explaining a binary prediction intra block copy-based prediction method according to one embodiment of the present invention.
  • FIG. 6 is a diagram for explaining a binary prediction intra block copy-based prediction method according to another embodiment of the present invention.
  • FIG. 7 is a diagram for explaining a plurality of template shape candidates used in a process of deriving distortion values for predicted values according to one embodiment of the present invention.
  • FIG. 8 is a flowchart illustrating an intra block copy prediction method according to an embodiment of the present invention.
  • FIG. 9 is a drawing exemplarily showing a content streaming system to which an embodiment according to the present invention can be applied.
  • first, second, etc. may be used to describe various components, but the components should not be limited by the terms. The terms are only used for the purpose of distinguishing one component from another.
  • the first component may be referred to as the second component, and similarly, the second component may also be referred to as the first component.
  • the term and/or includes a combination of a plurality of related described items or any item among a plurality of related described items.
  • each component shown in the embodiments of the present invention are independently depicted to indicate different characteristic functions, and do not mean that each component is formed as a separate hardware or software configuration unit. That is, each component is listed and included as a separate component for convenience of explanation, and at least two components among each component may be combined to form a single component, or one component may be divided into multiple components to perform a function, and such integrated embodiments and separate embodiments of each component are also included in the scope of the present invention as long as they do not deviate from the essence of the present invention.
  • the terminology used in the present invention is only used to describe specific embodiments and is not intended to limit the present invention.
  • the singular expression includes the plural expression unless the context clearly indicates otherwise.
  • some components of the present invention are not essential components that perform essential functions in the present invention and may be optional components that merely enhance performance.
  • the present invention may be implemented by including only essential components for implementing the essence of the present invention excluding components used only for enhancing performance, and a structure including only essential components excluding optional components used only for enhancing performance is also included in the scope of the present invention.
  • the term "at least one” can mean one of a number greater than or equal to 1, such as 1, 2, 3, and 4.
  • the term "a plurality of” can mean one of a number greater than or equal to 2, such as 2, 3, and 4.
  • video may mean one picture constituting a video, and may also represent the video itself.
  • encoding and/or decoding of a video may mean “encoding and/or decoding of a video,” and may also mean “encoding and/or decoding of one of the videos constituting the video.”
  • the target image may be an encoding target image that is a target of encoding and/or a decoding target image that is a target of decoding.
  • the target image may be an input image input to an encoding device and may be an input image input to a decoding device.
  • the target image may have the same meaning as the current image.
  • encoder and image encoding device may be used interchangeably and have the same meaning.
  • decoder and image decoding device may be used interchangeably and interchangeably.
  • image may be used with the same meaning and may be used interchangeably.
  • target block may be an encoding target block that is a target of encoding and/or a decoding target block that is a target of decoding.
  • target block may be a current block that is a target of current encoding and/or decoding.
  • target block and current block may be used with the same meaning and may be used interchangeably.
  • a coding tree unit may be composed of one luma component (Y) coding tree block (CTB) and two chroma component (Cb, Cr) coding tree blocks related to it.
  • sample may represent a basic unit constituting a block.
  • Figure 1 is a block diagram showing the configuration according to one embodiment of an encoding device to which the present invention is applied.
  • the encoding device (100) may be an encoder, a video encoding device, or an image encoding device.
  • the video may include one or more images.
  • the encoding device (100) may sequentially encode one or more images.
  • an encoding device (100) may include an image segmentation unit (110), an intra prediction unit (120), a motion prediction unit (121), a motion compensation unit (122), a switch (115), a subtractor (113), a transformation unit (130), a quantization unit (140), an entropy encoding unit (150), an inverse quantization unit (160), an inverse transformation unit (170), an adder (117), a filter unit (180), and a reference picture buffer (190).
  • the encoding device (100) can generate a bitstream including encoded information through encoding an input image, and output the generated bitstream.
  • the generated bitstream can be stored in a computer-readable recording medium, or can be streamed through a wired/wireless transmission medium.
  • the video segmentation unit (110) can segment the input video into various forms to increase the efficiency of video encoding/decoding. That is, the input video is composed of multiple pictures, and one picture can be hierarchically segmented and processed for compression efficiency, parallel processing, etc. For example, one picture can be segmented into one or multiple tiles or slices, and then segmented again into multiple CTUs (Coding Tree Units). Alternatively, one picture can be segmented into multiple sub-pictures defined as groups of rectangular slices, and each sub-picture can be segmented into the tiles/slices. Here, the sub-pictures can be utilized to support the function of partially independently encoding/decoding and transmitting the picture.
  • multiple sub-pictures can be individually restored, they have the advantage of being easy to edit in applications that configure multi-channel input into one picture.
  • tiles can be segmented horizontally to generate bricks.
  • a brick can be utilized as a basic unit of intra-picture parallel processing.
  • one CTU can be recursively split into a quad tree (QT: Quadtree), and the terminal node of the split can be defined as a CU (Coding Unit).
  • the CU can be split into a prediction unit (PU) and a transformation unit (TU) to perform prediction and splitting. Meanwhile, the CU can be utilized as a prediction unit and/or a transformation unit itself.
  • each CTU can be recursively split into not only a quad tree (QT) but also a multi-type tree (MTT: Multi-Type Tree).
  • MTT Multi-Type Tree
  • Splitting of a CTU into a multi-type tree can start from the terminal node of a QT, and the MTT can be composed of a BT (Binary Tree) and a TT (Triple Tree).
  • the MTT structure can be distinguished into vertical binary split mode (SPLIT_BT_VER), horizontal binary split mode (SPLIT_BT_HOR), vertical ternary split mode (SPLIT_TT_VER), and horizontal ternary split mode (SPLIT_TT_HOR).
  • the minimum block size (MinQTSize) of the quad tree of the luma block during splitting can be set to 16x16
  • the maximum block size (MaxBtSize) of the binary tree can be set to 128x128, and the maximum block size (MaxTtSize) of the triple tree can be set to 64x64.
  • the minimum block size (MinBtSize) of the binary tree and the minimum block size (MinTtSize) of the triple tree can be set to 4x4
  • the maximum depth (MaxMttDepth) of the multi-type tree can be set to 4.
  • a dual tree that uses different CTU split structures for luma and chrominance components can be applied to improve the encoding efficiency of the I slice.
  • the luminance and chrominance CTBs (Coding Tree Blocks) within the CTU can be split into a single tree sharing the coding tree structure.
  • the encoding device (100) may perform encoding on the input image in the intra mode and/or the inter mode.
  • the encoding device (100) may perform encoding on the input image in a third mode (e.g., IBC mode, Palette mode, etc.) other than the intra mode and the inter mode.
  • a third mode e.g., IBC mode, Palette mode, etc.
  • the third mode may be classified as the intra mode or the inter mode for convenience of explanation. In the present invention, the third mode will be classified and described separately only when a specific explanation is required.
  • the switch (115) can be switched to intra, and when the inter mode is used as the prediction mode, the switch (115) can be switched to inter.
  • the intra mode can mean an intra-screen prediction mode
  • the inter mode can mean an inter-screen prediction mode.
  • the encoding device (100) can generate a prediction block for an input block of an input image.
  • the encoding device (100) can encode a residual block using a residual of the input block and the prediction block.
  • the input image can be referred to as a current image which is a current encoding target.
  • the input block can be referred to as a current block which is a current encoding target or an encoding target block.
  • the intra prediction unit (120) can use samples of blocks already encoded/decoded around the current block as reference samples.
  • the intra prediction unit (120) can perform spatial prediction on the current block using the reference sample, and can generate prediction samples for the input block through spatial prediction.
  • intra prediction can mean prediction within the screen.
  • non-directional prediction modes such as DC mode and Planar mode and directional prediction modes (e.g., 65 directions) can be applied.
  • the intra prediction method can be expressed as an intra prediction mode or an intra-screen prediction mode.
  • the motion prediction unit (121) can search for an area that best matches the input block from the reference image during the motion prediction process, and can derive a motion vector using the searched area. At this time, the search area can be used as the area.
  • the reference image can be stored in the reference picture buffer (190).
  • it when encoding/decoding for the reference image is processed, it can be stored in the reference picture buffer (190).
  • the motion compensation unit (122) can generate a prediction block for the current block by performing motion compensation using a motion vector.
  • inter prediction can mean inter-screen prediction or motion compensation.
  • the above motion prediction unit (121) and motion compensation unit (122) can generate a prediction block by applying an interpolation filter to a portion of an area within a reference image when the value of a motion vector does not have an integer value.
  • the AFFINE mode of sub-PU based prediction the AFFINE mode of sub-PU based prediction, the SbTMVP (Subblock-based Temporal Motion Vector Prediction) mode, and the MMVD (Merge with MVD) mode, the GPM (Geometric Partitioning Mode) mode of PU based prediction can be applied.
  • the SbTMVP Subblock-based Temporal Motion Vector Prediction
  • MMVD Merge with MVD
  • GPM Gaometric Partitioning Mode
  • the HMVP History based MVP
  • the PAMVP Positionwise Average MVP
  • the CIIP Combined Intra/Inter Prediction
  • the AMVR Adaptive Motion Vector Resolution
  • the BDOF Bi-Directional Optical-Flow
  • the BCW Block Predictive with CU Weights
  • the LIC Lical Illumination Compensation
  • the TM Tempolate Matching
  • the OBMC Overlapped Block Motion Compensation
  • AFFINE mode is a technology that is used in both AMVP and MERGE modes and also has high encoding efficiency. Since the conventional video coding standard performs MC (Motion Compensation) by considering only the parallel translation of the block, there was a disadvantage in that it could not properly compensate for motions that occur in reality, such as zoom in/out and rotation. To supplement this, a four-parameter affine motion model using two control point motion vectors (CPMV) and a six-parameter affine motion model using three control point motion vectors can be applied to inter prediction.
  • CPMV is a vector representing an affine motion model of one of the upper left, upper right, and lower left of the current block.
  • the subtractor (113) can generate a residual block using the difference between the input block and the predicted block.
  • the residual block may also be referred to as a residual signal.
  • the residual signal may mean the difference between the original signal and the predicted signal.
  • the residual signal may be a signal generated by transforming, quantizing, or transforming and quantizing the difference between the original signal and the predicted signal.
  • the residual block may be a residual signal in block units.
  • the transform unit (130) can perform a transform on the residual block to generate a transform coefficient and output the generated transform coefficient.
  • the transform coefficient can be a coefficient value generated by performing a transform on the residual block.
  • the transform unit (130) can also skip the transform on the residual block.
  • a quantized level can be generated by applying quantization to a transform coefficient or a residual signal.
  • a quantized level may also be referred to as a transform coefficient.
  • a 4x4 luminance residual block generated through within-screen prediction can be transformed using a basis vector based on DST (Discrete Sine Transform), and a basis vector based on DCT (Discrete Cosine Transform) can be used to transform the remaining residual blocks.
  • a transform block can be divided into a quad tree shape for one block using RQT (Residual Quad Tree) technology, and after performing transformation and quantization on each transform block divided through RQT, a coded block flag (cbf) can be transmitted to increase encoding efficiency when all coefficients become 0.
  • RQT Residual Quad Tree
  • the Multiple Transform Selection (MTS) technique can be applied to perform transformation by selectively using multiple transformation bases. That is, instead of dividing the CU into TUs through the RQT, a function similar to TU division can be performed through the Sub-block Transform (SBT) technique.
  • SBT Sub-block Transform
  • the SBT is applied only to inter-screen prediction blocks, and unlike the RQT, the current block can be divided into 1 ⁇ 2 or 1 ⁇ 4 sizes in the vertical or horizontal direction, and then the transformation can be performed on only one of the blocks. For example, if it is divided vertically, the transformation can be performed on the leftmost or rightmost block, and if it is divided horizontally, the transformation can be performed on the topmost or bottommost block.
  • LFNST Low Frequency Non-Separable Transform
  • a secondary transform technique that additionally transforms the residual signal converted to the frequency domain through DCT or DST, can be applied.
  • LFNST additionally performs a transform on the low-frequency region of 4x4 or 8x8 in the upper left, so that the residual coefficients can be concentrated in the upper left.
  • the quantization unit (140) can generate a quantized level by quantizing a transform coefficient or a residual signal according to a quantization parameter (QP), and can output the generated quantized level. At this time, the quantization unit (140) can quantize the transform coefficient using a quantization matrix.
  • QP quantization parameter
  • a quantizer using QP values of 0 to 51 can be used.
  • 0 to 63 QP can be used.
  • DQ Dependent Quantization
  • DQ performs quantization using two quantizers (e.g., Q0 and Q1), and even without signaling information about the use of a specific quantizer, the quantizer to be used for the next transform coefficient can be selected based on the current state through a state transition model.
  • the entropy encoding unit (150) can generate a bitstream by performing entropy encoding according to a probability distribution on values produced by the quantization unit (140) or coding parameter values produced in the encoding process, and can output the bitstream.
  • the entropy encoding unit (150) can perform entropy encoding on information about image samples and information for decoding the image. For example, information for decoding the image can include syntax elements, etc.
  • the entropy encoding unit (150) can use an encoding method such as exponential Golomb, Context-Adaptive Variable Length Coding (CAVLC), or Context-Adaptive Binary Arithmetic Coding (CABAC) for entropy encoding.
  • CAVLC Context-Adaptive Variable Length Coding
  • CABAC Context-Adaptive Binary Arithmetic Coding
  • the entropy encoding unit (150) can perform entropy encoding using a Variable Length Coding/Code (VLC) table.
  • VLC Variable Length Coding/Code
  • the entropy encoding unit (150) may derive a binarization method of a target symbol and a probability model of a target symbol/bin, and then perform arithmetic encoding using the derived binarization method, probability model, and context model.
  • the table probability update method when applying CABAC, in order to reduce the size of the probability table stored in the decryption device, the table probability update method can be changed to a table update method using a simple formula and applied.
  • two different probability models can be used to obtain more accurate symbol probability values.
  • the entropy encoding unit (150) can change a two-dimensional block form coefficient into a one-dimensional vector form through a transform coefficient scanning method to encode a transform coefficient level (quantized level).
  • Coding parameters may include information (flags, indexes, etc.) encoded in an encoding device (100) and signaled to a decoding device (200), such as syntax elements, as well as information derived during an encoding process or a decoding process, and may mean information necessary when encoding or decoding an image.
  • signaling a flag or index may mean that the encoder entropy encodes the flag or index and includes it in the bitstream, and that the decoder entropy decodes the flag or index from the bitstream.
  • the encoded current image can be used as a reference image for other images to be processed later. Therefore, the encoding device (100) can restore or decode the encoded current image again, and store the restored or decoded image as a reference image in the reference picture buffer (190).
  • the quantized level can be dequantized in the dequantization unit (160) and inverse transformed in the inverse transform unit (170).
  • the dequantized and/or inverse transformed coefficients can be combined with a prediction block through an adder (117), and a reconstructed block can be generated by combining the dequantized and/or inverse transformed coefficients and the prediction block.
  • the dequantized and/or inverse transformed coefficients mean coefficients on which at least one of dequantization and inverse transformation has been performed, and may mean a reconstructed residual block.
  • the dequantization unit (160) and the inverse transform unit (170) can be performed in the reverse process of the quantization unit (140) and the transform unit (130).
  • the restoration block may pass through a filter unit (180).
  • the filter unit (180) may apply a deblocking filter, a sample adaptive offset (SAO), an adaptive loop filter (ALF), a bilateral filter (BIF), LMCS (Luma Mapping with Chroma Scaling), etc. as a filtering technique, in whole or in part, to the restoration sample, restoration block, or restoration image.
  • the filter unit (180) may also be called an in-loop filter. In this case, the in-loop filter is also used as a name excluding LMCS.
  • the deblocking filter can remove block distortion that occurs at the boundary between blocks.
  • different filters can be applied depending on the required deblocking filtering strength.
  • a sample adaptive offset can be used to add an appropriate offset value to the sample value to compensate for the encoding error.
  • the sample adaptive offset can correct the offset from the original image on a sample basis for the image on which deblocking has been performed.
  • a method can be used in which the samples included in the image are divided into a certain number of regions, and then the region to be offset is determined and the offset is applied to the region, or a method can be used in which the offset is applied by considering the edge information of each sample.
  • Bilateral filter can also compensate for the offset from the original image on a sample-by-sample basis for the deblocked image.
  • An adaptive loop filter can perform filtering based on a comparison value between a restored image and an original image. After dividing samples included in an image into a predetermined group, a filter to be applied to each group can be determined, and filtering can be performed differentially for each group. Information related to whether to apply an adaptive loop filter can be signaled for each coding unit (CU), and the shape and filter coefficients of the adaptive loop filter to be applied can vary for each block.
  • CU coding unit
  • LMCS Luma Mapping with Chroma Scaling
  • LM luma mapping
  • CS chroma scaling
  • LMCS can be utilized as an HDR correction technique that reflects the characteristics of HDR (High Dynamic Range) images.
  • the restored block or restored image that has passed through the filter unit (180) may be stored in the reference picture buffer (190).
  • the restored block that has passed through the filter unit (180) may be a part of the reference image.
  • the reference image may be a restored image composed of restored blocks that have passed through the filter unit (180).
  • the stored reference image may be used for inter-screen prediction or motion compensation thereafter.
  • FIG. 2 is a block diagram showing the configuration of one embodiment of a decryption device to which the present invention is applied.
  • the decoding device (200) may be a decoder, a video decoding device, or an image decoding device.
  • the decoding device (200) may include an entropy decoding unit (210), an inverse quantization unit (220), an inverse transformation unit (230), an intra prediction unit (240), a motion compensation unit (250), an adder (201), a switch (203), a filter unit (260), and a reference picture buffer (270).
  • an entropy decoding unit (210) may include an entropy decoding unit (210), an inverse quantization unit (220), an inverse transformation unit (230), an intra prediction unit (240), a motion compensation unit (250), an adder (201), a switch (203), a filter unit (260), and a reference picture buffer (270).
  • the decoding device (200) can receive a bitstream output from the encoding device (100).
  • the decoding device (200) can receive a bitstream stored in a computer-readable recording medium, or can receive a bitstream streamed through a wired/wireless transmission medium.
  • the decoding device (200) can perform decoding on the bitstream in an intra mode or an inter mode.
  • the decoding device (200) can generate a restored image or a decoded image through decoding, and can output the restored image or the decoded image.
  • the switch (203) can be switched to intra. If the prediction mode used for decryption is inter mode, the switch (203) can be switched to inter.
  • the decoding device (200) can obtain a reconstructed residual block by decoding the input bitstream and can generate a prediction block. When the reconstructed residual block and the prediction block are obtained, the decoding device (200) can generate a reconstructed block to be decoded by adding the reconstructed residual block and the prediction block.
  • the decoding target block can be referred to as a current block.
  • the entropy decoding unit (210) can generate symbols by performing entropy decoding according to a probability distribution for the bitstream.
  • the generated symbols can include symbols in the form of quantized levels.
  • the entropy decoding method can be the reverse process of the entropy encoding method described above.
  • the entropy decoding unit (210) can change a one-dimensional vector-shaped coefficient into a two-dimensional block-shaped coefficient through a transform coefficient scanning method to decode a transform coefficient level (quantized level).
  • the quantized level can be dequantized in the dequantization unit (220) and detransformed in the inverse transform unit (230).
  • the quantized level can be generated as a restored residual block as a result of the dequantization and/or detransformation.
  • the dequantization unit (220) can apply a quantization matrix to the quantized level.
  • the dequantization unit (220) and the detransform unit (230) applied to the decoding device can apply the same technology as the dequantization unit (160) and the detransform unit (170) applied to the encoding device described above.
  • the intra prediction unit (240) can generate a prediction block by performing spatial prediction on the current block using sample values of already decoded blocks surrounding the block to be decoded.
  • the intra prediction unit (240) applied to the decoding device can apply the same technology as the intra prediction unit (120) applied to the encoding device described above.
  • the motion compensation unit (250) can perform motion compensation using a motion vector and a reference image stored in the reference picture buffer (270) for the current block to generate a prediction block.
  • the motion compensation unit (250) can apply an interpolation filter to a part of the reference image to generate a prediction block when the value of the motion vector does not have an integer value.
  • the motion compensation unit (250) applied to the decoding device can apply the same technology as the motion compensation unit (122) applied to the encoding device described above.
  • the adder (201) can add the restored residual block and the prediction block to generate a restored block.
  • the filter unit (260) can apply at least one of an Inverse-LMCS, a deblocking filter, a sample adaptive offset, and an adaptive loop filter to the restored block or the restored image.
  • the filter unit (260) applied to the decoding device can apply the same filtering technology as that applied to the filter unit (180) applied to the encoding device described above.
  • the filter unit (260) can output a restored image.
  • the restored block or restored image can be stored in the reference picture buffer (270) and used for inter prediction.
  • the restored block that has passed through the filter unit (260) can be a part of the reference image.
  • the reference image can be a restored image composed of restored blocks that have passed through the filter unit (260).
  • the stored reference image can be used for inter-screen prediction or motion compensation thereafter.
  • FIG. 3 is a diagram schematically showing a video coding system to which the present invention can be applied.
  • a video coding system may include an encoding device (10) and a decoding device (20).
  • the encoding device (10) may transmit encoded video and/or image information or data to the decoding device (20) in the form of a file or streaming through a digital storage medium or a network.
  • An encoding device (10) may include a video source generating unit (11), an encoding unit (12), and a transmitting unit (13).
  • a decoding device (20) may include a receiving unit (21), a decoding unit (22), and a rendering unit (23).
  • the encoding unit (12) may be called a video/image encoding unit, and the decoding unit (22) may be called a video/image decoding unit.
  • the transmitting unit (13) may be included in the encoding unit (12).
  • the receiving unit (21) may be included in the decoding unit (22).
  • the rendering unit (23) may include a display unit, and the display unit may be configured as a separate device or an external component.
  • the video source generation unit (11) can obtain a video/image through a process of capturing, synthesizing, or generating a video/image.
  • the video source generation unit (11) can include a video/image capture device and/or a video/image generation device.
  • the video/image capture device can include, for example, one or more cameras, a video/image archive including previously captured video/image, etc.
  • the video/image generation device can include, for example, a computer, a tablet, a smartphone, etc., and can (electronically) generate a video/image.
  • a virtual video/image can be generated through a computer, etc., and in this case, the video/image capture process can be replaced with a process of generating related data.
  • the encoding unit (12) can encode the input video/image.
  • the encoding unit (12) can perform a series of procedures such as prediction, transformation, and quantization for compression and encoding efficiency.
  • the encoding unit (12) can output encoded data (encoded video/image information) in the form of a bitstream.
  • the detailed configuration of the encoding unit (12) can also be configured in the same manner as the encoding device (100) of FIG. 1 described above.
  • the transmission unit (13) can transmit encoded video/image information or data output in the form of a bitstream to the reception unit (21) of the decoding device (20) through a digital storage medium or a network in the form of a file or streaming.
  • the digital storage medium can include various storage media such as USB, SD, CD, DVD, Blu-ray, HDD, SSD, etc.
  • the transmission unit (13) can include an element for generating a media file through a predetermined file format and can include an element for transmission through a broadcasting/communication network.
  • the reception unit (21) can extract/receive the bitstream from the storage medium or network and transmit it to the decoding unit (22).
  • the decoding unit (22) can decode video/image by performing a series of procedures such as inverse quantization, inverse transformation, and prediction corresponding to the operation of the encoding unit (12).
  • the detailed configuration of the decoding unit (22) can also be configured in the same manner as the decoding device (200) of FIG. 2 described above.
  • the rendering unit (23) can render the decrypted video/image.
  • the rendered video/image can be displayed through the display unit.
  • the intra block copy (IBC)-based prediction method is a method of searching for an optimal prediction block in the reconstructed area of the current picture using a block vector and copying it to generate a prediction block of the current block.
  • FIG. 4 is a diagram for explaining an intra block copy-based prediction method according to an embodiment of the present invention.
  • a matching block (440) can be searched within a predefined search range (R1, R2, R3, R4) of a reconstructed area (430) of a current picture (400) based on a block vector (420) of a current block (410).
  • a prediction block of the current block (410) can be derived based on the matching block (440). Accordingly, when an intra block copy-based prediction method is applied to the current block (410), the encoder can transmit information about the intra block copy-based prediction method and information related to the block vector (420) to the decoder.
  • the predefined search ranges R1, R2, R3, and R4 may be defined as the current CTU (Coding Tree Unit) including the current block (410), the upper left CTU located at the upper left of the current CTU among the areas within the pre-set range from the current CTU, the upper CTU located at the top of the current CTU among the areas within the pre-set range from the current CTU, and the left CTU located to the left of the current CTU among the areas within the pre-set range from the current CTU, respectively.
  • the area within the pre-set range may be an area set as one of the current tiles, slices, and pictures.
  • the intra block copy-based prediction method described in FIG. 4 is a uni-predictive intra block copy (IBC) that generates a prediction value using one block vector (420).
  • the intra block copy-based prediction method may be a bi-predictive intra block copy (bi-predictive IBC) that uses two block vectors.
  • the bi-predictive intra block copy-based prediction method may be as described below.
  • FIG. 5 is a diagram for explaining a binary prediction intra block copy-based prediction method according to one embodiment of the present invention.
  • a first matching block (541) and a second matching block (542) can be searched within a predefined search range (R1, R2 , R3, R4) of a reconstructed region (530) of a current picture (500) based on different block vectors of a current block (510), which are first block vectors (BV 1 , 521) and second block vectors (BV 2 , 522). Then, the first prediction block of the current block (510) can be derived based on the first matching block (541), and the second prediction block can be derived based on the second matching block (542). Then, the final prediction block for the current block (510) can be derived through a weighted sum of the first prediction block and the second prediction block. That is, the binary prediction intra block copy-based prediction method can be a binary prediction method that uses two block vectors (521, 522).
  • the two block vectors (521, 522) used in the binary prediction intra block copy-based prediction method can be derived in various ways.
  • the two block vectors (521, 522) can be derived from the IBC block vector prediction (BVP) method and the IBC merge mode.
  • BVP IBC block vector prediction
  • an index for the IBC block vector prediction mode and an index for the IBC merge candidate can be signaled from the encoder to the decoder.
  • the two block vectors (521, 522) can be derived from the IBC merge candidate list by utilizing different IBC merge indices.
  • the merge indices indicating the two block vectors (521, 522) can be signaled from the encoder to the decoder.
  • the binary prediction intra block copy-based prediction method can improve coding efficiency compared to the unit prediction intra block copy-based prediction method. However, if the prediction value is generated through a simple average of two prediction signals, the prediction accuracy may be reduced.
  • a final prediction block can be generated by weighting and adding each of the generated prediction blocks.
  • the final prediction block can be generated according to the following mathematical formula.
  • P BV1 may mean a prediction value generated using the first block vector BV 1
  • P BV2 may mean a prediction value generated from the second block vector BV 2
  • W 1 may mean a weight value applied to P BV1
  • W 2 may mean a weight value applied to P BV2 .
  • the weight values W 1 and W 2 may be determined according to arbitrary values promised in the encoder and decoder. In this case, the weight values W 1 and W 2 may not be signaled separately.
  • the sum of the weight values W 1 and W 2 is 1, and each of the weight values W 1 and W 2 is a real number greater than or equal to 0.
  • the weight values W 1 and W 2 may be transmitted to indicate one of the preset N weight candidates, and may be determined based on the index to be parsed.
  • N is any positive integer.
  • the index may indicate the weight value W 1 applied to P BV1 and the weight value W 2 applied to P BV2 .
  • the sum of the weight values W 1 and W 2 is 1.
  • Candidates for weight values assigned to different prediction blocks and indices and codewords assigned to each of the candidates for weight values can be as described in the table below.
  • Table 1 shows an embodiment of an index and a codeword assigned to each of the candidates for a weight value.
  • the embodiment of Table 1 is only an embodiment of the present disclosure, and the number N of weight value candidates (N ⁇ 2 is an integer), the codeword for the weight value candidates, and the weight value for each of the weight value candidates can be arbitrarily determined.
  • a codeword for transmitting and/or parsing an index corresponding to the weight value candidates can be generated using one of any codeword allocation methods including a fixed length code (FLC), a unary code, a truncated unary code, and a truncated binary code.
  • FLC fixed length code
  • the weight values W 1 and W 2 can be determined based on K predefined categories and different weight value candidates W 1 , W 2 included in each of the categories.
  • an index indicating a category used among the K categories can be transmitted and parsed.
  • an index indicating a weight value used among a set of weight values included in the category can be transmitted and parsed.
  • the weight values W 1 and W 2 can use the distortion of each prediction value.
  • a binary prediction intra block copy prediction method based on the distortion of each prediction value can be as described below.
  • FIG. 6 is a diagram for explaining a binary prediction intra block copy-based prediction method according to another embodiment of the present invention.
  • a first matching block (641), which is the most similar block to the current block (610), can be searched within a predefined search range (R1, R2, R3, R4) of a reconstructed area (630) of a current picture (600).
  • a block vector indicating the first matching block (641) can be referred to as a first block vector (BV 1 , 621).
  • a second matching block (642), which is the second most similar block to the current block (610) can be searched within a predefined search range (R1, R2, R3, R4) of a reconstructed area.
  • a block vector indicating the second matching block (642) can be referred to as a second block vector (BV 2 , 622).
  • the first prediction block of the current block (610) can be derived based on the first matching block (641), and the second prediction block can be derived based on the second matching block (642). And, the final prediction block for the current block (610) can be derived through a weighted sum of the first prediction block and the second prediction block.
  • the distortion of each of the first prediction block and the second prediction block derived based on the intra block copy can be determined based on the difference between the current template (650) adjacent to the current block (610) and the reference template (661, 662) adjacent to the matching block indicated by each of the block vectors (621, 622).
  • the distortion value of the prediction signal derived based on the first block vector (621) may mean the distortion value between the current template (650) adjacent to the current block (610) and the first reference template (661), which is a reference template adjacent to the first matching block (641).
  • the distortion value of the prediction signal derived based on the second block vector (622) may mean the distortion value between the current template (650) adjacent to the current block (610) and the second reference template (662), which is a reference template adjacent to the second matching block (642).
  • the weight value applied to the prediction values obtained as a result of the intra block copy can be calculated based on the distortion value of the prediction signal derived based on the first block vector (621) and the distortion value of the prediction signal derived based on the second block vector (622), according to the mathematical formula below.
  • D BV1 may mean a distortion value of a prediction signal derived based on the first block vector (621)
  • D BV2 may mean a distortion value of a prediction signal derived based on the second block vector (622).
  • the weight value W 1 of the prediction signal derived based on the first block vector (621) may be calculated using D BV2
  • the weight value W 2 of the prediction signal derived based on the second block vector (622) may be calculated using D BV1 .
  • the weight value of each signal can be calculated using the distortion values of different signals.
  • the distortion of each signal can be calculated using various correlation measurement methods such as SAD (sum of absolute differences), SSE (sum of square error), MSE (mean squared error), SSD (sum of squared differences), and SATD (sum of absolute transformed differences).
  • Equation 2 all floating-point operations can be changed to integer operations using a look-up table (LUT). That is, instead of floating-point operations, the weight values W 1 and W 2 can be approximated using only integer multiplication, addition, and shift operations. For example, the weight values can be derived using the LUT of the Cross-Component Linear Model (CCLM).
  • CCLM Cross-Component Linear Model
  • the distortion value of the prediction signal derived based on the first block vector (621) and the distortion value of the prediction signal derived based on the second block vector (622) are similar, it may not be meaningful to generate the prediction signal using the weighted sum. Therefore, the binary prediction intra block copy-based prediction method may not be applied.
  • the distortion of each prediction signal can be calculated using various correlation measurement methods such as SAD, SSE, MSE, SSD, and SATD.
  • the similarity of the distortion value can be derived according to the following mathematical formula.
  • D min may mean a smaller value among the distortion value of the prediction signal derived based on the first block vector (621) and the distortion value of the prediction signal derived based on the second block vector (622).
  • Threshold 1 means a threshold value. Accordingly, if the smaller distortion value among the distortion values of the two prediction signals is less than the threshold value, the two prediction signals may be determined to be similar.
  • the threshold value may be derived according to the following mathematical formula.
  • D max may mean a larger value among the distortion value of the prediction signal derived based on the first block vector (621) and the distortion value of the prediction signal derived based on the second block vector (622).
  • may mean an arbitrary positive real number.
  • the similarity judgment method based on mathematical expressions 3 and 4 is only an example of judging the similarity of prediction signals, and the similarity judgment method is not limited to the present embodiment.
  • the binary prediction intra block copy-based prediction method may not be applied.
  • the distortion of each prediction signal can be calculated using various correlation measurement methods such as SAD, SSE, MSE, SSD, and SATD.
  • the dissimilarity of the distortion value can be derived according to the mathematical formula below.
  • D max may mean a larger value among the distortion value of the prediction signal derived based on the first block vector (621) and the distortion value of the prediction signal derived based on the second block vector (622).
  • Threshold 2 means a threshold value. Accordingly, if the larger distortion value among the distortion values of the two prediction signals is greater than the threshold value, the two prediction signals may be determined to be not similar.
  • the threshold value may be derived according to the following mathematical formula.
  • D min may mean a smaller value among the distortion value of the prediction signal derived based on the first block vector (621) and the distortion value of the prediction signal derived based on the second block vector (622).
  • may mean an arbitrary positive real number.
  • the dissimilarity judgment method based on mathematical expressions 5 and 6 is only an example of judging the dissimilarity of prediction signals, and the dissimilarity judgment method is not limited to the present embodiment.
  • the L-shaped current template (650) around the current block (610) and the L-shaped reference templates (661, 662) around each of the matching blocks (641, 642) can be utilized.
  • a template having a shape other than an L-shape can be used. That is, a distortion value for a matching block can be derived using a current template having a shape other than an L-shape and reference templates having a shape other than an L-shape. That is, when inducing distortion for prediction values, the distortion value for the prediction values can be derived by considering multiple template shape candidates rather than one template shape.
  • FIG. 7 is a diagram for explaining a plurality of template shape candidates used in a process of deriving distortion values for predicted values according to one embodiment of the present invention.
  • a template having one shape among a plurality of template shape candidates having different shapes can be used.
  • the template shape candidate may include peripheral samples on the left and top of the block, as in (a) of Fig. 7.
  • the template shape candidate including peripheral samples on the left and top of the block may be referred to as an L-shaped template.
  • the size of the L-shaped template may be expressed as (w x L2) + (L1 x h) + (L1 x L2).
  • w and h represent the width and height of the block, and the values of L1 and L2 may be any positive numbers.
  • the template shape candidate may include a left-side surrounding sample of the block, as in (b) of Fig. 7.
  • the template shape candidate including a left-side surrounding sample of the block may be referred to as a left template.
  • the size of the left template may be represented as L3 x h, where h represents the height of the block, and the value of L3 may be any positive number.
  • the template shape candidate may include a peripheral sample at the top of the block, as in (c) of Fig. 7.
  • the template shape candidate including a peripheral sample at the top of the block may be referred to as an upper template.
  • the size of the upper template may be represented as w x L4, where w represents the width of the block, and the value of L4 may be any positive number.
  • a distortion value can be derived for each of the matching blocks by using a current template adjacent to the current block and reference templates having the same shape as the current template and having one of the shapes of an L-shaped template, a left template, and an upper template, and being adjacent to each of the matching blocks.
  • samples around blocks of various shapes predefined in the encoder and decoder can be used.
  • the shape of the template is not limited to that described in the present invention.
  • the template used to induce distortion in the prediction values may be an L-shaped template based on information such as the size and position of the current block, a template of Fig. 7, or a template of a shape implicitly determined from among various forms of templates using samples around blocks predefined in the encoder and decoder.
  • FIG. 8 is a flowchart illustrating an intra block copy prediction method according to an embodiment of the present invention.
  • the image decoding method of Fig. 8 can be performed by an image decoding device.
  • the video decoding device can generate a first prediction block of the current block based on the first block vector (S810). Specifically, the first prediction block can be generated based on the first matching block in the current picture indicated by the first block vector.
  • the video decoding device can generate a second prediction block of the current block based on the second block vector (S820). Specifically, the second prediction block can be generated based on the second matching block in the current picture indicated by the second block vector.
  • the video decoding device can generate a final prediction block of the current block by weighting the first prediction block and the second prediction block (S830).
  • the first block vector can be derived by performing block vector prediction, and the second block vector can be determined based on the intra block copy merge candidate list and an index indicating one merge candidate.
  • the first block vector and the second block vector can be determined based on the intra block copy merge candidate list, the first merge index and the second merge index pointing to different merge candidates.
  • the final prediction block of the current block can be generated by adding the first prediction block to which the first weight is applied and the second prediction block to which the second weight is applied.
  • the first weight and the second weight can be determined as preset values.
  • the first weight and the second weight can be determined based on an index indicating one weight candidate among a plurality of preset weight candidates.
  • the first weight and the second weight can be determined based on a first index indicating one weight candidate set from among a plurality of weight candidate sets and a second index indicating one weight candidate from among a plurality of weight candidates included in one weight candidate set.
  • the first index may indicate a set of weight candidates used in at least one of a picture group, a slice group, and a block group
  • the second index may indicate a weight candidate used in at least one unit of a picture, a slice, and a block.
  • At least one unit may be characterized as being a unit of a lower layer corresponding to at least one group.
  • the first weight can be determined based on a first distortion value, which is a distortion value between an adjacent current template of the current block and a first reference template adjacent to the first matching block, and a second distortion value, which is a distortion value between an adjacent current template of the current block and a second reference template adjacent to the second matching block.
  • the first distortion value and the second distortion value can be measured using any one of the measurement methods such as SAD (sum of absolute differences), SSE (sum of square error), MSE (mean squared error), SSD (sum of squared differences), and SATD (sum of absolute transformed differences).
  • SAD sum of absolute differences
  • SSE sum of square error
  • MSE mean squared error
  • SSD sum of squared differences
  • SATD sum of absolute transformed differences
  • the first weight and the second weight can be determined based on a look-up table corresponding to the first distortion value and the second distortion value.
  • the shape of the current template can be determined as one of a first template shape including samples adjacent to the left side of the current block, a second template shape including samples adjacent to the top side of the current block, and a third template shape including samples adjacent to the left side of the current block and samples adjacent to the top side of the current block.
  • the shapes of the first reference template and the second reference template can correspond to the shape of the current template.
  • the final prediction block of the current block can be generated by weighting the first prediction block and the second prediction block based on the comparison result between the threshold value derived based on the first distortion value and the second distortion value.
  • the second distortion value is a value smaller than the first distortion value, and if the second distortion value is smaller than or equal to the threshold value, the final prediction block of the current block can be generated by weighting the first prediction block and the second prediction block.
  • the second distortion value is a value greater than the first distortion value, and if the second distortion value is less than the threshold value, the final prediction block of the current block can be generated by weighting the first prediction block and the second prediction block.
  • a bitstream can be generated by an image encoding method including the steps described in FIG. 8.
  • the bitstream can be stored in a non-transitory computer-readable recording medium, and can also be transmitted (or streamed).
  • FIG. 9 is a drawing exemplarily showing a content streaming system to which an embodiment according to the present invention can be applied.
  • a content streaming system to which an embodiment of the present invention is applied may largely include an encoding server, a streaming server, a web server, a media storage, a user device, and a multimedia input device.
  • the encoding server compresses content input from multimedia input devices such as smartphones, cameras, CCTVs, etc. into digital data to generate a bitstream and transmits it to the streaming server.
  • multimedia input devices such as smartphones, cameras, CCTVs, etc. directly generate a bitstream
  • the encoding server may be omitted.
  • the above bitstream can be generated by an image encoding method and/or an image encoding device to which an embodiment of the present invention is applied, and the streaming server can temporarily store the bitstream during the process of transmitting or receiving the bitstream.
  • the above streaming server transmits multimedia data to a user device based on a user request via a web server, and the web server can act as an intermediary that informs the user of any available services.
  • the web server transmits it to the streaming server, and the streaming server can transmit multimedia data to the user.
  • the content streaming system may include a separate control server, and in this case, the control server may perform a role of controlling commands/responses between each device within the content streaming system.
  • the above streaming server can receive content from a media storage and/or an encoding server. For example, when receiving content from the encoding server, the content can be received in real time. In this case, in order to provide a smooth streaming service, the streaming server can store the bitstream for a certain period of time.
  • Examples of the user devices may include mobile phones, smart phones, laptop computers, digital broadcasting terminals, personal digital assistants (PDAs), portable multimedia players (PMPs), navigation devices, slate PCs, tablet PCs, ultrabooks, wearable devices (e.g., smartwatches, smart glasses, HMDs (head mounted displays)), digital TVs, desktop computers, digital signage, etc.
  • PDAs personal digital assistants
  • PMPs portable multimedia players
  • navigation devices slate PCs
  • tablet PCs tablet PCs
  • ultrabooks ultrabooks
  • wearable devices e.g., smartwatches, smart glasses, HMDs (head mounted displays)
  • digital TVs desktop computers, digital signage, etc.
  • Each server within the above content streaming system can be operated as a distributed server, in which case data received from each server can be distributedly processed.
  • an image can be encoded/decoded using at least one or a combination of at least one of the above embodiments.
  • the order in which the above embodiments are applied may be different in the encoding device and the decoding device. Alternatively, the order in which the above embodiments are applied may be the same in the encoding device and the decoding device.
  • the above embodiments can be performed for each of the luminance and chrominance signals, or the above embodiments can be performed identically for the luminance and chrominance signals.
  • the methods are described based on the flowchart as a series of steps or units, but the present invention is not limited to the order of the steps, and some steps may occur in a different order or simultaneously with other steps described above.
  • the steps shown in the flowchart are not exclusive, and other steps may be included, or one or more steps in the flowchart may be deleted without affecting the scope of the present invention.
  • the above embodiments may be implemented in the form of program commands that can be executed through various computer components and recorded on a computer-readable recording medium.
  • the computer-readable recording medium may include program commands, data files, data structures, etc., alone or in combination.
  • the program commands recorded on the computer-readable recording medium may be those specifically designed and configured for the present invention or may be those known to and available to those skilled in the art of computer software.
  • a bitstream generated by an encoding method according to the above embodiment can be stored in a non-transitory computer-readable recording medium.
  • the bitstream stored in the non-transitory computer-readable recording medium can be decoded by a decoding method according to the above embodiment.
  • examples of computer-readable recording media include magnetic media such as hard disks, floppy disks, and magnetic tapes, optical recording media such as CD-ROMs and DVDs, magneto-optical media such as floptical disks, and hardware devices specifically configured to store and execute program instructions such as ROMs, RAMs, and flash memories.
  • Examples of program instructions include not only machine language codes generated by a compiler, but also high-level language codes that can be executed by a computer using an interpreter, etc.
  • the hardware devices may be configured to operate as one or more software modules to perform the processing according to the present invention, and vice versa.
  • the present invention can be used in a device for encoding/decoding an image and a recording medium storing a bitstream.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)

Abstract

L'invention concerne un procédé et un appareil de codage/décodage d'image, un support d'enregistrement servant à stocker un flux binaire, et un procédé de transmission. Le procédé de décodage d'image comprend les étapes consistant à : générer un premier bloc de prédiction du bloc courant sur la base d'un premier vecteur de bloc ; générer un second bloc de prédiction du bloc courant sur la base d'un second vecteur de bloc ; et générer un bloc de prédiction final du bloc courant par sommation pondérée du premier bloc de prédiction et du second bloc de prédiction, le premier bloc de prédiction pouvant être généré sur la base d'un premier bloc correspondant dans l'image courante indiqué par le premier vecteur de bloc et le second bloc de prédiction pouvant être généré sur la base d'un second bloc correspondant dans l'image courante indiqué par le second vecteur de bloc.
PCT/KR2024/007817 2023-06-08 2024-06-07 Procédé et appareil de codage/décodage d'image, et support d'enregistrement pour stocker un flux binaire Pending WO2024253465A1 (fr)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202480027475.1A CN121058240A (zh) 2023-06-08 2024-06-07 图像编码/解码方法和装置以及用于存储比特流的记录介质

Applications Claiming Priority (4)

Application Number Priority Date Filing Date Title
KR10-2023-0073463 2023-06-08
KR20230073463 2023-06-08
KR10-2024-0074453 2024-06-07
KR1020240074453A KR20240174507A (ko) 2023-06-08 2024-06-07 영상 부호화/복호화 방법, 장치 및 비트스트림을 저장한 기록 매체

Publications (1)

Publication Number Publication Date
WO2024253465A1 true WO2024253465A1 (fr) 2024-12-12

Family

ID=93796256

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/KR2024/007817 Pending WO2024253465A1 (fr) 2023-06-08 2024-06-07 Procédé et appareil de codage/décodage d'image, et support d'enregistrement pour stocker un flux binaire

Country Status (1)

Country Link
WO (1) WO2024253465A1 (fr)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20200007044A (ko) * 2017-06-22 2020-01-21 후아웨이 테크놀러지 컴퍼니 리미티드 인트라-프레임 예측 방법 및 장치
KR20220043240A (ko) * 2016-05-13 2022-04-05 브이아이디 스케일, 인크. 비디오 코딩을 위한 일반화된 다중-가설 예측 시스템 및 방법
WO2023046127A1 (fr) * 2021-09-25 2023-03-30 Beijing Bytedance Network Technology Co., Ltd. Procédé, appareil et support de traitement vidéo
US20230135166A1 (en) * 2021-10-28 2023-05-04 Tencent America LLC Intrabc using wedgelet partitioning
KR20230080497A (ko) * 2015-06-08 2023-06-07 브이아이디 스케일, 인크. 스크린 콘텐츠 코딩을 위한 인트라 블록 카피 모드

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20230080497A (ko) * 2015-06-08 2023-06-07 브이아이디 스케일, 인크. 스크린 콘텐츠 코딩을 위한 인트라 블록 카피 모드
KR20220043240A (ko) * 2016-05-13 2022-04-05 브이아이디 스케일, 인크. 비디오 코딩을 위한 일반화된 다중-가설 예측 시스템 및 방법
KR20200007044A (ko) * 2017-06-22 2020-01-21 후아웨이 테크놀러지 컴퍼니 리미티드 인트라-프레임 예측 방법 및 장치
WO2023046127A1 (fr) * 2021-09-25 2023-03-30 Beijing Bytedance Network Technology Co., Ltd. Procédé, appareil et support de traitement vidéo
US20230135166A1 (en) * 2021-10-28 2023-05-04 Tencent America LLC Intrabc using wedgelet partitioning

Similar Documents

Publication Publication Date Title
WO2023239147A1 (fr) Procédé et dispositif de codage/décodage d'image, et support d'enregistrement sur lequel un flux binaire est stocké
WO2023200214A1 (fr) Procédé et appareil de codage/décodage d'image, et support d'enregistrement stockant un train de bits
WO2023200206A1 (fr) Procédé et appareil de codage/décodage d'image, et support d'enregistrement stockant un train de bits
WO2024253465A1 (fr) Procédé et appareil de codage/décodage d'image, et support d'enregistrement pour stocker un flux binaire
WO2024210648A1 (fr) Procédé et dispositif de codage/décodage d'image, et support d'enregistrement stockant un flux binaire
WO2024210624A1 (fr) Procédé de codage/décodage d'image, dispositif, et support d'enregistrement stockant des flux binaires
WO2024258110A1 (fr) Procédé et dispositif de codage/décodage d'image et support d'enregistrement stockant un flux binaire
WO2025009816A1 (fr) Procédé et dispositif de codage/décodage d'image, et support d'enregistrement pour stocker un flux binaire
WO2025005615A1 (fr) Procédé et dispositif de codage/décodage d'image, et support d'enregistrement pour stocker un flux binaire
WO2024262870A1 (fr) Procédé et dispositif de codage/décodage d'images et support d'enregistrement stockant un flux binaire
WO2024191219A1 (fr) Procédé et appareil de codage/décodage d'image et support d'enregistrement dans lequel est stocké un flux binaire
WO2024181820A1 (fr) Procédé et dispositif de codage/décodage d'image, et support d'enregistrement sur lequel un train de bits est stocké
WO2025048492A1 (fr) Procédé et dispositif de codage/décodage d'image, et support d'enregistrement sur lequel un flux binaire est stocké
WO2025084817A1 (fr) Procédé et dispositif de codage/décodage d'image, et support d'enregistrement stockant un flux binaire
WO2024262883A1 (fr) Procédé et appareil de codage/décodage d'image, support d'enregistrement pour stocker un flux binaire
WO2025048441A1 (fr) Procédé et dispositif de codage/décodage d'image, et support d'enregistrement stockant un flux binaire
WO2026019073A1 (fr) Procédé et appareil de codage/décodage d'image et support d'enregistrement dans lequel est stocké un flux binaire
WO2025023735A1 (fr) Procédé et dispositif de codage/décodage d'image, et support d'enregistrement pour stocker un flux binaire
WO2025192990A1 (fr) Dispositif et procédé de codage/décodage d'image, et support d'enregistrement dans lequel sont stockés des trains de bits
WO2025037911A1 (fr) Procédé et dispositif de codage/décodage d'image, et support d'enregistrement sur lequel est stocké un flux binaire
WO2025110783A1 (fr) Procédé et dispositif de codage/décodage d'image, et support d'enregistrement stockant un flux binaire
WO2026071463A1 (fr) Procédé de codage/décodage d'image, dispositif et support d'enregistrement pour le stockage de flux binaire
WO2024215069A2 (fr) Procédé de codage/décodage d'image, dispositif, et support d'enregistrement pour le stockage de flux binaire
WO2024005456A1 (fr) Procédé et dispositif de codage/décodage d'image, et support d'enregistrement sur lequel un flux binaire est stocké
WO2025135613A1 (fr) Procédé et dispositif de codage/décodage d'image, et support d'enregistrement stockant un flux binaire

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 24819616

Country of ref document: EP

Kind code of ref document: A1

WWE Wipo information: entry into national phase

Ref document number: CN2024800274751

Country of ref document: CN

NENP Non-entry into the national phase

Ref country code: DE