WO2026014976A1 - Procédé et dispositif de codage/décodage d'image, et support d'enregistrement sur lequel un flux binaire est enregistré - Google Patents
Procédé et dispositif de codage/décodage d'image, et support d'enregistrement sur lequel un flux binaire est enregistréInfo
- Publication number
- WO2026014976A1 WO2026014976A1 PCT/KR2025/010166 KR2025010166W WO2026014976A1 WO 2026014976 A1 WO2026014976 A1 WO 2026014976A1 KR 2025010166 W KR2025010166 W KR 2025010166W WO 2026014976 A1 WO2026014976 A1 WO 2026014976A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- transformation
- block
- current block
- mode
- transform
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/102—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
- H04N19/103—Selection of coding mode or of prediction mode
- H04N19/105—Selection of the reference unit for prediction within a chosen coding or prediction mode, e.g. adaptive choice of position and number of pixels used for prediction
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/102—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
- H04N19/132—Sampling, masking or truncation of coding units, e.g. adaptive resampling, frame skipping, frame interpolation or high-frequency transform coefficient masking
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/134—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
- H04N19/157—Assigned coding mode, i.e. the coding mode being predefined or preselected to be further used for selection of another element or parameter
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/169—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
- H04N19/17—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
- H04N19/176—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a block, e.g. a macroblock
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/169—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
- H04N19/18—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being a set of transform coefficients
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/50—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
- H04N19/593—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving spatial prediction techniques
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/60—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding
- H04N19/61—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding in combination with predictive coding
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/70—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals characterised by syntax aspects related to video coding, e.g. related to compression standards
Definitions
- the present invention relates to a video encoding/decoding method and device, and a recording medium storing a bitstream.
- HD High Definition
- UHD Ultra High Definition
- image compression There are various technologies for image compression, such as inter prediction technology that predicts pixel values included in the current picture from pictures before or after the current picture, intra prediction technology that predicts pixel values included in the current picture using pixel information within the current picture, and entropy encoding technology that assigns short codes to values with high frequency of appearance and long codes to values with low frequency of appearance, and these technologies can be used to effectively compress and transmit or store image data.
- inter prediction technology that predicts pixel values included in the current picture from pictures before or after the current picture
- intra prediction technology that predicts pixel values included in the current picture using pixel information within the current picture
- entropy encoding technology that assigns short codes to values with high frequency of appearance and long codes to values with low frequency of appearance
- the present disclosure seeks to provide a method and device for performing transformation using non-separable transformation.
- the present disclosure provides a method and apparatus for performing a transformation using a reduced-dimensional non-separable transform kernel.
- the present disclosure seeks to provide a method and device for determining/signaling a non-separable transform kernel based on encoding parameters.
- the present disclosure provides a method and apparatus for determining a transform set for non-separable or separable transform based on multiple intra prediction modes.
- a video decoding method and device can derive transform coefficients of a current block from a bitstream, derive residual samples of the current block based on an inverse transform of the transform coefficients of the current block, and reconstruct the current block based on the residual samples of the current block.
- the residual samples can be derived based on any one of a plurality of transform kernel candidates belonging to a transform set of the current block.
- the transform set of the current block can be determined as any one of a basic transform set or an alternative transform set.
- the basic transformation set may be selected based on an intra prediction mode that specifies reference samples for intra prediction of the current block.
- the alternative transformation set may be selected based on a Derived Intra Prediction Mode (DIPM) derived based on a prediction block or surrounding samples of the current block.
- DIPM Derived Intra Prediction Mode
- the basic transformation set may be selected based on the mode having the largest accumulated intensity value among the DIMD (Decoder-side Intra Mode Derivation) modes derived through the DIMD method.
- the alternative transformation set may be selected based on the mode having the second largest accumulated intensity value among the DIMD modes.
- the number of transform kernel candidates applicable to the current block among a plurality of transform kernel candidates belonging to the transform set may be determined based on at least one of the size of the current block, the slice type for the current block, the prediction mode of the current block, or whether the transform set of the current block is an alternative transform set.
- the number of transform kernel candidates applicable to the current block when the product of the width and the height of the current block is less than a first value, the number of transform kernel candidates applicable to the current block may be 0. When the product of the width and the height of the current block is greater than or equal to the first value, the number of transform kernel candidates applicable to the current block may be 1, 2, or 3.
- the first value may be 16, 32, 64, or 128.
- the number of transform kernel candidates applicable to the current block when the product of the width and the height of the current block is less than a first value, the number of transform kernel candidates applicable to the current block may be 0, 1, or 2. When the product of the width and the height of the current block is greater than or equal to the first value, the number of transform kernel candidates applicable to the current block may be 3.
- the first value may be 256 or 512.
- the number of transform kernel candidates applicable to the current block among a plurality of transform kernel candidates belonging to the transform set can be determined based on whether a default section sequence is applied.
- the default interval sequence may be an interval sequence in which the number of applicable transform kernel candidates for all block areas is 0 or 3.
- whether the default section sequence is applied can be determined based on a flag signaled through the bitstream.
- the video encoding method and device can derive residual samples of a current block, derive transform coefficients of the current block based on a transform of the residual samples of the current block, and encode the transform coefficients of the current block.
- the transform coefficients can be derived based on any one of a plurality of transform kernel candidates belonging to a transform set of the current block.
- the transform set of the current block can be determined as any one of a basic transform set or an alternative transform set.
- a computer-readable digital storage medium having encoded video/image information stored thereon, which causes a decoding device according to the present disclosure to perform a video decoding method.
- a computer-readable digital storage medium storing video/image information generated by a video encoding method according to the present disclosure is provided.
- a method and device for transmitting video/image information generated by a video encoding method according to the present disclosure are provided.
- the present disclosure can improve the performance of transformation by utilizing non-separable transformation.
- the present disclosure can improve the performance of transformation by performing transformation using a non-separable transformation kernel of reduced dimensionality.
- the present disclosure can improve encoding efficiency by effectively determining and/or signaling a non-separable transform kernel based on encoding parameters.
- a substantially non-overlapping transformation set candidate by constructing a substantially non-overlapping transformation set candidate, not only can transformation performance be improved, but also a transformation set index can be efficiently signaled.
- FIG. 1 illustrates a video/image coding system according to the present disclosure.
- FIG. 2 is a schematic block diagram of an encoding device to which an embodiment of the present disclosure can be applied and in which encoding of a video/image signal is performed.
- FIG. 3 is a schematic block diagram of a decoding device to which an embodiment of the present disclosure can be applied and in which decoding of a video/image signal is performed.
- FIG. 4 illustrates an image decoding method performed by a decoding device (300) as an embodiment according to the present disclosure.
- FIG. 5 exemplarily shows an intra prediction mode and its prediction direction according to the present disclosure.
- Fig. 6 is a flowchart illustrating a method for inducing a DIMD mode according to the present disclosure.
- Fig. 7 illustrates a filter for inducing a DIMD mode according to the present disclosure.
- FIG. 8 illustrates a schematic configuration of a decoding device (300) that performs an image decoding method according to the present disclosure.
- FIG. 9 illustrates an image encoding method performed by an encoding device (200) as an embodiment according to the present disclosure.
- FIG. 10 illustrates a schematic configuration of an encoding device (200) that performs an image encoding method according to the present disclosure.
- FIG. 11 illustrates an example of a content streaming system to which embodiments of the present disclosure can be applied.
- first and second may be used to describe various components, these components should not be limited by these terms. These terms are used solely to distinguish one component from another. For example, without departing from the scope of the present disclosure, a first component could be referred to as a "second component,” and similarly, a second component could also be referred to as a "first component.”
- the term “and/or” includes a combination of multiple related items described herein or any of multiple related items described herein.
- the present disclosure relates to video/image coding.
- the methods/embodiments disclosed in this specification can be applied to methods disclosed in the versatile video coding (VVC) standard.
- the methods/embodiments disclosed in this specification can be applied to methods disclosed in the essential video coding (EVC) standard, the AOMedia Video 1 (AV1) standard, the second generation of audio video coding standard (AVS2), or the next generation of video/image coding standards (e.g., H.267 or H.268).
- VVC versatile video coding
- EVC essential video coding
- AV1 AOMedia Video 1
- AVS2 second generation of audio video coding standard
- next generation of video/image coding standards e.g., H.267 or H.268.
- a video may refer to a set of images over time.
- a picture generally refers to a unit representing one image at a specific time point, and a slice/tile is a unit that constitutes part of a picture in coding.
- a slice/tile may include one or more coding tree units (CTUs).
- CTUs coding tree units
- a picture may be composed of one or more slices/tiles.
- a tile is a rectangular area consisting of multiple CTUs within a specific tile column and a specific tile row of a picture.
- a tile column is a rectangular area of CTUs that has a height equal to the height of the picture and a width specified by the syntax requirements of the picture parameter set.
- a tile row is a rectangular area of CTUs that has a height specified by the picture parameter set and a width equal to the width of the picture.
- CTUs within a tile are arranged consecutively according to the CTU raster scan, while tiles within a picture may be arranged consecutively according to the tile raster scan.
- a slice may contain an integer number of complete tiles or an integer number of contiguous complete CTU rows within a picture, which may be exclusively contained within a single NAL unit. Meanwhile, a picture may be divided into two or more subpictures.
- a subpicture may be a rectangular region of one or more slices within a picture.
- a pixel, or pel can refer to the smallest unit that constitutes a picture (or image). Additionally, the term "sample" can be used as a counterpart to a pixel.
- a sample can generally represent a pixel or a pixel value, and can represent only the pixel/pixel value of the luma component, or only the pixel/pixel value of the chroma component.
- a unit may represent a basic unit of image processing.
- a unit may include at least one of a specific region of a picture and information related to the region.
- One unit may include one luma block and two chroma (e.g., cb, cr) blocks.
- the term "unit” may be used interchangeably with terms such as "block” or "area.”
- an MxN block may include a set (or array) of samples (or sample array) or transform coefficients consisting of M columns and N rows.
- a or B can mean “only A,” “only B,” or “both A and B.” In other words, as used herein, “A or B” can be interpreted as “A and/or B.” For example, as used herein, “A, B or C” can mean “only A,” “only B,” “only C,” or "any combination of A, B and C.”
- a slash (/) or a comma can mean “and/or.”
- A/B can mean “A and/or B.”
- A/B can mean "only A,” “only B,” or “both A and B.”
- A, B, C can mean "A, B, or C.”
- At least one of A and B may mean “only A”, “only B” or “both A and B”. Additionally, in this specification, the expressions “at least one of A or B” or “at least one of A and/or B” may be interpreted identically to "at least one of A and B”.
- “at least one of A, B and C” can mean “only A,” “only B,” “only C,” or “any combination of A, B and C.” Additionally, “at least one of A, B or C” or “at least one of A, B and/or C” can mean “at least one of A, B and C.”
- parentheses used herein may mean “for example.” Specifically, when “prediction (intra-prediction)" is indicated, “intra-prediction” may be suggested as an example of “prediction.” In other words, “prediction” in this specification is not limited to “intra-prediction,” and “intra-prediction” may be suggested as an example of “prediction.” Furthermore, even when “prediction (i.e., intra-prediction)” is indicated, “intra-prediction” may be suggested as an example of "prediction.”
- FIG. 1 illustrates a video/image coding system according to the present disclosure.
- a video/image coding system may include a first device (source device) and a second device (receiving device).
- a source device can transmit encoded video/image information or data to a receiving device via a digital storage medium or a network in the form of a file or streaming.
- the source device may include a video source, an encoding device, and a transmitting device.
- the receiving device may include a receiving device, a decoding device, and a renderer.
- the encoding device may be referred to as a video/image encoding device, and the decoding device may be referred to as a video/image decoding device.
- the transmitter may be included in the encoding device.
- the receiver may be included in the decoding device.
- the renderer may include a display unit, and the display unit may be configured as a separate device or an external component.
- a video source may obtain video/images through a process of capturing, synthesizing, or generating video/images.
- the video source may include a video/image capture device and/or a video/image generation device.
- the video/image capture device may include one or more cameras, a video/image archive containing previously captured video/images, etc.
- the video/image generation device may include a computer, a tablet, a smartphone, etc., and may (electronically) generate video/images.
- a virtual video/image may be generated through a computer, etc., in which case the video/image capture process may be replaced by a process of generating related data.
- An encoding device can encode input video/images.
- the encoding device can perform a series of procedures, such as prediction, transformation, and quantization, to improve compression and coding efficiency.
- the encoded data (encoded video/image information) can be output in the form of a bitstream.
- the transmission unit can transmit encoded video/image information or data output in the form of a bitstream to the receiving unit of a receiving device via a digital storage medium or network in the form of a file or streaming.
- the digital storage medium can include various storage media such as USB, SD, CD, DVD, Blu-ray, HDD, SSD, etc.
- the transmission unit can include an element for generating a media file via a predetermined file format and an element for transmission via a broadcasting/communication network.
- the receiving unit can receive/extract the bitstream and transmit it to a decoding device.
- the decoding device can decode the video/image by performing a series of procedures such as inverse quantization, inverse transformation, and prediction corresponding to the operation of the encoding device.
- the renderer can render decoded video/images.
- the rendered video/images can be displayed through the display unit.
- FIG. 2 is a schematic block diagram of an encoding device to which an embodiment of the present disclosure can be applied and in which encoding of a video/image signal is performed.
- the encoding device (200) may be configured to include an image partitioner (210), a prediction unit (predictor) 220, a residual processor (residual processor) 230, an entropy encoder (entropy encoder) 240, an adder (adder) 250, a filter (filter) 260, and a memory (memory) 270.
- the prediction unit (220) may include an inter prediction unit (221) and an intra prediction unit (222).
- the residual processor (230) may include a transformer (transformer) 232, a quantizer (quantizer) 233, a dequantizer (dequantizer) 234, and an inverse transformer (inverse transformer) 235.
- the residual processing unit (230) may further include a subtractor (231).
- the addition unit (250) may be called a reconstructor or a recontructed block generator.
- the image segmentation unit (210), the prediction unit (220), the residual processing unit (230), the entropy encoding unit (240), the addition unit (250), and the filtering unit (260) described above may be configured by one or more hardware components (e.g., an encoding device chipset or processor) according to an embodiment.
- the memory (270) may include a decoded picture buffer (DPB) and may be configured by a digital storage medium.
- the hardware component may further include the memory (270) as an internal/external component.
- the image segmentation unit (210) can segment an input image (or picture, frame) input to the encoding device (200) into one or more processing units.
- the processing unit may be called a coding unit (CU).
- the coding unit may be recursively segmented from a coding tree unit (CTU) or a largest coding unit (LCU) according to a QTBTTT (Quad-tree binary-tree ternary-tree) structure.
- a single coding unit may be split into multiple coding units with deeper depths based on a quad-tree structure, a binary tree structure, and/or a ternary structure.
- the quad-tree structure may be applied first, and the binary tree structure and/or the ternary structure may be applied later.
- the binary tree structure may be applied before the quad-tree structure.
- the coding procedure according to the present specification may be performed based on the final coding unit that is no longer split.
- the largest coding unit may be used directly as the final coding unit, or, if necessary, the coding unit may be recursively split into coding units of lower depths, and the coding unit with the optimal size may be used as the final coding unit.
- the coding procedure may include procedures such as prediction, transformation, and restoration, which will be described later.
- the processing unit may further include a prediction unit (PU) or a transform unit (TU).
- the prediction unit and the transform unit may each be split or partitioned from the final coding unit described above.
- the prediction unit may be a unit of sample prediction
- the transform unit may be a unit for deriving a transform coefficient and/or a unit for deriving a residual signal from a transform coefficient.
- an MxN block can represent a set of samples or transform coefficients consisting of M columns and N rows.
- a sample can generally represent a pixel or a pixel value, and can represent only the pixel/pixel value of the luma component, or only the pixel/pixel value of the chroma component.
- a sample can be used as a term corresponding to a pixel or pel in a picture (or image).
- the encoding device (200) can generate a residual signal (residual block, residual sample array) by subtracting a prediction signal (prediction block, prediction sample array) output from an inter prediction unit (221) or an intra prediction unit (222) from an input video signal (original block, original sample array), and the generated residual signal is transmitted to a conversion unit (232).
- a unit that subtracts a prediction signal (prediction block, prediction sample array) from an input video signal (original block, original sample array) within the encoding device (200) may be called a subtraction unit (231).
- the prediction unit (220) can perform a prediction on a block to be processed (hereinafter, referred to as a current block) and generate a predicted block including prediction samples for the current block.
- the prediction unit (220) can determine whether intra prediction or inter prediction is applied on a current block or CU basis.
- the prediction unit (220) can generate various information related to prediction, such as prediction mode information, as described later in the description of each prediction mode, and transmit the information to the entropy encoding unit (240).
- the information related to prediction can be encoded by the entropy encoding unit (240) and output in the form of a bitstream.
- the intra prediction unit (222) can predict the current block by referring to samples in the current picture.
- the referenced samples may be located in the neighborhood of the current block, or may be located a certain distance away from the current block, depending on the prediction mode.
- the prediction modes may include one or more non-directional modes and multiple directional modes.
- the non-directional mode may include at least one of a DC mode or a planar mode.
- the directional mode may include 33 directional modes or 65 directional modes depending on the degree of detail in the prediction direction. However, this is only an example, and a greater or lesser number of directional modes may be used depending on the settings.
- the intra prediction unit (222) may also determine the prediction mode applied to the current block by using the prediction mode applied to the neighboring blocks.
- the inter prediction unit (221) can derive a prediction block for the current block based on a reference block (reference sample array) specified by a motion vector on a reference picture.
- the motion information can be predicted in units of blocks, subblocks, or samples based on the correlation of motion information between neighboring blocks and the current block.
- the motion information can include a motion vector and a reference picture index.
- the motion information can further include inter prediction direction information (L0 prediction, L1 prediction, Bi prediction, etc.).
- the neighboring block can include a spatial neighboring block existing in the current picture and a temporal neighboring block existing in the reference picture.
- the reference picture including the reference block and the reference picture including the temporal neighboring block may be the same or different.
- the above temporal neighboring blocks may be called collocated reference blocks, collocated CUs (colCUs), etc., and the reference pictures including the temporal neighboring blocks may be called collocated pictures (colPic).
- the inter prediction unit (221) may construct a motion information candidate list based on the neighboring blocks, and generate information indicating which candidate is used to derive the motion vector and/or reference picture index of the current block. Inter prediction may be performed based on various prediction modes, and for example, in the case of skip mode and merge mode, the inter prediction unit (221) may use the motion information of the neighboring blocks as the motion information of the current block.
- a residual signal may not be transmitted.
- MVP motion vector prediction
- the prediction unit (220) can generate a prediction signal based on various prediction methods described below.
- the prediction unit can apply intra prediction or inter prediction for prediction of a single block, and can also apply intra prediction and inter prediction simultaneously. This can be called combined inter and intra prediction (CIIP) mode.
- the prediction unit can be based on an intra block copy (IBC) prediction mode or a palette mode for prediction of a block.
- the IBC prediction mode or palette mode can be used for content image/video coding such as games, such as screen content coding (SCC).
- IBC basically performs prediction within the current picture, but can be performed similarly to inter prediction in that it derives a reference block within the current picture. That is, IBC can utilize at least one of the inter prediction techniques described herein.
- Palette mode can be viewed as an example of intra coding or intra prediction.
- sample values within a picture can be signaled based on information about the palette table and palette index.
- the prediction signal generated through the prediction unit (220) can be used to generate a restoration signal or a residual signal.
- the transform unit (232) can apply a transform technique to the residual signal to generate transform coefficients.
- the transform technique can include at least one of a Discrete Cosine Transform (DCT), a Discrete Sine Transform (DST), a Karhunen-Loeve Transform (KLT), a Graph-Based Transform (GBT), or a Conditionally Non-linear Transform (CNT).
- DCT Discrete Cosine Transform
- DST Discrete Sine Transform
- KLT Karhunen-Loeve Transform
- GBT Graph-Based Transform
- CNT Conditionally Non-linear Transform
- GBT refers to a transform obtained from a graph when the relationship information between pixels is expressed as a graph.
- CNT refers to a transform obtained based on generating a prediction signal using all previously restored pixels.
- the transform process can be applied to a pixel block having a square size and the same size, or can be applied to a block of a non
- the quantization unit (233) quantizes the transform coefficients and transmits them to the entropy encoding unit (240), and the entropy encoding unit (240) can encode the quantized signal (information about the quantized transform coefficients) and output it as a bitstream.
- the information about the quantized transform coefficients can be called residual information.
- the quantization unit (233) can rearrange the quantized transform coefficients in a block form into a one-dimensional vector form based on the coefficient scan order, and can also generate information about the quantized transform coefficients based on the quantized transform coefficients in the one-dimensional vector form.
- the entropy encoding unit (240) can perform various encoding methods such as exponential Golomb, context-adaptive variable length coding (CAVLC), context-adaptive binary arithmetic coding (CABAC), etc.
- the entropy encoding unit (240) can also encode information necessary for video/image restoration (e.g., values of syntax elements, etc.) together or separately from quantized transform coefficients.
- Encoded information can be transmitted or stored in the form of a bitstream in units of NAL (network abstraction layer) units.
- the video/image information may further include information on various parameter sets, such as an adaptation parameter set (APS), a picture parameter set (PPS), a sequence parameter set (SPS), or a video parameter set (VPS).
- the video/image information may further include general constraint information.
- information and/or syntax elements transmitted/signaled from an encoding device to a decoding device may be included in the video/image information.
- the video/image information may be encoded through the above-described encoding procedure and included in the bitstream.
- the bitstream may be transmitted via a network or stored in a digital storage medium.
- the network may include a broadcasting network and/or a communication network
- the digital storage medium may include various storage media, such as a USB, SD, CD, DVD, Blu-ray, HDD, or SSD.
- the signal output from the entropy encoding unit (240) may be configured as an internal/external element of the encoding device (200) by a transmitting unit (not shown) and/or a storing unit (not shown), or the transmitting unit may be included in the entropy encoding unit (240).
- the quantized transform coefficients output from the quantization unit (233) can be used to generate a prediction signal. For example, by applying inverse quantization and inverse transformation to the quantized transform coefficients through the inverse quantization unit (234) and the inverse transform unit (235), a residual signal (residual block or residual samples) can be reconstructed.
- the addition unit (250) can generate a reconstructed signal (reconstructed picture, reconstructed block, reconstructed sample array) by adding the reconstructed residual signal to the prediction signal output from the inter prediction unit (221) or the intra prediction unit (222). When there is no residual for the block to be processed, such as when skip mode is applied, the predicted block can be used as a reconstructed block.
- the addition unit (250) may be called a reconstructor or a reconstructed block generation unit.
- the generated restoration signal can be used for intra prediction of the next processing target block within the current picture, and can also be used for inter prediction of the next picture after filtering as described below.
- LMCS luma mapping with chroma scaling
- the filtering unit (260) can improve subjective/objective picture quality by applying filtering to the restoration signal.
- the filtering unit (260) can apply various filtering methods to the restoration picture to generate a modified restoration picture, and store the modified restoration picture in the memory (270), specifically, in the DPB of the memory (270).
- the various filtering methods can include deblocking filtering, sample adaptive offset, adaptive loop filter, bilateral filter, etc.
- the filtering unit (260) can generate various information regarding filtering and transmit it to the entropy encoding unit (240).
- the information regarding filtering can be encoded by the entropy encoding unit (240) and output in the form of a bitstream.
- the modified restored picture transmitted to the memory (270) can be used as a reference picture in the inter prediction unit (221).
- the encoding device can avoid prediction mismatch between the encoding device (200) and the decoding device, and can also improve encoding efficiency.
- the DPB of the memory (270) can store the modified restored picture to be used as a reference picture in the inter prediction unit (221).
- the memory (270) can store motion information of a block from which motion information is derived (or encoded) within the current picture and/or motion information of blocks within a picture that has already been restored.
- the stored motion information can be transferred to the inter prediction unit (221) to be used as motion information of a spatial neighboring block or motion information of a temporal neighboring block.
- the memory (270) can store restored samples of restored blocks within the current picture and transfer them to the intra prediction unit (222).
- FIG. 3 is a schematic block diagram of a decoding device to which an embodiment of the present disclosure can be applied and in which decoding of a video/image signal is performed.
- the decoding device (300) may be configured to include an entropy decoder (310), a residual processor (320), a predictor (330), an adder (340), a filter (350), and a memory (360).
- the predictor (330) may include an inter-prediction unit (331) and an intra-prediction unit (332).
- the residual processor (320) may include a dequantizer (321) and an inverse transformer (321).
- the entropy decoding unit (310), residual processing unit (320), prediction unit (330), addition unit (340), and filtering unit (350) described above may be configured by a single hardware component (e.g., a decoding device chipset or processor) depending on the embodiment.
- the memory (360) may include a decoded picture buffer (DPB) and may be configured by a digital storage medium.
- the hardware component may further include the memory (360) as an internal/external component.
- the decoding device (300) can restore the image corresponding to the process in which the video/image information is processed in the encoding device of FIG. 2.
- the decoding device (300) can derive units/blocks based on block division-related information obtained from the bitstream.
- the decoding device (300) can perform decoding using a processing unit applied in the encoding device.
- the processing unit of decoding may be a coding unit, and the coding unit may be divided from a coding tree unit or a maximum coding unit according to a quad tree structure, a binary tree structure, and/or a ternary tree structure.
- One or more transform units may be derived from the coding unit. Then, the restored image signal decoded and output through the decoding device (300) can be reproduced through a reproduction device.
- the decoding device (300) can receive a signal output from the encoding device of FIG. 2 in the form of a bitstream, and the received signal can be decoded through the entropy decoding unit (310).
- the entropy decoding unit (310) can parse the bitstream to derive information (e.g., video/image information) necessary for image restoration (or picture restoration).
- the video/image information may further include information on various parameter sets, such as an adaptation parameter set (APS), a picture parameter set (PPS), a sequence parameter set (SPS), or a video parameter set (VPS).
- the video/image information may further include general constraint information.
- the decoding device can decode the picture further based on the information on the parameter set and/or the general constraint information.
- the signaling/received information and/or syntax elements described later in this specification can be decoded through the decoding procedure and obtained from the bitstream.
- the entropy decoding unit (310) can decode information in a bitstream based on a coding method such as exponential Golomb coding, CAVLC, or CABAC, and output the values of syntax elements required for image restoration and the quantized values of transform coefficients for residuals.
- the CABAC entropy decoding method receives a bin corresponding to each syntax element in the bitstream, determines a context model using information of the syntax element to be decoded and decoding information of surrounding and decoding target blocks or information of symbols/bins decoded in the previous step, and predicts the occurrence probability of the bin according to the determined context model to perform arithmetic decoding of the bin to generate a symbol corresponding to the value of each syntax element.
- the CABAC entropy decoding method can update the context model using information of the decoded symbol/bin for the context model of the next symbol/bin after determining the context model.
- the entropy decoding unit (310) Among the information decoded by the entropy decoding unit (310), information regarding prediction is provided to the prediction unit (inter-prediction unit (332) and intra-prediction unit (331)), and residual values on which entropy decoding is performed by the entropy decoding unit (310), i.e., quantized transform coefficients and related parameter information, can be input to the residual processing unit (320).
- the residual processing unit (320) can derive a residual signal (residual block, residual samples, residual sample array).
- information regarding filtering can be provided to the filtering unit (350).
- a receiving unit that receives a signal output from an encoding device may be further configured as an internal/external element of a decoding device (300), or the receiving unit may be a component of an entropy decoding unit (310).
- a decoding device may be called a video/video/picture decoding device, and the decoding device may be divided into an information decoding device (video/video/picture information decoding device) and a sample decoding device (video/video/picture sample decoding device).
- the information decoding device may include the entropy decoding unit (310), and the sample decoding device may include at least one of the inverse quantization unit (321), the inverse transformation unit (322), the addition unit (340), the filtering unit (350), the memory (360), the inter prediction unit (332), and the intra prediction unit (331).
- the inverse quantization unit (321) can inverse quantize the quantized transform coefficients and output the transform coefficients.
- the inverse quantization unit (321) can rearrange the quantized transform coefficients into a two-dimensional block form. In this case, the rearrangement can be performed based on the coefficient scanning order performed in the encoding device.
- the inverse quantization unit (321) can perform inverse quantization on the quantized transform coefficients using quantization parameters (e.g., quantization step size information) and obtain transform coefficients.
- the transform coefficients are inversely transformed to obtain a residual signal (residual block, residual sample array).
- the prediction unit (320) can perform a prediction on the current block and generate a predicted block including prediction samples for the current block.
- the prediction unit (320) can determine whether intra-prediction or inter-prediction is applied to the current block based on the information regarding the prediction output from the entropy decoding unit (310), and can determine a specific intra/inter-prediction mode.
- the prediction unit (320) can generate a prediction signal based on various prediction methods described below.
- the prediction unit (320) can apply intra prediction or inter prediction for prediction of a single block, and can also apply intra prediction and inter prediction simultaneously. This can be called combined inter and intra prediction (CIIP) mode.
- the prediction unit can be based on an intra block copy (IBC) prediction mode or a palette mode for prediction of a block.
- the IBC prediction mode or palette mode can be used for content image/video coding such as games, such as screen content coding (SCC).
- SCC screen content coding
- IBC basically performs prediction within the current picture, but can be performed similarly to inter prediction in that it derives a reference block within the current picture. That is, IBC can utilize at least one of the inter prediction techniques described herein.
- Palette mode can be viewed as an example of intra coding or intra prediction. When palette mode is applied, information about the palette table and palette index may be included and signaled in the video/image information.
- the intra prediction unit (331) can predict the current block by referring to samples within the current picture.
- the referenced samples may be located in the neighborhood of the current block, or may be located a certain distance away from the current block, depending on the prediction mode.
- the prediction modes may include one or more non-directional modes and multiple directional modes.
- the intra prediction unit (331) may also determine the prediction mode applied to the current block by using the prediction mode applied to the neighboring blocks.
- the inter prediction unit (332) can derive a prediction block for the current block based on a reference block (reference sample array) specified by a motion vector on a reference picture.
- the motion information can be predicted in units of blocks, subblocks, or samples based on the correlation of the motion information between the neighboring blocks and the current block.
- the motion information can include a motion vector and a reference picture index.
- the motion information can further include inter prediction direction information (L0 prediction, L1 prediction, Bi prediction, etc.).
- the neighboring blocks can include spatial neighboring blocks existing in the current picture and temporal neighboring blocks existing in the reference picture.
- the inter prediction unit (332) can construct a motion information candidate list based on the neighboring blocks, and derive the motion vector and/or reference picture index of the current block based on the received candidate selection information.
- Inter prediction can be performed based on various prediction modes, and information about the prediction can include information indicating an inter prediction mode for the current block.
- the addition unit (340) can generate a restoration signal (restored picture, restoration block, restoration sample array) by adding the acquired residual signal to the prediction signal (prediction block, prediction sample array) output from the prediction unit (including the inter-prediction unit (332) and/or intra-prediction unit (331)).
- the prediction block can be used as the restoration block.
- the addition unit (340) may be referred to as a restoration unit or restoration block generation unit.
- the generated restoration signal may be used for intra prediction of the next processing target block within the current picture, may be output after filtering as described below, or may be used for inter prediction of the next picture.
- LMCS luma mapping with chroma scaling
- the filtering unit (350) can improve subjective/objective image quality by applying filtering to the restored signal.
- the filtering unit (350) can apply various filtering methods to the restored picture to generate a modified restored picture, and transmit the modified restored picture to the memory (360), specifically, to the DPB of the memory (360).
- the various filtering methods can include deblocking filtering, sample adaptive offset, adaptive loop filter, bilateral filter, etc.
- the (modified) reconstructed picture stored in the DPB of the memory (360) can be used as a reference picture in the inter prediction unit (332).
- the memory (360) can store motion information of a block from which motion information is derived (or decoded) in the current picture and/or motion information of blocks in a picture that has already been reconstructed.
- the stored motion information can be transferred to the inter prediction unit (260) to be used as motion information of a spatial neighboring block or motion information of a temporal neighboring block.
- the memory (360) can store reconstructed samples of reconstructed blocks in the current picture and transfer them to the intra prediction unit (331).
- the embodiments described in the filtering unit (260), the inter prediction unit (221), and the intra prediction unit (222) of the encoding device (200) can be applied to the filtering unit (350), the inter prediction unit (332), and the intra prediction unit (331) of the decoding device (300) in the same or corresponding manner, respectively.
- FIG. 4 illustrates an image decoding method performed by a decoding device as an embodiment according to the present disclosure.
- the transform coefficients of the current block can be derived from the bitstream (S400). That is, the bitstream can include residual information of the current block, and the transform coefficients of the current block can be derived by decoding the residual information.
- residual samples of the current block can be derived by performing at least one of dequantization or inverse-transform on the transform coefficients of the current block (S410).
- the inverse transform may be performed based on at least one of DCT-2, DST-7, or DCT-8.
- DCT-2, DST-7, DCT-8, etc. may be referred to as a transform type, a transform kernel, or a transform core.
- the inverse transform may mean a separable transform.
- the present disclosure is not limited thereto, and the inverse transform may also mean a non-separable transform, or may be a concept that includes both a separable transform and a non-separable transform.
- the inverse transform in the present disclosure means a primary transform, but is not limited thereto, and may be transformed into an identical/similar form and applied to a secondary transform.
- DCT-2 and a non-separable transform may be used, or a non-separable transform may be used in addition to at least one of DCT-2, DST-7, or DCT-8, or a non-separable transform may replace the transform kernel of one or more of DCT-2, DST-7, or DCT-8.
- a non-separable transform can replace or be added to one or more of the five transform kernel candidates.
- the notation (transform1, transform2) indicates that transform1 is applied in the horizontal direction and transform2 is applied in the vertical direction. If a non-separable transform replaces some of the transform kernel candidates, the remaining transform kernel candidates except (DCT-2, DCT-2) and (DST-7, DST-7) can be replaced with the non-separable transform.
- the above transform kernel candidates are only examples, and other types of DCT and/or DST may be included, and a transform skip may be included as a transform kernel candidate.
- a non-separable transformation can refer to a transformation or inverse transformation based on a non-separable transformation matrix. That is, unlike a separable transformation that performs vertical and horizontal transformations independently by separating the vertical and horizontal transformations, a non-separable transformation can perform horizontal and vertical transformations simultaneously.
- the input data X to the non-separable transformation is as follows:
- vector X' can be expressed as follows.
- non-separable transformation can be performed as in the following mathematical expression 3.
- F represents a transformation coefficient vector
- T represents a 16x16 non-separable transformation matrix
- ⁇ represents the multiplication of a matrix and a vector.
- a 16x1 transform coefficient vector F can be derived through the above mathematical expression 3, and the F can be reconstructed into 4x4 blocks according to a predetermined scan order.
- the scan order can be a horizontal scan, a vertical scan, a diagonal scan, a z-scan, a raster scan, or a predefined scan.
- the non-separable transform set and/or transform kernel for the above non-separable transform can be variously configured based on at least one of a prediction mode (e.g., intra mode, inter mode, etc.), the width, height, or number of pixels of the current block, the position of a sub-block within the current block, explicitly signaled syntax elements, statistical characteristics of surrounding samples, whether a secondary transform is used, or a quantization parameter (QP).
- a prediction mode e.g., intra mode, inter mode, etc.
- QP quantization parameter
- pre-defined intra prediction modes are grouped to correspond to n sets of non-separable transformations, and each set of non-separable transformations may include k transform kernel candidates.
- n and k may be arbitrary constants according to rules (conditions) defined identically for the encoding device and the decoding device.
- the number of non-separable transformation sets and/or the number of transformation kernel candidates included in the non-separable transformation sets may be configured differently depending on the width and/or height of the current block.
- n 1 non-separable transformation sets and k 1 transformation kernel candidates may be configured.
- n 2 non-separable transformation sets and k 2 transformation kernel candidates may be configured differently depending on the product of the width and the height of the current block.
- n 3 non-separable transformation sets and k 3 transformation kernel candidates may be configured, otherwise, n 4 non-separable transformation sets and k 4 transformation kernel candidates may be configured. That is, since the degree of change in the statistical characteristics of the residual signal varies depending on the block size, the number of non-separable transformation sets and transformation kernel candidates can be configured differently to reflect this.
- the statistical characteristics of the residual signal may be different for each sub-block, and therefore the number of non-separable transform sets and transform kernel candidates may be configured differently.
- n 5 non-separable transform sets and k 5 transform kernel candidates may be configured for the upper left 4x4 sub-block
- n 6 non-separable transform sets and k 6 transform kernel candidates may be configured for the other 4x4 sub-blocks.
- the number of non-separable transformation sets and transformation kernel candidates can be configured differently.
- the syntax element information indicating any one of a plurality of non-separable transformation configurations can be used. For example, if three kinds of non-separable transformation configurations are supported (i.e., n 7 non-separable transformation sets and k 7 transformation kernel candidates, n 8 non-separable transformation sets and k 8 transformation kernel candidates, n 9 non-separable transformation sets and k 9 transformation kernel candidates), the corresponding syntax element can have values of 0, 1, and 2, and the non-separable transformation configuration applied to the current block can be determined according to the value of the signaled syntax element.
- the number of non-separable transformation sets and transformation kernel candidates can be configured differently. For example, if the secondary transformation is not applied, a non-separable transformation configuration including a set of n 10 non-separable transformations and k 10 transformation kernel candidates can be applied. If the secondary transformation is applied, a non-separable transformation configuration including a set of n 11 non-separable transformations and k 11 transformation kernel candidates can be applied.
- non-separable transform configurations can be applied. For example, when the QP value has a small value, a non-separable transform configuration including n 12 non-separable transform sets and k 12 transform kernel candidates can be applied. On the other hand, when the QP value has a large value, a non-separable transform configuration including n 13 non-separable transform sets and k 13 transform kernel candidates can be applied. If the QP value is less than or equal to a threshold (e.g., 32), the case is classified as having a small QP value, and otherwise, the case is classified as having a large QP value. Alternatively, the range of QP values can be divided into three or more ranges, and a different non-separable transform configuration can be applied to each range.
- a threshold e.g. 32
- the block can be divided into multiple sub-blocks and a non-separable transform corresponding to the width and height of the sub-blocks can be used.
- a non-separable transform corresponding to the width and height of the sub-blocks can be used.
- the 4x8 block can be divided into two 4x4 sub-blocks and a 4x4 block-based non-separable transform can be used for each 4x4 sub-block.
- the block can be divided into two 8x8 sub-blocks and an 8x8 block-based non-separable transform can be used.
- the above non-separable transform set can be determined based on the intra prediction mode of the current block and a mapping table.
- the mapping table can define a mapping relationship between pre-defined intra prediction modes and non-separable transform sets.
- the pre-defined intra prediction modes can include two non-directional modes and 65 directional modes.
- the size of the transform kernel of a non-separable transform is larger than that of a separable transform. This means that the computational complexity required for the transform process is high and the memory required for storing the transform kernel is large. Meanwhile, a separable transform can only consider statistical characteristics existing in the horizontal and/or vertical directions, but a non-separable transform can simultaneously consider statistical characteristics in a two-dimensional space including the horizontal and vertical directions, thereby providing better compression efficiency.
- the non-directional mode may include the planar mode (number 0) and the DC mode (number 1), and the directional mode may include the intra prediction modes (numbers 2 to 66).
- the present disclosure may also be applied to cases where the number of pre-defined intra prediction modes is different.
- the pre-defined intra prediction modes may further include intra prediction modes from -14 to -1 and intra prediction modes from 67 to 80.
- FIG. 5 exemplarily illustrates intra prediction modes and their prediction directions according to the present disclosure.
- modes -14 to -1 and 2 to 33, and modes 35 to 80 are symmetrical with respect to the prediction direction with respect to mode 34.
- modes 10 and 58 are symmetrical with respect to the direction corresponding to mode 34
- mode -1 is symmetrical with mode 67. Therefore, for vertical modes that are symmetrical with respect to horizontal modes with respect to mode 34, input data can be transposed and used. Transposing input data means that rows in the input data MxN of a 2D block become columns and columns become rows to form NxM data.
- the 16 data that make up the 4x4 block can be appropriately arranged to form a 16x1 one-dimensional vector for non-separable transformation.
- the one-dimensional vector can be formed in row-major order or column-major order.
- the residual samples resulting from the non-separable transformation can be arranged in the above order to form a two-dimensional block.
- the input vector can be constructed according to column-major order.
- mode 34 can be considered neither a horizontal mode nor a vertical mode, it is classified as belonging to the horizontal mode in the present disclosure. That is, for modes -14 to -1 and 2 to 33, the input data alignment method for the horizontal mode, i.e., row-major order, can be used, and the input data can be transposed and used for the vertical mode that is symmetrical around mode 34.
- the symmetry between block shapes that are in a transpose relationship with each other i.e., the symmetry between the KxL block and the LxK block
- a symmetry relationship exists between a KxL block predicted by the P mode and an LxK block predicted by the (68-P) mode.
- a symmetry relationship exists between a KxL block predicted by the Q mode and an LxK block predicted by the (66-Q) mode.
- the same transform kernel can be applied to the KxL block and the LxK block.
- the non-separable transform set can be derived through a mapping table corresponding to the KxL block based on the (68-P) mode instead of the P mode applied to the LxK block.
- the non-separable transform set can be derived through a mapping table corresponding to the KxL block based on the (66-Q) mode instead of the Q mode applied to the LxK block.
- the non-separable transformation set can be selected based on mode 2 instead of mode 66.
- the input data can be read in a pre-determined order (e.g., row-major order or column-major order) to form a one-dimensional vector and then the corresponding non-separable transformation can be applied.
- the input data can be read in the transposed order to form a one-dimensional vector and then the corresponding non-separable transformation can be applied. That is, if the KxL block is read in row-major order, the LxK block can be read in column-major order. Conversely, if the KxL block is read in column-major order, the LxK block can be read in row-major order.
- a non-separable transformation set can be determined based on mode 34, and the input data can be read in a predetermined order to form a one-dimensional vector to perform the corresponding non-separable transformation.
- mode 34 is applied to an LxK block, a non-separable transformation set can be determined based on mode 34 as well, but the input data can be read in a transposed order to form a one-dimensional vector to perform the corresponding non-separable transformation.
- the non-separable transformation can be performed based on an LxK block by utilizing the symmetry described above for a KxL block in the same manner.
- a block whose width is greater than its height can be restricted to be used as a reference block.
- symmetry can be restricted not to be utilized for non-square blocks.
- non-square blocks can use a different number of non-separable transformation sets and/or transformation kernel candidates than square blocks, and can select a non-separable transformation set using a different mapping table than square blocks.
- mapping table for selecting a non-separable transformation set is as follows.
- Table 1 shows an example of assigning non-separable transform sets according to intra prediction modes when there are five non-separable transform sets.
- the value of predModeIntra indicates the value of the intra prediction mode considering WAIP
- TrSetIdx is an index indicating a specific non-separable transform set.
- Table 1 it can be confirmed that the same non-separable transform set is applied to modes located in symmetrical directions according to the intra prediction mode.
- Table 1 is only an example using five non-separable transform sets and does not limit the total number of non-separable transform sets for non-separable transforms.
- non-separable transform may not be applied to WAIP for compression performance.
- a non-separable transform set corresponding to an adjacent intra prediction mode may be shared.
- the above non-separable transform set may include multiple transform kernel candidates, and any one of the multiple transform kernel candidates may be selectively used.
- an index signaled through a bitstream may be used.
- any one of the multiple transform kernel candidates may be implicitly determined based on context information of the current block.
- the context information may mean the size of the current block or whether a non-separable transform is applied to a neighboring block.
- the size of the current block may be defined by the width, the height, the maximum/minimum values of the width and the height, the sum of the width and the height, or the product of the width and the height.
- inverse transformation can be divided into separable transformation and non-separable transformation.
- Separable transformation means performing transformations in the horizontal and vertical directions respectively for a two-dimensional block
- non-separable transformation can mean performing a single transformation on samples constituting the entire or a portion of a two-dimensional block.
- Each transformation set can contain one or more transformation kernel candidates.
- any one of (DST-7, DST-7), (DCT-8, DST-7), (DST-7, DCT-8), or (DCT-8, DCT-8) can be applied as a separate transform, and the four transform kernel candidates can be regarded as one transform set.
- (DCT-2, DCT-2) can be regarded as one transform set.
- a transform skip that does not apply a transform can also be regarded as one transform set, and (DCT-2, DCT-2) and the transform skip can be regarded as one transform set.
- a transform kernel may refer to one transform (e.g., DCT-2, DST-7) or a pair of two transforms (e.g., (DCT-2, DCT-2)).
- a transform set there may be the aforementioned non-separable transform set.
- the non-separable transform applied as the primary transform may be denoted as a Non-Separable Primary Transform (NSPT).
- NSPT Non-Separable Primary Transform
- multiple non-separable transform sets may be configured, and each non-separable transform set may include one or more transform kernels as transform kernel candidates.
- one of the multiple non-separable transform sets is selected according to the intra prediction mode, and the multiple non-separable transform sets for NSPT may be denoted as an NSPT set list. This is as discussed above, and a detailed description thereof will be omitted here.
- a group of one or more transformation sets available to a current block can be formed from a plurality of pre-defined transformation sets.
- the group of one or more transformation sets can be formed by a predetermined area unit to which the current block belongs, and is hereinafter referred to as a collection.
- the predetermined area unit can be at least one of a picture, a slice, a coding tree unit row (CTU row), or a coding tree unit (CTU).
- the transform set consisting of (DCT-2, DCT-2) S 1 the transform set consisting of (DST-7, DST-7), (DCT-8, DST-7), (DST-7, DCT-8) and (DCT-8, DCT-8) S 2 , respectively.
- the above-mentioned NSPT set list can include N non-separable transform sets, and let's call the N non-separable transform sets S 3,1 , S 3,2 , ..., S 3,N , respectively.
- N can be 35, but is not limited thereto.
- the transformation kernel applicable to the current block may belong to any one of S 1 , S 2 , or S 3,13 .
- the collection available to the current block can be denoted as ⁇ S 1 , S 2 , S 3,13 ⁇ .
- a collection according to the present disclosure is a group of one or more transform sets available to the current block, the collection may be configured differently depending on the context of the current block.
- a collection may be constructed based on the context of the current block, and at this time, a process of selecting one of a plurality of transformation sets belonging to the collection and selecting one of a plurality of transformation kernel candidates belonging to the selected transformation set may be performed.
- the selection of the transformation set and the transformation kernel candidate may be performed implicitly based on the context of the current block or based on an explicitly signaled index.
- the process of selecting one of a plurality of transformation sets belonging to the collection and the process of selecting one of a plurality of transformation kernel candidates belonging to the selected transformation set may be performed separately. For example, an index for selecting a transformation set may be first signaled, and based on this, one of the plurality of transformation sets belonging to the collection may be selected.
- an index indicating one of a plurality of transformation kernel candidates belonging to the transformation set may be signaled, and based on the signaled index, one of the transformation kernel candidates may be selected from the transformation set.
- the transformation kernel of the current block may be determined based on the selected transformation kernel candidate.
- the selection of any one of the transformation sets from the collection may be implicitly performed based on the context of the current block, and the selection of any one of the transformation kernel candidates from the selected transformation set may be performed based on a signaled index.
- the selection of any one of the transformation sets from the collection may be implicitly performed based on the context of the current block, and the selection of any one of the transformation kernel candidates from the selected transformation set may also be implicitly performed based on the context of the current block.
- the index for selecting the transformation set may not be signaled.
- the index for indicating the corresponding transformation kernel candidate may not be signaled.
- an index indicating any one of all transformation kernel candidates in the current collection may be signaled.
- all transformation sets in the collection can be rearranged based on priority. For example, in cases where small-length binary codes are assigned to small-value indices, such as truncated unary codes, it may be advantageous to assign small-value indices to transformation kernel candidates that are more advantageous for improving coding performance.
- priority shuffling
- different shuffling can be applied to each collection.
- only some of them can be selectively rearranged.
- the transformation kernel for the inverse transformation of the current block can be determined based on MTS (Multiple Transform Selection).
- the MTS according to the present disclosure may use at least one of DST-7, DCT-8, DCT-5, DST-4, DST-1, or IDT (identity transform) as a transform kernel.
- the MTS according to the present disclosure may further include a DCT-2 transform kernel.
- multiple MTS sets for MTS can be defined. Based on the size of the current block and/or the intra prediction mode, any one of the multiple MTS sets can be determined. For example, in determining any one MTS set, 16 transform block sizes can be considered, and for directional modes, the shape of the transform block and the symmetry between the intra prediction modes can be considered.
- the MTS set corresponding to mode 2 can be applied to modes -1 to -14 (or -15), and the MTS set corresponding to mode 66 can be applied to modes 67 to 80 (or 81).
- a separate MTS set can be allocated for the MIP (Matrix-based Intra Prediction) mode.
- the MTS set according to the transform block size and intra prediction mode can be allocated/defined as shown in Table 4 below.
- Block size Intra prediction mode width height [0, 1] [2, 12] [13, 23] [24, 34] MIP 4 4 0 1 2 3 4 4 8 5 6 7 8 9 4 16 10 11 12 13 14 4 32 15 16 17 18 19 8 4 20 21 22 23 24 8 8 25 26 27 28 29 8 16 30 31 32 33 34 8 32 35 36 37 38 39 16 4 40 41 42 43 44 16 8 45 46 47 48 49 16 16 50 51 52 53 54 16 32 55 56 57 58 59 32 4 60 61 62 63 64 32 8 65 66 67 68 69 32 16 70 71 72 73 74 32 32 75 76 77 78 79
- Table 4 shows the allocation of MTS sets according to 16 transform block sizes and intra prediction modes.
- the number of predefined MTS sets is 80, and the index indicating any one of the 80 MTS sets can range from 0 to 79, as shown in Table 4.
- Table 5 shows the transformation kernel candidates included in each MTS set discussed in Table 4.
- Each MTS set can be composed of six transformation kernel candidates.
- the transformation kernel candidate index has a value of any one of 0 to 5 and can indicate any one of the six transformation kernel candidates.
- each transformation kernel candidate can be a combination of a horizontal transformation kernel and a vertical transformation kernel for a separate transformation, and 25 transformation kernel candidates with indices of 0 to 24 can be defined.
- Table 6 is an example of the 25 transform kernel candidates discussed in Table 5. Specifically, the horizontal and vertical transforms of the transform kernel candidates are indicated as (horizontal transform, vertical transform). For each transform kernel candidate index, the horizontal/vertical transform when the intra prediction mode is less than 35 may be the opposite of the horizontal/vertical transform when the intra prediction mode is 35 or greater. When the value of the intra prediction mode is 35 or greater, a mode symmetrical around mode 34 may be derived, and an MTS set may be selected from Table 4 based on the mode. In addition, the symmetry of the block shape may be additionally considered. When the original transform block has a WxH size, the original transform block may be considered to have a HxW size by symmetrizing it, and an MTS set may be selected from Table 4.
- the value of the intra prediction mode may be the value of the modified intra prediction mode. That is, as mode values for WAIP, -14 (or -15) to -1 are modified to mode 2, 67 to 80 (or 81) are modified to mode 66, and the remaining modes can be set to the values of the modified intra prediction modes with the values of the original intra prediction modes as they are.
- the extended modes for WAIP are also configured symmetrically around mode 34, the symmetry around mode 34 can be utilized for all directional modes except for the Planar mode and the DC mode.
- an MTS set with an index of 72 can be selected, as defined in Table 4.
- the MTS set assigned to the MIP mode may be selected based on the size of the current block, without considering the symmetry of the block shape.
- the MTS set assigned to the MIP mode may be selected based on the symmetric block size, considering the symmetry of the block shape. For example, when the MIP mode is applied to an 8x16 block, the 8x16 block may be regarded as a symmetrical 16x8 block, and an MTS set with an index of 49 may be selected, as defined in Table 4.
- the intra prediction mode may be regarded as the Planar mode.
- the MTS set assigned to the MIP mode may be selected based on the size of the current block, without considering the symmetry of the block shape.
- the MTS set assigned to the MIP mode may be selected based on the symmetrical block size, considering the symmetry of the block shape.
- a flag may be used to indicate whether the MIP mode is applied in transpose mode. If the MIP mode is applied to the current block of MxN and the flag indicates application of the transpose mode, the intra prediction mode is regarded as the Planar mode, and the current block of MxN may be regarded as an NxM block. That is, from Table 4, an MTS set corresponding to the block size of NxM and the Planar mode may be selected. As seen in Table 6, if the value of the intra prediction mode is 35 or greater, the horizontal and vertical transformations are swapped, but since the intra prediction mode of the current block is regarded as the Planar mode, the horizontal and vertical transformations of the transform kernel candidates may not be swapped.
- the intra prediction mode is not regarded as the Planar mode, and the current block of MxN may be regarded as an NxM block. That is, from Table 4, an MTS set corresponding to the block size of NxM and the MIP mode may be selected.
- a transformation kernel candidate selected by a transformation kernel candidate index may be set as the transformation kernel of the current block.
- at least one of the horizontal transformation or the vertical transformation of the selected transformation kernel candidate may be changed to another transformation kernel. For example, if the transformation kernel candidate index is 3 and the width and height of the current block are both 16 or less, at least one of the horizontal transformation or the vertical transformation of the transformation kernel candidate corresponding to the transformation kernel candidate index of 3 may be changed to another transformation kernel. At this time, the horizontal transformation and the vertical transformation may be changed independently of each other.
- the vertical transformation of the selected transformation kernel candidate may be changed to an identity transform (IDT). If the difference (or the absolute value of the difference) between the value of the intra prediction mode of the current block and the value of the vertical mode is less than or equal to a predetermined threshold, the horizontal transformation of the selected transformation kernel candidate may be changed to an identity transform (IDT).
- IDCT identity transform
- the threshold value can be determined based on the width and height of the current block, as shown in Table 7 below.
- Table 7 defines thresholds according to the size of a transform block for changing the horizontal transformation and/or vertical transformation of a transform kernel candidate selected by a transform kernel candidate index to another transform kernel.
- the six transform kernel candidates constituting one MTS set can be distinguished by transform kernel candidate indices from 0 to 5 as defined in Table 5.
- the transform kernel candidate indices can be signaled through the bitstream.
- a flag indicating whether the MTS set is available/applied (MTS enabled flag or MTS flag) can be signaled, and the transform kernel candidate index can be signaled when the flag indicates the availability/applicability of the MTS set.
- the MTS flag can be composed of one bin, and one or more contexts for CABAC-based entropy coding (hereinafter referred to as CABAC context) can be allocated to the bin. For example, different CABAC contexts can be allocated for non-MIP mode and MIP mode, respectively.
- the number of transform kernel candidates available to the current block may be set differently.
- the sum of the absolute values of all or part of the transform coefficients in the current block may be considered.
- the sum of the absolute values of the transform coefficients is referred to as AbsSum. If AbsSum is less than or equal to T1, only one transform kernel candidate corresponding to a transform kernel candidate index of 0 may be available. If AbsSum is greater than T1 and less than or equal to T2, four transform kernel candidates corresponding to transform kernel candidate indices of 0 to 3 may be available. If AbsSum is greater than T2, six transform kernel candidates corresponding to transform kernel candidate indices of 0 to 5 may be available.
- T1 may be 6 and T2 may be 32, but this is only an example.
- the transformation kernel candidate corresponding to the transformation kernel candidate index of 0 can be set as the transformation kernel of the current block without signaling the transformation kernel candidate index.
- AbsSum is greater than T1 and less than or equal to T2
- any one of the four transformation kernel candidates can be selected based on the transformation kernel candidate index with two bins. That is, the transformation kernel candidate indices of 0 to 3 can be signaled as 00, 01, 10, and 11, respectively.
- MSB Most Significant Bit
- LSB Least Significant Bit
- a CABAC context other than the CABAC context assigned for the MTS flag may be assigned to each bin.
- bypass coding may be applied without assigning a CABAC context to the two bins.
- AbsSum is greater than T2
- the transform kernel candidate index has a value from 0 to 5, so the transform kernel candidate index cannot be expressed with only two bins.
- the transform kernel candidate index may be expressed by assigning two or more bins, such as truncated binary coding.
- a CABAC context may be assigned, or bypass coding may be applied without assigning a CABAC context.
- a CABAC context may be assigned to some of the multiple bins (e.g., the first bin, or the first and second bins), and bypass coding may be applied to the remaining bins.
- the transformation kernel of the current block can be determined based on a transformation set including one or more transformation kernel candidates.
- the transformation kernel of the current block can be derived from any one or more transformation kernel candidates belonging to the transformation set.
- the process of determining a transformation kernel of the current block may include at least one of 1) determining a transformation set of the current block or 2) selecting one transformation kernel candidate from the transformation set of the current block.
- the process of determining the transformation set may be a process of selecting one of a plurality of transformation sets that are identically predefined for the encoding device and the decoding device.
- the process of determining the transformation set may be a process of configuring one or more transformation sets available to the current block from among a plurality of transformation sets that are identically predefined for the encoding device and the decoding device, and selecting one of the configured transformation sets.
- the process of determining the transformation set may be a process of configuring one transformation set based on a transformation kernel candidate available to the current block from among a plurality of transformation kernel candidates that are identically predefined for the encoding device and the decoding device.
- the transformation set of the current block includes multiple transformation kernel candidates, a process of selecting one of the multiple transformation kernel candidates for the current block may be performed. However, if the transformation set of the current block includes only one transformation kernel candidate (i.e., the current block has only one transformation kernel candidate available), the transformation kernel of the current block may be set to that transformation kernel candidate.
- the transformation set according to the present disclosure may mean the (non-separable) transformation set in the aforementioned embodiment 1, or may mean the MTS set in the embodiment 2.
- the transformation set may be defined separately from the (non-separable) transformation set in the embodiment 1 or the MTS set in the embodiment 2.
- the transformation set may include one or more specific transformation kernels as transformation kernel candidates.
- One specific transformation kernel may be defined as a pair of a transformation kernel for horizontal transformation and a transformation kernel for vertical transformation, or may be defined as one transformation kernel that is equally applied to horizontal and vertical transformations.
- NSPT which is a non-separable transform applied as a primary transform
- NSPT can be applied to all or part of a transform block.
- residual samples existing in the region where NSPT is applied can be input as a one-dimensional vector of NSPT. That is, residual samples existing in all or part of a transform block (referred to as Region Of Interest, ROI, in the present disclosure) can be collected into a one-dimensional vector and configured as an input.
- the primary transform coefficients can be obtained.
- the backward NSPT to the primary transform coefficients
- the one-dimensional vector output can be obtained.
- the dimension of the matrix of the non-separable transform kernel for NSPT may be determined according to the size of the ROI.
- the transform kernel may be referred to as a transform type and a transform matrix
- the non-separable transform kernel for NSPT may be referred to as an NSPT kernel.
- the dimension of the corresponding transform matrix may be MN x MN.
- the dimension of the NSPT kernel may be 64 x 64.
- the NSPT kernel when applying NSPT to a residual generated by intra prediction, can be adaptively determined depending on the intra prediction mode. Since the statistical characteristics of the residual block may vary depending on the intra prediction mode, compression efficiency can be improved by adaptively determining the NSPT kernel depending on the intra prediction mode.
- the non-separable transform set can be determined based on the intra prediction mode of the current block and a mapping table.
- the mapping table can define a mapping relationship between pre-defined intra prediction modes and non-separable transform sets.
- the pre-defined intra prediction modes can include two non-directional modes and 65 directional modes.
- intra prediction modes can be grouped into intra prediction mode groups.
- One NSPT kernel or multiple NSPT kernels can be assigned to an intra prediction mode group.
- a non-separable transform set (NSPT set) including one or more NSPT kernels can be assigned to an intra prediction mode group.
- the non-separable transform set is mapped to an intra prediction mode, and one of the N NSPT kernels included in the non-separable transform set can be selected.
- an intra prediction group may include adjacent prediction modes (e.g., modes 17, 18, and 19).
- the intra prediction group may include modes that have symmetry.
- directional modes may be symmetrical around the diagonal mode of FIG. 5 described above (i.e., intra prediction mode 34).
- two modes that have symmetry may form a group (or pair).
- modes 18 and 50 may be included in the same group because they are symmetrical around mode 34.
- a process of transposing the 2D input block and then constructing a one-dimensional input vector may be added before applying the forward NSPT kernel.
- the one-dimensional input vector may be derived from the corresponding input block in the row-first order without transposing the 2D input block. If the intra prediction mode is greater than 34, the one-dimensional input vector can be constructed by first transposing the 2D input block and then reading the input block in the row-first order, or by leaving the 2D input block as is and reading the input block in the column-first order.
- Table 8 below illustrates an example of a mapping table for assigning NSPT sets according to intra prediction modes.
- a total of 35 NSPT sets, from 0 to 34, can be defined.
- the extended WAIP mode i.e., modes -14 to -1 and modes 67 to 80 in FIG. 5
- the extended WAIP mode can be assigned the NSPT set assigned to the nearest normal directional mode.
- the extended WAIP mode can be assigned NSPT set number 2.
- An NSPT set may include one or more NSPT kernels (or kernel candidates). That is, an NSPT set may include N NSPT kernel candidates. For example, N may be set to a value greater than or equal to 1, such as 1, 2, 3, or 4. Among one or more NSPT kernels included in the NSPT set, a kernel applied to a current block may be signaled using an index. In the present disclosure, the index may be referred to as an NSPT index. For example, the NSPT index may have values of 0, 1, 2, ..., N - 1.
- the NSPT index value may be fixed to 0. In this case, the NSPT index may be inferred without being separately signaled.
- a flag indicating whether NSPT is applied may be signaled separately from the NSPT index. In the present disclosure, the flag may be referred to as an NSPT flag.
- NSPT flag value is 1, NSPT may be applied. If the NSPT flag value is 0, NSPT may not be applied. If the NSPT flag is not signaled, the NSPT flag value may be inferred to be 0. For example, if the NSPT flag value is 1, the NSPT index may be signaled. Based on the signaled NSPT index, one of the N kernel candidates included in the NSPT set selected by the intra prediction mode may be specified.
- the entropy coding method of the NSPT index can be defined in various ways considering the number (N) of NSPT kernels included in the NSPT set. For example, as a method of mapping values from 0 to N-1 to empty strings (i.e., a binarization method), truncated unary binarization, truncated binarization, and fixed-length binarization methods can be used.
- the candidates can be specified by two bins.
- the first, second, and third candidates can be binarized and signaled as 0, 10, and 11, respectively.
- the binarized bins can be coded using context coding or bypass coding.
- a reduced primary transform (RPT) method using a transform kernel of a reduced dimension by a primary transform is described.
- RPT reduced primary transform
- samples belonging to a 2D residual block can be arranged (or rearranged) into a 1D vector according to row-majority (or column-majority).
- a transformation matrix for NSPT can be multiplied by the arranged vector.
- the corresponding 2D residual block is an M x N block (M is the width, N is the height)
- the length of the rearranged 1D vector can be M*N. That is, the corresponding 2D residual block can also be represented as a column vector having dimensions M*N x 1.
- M*N can be conveniently denoted as MN.
- the dimension of the corresponding transformation matrix can be MN x MN.
- forward NSPT can operate by multiplying the left side of an MN x 1 vector by the corresponding MN x MN transformation matrix to obtain an MN x 1 transformation coefficient vector.
- r transform coefficients can be obtained by multiplying an r x MN matrix.
- r represents the number of rows of the transformation matrix
- MN represents the number of columns of the transformation matrix.
- the value of r can be set to be less than or equal to MN. That is, the existing forward NSPT transformation matrix includes MN rows, each row being a 1 x MN row vector, which is a transform basis vector of the corresponding NSPT transformation matrix.
- the corresponding transform coefficients can be obtained by multiplying each transform basis vector by an MN x 1 sample column vector.
- the conventional forward NSPT transformation matrix is composed of MN row vectors
- MN transformation coefficients i.e., MN x 1 transformation coefficient column vectors
- the transformation matrix can be composed of r transformation basis vectors instead of MN transformation basis vectors. Accordingly, when the forward RPT is applied, r transformation coefficients (i.e., r x 1 transformation coefficient column vectors) can be obtained instead of MN.
- the RPT kernel can be constructed by selecting r transform basis vectors, which are part of the transform basis vectors constituting the MN x MN forward NSPT kernel.
- the transform kernel can be referred to as a transform type and a transform matrix
- the non-separable transform kernel for NSPT can be referred to as an RPT kernel. That is, when selecting r 1 x MN row vectors from the MN x MN forward NSPT kernel, it can be advantageous to select the most important transform basis vectors from the perspective of coding performance. Specifically, in terms of energy compaction through transform, more energy can be concentrated in the transform coefficients that appear first by multiplying the forward NSPT transform matrix.
- the transform basis vectors located on the upper side of the forward NSPT transform matrix can generate transform coefficients having larger energy.
- the r x MN forward RPT kernel can be constructed (or derived) by taking r from the upper side of the forward NSPT kernel.
- the RPT according to the present disclosure takes only a portion (i.e., r) of the transform coefficients obtained by applying the existing NSPT, which may result in a loss of some of the energy contained in the original signal. In other words, distortion may occur between the original signal and the original signal through this process. Nevertheless, by applying the RPT, only r transform coefficients are generated instead of MN, so the number of bits required to code the transform coefficients can be reduced. Therefore, in the case of a signal in which a large amount of energy is concentrated in a small number of transform coefficients (e.g., an image residual signal), the gain obtained by reducing the signaling bits can be significantly large, thereby improving the coding performance.
- r a portion of the transform coefficients obtained by applying the existing NSPT
- the backward NSPT may be the transpose matrix of the forward NSPT kernel described above as a transformation matrix.
- the input data may be a transformation coefficient signal instead of a sample signal such as a residual signal.
- the forward NSPT transformation matrix is G and the sample signal rearranged into a 1D vector is x
- the transformation coefficient vector obtained by multiplying the transformation matrix on the left side can be expressed as in the following mathematical equation 4.
- x and y can be MN x 1 column vectors.
- G can have the form of an MN x MN matrix.
- the reverse NSPT process can be expressed as Equation 5 below using the same variables.
- G T denotes the transpose matrix of G.
- the forward RPT operation and the backward RPT operation according to the present disclosure can also be expressed by the two mathematical expressions above.
- y is an r x 1 column vector instead of an MN x 1 column vector
- G is an r x MN matrix instead of an MM x MN matrix. That is, even if RPT is applied instead of NSPT, the dimension of the sample signal (e.g., image residual signal) does not change, which may mean that the original number of sample signals (i.e., MN sample signals) can be restored with only r transform coefficients through the backward RPT. In other words, the original MN sample signals can be restored by coding only r transform coefficients that are less than MN, which may lead to an improvement in coding performance.
- an RPT structure that defines an r value by considering the statistical characteristics of a residual block, and derives a residual block of the size of an existing transform block from a residual block of a reduced size determined by the defined r value. If an additional transformation (i.e., a secondary transform) is applied to predict the statistical distribution of primary transform coefficients, quantized non-zero coefficients may be concentrated in a relatively low frequency region because a quantization process is applied to the primary transform coefficients. Accordingly, a reduced secondary transform for the statistical distribution of primary transform coefficients can define the statistical characteristics of the primary transform coefficients relatively simply by setting an r value for a given low frequency region.
- a secondary transform for the statistical distribution of primary transform coefficients can define the statistical characteristics of the primary transform coefficients relatively simply by setting an r value for a given low frequency region.
- the RPT according to the present disclosure is fundamentally different from the reduced secondary transform in that it is a technique for defining an r value by considering the statistical characteristics of samples within a residual block that have very different characteristics from the distribution of the primary transform coefficients.
- the RPT kernel which is a reduced-dimensional transformation matrix. In other words, a method for determining or defining the r value in the RPT is described below.
- memory usage can be considered as a measure of worst-case complexity.
- the allowed memory size per kernel can be set.
- each kernel coefficient each element constituting the transform kernel is referred to as a kernel coefficient in this disclosure
- the value of r can be set to q / (MN * p) or less.
- p is 1 byte for a forward RPT kernel for a 16x16 block
- the value of r can be set to 32 or less.
- memory usage and/or the number of multiplications per sample can be considered as measures of worst-case complexity. For example, if the maximum possible number of multiplications per sample is set to 16 for a 16x16 block and memory usage is set to 8KB or less per kernel (kernel coefficients are expressed as 1 byte), the value of r can be set to 16 or less.
- the r value constituting the RPT kernel may be determined by specific information.
- the r value constituting the RPT kernel may be determined based on a predefined encoding parameter.
- the r value may be determined based on the size of a block.
- the RPT kernel may be variably determined based on the size of the block.
- the block may be at least one of a coding block, a transform block, and a prediction block.
- the r value may be determined based on prediction information.
- the prediction information may include information regarding inter/intra prediction, intra prediction mode information, etc.
- the r value may be determined based on signaled information (values of syntax elements).
- the r value may be variably determined based on a quantization parameter value.
- a predefined fixed value may be used as the r value, and the predefined fixed value may be determined based on signaled information.
- r transform coefficients can be obtained.
- the obtained r transform coefficients can be arranged according to a predefined scan order of the transform coefficients (e.g., forward/backward zig-zag scan order, forward/backward horizontal scan order, forward/backward vertical scan order, forward/backward diagonal scan order, a scan order specified based on an intra prediction mode, etc.).
- the transform coefficients obtained from the forward RPT application are arranged according to such a scan order (e.g., a scan order per coefficient group (CG) unit can also be applied), if the value of r is smaller than MN, the inside of the M x N block cannot be filled with the r transform coefficients, and thus a blank space may be generated.
- CG scan order per coefficient group
- the above-described blank space can be predicted in the following manner in consideration of the characteristics of the residual signal.
- the values of empty spaces can be filled using the values of available surrounding pixels.
- the values in empty spaces can be filled based on the values of available surrounding pixels and the intra prediction mode.
- the values in empty spaces can be predicted by performing intra prediction based on the values of available surrounding pixels and the intra prediction mode.
- the values of empty spaces can be filled from available surrounding pixels using a predefined intra prediction mode (e.g., planar mode).
- a predefined intra prediction mode e.g., planar mode
- filling the empty space with 0 among the examples described above may be referred to as a zero-out process.
- the following embodiment may be applied. If a non-zero transform coefficient is detected (or parsed) in the empty space portion during parsing of the transform coefficients on the decoding device side, it may be considered (or inferred) that the RPT is not applied. In other words, if a non-zero transform coefficient exists in the predefined area representing the empty space, it may be considered that the RPT is not applied. In this case, signaling (or parsing) for a flag indicating whether the RPT is applied and/or an index designating one of a plurality of RPT kernel candidates may not be performed. As an example, if a non-zero transform coefficient exists in the predefined area representing the empty space, a predefined variable value may be updated, and it may be inferred that the RPT is not applied based on the updated variable value.
- whether or not to apply RPT may be determined based on the size and/or shape of a block.
- the RPT kernel may be variably determined based on the size and/or shape of the block. Since the r value may vary based on the size and/or shape of the block (i.e., for each M x N block), the empty space may vary based on the size and/or shape of the block. Accordingly, the area for checking whether a non-zero transform coefficient is detected may be defined differently based on the size and/or shape of the block. In other words, the zero-out area may be variably determined.
- the r value when a 16x64 matrix is applied as a forward RPT matrix for an 8x8 block, the r value may be 16.
- the CG when the CG is a 4x4 sub-block, only the upper left 4x4 block may be filled with a non-zero RPT transform coefficient, and the remaining three empty 4x4 sub-blocks (i.e., the upper right, lower left, and lower right sub-blocks) may be filled with 0 values.
- a non-zero transform coefficient is detected in the remaining three 4x4 sub-block areas during the decoding process, it may be considered that the RPT is not applied.
- a flag indicating whether to apply the RPT or an index specifying one of multiple RPT kernel candidates may not be signaled.
- the RPT transform coefficients can be filled in the upper-left 4x4 sub-block and the 4x4 sub-block adjacent to the lower side of the upper-left sub-block.
- the area to be filled with zero as an empty space can be determined as the remaining area excluding the two 4x4 sub-blocks.
- the RPT kernel can be determined variably depending on the size and/or shape of the block, and as seen, the empty space can be determined differently for an 8x8 block and a 16x8 block.
- the flag and/or index associated with the RPT may not be signaled if it is detected that non-zero transform coefficients exist in CGs belonging to empty spaces. That is, the transform coefficients within each CG may be scanned in a specified order, and the transform coefficients within the CG may be scanned in the same manner for the next CG according to the scan order for the CG unit.
- whether to apply the RPT can be determined based on that information alone, which can reduce signaling overhead and related implementation complexity.
- RPT may not be applied. In this case, signaling for information related to RPT may be omitted.
- the flag indicating whether RPT is applied can be parsed after parsing (or signaling) the related transform coefficients to make a final determination as to whether RPT is applied.
- a forward secondary transform may be additionally applied to the transform coefficients generated through the application of RPT.
- a forward secondary transform may be additionally applied to an area where the generated transform coefficients are located within an M x N block.
- the area or a part of the area may be referred to as an ROI from the perspective of the forward secondary transform.
- the backward secondary transform may be applied first and then the backward RPT may be applied.
- an area or a part of the area where r transform coefficients generated by the application of the forward RPT are located may be set as an ROI and the forward secondary transform may be applied.
- the 16 generated transform coefficients may be located in the upper left 4x4 sub-block, and the sub-block area may be set as an ROI and the forward secondary transform may be applied to the ROI.
- the coefficient values of the RPT kernel can be adjusted considering operations such as integer arithmetic or fixed-point arithmetic. That is, the RPT kernel can be configured to perform a transformation through integer arithmetic (or fixed-point arithmetic) in a practical codec system by appropriately scaling the kernel coefficients belonging to the kernel, rather than a theoretical orthogonal transformation or non-orthogonal transformation (wherein, orthogonal transformation and non-orthogonal transformation refer to transformations in which the norm of each transformation basis vector is 1).
- orthogonal transformation and non-orthogonal transformation refer to transformations in which the norm of each transformation basis vector is 1.
- the scaling factor multiplied can be equally reflected when applying the RPT.
- a separable transformation or a non-separable transformation can be performed while maintaining other processes (e.g., quantization and dequantization processes) other than the transformation.
- Integer coefficients of the RPT kernel can be obtained by multiplying the transformation basis vector by the scaling value described above.
- multiplying the scaling value may include applying operations such as rounding, flooring, and ceilinging to each kernel coefficient. That is, the integerized RPT kernel obtained through the above-described method is defined and can be used in the transformation/inverse transformation process.
- the maximum and minimum values can be obtained for all kernel coefficients, and thus the number of bits that can express all kernel coefficients can be obtained from the maximum and minimum values. For example, if the maximum value is less than or equal to 127 and the minimum value is greater than or equal to -128, all integer kernel coefficients can be expressed with 8 bits (especially, through 2's complement representation, etc.).
- all integer kernel coefficients can be represented with N bits if their maximum value is less than or equal to (2 (N-1) - 1) and their minimum value is greater than or equal to -2 (N-1) . If the maximum value is greater than (2 (N-1) - 1) or the minimum value is less than -2 (N-1) , then it may not be possible to represent all integer kernel coefficients with N bits. In this case, 1) all kernel coefficients can be additionally multiplied by a scaling factor to fit within the N-bit range, or 2) the number of bits required to represent the kernel coefficients can be increased (i.e., more than N+1 bits).
- All kernel coefficients can be expressed as 8-bit, 9-bit, 10-bit, etc. using the method described above.
- the scaling value of the kernel coefficients can be set differently for each block size or kernel, and the number of bits for expressing the kernel coefficients can be set differently.
- the aforementioned NSPT can be applied based on at least one of the size, tree type, or component type of the current block. For example, whether to apply NSPT can be determined based on at least one of the size, tree type, or component type of the current block.
- An NSPT index can be signaled based on at least one of the size, tree type, or component type of the current block.
- An NSPT set or NSPT kernel can be derived based on at least one of the size, tree type, or component type of the current block.
- Allowed transform block sizes pre-defined in a decoding device can be broadly divided into two groups.
- One of the two groups (hereinafter referred to as the first group) may refer to a set of block sizes to which NSPT can be applied.
- the first group may be composed of any one of the allowed transform block sizes, or may be composed of two or more block sizes among the allowed block sizes.
- a block size to which NSPT can be applied may be defined as a block size in which at least one of the width and the height is less than or equal to a predetermined threshold.
- a block size to which NSPT can be applied may be defined as a block size in which the product of the width and the height is less than or equal to a predetermined threshold.
- a block size to which NSPT can be applied may be defined as a block size in which the maximum value of the width and the height is less than or equal to a predetermined threshold.
- the threshold may be an integer of 4, 8, 16, 32, 64, 128, or higher.
- the other of the two groups above may refer to a set of block sizes to which NSPT is not applied.
- the aforementioned separable primary transformation may be applied to the block sizes belonging to the second group.
- a non-separable secondary transformation may be applied to all or some of the block sizes belonging to the second group.
- the reverse NSPT may be applied to the (dequantized) transform coefficients of the current block.
- the reverse separable primary transform may be applied to the (dequantized) transform coefficients of the current block.
- the reverse non-separable secondary transform e.g., low frequency non-separable transform, LFNST
- the reverse separable primary transform e.g., DCT-2
- the first group which is a set of block sizes to which NSPT can be applied, can be defined as a set of 4x4, 4x8, 8x4, 8x8.
- the first group can be defined as a set of 4x8, 8x4, 8x8.
- the first group can be defined as a set of 4x8, 8x4.
- the first group can be defined as a set of 4x4, 4x8, 4x16, 8x4, 8x8, 16x4.
- the first group can be defined as a set of 4x8, 4x16, 8x4, 8x8, 16x4.
- the first group can be defined as a set of 4x8, 4x16, 8x4, 16x4.
- the first group can be defined as a set of 4x4, 4x8, 8x4, 8x8, 8x16, 16x8, 16x16.
- the first group can be defined as a set of 4x4, 4x8, 8x4, 8x8, 8x16, 16x8.
- the first group can be defined as a set of 4x8, 8x4, 8x8, 8x16, 16x8.
- the first group can be defined as a set of 4x4, 4x8, 8x4, 8x8, 8x16, 16x8.
- the first group can be defined as a set of 4x4, 4x8, 8x4, 8x8, 8x16, 16x32, 32x16, 32x32.
- the first group can be defined as the set of 4x4, 4x8, 8x4, 8x8, 8x16, 16x8, 16x16, 16x32, 32x16.
- the first group can be defined as the set of 4x8, 8x4, 8x8, 8x16, 16x8, 16x16, 16x32, 32x16.
- the first group can be defined as the set of 4x8, 8x4, 8x16, 16x8, 16x16, 16x32, 32x16.
- the first group can be defined as the set of 4x8, 8x4, 8x16, 16x8, 16x32, 32x16.
- the first group can be defined as a set of 4x4, 4x8, 4x16, 8x4, 16x4.
- the first group can be defined as a set of 4x4, 4x8, 4x16, 8x4, 8x8, 8x16, 16x4, 16x8.
- the first group can be defined as a set of 4x8, 4x16, 8x4, 8x8, 8x16, 16x4, 16x8.
- the first group can be defined as a set of 4x4, 4x8, 4x16, 8x4, 8x16, 16x4, 16x8.
- the first group can be defined as a set of 4x8, 4x16, 8x4, 8x16, 16x4, 16x8.
- the first group can be defined as a set of 4x4, 4x8, 4x16, 8x4, 8x8, 8x16, 16x4, 16x8, 16x16.
- the first group can be defined as a set of 4x8, 4x16, 8x4, 8x8, 8x16, 16x4, 16x8, 16x16.
- the first group can be defined as a set of 4x4, 4x8, 4x16, 8x4, 8x16, 16x4, 16x8, 16x16.
- the first group can be defined as the set of 4x8, 4x16, 8x4, 8x16, 16x4, 16x8, 16x16.
- the first group can be defined as the set of 4x4, 4x8, 4x16, 4x32, 8x4, 8x16, 8x32, 16x4, 16x8, 32x4, 32x8.
- the first group can be defined as the set of 4x8, 4x16, 4x32, 8x4, 8x16, 8x32, 16x4, 16x8, 32x4, 32x8.
- the first group can be defined as the set of 4x4, 4x8, 4x16, 4x32, 8x4, 8x8, 8x16, 8x32, 16x4, 16x8, 16x16, 32x4, 32x8.
- the first group can be defined as the set of 4x4, 4x8, 4x16, 4x32, 8x4, 8x8, 8x16, 8x32, 16x4, 16x8, 16x32, 32x4, 32x8, 32x16.
- the first group can be defined as the set of 4x4, 4x8, 4x16, 4x32, 8x4, 8x8, 8x16, 8x32, 16x4, 16x8, 16x16, 16x32, 32x4, 32x8, 32x16.
- the first group can be defined as the set of 4x8, 4x16, 4x32, 8x4, 8x16, 8x32, 16x4, 16x8, 16x32, 32x4, 32x8, 32x16.
- the first group can be defined as a set of 4x4, 4x8, 4x16, 4x32, 8x4, 8x8, 8x16, 8x32, 16x4, 16x8, 16x16, 16x32, 32x4, 32x8, 32x16, 32x32.
- an NSPT matrix (or NSPT kernel) having a predetermined dimension
- the NSPT matrix can be expressed as a matrix having a dimension of PxQ as an inverse transformation matrix
- the PxQ matrix represents a matrix in which the number of rows and the number of columns are P and Q, respectively.
- a 16x16 NSPT matrix may be applied for a 4x4 block.
- a 32x20 NSPT matrix may be applied for at least one of a 4x16 block or a 16x4 block.
- a 64x24 NSPT matrix may be applied for at least one of a 4x16 block or a 16x4 block.
- a 64x32 NSPT matrix may be applied for an 8x8 block.
- a 128x40 NSPT matrix may be applied.
- a 16x16 block a 256x44 NSPT matrix may be applied for a 16x16 block.
- a 128x36, 128x38, or 128x40 NSPT matrix may be applied.
- a 256x48 NSPT matrix can be applied.
- a 512x52 or 512x54 NSPT matrix can be applied.
- an NSPT matrix of 128x36, 128x38, 128x40, or 128x(36-(4*n)) may be applied.
- an NSPT matrix of 128x(36-(4*n)) may be applied instead of the NSPT matrix of 128x36, 128x38, or 128x40.
- * denotes multiplication
- n may be an integer greater than or equal to 0.
- the NSPT matrix for at least one of a 4x32 block or a 32x4 block at least one of a 128x36 matrix, a 128x32 matrix, a 128x28 matrix, a 128x24 matrix, a 128x20 matrix, a 128x16 matrix, a 128x12 matrix, a 128x8 matrix, or a 128x4 matrix can be used.
- an NSPT matrix of 256x48 or 256x(48-(4*m)) may be applied.
- an NSPT matrix of 256x(48-(4*m)) may be applied instead of the NSPT matrix of 256x48.
- * denotes multiplication
- m may denote an integer greater than or equal to 0.
- the NSPT matrix for at least one of an 8x32 block or a 32x8 block at least one of a 256x48 matrix, a 256x44 matrix, a 256x40 matrix, a 256x36 matrix, a 256x32 matrix, a 256x28 matrix, a 256x24 matrix, a 256x20 matrix, a 256x16 matrix, a 256x12 matrix, a 256x8 matrix, or a 256x4 matrix can be used.
- the combination of the structure of the NSPT matrix for a 4x32 block and a 32x4 block and the structure of the NSPT matrix for an 8x32 block and a 32x8 block can be formed by a combination of the above-described matrices. That is, the combination of the structure of the NSPT matrix for a 4x32 block and a 32x4 block and the structure of the NSPT matrix for an 8x32 block and a 32x8 block can be defined by a combination of one or more 128x(36-(4*n)) NSPT matrices and one or more 256x(48-(4*m)) NSPT matrices.
- n can be one or more integers in the range of 0 to 8
- m can be one or more integers in the range of 0 to 11.
- the NSPT matrix for at least one of a 4x32 block or a 32x4 block may be a 128x24 matrix
- the NSPT matrix for at least one of an 8x32 block or a 32x8 block may be a 256x36 matrix.
- the NSPT matrix for at least one of the 4x32 block or the 32x4 block may be a 128x24 matrix
- the NSPT matrix for at least one of the 8x32 block or the 32x8 block may be a 256x32 matrix.
- the NSPT matrix for at least one of the 4x32 block or the 32x4 block may be a 128x24 matrix
- the NSPT matrix for at least one of the 8x32 block or the 32x8 block may be a 256x28 matrix.
- the NSPT matrix for at least one of the 4x32 block or the 32x4 block may be a 128x24 matrix
- the NSPT matrix for at least one of the 8x32 block or the 32x8 block may be a 256x24 matrix.
- the NSPT matrix for at least one of the 4x32 block or the 32x4 block may be a 128x24 matrix
- the NSPT matrix for at least one of the 8x32 block or the 32x8 block may be a 256x20 matrix.
- the NSPT matrix for at least one of the 4x32 block or the 32x4 block may be a 128x24 matrix
- the NSPT matrix for at least one of the 8x32 block or the 32x8 block may be a 256x16 matrix.
- the NSPT matrix for at least one of the 4x32 block or the 32x4 block may be a 128x20 matrix
- the NSPT matrix for at least one of the 8x32 block or the 32x8 block may be a 256x36 matrix.
- the NSPT matrix for at least one of the 4x32 block or the 32x4 block may be a 128x20 matrix
- the NSPT matrix for at least one of the 8x32 block or the 32x8 block may be a 256x32 matrix.
- the NSPT matrix for at least one of the 4x32 block or the 32x4 block may be a 128x20 matrix
- the NSPT matrix for at least one of the 8x32 block or the 32x8 block may be a 256x28 matrix.
- the NSPT matrix for at least one of the 4x32 block or the 32x4 block may be a 128x20 matrix
- the NSPT matrix for at least one of the 8x32 block or the 32x8 block may be a 256x24 matrix.
- the NSPT matrix for at least one of the 4x32 block or the 32x4 block may be a 128x20 matrix
- the NSPT matrix for at least one of the 8x32 block or the 32x8 block may be a 256x20 matrix.
- the NSPT matrix for at least one of the 4x32 block or the 32x4 block may be a 128x20 matrix
- the NSPT matrix for at least one of the 8x32 block or the 32x8 block may be a 256x16 matrix.
- the NSPT matrix for at least one of the 4x32 block or the 32x4 block may be a 128x16 matrix
- the NSPT matrix for at least one of the 8x32 block or the 32x8 block may be a 256x36 matrix.
- the NSPT matrix for at least one of the 4x32 block or the 32x4 block may be a 128x16 matrix
- the NSPT matrix for at least one of the 8x32 block or the 32x8 block may be a 256x32 matrix.
- the NSPT matrix for at least one of the 4x32 block or the 32x4 block may be a 128x16 matrix
- the NSPT matrix for at least one of the 8x32 block or the 32x8 block may be a 256x28 matrix.
- the NSPT matrix for at least one of the 4x32 block or the 32x4 block may be a 128x16 matrix
- the NSPT matrix for at least one of the 8x32 block or the 32x8 block may be a 256x24 matrix.
- the NSPT matrix for at least one of the 4x32 block or the 32x4 block may be a 128x16 matrix
- the NSPT matrix for at least one of the 8x32 block or the 32x8 block may be a 256x20 matrix.
- the NSPT matrix for at least one of the 4x32 block or the 32x4 block may be a 128x16 matrix
- the NSPT matrix for at least one of the 8x32 block or the 32x8 block may be a 256x16 matrix.
- a Px1 output vector can be obtained by applying the PxQ matrix to a Qx1 input vector (i.e., (PxQ matrix) x (Qx1 input vector)).
- the Qx1 input vector may correspond to (inverse quantized) transform coefficients in the current block to which NSPT is applied.
- the value of Q may mean the number of transform coefficients to which NSPT is applied, and may be less than or equal to the product of the width and the height of the current block.
- the value of Q may be variably determined based on the size of the current block among the block sizes belonging to the first group described above. Alternatively, the value of Q may be set identically for the block sizes belonging to the first group.
- the Px1 output vector may correspond to a residual signal (or decoded residual samples). The value of P may be equal to the product of the width and the height of the current block.
- the forward NSPT matrix can be expressed as a QxP matrix, which is the transpose of the PxQ matrix.
- a QxP output vector can be obtained by applying the QxP matrix to a Px1 input vector (i.e., (QxP matrix)x(Px1 input vector)).
- the Px1 input vector can correspond to residual samples in the current block to which NSPT is applied.
- the value of P can be equal to the product of the width and the height of the current block.
- the Qx1 output vector can correspond to transform coefficients in the current block derived through NSPT.
- the value of Q can mean the number of transform coefficients output through NSPT, and can be less than or equal to the product of the width and the height of the current block.
- the value of Q can be variably determined based on the size of the current block among the block sizes belonging to the first group described above. Alternatively, the value of Q can be set identically for the block sizes belonging to the first group.
- NSPT can be applied to non-square blocks, such as MxN blocks and NxM blocks.
- NSPT can be applied to 4x8 blocks and 8x4 blocks.
- NSPT can be applied to 4x16 blocks and 16x4 blocks, or 8x16 blocks and 16x8 blocks, or 16x32 blocks and 32x16 blocks.
- LFNST can be composed of a small number of transform basis vectors.
- a separable primary transform such as DCT-2 and a non-separable secondary transform such as LFNST are applied instead of NSPT to the corresponding block sizes, performance degradation may occur.
- NSPT NSPT
- LFNST performs a transpose operation on the input block only for the ROI region, utilizing symmetry.
- NSPT performs a transpose operation on the entire block, utilizing symmetry. Therefore, NSPT can train and apply the NSPT kernel using a more sophisticated symmetry method, which is expected to improve performance.
- a 32x64 transformation matrix can be applied instead of a 16x64 transformation matrix from the forward transformation perspective.
- the 16x64 transformation matrix can be constructed by sampling the upper 16 rows of the 32x64 transformation matrix.
- NSPT can be applied to the luma component of the current block, but NSPT can not be applied to the chroma component of the current block. If the tree type of the current block is a dual tree, NSPT can be applied to the luma and chroma components of the current block.
- NSPT may be applied to the luma component of the current block and not applied to the chroma component of the current block, regardless of whether the tree type of the current block is a single tree.
- NSPT may be applied to the luma component and chroma component of the current block, regardless of whether the tree type of the current block is a single tree.
- the tree type of the current block is single tree
- NSPT is allowed for luma and chroma components
- the size of the current block belongs to the first group one NSPT index may be signaled, and the luma and chroma components of the current block may share the NSPT index.
- the NSPT index may be an index for selecting any one of the transformation kernel candidates for NSPT. If the sizes of the luma block and the chroma block of the current block belong to the first group, the transformation kernel candidate selected by the same NSPT index may be applied to the luma and chroma components.
- LFNST may not be applied to the chroma component of the current block, and a separate transformation may be applied.
- LFNST may be applied to the chroma component of the current block.
- the luma and chroma components may have highly correlated characteristics.
- applying NSPT only to the luma component or applying a transform kernel candidate selected by a single NSPT index to both luma and chroma components can reduce unnecessary signaling and improve compression efficiency.
- the luma and chroma components each have independent partitioning and encoding structures. In this case, signaling the NSPT index for each component can reflect the characteristics of each component and improve compression efficiency.
- the NSPT kernel for the above NSPT can be derived based on at least one of symmetry between intra prediction modes or symmetry between block shapes.
- the NSPT kernel can be derived as an NSPT kernel corresponding to at least one of a mode symmetrical to the intra prediction mode of the current block or a block shape symmetrical to the block shape of the current block.
- the NSPT kernel can be derived based on an NSPT set including one or more NSPT kernel candidates, wherein the NSPT set can be derived as an NSPT set corresponding to at least one of a mode symmetrical to the intra prediction mode of the current block or a block shape symmetrical to the block shape of the current block.
- Any one of the one or more NSPT kernel candidates belonging to the NSPT set can be set as the NSPT kernel of the current block.
- an NSPT index specifying any one of the one or more NSPT kernel candidates belonging to the NSPT set can be used.
- the NSPT index can be signaled through the bitstream or can be derived based on the aforementioned symmetry.
- Mode 34 There may be symmetry between at least two intra prediction modes among the intra prediction modes pre-defined in the decoding device. For convenience of explanation, the symmetry will be described centered on the upper left diagonal mode (i.e., mode 34). Referring to Fig. 5, there is symmetry between the directional modes. All modes have a prediction direction except for the Planar mode (number 0) and the DC mode (number 1). Modes 2 to 66 are named normal directional modes (which can be expressed as [2, 66]), and modes -14 to -1 (which can be expressed as [-14, -1]) and modes 67 to 80 (which can be expressed as [67, 80]) are named wide directional modes.
- the wide directional modes may include at least one of a mode having a value smaller than -14 or a mode having a value larger than 80.
- all modes except modes 0 and 1 are symmetrical around mode 34.
- mode x and mode (68 - x) are symmetrical
- mode x and mode (66 - x) are symmetrical between mode [-14, -1] and mode [67, 80].
- the same symmetry relationship can be established between mode [N, -1] and mode [67, 66 - N].
- N can be an integer less than or equal to -14.
- an MxN block and an NxM block can be defined as blocks that are symmetric to each other.
- M and N can be the same or different.
- an M1xN1 block and an M2xN2 block can be defined as blocks that are symmetric to each other.
- an M1xN1 block and an M2xN2 block can be defined as blocks that are symmetric to each other.
- modes that are symmetric to each other can share at least one of the NSPT set, NSPT index, or NSPT kernel. That is, at least one of the NSPT set, NSPT index, or NSPT kernel for any one of the symmetric modes can be applied equally to any other one of the symmetric modes.
- modes that are symmetric to each other can share a single NSPT kernel.
- the corresponding NSPT kernel can be applied to the input data, and for the other mode, a transpose operation can be applied to the input data and then the corresponding NSPT kernel can be applied.
- mode x belongs to the [2, 33] mode
- a one-dimensional vector (1D vector) can be constructed in column-first order for the input data MxM block, and the NSPT kernel can be applied to the 1D vector.
- the construction of the 1D vector in column-first order can be done by reading the input data column by column from the input data MxM block, obtaining M columns, and arranging them in order to construct the 1D vector.
- a 1D vector can be constructed in row-first order, and the same NSPT kernel can be applied to the 1D vector.
- the configuration of a 1D vector according to the row-major order may be to read the input data row by row from the MxM block of input data, obtain M rows, and sequentially arrange them to configure a 1D vector.
- a 1D vector may be configured according to the column-first order for the (66 - x) mode that is symmetric to the x-th mode, and the same NSPT kernel as the x-th mode may be applied to the 1D vector.
- the column-first order or the row-first order may be applied to the modes 0 and 1, and the column-first order or the row-first order may be applied to the mode 34.
- the row-first order may be applied to the intra prediction mode that belongs to the [2, 33] mode, and the column-first order may be applied to the mode that is symmetric to the intra prediction mode.
- row-major ordering can be applied, and for modes symmetric to this, column-major ordering can be applied.
- a non-square block with width and height M and N, respectively can be viewed as having a symmetry relationship with a non-square block with width and height N and M, respectively.
- the x-th mode of an MxN block belongs to the [N, -1] mode (N ⁇ -14)
- the method of constructing a 1D vector from an input data block is as discussed above. That is, if a column-major order is applied to mode x, a row-major order can be applied to a mode that is symmetric to it. Alternatively, if a row-major order is applied to mode x, a column-major order can be applied to a mode that is symmetric to it. Specifically, if a column-major order is applied to mode x, input data can be read column-wise from an MxN block of input data to obtain M columns, which can be arranged sequentially to construct a 1D vector. Here, each column can have a length of N.
- input data can be read row-wise from an MxN block of input data to obtain N rows, which can be arranged sequentially to construct a 1D vector.
- each row can have a length of M.
- input data can be read row-wise from an MxN block of input data to obtain N rows, which can be arranged sequentially to construct a 1D vector.
- each row can have a length of M.
- input data can be read column by column from an MxN block of input data to obtain M columns, which can be arranged in order to form a 1D vector.
- each column can have a length of N.
- the NSPT set and/or NSPT kernel of the current block may be determined based on at least one of an intra prediction mode symmetric to mode x or a block size of NxM symmetric to the block size of MxN.
- the NSPT kernel may be set as an NSPT kernel for an NxM block, not an NSPT kernel for an MxN block. That is, if symmetry is utilized for the current block, the NSPT set and/or NSPT kernel for a block that has symmetry with the current block may be utilized in the same manner.
- a 1D vector may be constructed from an input data block according to a predetermined priority order, which may correspond to an input of the NSPT kernel.
- the symmetry may be restricted to be utilized only when the value of the intra prediction mode of the current block is greater than 34. That is, when the value of the intra prediction mode of the current block is greater than 34, a transpose operation may be applied when constructing a 1D vector from an input data block, and an NSPT set or NSPT kernel corresponding to a mode and/or block shape having symmetry with the current block may be utilized. Specifically, when the intra prediction mode of the current block belongs to the [N, -1] mode and the [2, 34] mode, symmetry may not be utilized for the current block. On the other hand, when the intra prediction mode of the current block belongs to the [35, 66] mode and the [67, 66 - N] mode, symmetry may be utilized for the current block.
- N may be an integer less than or equal to -14.
- the derivation of the NSPT set or NSPT kernel based on the above symmetry can be adaptively performed based on the size of the current block. For example, for 4x4 blocks and 8x8 blocks, the NSPT set or NSPT kernel may be derived based on symmetry, and for 4x8 blocks and 8x4 blocks, the NSPT set or NSPT kernel may not be derived based on symmetry.
- the number of available NSPT sets may vary. For example, if symmetry is utilized, the number of available NSPT sets may be 35, and if symmetry is not utilized, the number of available NSPT sets may be 67.
- Table 9 shows an example in which the NSPT set is determined using symmetry, and shows the mapping relationship between the intra prediction mode and the NSPT set when the number of available NSPT sets is 35.
- Intra prediction mode NSPT set index X ⁇ 0 2 0 ⁇ X ⁇ 34 X 35 ⁇ X ⁇ 66 68 - X X > 66 2
- the NSPT set of the current block may be determined as an NSPT set having an NSPT set index of 2 among 35 NSPT sets. If the value (X) of the intra prediction mode of the current block is greater than or equal to 0 and less than or equal to 34, the NSPT set of the current block may be determined as an NSPT set having an NSPT set index of X among 35 NSPT sets.
- the NSPT set of the current block may be determined as an NSPT set having an NSPT set index of (68-X) among 35 NSPT sets. If the value (X) of the intra prediction mode of the current block is greater than or equal to 35 and less than or equal to 66, the NSPT set of the current block may be identical to the NSPT set corresponding to the value (68-X) of the mode symmetrical to the intra prediction mode of the current block.
- the NSPT set of the current block may be determined as the NSPT set having the NSPT set index of 2 among the 35 NSPT sets. If the value (X) of the intra prediction mode of the current block is greater than 66, the NSPT set of the current block may be identical to the NSPT set corresponding to the mode symmetrical to the intra prediction mode of the current block.
- Table 10 below shows an example in which the NSPT set is determined without using symmetry, and shows the mapping relationship between the intra prediction mode and the NSPT set when the number of available NSPT sets is 67.
- Intra prediction mode NSPT set index X ⁇ 0 2 0 ⁇ X ⁇ 66 X X > 66 66
- the NSPT set of the current block may be determined as an NSPT set having an NSPT set index of 2 among 67 NSPT sets. If the value (X) of the intra prediction mode of the current block is greater than or equal to 0 and less than or equal to 66, the NSPT set of the current block may be determined as an NSPT set having an NSPT set index of X among 67 NSPT sets. If the value (X) of the intra prediction mode of the current block is greater than 66, the NSPT set of the current block may be determined as an NSPT set having an NSPT set index of 66 among 67 NSPT sets.
- the memory required to store transformation kernels can be reduced while maintaining the performance required for transform application. For example, by utilizing symmetry to use 35 NSPT sets instead of 67, the memory required to store NSPT kernels can be significantly reduced.
- the number of available NSPT sets and/or the number of NSPT kernel candidates in an NSPT set may vary depending on the block size.
- the number of available NSPT sets for a 4x4 block may be 35
- the number of available NSPT sets for a 4x8 block and an 8x4 block may be 19, and the number of available NSPT sets for an 8x8 block may be 10.
- the NSPT set for a 4x4 block may consist of three NSPT kernel candidates
- the NSPT sets for a 4x8 block and an 8x4 block may consist of three or two NSPT kernel candidates
- the NSPT set for an 8x8 block may consist of one NSPT kernel candidate.
- the size of the transform kernel may increase. Therefore, by reducing the number of available NSPT sets and/or the number of NSPT kernel candidates within an NSPT set, the memory required to store the transform kernel can be saved. Furthermore, as the block size increases, the residual signal characteristics within the block tend to become more generalized. Therefore, reducing the number of available NSPT sets and/or the number of NSPT kernel candidates within an NSPT set can help maintain compression efficiency while reducing implementation complexity by reflecting these statistical characteristics.
- a non-separable transform may be applied to the current block.
- the current block may be a block encoded using inter prediction.
- the method according to the present disclosure may be applied in the same/similar manner even when the current block is a block encoded using intra prediction.
- the non-separable transform may include at least one of the aforementioned LFNST or NSPT.
- a transform set for non-separable transformation may be selected based on the intra prediction mode used for intra prediction of the block.
- an intra prediction mode may be derived based on a decoder-side intra mode derivation (DIMD) method, and a transform set for non-separable transformation may be selected based on the derived intra prediction mode.
- DIMD decoder-side intra mode derivation
- a predetermined filter is applied to at least one of a prediction block or a surrounding area of a current block to derive a gradient value in a horizontal and/or vertical direction, and a specific intra prediction mode is derived based on the derived gradient value.
- a set of transformations to be applied to the current block can be selected based on the specific intra prediction mode.
- the intra prediction mode derived based on the DIMD method may be called a virtual intra prediction mode (VIPM) or a DIMD mode.
- an initial sample position may be set, and accumulated intensity values for all intra prediction modes may be initialized (S600).
- An intensity value for a current sample position may be calculated (S610).
- An intra prediction mode for which the calculated intensity value is to be accumulated may be selected (S620).
- the intensity value calculated in S610 may be added to the accumulated intensity value for the selected intra prediction mode (S630).
- the processes of S610 to S630 described above may be performed for each of all or some sample positions belonging to the current block until a next sample position does not exist. If a next sample position does not exist, an intra prediction mode having the largest accumulated intensity value may be selected (S640).
- the surrounding area is assumed to be a block with width and height W and H.
- the DIMD method can also be applied based on the predicted block of the current block, and hereinafter, "surrounding area" can be understood as “predicted block.”
- a filter of size PxQ can be applied inside a WxH block.
- the PxQ filter can be a 2D filter or a 1D filter. However, there may be cases where the filter goes beyond the block boundary for samples adjacent to the block boundary.
- the filter can be applied only to the internal area of the WxH block excluding samples adjacent to the block boundary. Specifically, the filter can be applied only to sample positions belonging to the (W-P+1)x(H-Q+1) area, which is the internal area of the WxH block.
- the filter size is 3x3
- a 3x3 2D filter can be applied only to sample positions belonging to the (W-2)x(H-2) block excluding the edge of the length of 1 in the WxH block.
- the 2D filter can be applied to a 3x3 area having the sample of the sample position (hereinafter referred to as the reference sample) as the center sample.
- the above reference sample and at least one surrounding sample adjacent to the reference sample may be input to the 2D filter.
- the surrounding sample may include a sample adjacent to at least one of the top, left, bottom, right, top left, top right, bottom left, or bottom right of the reference sample.
- a filter can be applied to each sample position belonging to the internal region to derive an intensity value of a specific intra prediction mode.
- the derived intensity value is added to a previously derived intensity value for the specific intra prediction mode, and through this process, a cumulative intensity value can be derived for the specific intra prediction mode.
- cumulative intensity values can be derived for all intra prediction modes (or directional modes excluding the non-directional mode).
- An intra prediction mode with the largest cumulative intensity value can be selected, and this can be set as an intra prediction mode derived based on the DIMD method (hereinafter, referred to as a DIMD mode).
- Two types of 3x3 filters can be used as filters to induce the DIMD mode, and these are called filterY and filterX, respectively.
- the position where filterY and filterX are applied i.e., the position of the reference sample
- the positions of the samples related to the application of the two filters can be represented as A, B, C, D, E, F, G, H, and I.
- the values obtained by applying filterY and filterX are represented as iDy and iDx, respectively, iDy and iDx can be calculated as follows.
- iDx G + 2*H + I - A - 2*B - C
- the subsequent steps i.e., obtaining the intensity value, selecting a specific intra prediction mode, and adding the intensity value to the accumulated intensity value for the intra prediction mode
- the filter can be moved to the next sample position to be filtered.
- the function abs may be a function that calculates and returns the absolute value of an input value.
- the value of iAmp may be calculated as follows. This may correspond to step S610 of FIG. 6.
- iAngUneven which is the value of the selected specific intra prediction mode, can be determined as in the following mathematical expression 8. This may correspond to step S620 of FIG. 6 for the case where either iDx or iDy is 0.
- VER_IDX and HOR_IDX can represent vertical mode and horizontal mode, respectively.
- VER_IDX and HOR_IDX can correspond to mode 50 and mode 18 in Fig. 5, respectively.
- a value of iDx of 0 can mean that the amount of variation in the vertical direction is 0 or almost none. In other words, it can mean that the same sample values are considered to be present in the vertical direction, and thus prediction is good in the vertical direction.
- a value of iDy of 0 i.e., when the value of iDx is not 0
- iAngUneven which is the value of the specific intra prediction mode selected, can be determined as follows. This may correspond to step S620 of FIG. 6 for the case where both iDx and iDy are non-zero.
- the values of intra prediction modes (especially, the values of directional modes) can be divided into the following four groups.
- the first group can be composed of modes having a specific number of horizontal directions.
- the first group can be composed of modes having a value less than or equal to the horizontal mode.
- the modes having a horizontal direction are modes 2 to 34 (however, mode 34 may not be included in the modes having a horizontal direction), and the value of the horizontal mode is 18 (i.e., the horizontal mode is mode 18).
- the specific number is N
- the first group can be composed of modes ⁇ 18, 18-1, 18-2, ... , 18-(N-1) ⁇ .
- N is 17, the first group can be composed of modes ⁇ 18, 17, 16, ... , 2 ⁇ .
- the second group (region 1) can be composed of modes with a specific number of horizontal orientations.
- the second group can be composed of modes whose values are greater than or equal to the value of the horizontal mode. If the specific number is N, the second group can be composed of modes ⁇ 18, 18+1, 18+2, ..., 18+(N-1) ⁇ . If N is 17, the second group can be composed of modes ⁇ 18, 19, 20, ..., 34 ⁇ .
- the third group (region 2) can be composed of modes having a specific number of vertical orientations.
- the third group can be composed of modes having a value less than or equal to the vertical mode.
- the modes having a vertical orientation are modes 34 to 66 (however, mode 34 may not be included in the modes having a vertical orientation), and the value of the vertical mode is 50 (i.e., the vertical mode is mode 50).
- the third group can be composed of modes ⁇ 50, 50-1, 50-2, ..., 50-(N-1) ⁇ . If N is 17, the third group can be composed of modes ⁇ 50, 49, 48, ..., 34 ⁇ .
- the fourth group (region 3) can be composed of modes with a specific number of vertical orientations.
- the fourth group can be composed of modes whose values are greater than or equal to the vertical mode. If the specific number is N, the fourth group can be composed of modes ⁇ 50, 50+1, 50+2, ..., 50+(N-1) ⁇ . If N is 17, the fourth group can be composed of modes ⁇ 50, 51, 52, ..., 66 ⁇ .
- an identifier indicating a specific group can be calculated as shown in Table 11 below.
- the identifiers corresponding to the first to fourth groups are 0, 1, 2, and 3, respectively.
- gtY can indicate whether the amount of change in the vertical direction is greater than the amount of change in the horizontal direction. That is, if absx is greater than absy, gtY can be derived as 1, otherwise gtY can be derived as 0.
- gtY is 1, this indicates that the amount of change in the vertical direction is greater than the amount of change in the horizontal direction, and if gtY is 0, this indicates that the amount of change in the vertical direction is less than or equal to the amount of change in the horizontal direction. If the amount of change in the vertical direction is large, it may mean that the prediction in the vertical direction is less likely to be good. In this case, region 0 or region 1 can be selected.
- mapXgrY1[signy][signx] can be selected as the identifier for the region. If the amount of change in the horizontal direction is large, it may mean that the prediction in the horizontal direction is less likely to be good. In this case, region 2 or region 3 can be selected. That is, mapXgrY0[signy][signx] can be selected as an identifier for the region.
- region 1 or region 2 can be selected, and when the sign for iDx and the sign for iDy are different, region 0 or region 3 can be selected.
- (1 ⁇ 16) means shifting 1 to the left by 16, which represents 65536, which is 2 16.
- the intFunc function can be a function that converts fRatioScaled, which is expressed as a decimal value, to an integer value. Operations such as rounding, ceiling, and flooring can be applied to convert to an integer value. You can also convert to an integer value using casting functions such as int provided by the C/C++ library.
- a division operation (/) may be required to obtain the ratio of absy and absx.
- the implementation cost for the division operation is expensive or difficult to implement. Therefore, it is often advantageous from an implementation perspective to approximate the division operation as a combination of several integer operations. Therefore, the equations for obtaining the ratio can be approximated with the equations in Table 13 below.
- the integerized ratio value can be determined through a process as shown in Table 13.
- the position to which the ratio value is closest in angTable can be determined as shown in Table 14 below.
- angTable can be composed of 17 entries. If each group described above consists of 17 modes, each intra-prediction mode constituting each group can correspond to an entry. In Table 14, the angTable entry closest to the ratio value can be found, and the value of idx can be derived based on the index of angleTable corresponding to that entry.
- iAngUneven can be assigned the value of the intra prediction mode for accumulating the intensity value for the current sample position within the current block to which the filter is applied.
- the region derived above can be an identifier indicating any one of the four groups.
- offsets[region] can indicate the starting value of the intra prediction mode for the group indicated by the region, and dirs[region] can indicate the direction in which the intra prediction mode increases or decreases.
- idx can indicate the position within the angleTable. Therefore, the value of iAngUneven can be determined using a formula such as Table 15.
- the process of adding the accumulated intensity values for a selected specific intra prediction mode can be performed as follows.
- the piHistogram array is an array that stores the accumulated intensity values for all intra prediction modes, and after being initialized to 0, the intensity values can be calculated by looping over the sample positions of the internal area within the current block to which the filter is applied. At this time, the intensity value calculated for the current sample position is added to the accumulated intensity value for the selected specific intra prediction mode.
- iAngUneven may store the value of the intra prediction mode selected for the current sample position to which the filter is applied
- iAmp may store the intensity value calculated for the corresponding sample position.
- piHistogram[iAngUneven] may store the accumulated intensity value for the intra prediction mode indicated by iAngUneven.
- the mathematical expression 9 may correspond to step S630 of FIG. 6.
- the piHistogram array After looping over all sample locations within the internal region of the block to which the filter is applied, the piHistogram array stores the accumulated intensity values for all intra prediction modes. One or more modes with the largest accumulated intensity values can be selected. The selected mode may be referred to as the DIMD mode. This may correspond to step S640 in FIG. 6.
- NUM_LUMA_MODE can indicate the number of all intra prediction modes available in DIMD mode. For example, NUM_LUMA_MODE can be 67. According to Table 16, the value of the intra prediction mode with the largest accumulated intensity value can be assigned to the firstMode variable. Therefore, the intra prediction mode corresponding to the value of the finally determined firstMode variable can be set as the DIMD mode for the WxH block.
- An encoding device and a decoding device may define a mapping table that defines a mapping relationship between intra prediction modes and transformation sets.
- a transformation set corresponding to a previously derived intra prediction mode may be selected from the mapping table without signaling information for selecting a transformation set.
- the selected transformation set may include a plurality of transformation kernel candidates.
- An index indicating one of the plurality of transformation kernel candidates may be signaled.
- the index here may be expressed as an NSPT index or an LFNST index depending on the type of non-separable transformation.
- the index may be expressed as a transformation index.
- the value of the index may fall in the range of 0 to N.
- An index of 0 indicates that a non-separable transformation is not applied, and an index of k may indicate the kth transformation kernel candidate (1 ⁇ k ⁇ N).
- the transform set candidate may mean a transform set for a non-separable transform (e.g., a transform set for LFNST and/or a transform set for NSPT).
- the present invention is not limited thereto, and may also mean a transform set for a separable transform (e.g., a transform set for MTS).
- Each transform set candidate may include one or more transform kernel candidates. Any one of the one or more transform set candidates may be determined as the transform set of the current block. Any one of the one or more transform kernel candidates belonging to the transform set of the current block may be set as the transform kernel of the current block.
- candidate transformation sets mapped to them can be derived respectively.
- a transformation set candidate may be selected based on the intra prediction mode applied to the current block.
- the intra prediction mode may be a planar mode, a DC mode, or a directional mode.
- the intra prediction mode applied to intra prediction when selecting a transformation set candidate, the residual signal resulting from intra prediction may not exactly match the transformation kernel according to the intra prediction mode.
- the intra prediction mode applied to intra prediction is referred to as mode A.
- a transformation set candidate can be selected based on a mode other than mode A.
- modes other than mode A may include modes similar to mode A.
- mode value of mode A is M
- at least one mode in the range of (M-d) to (M+d) may be considered a mode similar to mode A.
- d may be an integer greater than or equal to 1.
- Modes other than mode A may include representative modes that do not overlap with mode A.
- at least one of the planar mode, DC mode, horizontal mode, vertical mode, right-upward diagonal mode, right-downward diagonal mode, or left-downward diagonal mode may be a representative mode that does not overlap with mode A.
- the right-upward, right-downward, and left-downward diagonal modes may be mode 2, mode 34, and mode 66, respectively.
- Modes other than the A mode may include modes induced through a specific method, such as the DIMD method, which is as discussed in Example 4.
- the mode induced through the specific method may be named VIPM (Virtual Intra Prediction Mode) or DIPM (Derived Intra Prediction Mode).
- a prediction block can be generated based on a specific intra prediction mode, such as a planar mode, a DC mode, or a directional mode.
- a DIPM can be derived using the DIMD method, which takes the prediction block as input. That is, the DIMD method can derive accumulated amplitude values for intra prediction modes, and the top K intra prediction modes can be derived as DIPMs in descending order of the accumulated amplitude values.
- K can be an integer greater than or equal to 1.
- an intra-prediction mode may not be defined for the inter-block. However, as an exception, an intra-prediction mode may be defined for the GPM (Geometric Partitioning Mode) and the CIIP (Combined Inter-Intra Prediction) mode.
- the DIPM can be derived through the DIMD method that takes as input a prediction block generated through inter-prediction. That is, the accumulated amplitude value for the intra-prediction modes can be derived through the DIMD method, and the top K intra-prediction mode(s) in descending order of the accumulated amplitude value can be derived as the DIPM.
- K can be an integer greater than or equal to 1.
- Intra prediction may be performed in a manner other than the conventional intra prediction modes such as planar mode, DC mode, or directional mode.
- intra prediction may be performed in a manner such as Decoder-side Intra Mode Derivation (DIMD), Template-based Intra Mode Derivation (TIMD), Matrix-based Intra Prediction (MIP), Intra Template Matching Prediction (IntraTMP), Extrapolation filter-based Intra Prediction (EIP), or Spatial Geometric Partitioning Mode (SGPM).
- DIMD Decoder-side Intra Mode Derivation
- TIMD Template-based Intra Mode Derivation
- MIP Matrix-based Intra Prediction
- IntraTMP Intra Template Matching Prediction
- Extrapolation filter-based Intra Prediction EIP
- Spatial Geometric Partitioning Mode SGPM
- each partition according to geometric partitioning may have a conventional intra prediction mode, but it may be difficult to specify a conventional intra prediction mode for the entire coding block or transform block.
- intra prediction modes such
- Transform sets specialized for blocks encoded with a special intra prediction mode can be separately defined.
- transform sets mapped to conventional intra prediction modes can be used in the same manner. If the current block is a block encoded with a special intra prediction mode, a conventional intra prediction mode corresponding to the block encoded with the special intra prediction mode can be derived.
- a transform set candidate can be selected from the transform sets based on the derived conventional intra prediction mode.
- the above-described derived conventional intra prediction mode may include a DIPM derived through a DIMD method.
- the DIMD method may be applied by inputting a prediction block generated based on a special intra prediction mode.
- the DIMD method mentioned as a method for deriving the DIPM is only an example, and the DIPM may be derived based on at least one of pre-restored available samples in a decoding device, a distribution of the available samples, intra prediction mode(s) of adjacent and/or non-adjacent neighboring block(s), or frequencies of intra prediction modes of neighboring blocks.
- the above derived conventional intra prediction mode may include at least one representative mode such as a planar mode, a DC mode, a horizontal mode, a vertical mode, a right-up diagonal mode, a right-down diagonal mode, and a left-down diagonal mode.
- the above-described derived conventional intra prediction mode may include an intra prediction mode derived during the process of applying a special intra prediction mode. For example, when SGPM is applied to the current block, at least one of an intra prediction mode corresponding to the direction of the partition boundary, an intra prediction mode most similar to the direction of the partition boundary, or an intra prediction mode applied to each partition may be derived. When DIMD or TIMD is applied to the current block, one or more intra prediction modes may be derived during the intra prediction process.
- the plurality of intra prediction modes derived for the current block may include at least one of an intra prediction mode used for intra prediction of the current block (i.e., mode A), a mode different from mode A, or a conventional intra prediction mode derived for the current block.
- a conventional intra prediction mode can be derived based on a prediction block generated through inter prediction.
- a DIPM can be derived through a DIMD method that inputs a prediction block generated through inter prediction, and the DIPM can be set as a conventional intra prediction mode.
- the DIMD method is only an example, and the DIPM can also be derived based on at least one of pre-restored available samples in a decoding device, a distribution of available samples, intra prediction mode(s) of adjacent and/or non-adjacent neighboring block(s), or frequencies of intra prediction modes of neighboring blocks.
- a transform set candidate can be selected from transform sets based on the derived conventional intra prediction mode.
- a conventional intra prediction mode can be derived based on the intra prediction mode derived during the prediction process.
- a transformation set candidate can be selected from transformation sets based on the derived conventional intra prediction mode. Specifically, in the case of GPM, at least one of an intra prediction mode corresponding to the direction of a partition boundary, an intra prediction mode most similar to the direction of the partition boundary, or an intra prediction mode applied to each partition can be set as the conventional intra prediction mode. In the case of the CIIP mode, an intra prediction mode used for intra prediction can be set as the conventional intra prediction mode.
- the plurality of intra prediction modes derived for the current block may include at least one of the conventional intra prediction modes derived through the above-described method.
- the transform sets mapped to the multiple intra prediction modes can be defined as KS_(1), KS_(2), ... , KS_(n-1), KS_(n).
- the value of n can be an integer greater than or equal to 2. If M_(1) to M_(n) are assumed to be different modes, KS_(1) to KS_(n) can be different transform sets, and all or part of KS_(1) to KS_(n) can overlap with each other.
- KS_(1) to KS_(n) can be set as transformation set candidates for the current block.
- the overlapping transformation sets can be removed from KS_(1) to KS_(n) and only the different transformation sets can be selected to form KSE_(1), KSE(2), ..., KSE_(m-1), KSE_(m).
- KSE_(1) to KSE_(m) can be set as transformation set candidates for the current block.
- the value of m can be an integer greater than or equal to 1.
- transformation set candidates can be derived based on different transformation sets.
- the number of different transformation sets among the transformation sets corresponding to multiple intra prediction modes is two or more, two or more transformation set candidates can be derived.
- one transformation set candidate can be derived.
- one or more transformation sets different from KS_(1) can be additionally derived through the findOtherKS function described below. In this case, KS_(1) and the additionally derived one or more transformation sets can be set as transformation set candidates for non-separable transformation of the current block.
- a transformation set index specifying one of the two or more transformation set candidates can be signaled through the bitstream. Based on the signaled transformation set index, a transformation set for non-separable transformation of the current block can be determined. However, when only one transformation set candidate is derived, the signaling of the transformation set index can be omitted, and the value of the transformation set index can be derived as 0.
- M_(1) may be set as the base mode.
- KS_(1) that is mapped to M_(1) may be derived.
- it may be checked whether the transformation sets that are sequentially mapped to the corresponding modes from M_(2) to M_(n) and KS_(1) are identical.
- the identity check may be performed until a transformation set that is different from the pre-derived KS_(1) is found. If a transformation set (hereinafter referred to as KS_(k)) that is different from KS_(1) is found, KS_(1) and KS_(k) may be set as transformation set candidates for the current block.
- two transformation set candidates may be derived for the current block.
- one transformation set candidate may be derived for the current block.
- the above one transformation set candidate may be KS_(1).
- a flag specifying one of two transformation set candidates derived for the current block can be signaled via the bitstream.
- a transformation set for a non-separable transformation of the current block can be determined based on the signaled flag. However, if, as a result of the identity check, no transformation set different from KS_(1) is found (or, if one transformation set candidate is derived for the current block), the signaling of the aforementioned flag can be omitted, and the derived one transformation set candidate can be set as the transformation set for the non-separable transformation of the current block.
- a transformation set different from KS_(1) can be additionally derived via the findOtherKS function described below.
- KS_(1) and the additionally derived transformation set can be set as transformation set candidates for the non-separable transformation of the current block.
- Whether the two transformation sets are identical can be determined based on whether the transformation set indices between the transformation sets are identical.
- the transformation set (KS_(2)) mapped to M_(2) may be determined to be the same as the transformation set (KS_(1)) mapped to M_(1). In this case, the transformation set mapped to M_(2) may not be set as a transformation set candidate for the current block.
- the transformation set (KS_(2)) mapped to M_(2) may be determined to be different from the transformation set (KS_(1)) mapped to M_(1). In this case, KS_(2) may be set as a transformation set candidate for the current block together with KS_(1).
- the mapping relationship between an intra prediction mode and a transform set can be defined as in Table 8, Table 9, or Table 10.
- the two directional modes i.e., mode X and mode (68-X)
- the two directional modes can be mapped to the same transform set.
- mode 12 and mode 56 are mapped to the same transform set. Therefore, it is possible to determine whether the two transform sets are the same based only on the transform set index (e.g., NSPT set index) pointing to a specific transform set.
- the transform set index e.g., NSPT set index
- Equation 10 if the transformation set index K(q) mapped to mode q and the transformation set index K(r) mapped to mode r are the same, the value of redundantKS can be derived as true or 1. Therefore, the transformation set mapped to mode q and the transformation set mapped to mode r can be determined to be the same. Otherwise, the value of redundantKS can be derived as false or 0. Therefore, the transformation set mapped to mode q and the transformation set mapped to mode r can be determined to be different.
- whether the two transformation sets are identical may be determined based on at least one of the following: whether the transformation set indices between the transformation sets are identical, or whether the input data is transposed (or transposed type).
- the transposition type may be determined based on a flag indicating whether the input data is transposed.
- the transformation set (KS_(2)) mapped to M_(2) may be determined to be the same as the transformation set (KS_(1)) mapped to M_(1). In this case, the transformation set mapped to M_(2) may not be set as a transformation set candidate for the current block.
- the transformation set (KS_(2)) mapped to M_(2) may be determined to be different from the transformation set (KS_(1)) mapped to M_(1).
- the transformation set (KS_(2)) mapped to M_(2) may be determined to be different from the transformation set (KS_(1)) mapped to M_(1), regardless of whether the transpose type corresponding to M_(2) is the same as the transpose type corresponding to M_(1). If KS_(2) is determined to be different from KS_(1), KS_(2) may be set as a transformation set candidate for the current block together with KS_(1).
- a transpose operation may be performed on the input data first and then the transform kernel may be applied.
- the transform kernels applied to mode 12 and mode 56 cannot be considered identical.
- the transpose operation is a type of shuffling, and can be considered equivalent to multiplying the permutation matrix (here, matrix P) to the left of X.
- X may be a one-dimensional vector composed of the input data belonging to the 2D block to which the forward non-separable transform is applied in a specific order, such as row-major order.
- the forward transform kernels for mode 12 and mode 56 are effectively G and GP, respectively, it can be seen that different transform kernels are applied to the two modes. Therefore, it cannot be said that the same transform set is mapped to mode 12 and mode 56. Moreover, if a non-separable transform is applied to a non-square block, such as an MxN block, even the transform kernels applied to the two modes may be different. For the sake of explanation, let's assume that wide angle intra prediction (WAIP) is not applied. Since mode 56 of an MxN block applies a transform set that maps to mode 12 of an NxM block, a fundamentally different transform set may be applied from mode 12 of an MxN block.
- WAIP wide angle intra prediction
- the transformation set index mapped to mode x be K(x) and the value representing the transpose type for mode x be TP(x).
- the variable redundentKS which indicates whether the transformation sets mapped to mode q and mode r are identical, can be derived as follows in mathematical expression 11.
- the operation "&&" is an AND operation that evaluates to true only when both the left and right sides are true, and false otherwise.
- the value of redundantKS can be derived as true or 1. Therefore, the transformation set mapped to mode q and the transformation set mapped to mode r can be determined to be the same. Otherwise (for example, if K(q) mapped to mode q and K(r) mapped to mode r are different, or TP(q) for mode q and TP(r) for mode r are different), the value of redundantKS can be derived as false or 0. Therefore, the transformation set mapped to mode q and the transformation set mapped to mode r can be determined to be different.
- the two modes may be modes prior to being converted to the WAIP mode. That is, M_(1) to M_(n) may be modes prior to being converted to the WAIP mode.
- the two modes may be modes converted to the modes of WAIP.
- mode x converted to the modes of WAIP is expressed as WA(x)
- the variable redundantKS can be derived as in the following mathematical expressions 12 or 13.
- K(WA(q)) may represent a transformation set index mapped to mode q converted to the mode of WAIP
- K(WA(r)) may represent a transformation set index mapped to mode r converted to the mode of WAIP.
- Equation 12 if K(WA(q)) mapped to mode WA(q) and K(WA(r)) mapped to mode WA(r) are identical, the value of redundantKS can be derived as true or 1. Therefore, the transformation set mapped to mode WA(q) and the transformation set mapped to mode WA(r) can be determined to be identical. Otherwise, the value of redundantKS can be derived as false or 0. Therefore, the transformation set mapped to mode WA(q) and the transformation set mapped to mode WA(r) can be determined to be different.
- Equation 13 K(WA(q)) and K(WA(r)) are as discussed in Equation 12.
- TP(WA(q)) can represent a transpose type for mode WA(q)
- TP(WA(r)) can represent a transpose type for mode WA(r).
- Equation 13 if K(WA(q)) mapped to mode WA(q) is the same as K(WA(r)) mapped to mode WA(r), and TP(WA(q)) for mode WA(q) is the same as TP(WA(r)) for mode WA(r), then the value of redundantKS can be derived as true or 1. Therefore, the set of transformations mapped to mode WA(q) and the set of transformations mapped to mode WA(r) can be determined to be the same.
- either mode q or mode r may be configured such that only one of them is converted to the WAIP mode and the other is not converted to the WAIP mode.
- redundantKS can be derived as in Equation 14 or 15.
- Equation 14 if the transformation set index K(WA(q)) mapped to mode WA(q) and the transformation set index K(r) mapped to mode r are the same, and the transpose type TP(WA(q)) for mode WA(q) and the transpose type TP(r) for mode r are the same, the value of redundantKS can be derived as true or 1. Therefore, the transformation set mapped to mode WA(q) and the transformation set mapped to mode r can be determined to be the same.
- the value of redundantKS can be derived as false or 0. Therefore, the set of transformations mapped to mode WA(q) and the set of transformations mapped to mode r can be judged to be different.
- Equation 15 if the transformation set index K(q) mapped to mode q and the transformation set index K(WA(r)) mapped to mode WA(r) are the same, and the transpose type TP(q) for mode q and the transpose type TP(WA(r)) for mode WA(r) are the same, the value of redundantKS can be derived as true or 1. Therefore, the transformation set mapped to mode q and the transformation set mapped to mode WA(r) can be determined to be the same.
- the value of redundantKS can be derived as false or 0. Therefore, the set of transformations mapped to mode q and the set of transformations mapped to mode WA(r) can be judged to be different.
- RedundantKS can be derived based on any of the predefined methods identical to those of the encoding device and the decoding device.
- the predefined methods may include at least one of the redundantKS derivation methods according to the aforementioned mathematical expressions 10 to 15.
- one of the above-described methods can be selectively used to determine whether the transform sets mapped to the two modes are identical. For example, if SGPM is applied to the current block, the redundantKS derivation method according to Equation 12 can be used. If DIMD mode is applied to the current block, the redundantKS derivation method according to Equation 10 can be used.
- the transform set candidate(s) can be derived through the aforementioned method based on the conventional intra prediction mode and an additional DIPM.
- the additional DIPM can be derived through a DIMD method that takes as input a prediction block generated based on the conventional intra prediction mode.
- the additional DIPM can also be derived through a DIMD method that takes as input a pre-reconstructed neighboring region adjacent to the current block.
- the conventional intra prediction mode and the additional DIPM derived for the current block can correspond to M_(1) to M_(n).
- the conventional intra prediction mode can be assigned to M_(1)
- the additional DIPM can be assigned to M_(2) to M_(n), respectively.
- q and r can represent the mode before being converted to the WAIP mode, and mode can be set to DIMD, TIMD, SGPM, IntraTMP, EIP, MIP, or convIntra.
- convIntra can indicate that the normal intra prediction mode, not the special intra prediction mode, is applied to the current block.
- any one of the redunantKS derivation methods according to the mathematical expressions 10 to 15 can be fixedly used regardless of the mode.
- the redunantKS derivation method according to Equation 11 may be used fixedly, or the redunantKS derivation method according to Equation 14 or 15 may be used fixedly.
- M_(1) to M_(n) may be modes that are not converted to the mode of WAIP.
- the variable found may be a variable indicating whether a transformation set different from the transformation set specified by M_(1) is found.
- primaryKS may represent the transformation set specified by M_(1)
- alternativeKS may represent the transformation set mapped to the intra prediction mode when an intra prediction mode having a different transformation set is first found while looping over M_(2) to M_(n).
- variable found is false, this may indicate that no intra prediction mode with a transformation set different from the primaryKS was found.
- alternativeKS may be determined to be absent, and a transformation set different from the primaryKS may be found and assigned, as in the pseudocode above.
- a transformation set different from the primaryKS can be found using the findOtherKS function.
- the flag for selecting one of the two candidate transformation sets may not be signaled.
- the primaryKS may be set to the transformation set for the non-separable transformation of the current block.
- findOtherKS function can be configured as follows.
- a specific transformation set can be defined as a default transformation set, so that the default transformation set can be returned.
- the default transformation set may be a transformation set that is mapped to a planar mode or a DC mode.
- the default transformation set may be a transformation set that is mapped to a down-right diagonal mode (e.g., mode 34).
- a transformation set that is not mapped to any of the intra prediction modes pre-defined in the encoding device and the decoding device may be defined as the default transformation set.
- a configuration that selects a transformation set that is mapped to the specific intra prediction mode and performs a transpose operation on input data based on a forward transformation may be defined as the default transformation set.
- the transformation set adjacent to primaryKS may mean a transformation set corresponding to a value obtained by adding or subtracting a predetermined offset from the transformation set index of primaryKS. For example, if the transformation set index of primaryKS is 5, a transformation set having a transformation set index of 3 or 4 may be selected as alternativeKS.
- a default transformation set that is not mapped to any of the intra prediction modes pre-defined in the encoding device and the decoding device can be separately defined.
- the default transformation set can be selected as the alternativeKS.
- a separate transformation set that is used only when the value of the variable found is false can be defined and used as the default transformation set.
- a transformation set mapped to the horizontal mode can be selected as alternativeKS. If primaryKS is a transformation set corresponding to the horizontal mode, a transformation set mapped to the vertical mode can be selected as alternativeKS. If primaryKS is a transformation set corresponding to the vertical mode, a transformation set mapped to the horizontal mode can be selected as alternativeKS. If primaryKS is a transformation set corresponding to the horizontal mode (or vertical mode), alternativeKS can be set based on any one of the configurations 1 to 4 described above. If primaryKS is a transformation set corresponding to a non-directional mode (e.g., planar mode, DC mode), alternativeKS can be set based on any one of the configurations 1 to 4 described above.
- a non-directional mode e.g., planar mode, DC mode
- the findOtherKS function may be defined by any one of the configurations 1 to 6 described above.
- the findOtherKS function may be defined by at least two of the configurations 1 to 6, in which case at least one of the two configurations may be selectively used.
- the transformation set refers to a transformation set for non-separable transformation, but this is only an example. If a separable transformation (e.g., MTS) is applied to the current block, the transformation set in this embodiment can be understood as being replaced with a transformation set for separable transformation.
- the transformation set for separable transformation can be composed of a plurality of different transformation kernel candidates.
- Each of the transformation kernel candidates can include at least one of a horizontal transformation kernel or a vertical transformation kernel.
- the number of transformation kernel candidates belonging to one transformation set can be 4, 6, or 8.
- a transform set may be selected based on the intra prediction mode of the current block.
- the transform set may be a transform set for non-separable transforms or a transform set for separable transforms.
- the selected transform set may include one or more transform kernel candidates.
- a residual sample of the current block may be generated based on one or more transform kernel candidates belonging to the transform set.
- a transformation set may be mapped for each intra prediction mode. If two or more intra prediction modes are derived for the current block, a transformation set may exist that is mapped to each intra prediction mode. In this case, the two or more transformation sets that are respectively mapped to the two or more intra prediction modes may be identical to each other. At least one of the two or more transformation sets that are respectively mapped to the two or more intra prediction modes may be different from the other one. The two or more transformation sets that are respectively mapped to the two or more intra prediction modes may also be different from each other.
- the main transformation set is called the main transformation set
- the other transformation set(s) is called the alternative transformation set.
- the main transformation set and the alternative transformation set may be understood as being replaced by the first transformation set and the second transformation set, respectively.
- a transformation set selected based on an intra prediction mode applied to the prediction process of the current block may be set as a base transformation set.
- the intra prediction mode may refer to an intra prediction mode that specifies reference samples for intra prediction of the current block.
- a DIPM may be derived based on a prediction block of the current block or surrounding samples, and a transformation set selected based on the DIPM may be set as an alternative transformation set.
- the prediction block of the current block may be generated based on the intra prediction mode applied to the prediction process of the current block.
- the DIPM derivation method has been described above, and a detailed description thereof will be omitted here.
- a transformation set selected based on the mode with the largest accumulated intensity value among the DIMD modes derived through the aforementioned DIMD method may be set as the default transformation set.
- a transformation set selected based on the mode with the second largest accumulated intensity value among the DIMD modes may be set as the alternative transformation set.
- the base transformation set and/or the alternative transformation set may be determined by specific rules even if there is no corresponding intra prediction mode (or DIPM, DIMD mode).
- the alternative transformation set may be set to a transformation set mapped to a non-directional mode.
- the non-directional mode may be a planar mode or a DC mode.
- the method for determining the base and alternative transformation sets is not limited to the method described above, and the method for determining the base and alternative transformation sets may be separately defined according to various intra prediction methods (e.g., Matrix-based Intra Prediction (MIP), Spatial Geometric Partitioning Mode (SGPM), Intra Template Matching (IntraTMP)).
- MIP Matrix-based Intra Prediction
- SGPM Spatial Geometric Partitioning Mode
- IntraTMP Intra Template Matching
- Which transformation set is applied, between the default transformation set and the alternative transformation set, can be determined implicitly or explicitly. For example, a flag indicating whether the alternative transformation set is applied can be explicitly signaled. If the flag has a first value, this may indicate that the default transformation set is applied. On the other hand, if the flag has a second value, this may indicate that the default transformation set is not applied. If the flag has a second value, this may indicate that the alternative transformation set is applied.
- a non-separable transformation can be performed based on a single transformation kernel candidate selected from among multiple transformation kernel candidates belonging to a transformation set.
- the transformation set may be a transformation set for a separable transformation, in which case each transformation kernel candidate belonging to the transformation set may be defined as a pair of a transformation kernel applied in the horizontal direction and a transformation kernel applied in the vertical direction.
- Any one of the multiple transformation kernel candidates may be determined implicitly or explicitly based on index information signaled through the bitstream.
- a non-separable transform can be applied to at least one of a block encoded based on an intra mode (hereinafter referred to as an intra block) or a block encoded based on an inter mode (hereinafter referred to as an inter block).
- an intra block a block encoded based on an intra mode
- an inter block a block encoded based on an inter mode
- I-slice may refer to a slice composed of only intra blocks
- P/B-slice may refer to a slice composed of intra blocks and inter blocks.
- I-slice may be a slice belonging to a picture coded only in intra mode.
- P/B-slice is a slice belonging to a picture coded in inter mode, but an intra block may exist in a P/B-slice.
- a non-separable transformation applied to an intra block is called intra-NST
- inter-NST a non-separable transformation applied to an inter block
- Case 1 is when an intra block belonging to an I-slice performs an inseparable transformation based on a basic transformation set.
- Case 2 is when an intra block belonging to an I-slice performs an inseparable transformation based on an alternative transformation set.
- Case 3 is when an intra block belonging to a P/B-slice performs an inseparable transformation based on a basic transformation set.
- Case 4 is when an intra block belonging to a P/B-slice performs an inseparable transformation based on an alternative transformation set.
- Case 5 is when an inter block belonging to a P/B-slice performs an inseparable transformation based on a basic transformation set.
- Case 6 is when an inter block belonging to a P/B-slice performs an inseparable transformation based on an alternative transformation set.
- transformation set # (where # is an integer ranging from 1 to N).
- [B-(m)] can refer to the case of applying transformation set m, and can be denoted as a combination of [A-(n)]/[B-(m)], where n is an integer ranging from 1 to 3, and m is an integer ranging from 1 to N.
- the present disclosure assumes that there are only [B-(1)] and [B-(2)], but the present disclosure can be applied identically/similarly even when there are three or more transformation sets. If there are M applicable transformation sets, there may be overlapping transformation sets among the M transformation sets. Even if overlapping transformation sets exist, the conditions under which they are applied may differ. For example, even when the same transformation set is applied, cases where general intra prediction, such as directional mode, is applied may be treated differently from cases where special intra prediction, such as MIP, EIP, IntraTMP, or TIMD, is applied. In such cases, even if the transformation sets overlap, they can be configured to be treated differently, such as B_(p) and B_(q), where p and q are different.
- the method of applying the transformation kernel can be configured differently depending on the block size.
- all block sizes allowed by the encoding device and the decoding device can be divided into one or more sections.
- the one or more sections can include at least one of a first section to which a block size smaller than a first size belongs, a second section to which a block size larger than or equal to the first size belongs and smaller than a second size belongs, a third section to which a block size larger than or equal to the second size belongs and smaller than a third size belongs, ..., an L-th section to which a block size larger than or equal to the (L-1)th size belongs and smaller than the L-th size belongs, or an (L+1)th section to which a block size larger than or equal to the L-th size belongs.
- the p-th section (i.e., the p-th section) can mean a section in which the number of transformation kernel candidates applicable to the current block is p, and p can be an integer greater than or equal to 0 and less than or equal to L.
- all block sizes allowed by the encoding device and the decoding device can be divided into (L+1) sections based on the block area as follows.
- An interval [ a, b ) may mean an interval greater than or equal to a and less than b. If the total number of transformation kernel candidates belonging to a specific transformation set (e.g., a basic transformation set, an alternative transformation set) is L, (L+1) intervals may mean intervals having from 0 to L transformation kernel candidates, respectively. Specifically, an interval [ s_(k), e_(k) ) may represent an interval to which only k transformation kernel candidates among the L transformation kernel candidates are applicable. Here, k may be an integer in the range from 0 to L. In other words, an interval [ s_(k), e_(k) ) may be configured so that from the first transformation kernel candidate to the kth transformation kernel candidate among the L transformation kernel candidates are applicable.
- a specific transformation set e.g., a basic transformation set, an alternative transformation set
- (L+1) intervals may mean intervals having from 0 to L transformation kernel candidates, respectively.
- the interval [ s_(0), e_(0) ) may be an interval to which no transformation kernel candidate is applied.
- s_(0) may represent the minimum block area to which the non-separable transformation is applicable, and in the present disclosure, the minimum block area is denoted as min_area. For example, if the non-separable transformation is applicable only to blocks in which both the width and the height are greater than or equal to 4, min_area may be 16, and s_(0) may be set to 16.
- the interval [ s_(L), e_(L) ) may be an interval to which L transformation kernel candidates are applicable.
- e_(L) may be set to a value greater than the maximum block area to which the non-separable transformation is applicable.
- max_area the maximum block area
- e_(L) may be set to a value greater than max_area.
- e_(L) is set to the value of (max_area+1), and the value of (max_area+1) is denoted as max_area_plus_1.
- the interval [ s_(k), e_(k) ) cannot contain any block area value, and thus the interval [ s_(k), e_(k) ) can be considered an empty interval. This may indicate that there are no blocks to which k transformation kernel candidates can be applied.
- min_area and max_area are implicitly determined according to the combinations described above (or fixedly regardless of the combinations), by setting the values of e_(0), s_(1), e_(1), s_(2), e_(2), ..., s_(L-1), e_(L-1), s(L), we can specify applicable transformation kernel candidates for each of all block areas.
- s_(0) and e_(L) can be set to min_area and max_area_plus_1, respectively.
- min_area and max_area_plus_1 can be fixed values regardless of the combination. Alternatively, min_area and max_area_plus_1 can have different values depending on the combination.
- min_area and max_area_plus_1 can have different values depending on whether it is [A-(1)]/[B-(1)], [A-(1)]/[B-(2)], [A-(2)]/[B-(1)], [A-(2)]/[B-(2)], [A-(3)]/[B-(1)], or [A-(3)]/[B-(2)].
- the values of e_(0), s_(1), e_(1), s_(2), e_(2), ..., s_(L-1), e_(L-1), s(L) can form a non-decreasing sequence.
- [ s_(0), e_(0) ), [ s_(1), e_(1) ), ... , [ s_(L-1), e_(L-1) ), [ s_(L), e_(L) ) are called range sequences.
- configuring a larger number of applicable transformation kernel candidates can be advantageous from a performance-complexity trade-off perspective.
- a larger block area increases the statistical diversity represented by the block data, so a small number of applicable transformation kernel candidates may not sufficiently remove decorrelation in the block data. Therefore, configuring a larger number of applicable transformation kernel candidates for a larger block area can be advantageous for performance improvement.
- configuring a large number of applicable transformation kernel candidates for a small block area with low statistical diversity can increase the signaling cost of selecting a particular transformation kernel candidate relative to the decorrelation capability.
- configuring a smaller number of applicable transformation kernel candidates can be advantageous from a performance-complexity perspective. Therefore, configuring a smaller number of applicable transformation kernel candidates for each block area can be advantageous from a performance-complexity trade-off perspective.
- an interval sequence applied to each combination can be defined. At least one of the six combinations can have an interval sequence different from the other combinations. Some of the six combinations can have the same interval sequence.
- the same interval sequence (RS1) can be defined for the combinations [A-(1)]/[B-(1)] and [A-(2)]/[B-(1)]
- the same interval sequence (RS2) can be defined for the combinations [A-(1)]/[B-(2)] and [A-(2)]/[B-(2)].
- RS1 can be a different interval sequence from RS2.
- the same interval sequence can be defined for all six combinations.
- the six combinations can be grouped into multiple groups, and an interval sequence can be defined for each group. The same interval sequence can be applied to combinations belonging to the same group.
- the total number of transformation kernel candidates (i.e., L) that constitute the transformation set can be defined identically.
- each combination can have a unique value for L.
- the interval sequence of [ min_area, max_area_plus_1 ), [ max_area_plus_1, max_area_plus_1 ), ... , [ max_area_plus_1, max_area_plus_1 ) may be a configuration to which no transformation kernel is applied.
- the interval sequence of [ min_area, min_area ), [ min_area, min_area ), ... , [ min_area, max_area_plus_1 ) can be a configuration in which all transformation kernel candidates belonging to the transformation set are applicable to all block areas.
- p may be greater than or equal to 0 and less than or equal to L.
- p may be greater than or equal to 0 and less than or equal to L.
- each interval sequence can indicate that transformation kernel candidates 0, 1, 2, and 3 are applicable to the four intervals belonging to the interval sequence, from left to right.
- the number of applicable transformation kernel candidates for blocks with a block area smaller than 128 may be 0. This may mean that no transformation kernel is applied to blocks with a block area smaller than 128.
- the number of applicable transformation kernel candidates for blocks with a block area greater than or equal to 128 and less than 256 may be 1.
- the number of applicable transformation kernel candidates for blocks with a block area greater than or equal to 256 and less than 512 may be 2.
- the number of applicable transformation kernel candidates for blocks with a block area greater than or equal to 512 may be 3.
- the number of applicable transformation kernel candidates for a block with a block area smaller than 64 may be 0.
- the number of applicable transformation kernel candidates for a block with a block area greater than or equal to 64 and less than 256 may be 1.
- the number of applicable transformation kernel candidates for a block with a block area greater than or equal to 256 and less than 512 may be 2.
- the number of applicable transformation kernel candidates for a block with a block area greater than or equal to 512 may be 3.
- the number of applicable transformation kernel candidates for a block with a block area less than 32 may be 0.
- the number of applicable transformation kernel candidates for a block with a block area greater than or equal to 32 and less than 256 may be 1.
- the number of applicable transformation kernel candidates for a block with a block area greater than or equal to 256 and less than 512 may be 2.
- the number of applicable transformation kernel candidates for a block with a block area greater than or equal to 512 may be 3.
- the number of applicable transformation kernel candidates is determined as one, two, or three depending on the block area, and there may not be a case where the number of applicable transformation kernel candidates is zero.
- the number of applicable transformation kernel candidates may be one.
- the number of applicable transformation kernel candidates may be two.
- the number of applicable transformation kernel candidates may be three.
- the number of applicable transformation kernel candidates is determined as 0, 1, or 3 depending on the block area, and there may not be a case where the number of applicable transformation kernel candidates is 2.
- the number of applicable transformation kernel candidates may be 0.
- the number of applicable transformation kernel candidates may be 1.
- the number of applicable transformation kernel candidates may be 3.
- the number of applicable transformation kernel candidates is determined as 0, 1, or 3 depending on the block area, and there may not be a case where the number of applicable transformation kernel candidates is 2.
- the number of applicable transformation kernel candidates may be 0.
- the number of applicable transformation kernel candidates may be 1.
- the number of applicable transformation kernel candidates may be 3.
- the number of applicable transformation kernel candidates is determined as 0, 1, or 3 depending on the block area, and there may not be a case where the number of applicable transformation kernel candidates is 2.
- the number of applicable transformation kernel candidates may be 0.
- the number of applicable transformation kernel candidates may be 1.
- the number of applicable transformation kernel candidates may be 3.
- the number of applicable transformation kernel candidates is determined as either 1 or 3 depending on the block area, and there may not be a case where the number of applicable transformation kernel candidates is 0 or 2.
- the number of applicable transformation kernel candidates may be 1.
- the number of applicable transformation kernel candidates may be 3.
- the number of applicable transformation kernel candidates is determined as 0, 1, or 3 depending on the block area, and there may not be a case where the number of applicable transformation kernel candidates is 2.
- the number of applicable transformation kernel candidates may be 0.
- the number of applicable transformation kernel candidates may be 1.
- the number of applicable transformation kernel candidates may be 3.
- the number of applicable transformation kernel candidates is determined as 0, 1, or 3 depending on the block area, and there may not be a case where the number of applicable transformation kernel candidates is 2.
- the number of applicable transformation kernel candidates may be 0.
- the number of applicable transformation kernel candidates may be 1.
- the number of applicable transformation kernel candidates may be 3.
- the number of applicable transformation kernel candidates is determined as 0, 1, or 3 depending on the block area, and there may not be a case where the number of applicable transformation kernel candidates is 2.
- the number of applicable transformation kernel candidates may be 0.
- the number of applicable transformation kernel candidates may be 1.
- the number of applicable transformation kernel candidates may be 3.
- the number of applicable transformation kernel candidates is determined as either 1 or 3 depending on the block area, and there may not be a case where the number of applicable transformation kernel candidates is 0 or 2.
- the number of applicable transformation kernel candidates may be 1.
- the number of applicable transformation kernel candidates may be 3.
- the number of applicable transformation kernel candidates is determined as 0, 1, or 2 depending on the block area, and there may not be a case where the number of applicable transformation kernel candidates is 3.
- the number of applicable transformation kernel candidates may be 0.
- the number of applicable transformation kernel candidates may be 1.
- the number of applicable transformation kernel candidates may be 2.
- the number of applicable transformation kernel candidates is determined as 0, 1, or 2 depending on the block area, and there may not be a case where the number of applicable transformation kernel candidates is 3.
- the number of applicable transformation kernel candidates may be 0.
- the number of applicable transformation kernel candidates may be 1.
- the number of applicable transformation kernel candidates may be 2.
- the number of applicable transformation kernel candidates is determined as 0, 1, or 2 depending on the block area, and there may not be a case where the number of applicable transformation kernel candidates is 3.
- the number of applicable transformation kernel candidates may be 0.
- the number of applicable transformation kernel candidates may be 1.
- the number of applicable transformation kernel candidates may be 2.
- the number of applicable transformation kernel candidates is determined as either 1 or 2 depending on the block area, and there may not be a case where the number of applicable transformation kernel candidates is 0 or 3.
- the number of applicable transformation kernel candidates may be 1.
- the number of applicable transformation kernel candidates may be 2.
- the number of applicable transformation kernel candidates is determined as 0, 1, or 2 depending on the block area, and there may not be a case where the number of applicable transformation kernel candidates is 3.
- the number of applicable transformation kernel candidates may be 0.
- the number of applicable transformation kernel candidates may be 1.
- the number of applicable transformation kernel candidates may be 2.
- the number of applicable transformation kernel candidates is determined as 0, 1, or 2 depending on the block area, and there may not be a case where the number of applicable transformation kernel candidates is 3.
- the number of applicable transformation kernel candidates may be 0.
- the number of applicable transformation kernel candidates may be 1.
- the number of applicable transformation kernel candidates may be 2.
- the number of applicable transformation kernel candidates is determined as 0, 1, or 2 depending on the block area, and there may not be a case where the number of applicable transformation kernel candidates is 3.
- the number of applicable transformation kernel candidates may be 0.
- the number of applicable transformation kernel candidates may be 1.
- the number of applicable transformation kernel candidates may be 2.
- the number of applicable transformation kernel candidates is determined as either 1 or 2 depending on the block area, and there may not be a case where the number of applicable transformation kernel candidates is 0 or 3.
- the number of applicable transformation kernel candidates may be 1.
- the number of applicable transformation kernel candidates may be 2.
- the number of applicable transformation kernel candidates is determined as either 0 or 3 depending on the block area, and there may not be a case where the number of applicable transformation kernel candidates is 1 or 2.
- the number of applicable transformation kernel candidates may be 0. That is, a transformation kernel may not be applied to a block with a block area smaller than 256.
- the number of applicable transformation kernel candidates may be 3.
- the number of applicable transformation kernel candidates is determined as either 0 or 3 depending on the block area, and there may not be a case where the number of applicable transformation kernel candidates is 1 or 2.
- the number of applicable transformation kernel candidates may be 0. That is, no transformation kernel may be applied to blocks with a block area smaller than 128.
- the number of applicable transformation kernel candidates may be 3.
- the number of applicable transformation kernel candidates is determined as either 0 or 3 depending on the block area, and there may not be a case where the number of applicable transformation kernel candidates is 1 or 2.
- the number of applicable transformation kernel candidates may be 0. That is, no transformation kernel may be applied to a block with a block area smaller than 64.
- the number of applicable transformation kernel candidates may be 3.
- the number of applicable transformation kernel candidates is determined as either 0 or 3 depending on the block area, and there may not be a case where the number of applicable transformation kernel candidates is 1 or 2.
- the number of applicable transformation kernel candidates may be 0. That is, no transformation kernel may be applied to a block with a block area smaller than 32.
- the number of applicable transformation kernel candidates may be 3.
- the number of applicable transformation kernel candidates for all block areas may be 0. That is, no transformation kernel may be applied to all blocks.
- the number of applicable transformation kernel candidates for all block areas can be 1.
- the number of applicable transformation kernel candidates for all block areas can be two.
- the number of applicable transformation kernel candidates for all block areas can be three.
- the number of applicable transformation kernel candidates is determined as either 2 or 3 depending on the block area, and there may not be a case where the number of applicable transformation kernel candidates is 0 or 1.
- the number of applicable transformation kernel candidates may be 2.
- the number of applicable transformation kernel candidates may be 3.
- the number of applicable transformation kernel candidates is determined as either 2 or 3 depending on the block area, and there may not be a case where the number of applicable transformation kernel candidates is 0 or 1.
- the number of applicable transformation kernel candidates may be 2.
- the number of applicable transformation kernel candidates may be 3.
- the number of applicable transformation kernel candidates is determined as either 2 or 3 depending on the block area, and there may not be a case where the number of applicable transformation kernel candidates is 0 or 1.
- the number of applicable transformation kernel candidates may be 2.
- the number of applicable transformation kernel candidates may be 3.
- the number of applicable transformation kernel candidates is determined as either 2 or 3 depending on the block area, and there may not be a case where the number of applicable transformation kernel candidates is 0 or 1.
- the number of applicable transformation kernel candidates may be 2.
- the number of applicable transformation kernel candidates may be 3.
- the number of applicable transformation kernel candidates is determined as either 0 or 1 depending on the block area, and there may not be a case where the number of applicable transformation kernel candidates is 2 or 3.
- the number of applicable transformation kernel candidates may be 0.
- the number of applicable transformation kernel candidates may be 1.
- the number of applicable transformation kernel candidates is determined as either 0 or 1 depending on the block area, and there may not be a case where the number of applicable transformation kernel candidates is 2 or 3.
- the number of applicable transformation kernel candidates may be 0.
- the number of applicable transformation kernel candidates may be 1.
- the number of applicable transformation kernel candidates is determined as either 0 or 1 depending on the block area, and there may not be a case where the number of applicable transformation kernel candidates is 2 or 3.
- the number of applicable transformation kernel candidates may be 0.
- the number of applicable transformation kernel candidates may be 1.
- the number of applicable transformation kernel candidates is determined as either 0 or 1 depending on the block area, and there may not be a case where the number of applicable transformation kernel candidates is 2 or 3.
- the number of applicable transformation kernel candidates may be 0.
- the number of applicable transformation kernel candidates may be 1.
- the number of applicable transformation kernel candidates is determined as either 0 or 2 depending on the block area, and there may not be a case where the number of applicable transformation kernel candidates is 1 or 3. For blocks with a block area smaller than 256, the number of applicable transformation kernel candidates may be 0. For blocks with a block area greater than or equal to 256, the number of applicable transformation kernel candidates may be 2.
- the number of applicable transformation kernel candidates is determined as either 0 or 2 depending on the block area, and there may not be a case where the number of applicable transformation kernel candidates is 1 or 3. For blocks with a block area smaller than 128, the number of applicable transformation kernel candidates may be 0. For blocks with a block area greater than or equal to 128, the number of applicable transformation kernel candidates may be 2.
- the number of applicable transformation kernel candidates is determined as either 0 or 2 depending on the block area, and there may not be a case where the number of applicable transformation kernel candidates is 1 or 3. For blocks with a block area smaller than 64, the number of applicable transformation kernel candidates may be 0. For blocks with a block area greater than or equal to 64, the number of applicable transformation kernel candidates may be 2.
- the number of applicable transformation kernel candidates is determined as either 0 or 2 depending on the block area, and there may not be a case where the number of applicable transformation kernel candidates is 1 or 3. For blocks with a block area smaller than 32, the number of applicable transformation kernel candidates may be 0. For blocks with a block area greater than or equal to 32, the number of applicable transformation kernel candidates may be 2.
- the number of applicable transformation kernel candidates is determined as 0, 2, or 3 depending on the block area, and there may not be a case where the number of applicable transformation kernel candidates is 1.
- the number of applicable transformation kernel candidates may be 0.
- No transformation kernel may be applied to a block with a block area smaller than 128.
- the number of applicable transformation kernel candidates may be 2.
- the number of applicable transformation kernel candidates may be 3.
- the number of applicable transformation kernel candidates is determined as 0, 2, or 3 depending on the block area, and there may not be a case where the number of applicable transformation kernel candidates is 1.
- the number of applicable transformation kernel candidates may be 0 for a block with a block area smaller than 64. No transformation kernel may be applied to a block with a block area smaller than 64.
- the number of applicable transformation kernel candidates may be 2 for a block with a block area greater than or equal to 64 and less than 256.
- the number of applicable transformation kernel candidates may be 3 for a block with a block area greater than or equal to 256.
- the number of applicable transformation kernel candidates is determined as 0, 2, or 3 depending on the block area, and there may not be a case where the number of applicable transformation kernel candidates is 1.
- the number of applicable transformation kernel candidates may be 0.
- No transformation kernel may be applied to a block with a block area smaller than 32.
- the number of applicable transformation kernel candidates may be 2.
- the number of applicable transformation kernel candidates may be 3.
- the number of applicable transformation kernel candidates is determined as 0, 2, or 3 depending on the block area, and there may not be a case where the number of applicable transformation kernel candidates is 1.
- the number of applicable transformation kernel candidates may be 0.
- No transformation kernel may be applied to a block with a block area smaller than 16.
- the number of applicable transformation kernel candidates may be 2.
- the number of applicable transformation kernel candidates may be 3.
- the number of applicable transformation kernel candidates is determined as 0, 2, or 3 depending on the block area, and there may not be a case where the number of applicable transformation kernel candidates is 1.
- the number of applicable transformation kernel candidates may be 0.
- No transformation kernel may be applied to a block with a block area smaller than 128.
- the number of applicable transformation kernel candidates may be 2.
- the number of applicable transformation kernel candidates may be 3.
- the number of applicable transformation kernel candidates is determined as 0, 2, or 3 depending on the block area, and there may not be a case where the number of applicable transformation kernel candidates is 1.
- the number of applicable transformation kernel candidates may be 0 for a block with a block area smaller than 64. No transformation kernel may be applied to a block with a block area smaller than 64.
- the number of applicable transformation kernel candidates may be 2 for a block with a block area greater than or equal to 64 and less than 512.
- the number of applicable transformation kernel candidates may be 3 for a block with a block area greater than or equal to 512.
- the number of applicable transformation kernel candidates is determined as 0, 2, or 3 depending on the block area, and there may not be a case where the number of applicable transformation kernel candidates is 1.
- the number of applicable transformation kernel candidates may be 0.
- No transformation kernel may be applied to a block with a block area smaller than 32.
- the number of applicable transformation kernel candidates may be 2.
- the number of applicable transformation kernel candidates may be 3.
- the number of applicable transformation kernel candidates is determined as 0, 2, or 3 depending on the block area, and there may not be a case where the number of applicable transformation kernel candidates is 1.
- the number of applicable transformation kernel candidates may be 0.
- No transformation kernel may be applied to a block with a block area smaller than 16.
- the number of applicable transformation kernel candidates may be 2.
- the number of applicable transformation kernel candidates may be 3.
- the number of applicable transformation kernel candidates is determined as either 2 or 3 depending on the block area, and there may not be a case where the number of applicable transformation kernel candidates is 0 or 1.
- the number of applicable transformation kernel candidates may be 2.
- the number of applicable transformation kernel candidates may be 3.
- the number of applicable transformation kernel candidates is determined as 0, 1, or 3 depending on the block area, and there may not be a case where the number of applicable transformation kernel candidates is 2.
- the number of applicable transformation kernel candidates may be 0.
- No transformation kernel may be applied to a block with a block area smaller than 16.
- the number of applicable transformation kernel candidates may be 1.
- the number of applicable transformation kernel candidates may be 3.
- min_area is 16, this may be the same as the interval sequence of 12.
- the number of applicable transformation kernel candidates can be determined as 0, 1, 2, or 3 depending on the block area.
- the number of applicable transformation kernel candidates can be 0.
- No transformation kernel may be applied to a block with a block area smaller than 16.
- the number of applicable transformation kernel candidates can be 1.
- the number of applicable transformation kernel candidates can be 2.
- the number of applicable transformation kernel candidates can be 3.
- min_area is 16, this can be the same as the section sequence of number 4.
- the number of applicable transformation kernel candidates is determined as 0, 1, or 3 depending on the block area, and there may not be a case where the number of applicable transformation kernel candidates is 2.
- the number of applicable transformation kernel candidates may be 0.
- No transformation kernel may be applied to a block with a block area smaller than 16.
- the number of applicable transformation kernel candidates may be 1.
- the number of applicable transformation kernel candidates may be 3.
- min_area is 16, this may be the same as the interval sequence of 8.
- any one of the interval sequences 1 to 40 can be defined. Below, we will examine various examples of interval sequences for the six combinations.
- the current block is an intra block and the transformation set of the current block is determined as an alternative transformation set
- no transformation kernel is applied to the current block with a block area smaller than 256
- the number of applicable transformation kernel candidates can be three for the current block with a block area greater than or equal to 256.
- the number of applicable transformation kernel candidates for the current block can be three, regardless of the block area of the current block.
- the current block is an intra block belonging to an I-Slice and the transformation set of the current block is determined as an alternative transformation set
- no transformation kernel is applied to the current block with a block area smaller than 256
- the number of applicable transformation kernel candidates for the current block with a block area greater than or equal to 256 may be three.
- the number of applicable transformation kernel candidates for the current block may be three.
- the number of transformation kernel candidates applicable to the current block may be 1, regardless of the block area of the current block. In other combinations, the number of transformation kernel candidates applicable to the current block may be 3, regardless of the block area of the current block.
- the interval sequence applicable to the current block may be any one of the interval sequences 1 to 40 described above.
- the combination of [A-(1)]/[B-(2)] may have a different interval sequence from other combinations.
- the number of transformation kernel candidates applicable to the current block may be 3.
- the interval sequence applicable to the current block may be any one of the interval sequences 1 to 40 described above.
- the combinations of [A-(1)]/[B-(2)] and [A-(2)]/[B-(2)] may have different interval sequences from other combinations.
- the number of transformation kernel candidates applicable to the current block may be 3.
- the number of applicable transformation kernel candidates for the current block with a block area less than 256 may be 2, and the number of applicable transformation kernel candidates for the current block with a block area greater than or equal to 256 may be 3. In other combinations, the number of applicable transformation kernel candidates for the current block may be 3, regardless of the block area of the current block.
- the current block is an intra block belonging to an I-Slice and the transformation set of the current block is determined as an alternative transformation set
- no transformation kernel may be applied to a current block having a block area smaller than A
- the number of applicable transformation kernel candidates for a current block having a block area greater than or equal to A may be 1.
- the value of A may be an integer of 32, 64, 128, 256, 512, or a larger number. If the current block is an intra block belonging to a P/B-Slice and an alternative transformation set is used, the number of applicable transformation kernel candidates for the current block may be 3, regardless of the block area of the current block.
- the current block is an intra block belonging to a P/B-Slice and the transformation set of the current block is determined as an alternative transformation set
- no transformation kernel may be applied to the current block having a block area smaller than B
- the number of applicable transformation kernel candidates may be 1 for the current block having a block area greater than or equal to B.
- the value of B may be an integer of 32, 64, 128, 256, 512, or a larger number.
- the number of applicable transformation kernel candidates for the current block may be determined as either 2 or 3, depending on whether the block area of the current block is less than 256. For other combinations, the number of applicable transformation kernel candidates for the current block may be 3, regardless of the block area of the current block.
- the number of applicable transformation kernel candidates for the current block with a block area smaller than A may be 1, and the number of applicable transformation kernel candidates for the current block with a block area greater than or equal to A may be 3.
- the value of A may be an integer of 32, 64, 128, 256, 512, or a larger number.
- the number of applicable transformation kernel candidates for the current block with a block area less than 256 may be 2, and the number of applicable transformation kernel candidates for the current block with a block area greater than or equal to 256 may be 3.
- the number of applicable transformation kernel candidates for the current block may be 3, regardless of the block area of the current block.
- the number of applicable transformation kernel candidates for the current block with a block area smaller than A may be 2, and the number of applicable transformation kernel candidates for the current block with a block area greater than or equal to A may be 3.
- the value of A may be an integer of 32, 64, 128, 256, 512, or a larger number.
- the number of applicable transformation kernel candidates for the current block with a block area less than 256 may be 2, and the number of applicable transformation kernel candidates for the current block with a block area greater than or equal to 256 may be 3.
- the number of applicable transformation kernel candidates for the current block may be 3.
- the interval sequence for any one of [A-(1)]/[B-(1)], [A-(1)]/[B-(2)], [A-(2)]/[B-(1)], [A-(2)]/[B-(2)], [A-(3)]/[B-(1)], or [A-(3)]/[B-(2)] may be identical to the interval sequence for at least one of the other.
- the interval sequence for any one of [A-(1)]/[B-(1)], [A-(1)]/[B-(2)], [A-(2)]/[B-(1)], [A-(2)]/[B-(2)], [A-(3)]/[B-(1)], or [A-(3)]/[B-(2)] may be different from the interval sequence for at least one of the other.
- HLS high-level syntax
- a fixed configuration may mean that the configuration for a pre-defined interval sequence in the codec system does not change.
- An HLS-based configuration may mean that the interval sequence is configured based on syntax elements signaled in a high-level syntax table of the bitstream (e.g., SPS, PPS, PH, SH).
- the interval sequence for each of the six combinations may be configured based on either the fixed configuration or the HLS-based configuration. For example, all interval sequences for the six combinations may be based on the fixed configuration. Alternatively, all interval sequences for the six combinations may be based on the HLS-based configuration. Alternatively, some interval sequences for the six combinations may be based on the fixed configuration, and the rest may be based on the HLS-based configuration.
- the interval sequence can be set based on syntax elements as shown in Table 29 below.
- Table 29 assumes that the interval sequence consists of a total of 4 intervals.
- the syntax elements can be defined as sps_alt_lfnst_nspt_set_cfg_flag[ k ] and sps_alt_lfnst_nspt_set_range[ k ].
- k can have a value from 0 to (M-1).
- the syntax elements related to setting the interval sequence can be signaled through the SPS. However, this is only an example, and the signal may be signaled through at least one of the picture parameter set (PPS), the picture header (PH), or the slice header (SH).
- sps_alt_lfnst_nspt_set_cfg_flag[ k ] can have a value of 0 or 1. If the value of sps_alt_lfnst_nspt_set_cfg_flag[ k ] is 0, the value of e_(k) of the k-th interval can be set to a value equal to max_area_plus_1. In this case, the subsequent intervals (i.e., intervals from the (k+1)-th interval) can be empty intervals with length 0.
- sps_alt_lfnst_nspt_set_cfg_flag[ k ] If the value of sps_alt_lfnst_nspt_set_cfg_flag[ k ] is 1, the values of e_(k) and s_(k+1) can be determined based on sps_alt_lfnst_nspt_set_range[ k ].
- the value of sps_alt_lfnst_nspt_set_range[ k ] can be a value in a logarithmic scale with a base of 2.
- the values of e_(k) and s_(k+1) can be set as in the following mathematical expression 16.
- log2(x) is a function that returns the base-2 logarithm of x. It is not necessary to perform the log2 operation every time in the operation process according to mathematical expression 16.
- the value of log2(e_(k)) can be calculated by accumulating the value of sps_alt_lfnst_nspt_set_range[ k ] from the base-2 logarithm value (i.e., log2(min_area)) for the smallest block area (i.e., min_area) to which non-separable transformation can be applied.
- the values of the actual block area, e_(k) and s_(k+1), can be calculated by applying a left shift to the value of 1 as in mathematical expression 16.
- e_(k) becomes equal to s_(k)
- the k-th interval becomes an empty interval with a length of 0, and the block area belonging to the interval does not exist.
- Table 30 shows pseudo code that derives the number of transformation kernel candidates applicable to the current block by comparing the value of the block area (area) of the current block with at least one section.
- Table 30 shows the pseudo code for the case where the maximum number of applicable transformation kernel candidates is 3 (i.e., when L is 3), but it can be easily modified and applied according to the rules defined in Table 30 even when the maximum number of applicable transformation kernel candidates is greater than or less than 3.
- areaBoundaryValid[ k ] and areaBoundary[ k ] in Table 30 can be derived based on the associated syntax elements as shown in Table 31 below.
- sps_alt_lfnst_nspt_set_range[ k ] can be coded with a code having a specific fixed length of bits, such as 3 bits or 4 bits, or with a code having a variable length of bits, such as an Exponential Group code.
- interval sequences for two or more combinations can also be set. For example, based on the syntax elements, interval sequences for the combinations of ([A-(1)]/[B-(2)]) and ([A-(2)]/[B-(2)]) can be set to be identical.
- the six combinations can be divided into multiple groups.
- the interval sequence can be set based on fixed settings, and for the other group (hereinafter referred to as the second group), the interval sequence can be set based on HLS-based settings.
- the first group has a fixed interval sequence pre-defined in the encoding device and the decoding device, there is no need to define a syntax element for setting the interval sequence for the first group.
- a syntax element for setting the interval sequence can be signaled for each combination.
- the plurality of combinations belonging to the second group can have different interval sequences.
- an interval sequence can be determined based on the signaled syntax element for the second group, and the determined interval sequence can be applied equally to the plurality of combinations belonging to the second group.
- the plurality of combinations belonging to the second group can be divided into a plurality of subgroups.
- a syntax element for setting the interval sequence can be signaled for each subgroup.
- the subgroups belonging to the second group can have different interval sequences, and the combinations belonging to the same subgroup can have the same interval sequence.
- the interval sequence can be set based on HLS-based settings for the six combinations.
- the six combinations can be divided into multiple groups, and syntax elements related to the interval sequence setting can be signaled for each group. Different interval sequences can be set for the multiple groups. The same interval sequence can be set for one or more combinations belonging to the same group.
- the interval sequence can be established based on fixed settings for the six combinations.
- the six combinations can be divided into multiple groups, and a fixed interval sequence can be defined for each group. Different interval sequences can be established for the multiple groups. The same interval sequence can be established for one or more combinations within the same group.
- a fixed interval sequence can be set for [A-(1)]/[B-(1)], [A-(2)]/[B-(1)], [A-(3)]/[B-(1)], and [A-(3)]/[B-(2)], respectively.
- a syntax element for setting the interval sequence can be signaled for [A-(1)]/[B-(2)] and [A-(2)]/[B-(2)].
- the interval sequence determined based on the signaled syntax element can be set identically for [A-(1)]/[B-(2)] and [A-(2)]/[B-(2)].
- fixed interval sequences may be set for [A-(1)]/[B-(1)], [A-(2)]/[B-(1)], [A-(3)]/[B-(1)], and [A-(3)]/[B-(2)], respectively.
- Syntax elements as shown in Table 29 may be signaled for [A-(1)]/[B-(2)] and [A-(2)]/[B-(2)], and interval sequences for the corresponding combinations may be set based on the signaled syntax elements.
- fixed interval sequences can be set for [A-(1)]/[B-(1)], [A-(2)]/[B-(1)], and [A-(3)]/[B-(1)], respectively.
- a syntax element for setting the interval sequence can be signaled for [A-(1)]/[B-(2)] and [A-(2)]/[B-(2)].
- the interval sequence determined based on the signaled syntax element can be set identically for [A-(1)]/[B-(2)] and [A-(2)]/[B-(2)].
- a separate syntax element can be defined and signaled for [A-(3)]/[B-(2)].
- the interval sequence for [A-(3)]/[B-(2)] can be set based on the syntax element.
- a fixed interval sequence can be set for each of the six combinations.
- the six combinations can be divided into multiple groups.
- the multiple groups can include a first group composed of [A-(1)]/[B-(1)] and [A-(2)]/[B-(1)], a second group composed of [A-(1)]/[B-(2)] and [A-(2)]/[B-(2)], a third group composed of [A-(3)]/[B-(1)], and a fourth group composed of [A-(3)]/[B-(2)].
- a syntax element as shown in Table 29 can be defined and signaled, and an interval sequence for the group can be set based on the signaled syntax element.
- the multiple groups can have different interval sequences, and the same interval sequence can be set for combinations belonging to the same group.
- the six combinations can be divided into multiple groups. For example, the six combinations could be organized into one group. Alternatively, the six combinations could be divided into two, three, four, or five groups. Alternatively, each of the six combinations could form a group.
- a flag may be defined for each group.
- the flag may be used to determine the number of transformation kernel candidates applicable to the current block.
- the flag may indicate whether a default interval sequence is applied to combinations belonging to the group. For example, if the flag for a specific group is the first value (1 or 0), the interval sequence corresponding to each combination may be applied to combinations belonging to the group. Conversely, if the flag for a specific group is the second value (0 or 1), the default interval sequence may be applied to combinations belonging to the group.
- the default interval sequence can be set for each group or for each combination within the same group. Alternatively, the default interval sequence can be set the same for all groups or for all combinations within the same group.
- the default interval sequence may be an interval sequence in which the number of applicable transformation kernel candidates is three for all block areas (e.g., [ min_area, min_area ), [ min_area, min_area ), [ min_area, min_area ), [ min_area, max_area_plus_1 ) ).
- the default interval sequence may be an interval sequence in which no transformation kernel is applied for all block areas (e.g., [ min_area, max_area_plus_1 ), [ max_area_plus_1, max_area_plus_1 ), [ max_area_plus_1, max_area_plus_1 ), [ max_area_plus_1, max_area_plus_1 ) ).
- the above flag can be signaled via the bitstream.
- the above flag can be signaled via a higher level syntax table such as SPS, PPS, PH, SH of the bitstream.
- the inverse transform of the current block may be a separable linear transform and/or an LFNST-based inverse transform. That is, a reverse LFNST may be applied to all or part of the (inverse quantized) transform coefficients of the current block, and then a reverse separable linear transform may be applied to the transform coefficients derived through the LFNST to derive residual samples.
- a reverse LFNST may be applied to the (inverse quantized) transform coefficients belonging to a certain region of the current block.
- the certain region means a region to which the forward LFNST is applied, and is hereinafter referred to as a region of interest (ROI).
- the transform coefficients derived through the LFNST may be arranged in the ROI region according to a predetermined scanning order.
- the predetermined scanning order may be a row-first order or a column-first order.
- a reverse separable linear transform may be applied to the transform coefficients derived through the LFNST and the transform coefficients belonging to the remaining area excluding the ROI area within the current block.
- a reverse separable linear transform may be applied to the transform coefficients derived through the LFNST.
- the size of the ROI area may be determined based on at least one of the width or the height of the current block.
- the size of the ROI area may mean at least one of the width or the height of the ROI area, or may mean the number of sample positions belonging to the ROI area.
- the current block can be restored based on the residual sample of the current block (S420).
- the current block can be restored based on the predicted block and residual block of the current block.
- a prediction block for the current block can be derived based on at least one of inter-prediction and intra-prediction.
- the current block can be divided into multiple partitions, and a prediction block for the current block can be generated based on the prediction for each partition.
- a current block can be divided into multiple partitions based on one or more dividing lines.
- the dividing lines can include at least one of a vertical line and a horizontal line.
- the current block can be divided into two partitions by a predetermined dividing line.
- the dividing line for the geometric partitioning can be defined based on a predetermined dividing direction (or dividing angle) and a distance from the center of the current block.
- the current block can be a coding block that is no longer divided through tree-based block partitioning.
- the prediction block of the current block can be generated as a weighted sum of the first prediction block for the first partition and the second prediction block for the second partition.
- the first and second prediction blocks can be generated based on intra prediction.
- the first and second prediction blocks can be generated based on inter prediction.
- either the first prediction block or the second prediction block can be generated based on intra prediction, and the other can be generated based on inter prediction.
- a prediction block for each partition can be generated based on inter prediction, and a prediction block of the current block can be generated based on a weighted sum of the generated prediction blocks.
- a prediction block for each partition can be generated based on intra prediction, and the prediction block for the current block can be generated based on a weighted sum of the generated prediction blocks.
- This is hereinafter referred to as the Spatial Geometric Partitioning Mode (SGPM).
- SGPM Spatial Geometric Partitioning Mode
- the partition type of the current block can be determined based on a partition type index that specifies the partitioning direction and position of the partition line.
- the partition type index can indicate any one of the predefined partition type candidates.
- the intra prediction mode of each partition within the current block can be derived based on a mode index that indicates any one of the multiple intra prediction mode candidates.
- the mode index can be defined for each partition within the current block.
- the above partition type index and the mode index for each partition can be signaled via the bitstream.
- the partition type index can be represented as partition_mode_idx
- the mode index for each partition can be represented as intra_pred_mode0_idx and intra_pred_mode1_idx, respectively.
- the partition type index and the mode index for the first partition can be signaled via the bitstream
- the mode index for the second partition can be derived based on the mode index for the first partition.
- At least one of the above partition type indexes or mode indexes can be signaled based on a flag (cu_sgpm_flag) indicating whether SGPM is applied to the current block. For example, when cu_sgpm_flag is 1, the partition type index and the mode index can be signaled, and when cu_sgpm_flag is 0, the partition type index and the mode index can not be signaled.
- a candidate list for SGPM may be constructed.
- the candidate list may include a plurality of candidates, each of which may include one partition type index and two mode indices.
- the plurality of candidates in the candidate list may be derived from combinations of predefined partition type candidates (e.g., 26 partition type candidates) and predetermined intra prediction mode candidates (e.g., 3 intra prediction mode candidates).
- the maximum number of candidates that can be included in the candidate list may be 16.
- a partition type index for the current block and a mode index for each partition may be derived.
- a candidate index indicating any one of the plurality of candidates may be signaled through a bitstream.
- the above candidate list may be reordered based on a predetermined template region. For example, a SAD between a predicted sample and a reconstructed sample in the template region may be calculated. The SAD may be calculated for each of a plurality of candidates in the candidate list. The plurality of candidates in the candidate list may be reordered in ascending order of the SAD.
- the template region may include at least one of a top peripheral region or a left peripheral region adjacent to the current block. The height of the top peripheral region and the width of the left peripheral region may be fixed to a predetermined length (e.g., 1).
- the above intra prediction mode candidates may be configured in an IPM list.
- An IPM list may be configured for each partition within the current block. At least one of the intra prediction mode candidates in the IPM list of the first partition may be different from the intra prediction mode candidates in the IPM list of the second partition.
- a single IPM list may be configured for the current block, and partitions within the current block may share the single IPM list.
- the IPM list may include three or more intra prediction mode candidates.
- the SGPM can be applied when the size of the current block satisfies a certain condition.
- the condition may include at least one of the following conditions 1 to 5.
- “width” and “height” may represent the width and height of the current block, respectively.
- a flag may be defined indicating whether blending between the prediction block of the first partition and the prediction block of the second partition for the current block is allowed.
- adaptive blending may be used in the SGPM.
- the blending depth for the adaptive blending may be derived based on the size of the current block. For example, if the minimum of the width and height of the current block is 4, the blending depth may be derived as 1/2 ⁇ . If the minimum of the width and height of the current block is 8, the blending depth may be derived as ⁇ . If the minimum of the width and height of the current block is 16, the blending depth may be derived as 2 ⁇ . If the minimum of the width and height of the current block is 32, the blending depth may be derived as 4 ⁇ . If the minimum of the width and height of the current block is greater than 32, the blending depth may be derived as 8 ⁇ .
- ⁇ may be any integer greater than 0.
- the blending depth can be derived as a default value (e.g., 1/4 ⁇ ). This means that blending is not used when the dividing lines of the geometric segmentation correspond to vertical or horizontal lines, and the width of the region to which blending is applied becomes relatively narrower when the dividing lines of the geometric segmentation do not correspond to vertical or horizontal lines (i.e., when the dividing lines have different dividing directions).
- FIG. 8 illustrates a schematic configuration of a decoding device (300) that performs an image decoding method according to the present disclosure.
- a decoding device (300) may include a transform coefficient derivation unit (800), a residual sample derivation unit (810), and a restoration block generation unit (820).
- the transform coefficient derivation unit (800) may be configured in the entropy decoding unit (310) of FIG. 3
- the residual sample derivation unit (810) may be configured in the residual processing unit (320) of FIG. 3
- the restoration block generation unit (820) may be configured in the adding unit (340) of FIG. 3.
- the transform coefficient derivation unit (800) can obtain residual information of the current block from the bitstream and decode it to derive the transform coefficient of the current block.
- the residual sample derivation unit (810) can derive a residual sample of the current block by performing at least one of inverse quantization or inverse transformation on the transform coefficient of the current block.
- the residual sample derivation unit (810) can determine a transformation kernel for the inverse transformation of the current block through a predetermined transformation kernel determination method, and derive the residual sample of the current block based on this. This has been described with reference to FIG. 4, and a detailed description thereof will be omitted here.
- the restoration block generation unit (820) can restore the current block based on the residual sample of the current block.
- FIG. 9 illustrates an image encoding method performed by an encoding device (200) as an embodiment according to the present disclosure.
- the residual samples of the current block can be derived by differentiating prediction samples from the original samples of the current block.
- the prediction samples may be derived based on inter-prediction or intra-prediction.
- the current block can be divided into multiple partitions, and a prediction block of the current block can be generated based on the prediction for each partition.
- the prediction block of the current block can be generated as a weighted sum of a first prediction block for the first partition and a second prediction block for the second partition.
- each of the first and second prediction blocks can be generated based on intra prediction or inter prediction.
- the partition type of the current block and the intra prediction mode of each partition can be determined.
- a partition type index for indicating the determined partition type can be encoded in a bitstream.
- a mode index indicating the determined intra prediction mode among a plurality of intra prediction mode candidates can be encoded in the bitstream.
- the mode index can be encoded for each partition.
- the mode index for the second partition can be encoded based on the mode index for the first partition.
- At least one of the partition type index or the mode index can be encoded based on a flag (cu_sgpm_flag) indicating whether SGPM is applied to the current block.
- a candidate list for the SGPM of the current block may be constructed.
- Each candidate in the candidate list may include one partition type index and two mode indices.
- a partition type index for the current block and a mode index for each partition may be derived.
- a candidate index indicating any one of the multiple candidates may be encoded in the bitstream.
- the candidate list may be reordered based on a predetermined template region.
- the intra prediction mode candidates can be organized into an intra prediction mode (IPM) list.
- IPM intra prediction mode
- SGPM can also be applied when the current block size satisfies certain conditions.
- blending depth may be derived based on the size of the current block.
- the blending depth may be derived as a default value (e.g., 1/4 ⁇ ). Based on the determination, a flag indicating whether blending between the prediction blocks of the first and second partitions is allowed may be encoded in the bitstream.
- transform coefficients of the current block can be derived by performing at least one of transformation or quantization on the residual sample of the current block (S910).
- the transformation method according to the present disclosure can be understood as the reverse process of the inverse transformation described with reference to FIG. 4.
- the method for determining the transformation kernel for the above transformation is as described with reference to FIG. 4. A detailed description thereof will be omitted here.
- NSPT can be applied based on at least one of the tree type or component type of the current block.
- the NSPT kernel (or NSPT matrix) for the NSPT can be determined by utilizing the symmetry between intra prediction modes or the symmetry between block shapes.
- the NSPT kernel can be expressed as r x MN.
- r means the output length of NSPT or the number of transform coefficients generated by NSPT
- MN is the product of the width and height of the current block, and can mean the input length of NSPT or the number of residual samples to which NSPT is applied.
- a method for determining the size of the NSPT kernel has been described with reference to FIG. 4.
- the LFNST index and/or NSPT index for transformation may be encoded as a single integrated syntax, or the LFNST index and NSPT index may be encoded separately and inserted into the bitstream. Binarization of the LFNST index and NSPT index, and assignment of CABAC context and initial value are as described with reference to FIG. 4.
- a non-separable transformation can be applied to the current block encoded with inter prediction (or intra prediction), as discussed with reference to Fig. 4.
- a bitstream can be generated by encoding the transform coefficients of the current block (S920).
- residual information about the transform coefficient can be generated, and a bitstream can be generated by encoding the residual information.
- FIG. 10 illustrates a schematic configuration of an encoding device (200) that performs an image encoding method according to the present disclosure.
- an encoding device (200) may include a residual sample derivation unit (1000), a transform coefficient derivation unit (1010), and a transform coefficient encoding unit (1020).
- the residual sample derivation unit (1000) and the transform coefficient derivation unit (1010) may be configured in the residual processing unit (230) of FIG. 2, and the transform coefficient encoding unit (1020) may be configured in the entropy encoding unit (240) of FIG. 2.
- the residual sample derivation unit (1000) can derive a residual sample of the current block by differentiating a predicted sample from an original sample of the current block.
- the predicted sample may be derived based on a predetermined intra prediction mode.
- the transform coefficient derivation unit (1010) can derive the transform coefficient of the current block by performing at least one of transform and quantization on the residual sample of the current block.
- the transform coefficient derivation unit (1010) can determine the transform kernel of the current block based on at least one of the above-described embodiments 1 to 6, and derive the transform coefficient by applying the transform kernel to the residual sample of the current block.
- the transform coefficient encoding unit (1020) can generate a bitstream by encoding the transform coefficient of the current block.
- the methods are described based on a flowchart as a series of steps or blocks. However, the embodiments are not limited to the order of the steps, and some steps may occur in a different order or simultaneously with other steps described above. Furthermore, those skilled in the art will understand that the steps depicted in the flowchart are not exclusive, and other steps may be included, or one or more steps in the flowchart may be deleted without affecting the scope of the embodiments of this document.
- the method according to the embodiments of the present document described above can be implemented in the form of software, and the encoding device and/or decoding device according to the present document can be included in a device that performs image processing, such as a TV, a computer, a smartphone, a set-top box, a display device, etc.
- the above-described method can be implemented as a module (process, function, etc.) that performs the above-described function.
- the module can be stored in memory and executed by a processor.
- the memory can be internal or external to the processor and can be connected to the processor by various well-known means.
- the processor can include an application-specific integrated circuit (ASIC), another chipset, logic circuit, and/or data processing device.
- the memory can include a read-only memory (ROM), a random access memory (RAM), flash memory, a memory card, a storage medium, and/or other storage devices. That is, the embodiments described in this document can be implemented and performed on a processor, a microprocessor, a controller, or a chip.
- each drawing can be implemented and performed on a computer, a processor, a microprocessor, a controller, or a chip.
- information for implementation e.g., information on instructions
- an algorithm can be stored on a digital storage medium.
- the decoding device and encoding device to which the embodiment(s) of the present specification are applied may be included in a multimedia broadcasting transmitting and receiving device, a mobile communication terminal, a home cinema video device, a digital cinema video device, a surveillance camera, a video conversation device, a real-time communication device such as a video communication, a mobile streaming device, a storage medium, a camcorder, a video-on-demand (VoD) service providing device, an OTT (Over the top video) device, an Internet streaming service providing device, a three-dimensional (3D) video device, a VR (virtual reality) device, an AR (argumente reality) device, a video phone video device, a transportation terminal (ex.
- a multimedia broadcasting transmitting and receiving device a mobile communication terminal, a home cinema video device, a digital cinema video device, a surveillance camera, a video conversation device, a real-time communication device such as a video communication, a mobile streaming device, a storage medium, a camcorder,
- the OTT (Over the top video) device may include a game console, a Blu-ray player, an Internet-connected TV, a home theater system, a smartphone, a tablet PC, a DVR (Digital Video Recorder), etc.
- the processing method to which the embodiment(s) of the present specification are applied can be produced in the form of a computer-executable program and can be stored in a computer-readable recording medium.
- Multimedia data having a data structure according to the embodiment(s) of the present specification can also be stored in a computer-readable recording medium.
- the computer-readable recording medium includes all types of storage devices and distributed storage devices in which computer-readable data is stored.
- the computer-readable recording medium can include, for example, a Blu-ray disc (BD), a universal serial bus (USB), a ROM, a PROM, an EPROM, an EEPROM, a RAM, a CD-ROM, a magnetic tape, a floppy disk, and an optical data storage device.
- the computer-readable recording medium includes a medium implemented in the form of a carrier wave (e.g., transmission via the Internet).
- a bitstream generated by an encoding method can be stored in a computer-readable recording medium or transmitted via a wired or wireless communication network.
- embodiments of the present disclosure may be implemented as a computer program product by program code, and the program code may be executed on a computer by the embodiments of the present disclosure.
- the program code may be stored on a computer-readable carrier.
- FIG. 11 illustrates an example of a content streaming system to which embodiments of the present disclosure can be applied.
- a content streaming system to which the embodiment(s) of the present specification are applied may largely include an encoding server, a streaming server, a web server, a media storage, a user device, and a multimedia input device.
- the encoding server compresses content input from multimedia input devices such as smartphones, cameras, and camcorders into digital data, generates a bitstream, and transmits it to the streaming server.
- multimedia input devices such as smartphones, cameras, and camcorders directly generate bitstreams
- the encoding server may be omitted.
- the above bitstream can be generated by an encoding method or a bitstream generation method to which the embodiment(s) of the present specification are applied, and the streaming server can temporarily store the bitstream during the process of transmitting or receiving the bitstream.
- the streaming server transmits multimedia data to a user device based on a user request via a web server, and the web server acts as an intermediary to inform the user of available services.
- the web server transmits the request to the streaming server, and the streaming server transmits the multimedia data to the user.
- the content streaming system may include a separate control server, in which case the control server controls commands/responses between each device within the content streaming system.
- the streaming server can receive content from a media repository and/or an encoding server. For example, when receiving content from the encoding server, the content can be received in real time. In this case, to provide a smooth streaming service, the streaming server can store the bitstream for a certain period of time.
- Examples of the user devices may include mobile phones, smart phones, laptop computers, digital broadcasting terminals, personal digital assistants (PDAs), portable multimedia players (PMPs), navigation devices, slate PCs, tablet PCs, ultrabooks, wearable devices (e.g., smartwatches, smart glasses, HMDs), digital TVs, desktop computers, digital signage, etc.
- PDAs personal digital assistants
- PMPs portable multimedia players
- navigation devices slate PCs
- tablet PCs tablet PCs
- ultrabooks ultrabooks
- wearable devices e.g., smartwatches, smart glasses, HMDs
- digital TVs desktop computers, digital signage, etc.
- Each server within the above content streaming system can be operated as a distributed server, in which case data received from each server can be processed in a distributed manner.
Landscapes
- Engineering & Computer Science (AREA)
- Multimedia (AREA)
- Signal Processing (AREA)
- Compression Or Coding Systems Of Tv Signals (AREA)
Abstract
La présente invention concerne un procédé et un dispositif de décodage d'images pouvant : dériver des coefficients de transformation d'un bloc courant à partir d'un flux binaire ; dériver des échantillons résiduels du bloc courant sur la base d'une transformation inverse des coefficients de transformation du bloc courant ; et reconstruire le bloc courant à partir des échantillons résiduels du bloc courant. Les échantillons résiduels peuvent être dérivés sur la base de l'un quelconque parmi une pluralité de candidats de noyau de transformation appartenant à un ensemble de transformations du bloc courant. L'ensemble de transformations du bloc courant peut être déterminé comme étant soit un ensemble de transformations de base, soit un autre ensemble de transformations.
Applications Claiming Priority (2)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| US202463670736P | 2024-07-12 | 2024-07-12 | |
| US63/670,736 | 2024-07-12 |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| WO2026014976A1 true WO2026014976A1 (fr) | 2026-01-15 |
Family
ID=98387107
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| PCT/KR2025/010166 Pending WO2026014976A1 (fr) | 2024-07-12 | 2025-07-11 | Procédé et dispositif de codage/décodage d'image, et support d'enregistrement sur lequel un flux binaire est enregistré |
Country Status (1)
| Country | Link |
|---|---|
| WO (1) | WO2026014976A1 (fr) |
Citations (5)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| KR20130110124A (ko) * | 2010-12-21 | 2013-10-08 | 한국전자통신연구원 | 인트라 예측 모드 부호화/복호화 방법 및 그 장치 |
| KR20220162184A (ko) * | 2019-04-16 | 2022-12-07 | 엘지전자 주식회사 | 인트라 예측 기반 영상 코딩에서의 변환 |
| KR20240065135A (ko) * | 2022-10-19 | 2024-05-14 | 텐센트 아메리카 엘엘씨 | 인트라 예측 융합을 위한 변환 선택 |
| KR20240095193A (ko) * | 2021-10-28 | 2024-06-25 | 엘지전자 주식회사 | Mpm 리스트를 이용하는 영상 코딩 방법 및 장치 |
| KR20240108795A (ko) * | 2023-01-02 | 2024-07-09 | 현대자동차주식회사 | 영상 부호화/복호화 방법, 장치 및 비트스트림을 저장한 기록 매체 |
-
2025
- 2025-07-11 WO PCT/KR2025/010166 patent/WO2026014976A1/fr active Pending
Patent Citations (5)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| KR20130110124A (ko) * | 2010-12-21 | 2013-10-08 | 한국전자통신연구원 | 인트라 예측 모드 부호화/복호화 방법 및 그 장치 |
| KR20220162184A (ko) * | 2019-04-16 | 2022-12-07 | 엘지전자 주식회사 | 인트라 예측 기반 영상 코딩에서의 변환 |
| KR20240095193A (ko) * | 2021-10-28 | 2024-06-25 | 엘지전자 주식회사 | Mpm 리스트를 이용하는 영상 코딩 방법 및 장치 |
| KR20240065135A (ko) * | 2022-10-19 | 2024-05-14 | 텐센트 아메리카 엘엘씨 | 인트라 예측 융합을 위한 변환 선택 |
| KR20240108795A (ko) * | 2023-01-02 | 2024-07-09 | 현대자동차주식회사 | 영상 부호화/복호화 방법, 장치 및 비트스트림을 저장한 기록 매체 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| WO2021054796A1 (fr) | Procédé de codage d'image basé sur une transformée, et dispositif associé | |
| WO2021086055A1 (fr) | Procédé de codage d'images basée sur une transformée et appareil associé | |
| WO2021206445A1 (fr) | Procédé de codage d'image basé sur une transformée et appareil associé | |
| WO2021066598A1 (fr) | Procédé de codage d'image basé sur une transformée, et appareil associé | |
| WO2021096290A1 (fr) | Procédé de codage d'image basé sur une transformée et dispositif associé | |
| WO2021066616A1 (fr) | Procédé de codage d'image basé sur une transformée et dispositif associé | |
| WO2021054798A1 (fr) | Procédé et dispositif de codage d'image à base de transformée | |
| WO2021086061A1 (fr) | Procédé de codage d'images basé sur une transformée et dispositif associé | |
| WO2021096293A1 (fr) | Procédé de codage d'image reposant sur une transformée et dispositif associé | |
| WO2021010680A1 (fr) | Procédé de codage d'image basé sur une transformée, et dispositif associé | |
| WO2021086056A1 (fr) | Procédé de codage d'image basé sur une transformée et dispositif associé | |
| WO2021194221A1 (fr) | Procédé de codage d'images basé sur une transformée et dispositif associé | |
| WO2021071282A1 (fr) | Procédé de codage d'image reposant sur une transformée et dispositif associé | |
| WO2021096295A1 (fr) | Procédé de codage d'image reposant sur une transformée et dispositif associé | |
| WO2021086050A1 (fr) | Procédé de codage d'image basé sur une transformée et appareil associé | |
| WO2021060905A1 (fr) | Procédé de codage d'image basé sur une transformée et appareil associé | |
| WO2021137556A1 (fr) | Procédé de codage d'image reposant sur une transformée et dispositif associé | |
| WO2021086064A1 (fr) | Procédé de codage d'image basé sur une transformée, et appareil associé | |
| WO2021066601A1 (fr) | Procédé de codage d'image sur la base d'une transformée, et dispositif associé | |
| WO2021141472A1 (fr) | Procédé de codage d'image basé sur une transformée, et dispositif associé | |
| WO2021086149A1 (fr) | Procédé de codage d'image basé sur une transformée, et dispositif associé | |
| WO2021054799A1 (fr) | Procédé de codage d'image basé sur une transformée et dispositif associé | |
| WO2021141478A1 (fr) | Procédé de codage d'image basé sur une transformée et dispositif associé | |
| WO2021071283A1 (fr) | Procédé de codage d'image basé sur une transformée et dispositif associé | |
| WO2021194199A1 (fr) | Procédé de codage d'images basé sur une transformée et dispositif associé |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| 121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 25837652 Country of ref document: EP Kind code of ref document: A1 |