WO2020162737A1 - 이차 변환을 이용하는 비디오 신호 처리 방법 및 장치 - Google Patents
이차 변환을 이용하는 비디오 신호 처리 방법 및 장치 Download PDFInfo
- Publication number
- WO2020162737A1 WO2020162737A1 PCT/KR2020/001853 KR2020001853W WO2020162737A1 WO 2020162737 A1 WO2020162737 A1 WO 2020162737A1 KR 2020001853 W KR2020001853 W KR 2020001853W WO 2020162737 A1 WO2020162737 A1 WO 2020162737A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- transform
- quadratic
- current block
- block
- inverse
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Ceased
Links
Images
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/134—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
- H04N19/157—Assigned coding mode, i.e. the coding mode being predefined or preselected to be further used for selection of another element or parameter
- H04N19/159—Prediction type, e.g. intra-frame, inter-frame or bidirectional frame prediction
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/102—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
- H04N19/103—Selection of coding mode or of prediction mode
- H04N19/11—Selection of coding mode or of prediction mode among a plurality of spatial predictive coding modes
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/102—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
- H04N19/119—Adaptive subdivision aspects, e.g. subdivision of a picture into rectangular or non-rectangular coding blocks
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/102—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
- H04N19/129—Scanning of coding units, e.g. zig-zag scan of transform coefficients or flexible macroblock ordering [FMO]
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/102—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
- H04N19/132—Sampling, masking or truncation of coding units, e.g. adaptive resampling, frame skipping, frame interpolation or high-frequency transform coefficient masking
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/169—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
- H04N19/17—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
- H04N19/176—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a block, e.g. a macroblock
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/40—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using video transcoding, i.e. partial or full decoding of a coded input stream followed by re-encoding of the decoded output stream
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/50—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
- H04N19/593—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving spatial prediction techniques
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/60—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding
- H04N19/61—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding in combination with predictive coding
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/60—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding
- H04N19/61—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding in combination with predictive coding
- H04N19/619—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding in combination with predictive coding the transform being operated outside the prediction loop
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/60—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding
- H04N19/625—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding using discrete cosine transform [DCT]
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/70—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals characterised by syntax aspects related to video coding, e.g. related to compression standards
Definitions
- the present invention relates to a video signal processing method and apparatus, and more particularly, to a video signal processing method and apparatus for encoding or decoding a video signal.
- Compression coding refers to a series of signal processing techniques for transmitting digitized information through a communication line or storing it in a format suitable for a storage medium.
- Objects of compression encoding include audio, video, and text, and in particular, a technique for performing compression encoding on an image is called video image compression.
- Compression coding of a video signal is performed by removing redundant information in consideration of spatial correlation, temporal correlation, and probability correlation.
- a more efficient video signal processing method and apparatus is required.
- An object of the present invention is to increase the coding efficiency of a video signal. Specifically, the present invention has an object of improving coding efficiency by using a transform kernel suitable for a transform block.
- the present invention provides a video signal processing apparatus and a video signal processing method as follows.
- a method for processing a video signal comprising: determining whether a second order inverse transform is applied to a current block; When the quadratic transform is applied to the current block, deriving a quadratic transform kernel set applied to the current block from among predefined quadratic transform kernel sets based on an intra prediction mode of the current block; Determining a quadratic transform kernel applied to the current block within the determined quadratic transform kernel set; Generating a quadratic inverse transformed block by performing a quadratic inverse transform on a specific region at the upper left of the current block using the quadratic transform kernel; And generating a residual block of the current block by performing a first-order inverse transform on the inverse quadratic transformed block, wherein the inverse quadratic transform is inverse quantized based on a fixed scan order regardless of the size of the quadratic transform kernel.
- a video signal processing method is provided, characterized in that receiving the transformed coefficient.
- the step of generating the quadratic inverse transformed block includes assigning the inverse quantized transform coefficient to the input coefficient array of the quadratic inverse transform based on an up-right diagonal scan order. I can.
- the upper-right diagonal scan order may be predefined as a scan order for a 4x4 size block.
- the step of determining whether a quadratic inverse transform is applied to the current block includes obtaining a syntax element indicating whether a quadratic transform is applied to the current block when a predefined condition is satisfied.
- the predefined condition may include whether the width and height of the current block are less than or equal to the maximum transform size.
- determining whether the inverse quadratic transform is applied to the current block may include inferring the syntax element as 0 when the predefined condition is not satisfied.
- the determined quadratic transformation according to the value of the syntax element A quadratic transformation kernel applied to the current block in the kernel set may be determined.
- the current block when the width or height of the current block is larger than the maximum transform size, the current block may be divided into a plurality of transform units.
- the processor determines whether or not a quadratic inverse transform is applied to a current block, and when the quadratic transform is applied to the current block, A quadratic transform kernel set applied to the current block is derived from among predefined quadratic transform kernel sets based on the intra prediction mode of the current block, and a quadratic transform kernel applied to the current block within the determined quadratic transform kernel set And, by performing a quadratic inverse transform on a specific area at the upper left of the current block using the quadratic transform kernel, a quadratic inverse transformed block is generated, and a linear inverse transform is performed on the quadratic inverse transformed block, so that the current block A video signal processing apparatus is provided, wherein the residual block of is generated, wherein the inverse quadratic transform receives inverse quantized transform coefficients based on a fixed scan order regardless of the size of the quadratic transform kernel.
- the processor may allocate the inverse quantized transform coefficient to the input coefficient array of the quadratic inverse transform based on an up-right diagonal scan order.
- the upper-right diagonal scan order may be predefined as a scan order for a 4x4 size block.
- the processor when a predefined condition is satisfied, obtains a syntax element indicating whether quadratic transformation is applied to the current block, and the predefined condition is the width of the current block And whether the height is less than or equal to the maximum transform size.
- the processor may infer the syntax element as 0.
- the determined quadratic transformation according to the value of the syntax element A quadratic transformation kernel applied to the current block in the kernel set may be determined.
- the current block when the width or height of the current block is larger than the maximum transform size, the current block may be divided into a plurality of transform units.
- a video signal processing method comprising: determining whether to apply a quadratic transform to a current block; When the quadratic transform is applied to the current block, deriving a quadratic transform kernel set applied to the current block from among predefined quadratic transform kernel sets based on an intra prediction mode of the current block; Determining a quadratic transform kernel applied to the current block within the determined quadratic transform kernel set; Generating a first-order transformed block by performing a first-order transform on the residual block of the current block; Generating a quadratic transformed block by performing quadratic transform on a specific area at the upper left of the block by using the quadratic transform kernel; And generating a bitstream by encoding the quadratic transformed block, wherein the quadratic transform comprises the quadratic transformed coefficients into a transform coefficient array based on a fixed scan order regardless of the size of the quadratic transform kernel.
- a video signal processing method characterized in that it is performed, is
- a non-transitory computer-executable component storing a computer-executable component configured to execute on one or more processors of a computing device, the computer-executable component comprising: , It is determined whether or not the quadratic transform is applied to the current block, and when the quadratic transform is applied to the current block, it is applied to the current block among predefined quadratic transform kernel sets based on the intra prediction mode of the current block.
- a quadratic transform kernel set is derived, a quadratic transform kernel applied to the current block is determined within the determined quadratic transform kernel set, and a quadratic inverse transform is performed on a specific region at the upper left of the current block using the quadratic transform kernel.
- a non-transitory computer-readable medium characterized in that receiving an inverse quantized transform coefficient based on.
- coding efficiency of a video signal can be improved.
- a transform kernel suitable for a current transform block may be selected.
- FIG. 1 is a schematic block diagram of an apparatus for encoding a video signal according to an embodiment of the present invention.
- FIG. 2 is a schematic block diagram of a video signal decoding apparatus according to an embodiment of the present invention.
- FIG. 3 shows an embodiment in which a coding tree unit is divided into coding units within a picture.
- FIG. 4 shows an embodiment of a method for signaling division of a quad tree and a multi-type tree.
- FIG 5 and 6 more specifically illustrate an intra prediction method according to an embodiment of the present invention.
- FIG. 7 is a diagram specifically illustrating a method of converting a residual signal by an encoder.
- FIG. 8 is a diagram specifically illustrating a method of obtaining a residual signal by inverse transforming a transform coefficient by an encoder and a decoder.
- AMT adaptive multiple core transform
- DCT-II discrete cosine transform type-V
- DCT-VIII discrete cosine transform type-VIII
- DST-I discrete sine transform type-I
- DST-VII kernel formulas are shown.
- FIG. 10 is a diagram illustrating a transform set according to an intra prediction mode and a transform kernel candidate defined according to the transform set in AMT.
- FIG. 11 is a diagram showing a 0th (lowest frequency component of a corresponding conversion kernel) basis function of the DCT-II, DCT-V, DCT-VIII, DST-I, and DST-VII conversion defined in FIG. 9.
- FIG. 12 illustrates a transform kernel used in a multiple transform selection (MTS) technique according to an embodiment of the present invention and a transform set and transform kernel candidates defined according to a prediction mode.
- MTS multiple transform selection
- DST-IV and DCT-IV basis functions are a definition of the DST-IV and DCT-IV basis functions according to an embodiment of the present invention and the 0th (lowest frequency component of DCT-II, DCT-IV, DCT-VIII, DST-IV, and DST-VII) ) Is a diagram showing a graph of the basis function.
- FIG. 14 is a block diagram illustrating a process of restoring a residual signal in a decoder performing quadratic transformation according to an embodiment of the present invention.
- 15 is a diagram illustrating a process of restoring a residual signal in a decoder performing quadratic transformation according to an embodiment of the present invention at a block level.
- 16 is a diagram illustrating a method of applying a quadratic transformation using a reduced number of samples according to an embodiment of the present invention.
- 17 is a diagram illustrating a method of determining an up-right diagonal scan order according to an embodiment of the present invention.
- FIG. 18 is a diagram illustrating an upper-right diagonal scan order according to a block size according to an embodiment of the present invention.
- FIG. 19 is a diagram illustrating an example of a second-order conversion process according to an embodiment of the present invention.
- 20 is a diagram illustrating a process of deriving a second-order transform matrix according to an embodiment of the present invention.
- 21 is a flowchart illustrating a video signal processing method according to an embodiment of the present invention.
- Coding can be interpreted as encoding or decoding in some cases.
- an apparatus for generating a video signal bitstream by encoding (encoding) a video signal is referred to as an encoding apparatus or an encoder
- an apparatus for reconstructing a video signal by performing decoding (decoding) of a video signal bitstream is decoding It is referred to as a device or decoder.
- a video signal processing apparatus is used as a term for a concept including both an encoder and a decoder.
- Information is a term that includes all values, parameters, coefficients, elements, etc., and the meaning may be interpreted differently in some cases, so the present invention is not limited thereto.
- 'Unit' is used to refer to a basic unit of image processing or a specific position of a picture, and refers to an image area including at least one of a luma component and a chroma component.
- a'block' refers to an image area including a specific component among luminance components and color difference components (ie, Cb and Cr).
- terms such as'unit','block','partition', and'area' may be used interchangeably.
- a unit may be used as a concept including all of a coding unit, a prediction unit, and a transform unit.
- a picture refers to a field or a frame, and the terms may be used interchangeably according to embodiments.
- the encoding apparatus 100 of the present invention includes a transform unit 110, a quantization unit 115, an inverse quantization unit 120, an inverse transform unit 125, a filtering unit 130, and a prediction unit 150. ) And an entropy coding unit 160.
- the transform unit 110 converts a residual signal that is a difference between the input video signal and the prediction signal generated by the prediction unit 150 to obtain a transform coefficient value.
- a transform coefficient value For example, a Discrete Cosine Transform (DCT), a Discrete Sine Transform (DST), or a Wavelet Transform may be used.
- Discrete cosine transform and discrete sine transform are transformed by dividing the input picture signal into a block form. In transformation, coding efficiency may vary depending on the distribution and characteristics of values in the transformation region.
- the quantization unit 115 quantizes a transform coefficient value output from the transform unit 110.
- the picture signal is not coded as it is, but a picture is predicted by using a region already coded through the prediction unit 150, and a residual value between the original picture and the predicted picture is added to the predicted picture to be a reconstructed picture.
- the method of obtaining is used.
- information available in the decoder should be used when performing prediction in the encoder.
- the encoder performs a process of reconstructing the encoded current block again.
- the inverse quantization unit 120 inverse quantizes the transform coefficient value, and the inverse transform unit 125 restores the residual value by using the inverse quantization transform coefficient value.
- the filtering unit 130 performs a filtering operation to improve the quality and encoding efficiency of the reconstructed picture.
- a deblocking filter For example, a deblocking filter, a sample adaptive offset (SAO), and an adaptive loop filter may be included.
- the filtered picture is output or stored in a decoded picture buffer (DPB) 156 to be used as a reference picture.
- DPB decoded picture buffer
- the picture signal is not coded as it is, but a picture is predicted by using a region already coded through the prediction unit 150, and a residual value between the original picture and the predicted picture is added to the predicted picture.
- the method of obtaining is used.
- the intra prediction unit 152 performs intra prediction within the current picture, and the inter prediction unit 154 predicts the current picture using a reference picture stored in the decoded picture buffer 156.
- the intra prediction unit 152 performs intra prediction from reconstructed regions in the current picture and transmits the intra encoding information to the entropy coding unit 160.
- the inter prediction unit 154 may again include a motion estimation unit 154a and a motion compensation unit 154b.
- the motion estimation unit 154a obtains a motion vector value of the current region by referring to the restored specific region.
- the motion estimating unit 154a transfers position information (reference frame, motion vector, etc.) of the reference region to the entropy coding unit 160 to be included in the bitstream.
- the motion compensation unit 154b performs inter-motion compensation using the motion vector value transferred from the motion estimation unit 154a.
- the prediction unit 150 includes an intra prediction unit 152 and an inter prediction unit 154.
- the intra prediction unit 152 performs intra prediction within the current picture, and the inter prediction unit 154 predicts the current picture using a reference picture stored in the decoded picture buffer 156. Perform.
- the intra prediction unit 152 performs intra prediction from reconstructed samples in the current picture, and transmits intra-encoding information to the entropy coding unit 160.
- the intra encoding information may include at least one of an intra prediction mode, a Most Probable Mode (MPM) flag, and an MPM index.
- Intra encoding information may include information on a reference sample.
- the inter prediction unit 154 may include a motion estimation unit 154a and a motion compensation unit 154b.
- the motion estimation unit 154a obtains a motion vector value of the current region by referring to a specific region of the reconstructed reference picture.
- the motion estimation unit 154a transmits a motion information set (reference picture index, motion vector information, etc.) for the reference region to the entropy coding unit 160.
- the motion compensation unit 154b performs motion compensation using the motion vector value transmitted from the motion estimation unit 154a.
- the inter prediction unit 154 transmits inter encoding information including motion information on the reference region to the entropy coding unit 160.
- the prediction unit 150 may include an intra block copy (BC) prediction unit (not shown).
- the intra BC prediction unit performs intra BC prediction from reconstructed samples in the current picture, and transfers intra BC encoding information to the entropy coding unit 160.
- the intra BC predictor refers to a specific region in the current picture and obtains a block vector value indicating a reference region used for prediction of the current region.
- the intra BC prediction unit may perform intra BC prediction using the obtained block vector value.
- the intra BC prediction unit transfers intra BC encoding information to the entropy coding unit 160.
- Intra BC encoding information may include block vector information.
- the transform unit 110 obtains a transform coefficient value by transforming a residual value between the original picture and the predicted picture.
- the transformation may be performed in units of a specific block within the picture, and the size of the specific block may vary within a preset range.
- the quantization unit 115 quantizes the transform coefficient values generated by the transform unit 110 and transmits the quantization to the entropy coding unit 160.
- the entropy coding unit 160 generates a video signal bitstream by entropy coding information representing a quantized transform coefficient, intra coding information, and inter coding information.
- a variable length coding (VLC) method and an arithmetic coding method may be used.
- the variable length coding (VLC) method converts input symbols into consecutive codewords, and the length of the codeword may be variable. For example, frequently occurring symbols are represented by a short codeword, and infrequently occurring symbols are represented by a long codeword.
- a context-based adaptive variable length coding (CAVLC) scheme may be used as a variable length coding scheme.
- Arithmetic coding converts consecutive data symbols into a single prime number, and arithmetic coding can obtain an optimal decimal bit necessary to represent each symbol.
- Context-based Adaptive Binary Arithmetic Code (CABAC) may be used as arithmetic coding.
- CABAC Context-based Adaptive Binary Arithmetic Code
- the entropy coding unit 160 may binarize information representing a quantized transform coefficient.
- the entropy coding unit 160 may generate a bitstream by arithmetic coding the binary information.
- the generated bitstream is encapsulated in a basic unit of a Network Abstraction Layer (NAL) unit.
- the NAL unit includes a coded integer number of coding tree units.
- the bitstream In order to decode a bitstream in a video decoder, the bitstream must first be separated into NAL unit units, and then each separated NAL unit must be decoded. Meanwhile, information necessary for decoding a video signal bitstream is a high-level set such as a picture parameter set (PPS), a sequence parameter set (SPS), and a video parameter set (VPS). It may be transmitted through RBSP (Raw Byte Sequence Payload).
- PPS picture parameter set
- SPS sequence parameter set
- VPN video parameter set
- FIG. 1 shows the encoding apparatus 100 according to an embodiment of the present invention, and separately displayed blocks show elements of the encoding apparatus 100 by logically distinguishing them. Accordingly, the elements of the encoding apparatus 100 described above may be mounted as one chip or as a plurality of chips according to the design of the device. According to an embodiment, the operation of each element of the encoding apparatus 100 described above may be performed by a processor (not shown).
- the decoding apparatus 200 of the present invention includes an entropy decoding unit 210, an inverse quantization unit 220, an inverse transform unit 225, a filtering unit 230, and a prediction unit 250.
- the entropy decoding unit 210 entropy-decodes the video signal bitstream to extract transform coefficient information, intra encoding information, inter encoding information, and the like for each region. For example, the entropy decoding unit 210 may obtain a binarization code for transform coefficient information of a specific region from a video signal bitstream. In addition, the entropy decoding unit 210 obtains quantized transform coefficients by inverse binarizing the binarized code. The inverse quantization unit 220 inverse quantizes the quantized transform coefficient, and the inverse transform unit 225 restores a residual value by using the inverse quantization transform coefficient. The video signal processing apparatus 200 restores the original pixel value by summing the residual value obtained by the inverse transform unit 225 with the predicted value obtained by the prediction unit 250.
- the filtering unit 230 improves image quality by performing filtering on a picture.
- This may include a deblocking filter for reducing block distortion and/or an adaptive loop filter for removing distortion of an entire picture.
- the filtered picture is output or stored in the decoded picture buffer (DPB) 256 to be used as a reference picture for the next picture.
- DPB decoded picture buffer
- the prediction unit 250 includes an intra prediction unit 252 and an inter prediction unit 254.
- the prediction unit 250 generates a prediction picture by using an encoding type decoded by the entropy decoding unit 210 described above, a transform coefficient for each region, and intra/inter encoding information.
- a current picture including the current block or a decoded area of other pictures may be used.
- a picture (or tile/slice) performing intra prediction or intra BC prediction an intra picture or an I picture (or tile/slice), intra prediction, inter prediction, and intra BC prediction
- a picture (or tile/slice) that can be performed is called an inter picture (or tile/slice).
- a picture (or tile/slice) using at most one motion vector and a reference picture index is a predictive picture or a P picture (or , Tile/slice), and a picture (or tile/slice) using up to two motion vectors and a reference picture index is referred to as a bi-predictive picture or a B picture (or tile/slice).
- a P picture (or tile/slice) uses at most one set of motion information to predict each block
- a B picture (or tile/slice) uses at most two motion information to predict each block.
- the motion information set includes one or more motion vectors and one reference picture index.
- the intra prediction unit 252 generates a prediction block using intra encoding information and reconstructed samples in the current picture.
- the intra encoding information may include at least one of an intra prediction mode, a Most Probable Mode (MPM) flag, and an MPM index.
- MPM Most Probable Mode
- the intra prediction unit 252 predicts sample values of the current block by using reconstructed samples located on the left side and/or above the current block as reference samples.
- reconstructed samples, reference samples, and samples of the current block may represent pixels. Also, sample values may represent pixel values.
- the reference samples may be samples included in a neighboring block of the current block.
- the reference samples may be samples adjacent to the left boundary of the current block and/or samples adjacent to the upper boundary.
- the reference samples are samples located on a line within a preset distance from the left boundary of the current block among samples of the neighboring blocks of the current block and/or on a line within a preset distance from the upper boundary of the current block. It may be a sample.
- the neighboring block of the current block is a left (L) block, an upper (A) block, a lower left (BL) block, an upper right (AR) block, or an upper left (Above Left) block.
- AL may include at least one of the blocks.
- the inter prediction unit 254 generates a prediction block using a reference picture and inter encoding information stored in the decoded picture buffer 256.
- the inter-encoding information may include a motion information set (reference picture index, motion vector information, etc.) of the current block for the reference block.
- Inter prediction may include L0 prediction, L1 prediction, and Bi-prediction.
- L0 prediction means prediction using one reference picture included in the L0 picture list
- L1 prediction means prediction using one reference picture included in the L1 picture list.
- one set of motion information eg, a motion vector and a reference picture index
- up to two reference regions may be used, and the two reference regions may exist in the same reference picture or may exist in different pictures.
- two motion vectors may correspond to the same reference picture index or to different reference picture indexes. May correspond.
- reference pictures may be displayed (or output) temporally before or after the current picture.
- two reference regions used in the bi-prediction method may be regions selected from each of the L0 picture list and the L1 picture list.
- the inter prediction unit 254 may obtain a reference block of the current block using a motion vector and a reference picture index.
- the reference block exists in a reference picture corresponding to a reference picture index.
- a sample value of a block specified by a motion vector or an interpolated value thereof may be used as a predictor of the current block.
- an 8-tap interpolation filter may be used for a luminance signal and a 4-tap interpolation filter may be used for a color difference signal.
- the interpolation filter for motion prediction in units of subpels is not limited thereto.
- the inter prediction unit 254 performs motion compensation for predicting the texture of the current unit from a previously restored picture.
- the inter prediction unit may use a motion information set.
- the prediction unit 250 may include an intra BC prediction unit (not shown).
- the intra BC predictor may reconstruct the current area by referring to a specific area including reconstructed samples in the current picture.
- the intra BC prediction unit obtains intra BC encoding information for the current region from the entropy decoding unit 210.
- the intra BC predictor obtains a block vector value of the current region indicating a specific region in the current picture.
- the intra BC prediction unit may perform intra BC prediction using the obtained block vector value.
- Intra BC encoding information may include block vector information.
- a reconstructed video picture is generated by adding a prediction value output from the intra prediction unit 252 or the inter prediction unit 254 and a residual value output from the inverse transform unit 225. That is, the video signal decoding apparatus 200 reconstructs the current block by using the prediction block generated by the prediction unit 250 and the residual obtained from the inverse transform unit 225.
- FIG. 2 shows the decoding apparatus 200 according to an embodiment of the present invention, and separately displayed blocks are shown by logically distinguishing elements of the decoding apparatus 200. Therefore, the elements of the decoding apparatus 200 described above may be mounted as one chip or as a plurality of chips according to the design of the device. According to an embodiment, the operation of each element of the decoding apparatus 200 described above may be performed by a processor (not shown).
- a coding tree unit (CTU) is divided into coding units (CUs) in a picture.
- a picture may be divided into a sequence of coding tree units (CTUs).
- the coding tree unit is composed of an NXN block of luma samples and two blocks of chroma samples corresponding thereto.
- the coding tree unit may be divided into a plurality of coding units.
- the coding tree unit is not divided and may be a leaf node. In this case, the coding tree unit itself may be a coding unit.
- the coding unit refers to a basic unit for processing a picture in the above-described video signal processing, that is, intra/inter prediction, transformation, quantization, and/or entropy coding.
- the size and shape of a coding unit in one picture may not be constant.
- the coding unit may have a square or rectangular shape.
- the rectangular coding unit (or rectangular block) includes a vertical coding unit (or vertical block) and a horizontal coding unit (or horizontal block).
- a vertical block is a block having a height greater than a width
- a horizontal block is a block having a width greater than the height.
- a non-square block may refer to a rectangular block, but the present invention is not limited thereto.
- a coding tree unit is first divided into a quad tree (QT) structure. That is, in a quad tree structure, one node having a size of 2NX2N may be divided into four nodes having a size of NXN.
- the quad tree may also be referred to as a quaternary tree. Quad tree partitioning can be performed recursively, and not all nodes need to be partitioned to the same depth.
- a leaf node of the quad tree described above may be further divided into a multi-type tree (MTT) structure.
- MTT multi-type tree
- one node in a multi-type tree structure, one node may be divided into a horizontal or vertically divided binary (binary) or ternary (ternary) tree structure. That is, in the multi-type tree structure, there are four divisional structures: vertical binary division, horizontal binary division, vertical ternary division, and horizontal ternary division.
- both widths and heights of nodes in each tree structure may have a power of 2.
- a node having a size of 2NX2N may be divided into two NX2N nodes by vertical binary division, and divided into two 2NXN nodes by horizontal binary division.
- a node of 2NX2N size is divided into nodes of (N/2)X2N, NX2N, and (N/2)X2N by vertical ternary division, and horizontal ternary It can be divided into 2NX(N/2), 2NXN, and 2NX(N/2) nodes by division.
- This multi-type tree division can be performed recursively.
- Leaf nodes of a multi-type tree can be coding units.
- the corresponding coding unit may be used as a unit of prediction and/or transformation without further division.
- the current coding unit may be divided into a plurality of transform units without explicit signaling regarding segmentation.
- at least one of the following parameters may be defined in advance or transmitted through RBSP of a higher level set such as PPS, SPS, and VPS.
- CTU size the size of the root node of the quad tree
- MinQtSize the minimum QT leaf node size allowed
- maximum BT size the maximum BT root node size allowed
- Maximum TT size Maximum allowed TT root node size
- Maximum MTT depth Maximum allowed depth of MTT segmentation from leaf nodes of QT
- Minimum BT size MinBtSize: allowed Minimum BT leaf node size
- Minimum TT size Minimum allowed TT leaf node size.
- Preset flags may be used to signal the division of the quad tree and multi-type tree described above. 4, a flag'qt_split_flag' indicating whether to divide a quad tree node, a flag'mtt_split_flag' indicating whether to divide a multi-type tree node, and a flag'mtt_split_vertical_flag' indicating a splitting direction of a multi-type tree node. 'Or at least one of a flag'mtt_split_binary_flag' indicating the split shape of the multi-type tree node may be used.
- the coding tree unit is a root node of a quad tree, and may be first divided into a quad tree structure.
- quad tree structure In the quad tree structure,'qt_split_flag' is signaled for each node'QT_node'. If the value of'qt_split_flag' is 1, the node is divided into 4 square nodes, and if the value of'qt_split_flag' is 0, the node becomes'QT_leaf_node', a leaf node of the quad tree.
- Each quad tree leaf node'QT_leaf_node' may be further divided into a multi-type tree structure.
- 'mtt_split_flag' is signaled for each node'MTT_node'.
- the corresponding node is divided into a plurality of rectangular nodes, and when the value of'mtt_split_flag' is 0, the corresponding node becomes'MTT_leaf_node' of the multi-type tree.
- the node'MTT_node' is divided into two rectangular nodes, and when the value of'mtt_split_binary_flag' is 0, the node'MTT_node' is divided into three rectangular nodes.
- Picture prediction (motion compensation) for coding is performed for coding units that are no longer divided (ie, leaf nodes of the coding unit tree).
- the basic unit that performs such prediction is hereinafter referred to as a prediction unit or a prediction block.
- the term unit used herein may be used as a term to replace the prediction unit, which is a basic unit for performing prediction.
- the present invention is not limited thereto, and more broadly, it may be understood as a concept including the coding unit.
- the intra prediction unit predicts sample values of the current block by using reconstructed samples located on the left and/or above of the current block as reference samples.
- FIG. 5 shows an embodiment of reference samples used for prediction of a current block in an intra prediction mode.
- the reference samples may be samples adjacent to the left boundary of the current block and/or samples adjacent to the upper boundary.
- a maximum of 2W+2H+1 located on the left and/or upper side of the current block Reference samples can be set using the surrounding samples.
- the intra prediction unit may obtain a reference sample by performing a reference sample padding process. Also, the intra prediction unit may perform a reference sample filtering process to reduce an error in intra prediction. That is, filtered reference samples may be obtained by performing filtering on neighboring samples and/or reference samples obtained by the reference sample padding process. The intra prediction unit predicts the samples of the current block using the reference samples thus obtained. The intra prediction unit predicts the samples of the current block using unfiltered reference samples or filtered reference samples.
- peripheral samples may include samples on at least one reference line.
- the surrounding samples may include adjacent samples on a line adjacent to the boundary of the current block.
- FIG. 6 shows an embodiment of prediction modes used for intra prediction.
- intra prediction mode information indicating an intra prediction direction may be signaled.
- the intra prediction mode information indicates any one of a plurality of intra prediction modes constituting the intra prediction mode set.
- the decoder receives intra prediction mode information of the current block from the bitstream.
- the intra prediction unit of the decoder performs intra prediction on the current block based on the extracted intra prediction mode information.
- the intra prediction mode set may include all intra prediction modes (eg, a total of 67 intra prediction modes) used for intra prediction. More specifically, the intra prediction mode set may include a planar mode, a DC mode, and a plurality (eg, 65) angular modes (ie, directional modes). Each intra prediction mode may be indicated through a preset index (ie, an intra prediction mode index). For example, as shown in FIG. 6, intra prediction mode index 0 indicates a planar mode, and intra prediction mode index 1 indicates a DC mode.
- intra prediction mode indexes 2 to 66 may indicate different angular modes, respectively. The angle modes indicate different angles within a preset angle range, respectively.
- the angle mode may indicate an angle within an angle range (ie, a first angle range) between 45 degrees and -135 degrees clockwise.
- the angular mode may be defined based on the 12 o'clock direction.
- the intra prediction mode index 2 indicates a horizontal diagonal (HDIA) mode
- the intra prediction mode index 18 indicates a horizontal (HOR) mode
- the intra prediction mode index 34 indicates a diagonal (Diagonal, DIA) mode.
- a mode is indicated
- an intra prediction mode index 50 indicates a vertical (VER) mode
- an intra prediction mode index 66 indicates a vertical diagonal (VDIA) mode.
- the preset angle range may be set differently according to the shape of the current block.
- a wide-angle mode indicating an angle exceeding 45 degrees or less than -135 degrees clockwise may be additionally used.
- the angle mode may indicate an angle within an angular range (ie, a second angular range) between (45+offset1) degrees and (-135+offset1) degrees in a clockwise direction.
- angle modes 67 to 76 outside the first angle range may be additionally used.
- the angle mode may indicate an angle within an angle range (ie, a third angle range) between (45-offset2) degrees and (-135-offset2) degrees in a clockwise direction.
- angle modes -10 to -1 out of the first angle range may be additionally used.
- values of offset1 and offset2 may be determined differently according to a ratio between the width and height of the rectangular block. Further, offset1 and offset2 may be positive numbers.
- a plurality of angular modes constituting the intra prediction mode set may include a basic angular mode and an extended angular mode.
- the extended angle mode may be determined based on the basic angle mode.
- the default angle mode is a mode corresponding to an angle used in intra prediction of an existing High Efficiency Video Coding (HEVC) standard
- the extended angle mode corresponds to an angle newly added in intra prediction of the next generation video codec standard. It can be a mode to do.
- the basic angular mode is an intra prediction mode ⁇ 2, 4, 6, ... , 66 ⁇
- the extended angle mode is an intra prediction mode ⁇ 3, 5, 7, ... , 65 ⁇ may be an angular mode corresponding to any one of. That is, the extended angle mode may be an angle mode between basic angle modes within the first angle range. Accordingly, the angle indicated by the extended angle mode may be determined based on the angle indicated by the basic angle mode.
- the basic angle mode is a mode corresponding to an angle within a preset first angle range
- the extended angle mode may be a wide angle mode outside the first angle range. That is, the basic angular mode is the intra prediction mode ⁇ 2, 3, 4,... , 66 ⁇ , and the extended angle mode is an intra prediction mode ⁇ -10, -9, ... , -1 ⁇ and ⁇ 67, 68,... , 76 ⁇ may be an angular mode corresponding to any one of.
- the angle indicated by the extended angle mode may be determined as an angle opposite to the angle indicated by the corresponding basic angle mode. Accordingly, the angle indicated by the extended angle mode may be determined based on the angle indicated by the basic angle mode.
- the number of expansion angle modes is not limited thereto, and additional expansion angles may be defined according to the size and/or shape of the current block.
- the extended angle mode is the intra prediction mode ⁇ -14, -13,... , -1 ⁇ and ⁇ 67, 68,... , 80 ⁇ may be defined as an angular mode corresponding to any one of.
- the total number of intra prediction modes included in the intra prediction mode set may vary according to the configuration of the above-described basic angle mode and extended angle mode.
- the spacing between the extended angle modes may be set based on the spacing between the corresponding basic angle modes.
- extended angle modes ⁇ 3, 5, 7,... , 65 ⁇ the spacing between the corresponding basic angular modes ⁇ 2, 4, 6,... , 66 ⁇ can be determined based on the interval.
- the extended angle modes ⁇ -10, -9,... , -1 ⁇ the spacing between the corresponding opposite basic angular modes ⁇ 56, 57,... , 65 ⁇ , determined based on the spacing between the extended angle modes ⁇ 67, 68, ... , 76 ⁇ the spacing between the corresponding opposite basic angular modes ⁇ 3, 4,... , 12 ⁇ may be determined based on the interval.
- the angular interval between the extended angle modes may be set to be the same as the angular interval between the corresponding basic angular modes. Also, the number of extended angle modes in the intra prediction mode set may be set to be less than or equal to the number of basic angle modes.
- the extended angle mode may be signaled based on the basic angle mode.
- the wide-angle mode ie, the extended angle mode
- the wide-angle mode may replace at least one angle mode (ie, the basic angle mode) within the first angle range.
- the replaced basic angle mode may be an angle mode corresponding to the opposite side of the wide angle mode. That is, the replaced basic angle mode is an angle mode that corresponds to an angle in a direction opposite to the angle indicated by the wide-angle mode or to an angle that differs by a preset offset index from the angle in the opposite direction.
- the preset offset index is 1.
- the intra prediction mode index corresponding to the replaced basic angular mode may be remapped to the wide-angle mode to signal the corresponding wide-angle mode.
- wide-angle mode ⁇ -10, -9,... , -1 ⁇ is the intra prediction mode index ⁇ 57, 58, ... , 66 ⁇
- each can be signaled by the wide-angle mode ⁇ 67, 68, ... , 76 ⁇ is an intra prediction mode index ⁇ 2, 3, ... , 11 ⁇ , respectively.
- the intra prediction mode index for the basic angular mode signals the extended angular mode
- the intra prediction mode indices of the same set are applied to the signaling of the intra prediction mode even if the configurations of the angular modes used for intra prediction of each block are different. Can be used. Accordingly, signaling overhead due to a change in the intra prediction mode configuration can be minimized.
- whether to use the extended angle mode may be determined based on at least one of the shape and size of the current block.
- the extended angle mode when the size of the current block is larger than a preset size, the extended angle mode is used for intra prediction of the current block, otherwise, only the basic angle mode may be used for intra prediction of the current block.
- the current block when the current block is a non-square block, the extended angle mode is used for intra prediction of the current block, and when the current block is a square block, only the basic angle mode may be used for intra prediction of the current block.
- a method of quantizing a transform coefficient value obtained by transforming the residual signal and coding the quantized transform coefficient may be used instead of coding the above-described residual signal as it is.
- the transform unit may obtain a transform coefficient value by transforming the residual signal.
- the residual signal of a specific block may be distributed over the entire area of the current block. Accordingly, it is possible to improve coding efficiency by concentrating energy in the low frequency region through frequency domain conversion of the residual signal.
- a method of transforming or inversely transforming the residual signal will be described in detail.
- the residual signal in the spatial domain may be converted to the frequency domain.
- the encoder may convert the obtained residual signal to obtain a transform coefficient.
- the encoder may obtain at least one residual block including a residual signal for the current block.
- the residual block may be either the current block or blocks divided from the current block.
- the residual block may be referred to as a residual array or a residual matrix including residual samples of the current block.
- the residual block may represent a transform unit or a block having the same size as the transform block.
- the encoder can transform the residual block using a transform kernel.
- the transform kernel used for transforming the residual block may be a transform kernel having separable characteristics of vertical transform and horizontal transform.
- the transformation for the residual block may be performed separately into vertical transformation and horizontal transformation.
- the encoder may perform vertical transformation by applying a transformation kernel in the vertical direction of the residual block.
- the encoder may perform horizontal transformation by applying a transformation kernel in the horizontal direction of the residual block.
- a transform kernel may be used as a term to refer to a parameter set used for transforming a residual signal such as transform matrix, transform array, transform function, and transform.
- the conversion kernel may be any one of a plurality of usable kernels.
- a transformation kernel based on different transformation types may be used for each of the vertical transformation and the horizontal transformation.
- the encoder may quantize by transferring the transform block transformed from the residual block to the quantization unit.
- the transform block may include a plurality of transform coefficients.
- the transform block may be composed of a plurality of transform coefficients arranged in two dimensions. Similar to the residual block, the size of the transform block may be the same as either the current block or a block divided from the current block.
- the transform coefficients transferred to the quantization unit may be expressed as quantized values.
- the encoder may perform additional transform before the transform coefficient is quantized.
- the above-described transform method may be referred to as a primary transform, and an additional transform may be referred to as a secondary transform.
- the quadratic transformation may be selective for each residual block.
- the encoder may improve coding efficiency by performing second-order transform on a region where it is difficult to concentrate energy in a low-frequency region only by first-order transform.
- a quadratic transformation may be added to a block in which residual values appear larger in a direction other than the horizontal or vertical direction of the residual block.
- the residual values of the intra-predicted block may have a higher probability of changing in a direction other than the horizontal or vertical direction compared to the residual values of the inter-predicted block. Accordingly, the encoder may additionally perform quadratic transformation on the residual signal of the intra-predicted block. In addition, the encoder may omit the quadratic transformation for the residual signal of the inter-predicted block.
- whether to perform the second-order transformation may be determined according to the size of the current block or the residual block.
- transform kernels having different sizes according to the size of the current block or the residual block may be used.
- 8X8 quadratic transformation may be applied to a block in which the shorter side of the width or height is greater than or equal to the first preset length.
- 4X4 quadratic transformation may be applied to a block having a shorter side of the width or height that is greater than or equal to the second preset length and smaller than the first preset length.
- the first preset length may be a value larger than the second preset length, but the present disclosure is not limited thereto.
- the second-order transformation may not be separately performed as a vertical transformation and a horizontal transformation. This second-order transform may be referred to as a low frequency non-separable transform (LFNST).
- Whether to perform conversion on the residual signal of a specific area may be determined by a syntax element related to conversion of the specific area.
- the syntax element may include transform skip information.
- the transform skip information may be a transform skip flag.
- the encoder may immediately quantize the residual signal in which the transformation of the corresponding region has not been performed. The operations of the encoder described with reference to FIG. 7 may be performed through the converter of FIG. 1.
- the above-described conversion related syntax elements may be information parsed from a video signal bitstream.
- the decoder may entropy decode the video signal bitstream to obtain syntax elements related to transformation.
- the encoder may generate a video signal bitstream by entropy coding the syntax elements related to the transformation.
- FIG. 8 is a diagram specifically illustrating a method of obtaining a residual signal by inverse transforming a transform coefficient by an encoder and a decoder.
- an inverse transform operation is performed through an inverse transform unit of each of the encoder and the decoder.
- the inverse transform unit may obtain a residual signal by inverse transforming the inverse quantized transform coefficient.
- the inverse transform unit may detect whether an inverse transform for a corresponding region is performed from a syntax element related to transformation of a specific region. According to an embodiment, when a transformation-related syntax element for a specific transformation block indicates transformation skip, transformation for the corresponding transformation block may be omitted.
- both the first-order inverse transform and the second-order inverse transform described above for the transform block may be omitted.
- the inverse quantized transform coefficient can be used as a residual signal.
- the decoder may reconstruct the current block by using the inverse quantized transform coefficient as a residual signal.
- the above-described first-order inverse transform represents an inverse transform with respect to a first-order transform, and may be referred to as an inverse primary transform.
- the second-order inverse transform represents an inverse transform for the second-order transform, and may be referred to as an inverse secondary transform or inverse LFNST.
- the first (inverse) transformation may be referred to as a first (inverse) transformation
- the second (inverse) transformation may be referred to as a second (inverse) transformation.
- a transform related syntax element for a specific transform block may not indicate transform skip.
- the inverse transform unit may determine whether to perform the second-order inverse transform for the second transform. For example, when the transform block is a transform block of an intra predicted block, a second-order inverse transform may be performed on the transform block. Also, a second-order transform kernel used for the transform block may be determined based on the intra prediction mode corresponding to the transform block. As another example, whether to perform the second-order inverse transform may be determined based on the size of the transform block. The second-order inverse transform may be performed after the inverse quantization process and before the first-order inverse transform is performed.
- the inverse transform unit may perform a first-order inverse transform on an inverse quantized transform coefficient or a second inverse transform coefficient.
- the first-order inverse transformation like the first-order transformation, the vertical transformation and the horizontal transformation may be separated and performed.
- the inverse transform unit may obtain a residual block by performing vertical inverse transform and horizontal inverse transform on the transform block.
- the inverse transform unit may inverse transform the transform block based on the transform kernel used for transforming the transform block.
- the encoder may explicitly or implicitly signal information indicating a transform kernel applied to a current transform block among a plurality of usable transform kernels.
- the decoder may select a transform kernel to be used for inverse transform of a transform block from among a plurality of available transform kernels using information indicating the signaled transform kernel.
- the inverse transform unit may reconstruct the current block by using the residual signal obtained through inverse transform of the transform coefficient.
- the distribution of the residual signal of a picture may be different for each region.
- a distribution of values for a residual signal in a specific region may vary according to a prediction method.
- coding efficiency may vary for each transform region according to distributions and characteristics of values in the transform region.
- a transform kernel used for transforming a specific transform block is adaptively selected from among a plurality of available transform kernels, coding efficiency may be additionally improved. That is, the encoder and decoder may additionally be configured to use a transform kernel other than the basic transform kernel in transforming a video signal.
- a method of adaptively selecting a transform kernel may be referred to as adaptive multiple core transform (AMT) or multiple transform selection (MTS).
- transform and inverse transform are collectively referred to as transform.
- transform kernel and the inverse transform kernel are collectively referred to as a transform kernel.
- the residual (residual) signal which is a difference signal between the original signal and the prediction signal generated through inter prediction or intra prediction, has energy distributed over the entire pixel domain, so when encoding the pixel value of the residual signal itself, compression There is a problem of poor efficiency. Therefore, a process of concentrating energy in the low frequency region of the frequency domain through transcoding of the residual signal in the pixel domain is required.
- DCT-II discrete cosine transform type-II
- DST-VII discrete sine transform type-VII
- AMT is a transformation technique that adaptively selects a transformation kernel from among several preset transformation kernels according to the prediction method, and the pattern in the pixel domain of the residual signal (signal characteristics in the horizontal direction, vertical direction) depending on which prediction method is used. As the signal characteristics) are different, higher coding efficiency can be expected than when only DCT-II is used.
- AMT is not limited to its name, and may be referred to as multiple transform selection (MTS).
- FIG. 9 is a diagram showing the definition of a transform kernel used in AMT, DCT-II, DCT-V (discrete cosine transform type-V), DCT-VIII (discrete cosine transform type-VIII), and DST applied to AMT -I (discrete sine transform type-I), shows the formula of the DST-VII kernel.
- DCT and DST can be expressed as a function of cosine and sine, respectively.
- index i represents the index in the frequency domain
- index j is the basis. Represents the index in the function. That is, the smaller i represents the low frequency basis function, and the larger i represents the high frequency basis function.
- the basis function Ti(j) can represent the j-th element of the i-th row. Since all of the transform kernels shown in FIG. 9 have separable characteristics, the residual signal X is in the horizontal direction. Transformation can be performed in the and vertical directions respectively.
- T the transformation of the residual signal X
- T' the transformation of the transform kernel matrix T.
- Values of the transformation matrix defined by the basis function shown in FIG. 9 may be in a decimal form rather than an integer form. It may be difficult to implement decimal values in hardware in a video encoding device and a decoding device. Accordingly, a transform kernel approximated by an integer from an original transform kernel including values in a decimal form can be used for encoding and decoding a video signal.
- An approximated transform kernel including integer values may be generated through scaling and rounding of a circular transform kernel.
- the integer value included in the approximated conversion kernel may be a value within a range that can be represented by a preset number of bits. The preset number of bits may be 8-bit or 10-bit.
- the orthonormal property of DCT and DST may not be maintained. However, since the resulting coding efficiency loss is not large, it may be advantageous in terms of hardware implementation to approximate the transform kernel in an integer form.
- FIG. 10 is a diagram illustrating a transform set according to an intra prediction mode and a transform kernel candidate defined according to the transform set in AMT.
- non-directional prediction ⁇ INTRA_PLANAR (mode 0), INTRA_DC (mode 1) ⁇ , directional prediction ⁇ INTRA_ANGULAR2, INTRA_ANGULAR3,... using reconstructed reference samples around the coding unit.
- INTRA_ANGULAR66 ⁇ and may additionally include a wide-angle prediction mode applied to a rectangular block and a cross-component linear model (CCLM) for predicting a color difference component signal from the reconstructed luminance component signal.
- CCLM cross-component linear model
- a set of transform kernels that can be used according to the prediction mode is defined, and the transform candidate index used in the set By signaling in 1-bit for each vertical direction, the decoder can perform inverse transformation by applying an optimal transformation kernel found in the encoder.
- 10(a) shows a transform set index defined according to the prediction mode when 67 intra prediction modes are used, where V (vertical) represents a transform set applied in the vertical direction, and H (horizontal) Denotes a set of transforms applied in the horizontal direction.
- V vertical
- H horizontal
- Different transform sets may be used depending on the intra prediction mode, and transform sets applied to the horizontal and vertical directions in a specific prediction mode may be different.
- Transform Set 10B shows a transform set used in intra prediction and a transform kernel candidate that can be used according to the transform set.
- Transform Set 0 consists of ⁇ DST-VII, DCT-VIII ⁇
- Transform Set 1 consists of ⁇ DST-VII, DST-I ⁇
- Transform Set 2 consists of ⁇ DST-VII, DCT-V It consists of ⁇ .
- intra prediction due to the characteristics of intra prediction that is predicted by using reconstructed reference samples around the current block, the further away from the reference sample, that is, in the horizontal and vertical directions based on the upper left coordinate of the residual signal block. Since the energy of the residual signal tends to increase as the distance increases, DST-VII, which expresses this well, is effective. Therefore, DST-VII can be included in all transform sets.
- 10C shows a transform set used in inter prediction and a transform kernel candidate that can be used according to the transform set.
- inter prediction there is one transform set that can be used, and is composed of transform set 0 ⁇ DCT-VIII, DST-VII ⁇ .
- AMT is applicable only to the luminance component, and DCT-II conversion can be used for the color difference component like HEVC.
- On/off can be indicated with a 1-bit flag so that the AMT can be controlled in units of coding units.
- DCT-II which is a basic kernel, is used as a color difference component. I can.
- inverse transform may be performed by signaling a transform candidate index used in a preset transform set according to a prediction mode and applying a transform kernel corresponding to the index in the decoder. Since different transformations can be applied in the horizontal and vertical directions, the transformation index used in a total of 2-bits each 1-bit can be indicated.
- the transform candidate index may not be signaled according to the number of non-zero coefficients. For example, when the number of non-zero coefficients is one or two, the transformation candidate index is not signaled, and in this case, the horizontal and vertical directions are encoded/decoded using DST-VII.
- FIG. 11 is a diagram showing a 0-th (lowest frequency component of a corresponding conversion kernel) basis function of the DCT-II, DCT-V, DCT-VIII, DST-I, and DST-VII conversion defined in FIG. 9.
- the distance in the horizontal and vertical directions within the residual signal block based on the upper left coordinate of the block like intra-screen prediction. It may be more efficient for a pattern of a residual signal in which the energy of the residual signal increases as the amount increases.
- the 0-th basis function represents DC, and may be effective for a pattern of a residual signal having a uniform distribution of pixel values in a residual block, such as inter prediction.
- DCT-V it is similar to DCT-II, but since the value when j is 0 is smaller than the value when j is not 0, it has a signal model in which a straight line is bent when j is 1.
- multiple transform selection (MTS) technology is a transform coding method capable of improving encoding efficiency by adaptively selecting a transform kernel according to a prediction mode.
- 12 illustrates a transform kernel used in MTS and a transform set and transform kernel candidates defined according to a prediction mode according to an embodiment of the present invention.
- 12(a) shows the equation of the basis function constituting the DCT-II, DCT-VIII, and DST-VII kernels used in MTS.
- DCT and DST can be expressed as a function of cosine and sine, respectively.
- index i represents the index in the frequency domain
- index j is the basis. Represents the index in the function. That is, the smaller i represents the low frequency basis function, and the larger i represents the high frequency basis function.
- the basis function Ti(j) can represent the j-th element of the i-th row. Since all the transformation kernels shown in Fig. 12(a) have separable characteristics, the residual signal X For each, transformation can be performed in the horizontal direction and the vertical direction, respectively. That is, when the residual signal block is denoted by X and the transform kernel matrix is denoted by T, the transformation of the residual signal X may be expressed as TXT'. In this case, T'means a transpose matrix of the transform kernel matrix T.
- values of the transformation matrix defined by the basis function may be in the form of decimal numbers instead of the form of integers. It may be difficult to implement decimal values in hardware in a video encoding device and a decoding device. Accordingly, a transform kernel approximated by an integer from an original transform kernel including values in a decimal form can be used for encoding and decoding a video signal.
- An approximated transform kernel including integer values may be generated through scaling and rounding of a circular transform kernel.
- the integer value included in the approximated conversion kernel may be a value within a range that can be represented by a preset number of bits. The preset number of bits may be 8-bit or 10-bit.
- the orthonormal property of DCT and DST may not be maintained. However, since the resulting coding efficiency loss is not large, it may be advantageous in terms of hardware implementation to approximate the transform kernel in an integer form.
- FIGS. 12(b) and (c) are diagrams showing a transform set according to an intra prediction mode and a transform kernel candidate defined according to the transform set.
- intra prediction non-directional prediction ⁇ INTRA_PLANAR (mode 0), INTRA_DC (mode 1) ⁇ , directional prediction ⁇ INTRA_ANGULAR2, INTRA_ANGULAR3,... using reconstructed reference samples around the coding unit.
- INTRA_ANGULAR66 ⁇ and may additionally include a wide-angle prediction mode applied to a rectangular block and a cross-component linear model (CCLM) for predicting a color difference component signal from the reconstructed luminance signal.
- CCLM cross-component linear model
- a set of transform kernels that can be used according to the prediction mode is defined, and a transform candidate index used in the set is signaled and decoded.
- the inverse transform can be performed by applying the optimal transform kernel found in the encoder.
- 12(b) shows a transform set index defined according to prediction modes when 67 intra prediction modes are used, where V (vertical) represents a transform set applied in the vertical direction, and H (horizontal) Denotes a set of transforms applied in the horizontal direction.
- V vertical
- H horizontal
- Different transform sets may be used depending on the intra prediction mode, and transform sets applied to the horizontal and vertical directions in a specific prediction mode may be different.
- FIG. 12(c) shows a transform set used in intra prediction and a transform kernel candidate that can be used according to the transform set.
- All of the transform sets 0, 1, and 2 consist of ⁇ DST-VII, DCT-VIII ⁇ . In other words, it can be interpreted as using one transform set regardless of the intra prediction mode (the same transform kernel candidate is used for all intra prediction modes), but each transform set is composed of different transform kernel candidates like AMT. May be.
- the transform kernel matrix can be approximated from a real-form circular matrix to an integer form, which can be expressed with 8-bit or 10-bit precision. Since all of these transform kernels must be stored in memory in the encoder and decoder in advance, as the number of transform kernels increases, the memory burden of the encoder and decoder increases. Therefore, it is possible to reduce the amount of memory required to store the transform kernel by using only the transform kernel based on DCT-II, DCT-VIII, and DST-VII, which has the greatest effect on the encoding efficiency performance.
- FIG. 12(d) shows a transform set used in inter prediction and a transform kernel candidate that can be used according to the transform set.
- inter prediction there is one transform set that can be used, and is composed of transform set 0 ⁇ DST-VII, DCT-VIII ⁇ .
- MTS is applicable only to the luminance component, and DCT-II conversion can be used for the color difference component.
- On/off can be indicated with a 1-bit flag so that the MTS can be controlled in the coding unit unit, and when this flag indicates off, the basic kernel DCT-II is horizontal like a color difference component. It can be applied in both directions and vertical directions.
- inverse transform may be performed by signaling a transform candidate index used in a preset transform set according to a prediction mode and applying a transform kernel corresponding to the index in the decoder. Since different transformations can be applied in the horizontal and vertical directions, the transformation index used in a total of 2-bits each 1-bit can be indicated.
- the transform index can be indicated using a truncated unary binarization method.
- the encoder can signal these to the decoder as follows.
- a 1-bit on/off flag for controlling the MTS and an index indicating the conversion kernel may be signaled as one syntax element.
- mts_idx may be expressed as a binary code using a truncated unary binarization method, and may indicate a transform kernel applied to a horizontal direction and a vertical direction.
- mts_idx When mts_idx is 0 (binary code 0), it may indicate that a basic kernel based on DCT-II is applied in both the horizontal direction and the vertical direction.
- mts_idx When mts_idx is 1 (binary code 10), it may indicate that a kernel based on DST-VII is applied in both the horizontal direction and the vertical direction.
- mts_idx When mts_idx is 2 (binary code 110), it may indicate that a kernel based on DCT-VIII is applied in the horizontal direction, and may indicate that a kernel based on DST-VII is applied in the vertical direction.
- mts_idx When mts_idx is 3 (binary code 1110), it may indicate that a kernel based on DST-VII is applied in the horizontal direction, and may indicate that a kernel based on DCT-VIII is applied in the vertical direction.
- mts_idx When mts_idx is 4 (binary code 1111), it may indicate that a kernel based on DCT-VIII is applied in both the horizontal direction and the vertical direction.
- DST-IV discrete sine transform type-IV
- DCT-IV discrete cosine transform type-IV
- the DCT-II kernel for the number of samples 2N contains a DCT-IV kernel for the number of samples N
- the DST-IV kernel for the number of samples N is a simple operation from the DCT-IV kernel for the number of samples N, and Since it can be implemented by sorting the basis function in reverse order, DST-IV and DCT-IV for the number of samples N can be simply derived from the DCT-II for the number of samples 2N.
- FIG. 13 is a definition of the DST-IV and DCT-IV basis functions according to an embodiment of the present invention and the 0th (lowest frequency component of DCT-II, DCT-IV, DCT-VIII, DST-IV, and DST-VII) )
- N 8 and i is 0 for Ti(j), which is the conversion basis function of DCT/DST defined in FIGS. 12(a) and 13(a)
- DST-IV and DST-VII are similar signal models, and since the signal tends to increase as the index j increases, the distance in the horizontal and vertical directions based on the upper left coordinate of the block within the residual signal block, like in-screen prediction. It may be more efficient for the pattern of the residual signal in which the energy of the residual signal increases as the distance increases.
- DCT-IV and DCT-VIII are similar signal models, and as the index j increases, the signal size decreases, so the distance in the horizontal and vertical directions within the residual signal block increases based on the upper left coordinate of the block. It may be effective for a pattern of a residual signal in which the energy of the residual signal decreases as the value increases.
- a transform kernel set can be constructed in consideration of this.
- DST-IV/DCT-IV tends to be able to represent the residual signal more efficiently than DST-VII/DCT-VIII, so the 4x4 DST-VII kernel is replaced with 4x4 DST-IV. It can be replaced with a kernel, and a 4x4 DCT-VIII kernel with a 4x4 DCT-IV kernel.
- DST-VII and DCT-VIII described in FIG. 12 may be used.
- the residual signal which is the difference between the original signal and the predicted signal, has a characteristic that the energy distribution of the signal varies depending on the prediction method
- encoding efficiency can be improved when a transform kernel is adaptively selected according to a prediction method such as AMT or MTS. .
- a prediction method such as AMT or MTS.
- a second-order transformation can improve energy compaction for an intra-predicted residual signal block in which strong energy is likely to exist in a direction other than the horizontal or vertical direction of the residual signal block.
- this second order transform may be referred to as a low frequency non-separable transform (LFNST).
- the first-order transform may be referred to as a core transform.
- the entropy coder may parse a syntax element related to a residual signal from a bitstream and obtain a quantization coefficient through inverse binarization.
- a transform coefficient may be obtained by performing inverse quantization on the restored quantization coefficient, and a residual signal block may be restored by performing inverse transform on the transform coefficient.
- the inverse transform may be applied to a block to which transform skip (TS) is not applied, and the inverse transform may be performed in the order of a second-order inverse transform and a first-order inverse transform in the decoder.
- TS transform skip
- the second-order inverse transform may be omitted, and a condition in which the second-order inverse transform may be omitted may be an inter prediction block.
- the second-order inverse transform may be omitted depending on the block size condition.
- the reconstructed residual signal includes a quantization error, and the second-order transform changes the energy distribution of the residual signal, thereby reducing the quantization error than when only the first-order transform is performed.
- FIG. 15 is a diagram illustrating a process of restoring a residual signal in a decoder performing quadratic transformation according to an embodiment of the present invention at a block level.
- Restoration of the residual signal may be performed in a transform unit (TU) or a sub-block unit within the TU.
- FIG. 15 illustrates a process of reconstructing a residual signal block to which a second order transform is applied, and a second order inverse transform may be first performed on an inverse quantized transform coefficient block. It is also possible to perform a second-order inverse transformation on all WxH (W: width, number of horizontal samples, H: height, number of vertical samples) in the TU, but considering the complexity, the low-frequency region, which is the most influential, is left-top.
- WxH width, number of horizontal samples
- H height, number of vertical samples
- W'xH' W'is less than or equal to W
- min(x, y) represents an operation that returns x when x is less than or equal to y, and returns y when x is greater than y.
- Whether to perform the second-order transformation is a sequence parameter set (SPS), a picture parameter set (PPS), a picture header, a slice header, a high level syntax (HLS) such as a tile group header. ) It may be indicated by being included in at least one of the RBSPs in the form of a 1-bit flag. Additionally, when performing the quadratic transformation, the size of the upper left sub-block to be considered in the quadratic transformation may be indicated. For example, in the case of quadratic transformation considering sub-blocks of 4x4 and 8x8 sizes, whether or not a sub-block of 8x8 size can be used may be indicated by a 1-bit flag.
- whether to apply the second-order transformation at a coding unit (CU) level may be indicated by a 1-bit flag.
- an index indicating the transformation kernel used for the quadratic transformation may be indicated, and the transformation kernel indicated by the corresponding index within a preset transformation kernel set according to the prediction mode is used. You can perform a quadratic transformation.
- the index representing the transform kernel can be binarized using truncated unary or fixed length binarization methods.
- the 1-bit flag indicating whether or not the second-order transformation is applied and the index indicating the transformation kernel may be indicated using one syntax element.
- st_idx this is referred to as st_idx, but the present invention refers to this name.
- the st_idx may be referred to as a second-order transform index and an LFNST index.
- the first bit of st_idx may indicate whether or not a second order transformation is applied at the CU level, and the remaining bits may indicate an index indicating a conversion kernel used in the second order transformation.
- the st_idx may be encoded using an entropy coder such as context adeptive binary arithmetic coding (CABAC) or context adaptive variable length coding (CAVLC) that adaptively encodes according to a context.
- CABAC context adeptive binary arithmetic coding
- CAVLC context adaptive variable length coding
- the quadratic transformation may not be applied, and st_idx, a syntax element related to the quadratic transformation, may be set to 0 without signaling. For example, when st_idx is 0, it may indicate that the quadratic transformation is not used. On the other hand, when st_idx is greater than 0, it may indicate that a quadratic transformation is applied, and a transformation kernel used for the quadratic transformation may be selected based on st_idx.
- a leaf node of a multi-type tree may be a coding unit.
- the corresponding coding unit may be used as a unit of prediction and/or transformation without further division.
- the current coding unit may be divided into a plurality of transform units without explicit signaling regarding segmentation.
- the size of the coding unit is larger than the maximum transform size, it may be divided into a plurality of transform blocks without signaling. In this case, when the second-order transform is applied, performance degradation and complexity may increase, and thus, the maximum coding block (or the maximum size of the coding block) to which the second-order transform is applied may be limited.
- the size of the maximum coding block may be the same as the maximum transform size. Alternatively, it may be defined as the size of a preset coding block. As an embodiment, the preset value may be 64, 32, or 16, but the present invention is not limited thereto. In this case, a value to be compared with a preset value (or maximum transform size) may be defined as the length of a long side or the total number of samples.
- the DCT-II, DST-VII, and DCT-VIII kernels used in the first-order transformation have separable characteristics, two transformations in the vertical/horizontal direction are performed on the samples in the NxN-sized residual block. And the size of the conversion kernel may be NxN.
- the transformation kernel has an inseparable characteristic, when the number of samples considered in the quadratic transformation is nxn, one transformation can be performed, and the size of the transformation kernel is (n ⁇ 2)x( may be n ⁇ 2).
- a 16x16 size transformation kernel may be applied, and when performing quadratic transformation on the upper left-most 8x8 coefficient blocks, the 64x64 size transformation kernel Can be applied. Since the 64x64 transform kernel involves a large amount of multiplication operations, it can be a heavy burden on the encoder and decoder. Accordingly, when the number of samples considered in the second-order transformation is reduced, the amount of computation and the memory required for storing the transformation kernel can be reduced.
- the quadratic transform may be expressed as a product of a quadratic transform kernel matrix and a first transformed coefficient vector, and may be interpreted as mapping a first transformed coefficient to another space.
- the number of coefficients to be quadratic transformation is reduced, that is, when the number of basis vectors constituting the quadratic transformation kernel is reduced, the amount of computation required for the quadratic transformation and the memory capacity required for storing the transformation kernel can be reduced.
- the size of 16(row)x64(column) (or 16(row)x48(column) ) Size) can be applied.
- the transform unit of the encoder may obtain a second-order transformed coefficient vector through an inner product of each row vector constituting the transform kernel matrix and a first-order transformed coefficient vector.
- the inverse transform unit of the encoder and the decoder may obtain a first-order transformed coefficient vector through a dot product of each column vector constituting the transform kernel matrix and a second-order transformed coefficient vector.
- the encoder may first perform a forward primary transform on a residual signal block to obtain a coefficient block having a first-order transform.
- a forward primary transform When the size of the first-order transformed coefficient block is MxN, for an intra-predicted block with a value of min(M, N) of 4, a 4x4 quadratic transformation is performed on the left-top 4x4 samples of the first-order transformed coefficient block. (forward secondary transform) may be performed.
- an 8x8 quadratic transformation may be performed on upper left 8x8 samples of the coefficient block having a first-order transform.
- the first order A 4x4 quadratic transform may be performed on each of the two upper left 4x4 subblocks in the transformed coefficient block.
- coefficients in the upper-left sub-block of the coefficient block having the first-order transformation can be configured in a vector form.
- a method of constructing a vector may depend on an intra prediction mode. For example, if the intra prediction mode is less than or equal to the 34th angular mode among the intra prediction modes shown in FIG. 6, the coefficients may be configured as vectors by scanning the upper-left sub-block of the first-order transformed coefficient block in the horizontal direction. I can.
- the vectorized coefficients are [x_00,x_01, ..., x_0n-1, x_10, x_11, ... , x_1n-1, ..., x_n-10, x_n-11, ..., x_n-1n-1].
- the coefficients may be configured as vectors by scanning the upper-left sub-block of the coefficient block having the first order transform in the vertical direction.
- Vectorized coefficients are [x_00, x_10, ..., x_n-10, x_01, x_11, ..., x_n-11, ..., x_0n-1, x_1n-1, ..., x_n-1n-1 It can be expressed as ].
- the coefficient x_ij with i>3 and j>3 in the above-described vector construction method may not be included.
- 16 first-order transformed coefficients may be considered as inputs of the quadratic transformation
- 48 first-order transformed coefficients may be considered as inputs of the quadratic transformation.
- the second-order transformed coefficients can be obtained by multiplying the upper-left sub-block samples of the vectorized first-order transform coefficient block with a second-order transform kernel matrix, and the second-order transform kernel includes the size of the transform unit, the intra mode, and the transform kernel. It may be determined according to a syntax element indicating As described above, when the number of coefficients to be quadratic transformed is reduced, the amount of calculation and memory required for storing the transform kernel may be reduced, and thus the number of coefficients to be quadratic transformed may be determined according to the size of the current transform block. For example, in the case of a 4x4 block, a coefficient vector of length 8 may be obtained by multiplying a vector of length 16 and an 8 (row)x16 (column) transform kernel matrix.
- the 8 (row) x 16 (column) transform kernel matrix may be obtained based on the eighth basis vector from the first basis vector constituting the 16 (row) x 16 (column) transform kernel matrix.
- a coefficient vector having a length of 16 may be obtained by multiplying a vector having a length of 16 and a 16 (row)x16 (column) transform kernel matrix.
- a coefficient vector of length 8 may be obtained through the product of a vector of length 48 and an 8 (row)x48 (column) transform kernel matrix.
- the 8 (row) x 48 (column) transform kernel matrix may be obtained based on the eighth basis vector from the first basis vector constituting the 16 (row) x 48 (column) transform kernel matrix.
- a coefficient vector having a length of 16 may be obtained by multiplying a vector having a length of 48 and a 16 (row) x 48 (column) transform kernel matrix.
- the coefficients transformed by the second order since they are in the form of vectors, they may be expressed as data in a two-dimensional form.
- the second-order transformed coefficients according to a preset scan order may be configured as an upper left coefficient sub-block.
- the preset scan order may be an up-right diagonal scan order.
- the present invention is not limited thereto, and the upper-right diagonal scan order may be determined based on the methods described in FIGS. 17 and 18 to be described later.
- transform coefficients of a total transform unit size including quadratic transformed coefficients may be included in a bitstream and transmitted after quantization.
- the bitstream may include syntax elements related to second-order transformation.
- the bitstream may include information on whether a second-order transform is applied to the current block and information indicating a transform kernel applied to the current block.
- the decoder may first parse the quantized transform coefficients from the bitstream, and obtain transform coefficients through de-quantization. Inverse-quantization may be referred to as scaling.
- the decoder may determine whether to perform a second-order inverse transform on the current block based on a syntax element related to the second-order transform. When the second-order inverse transform is applied to the current transform unit, depending on the size of the transform unit, 8 or 16 transform coefficients can be input to the second-order inverse transform, which is the same as the number of coefficients output from the second-order transform of the encoder. can do.
- the size of the transform unit is 4x4 or 8x8, 8 transform coefficients may be inputs of the second-order inverse transform, otherwise, 16 transform coefficients may be inputs of the second-order inverse transform.
- a 4x4 quadratic inverse transform is performed on 16 or 8 coefficients of the upper-left 4x4 subblock of the transform coefficient block for an intra-predicted block with a value of 4 min(M, N).
- an 8x8 quadratic inverse transformation may be performed on 16 or 8 coefficients of the upper left 4x4 subblock of the transform coefficient block.
- min(M,N) is 4 and M or N is greater than 8 (for example, a rectangular block having a size of 4x16 and 16x4)
- two left-hand side of a transform coefficient block A 4x4 quadratic inverse transformation may be performed on each of the upper 4x4 sub-blocks.
- the decoder since the second-order inverse transform can be calculated as a product of the second-order inverse transform kernel matrix and the input vector, the decoder first selects the input inverse-quantized transform coefficient block according to a preset scan order. It can be configured in vector form.
- the preset scan order may be an up-right diagonal scan order, and the present invention is not limited thereto, and the upper-right diagonal scan order is a method described in FIGS. 17 and 18 to be described later. Can be determined based on
- the decoder may obtain a first-order transformed coefficient by multiplying a vectorized transform coefficient and a second-order inverse transform kernel matrix.
- the second-order inverse transform kernel may be determined according to the size of the transform unit, the intra mode, and a syntax element indicating the transform kernel.
- the second-order inverse transform kernel matrix may be a transposed matrix of a second-order transform kernel matrix, and the elements of the kernel matrix may be integers expressed with 10-bit or 8-bit accuracy in consideration of implementation complexity.
- the length of the vector used as the output of the second-order inverse transform may be determined based on the size of the current transform block.
- a coefficient vector of length 16 may be obtained by multiplying a vector of length 8 and an 8(row)x16(column) transform kernel matrix.
- the 8 (row) x 16 (column) transform kernel matrix may be obtained based on the eighth basis vector from the first basis vector constituting the 16 (row) x 16 (column) transform kernel matrix.
- a coefficient vector having a length of 16 may be obtained by multiplying a vector having a length of 16 and a 16 (row)x16 (column) transform kernel matrix.
- a coefficient vector of length 48 may be obtained by multiplying a vector of length 8 and an 8 (row)x48 (column) transform kernel matrix.
- the 8 (row) x 48 (column) transform kernel matrix may be obtained based on the eighth basis vector from the first basis vector constituting the 16 (row) x 48 (column) transform kernel matrix.
- a coefficient vector having a length of 48 may be obtained by multiplying a vector having a length of 16 and a 16 (row)x48 (column) transform kernel matrix.
- the decoder may express it again as data in a two-dimensional form, which may depend on the intra mode.
- the mapping relationship based on the intra mode applied by the encoder can be applied in the same way.
- the 2D inversely transformed coefficient vector can be scanned in the horizontal direction to obtain a 2-dimensional transform coefficient array, and the intra prediction mode is less than the 34th angular mode.
- a two-dimensional inverse transformed coefficient vector may be scanned in a vertical direction to obtain a two-dimensional transform coefficient array.
- a residual signal may be obtained by performing a first-order inverse transform on a transform coefficient block having a total transform unit size including transform coefficients obtained by performing a second-order inverse transform.
- a scaling process using a bit shift operation may be included in order to correct a scale that increases due to a transform kernel after transform or inverse transform.
- FIG. 17 is a diagram illustrating a method of determining an up-right diagonal scan order according to an embodiment of the present invention.
- a process of initializing a scan order may be performed during encoding or decoding.
- An array including scan order information may be initialized according to the block size.
- the process of initializing the upper right diagonal scan order arrangement shown in FIG. 17 in which 1 ⁇ log2BlockWidth and 1 ⁇ log2BlockHeight are inputs for the combination of log2BlockWidth and log2BlockHeight may be called (or performed).
- the output of the initializing process of the upper-right diagonal scan order arrangement may be allocated to DiagScanOrder[log2BlockWidth][log2BlockHeight].
- log2BlockWidth and log2BlockHeight represent variables representing values obtained by taking a logarithm of the base 2 with respect to the width and height of the block, respectively, and may be values in the range [0, 4].
- the encoder/decoder may output the array diagScan[sPos][sComp] for blkWidth, which is the width of the received block, and blkHeight, which is the height of the block.
- the array index sPos may indicate the scan position and may be a value in the range of [0, blkWidth*blkHeight-1].
- sComp which is an index of the array, is 0, it may represent a horizontal component (x), and when sComp is 1, it may represent a vertical component (y).
- the x-coordinate and y-coordinate values on the two-dimensional coordinates at the scan position sPos are assigned to diagScan[sPos][0] and diagScan[sPos][1], respectively, according to the upper-right diagonal scan order. It can be interpreted as being.
- the value stored in the DiagScanOrder[log2BlockHeight][sPos][sComp] array (or array) is sPos in the diagonal scan order of the upper right side of the block whose width and height are 1 ⁇ log2BlockWidth, 1 ⁇ log2BlockHeight, respectively. It may mean a coordinate value corresponding to sComp when it is a scan position.
- FIG. 18 is a diagram illustrating an upper-right diagonal scan order according to a block size according to an embodiment of the present invention.
- log2BlockWidth and log2BlockHeight when both log2BlockWidth and log2BlockHeight are 2, it may mean a block having a size of 4x4.
- log2BlockWidth and log2BlockHeight when both log2BlockWidth and log2BlockHeight are 3, it may mean a block having a size of 8x8.
- a number displayed in a gray shaded area indicates a scan position sPos.
- the x coordinate value and y coordinate value at the sPos position may be assigned to DiagScanOrder[log2BlockWidth][log2BlockHeight][sPos][0], DiagScanOrder[log2BlockWidth][log2BlockHeight][sPos][1], respectively.
- the encoder/decoder may code transform coefficient information based on the above-described scan order.
- an embodiment based on the case where the upper right scanning method is used is mainly described, but the present invention is not limited thereto, and other known scanning methods may also be applied.
- the following variable may be input in the transform process of the second-order transformation.
- -(xTbY, yTbY) indicates the position (or coordinates) of the upper left luma sample of the current luma transform block, and may be a position relative to the upper left luma sample of the current picture.
- nTbH represents the width and height of the current transform block, respectively.
- -cIdx Represents a variable indicating the color component of the current block, and when cIdx is 0, it may mean luma Y, when it is 1, it can mean chroma Cb, and when it is 2, it can mean chroma Cr.
- -d[x][y] An array of (nTbW)x(nTbH) size, indicating a transform coefficient array.
- x may be in the range of [0, nTbW-1]
- y may be in the range of [0, nTbH-1].
- the conversion process according to the present embodiment may output r[x][y], which is an array of residual samples of (nTbW)x(nTbH) size, where x is [0, nTbW-1], and y is [0]. , nTbH-1].
- whether the second-order transform is applied to the current block may be determined according to a value of a syntax element st_idx[xTbY][yTbY] indicating a second-order transform index (or LFNST index). For example, when the value of st_idx[xTbY][yTbY] is greater than 0, the decoding process related to the quadratic transformation may be performed. When the value of st_idx[xTbY][yTbY] is 0, the quadratic transformation is It is not performed (or applied), only the first order transformation can be performed.
- the decoding process related to the quadratic transformation may not be performed according to the currently processed color component.
- the second-order transform since a residual sample may be obtained after the second-order inverse transform and the first-order inverse transform are performed in the decoder, a delay time may increase compared to when only the first-order inverse transform is applied.
- the delay time generated by performing the quadratic transformation is the largest in a single tree coding structure in which both luma and chroma components can exist (a structure in which luma and chroma components are coded with the same coding tree).
- the second-order inverse transform may not be applied.
- a variable related to the conversion process may be set as follows.
- log2StSize may be set to 3 and nStOutSize may be set to 48. Otherwise, log2StSize may be set to 2 and nStOutSize may be set to 16.
- log2StSize is a variable representing a value obtained by taking the logarithm of the base 2 to the size to which the quadratic transformation is applied. When log2StSize is 2, it may indicate that 4x4 quadratic transformation is applied, and when log2StSize is 3, it may indicate that 8x8 quadratic transformation is applied.
- nStOutSize is a variable representing the number of samples output by the quadratic transformation.
- -nStSize may be set to (1 ⁇ log2StSize).
- nStSize is a variable indicating the size to which the quadratic transformation is applied.
- -log2SbSize is a variable representing the size of a sub-block and may be set to 2.
- variable numStX may be set to 2, otherwise it may be set to 1.
- numStX is a variable indicating the number of sub-blocks in the horizontal direction to be input for the quadratic transformation.
- variable numStY may be set to 2, otherwise it may be set to 1.
- numStY is a variable indicating the number of sub-blocks in the vertical direction to be input for the quadratic transformation.
- nonZeroSize When both nTbW and nTbH are 4, or when both nTbW and nTbH are 8, that is, 4x4 or 8x8 block, nonZeroSize may be set to 8, otherwise, it may be set to 16.
- nonZeroSize is a variable indicating the size of a coefficient vector used as an input for the quadratic transformation.
- the following processes may be applied from 0 to numStX-1, which is a subblock index in the horizontal direction, and ySbIdx, which is a subblock index in the vertical direction, from 0 to numStY-1.
- the present invention is not limited thereto, and the following process may be applied in the same manner even when the quadratic transform is applied only to one upper left sub-block (or a predetermined specific region on the upper left).
- the array u[x] is a coefficient vector used as an input of the quadratic transformation, and x may be in the range of [0, nonZeroSize-1].
- D[xC][yC] which is a scaled transform coefficient (or inverse quantized transform coefficient) that is an input to the transformation process according to the present embodiment, is a two-dimensional array (or array), and the decoder d[ Among the total samples of xC][yC], nonZeroSize samples can be allocated to u[x] according to the scan order.
- nonZeroSize is a variable representing the length of the input vector or the maximum number of significant coefficients that the input vector can contain.
- x which is the index of the array u, may mean a scan position in the scan order.
- the decoder may obtain the x coordinate value in the subblock for the scan position x through the value of DiagScanOrder[log2SbSize][log2SbSize][x][0], and determine xC based thereon.
- the decoder may determine the x coordinate value in the subblock for the scan position x as xC through the value of DiagScanOrder[log2SbSize][log2SbSize][x][0].
- the decoder may determine xC by adding (xSbIdx ⁇ log2StSize) to the left-top x coordinate value of the subblock.
- a y-coordinate value in the subblock for the scan position x may be obtained through the value of DiagScanOrder[log2SbSize][log2SbSize][x][1], and yC may be determined based thereon.
- the decoder may determine the y coordinate value in the subblock for the scan position x as yC through the value of DiagScanOrder[log2SbSize][log2SbSize][x][1].
- yC may be determined by adding (ySbIdx ⁇ log2StSize) to the upper-left y coordinate value of the sub-block.
- the decoder can allocate d[xC][yC] to u[x] from 0 to nonZeroSize-1.
- the scan order may be an up-right diagonal scan order, and the method described above with reference to FIGS. 17 and 18 may be applied.
- the array u[x] may be determined (or derived) based on Equation 1 below.
- log2SbSize of Equation 1 may be defined (or set) as 2.
- the array u[x] may be determined (or derived) based on Equation 2 below.
- log2SbSize of Equation 2 may be defined (or set) as 2.
- the decoder may set the factor of DiagScanOrder to [log2StSize][log2StSize] to determine xC and yC.
- a problem arises that a transform coefficient not located in a top-left sub-block is allocated as an input for a quadratic transform according to the size of the transform block.
- the factor of DiagScanOrder is set to [log2SbSize][log2SbSize][log2SbSize]
- ie, upper left D[xC][yC] in the sub-block can be allocated as u[x]
- the above-described problem can be solved.
- the set array u[x] (the range of x is [0, nonZeroSize-1]) is the array v[x] (the range of x is [0] by the secondary transform process of FIG. 19 to be described later). , nStOutSize-1]).
- the second-order transformation process includes nonZeroSize as a transform input length, nStOutSize as a transform output length, and u[x] as an inverse quantized transform coefficient (length is nonZeroSize-1, and the range of x is [0, nonZeroSize-1]), stPredModeIntra, st_idx[xTbY][yTbY], which are the intra prediction modes of the current block, can be input, and the transformed coefficient v[x] (length is nStOutSize, and the range of x is [ 0, nStOutSize-1]) can be output.
- v[x] which is the output of the quadratic transformation process, can be allocated as d[(xSbIdx ⁇ log2StSize)+x][(ySbIdx ⁇ log2StSize)+y] according to the intra prediction mode as follows, Both x and y ranges may be [0, nStSize-1].
- FIG. 19 is a diagram illustrating an example of a second-order conversion process according to an embodiment of the present invention. Referring to FIG. 19, in the second-order conversion process according to an embodiment of the present invention, the following variables may be input.
- -nTrS represents a variable indicating the conversion output length.
- -nonZeroSize represents a variable indicating the conversion input length.
- Array x[j] represents a conversion input, and j may be in the range of [0, nonZeroSize-1].
- -stPredModeIntra represents a variable indicating the intra prediction mode of the current block, and can be used to determine the index of the transform kernel set.
- a specific transform kernel set may be determined based on the stPredModeIntra, and a specific transform kernel set within the transform kernel set may be selected based on stIdx. That is, stIdx represents an index indicating a specific transformation kernel used for the second transformation of the current block in the specific transformation kernel set determined based on stPredModeIntra.
- y[i], which is an array of transform output samples, may be output, and the range of i may be [0, nTrS-1].
- a transform matrix derivation process of FIG. 20 to be described later may be performed first.
- nTrS indicating a transform output length
- an index stIdx indicating a transform kernel in the transform kernel set
- secTransMatrix which is a transform kernel matrix
- the output secTransMatrix may be (nTrS)x (nonZeroSize) size, and the element of secTransMatrix may be an integer.
- the i-th element of the transform output, y[i] may be calculated using the i-th column of secTransMatrix and the dot product of the transform input array x.
- the calculation result may be clipped to a value between the minimum coefficient value CoeffMin and the maximum coefficient value CoeffMax through the clipping operation.
- the transform coefficient may be expressed with a preset precision, and the preset precision may be 16 bits.
- CoeffMin and CoeffMax may be set to ?(2 ⁇ 16) and (2 ⁇ 16)-1, respectively.
- 20 is a diagram illustrating a process of deriving a second-order transform matrix according to an embodiment of the present invention. According to an embodiment of the present invention, in the process of deriving a second-order transformation matrix, the following variables may be input.
- -nTrS represents a variable indicating the conversion output length.
- -stPredModeIntra represents a variable indicating the intra prediction mode of the current block, and can be used to determine the index of the transform kernel set.
- -stIdx indicates an index indicating a transform kernel in the selected transform kernel set.
- the determination of the quadratic transform kernel set according to an embodiment of the present invention may depend on an intra mode (or intra prediction mode). For example, as shown in the table shown in FIG. 20, the encoder/decoder may group the intra mode into four mode groups.
- the conversion kernel set for each group may be indicated (or allocated) by stTrSetIdx. Also, whether it is a 4x4 quadratic conversion kernel or an 8x8 quadratic conversion kernel may be indicated by an nTrS variable.
- nTrS is a variable indicating the conversion output length.
- the stIdx-th transform kernel matrix among the transform kernels of the size indicated by the nTrS variable may be output as secTransMatrix.
- nTrS may be 16 or 48.
- stPredModeIntra is 1, and the 0th transform kernel set may be selected.
- a first transform kernel matrix among 4x4 quadratic transform kernels may be output.
- stTrSetIdx may be determined to be 0 instead of 1, which may indicate that the 0 th transform kernel set is used.
- intra modes may be grouped according to whether the intra mode is odd or even.
- stTrSetIdx may be assigned as 2.
- stPredModeIntra is one of 0, 1, 81, 82, 83, stTrSetIdx may be assigned as 0.
- the encoder/decoder may differently allocate stTrSetIdx for CCLM modes.
- stTrSetIdx for CCLM modes.
- INTRA_L_CCLM a linear relationship between the reconstructed chroma samples on the left adjacent to the current block and the corresponding luma samples is derived and used for prediction, so the reconstructed samples on the left adjacent to the current block are used as reference samples.
- INTRA_ANGULAR_18 (18) The pattern of the mode and the residual signal may be similar.
- the encoder/decoder may set stTrSetIdx to 0 when stPredModeIntra is 81 (INTRA_LT_CCLM), and stTrSetIdx to 2 when stPredModeIntra is 82 (INTRA_L_CCLM) or 83 (INTRA_T_CCLM).
- all intra modes may use the same transform kernel set. That is, the used conversion kernel is not dependent on the intra mode, and may be determined by nTrS and stIdx, and the stIdx-th conversion kernel may be selected from among the conversion kernels having a size indicated by nTrS.
- the transform kernel set may be determined based on the transform kernel applied to the first transform, not based on the intra mode. For example, when DST-VII is applied to both horizontal and vertical directions, stTrSetIdx may be set to 1. When DST-VII is applied only to one of the horizontal and vertical directions, stTrSetIdx may be set to 2. Otherwise, stTrSetIdx may be set to 0.
- FIG. 21 is a flowchart illustrating a video signal processing method according to an embodiment of the present invention.
- a decoder is mainly described for convenience of description, but the present invention is not limited thereto, and the video signal processing method according to the present embodiment may be applied to an encoder in substantially the same manner.
- the decoder determines whether a quadratic transform (or a quadratic inverse transform) is applied to the current block (S2101).
- the quadratic transform may be referred to as a low frequency non-separable transform (LFNST).
- the secondary transform may be applied after a primary transform is applied based on the encoder side. That is, the second-order transform may represent a transform applied before the first-order transform based on the decoder side despite the name.
- the decoder When the quadratic transform is applied to the current block, the decoder derives a quadratic transform kernel set applied to the current block from among predefined quadratic transform kernel sets based on the intra prediction mode of the current block (S2102).
- the decoder determines a quadratic transform kernel applied to the current block within the determined quadratic transform kernel set (S2103).
- the decoder generates a quadratic inverse transformed block by performing a quadratic inverse transform on a specific region at the upper left of the current block using the quadratic transform kernel (S2104).
- the decoder generates a residual block of the current block by performing inverse linear transformation on the inverse quadratic transformed block (S2105).
- the quadratic inverse transform may be performed by receiving an inverse quantized transform coefficient based on a fixed scan order regardless of the size of the quadratic transform kernel.
- the generating of the quadratic inverse transformed block includes assigning the inverse quantized transform coefficient to the input coefficient array of the quadratic inverse transform based on an up-right diagonal scan order. It may include.
- the upper-right diagonal scan order may be predefined as a scan order for a 4x4 size block.
- the step of determining whether the quadratic transformation is applied to the current block may include obtaining a syntax element indicating whether the quadratic transformation is applied to the current block when a predefined condition is satisfied. It may include steps.
- the syntax element may be referred to as a second-order transform index and an LFNST index.
- the predefined condition may include whether the width and height of the current block are less than or equal to the maximum transform size.
- determining whether the quadratic transform is applied to the current block may include inferring the syntax element to be 0 when the predefined condition is not satisfied.
- a quadratic transform kernel applied to the current block may be determined.
- the current block when the width or height of the current block is larger than the maximum transform size, the current block may be divided into a plurality of transform units.
- embodiments of the present invention can be implemented through various means.
- embodiments of the present invention may be implemented by hardware, firmware, software, or a combination thereof.
- the method according to embodiments of the present invention includes one or more Application Specific Integrated Circuits (ASICs), Digital Signal Processors (DSPs), Digital Signal Processing Devices (DSPDs), and Programmable Logic Devices (PLDs). , Field Programmable Gate Arrays (FPGAs), processors, controllers, microcontrollers, microprocessors, and the like.
- ASICs Application Specific Integrated Circuits
- DSPs Digital Signal Processors
- DSPDs Digital Signal Processing Devices
- PLDs Programmable Logic Devices
- FPGAs Field Programmable Gate Arrays
- processors controllers
- microcontrollers microcontrollers
- microprocessors and the like.
- the method according to the embodiments of the present invention may be implemented in the form of a module, procedure, or function that performs the functions or operations described above.
- the software code can be stored in a memory and driven by a processor.
- the memory may be located inside or outside the processor, and data may be exchanged with the processor through various known means.
- Computer-readable media can be any available media that can be accessed by a computer, and includes both volatile and nonvolatile media, removable and non-removable media. Further, the computer-readable medium may include both computer storage media and communication media.
- Computer storage media includes both volatile and nonvolatile, removable and non-removable media implemented in any method or technology for storage of information such as computer readable instructions, data structures, program modules or other data.
- Communication media typically includes computer readable instructions, data structures, or other data in a modulated data signal, such as program modules, or other transmission mechanisms, and includes any information delivery media.
Landscapes
- Engineering & Computer Science (AREA)
- Multimedia (AREA)
- Signal Processing (AREA)
- Physics & Mathematics (AREA)
- Discrete Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Compression Or Coding Systems Of Tv Signals (AREA)
Abstract
Description
Claims (16)
- 비디오 신호 처리 방법에 있어서,현재 블록에 이차 역변환이 적용되는지 여부를 결정하는 단계;상기 현재 블록에 상기 이차 변환이 적용되는 경우, 상기 현재 블록의 인트라 예측 모드에 기초하여 미리 정의된 이차 변환 커널 세트들 중에서 상기 현재 블록에 적용되는 이차 변환 커널 세트를 유도하는 단계;상기 결정된 이차 변환 커널 세트 내에서 상기 현재 블록에 적용되는 이차 변환 커널을 결정하는 단계;상기 이차 변환 커널을 이용하여 상기 현재 블록의 좌상단 특정 영역에 대하여 이차 역변환을 수행함으로써, 이차 역변환된 블록을 생성하는 단계; 및상기 이차 역변환된 블록에 대하여 일차 역변환을 수행함으로써, 상기 현재 블록의 잔차 블록을 생성하는 단계를 포함하되,상기 이차 역변환은 상기 이차 변환 커널의 크기와 무관하게 고정된 스캔 순서에 기초하여 역양자화된 변환 계수를 입력 받는 것을 특징으로 하는, 비디오 신호 처리 방법.
- 제1항에 있어서,상기 이차 역변환된 블록을 생성하는 단계는,우상측 대각(up-right diagonal) 스캔 순서에 기초하여 상기 역양자화된 변환 계수를 상기 이차 역변환의 입력 계수 배열에 할당하는 단계를 포함하는, 비디오 신호 처리 방법.
- 제2항에 있어서,상기 우상측 대각 스캔 순서는 4x4 크기의 블록에 대한 스캔 순서로 미리 정의되는 것을 특징으로 하는, 비디오 신호 처리 방법.
- 제1항에 있어서,상기 현재 블록에 이차 역변환이 적용되는지 여부를 결정하는 단계는,미리 정의된 조건을 만족하는 경우, 상기 현재 블록에 이차 변환이 적용되는지 여부를 지시하는 신택스 요소를 획득하는 단계를 포함하고,상기 미리 정의된 조건은 상기 현재 블록의 너비(width) 및 높이(height)가 최대 변환 크기보다 작거나 같은지 여부를 포함하는, 비디오 신호 처리 방법.
- 제4항에 있어서,상기 현재 블록에 이차 역변환이 적용되는지 여부를 결정하는 단계는,상기 미리 정의된 조건을 만족하지 않는 경우, 상기 신택스 요소를 0으로 추론하는 단계를 포함하는, 비디오 신호 처리 방법.
- 제5항에 있어서,상기 신택스 요소의 값이 0인 경우, 상기 현재 블록에 이차 역변환이 적용되지 않는 것으로 결정되고,상기 신택스 요소의 값이 0이 아닌 경우, 상기 신택스 요소의 값에 따라 상기 결정된 이차 변환 커널 세트 내에서 상기 현재 블록에 적용되는 이차 변환 커널이 결정되는, 비디오 신호 처리 방법.
- 제4항에 있어서,상기 현재 블록의 너비 또는 높이가 상기 최대 변환 크기보다 큰 경우, 상기 현재 블록은 복수의 변환 유닛으로 분할되는 것을 특징으로 하는, 비디오 신호 처리 방법.
- 비디오 신호 처리 장치에 있어서,프로세서를 포함하며,상기 프로세서는,현재 블록에 이차 역변환이 적용되는지 여부를 결정하고,상기 현재 블록에 상기 이차 변환이 적용되는 경우, 상기 현재 블록의 인트라 예측 모드에 기초하여 미리 정의된 이차 변환 커널 세트들 중에서 상기 현재 블록에 적용되는 이차 변환 커널 세트를 유도하고,상기 결정된 이차 변환 커널 세트 내에서 상기 현재 블록에 적용되는 이차 변환 커널을 결정하고,상기 이차 변환 커널을 이용하여 상기 현재 블록의 좌상단 특정 영역에 대하여 이차 역변환을 수행함으로써, 이차 역변환된 블록을 생성하고,상기 이차 역변환된 블록에 대하여 일차 역변환을 수행함으로써, 상기 현재 블록의 잔차 블록을 생성하되,상기 이차 역변환은 상기 이차 변환 커널의 크기와 무관하게 고정된 스캔 순서에 기초하여 역양자화된 변환 계수를 입력 받는 것을 특징으로 하는, 비디오 신호 처리 장치.
- 제8항에 있어서,상기 프로세서는,우상측 대각(up-right diagonal) 스캔 순서에 기초하여 상기 역양자화된 변환 계수를 상기 이차 역변환의 입력 계수 배열에 할당하는, 비디오 신호 처리 장치.
- 제9항에 있어서,상기 우상측 대각 스캔 순서는 4x4 크기의 블록에 대한 스캔 순서로 미리 정의되는 것을 특징으로 하는, 비디오 신호 처리 장치.
- 제8항에 있어서,상기 프로세서는,미리 정의된 조건을 만족하는 경우, 상기 현재 블록에 이차 변환이 적용되는지 여부를 지시하는 신택스 요소를 획득하고,상기 미리 정의된 조건은 상기 현재 블록의 너비(width) 및 높이(height)가 최대 변환 크기보다 작거나 같은지 여부를 포함하는, 비디오 신호 처리 장치.
- 제11항에 있어서,상기 프로세서는,상기 미리 정의된 조건을 만족하지 않는 경우, 상기 신택스 요소를 0으로 추론하는, 비디오 신호 처리 장치.
- 제12항에 있어서,상기 신택스 요소의 값이 0인 경우, 상기 현재 블록에 이차 역변환이 적용되지 않는 것으로 결정되고,상기 신택스 요소의 값이 0이 아닌 경우, 상기 신택스 요소의 값에 따라 상기 결정된 이차 변환 커널 세트 내에서 상기 현재 블록에 적용되는 이차 변환 커널이 결정되는, 비디오 신호 처리 장치.
- 제11항에 있어서,상기 현재 블록의 너비 또는 높이가 상기 최대 변환 크기보다 큰 경우, 상기 현재 블록은 복수의 변환 유닛으로 분할되는 것을 특징으로 하는, 비디오 신호 처리 장치.
- 비디오 신호 처리 방법에 있어서,현재 블록에 이차 변환을 적용할지 여부를 결정하는 단계;상기 현재 블록에 상기 이차 변환이 적용되는 경우, 상기 현재 블록의 인트라 예측 모드에 기초하여 미리 정의된 이차 변환 커널 세트들 중에서 상기 현재 블록에 적용되는 이차 변환 커널 세트를 유도하는 단계;상기 결정된 이차 변환 커널 세트 내에서 상기 현재 블록에 적용되는 이차 변환 커널을 결정하는 단계;상기 현재 블록의 잔차 블록에 대하여 일차 변환을 수행함으로써, 일차 변환된 블록을 생성하는 단계;상기 이차 변환 커널을 이용하여 상기 일차 변환된 블록의 좌상단 특정 영역에 대하여 이차 변환을 수행함으로써, 이차 변환된 블록을 생성하는 단계; 및상기 이차 변환된 블록을 인코딩함으로써 비트스트림을 생성하는 단계를 포함하되,상기 이차 변환은 상기 이차 변환 커널의 크기와 무관하게 고정된 스캔 순서에 기초하여 이차 변환된 계수를 변환 계수 배열로 구성함으로써 수행되는 것을 특징으로 하는, 비디오 신호 처리 방법.
- 컴퓨팅 디바이스의 하나 이상의 프로세서에서 실행하도록 구성된 컴퓨터 실행 가능한 컴포넌트가 저장된 비 일시적(non-transitory) 컴퓨터 판독 가능한 매체(computer-executable component)로서, 상기 컴퓨터 실행 가능한 컴포넌트는,현재 블록에 이차 변환이 적용되는지 여부를 결정하고,상기 현재 블록에 상기 이차 변환이 적용되는 경우, 상기 현재 블록의 인트라 예측 모드에 기초하여 미리 정의된 이차 변환 커널 세트들 중에서 상기 현재 블록에 적용되는 이차 변환 커널 세트를 유도하고,상기 결정된 이차 변환 커널 세트 내에서 상기 현재 블록에 적용되는 이차 변환 커널을 결정하고,상기 이차 변환 커널을 이용하여 상기 현재 블록의 좌상단 특정 영역에 대하여 이차 역변환을 수행함으로써, 이차 역변환된 블록을 생성하고,상기 이차 역변환된 블록에 대하여 일차 역변환을 수행함으로써, 상기 현재 블록의 잔차 블록을 생성하되,상기 이차 역변환은 상기 이차 변환 커널의 크기와 무관하게 고정된 스캔 순서에 기초하여 역양자화된 변환 계수를 입력 받는 것을 특징으로 하는, 비 일시적 컴퓨터 판독 가능한 매체.
Priority Applications (5)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| KR1020217017610A KR20210114386A (ko) | 2019-02-08 | 2020-02-10 | 이차 변환을 이용하는 비디오 신호 처리 방법 및 장치 |
| US17/348,227 US11616984B2 (en) | 2019-02-08 | 2021-06-15 | Video signal processing method and device using secondary transform |
| US18/164,460 US11973986B2 (en) | 2019-02-08 | 2023-02-03 | Video signal processing method and device using secondary transform |
| US18/620,899 US12356010B2 (en) | 2019-02-08 | 2024-03-28 | Video signal processing method and device using secondary transform |
| US19/231,346 US20250301173A1 (en) | 2019-02-08 | 2025-06-06 | Video signal processing method and device using secondary transform |
Applications Claiming Priority (6)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| KR10-2019-0014736 | 2019-02-08 | ||
| KR20190014736 | 2019-02-08 | ||
| KR20190035438 | 2019-03-27 | ||
| KR10-2019-0035438 | 2019-03-27 | ||
| KR20190051052 | 2019-04-30 | ||
| KR10-2019-0051052 | 2019-04-30 |
Related Child Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| US17/348,227 Continuation US11616984B2 (en) | 2019-02-08 | 2021-06-15 | Video signal processing method and device using secondary transform |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| WO2020162737A1 true WO2020162737A1 (ko) | 2020-08-13 |
Family
ID=71948040
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| PCT/KR2020/001853 Ceased WO2020162737A1 (ko) | 2019-02-08 | 2020-02-10 | 이차 변환을 이용하는 비디오 신호 처리 방법 및 장치 |
Country Status (3)
| Country | Link |
|---|---|
| US (4) | US11616984B2 (ko) |
| KR (1) | KR20210114386A (ko) |
| WO (1) | WO2020162737A1 (ko) |
Cited By (1)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| WO2022072289A1 (en) * | 2020-10-02 | 2022-04-07 | Qualcomm Incorporated | Extended low-frequency non-separable transform (lfnst) designs with worst-case complexity handling |
Families Citing this family (12)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| KR102772255B1 (ko) | 2019-02-01 | 2025-02-25 | 엘지전자 주식회사 | 이차 변환에 기반한 영상 코딩 방법 및 그 장치 |
| WO2020175965A1 (ko) | 2019-02-28 | 2020-09-03 | 주식회사 윌러스표준기술연구소 | 인트라 예측 기반 비디오 신호 처리 방법 및 장치 |
| WO2020228670A1 (en) | 2019-05-10 | 2020-11-19 | Beijing Bytedance Network Technology Co., Ltd. | Luma based secondary transform matrix selection for video processing |
| CN117354521A (zh) * | 2019-06-07 | 2024-01-05 | 北京字节跳动网络技术有限公司 | 视频比特流中的简化二次变换的有条件信令 |
| CN114208183B (zh) | 2019-08-03 | 2025-01-10 | 北京字节跳动网络技术有限公司 | 视频的缩减二次变换中基于位置的模式导出 |
| WO2021032045A1 (en) | 2019-08-17 | 2021-02-25 | Beijing Bytedance Network Technology Co., Ltd. | Context modeling of side information for reduced secondary transforms in video |
| RS67060B1 (sr) * | 2019-09-21 | 2025-08-29 | Lg Electronics Inc | Kodiranje slike na osnovu transformacije |
| US20220150518A1 (en) * | 2020-11-11 | 2022-05-12 | Tencent America LLC | Method and apparatus for video coding |
| US11930177B2 (en) | 2021-10-29 | 2024-03-12 | Tencent America LLC | Primary transforms for cross-component level reconstruction |
| WO2023197195A1 (zh) * | 2022-04-13 | 2023-10-19 | Oppo广东移动通信有限公司 | 视频编解码方法、编码器、解码器及存储介质 |
| US20250330614A1 (en) * | 2024-04-22 | 2025-10-23 | Tencent America LLC | Intra mode information based on non-conventional intra predictor |
| US20260006224A1 (en) * | 2024-07-01 | 2026-01-01 | Tencent America LLC | Adaptive intra secondary transform set selection and signaling |
Citations (5)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| KR20100095992A (ko) * | 2009-02-23 | 2010-09-01 | 한국과학기술원 | 비디오 부호화에서의 분할 블록 부호화 방법, 비디오 복호화에서의 분할 블록 복호화 방법 및 이를 구현하는 기록매체 |
| US20170019686A1 (en) * | 2015-07-16 | 2017-01-19 | Mediatek Inc. | Partial decoding circuit of video encoder/decoder for dealing with inverse second transform and partial encoding circuit of video encoder for dealing with second transform |
| KR20170117112A (ko) * | 2015-03-06 | 2017-10-20 | 한국과학기술원 | 저 복잡도 변환에 기반한 영상 부호화 및 복호화 방법 및 이를 이용하는 장치 |
| US20180249179A1 (en) * | 2017-02-28 | 2018-08-30 | Google Inc. | Transform Kernel Selection and Entropy Coding |
| US20180302631A1 (en) * | 2017-04-14 | 2018-10-18 | Mediatek Inc. | Secondary Transform Kernel Size Selection |
Family Cites Families (26)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| AU2003253900A1 (en) * | 2002-07-16 | 2004-02-02 | Thomson Licensing S.A. | Interleaving of base and enhancement layers for hd-dvd |
| CA3007527C (en) | 2010-04-13 | 2020-06-23 | Samsung Electronics Co., Ltd. | Video encoding method and video encoding apparatus and video decoding method and video decoding apparatus, which perform deblocking filtering based on tree-structure encoding units |
| WO2012008130A1 (ja) | 2010-07-13 | 2012-01-19 | 日本電気株式会社 | 映像符号化装置、映像復号装置、映像符号化方法、映像復号方法及びプログラム |
| WO2013082291A2 (en) | 2011-11-29 | 2013-06-06 | Huawei Technologies Co., Ltd. | Unified partitioning structures and signaling methods for high efficiency video coding |
| CN108632611A (zh) | 2012-06-29 | 2018-10-09 | 韩国电子通信研究院 | 视频解码方法、视频编码方法和计算机可读介质 |
| WO2016129980A1 (ko) | 2015-02-13 | 2016-08-18 | 엘지전자(주) | 변환 도메인 예측을 이용하여 비디오 신호를 인코딩, 디코딩하는 방법 및 장치 |
| US10425648B2 (en) | 2015-09-29 | 2019-09-24 | Qualcomm Incorporated | Video intra-prediction using position-dependent prediction combination for video coding |
| US10491922B2 (en) | 2015-09-29 | 2019-11-26 | Qualcomm Incorporated | Non-separable secondary transform for video coding |
| US10455228B2 (en) | 2016-03-21 | 2019-10-22 | Qualcomm Incorporated | Determining prediction parameters for non-square blocks in video coding |
| WO2017192995A1 (en) | 2016-05-06 | 2017-11-09 | Vid Scale, Inc. | Method and system for decoder-side intra mode derivation for block-based video coding |
| CN116708776A (zh) | 2016-07-18 | 2023-09-05 | 韩国电子通信研究院 | 图像编码/解码方法和装置以及存储比特流的记录介质 |
| WO2018038554A1 (ko) | 2016-08-24 | 2018-03-01 | 엘지전자(주) | 이차 변환을 이용한 비디오 신호의 인코딩/디코딩 방법 및 장치 |
| CN110024399B (zh) | 2016-11-28 | 2024-05-17 | 韩国电子通信研究院 | 对图像编码/解码的方法和设备及存储比特流的记录介质 |
| US10674165B2 (en) | 2016-12-21 | 2020-06-02 | Arris Enterprises Llc | Constrained position dependent intra prediction combination (PDPC) |
| AU2018230328B2 (en) | 2017-03-10 | 2021-05-06 | Hfi Innovation Inc. | Method and apparatus of implicit intra coding tool settings with intra directional prediction modes for video coding |
| US10742975B2 (en) | 2017-05-09 | 2020-08-11 | Futurewei Technologies, Inc. | Intra-prediction with multiple reference lines |
| US10805641B2 (en) | 2017-06-15 | 2020-10-13 | Qualcomm Incorporated | Intra filtering applied together with transform processing in video coding |
| KR102017379B1 (ko) | 2017-07-21 | 2019-09-02 | 건국대학교 산학협력단 | 이미지 벡터 처리를 이용한 해시 암호화 방법 및 장치 |
| ES3030533T3 (en) | 2018-06-03 | 2025-06-30 | Lg Electronics Inc | Method and device for processing video signal by using reduced transform |
| EP3840387B1 (en) | 2018-10-12 | 2024-03-06 | Guangdong Oppo Mobile Telecommunications Corp., Ltd. | Method for encoding/decoding image signal and device for same |
| CN113170197B (zh) * | 2018-12-06 | 2023-07-18 | Lg电子株式会社 | 基于二次变换的图像编码方法及其装置 |
| CN112514384B (zh) | 2019-01-28 | 2024-12-24 | 苹果公司 | 视频信号编码/解码方法及其装置 |
| KR102772255B1 (ko) | 2019-02-01 | 2025-02-25 | 엘지전자 주식회사 | 이차 변환에 기반한 영상 코딩 방법 및 그 장치 |
| US12200202B2 (en) | 2019-02-21 | 2025-01-14 | Lg Electronics Inc. | Image decoding method and apparatus using intra prediction in image coding system |
| WO2020175965A1 (ko) | 2019-02-28 | 2020-09-03 | 주식회사 윌러스표준기술연구소 | 인트라 예측 기반 비디오 신호 처리 방법 및 장치 |
| MX2021016011A (es) | 2019-06-25 | 2022-02-22 | Fraunhofer Ges Forschung | Decodificador, codificador y metodos que comprenden una codificacion para intra-subparticiones. |
-
2020
- 2020-02-10 WO PCT/KR2020/001853 patent/WO2020162737A1/ko not_active Ceased
- 2020-02-10 KR KR1020217017610A patent/KR20210114386A/ko active Pending
-
2021
- 2021-06-15 US US17/348,227 patent/US11616984B2/en active Active
-
2023
- 2023-02-03 US US18/164,460 patent/US11973986B2/en active Active
-
2024
- 2024-03-28 US US18/620,899 patent/US12356010B2/en active Active
-
2025
- 2025-06-06 US US19/231,346 patent/US20250301173A1/en active Pending
Patent Citations (5)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| KR20100095992A (ko) * | 2009-02-23 | 2010-09-01 | 한국과학기술원 | 비디오 부호화에서의 분할 블록 부호화 방법, 비디오 복호화에서의 분할 블록 복호화 방법 및 이를 구현하는 기록매체 |
| KR20170117112A (ko) * | 2015-03-06 | 2017-10-20 | 한국과학기술원 | 저 복잡도 변환에 기반한 영상 부호화 및 복호화 방법 및 이를 이용하는 장치 |
| US20170019686A1 (en) * | 2015-07-16 | 2017-01-19 | Mediatek Inc. | Partial decoding circuit of video encoder/decoder for dealing with inverse second transform and partial encoding circuit of video encoder for dealing with second transform |
| US20180249179A1 (en) * | 2017-02-28 | 2018-08-30 | Google Inc. | Transform Kernel Selection and Entropy Coding |
| US20180302631A1 (en) * | 2017-04-14 | 2018-10-18 | Mediatek Inc. | Secondary Transform Kernel Size Selection |
Cited By (4)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| WO2022072289A1 (en) * | 2020-10-02 | 2022-04-07 | Qualcomm Incorporated | Extended low-frequency non-separable transform (lfnst) designs with worst-case complexity handling |
| CN116250233A (zh) * | 2020-10-02 | 2023-06-09 | 高通股份有限公司 | 具有最坏情况复杂度处理的扩展低频不可分离变换(lfnst)设计 |
| US11871010B2 (en) | 2020-10-02 | 2024-01-09 | Qualcomm Incorporated | Extended low-frequency non-separable transform (LFNST) designs with worst-case complexity handling |
| US12598313B2 (en) | 2020-10-02 | 2026-04-07 | Qualcomm Incorporated | Extended low-frequency non-separable transform (LFNST) designs with worst-case complexity handling |
Also Published As
| Publication number | Publication date |
|---|---|
| US20210314619A1 (en) | 2021-10-07 |
| KR20210114386A (ko) | 2021-09-23 |
| US11616984B2 (en) | 2023-03-28 |
| US20230188754A1 (en) | 2023-06-15 |
| US20240244263A1 (en) | 2024-07-18 |
| US11973986B2 (en) | 2024-04-30 |
| US12356010B2 (en) | 2025-07-08 |
| US20250301173A1 (en) | 2025-09-25 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| WO2020162737A1 (ko) | 이차 변환을 이용하는 비디오 신호 처리 방법 및 장치 | |
| WO2020050702A1 (ko) | 다중 변환 커널을 사용하는 비디오 신호 처리 방법 및 장치 | |
| WO2019125093A1 (ko) | 비디오 신호 처리 방법 및 장치 | |
| WO2018174593A1 (ko) | 적응적인 화소 분류 기준에 따른 인루프 필터링 방법 | |
| WO2023277535A1 (ko) | 인트라 예측을 이용한 비디오 신호 처리 방법 및 이를 위한 장치 | |
| WO2018128322A1 (ko) | 영상 처리 방법 및 이를 위한 장치 | |
| WO2020009419A1 (ko) | 병합 후보를 사용하는 비디오 코딩 방법 및 장치 | |
| WO2018044089A1 (ko) | 비디오 신호 처리 방법 및 장치 | |
| WO2016159610A1 (ko) | 비디오 신호 처리 방법 및 장치 | |
| WO2016114583A1 (ko) | 비디오 신호 처리 방법 및 장치 | |
| WO2021125751A1 (ko) | 임의의 모양으로 분할되는 블록을 예측하는 방법 및 복호화 장치 | |
| WO2020130600A1 (ko) | 예측 모드를 시그널링하는 비디오 신호 처리 방법 및 장치 | |
| WO2023096472A1 (ko) | 비디오 신호 처리 방법 및 이를 위한 장치 | |
| WO2023132660A1 (ko) | 종속 양자화를 이용한 비디오 신호 처리 방법 및 이를 위한 장치 | |
| WO2023043296A1 (ko) | Obmc를 이용한 비디오 신호 처리 방법 및 이를 위한 장치 | |
| WO2016122251A1 (ko) | 비디오 신호 처리 방법 및 장치 | |
| WO2024053987A1 (ko) | 기하학적 분할을 이용하는 비디오 신호 처리 방법 및 이를 위한 장치 | |
| WO2021060801A1 (ko) | 영상 부호화/복호화 방법, 장치 및 비트스트림을 저장한 기록 매체 | |
| WO2018070788A1 (ko) | 영상 부호화 방법/장치, 영상 복호화 방법/장치 및 비트스트림을 저장한 기록 매체 | |
| WO2023055147A1 (ko) | Mhp(multi-hypothesis prediction)모드에 기초한 비디오 신호 처리 방법 및 이를 위한 장치 | |
| WO2018070723A1 (ko) | 영상의 부호화/복호화 방법 및 이를 위한 장치 | |
| WO2020184936A1 (ko) | 비디오 신호의 부호화 또는 복호화 방법 및 장치 | |
| WO2024237502A1 (ko) | 외삽 기반의 인트라 예측을 위한 방법 | |
| WO2016204479A1 (ko) | 영상의 부호화/복호화 방법 및 이를 위한 장치 | |
| WO2020256510A1 (ko) | 코딩 툴들을 제어하는 방법 및 장치 |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| 121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 20752933 Country of ref document: EP Kind code of ref document: A1 |
|
| ENP | Entry into the national phase |
Ref document number: 20217017610 Country of ref document: KR Kind code of ref document: A |
|
| NENP | Non-entry into the national phase |
Ref country code: DE |
|
| 122 | Ep: pct application non-entry in european phase |
Ref document number: 20752933 Country of ref document: EP Kind code of ref document: A1 |
|
| WWR | Wipo information: refused in national office |
Ref document number: 1020217017610 Country of ref document: KR |
|
| WWC | Wipo information: continuation of processing after refusal or withdrawal |
Ref document number: 1020217017610 Country of ref document: KR |



