EP4684526A1 - Prédiction inter-composantes dans des images inter - Google Patents
Prédiction inter-composantes dans des images interInfo
- Publication number
- EP4684526A1 EP4684526A1 EP24711560.3A EP24711560A EP4684526A1 EP 4684526 A1 EP4684526 A1 EP 4684526A1 EP 24711560 A EP24711560 A EP 24711560A EP 4684526 A1 EP4684526 A1 EP 4684526A1
- Authority
- EP
- European Patent Office
- Prior art keywords
- block
- cross
- coding mode
- model
- component coding
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/169—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
- H04N19/17—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
- H04N19/176—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a block, e.g. a macroblock
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/102—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
- H04N19/103—Selection of coding mode or of prediction mode
- H04N19/105—Selection of the reference unit for prediction within a chosen coding or prediction mode, e.g. adaptive choice of position and number of pixels used for prediction
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/102—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
- H04N19/103—Selection of coding mode or of prediction mode
- H04N19/11—Selection of coding mode or of prediction mode among a plurality of spatial predictive coding modes
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/134—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
- H04N19/146—Data rate or code amount at the encoder output
- H04N19/147—Data rate or code amount at the encoder output according to rate distortion criteria
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/134—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
- H04N19/157—Assigned coding mode, i.e. the coding mode being predefined or preselected to be further used for selection of another element or parameter
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/169—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
- H04N19/186—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being a colour or a chrominance component
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/189—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the adaptation method, adaptation tool or adaptation type used for the adaptive coding
- H04N19/196—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the adaptation method, adaptation tool or adaptation type used for the adaptive coding being specially adapted for the computation of encoding parameters, e.g. by averaging previously computed encoding parameters
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/42—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals characterised by implementation details or hardware specially adapted for video compression or decompression, e.g. dedicated software implementation
- H04N19/423—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals characterised by implementation details or hardware specially adapted for video compression or decompression, e.g. dedicated software implementation characterised by memory arrangements
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/50—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
- H04N19/503—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
- H04N19/51—Motion estimation or motion compensation
- H04N19/513—Processing of motion vectors
- H04N19/517—Processing of motion vectors by encoding
- H04N19/52—Processing of motion vectors by encoding by predictive encoding
Definitions
- pictures of the video content are divided into blocks of samples (i.e., Pixels), these blocks being then partitioned into one or more sub-blocks, called original sub-blocks in the following.
- An intra or inter prediction is then applied to each sub-block to exploit intra or inter image correlations.
- a predictor sub-block is determined for each original sub- block.
- a sub-block representing a difference between the original sub-block and the predictor sub-block often denoted as a prediction error sub-block, a prediction residual sub-block or simply a residual sub-block, is transformed, quantized and entropy coded to generate an encoded video stream.
- the compressed data is decoded by inverse processes corresponding to the transform, quantization and entropic coding.
- Intra prediction had been recently improved to better benefit from the correlations between components of a block.
- New tools consisting in a cross component intra prediction wherein chroma samples of a block are predicted from reconstructed luma samples of the block using a model were proposed.
- Some of these cross- component (CC) coding tools (or CC coding modes) use models of blocks previously coded using a CC coding tool.
- CC coding tools called Non-Adjacent CC 1 2023PF00217 prediction (NA-CCP) or history-based CC prediction (H-CCP), models of the CC coding tools applied to a block are kept for future use.
- one or more of the present embodiments provide a method a method comprising: obtaining a first block to reconstruct using a cross-component coding mode in which a model of the cross-component coding mode to apply to reconstruct the first block is obtained from a model of the cross-component coding mode associated to a reconstructed second block, wherein, the second block was reconstructed using a mode different from the cross-component coding mode.
- one or more of the present embodiments provide a method comprising: applying a cross-component coding mode to a first block, a model of the cross-component coding mode to apply to the first block being obtained from a model of the cross-component coding mode associated to a second block encoded before the first block, wherein, the second block was encoded using a mode different from the cross-component coding mode.
- the model of the cross-component coding mode associated to the reconstructed second block is stored in at least one cell corresponding to the second block of a first buffer storing coding parameters of blocks of a current picture comprising the first and the second block.
- a cell of the at least one cell inherited the model of the cross-component coding mode associated to the 2 2023PF00217 reconstructed second block from another cell corresponding to a reference block of a reference picture designated by motion information of the second block, the another cell being comprised in a second buffer storing coding parameters of blocks of the reference picture.
- the cell of the at least one cell inherited the model of the cross-component coding mode associated to the reconstructed second block from the another cell depending on a value representative of a quality of the model of the cross-component coding mode inherited by the first block from the second block.
- the model of the cross-component coding mode to apply to the first block is selected from a table storing models of the cross-component coding mode of last reconstructed blocks for which a model of the cross-component coding mode is available, said table being updated after a reconstruction of a block for which a model of the cross-component coding mode is available.
- the model of the cross-component coding mode to apply to the first block is selected from models of the cross-component coding mode of a candidate region list or of candidate block list, regions of the candidate region list and blocks of the candidate block list being in a neighborhood of the first block.
- a model of the cross- component coding mode is computed for reconstructed blocks predicted using a mode different from the cross-component coding mode.
- one or more of the present embodiments provide a method for decoding a current picture comprising reconstructing a current block of the current picture according to a cross-component coding mode applying the method of the first aspect.
- one or more of the present embodiments provide a method for encoding a current picture comprising reconstructing a current block of the current picture according to a cross-component coding mode applying the method of the second aspect.
- information representative of the model of the cross-component coding mode applied to reconstruct the first block is stored in at least one cell corresponding to the current block of a current buffer storing coding parameters associated to the current picture.
- one or more of the present embodiments provide a device comprising electronic circuitry configured for: obtaining a first block to reconstruct using a cross-component coding mode in which a model of the cross-component coding mode to apply to reconstruct the first block is obtained from a model of the cross-component coding mode associated to a reconstructed second block, wherein, the second block was reconstructed using a mode different from the cross-component coding mode.
- one or more of the present embodiments provide a device comprising electronic circuitry configured for: applying a cross-component coding mode to a first block, a model of the cross- component coding mode to apply to the first block being obtained from a model of the cross-component coding mode associated to a second block encoded before the first block, wherein, the second block was encoded using a mode different from the cross- component coding mode.
- the model of the cross-component coding mode associated to the reconstructed second block is stored in at least one cell corresponding to the second block of a first buffer storing coding parameters of blocks of a current picture comprising the first and the second block.
- a cell of the at least one cell inherited the model of the cross-component coding mode associated to the reconstructed second block from another cell corresponding to a reference block of a reference picture designated by motion information of the second block, the another cell being comprised in a second buffer storing coding parameters of blocks of the reference picture. 4 2023PF00217
- the cell of the at least one cell inherited the model of the cross-component coding mode associated to the reconstructed second block from the another cell depending on a value representative of a quality of the model of the cross-component coding mode inherited by the first block from the second block.
- the model of the cross-component coding mode to apply to the first block is selected from a table storing models of the cross-component coding mode of last reconstructed blocks for which a model of the cross-component coding mode is available, said table being updated after a reconstruction of a block for which a model of the cross-component coding mode is available.
- the model of the cross-component coding mode to apply to the first block is selected from models of the cross-component coding mode of a candidate region list or of candidate block list, regions of the candidate region list and blocks of the candidate block list being in a neighborhood of the first block.
- a model of the cross-component coding mode is computed for reconstructed blocks predicted using a mode different from the cross-component coding mode.
- one or more of the present embodiments provide a system for decoding a current picture comprising electronic circuitry configured for reconstructing a current block of a current picture according to a cross-component coding mode comprising the device of the fifth aspect.
- one or more of the present embodiments provide a system for encoding a current picture comprising electronic circuitry configured for reconstructing a current block of a current picture according to a cross-component coding mode comprising the device of the sixth aspect.
- information representative of the model of the cross-component coding mode applied to reconstruct the first block is 5 2023PF00217 stored in at least one cell corresponding to the current block of a current buffer storing coding parameters associated to the current picture.
- one or more of the present embodiments provide a computer program comprising program code instructions for implementing the method according to the first, the second, the third or the fourth aspect.
- one or more of the present embodiments provide a non-transitory information storage medium storing program code instructions for implementing the method according to the first, the second, the third or the fourth aspect. 5.
- Fig.1 illustrates schematically a context in which embodiments are implemented
- Fig. 2 illustrates schematically an example of partitioning undergone by a picture of pixels of an original video
- Fig.3 depicts schematically a method for encoding a video stream
- Fig.4 depicts schematically a method for decoding an encoded video stream
- Fig. 5A illustrates schematically an example of hardware architecture of a processing module able to implement an encoding module or a decoding module in which various aspects and embodiments are implemented
- Fig. 1 illustrates schematically a context in which embodiments are implemented
- Fig. 2 illustrates schematically an example of partitioning undergone by a picture of pixels of an original video
- Fig.3 depicts schematically a method for encoding a video stream
- Fig.4 depicts schematically a method for decoding an encoded video stream
- Fig. 5A illustrates schematically an example of hardware architecture of a processing module able to implement an encoding module or a decoding
- FIG. 5B illustrates a block diagram of an example of a first system in which various aspects and embodiments are implemented
- Fig.5C illustrates a block diagram of an example of a second system in which various aspects and embodiments are implemented
- Fig.6A illustrates the LM_Chroma CCLM mode
- Fig.6B illustrates the MDLM_T CCLM mode
- Fig.6C illustrates the MDLM_L CCLM mode
- Fig. 7 illustrates classes used to determine models parameters in the Multi-model LM (MMLM) modes
- Fig.8 illustrates a 5-tap spatial filter component of a 7-tap convolutional filter used in the CCCM mode
- FIG. 9 illustrates a reference area consisting of six lines/columns of chroma samples above and left of a CU used in the CCCM mode;
- Fig. 10 illustrates a general process of the CC coding tools (e.g., CCLM, MMLM, CCCM);
- Fig. 11 illustrates a set of candidate regions for a non-adjacent cross component prediction mode;
- Fig.12 illustrates a spanInfo process;
- Fig.13 depicts schematically a modified method for encoding video data according to embodiments;
- Fig. 14 depicts schematically a method for decoding encoded video data according to an embodiment; and,
- Fig.15 depicts a method for inheriting candidates for a block coded in inter in a current picture. 6.
- VVC Versatile Video Coding
- ITU-T H.266 Versatile Video Coding
- these embodiments are not limited to the video coding/decoding method corresponding to VVC.
- These embodiments are in particular adapted to various video formats comprising (and derived from) for example HEVC (ISO/IEC 23008-2 – MPEG-H Part 2, High Efficiency Video Coding / ITU-T H.265)), AVC ((ISO/CEI 14496-10), EVC (Essential Video Coding/MPEG-5), AV1, AV2 and VP9.
- HEVC ISO/IEC 23008-2 – MPEG-H Part 2, High Efficiency Video Coding / ITU-T H.265)
- AVC ((ISO/CEI 14496-10)
- EVC Essential Video Coding/MPEG-5
- a system 11 that could be a camera, a storage device, a computer, a server or any device capable of delivering a video stream, transmits a video stream to a system 13 using a communication channel 12.
- the video stream is either encoded and transmitted by the system 11 or received and/or stored by the system 11 and then transmitted.
- the communication channel 12 is a wired (for example Internet or Ethernet) or a wireless (for example WiFi, 3G, 4G or 5G) network link.
- the system 13, that could be for example a set top box, receives and decodes 7 2023PF00217 the video stream to generate a sequence of decoded pictures. A post-processing may be applied to the decoded pictures.
- the obtained sequence of decoded pictures is then transmitted to a display system 15 using a communication channel 14, that could be a wired or wireless network.
- the display system 15 then displays said pictures.
- the system 13 is comprised in the display system 15.
- the system 13 and display system 15 are comprised in a TV, a computer, a tablet, a smartphone, a head-mounted display, etc.
- Figs.2, 3 and 4 introduce an example of video format.
- Fig.2 illustrates an example of partitioning undergone by a picture of pixels 21 of an original video sequence 20. It is considered here that a pixel is composed of three components: a luminance component and two chrominance components.
- a picture is divided into a plurality of coding entities.
- a picture is divided in a grid of blocks called coding tree units (CTU).
- CTU coding tree units
- a CTU consists of an ⁇ ⁇ ⁇ block of luminance samples together with two corresponding blocks of chrominance samples.
- N is generally a power of two having a maximum value of “128” for example.
- a picture is divided into one or more groups of CTU. For example, it can be divided into one or more tile rows and tile columns, a tile being a sequence of CTU covering a rectangular region of a picture.
- a tile could be divided into one or more bricks, each of which consisting of at least one row of CTU within the tile.
- another encoding entity called slice, exists, that can contain at least one tile of a picture or at least one brick of a tile.
- the picture 21 is divided into three slices S1, S2 and S3 of the raster-scan slice mode, each comprising a plurality of tiles (not represented), each tile comprising only one brick.
- a CTU may be partitioned into the form of a hierarchical tree of one or more sub-blocks called coding units (CU).
- CU coding units
- the CTU is the root (i.e., the parent node) of the hierarchical tree and can be partitioned in a plurality of CU (i.e., child nodes). Each CU becomes a leaf of the hierarchical tree if it is not further partitioned in smaller CU or becomes a parent node of smaller CU (i.e., 8 2023PF00217 child nodes) if it is further partitioned.
- the CTU 24 is first partitioned in “4” square CU using a quadtree type partitioning.
- the upper left CU is a leaf of the hierarchical tree since it is not further partitioned, i.e., it is not a parent node of any other CU.
- the upper right CU is further partitioned in “4” smaller square CU using again a quadtree type partitioning.
- the bottom right CU is vertically partitioned in “2” rectangular CU using a binary tree type partitioning.
- the bottom left CU is vertically partitioned in “3” rectangular CU using a ternary tree type partitioning.
- the partitioning is adaptive, each CTU being partitioned so as to optimize a compression efficiency of the CTU criterion.
- PU prediction unit
- TU transform unit
- the coding entity that is used for prediction (i.e., a PU) and transform (i.e., a TU) can be a subdivision of a CU.
- a CU of size 2 ⁇ ⁇ 2 ⁇ can be divided in PU 2411 of size ⁇ ⁇ 2 ⁇ or of size 2 ⁇ ⁇ ⁇ .
- said CU can be divided in “4” TU 2412 of size ⁇ ⁇ ⁇ or in “16” TU of size ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ .
- frontiers of the TU and PU are aligned on the frontiers of the CU.
- a CU comprises generally one TU and one PU.
- the term “block” or “picture block” can be used to refer to any one of a CTU, a CU, a PU and a TU.
- the term “block” or “picture block” can be used to refer to a macroblock, a partition and a sub-block as specified in H.264/AVC or in other video coding formats, and more generally to refer to an array of samples of numerous sizes.
- Fig.3 depicts schematically a method for encoding a video stream executed by an encoding module. For instance, the method for encoding of Fig.3 is executed by the system 11. Variations of this method for encoding are contemplated, but the method for 9 2023PF00217 encoding of Fig.
- a current original picture of an original video sequence may go through a pre-processing.
- a color transform is applied to the current original picture (e.g., conversion from RGB 4:4:4 to YCbCr 4:2:0), or a remapping is applied to the current original picture components in order to get a signal distribution more resilient to compression (for instance using a histogram equalization of one of the color components).
- Pictures obtained by pre-processing are called pre-processed pictures in the following.
- the encoding of a pre-processed picture begins with a partitioning of the pre- processed picture during a step 302, as described in relation to Fig.2.
- the pre-processed picture is thus partitioned into CTU, CU, PU, TU, etc.
- the encoding module determines then a coding mode between an intra prediction and an inter prediction.
- the intra prediction consists of predicting, in accordance with an intra prediction method, during a step 303, the pixels of a current block from a prediction block derived from pixels of reconstructed blocks situated in a causal neighbourhood of the current block to be encoded.
- the result of the intra prediction is an intra prediction mode indicating which pixels of the blocks in the neighbourhood to use, and a residual block resulting from a calculation of a difference between the current block and the prediction block.
- the usual intra prediction described above is adapted to the intra prediction in one component (Y or Cb or Cr), i.e., predicted samples are predicted only from reference samples from the same component. This usual intra prediction could be called intra component prediction.
- Intra prediction has been recently enriched by several tools (or modes) consisting in intra prediction between different components, i.e., samples of a first component are predicted using samples of a second component. These tools (i.e., modes) are generally called cross-component prediction tools.
- CCLM Cross-Component Linear Model
- CC Cross-Component Linear Model
- CCLM Cross-Component Linear Model
- CCLM parameters ⁇ ⁇ and ⁇ ⁇ are derived for each chroma component with a set of neighboring chroma samples of the same chroma component and their corresponding luma samples (eventually down- sampled). In some implementations, a subset of neighboring chroma samples (e.g., at most four) and their corresponding luma samples are used. Also, the position of the neighboring samples may be signaled in the bitstream. For example, in VVC there exists three CCLM modes (LM_CHROMA, MDLM_T, MDLM_L) that differ with the location of the neighboring chroma samples.
- Fig.6A illustrates the LM_Chroma CCLM mode.
- Fig.6B illustrates the MDLM_T CCLM mode.
- Fig.6C illustrates the MDLM_L CCLM mode.
- the CCLM mode to be used is coded per CU.
- the set of neighboring luma samples at the selected positions are down-sampled (if required by the chroma format) and compared to find two smaller values: x 0 A and x 1 A , and two larger values: x 0 B and x 1 B .
- Their corresponding chroma sample values are denoted as y 0 A, y 1 A, y 0 B and y 1 B.
- MMLM for multi-model LM
- neighboring luma samples and neighboring chroma samples of the current block are classified into several classes, each class is used as a training set to derive a linear model (i.e., particular ⁇ ⁇ and ⁇ ⁇ are derived for a particular class).
- the samples of the current luma block are also classified based on the same rule for the classification of neighboring luma samples.
- MMLM MMLM
- M MMLM2
- M MMLM3
- the encoder chooses the optimal mode in a RDO process and signal the mode.
- Fig.7 shows an example of classifying the neighboring samples into two groups. Threshold is calculated as the average value of the neighboring reconstructed Luma samples.
- a slope adjustment is applied to cross-component linear model (CCLM) and to Multi-model LM prediction.
- the adjustment is tilting the linear function which maps luma values to chroma values with respect to a center point determined by the average luma value of the reconstructed luma samples.
- CCCM the linear model of CCLM is replaced by a CC prediction model taking the form of an adaptive 7-Tap convolutional filter.
- the 7-tap convolutional filter consists of a 5-tap spatial filter component, a nonlinear term P and a bias term B.
- the input to the spatial 5-tap spatial filter component consists of (eventually) down-sampled luma samples comprising a center luma sample C which is collocated with a chroma sample to be predicted, a luma sample N above the center luma sample C, a luma sample S below the center luma sample C, a sample W on the left of the center luma sample C and a sample E on the right of the center luma sample C as illustrated in Fig.8.
- the predicted chroma samples for a chroma component ⁇ ⁇ i.e.
- the filter coefficients ⁇ ⁇ ⁇ are calculated by minimizing a MSE between predicted and reconstructed chroma samples in a reference area.
- Fig. 9 illustrates the reference area which consists of six lines/columns of chroma samples above and left of the CU. Reference area extends one CU width to the right and one CU height below the CU boundaries. Area is adjusted to include only available samples.
- Gx (2W + NW + SW) – (2E + NE + SE) Luma samples NW, NE, SW and SE are illustrated in Fig.8.
- the Y and X parameters are vertical and horizontal coordinates of the center luma sample location.
- the reconstructed luma samples are not down-sampled.
- 13 2023PF00217 A general process of the chroma prediction modes using cross-component luma model (e.g., CCLM, MMLM, CCCM) is depicted in Fig.10.
- the process of Fig.10 is executed by a processing module similar to the processing module described in the following in relation to Fig.5A.
- the processing module selects reference samples (reconstructed luma and chroma sample values) from the neighborhood of the current CU.
- CCCM uses six lines of reference samples above the current CU and six columns of reference samples on the left of the current CU.
- a step 1015 the processing module filters the reconstructed luma sample values to obtain down-sampled luma samples.
- Step 1015 is optional.
- the processing module determines a threshold to classify the reference samples in at least two classes.
- Step 1020 is optional and is applied when multiple CC prediction models are used as in MMLM.
- the processing module derives the parameters of the (multi-) CC prediction model(s) (see coefficients ⁇ ⁇ ⁇ ) from the reference luma and chroma sample values.
- the processing module uses the CC prediction model(s) to derive the chroma sample prediction values from the co-located (eventually down-sampled) reconstructed luma sample values.
- CCCM Non-Adjacent Cross Component Prediction
- samples in regions non-adjacent to a current block can be used to derive a CCCM model for the current block as described in document K.Zhang, L.Zhang, Z.Deng, “ Non-EE2: Non-Local Cross-Component Prediction,” document JVET- AC0176, 29th Meeting, by teleconference, 11–20 January 2023.
- a candidate region list with “6” candidates is constructed by checking potential 8 ⁇ 8 regions in a neighborhood of the current block in a specified order. If a checked region is available, it is put into the candidate region list. The top-left positions of the potential 8 ⁇ 8 regions are predetermined as depicted in Figure 11.
- a flag is signaled to indicate whether NA-CCP is applied to a chroma block. If NA-CCP is applied, an index is signaled to indicate which candidate in the candidate region list is used to derive the CCCM model for the current block. 14 2023PF00217 As can be seen, in NA-CCP, it is considered that non-adjacent regions may be more correlated with the chroma block signal of the current block than adjacent regions.
- H-CCP History-based cross-component prediction
- a table H-CCLM of CCLM models of previous blocks encoded according to the CCLM mode and a table H-CCCM of CCCM models of previous blocks encoded according to the CCCM mode are maintained similarly to a history- based motion vector prediction (HMVP) table.
- HMVP motion information of a previously coded block is stored in a table and used as motion vector predictor candidate for the current CU.
- the table with multiple HMVP candidates is maintained during the encoding/decoding process.
- the table is reset (emptied) when a new CTU row is encountered.
- the HMVP table size is generally set to be “6”.
- a constrained first-in-first-out (FIFO) rule is utilized wherein redundancy check is firstly applied to remove duplicate candidates in the HMVP table.
- FIFO constrained first-in-first-out
- H-CCP after decoding a CCLM or CCCM encoded block, the corresponding table (the H-CCLM table or the H-CCCM table) is updated with the CCLM or CCCM model of the block.
- the size of either H-CCLM table or H-CCCM table is also “6”. If the current block is coded with CCLM or CCCM mode, a flag is signaled to indicate whether H-CCP is applied.
- H-CCP avoid computing a CCLM and CCCM model for each block encoded according to the CCLM or CCCM mode. It amounts at replacing step 1030 of Fig.10 by a step of selection of a model in the tables H-CCLM or H-CCCM.
- the H-CCP mode has some limitations in inter slices. Indeed, since most of the blocks of inter-slices are coded in inter mode, the H-CCLM table or H-CCCM table are updated rarely.
- the inter prediction consists in predicting the pixels of a current block from a block of pixels, referred to as the reference block, of a picture preceding or following the current picture, this picture being referred to as the reference picture.
- the reference picture a block of the reference picture closest, in accordance with a similarity criterion, to the current block is determined by a motion estimation step 304.
- a motion vector indicating the position of the reference block in the reference picture is determined. Said motion vector is used during a motion compensation step 305 during which a residual block is calculated in the form of a difference between the current block and the reference block.
- the mono-directional inter prediction mode described above was the only inter mode available. As video compression standards evolve, the family of inter modes has grown significantly and comprises now many different inter modes.
- the prediction mode optimising the compression performances in accordance with a rate/distortion optimization criterion (i.e., RDO criterion), among the prediction modes tested (Intra prediction modes, Inter prediction modes), is selected by the encoding module.
- the residual block is transformed during a step 307.
- the transformed block is then quantized during a step 309.
- the encoding module can skip the transform and apply quantization directly to the non-transformed residual signal.
- an intra prediction mode and the transformed and quantized residual block are encoded by an entropic encoder during a step 310.
- a motion vector of the block is predicted from a prediction vector selected from a set of motion vector predictors derived from reconstructed blocks situated in a spatial and temporal vicinity of the block to be encoded.
- the motion information is next encoded by the entropic encoder during step 310 in the form of a motion residual and an index for identifying the prediction vector.
- the transformed and quantized residual block is encoded by the entropic encoder during step 310.
- the encoding module can bypass both transform and quantization, i.e., the entropic encoding is applied on the residual without the application of the transform 16 2023PF00217 or quantization processes.
- the result of the entropic encoding is inserted in an encoded video stream (i.e., encoded video data) 311. Metadata such as SEI (supplemental enhancement information) messages can be attached to the encoded video stream 311.
- a SEI message as defined for example in standards such as AVC, HEVC or VVC (or in standard Versatile supplemental enhancement information (VSEI) messages for coded video bitstreams – H.274) is a data container or a syntax structure associated to a video stream and comprising metadata providing information relative to the video stream.
- VSEI Versatile supplemental enhancement information
- the encoding module applies, when appropriate, during a step 316, a motion compensation using the motion vector of the current block in order to identify the reference block of the current block.
- the intra prediction mode during a step 315, the intra prediction mode corresponding to the current block is used for reconstructing the prediction block of the current block.
- the prediction block and the reconstructed residual block are added in order to obtain the reconstructed current block.
- an in-loop filtering intended to reduce the encoding artefacts is applied, during a step 317, to the reconstructed block.
- In-loop filtering tools comprises deblocking filtering, SAO (Sample adaptive Offset) and ALF (Adaptive Loop Filtering).
- SAO Sample adaptive Offset
- ALF Adaptive Loop Filtering
- FIG. 4 depicts schematically a method for decoding the encoded video stream 311 encoded according to method described in relation to Fig.3 executed by a decoding module.
- the method for decoding of Fig.4 is executed by the system 13. Variations of this method for decoding are contemplated, but the method for decoding of Fig. 4 is described below for purposes of clarity without describing all expected variations.
- the decoding is done block by block. For a current block, it starts with an entropic decoding of the current block during a step 410. Entropic decoding allows to obtain, at least, the prediction mode of the block. If the block has been encoded according to an inter prediction mode, the entropic decoding allows to obtain, when appropriate, a prediction vector index, a motion residual and a residual block.
- a motion vector is reconstructed for the current block using the prediction vector index and the motion residual. If the block has been encoded according to an intra prediction mode, entropic decoding allows to obtain an intra prediction mode and a residual block. Steps 412, 413, 414, 415, 416 and 417 implemented by the decoding module are in all respects identical respectively to steps 312, 313, 314, 315, 316 and 317 implemented by the encoding module. Decoded blocks are saved in decoded pictures and the decoded pictures are stored in a DPB 419 in a step 418.
- the decoding module decodes a given picture
- the pictures stored in the DPB 419 are identical to the pictures stored in the DPB 319 by the encoding module during the encoding of said given picture.
- the decoded picture can also be outputted by the decoding module for instance to be displayed.
- a post-processing step 421 may be applied.
- Fig. 5A, 5B and 5C describes examples of device, apparatus and/or system allowing implementing the various embodiments. Fig.
- FIG. 5A illustrates schematically an example of hardware architecture of a processing module 500 able to implement an encoding module or a decoding module capable of implementing respectively a method for encoding of Fig.3 and a method for decoding of Fig. 4 modified according to different aspects and embodiments.
- the 18 2023PF00217 encoding module is for example comprised in the system 11 when this system is in charge of encoding the video stream.
- the decoding module is for example comprised in the system 13.
- the processing module 500 comprises, connected by a communication bus 5005: a processor or CPU (central processing unit) 5000 encompassing one or more microprocessors, general purpose computers, special purpose computers, and processors based on a multi-core architecture, as non-limiting examples; a random access memory (RAM) 5001; a read only memory (ROM) 5002; a storage unit 5003, which can include non-volatile memory and/or volatile memory, including, but not limited to, Electrically Erasable Programmable Read-Only Memory (EEPROM), Read- Only Memory (ROM), Programmable Read-Only Memory (PROM), Random Access Memory (RAM), Dynamic Random Access Memory (DRAM), Static Random Access Memory (SRAM), flash, magnetic disk drive, and/or optical disk drive, or a storage medium reader, such as a SD (secure digital) card reader and/or a hard disc drive (HDD) and/or a network accessible storage device; at least one communication interface 5004 for exchanging data with other modules, devices or system.
- a storage medium reader such as
- the communication interface 5004 can include, but is not limited to, a transceiver configured to transmit and to receive data over a communication channel.
- the communication interface 5004 can include, but is not limited to, a modem or network card. If the processing module 500 implements a decoding module, the communication interface 5004 enables for instance the processing module 500 to receive encoded video streams (i.e., video data) and to provide a sequence of decoded pictures. If the processing module 500 implements an encoding module, the communication interface 5004 enables for instance the processing module 500 to receive a sequence of original picture data to encode and to provide an encoded video stream.
- the processor 5000 is capable of executing instructions loaded into the RAM 5001 from the ROM 5002, from an external memory (not shown), from a storage medium, or from a communication network. When the processing module 500 is powered up, the processor 5000 is capable of reading instructions from the RAM 5001 and executing them.
- These instructions form a computer program causing, for example, the implementation by the processor 5000 of a decoding method as described in relation with Fig.14 and/or an encoding method described in relation to Fig.13, these methods 19 2023PF00217 comprising various aspects and embodiments described below in this document.
- Figs.13 or 14 may be implemented in software form by the execution of a set of instructions by a programmable machine such as a DSP (digital signal processor) or a microcontroller, or be implemented in hardware form by a machine or a dedicated component such as a FPGA (field-programmable gate array) or an ASIC (application-specific integrated circuit).
- a programmable machine such as a DSP (digital signal processor) or a microcontroller
- FPGA field-programmable gate array
- ASIC application-specific integrated circuit
- FIG. 5C illustrates a block diagram of an example of the system 13 in which various aspects and embodiments are implemented.
- the system 13 can be embodied as a device including the various components described below and is configured to perform one or more of the aspects and embodiments described in this document. Examples of such devices include, but are not limited to, various electronic devices such as personal computers, laptop computers, smartphones, tablet computers, digital multimedia set top boxes, digital television receivers, personal video recording systems, connected home appliances and head mounted display. Elements of system 13, singly or in combination, can be embodied in a single integrated circuit (IC), multiple ICs, and/or discrete components.
- the system 13 comprises one processing module 500 that implements a decoding module.
- system 13 is communicatively coupled to one or more other systems, or other electronic devices, via, for example, a communications bus or through dedicated input and/or output ports. In various embodiments, the system 13 is configured to implement one or more of the aspects described in this document.
- the input to the processing module 500 can be provided through various input modules as indicated in block 531.
- Such input modules include, but are not limited to, (i) a radio frequency (RF) module that receives an RF signal transmitted, for example, over the air by a broadcaster, (ii) a component (COMP) input module (or a set of COMP input modules), (iii) a Universal Serial Bus (USB) input module, and/or (iv) a High Definition Multimedia Interface (HDMI) input module.
- RF radio frequency
- COMP component
- USB Universal Serial Bus
- HDMI High Definition Multimedia Interface
- Other examples not shown in FIG.5C, include composite video. 20 2023PF00217
- the input modules of block 531 have associated respective input processing elements as known in the art.
- the RF module can be associated with elements suitable for (i) selecting a desired frequency (also referred to as selecting a signal, or band-limiting a signal to a band of frequencies), (ii) down-converting the selected signal, (iii) band-limiting again to a narrower band of frequencies to select (for example) a signal frequency band which can be referred to as a channel in certain embodiments, (iv) demodulating the down-converted and band- limited signal, (v) performing error correction, and (vi) demultiplexing to select the desired stream of data packets.
- a desired frequency also referred to as selecting a signal, or band-limiting a signal to a band of frequencies
- down-converting the selected signal for example
- band-limiting again to a narrower band of frequencies to select (for example) a signal frequency band which can be referred to as a channel in certain embodiments
- demodulating the down-converted and band- limited signal (v) performing error correction, and (vi) demultiplexing to select the desired stream
- the RF module of various embodiments includes one or more elements to perform these functions, for example, frequency selectors, signal selectors, band-limiters, channel selectors, filters, downconverters, demodulators, error correctors, and demultiplexers.
- the RF portion can include a tuner that performs various of these functions, including, for example, down-converting the received signal to a lower frequency (for example, an intermediate frequency or a near-baseband frequency) or to baseband.
- the RF module and its associated input processing element receives an RF signal transmitted over a wired (for example, cable) medium, and performs frequency selection by filtering, down- converting, and filtering again to a desired frequency band.
- Adding elements can include inserting elements in between existing elements, such as, for example, inserting amplifiers and an analog-to-digital converter.
- the RF module includes an antenna.
- the USB and/or HDMI modules can include respective interface processors for connecting system 13 to other electronic devices across USB and/or HDMI connections. It is to be understood that various aspects of input processing, for example, Reed-Solomon error correction, can be implemented, for example, within a separate input processing IC or within the processing module 500 as necessary. Similarly, aspects of USB or HDMI interface processing can be implemented within separate interface ICs or within the processing module 500 as necessary.
- the demodulated, error corrected, and demultiplexed stream is provided to the processing module 500.
- Various elements of system 13 can be provided within an integrated housing. 21 2023PF00217 Within the integrated housing, the various elements can be interconnected and transmit data therebetween using suitable connection arrangements, for example, an internal bus as known in the art, including the Inter-IC (I2C) bus, wiring, and printed circuit boards.
- I2C Inter-IC
- the processing module 500 is interconnected to other elements of said system 13 by the bus 5005.
- the communication interface 5004 of the processing module 500 allows the system 13 to communicate on the communication channel 12.
- the communication channel 12 can be implemented, for example, within a wired and/or a wireless medium.
- Wi-Fi Wireless Fidelity
- IEEE 802.11 IEEE refers to the Institute of Electrical and Electronics Engineers
- the Wi- Fi signal of these embodiments is received over the communications channel 12 and the communications interface 5004 which are adapted for Wi-Fi communications.
- the communications channel 12 of these embodiments is typically connected to an access point or router that provides access to external networks including the Internet for allowing streaming applications and other over-the-top communications.
- Other embodiments provide streamed data to the system 13 using the RF connection of the input block 531. As indicated above, various embodiments provide data in a non- streaming manner.
- various embodiments use wireless networks other than Wi-Fi, for example a cellular network or a Bluetooth network.
- the system 13 can provide an output signal to various output devices, including the display system 15, speakers 535, and other peripheral devices 536.
- the display system 15 of various embodiments includes one or more of, for example, a touchscreen display, an organic light-emitting diode (OLED) display, a curved display, and/or a foldable display.
- the display system 15 can be for a television, a tablet, a laptop, a cell phone (mobile phone), a head mounted display or other devices.
- the display system 15 can also be integrated with other components (for example, as in a smart phone), or separate (for example, an external monitor for a laptop).
- the other peripheral devices 536 include, in various examples of embodiments, one or more of a stand-alone digital video disc (or digital versatile disc) (DVR, for both terms), a disk player, a stereo system, and/or a lighting system.
- Various embodiments use one or more peripheral devices 536 that provide a function based on the output of the system 13. For example, 22 2023PF00217 a disk player performs the function of playing an output of the system 13.
- control signals are communicated between the system 13 and the display system 15, speakers 535, or other peripheral devices 536 using signaling such as AV.Link, Consumer Electronics Control (CEC), or other communications protocols that enable device-to-device control with or without user intervention.
- AV.Link Consumer Electronics Control
- CEC Consumer Electronics Control
- the output devices can be communicatively coupled to system 13 via dedicated connections through respective interfaces 532, 533, and 534. Alternatively, the output devices can be connected to system 13 using the communications channel 12 via the communications interface 5004 or a dedicated communication channel corresponding to the communication channel 12 in Fig. 5C via the communication interface 5004.
- the display system 15 and speakers 535 can be integrated in a single unit with the other components of system 13 in an electronic device such as, for example, a television.
- the display interface 532 includes a display driver, such as, for example, a timing controller (T Con) chip.
- T Con timing controller
- the display system 15 and speaker 535 can alternatively be separate from one or more of the other components.
- the output signal can be provided via dedicated output connections, including, for example, HDMI ports, USB ports, or COMP outputs.
- Fig. 5B illustrates a block diagram of an example of the system 11 in which various aspects and embodiments are implemented.
- System 11 is very similar to system 13.
- the system 11 can be embodied as a device including the various components described below and is configured to perform one or more of the aspects and embodiments described in this document. Examples of such devices include, but are not limited to, various electronic devices such as personal computers, laptop computers, smartphones, tablet computers, a camera and a server.
- Elements of system 11, singly or in combination, can be embodied in a single integrated circuit (IC), multiple ICs, and/or discrete components.
- IC integrated circuit
- the system 11 comprises one processing module 500 that implements an encoding module.
- the system 11 is communicatively coupled to one or more other systems, or other electronic devices, via, for example, a communications bus or through dedicated input and/or output ports.
- the system 11 is configured to implement one or more of the aspects described in this document. 23 2023PF00217
- the input to the processing module 500 can be provided through various input modules as indicated in block 531 already described in relation to Fig.5C.
- Various elements of system 11 can be provided within an integrated housing.
- the various elements can be interconnected and transmit data therebetween using suitable connection arrangements, for example, an internal bus as known in the art, including the Inter-IC (I2C) bus, wiring, and printed circuit boards.
- I2C Inter-IC
- the processing module 500 is interconnected to other elements of said system 11 by the bus 5005.
- the communication interface 5004 of the processing module 500 allows the system 11 to communicate on the communication channel 12.
- Data is streamed, or otherwise provided, to the system 11, in various embodiments, using a wireless network such as a Wi-Fi network, for example IEEE 802.11 (IEEE refers to the Institute of Electrical and Electronics Engineers).
- IEEE 802.11 IEEE refers to the Institute of Electrical and Electronics Engineers.
- the Wi- Fi signal of these embodiments is received over the communications channel 12 and the communications interface 5004 which are adapted for Wi-Fi communications.
- the communications channel 12 of these embodiments is typically connected to an access point or router that provides access to external networks including the Internet for allowing streaming applications and other over-the-top communications.
- Other embodiments provide streamed data to the system 11 using the RF connection of the input block 531.
- various embodiments provide data in a non-streaming manner.
- various embodiments use wireless networks other than Wi-Fi, for example a cellular network or a Bluetooth network.
- the data provided to the system 11 can be provided in different format. In various embodiments these data are encoded and compliant with a known video compression format such as AV1, VP9, VVC, HEVC, AVC, etc.
- these data are raw data provided for example by a picture and/or audio acquisition module connected to the system 11 or comprised in the system 11.
- the processing module 500 take in charge the encoding of these data.
- the system 11 can provide an output signal to various output devices capable of storing and/or decoding the output signal such as the system 13.
- decoding can encompass all or part of the processes performed, for example, on a 24 2023PF00217 received encoded video stream in order to produce a final output suitable for display.
- such processes include one or more of the processes typically performed by a decoder, for example, entropy decoding, inverse quantization, inverse transformation, and prediction.
- such processes also, or alternatively, include processes performed by a decoder of various implementations described in this application, for example, for applying a CC coding tool (e.g., CCLM, MMLM, CCCM or any variant of these coding tools described in this document).
- a CC coding tool e.g., CCLM, MMLM, CCCM or any variant of these coding tools described in this document.
- such processes include one or more of the processes typically performed by an encoder, for example, partitioning, prediction, transformation, quantization, and entropy encoding.
- such processes also, or alternatively, include processes performed by an encoder of various implementations described in this application, for example, for applying a CC coding tool (e.g., CCLM, MMLM, CCCM or any variant of these coding tools described in this document).
- CC coding tool e.g., CCLM, MMLM, CCCM or any variant of these coding tools described in this document.
- Various embodiments refer to rate distortion optimization.
- the balance or trade-off between a rate and a distortion is usually 25 2023PF00217 considered.
- the rate distortion optimization is usually formulated as minimizing a rate distortion function, which is a weighted sum of the rate and of the distortion. There are different approaches to solve the rate distortion optimization problem.
- the approaches may be based on an extensive testing of all encoding options, including all considered modes or coding parameters values, with a complete evaluation of their coding cost and related distortion of a reconstructed signal after coding and decoding.
- Faster approaches may also be used, to save encoding complexity, in particular with computation of an approximated distortion based on a prediction or a prediction residual signal, not the reconstructed one.
- Mix of these two approaches can also be used, such as by using an approximated distortion for only some of the possible encoding options, and a complete distortion for other encoding options.
- Other approaches only evaluate a subset of the possible encoding options.
- implementations and aspects described herein can be implemented in, for example, a method or a process, an apparatus, a software program, a data stream, or a signal. Even if only discussed in the context of a single form of implementation (for example, discussed only as a method), the implementation of features discussed can also be implemented in other forms (for example, an apparatus or program).
- An apparatus can be implemented in, for example, appropriate hardware, software, and firmware.
- processors refers to processing devices in general, including, for example, a computer, a microprocessor, an integrated circuit, or a programmable logic device.
- Processors also include communication devices, such as, for example, computers, cell phones, portable/personal digital assistants ("PDAs"), and other devices that facilitate communication of information between end-users.
- PDAs portable/personal digital assistants
- this application may refer to “determining” various pieces of information. Determining the information can include one or more of, for example, estimating the information, calculating the information, predicting the information, retrieving the information from memory or obtaining the information for example from another device, module or from user. Further, this application may refer to “accessing” various pieces of information.
- Accessing the information can include one or more of, for example, receiving the information, retrieving the information (for example, from memory), storing the information, moving the information, copying the information, calculating the information, determining the information, predicting the information, or estimating the information. Additionally, this application may refer to “receiving” various pieces of information. Receiving is, as with “accessing”, intended to be a broad term. Receiving the information can include one or more of, for example, accessing the information, or retrieving the information (for example, from memory).
- “receiving” is typically involved, in one way or another, during operations such as, for example, storing the information, processing the information, transmitting the information, moving the information, copying the information, erasing the information, calculating the information, determining the information, predicting the information, or estimating the information. It is to be appreciated that the use of any of the following “/”, “and/or”, and “at least one of”, “one or more of” for example, in the cases of “A and/or B” and “at least one of A and B”, “one or more of A and B” is intended to encompass the selection of the first listed option (A) only, or the selection of the second listed option (B) only, or the selection of both options (A and B).
- the word “signal” refers to, among other things, indicating something to a corresponding decoder.
- the encoder signals a use of some coding tools.
- the same parameters can be used at both the encoder side and the decoder side.
- an encoder can transmit (explicit signaling) a particular parameter to the decoder so that the decoder can use the same particular parameter.
- signaling can be used without transmitting (implicit signaling) to simply allow the decoder to know and select the particular parameter. By avoiding transmission of any actual functions, a bit savings is realized in various embodiments.
- signaling can be accomplished in a variety of ways. For example, one or more syntax elements, flags, and so forth are used to signal information to a corresponding decoder in various embodiments. While the preceding relates to the verb form of the word “signal”, the word “signal” can also be used herein as a noun. As will be evident to one of ordinary skill in the art, implementations can produce a variety of signals formatted to carry information that can be, for example, stored or transmitted. The information can include, for example, instructions for performing a method, or data produced by one of the described implementations. For example, a signal can include a signal indicating how to apply a CC coding tool.
- Such a signal can be formatted, for example, as an electromagnetic wave (for example, using a radio frequency portion of spectrum) or as a baseband signal.
- the formatting can include, for example, encoding an encoded video stream and modulating a carrier with the encoded video stream.
- the information that the signal carries can be, for example, analog or digital information.
- the signal can be transmitted over a variety of different wired or wireless links, as is known.
- the signal can be stored on a processor-readable medium.
- various embodiments of the H-CCP and NA-CCP modes are proposed.
- CCLM and CCCM model parameters are associated to inter coded blocks.
- a picture can be associated to a buffer, called Prediction Mode (PM) buffer in the following.
- the PM buffer allows storing some coding parameters for blocks of the associated picture.
- This buffer is stored in the DPB 319 or 419 along with the associated pictures.
- reference pictures of the DPB are associated to PM buffers.
- a PM buffer can be viewed as a grid of cells of pre- determined size (ex: 4x4).
- a cell of the PM buffer is filled with the coding parameters of a block of the associated picture covering this cell. For instance, a block of size 16x16 covers “16” 4x4 cells, each cell of the “16” 4x4 cells being filled with the coding parameters of the corresponding 16x16 block.
- the coding parameters include information representing an intra/inter mode of a block, the motion vector(s) of a block, the Picture Order Count (POC) of (a) reference picture(s) used in inter prediction of a block, etc.
- POC Picture Order Count
- spanInfo allows the inter coded block to inherit from the coding parameters of a reference block used for inter predicting the inter coded block.
- the cell(s) of the PM buffer associated to the current picture corresponding to the inter coded block inherits from the coding parameters stored in the cell(s) corresponding to the reference block of the PM buffer associated to the reference picture comprising the reference block.
- Another process called storeInfo allows storing coding parameters of a current block in the PM buffer associated with the current picture, in the cell(s) corresponding to the current block.
- Fig.12 illustrates an example of application of the spanInfo process.
- Fig. 12 represents “3” pictures P0, P1 and P2.
- Picture P0 (respectively picture P1 and P2) is associated to a PM buffer PM0 (respectively picture PM1 and PM2).
- a block B0 of picture P0 is encoded in INTRA mode.
- the coding parameters of block B0 are stored in a cell C0 of the PM buffer PM0.
- a block B1 of picture P1 is inter coded using block B0 as a reference block (block B0 is pointed by a motion vector MV1 of block B1).
- the block B1 is associated to a cell C1 of the PM buffer PM1.
- the 29 2023PF00217 cell C1 inherits from the coding parameters (i.e., the coding parameters of block B0) stored in the cell C0 of the PM buffer PM0.
- a block B2 of picture P2 is inter coded using block B1 as a reference block (block B1 is pointed by a motion vector MV2 of block B2).
- the block B2 is associated to a cell C2 of the PM buffer PM2.
- the cell C2 inherits from the coding parameters (i.e., the coding parameters of block B0) stored in the cell C1 of the PM buffer PM1.
- the spanInfo process is invoked before the encoding/decoding of a current block coded in inter using the motion information of the current block (i.e., motion vector and index of the reference picture), while the storeInfo process is invoked after the encoding/decoding of a current block.
- the spanInfo process is applied just before the encoding/decoding of block B1 (respectively B2) so that cell C1 (respectively C2) inherits from the parameters stored in cell C0 (respectively C1) that may be used to encode B1 (respectively B2).
- edges of a reference blocks are not aligned on edges of encoded blocks as defined by the partitioning of the reference picture and may cover several encoded blocks of the reference picture.
- the spanInfo process uses the cell of the reference picture corresponding to a pre-determined location in the reference block (the center of the reference block for example).
- a specific inheritance process is also applied to bi-predicted inter blocks to determine which motion vector and reference picture index to use.
- it is proposed to modify the PM buffer so that it allows storing CCCM and CCLM models parameters, or to manage another PM buffer for storing CCCM and CCLM models, similarly to the regular PM buffer.
- the spanInfo process allows a cell corresponding to an inter coded block inheriting CCCM or CCLM models from another cell.
- Cells of PM buffers corresponding to inter blocks could therefore store CCCM or CCLM models.
- the CCLM or CCCM model is stored in the cell(s) corresponding to the current block of the PM buffer associated to the current picture using the storeInfo process. Doing so, in inter slices, when applying the H-CCP mode to a current block, the frequency of updating of the H-CCLM and H-CCCM tables increases.
- the number of blocks associated to CCLM or CCCM models is not restricted to blocks 30 2023PF00217 actually encoded using the CCLM or the CCCM mode, but any type of block can be associated to CCLM or CCCM models.
- H-CCLM and H-CCCM tables are reset when a new CTU row is encountered, the likelihood of finding a block associated to a CCLM or CCCM model in a current CTU row of an inter slice is increased.
- the spanInfo process can be invoked to use the coding parameters stored in the PM buffer associated with at least one reference block used to predict the adjacent block at a corresponding position in the reference block (1502), as depicted in Fig.15.
- Fig.13 depicts schematically a modified method for encoding video data according to embodiments. Comparing to the method of Fig. 3, the method of Fig. 13 comprises new steps 1301 and 1302 and step 315 is replaced by a step 1300. All additional steps are executed for example by the processing module 500 of the system 11.
- Step 1300 is identical to step 315 except in that, responsive to an application of the H-CCP mode to a current block, the determination of the CCLM (or CCCM) model for the current block uses a H-CCLM (or H-CCCM) table updated according to the first embodiment.
- the CCLM model (or the CCCM model) to apply to the current block is obtained from a CCLM model (or CCCM model) associated to a block encoded before the current block.
- the block from which are obtained the CCLM (or CCCM) model was encoded using a mode different from the CCLM (or CCCM) coding mode, i.e., was encoded according to an inter mode.
- the H- CCLM and H-CCCM tables were filled using cells of the PM buffer corresponding in majority to inter blocks. If the current block is encoded according to the H-CCP mode, the CCLM (or CCCM) model used for the current block is stored (using the storeInfo process) in a cell (or in cells) of the PM buffer associated to the current picture corresponding to the current block in step 1301.
- the spanInfo process is applied to the current block.
- the cell(s) (called current cell) corresponding to the current block of the PM buffer associated to the current picture inherits from the coding parameters stored in the cell (called target cell) corresponding to a reference block designated by the motion information of the current block in the PM buffer associated to a reference picture.
- the coding parameters comprise a CCLM (or CCCM) model
- the model is also inherited. Therefore, the CCLM (or CCCM) model of the target cell is copied into the current cell.
- Fig.14 depicts schematically a method for decoding encoded video data according to an embodiment.
- the method of Fig. 14 comprises new steps 1401 and 1402 and step 415 is replaced by a step 1400. All additional steps are executed for example by the processing module 500 of the system 13. Steps 1400, 1401, 1402 and 1403 are respectively identical to step 1300, 1301, 1302 and 1303.
- the current cell inherits from the CCLM (or CCCM) model of the target cell in step 1302 and 1402 only if the inter prediction error of the current block is low.
- the current block inherits from the CCLM (or CCCM) models in step 1302 under the following condition: ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ 1
- ⁇ ⁇ ⁇ is the value of the DC coefficient of the reconstructed residual of the current block
- QP is the quantization parameter of the current block
- TH1 is a predefined threshold.
- the current cell inherits from the CCLM (or CCCM) model of the target cell in step 1302 and 1402 only if the number of successive applications of the spanInfo allowing storing the CCLM (or CCCM) model in the target cell is below a predefined threshold TH2.
- the value ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ and the number of successive applications of the spanInfo allowing storing the CCLM (or CCCM) model in the target cell can be considered as values representative of a quality of the CCLM (or CCCM) model inherited between the current cell and the target cell.
- CCLM model or CCCM model or both are computed for all or some reconstructed blocks predicted using a prediction mode different from the CCLM or CCCM mode (i.e., either using an inter mode or in intra mode not using the CCLM or CCCM mode). For example, whatever the actual mode of a reconstructed block is (inter, intra with or without CCLM or CCCM), a CCLM model or CCCM model or both are computed for the reconstructed block. Each computed model is then stored (using the storeInfo process) in the cell corresponding to the block of the PM buffer associated to the current picture.
- the candidate region list is replaced by a candidate block list.
- a flag is signaled to indicate whether NA-CCP is applied the chroma block of the current block. If NA-CCP is applied, an index is signaled to indicate which candidate 33 2023PF00217 block in the candidate block list provides the CCCM (or CCLM) model to the current block.
- the candidate region list is kept.
- a plurality of blocks may be comprised in a region.
- the CCCM (or CCLM) model of the block comprising a pre-determined (e.g., top left) pixel location of the region is considered as the CCCM (or CCLM) model of the region.
- a block may cover at least partially several regions.
- the CCCM (or CCLM) model of the block comprising the top-left pixel of the region is considered as the CCCM (or CCLM) model of the region.
- an example of implementation of the second embodiment in the case of the NA-CCP mode is represented by the insertion of a step 1303 to the method of Fig.3 in addition to steps 1300 and 1301.
- step 1302 is not applied.
- the processing module 500 of the system 11 computes a CCCM (or CCLM) model for a current block if necessary. For instance, in step 1303, a CCCM (or CCLM) model is computed for each inter block and for each intra block not using the CCCM (or CCLM) mode of an inter slice. Each computed CCCM (or CCLM) model is stored in a cell of the PM buffer associated to the current picture corresponding to the current block.
- Fig.14 an example of implementation of the second embodiment in the case of the NA-CCP mode is represented by the insertion of a step 1403 to the method of Fig.4 in addition to steps 1400 and 1401.
- step 1402 is not applied.
- Step 1403 is identical to step 1303.
- the CCCM (or CCLM) model to apply to the current block is obtained from a CCCM (or CCLM) model associated to a block encoded/decoded before the current block.
- the block from which is obtained CCCM (or CCLM) model was encoded using a mode different from the CCCM (or CCLM) mode, i.e., was encoded according to an inter mode or to an intra mode note using the CCCM (or CCLM) mode.
- the blocks of the candidate block list were encoded in inter mode or in an intra mode not using the CCCM (or CCLM) mode.
- a CCCM 34 2023PF00217 (or CCLM) model is computed only for a subset of blocks of the inter slice to insure that each block of the inter slice has at least one block in its candidate block list associated to a cell of a PM buffer comprising a CCCM (or CCLM) model.
- the candidate block list comprises “6” block
- each time the encoding module has reconstructed “6” blocks it computes a CCCM (or CCLM) model.
- a frequency of computing the CCCM or CCLM model is predefined, for example, at pre-determined locations in pictures, after having reconstructed a given number of pixels, after having reconstructed a block of a given size.
- CCLM or CCCM or both model(s) is (are) computed for an inter block so that the H-CCLM and H-CCCM tables are updated even if the CCLM (or CCCM) mode is not used in the inter slice.
- a step 1303 to the method of Fig. 3 in addition to steps 1300 and 1301. In this embodiment, step 1302 is not applied.
- the processing module 500 of the system 11 computes a CCLM (or CCCM or both) model for a reconstructed block if necessary (for example, if the reconstructed block was not yet encoded according to the CCLM or CCCM mode). For instance, in step 1303, a CCLM (or CCCM or both) model is computed for each inter block and for each intra block not using the CCLM (or CCCM) mode of an inter slice. After having computed the CCLM (or CCCM or both) model for the reconstructed block, the corresponding table (the H-CCLM table or the H-CCCM table) is updated with the CCLM (or CCCM) model of the current block.
- step 1300 if a current block is coded with the CCLM (or CCCM) mode, a flag is signaled to indicate whether H- CCP is applied. If H-CCP is used, an index is further signaled to indicate which candidate model in the H-CCLM table or H-CCCM table is selected.
- Fig.14 an example of implementation of the second embodiment in the case of the H-CCP mode is represented by the insertion of a step 1403 to the method of Fig. 4 in addition to steps 1400 and 1401. In this embodiment, step 1402 is not applied. Step 1403 is identical to step 1303.
- step 1400 if a current block is coded with the CCLM (or CCCM) mode, a flag indicating whether H-CCP is applied to the current block is 35 2023PF00217 decoded. If H-CCP is used, an index indicating which candidate model in the H-CCLM table or H-CCCM table using is decoded. Therefore, when applying the NA-CCP mode to a current block in step 1300 and 1400, the CCCM model to apply to the current block is obtained from a CCCM model associated to a block encoded/decoded before the current block.
- step 1302 is applied by the encoding module (resp. the decoding module) when a CCLM (or CCCM) model is not computed for each inter block or intra block not using the CCLM (or CCCM) mode.
- the model is only computed for inter blocks that would inherit CCLM (or CCCM) model from an inter block via the spanInfo process.
- the model is not computed for inter blocks inheriting CCLM (or CCCM) model from an intra block using the CCLM (or CCCM) mode via the spanInfo process.
- no CCLM (or CCCM) model is computed for the block B1 since it inherits its model from an INTRA block.
- block B2 would inherit its CCLM (or CCLM) model from the inter block B1 via the spanInfo process, a new CCLM (or CCCM) model is computed for block B2.
- the cells size of the grid of the PM buffer used for storing CCCM and CCLM models is different as the regular PM buffer for storing the intra prediction mode or for storing the motion information.
- the PM buffer for storing the intra prediction mode or for storing the motion information is 4x4
- the cells size of the grid of the PM buffer used for storing CCCM and CCLM models is 8x8.
- Increasing the size of the cells allows for reducing the amount of internal memory, whereas reducing the size of the cells allows increasing the accuracy and the number of CC-model candidates potentially.
- the cells size may vary per frame. For example, one may allocate fewer memory (greater cells size) for some slices or pictures.
- the selection/derivation of the cells size may be function of the temporal id of the frame.
- the precision for the CC-models parameters may be 36 2023PF00217 reduced to reduce the internal memory. For example, one may use 64-bits for deriving the CC-models parameters and use 32-bits for storage in the PM buffer.
- the CCCM and CCLM models are stored into a look up table (LUT) and the PM buffer stores an index pointing to the LUT (Fig.15). In that way, the CCCM and CCLM models are not duplicated into the grid and only one index is stored in each cell. One LUT is associated to each picture.
- the spanInfo process can be modified by inheriting and/or copying index rather than the CCCM and CCLM model parameters directly.
- a process for converting the indexes of one reference picture into new indexes for the current picture and for updating the current LUT with CCCM and CCLM models from the reference picture LUT is applied.
- the spanInfo process can be invoked to update the current LUT with models stored in the reference LUTs, and fill-in the current PM buffer with new indexes converted from the indexes in the reference PM buffer.
- the size of the LUT is limited to a predefined or signaled value maxLUT.
- a TV, set-top box, cell phone, tablet, or other electronic device that performs at least one of the embodiments described, and that displays (e.g., using a monitor, screen, or other type of display) a resulting picture.
- 37 2023PF00217 A TV, set-top box, cell phone, tablet, or other electronic device that tunes (e.g., using a tuner) a channel to receive a signal including an encoded video stream, and performs at least one of the embodiments described.
- a TV, set-top box, cell phone, tablet, or other electronic device that receives (e.g., using an antenna) a signal over the air that includes an encoded video stream, and performs at least one of the embodiments described.
- a server, camera, cell phone, tablet or other electronic device that transmits (e.g., using an antenna) a signal over the air that includes an encoded video stream, and performs at least one of the embodiments described.
- a server, camera, cell phone, tablet or other electronic device that tunes (e.g., using a tuner) a channel to transmit a signal including an encoded video stream, and performs at least one of the embodiments described. 38
Landscapes
- Engineering & Computer Science (AREA)
- Multimedia (AREA)
- Signal Processing (AREA)
- Computing Systems (AREA)
- Theoretical Computer Science (AREA)
- Compression Or Coding Systems Of Tv Signals (AREA)
Abstract
Un procédé consiste à : obtenir un premier bloc à reconstruire à l'aide d'un mode de codage inter-composantes dans lequel un modèle du mode de codage inter-composantes à appliquer pour reconstruire le premier bloc est obtenu à partir d'un modèle du mode de codage inter-composantes associé à un second bloc reconstruit, le second bloc ayant été reconstruit à l'aide d'un mode différent du mode de codage inter-composantes.
Applications Claiming Priority (3)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| EP23305388 | 2023-03-22 | ||
| EP23306017 | 2023-06-26 | ||
| PCT/EP2024/057124 WO2024194243A1 (fr) | 2023-03-22 | 2024-03-18 | Prédiction inter-composantes dans des images inter |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| EP4684526A1 true EP4684526A1 (fr) | 2026-01-28 |
Family
ID=90364829
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| EP24711560.3A Pending EP4684526A1 (fr) | 2023-03-22 | 2024-03-18 | Prédiction inter-composantes dans des images inter |
Country Status (5)
| Country | Link |
|---|---|
| EP (1) | EP4684526A1 (fr) |
| JP (1) | JP2026511121A (fr) |
| KR (1) | KR20250162536A (fr) |
| CN (1) | CN120642328A (fr) |
| WO (1) | WO2024194243A1 (fr) |
-
2024
- 2024-03-18 KR KR1020257029917A patent/KR20250162536A/ko active Pending
- 2024-03-18 EP EP24711560.3A patent/EP4684526A1/fr active Pending
- 2024-03-18 JP JP2025555498A patent/JP2026511121A/ja active Pending
- 2024-03-18 WO PCT/EP2024/057124 patent/WO2024194243A1/fr not_active Ceased
- 2024-03-18 CN CN202480010597.XA patent/CN120642328A/zh active Pending
Also Published As
| Publication number | Publication date |
|---|---|
| WO2024194243A1 (fr) | 2024-09-26 |
| CN120642328A (zh) | 2025-09-12 |
| KR20250162536A (ko) | 2025-11-18 |
| JP2026511121A (ja) | 2026-04-10 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| US12549747B2 (en) | Spatial resolution adaptation of in-loop and post-filtering of compressed video using metadata | |
| US20250386037A1 (en) | Simplification for cross-component intra prediction | |
| EP3854098A1 (fr) | Procédé et dispositif de codage et de décodage d'images | |
| US20250267307A1 (en) | Method and device for image encoding and decoding | |
| US20240275960A1 (en) | High-level syntax for picture resampling | |
| US20260099953A1 (en) | Mixing analog and digital neural networks implementations in video coding processes | |
| US12395637B2 (en) | Spatial illumination compensation on large areas | |
| US20240291986A1 (en) | Coding of last significant coefficient in a block of a picture | |
| WO2024194243A1 (fr) | Prédiction inter-composantes dans des images inter | |
| WO2021001215A1 (fr) | Matrices de quantification dépendant du format de chrominance pour encodage et décodage vidéo | |
| US20260067484A1 (en) | Film grain synthesis using encoding information | |
| WO2024208638A1 (fr) | Transformations non séparables pour applications à faible retard | |
| KR20250152590A (ko) | 인트라 예측에 대한 변환 도메인 접근법 | |
| EP4714107A1 (fr) | Suppression de certaines redondances dans un codage d'informations de mouvement | |
| WO2022214363A1 (fr) | Matrices 4x4 de transformées de dst7 et de dct8 de haute précision | |
| KR20250107180A (ko) | 가장 가능성이 높은 모드의 동적 목록을 사용하는 인트라 예측 모드의 인코딩 및 디코딩 방법과 대응하는 장치 |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| STAA | Information on the status of an ep patent application or granted ep patent |
Free format text: STATUS: UNKNOWN |
|
| STAA | Information on the status of an ep patent application or granted ep patent |
Free format text: STATUS: THE INTERNATIONAL PUBLICATION HAS BEEN MADE |
|
| PUAI | Public reference made under article 153(3) epc to a published international application that has entered the european phase |
Free format text: ORIGINAL CODE: 0009012 |
|
| STAA | Information on the status of an ep patent application or granted ep patent |
Free format text: STATUS: REQUEST FOR EXAMINATION WAS MADE |
|
| 17P | Request for examination filed |
Effective date: 20250826 |
|
| AK | Designated contracting states |
Kind code of ref document: A1 Designated state(s): AL AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HR HU IE IS IT LI LT LU LV MC ME MK MT NL NO PL PT RO RS SE SI SK SM TR |