WO2024007789A1 - Génération de prédiction avec contrôle hors limite dans un codage vidéo - Google Patents

Génération de prédiction avec contrôle hors limite dans un codage vidéo Download PDF

Info

Publication number
WO2024007789A1
WO2024007789A1 PCT/CN2023/098287 CN2023098287W WO2024007789A1 WO 2024007789 A1 WO2024007789 A1 WO 2024007789A1 CN 2023098287 W CN2023098287 W CN 2023098287W WO 2024007789 A1 WO2024007789 A1 WO 2024007789A1
Authority
WO
WIPO (PCT)
Prior art keywords
block
oob
prediction
sample
predictor
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Ceased
Application number
PCT/CN2023/098287
Other languages
English (en)
Inventor
Yu-Ling Hsiao
Chih-Wei Hsu
Yu-Wen Huang
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
MediaTek Inc
Original Assignee
MediaTek Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by MediaTek Inc filed Critical MediaTek Inc
Priority to CN202380051898.2A priority Critical patent/CN119586131A/zh
Priority to TW112125288A priority patent/TW202420819A/zh
Publication of WO2024007789A1 publication Critical patent/WO2024007789A1/fr
Anticipated expiration legal-status Critical
Ceased legal-status Critical Current

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/597Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding specially adapted for multi-view video sequence encoding
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/103Selection of coding mode or of prediction mode
    • H04N19/109Selection of coding mode or of prediction mode among a plurality of temporal predictive coding modes
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/17Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
    • H04N19/176Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a block, e.g. a macroblock
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/503Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
    • H04N19/51Motion estimation or motion compensation
    • H04N19/563Motion estimation with padding, i.e. with filling of non-object values in an arbitrarily shaped picture block or region for estimation purposes
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/503Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
    • H04N19/51Motion estimation or motion compensation
    • H04N19/577Motion compensation with bidirectional frame interpolation, i.e. using B-pictures

Definitions

  • the present disclosure relates generally to video coding.
  • the present disclosure relates to methods of coding pixel blocks by inter-and intra-prediction.
  • High-Efficiency Video Coding is an international video coding standard developed by the Joint Collaborative Team on Video Coding (JCT-VC) .
  • JCT-VC Joint Collaborative Team on Video Coding
  • HEVC is based on the hybrid block-based motion-compensated DCT-like transform coding architecture.
  • the basic unit for compression termed coding unit (CU) , is a 2Nx2N square block of pixels, and each CU can be recursively split into four smaller CUs until the predefined minimum size is reached.
  • Each CU contains one or multiple prediction units (PUs) .
  • VVC Versatile video coding
  • JVET Joint Video Expert Team
  • the input video signal is predicted from the reconstructed signal, which is derived from the coded picture regions.
  • the prediction residual signal is processed by a block transform.
  • the transform coefficients are quantized and entropy coded together with other side information in the bitstream.
  • the reconstructed signal is generated from the prediction signal and the reconstructed residual signal after inverse transform on the de-quantized transform coefficients.
  • the reconstructed signal is further processed by in-loop filtering for removing coding artifacts.
  • the decoded pictures are stored in the frame buffer for predicting the future pictures in the input video signal.
  • a coded picture is partitioned into non-overlapped square block regions represented by the associated coding tree units (CTUs) .
  • the leaf nodes of a coding tree correspond to the coding units (CUs) .
  • a coded picture can be represented by a collection of slices, each comprising an integer number of CTUs. The individual CTUs in a slice are processed in raster-scan order.
  • a bi-predictive (B) slice may be decoded using intra prediction or inter prediction with at most two motion vectors and reference indices to predict the sample values of each block.
  • a predictive (P) slice is decoded using intra prediction or inter prediction with at most one motion vector and reference index to predict the sample values of each block.
  • An intra (I) slice is decoded using intra prediction only.
  • a CTU can be partitioned into one or multiple non-overlapped coding units (CUs) using the quadtree (QT) with nested multi-type-tree (MTT) structure to adapt to various local motion and texture characteristics.
  • a CU can be further split into smaller CUs using one of the five split types: quad-tree partitioning, vertical binary tree partitioning, horizontal binary tree partitioning, vertical center-side triple-tree partitioning, horizontal center-side triple-tree partitioning.
  • Each CU contains one or more prediction units (PUs) .
  • the prediction unit together with the associated CU syntax, works as a basic unit for signaling the predictor information.
  • the specified prediction process is employed to predict the values of the associated pixel samples inside the PU.
  • Each CU may contain one or more transform units (TUs) for representing the prediction residual blocks.
  • a transform unit (TU) is comprised of a transform block (TB) of luma samples and two corresponding transform blocks of chroma samples and each TB correspond to one residual block of samples from one color component.
  • An integer transform is applied to a transform block.
  • the level values of quantized coefficients together with other side information are entropy coded in the bitstream.
  • coding tree block CB
  • CB coding block
  • PB prediction block
  • TB transform block
  • motion parameters consisting of motion vectors, reference picture indices and reference picture list usage index, and additional information are used for inter-predicted sample generation.
  • the motion parameter can be signalled in an explicit or implicit manner.
  • a CU is coded with skip mode, the CU is associated with one PU and has no significant residual coefficients, no coded motion vector delta or reference picture index.
  • a merge mode is specified whereby the motion parameters for the current CU are obtained from neighbouring CUs, including spatial and temporal candidates, and additional schedules introduced in VVC.
  • the merge mode can be applied to any inter-predicted CU.
  • the alternative to merge mode is the explicit transmission of motion parameters, where motion vector, corresponding reference picture index for each reference picture list and reference picture list usage flag and other needed information are signalled explicitly per each CU.
  • Some embodiments of the disclosure provide a method of coding video pictures using predictive coding with out-of-bound (OOB) checks.
  • a video coder receives data to be encoded or decoded as a current block of a current picture of a video.
  • the video coder identifies a first reference block in a first reference picture based on a first block vector of the current block and a second reference block in a second reference picture based on a second block vector of the current block.
  • the video coder performs out-of-bound (OOB) check for the first and second reference blocks.
  • the video coder generates a predictor for the current block based on the first and second reference blocks and based on the OOB check.
  • the video coder encodes or decodes the current block by using the generated predictor.
  • the first and second block vectors may be motion vectors.
  • the first reference picture or the second reference picture is the current picture, and the first block vector or the second block vector is an intra-prediction direction or mode.
  • a bi-directional motion compensation is not applied to at least one of a horizontal direction and a vertical direction.
  • the bi-directional motion compensation is not applied.
  • a sample of the first reference block is a non-OOB sample
  • a corresponding predictor sample is generated by blending the non-OOB sample of the first reference block with a corresponding non-OOB sample of the second reference block.
  • a sample of the first reference block is a OOB sample
  • a corresponding predictor sample is generated by using a corresponding non-OOB sample of the second reference block without blending.
  • the video may be a 360-degree video in an equi-rectangular projection (ERP) format.
  • ERP equi-rectangular projection
  • a corresponding predictor sample is generated by blending a horizontal wrap-around sample from the first reference picture with a corresponding non-OOB sample from the second reference block in the second reference picture.
  • a corresponding predictor sample is generated by using a corresponding non-OOB sample from the second reference block without blending.
  • the current block is partitioned into multiple geometric partitions
  • the first reference block is used to generate a first predictor for a first geometric partition
  • the second reference block is used to generate a second predictor for a second geometric partition.
  • a corresponding predictor sample is generated by blending samples of the first and second predictors along a boundary between the first and second geometric partitions, with out-of-bound samples of the first and second predictors excluded from the blending.
  • the generated predictor is a combined inter-intra prediction (CIIP) , wherein the first reference block provides inter-prediction samples and the second reference block provide intra-prediction samples.
  • the current block is coded using multi-hypothesis prediction (MHP) , such that the encoder performs an additional OOB check for a third reference block, and the predictor is generated further based on the third reference block and the additional OOB check.
  • MHP multi-hypothesis prediction
  • the first block vector is refined to minimize a calculated template matching cost between a reference template neighboring the first reference block and a current template neighboring the current block, with OOB samples of the reference template excluded from the cost calculation.
  • the first and second block vectors are refined to minimize a bilateral matching cost based on the first and second reference blocks, with OOB samples in the first and second reference blocks excluded from the cost calculation.
  • FIG. 1 illustrates bi-directional prediction with out-of-bound (OOB) reference blocks.
  • FIG. 2 conceptually illustrates multiple hypothesis prediction (MHP) for coding a current block.
  • FIG. 3 illustrates the partitioning of a coding unit (CU) by the geometric partitioning mode (GPM) .
  • FIG. 4 illustrates an example uni-prediction candidate list for a GPM partition and the selection of a uni-prediction MV for GPM.
  • FIG. 5 illustrates an example partition edge blending process for GPM for a CU.
  • FIG. 6 conceptually illustrates performing template matching based on a search area around an initial motion vector (MV) .
  • FIG. 7 conceptually illustrates a current block having sub-block motion.
  • FIG. 8 conceptually illustrates refinement of a prediction candidate by bilateral matching (BM) .
  • FIG. 9 shows the intra-prediction modes in different directions.
  • FIGS. 10A-B conceptually illustrate top and left reference templates with extended lengths for supporting wide-angular direction mode for non-square blocks of different aspect ratios.
  • FIGS. 11A-C conceptually illustrate horizontal wrap-around motion compensation with out-of-boundary (OOB) checks.
  • FIG. 12 conceptually illustrates intra prediction with OOB reference samples.
  • FIG. 13 illustrates an example video encoder that may implement predictive coding.
  • FIG. 14 illustrates portions of the video encoder that implement out-of-bound checks for predictive coding.
  • FIG. 15 conceptually illustrates a process for using out-of-bound checks for predictive coding of a block of pixels.
  • FIG. 16 illustrates an example video decoder that may implement predictive coding.
  • FIG. 17 illustrates portions of the video decoder that implement out-of-bound checks for predictive coding.
  • FIG. 18 conceptually illustrates a process for using out-of-bound checks for predictive coding of a block of pixels.
  • FIG. 19 conceptually illustrates an electronic system with which some embodiments of the present disclosure are implemented.
  • FIG. 1 illustrates bi-directional prediction with out-of-boundary (OOB) reference blocks.
  • bi-directional motion compensation is performed to generate an inter prediction block (predictor) of a current block 105 in a current picture 100.
  • a list 0 reference block 120 is partially out-of-boundary (OOB) of a reference picture 110.
  • a list 1 reference block 121 is fully inside a reference picture 111.
  • the OOB portion 140 of the reference block 120 is filled with repetitive padding samples when the video coder performing a prediction generation process 130.
  • the prediction generation process 130 uses the reference blocks 120 and 121 to generate a prediction block 135 for the current block 105. Since the OOB portion 140 of the reference block 120 is padded with repetitive samples derived from the boundary samples within the reference picture 110, the part of a motion compensated block 135 that correspond to the OOB portion 140 may provide less prediction efficiency.
  • a uni-directional motion compensated sample is regarded as OOB when one of its reference samples is located outside the reference picture beyond half sample.
  • the corresponding bi-directional sample is set equal to the non-OOB sample instead of using an average of the OOB and non-OOB samples.
  • the OOB prediction samples are discarded, and only the non-OOB predictors are used to generate the final predictor.
  • Pos_x i, j and Pos_y i, j denote the x and y positions of one prediction sample (i, j) in one current block
  • Pos LeftBdry , Pos RightBdry , Pos TopBdry , Pos BottomBdry are the positions of four boundaries of the picture.
  • One prediction sample is regarded as OOB when at least one of the following conditions is satisfied: (Pos_x i, j + MV LX _x i, j ) > (Pos RightBdry +half_pixel) , (Pos_x i, j + MV LX _x i, j ) ⁇ (Pos LeftBdry +half_pixel) , (Pos_y i, j + MV LX _y i, j ) > (Pos BottomBdry +half_pixel) , (Pos_y i, j + MV LX _y i, j ) > (Pos TopBdry +half_pixel)
  • half_pixel is equal to 8 that represents the half-pel sample distance in the 1/16-pel sample precision.
  • the horizontal wrap around motion compensation is a 360-specific coding tool designed to improve the visual quality of reconstructed 360-degree video in an equi-rectangular projection (ERP) format.
  • ERP equi-rectangular projection
  • conventional motion compensation when a motion vector refers to samples beyond the picture boundaries of the reference picture, repetitive padding is applied to derive the values of the out-of-bounds samples by copying from those nearest neighbors on the corresponding picture boundary.
  • this method of repetitive padding is not suitable, and may cause visual artefacts called “seam artefacts” in a reconstructed viewport video. Because a 360-degree video is captured on a sphere and inherently has no “boundary, ” the reference samples that are out of the boundaries of a reference picture in the projected domain can always be obtained from neighboring samples in the spherical domain.
  • the horizontal wrap around motion compensation can be combined with the (non-normative) padding method often used in 360-degree video coding. This may be achieved by signaling a high-level syntax element to indicate the wrap-around offset, which may be set to the width of padding applied to the ERP picture. This syntax may be used to adjust the position of horizontal wrap around accordingly. This syntax is not affected by the specific amount of padding on the left and right picture boundaries, and therefore naturally supports asymmetric padding of the ERP picture, i.e., when left and right padding are different.
  • the horizontal wrap around motion compensation provides more meaningful information for motion compensation when the reference samples are outside of the reference picture’s left and right boundaries.
  • the multi-hypothesis inter prediction mode MHP
  • one or more additional motion-compensated prediction signals are signaled, in addition to the conventional Bi-prediction signal.
  • the resulting overall prediction signal is obtained by sample-wise weighted superposition.
  • the weighting factor ⁇ (or blending ratio) is specified by a syntax element add_hyp_weight_idx in the coded video to be 1/4 or -1/8, etc.
  • FIG. 2 conceptually illustrates multiple hypothesis prediction (MHP) for coding a current block 220.
  • the current block 220 is in a current picture 200.
  • the current block has bi-prediction motion vectors MV0 and MV1 for fetching predictors 230 (in reference picture 210) and 231 (in reference picture 211) , respecrtively.
  • the bi-prediction predictors are used to generate the bi-prediction signal p bi .
  • the current block is coded by using MHP prediction.
  • the MHP prediction is based on an additional hypothesis selected from several MHP candidates A, B, C, and D, which respectively locate predictors or prediction signals 241, 242, 243, and 244 in different reference pictures.
  • the hypothesis C is selected to generate the additional prediction signal h 3 , which is prediction signal 243.
  • more than one additional prediction signal can be used.
  • the resulting overall prediction signal is accumulated iteratively with each additional prediction signal.
  • p n+1 (1 - ⁇ n+1 ) p n + ⁇ n+1 h n+1
  • the resulting prediction signal p n+1 is obtained as a blending result of the last p n (currently accumulated prediction) and the newest additional prediction signal h n+1 .
  • the first existing prediction of the current block is p 0 .
  • p 0 is indicated with the existing merge index.
  • the blending of h n+1 and p n+1 is a weighted sum based on a blending ratio or weighting factor ⁇ .
  • a weighting index may be signaled/parsed to indicate the weighting factor.
  • the Combined inter and intra prediction combines an inter prediction signal with an intra prediction signal.
  • the inter prediction signal in the CIIP mode P inter is derived using the same inter prediction process applied to regular merge mode; and the intra prediction signal P intra is derived following the regular intra prediction process with the planar mode or the one or more intra prediction modes derived from a pre-defined mechanism.
  • the pre-defined mechanism is based on the neighboring reference regions (template) of the current block.
  • the intra prediction mode of a CU is implicitly derived by a neighboring template at both encoder and decoder, instead of being signalled as the exact intra prediction mode bits to the decoder.
  • the prediction samples of the template are generated using the reference samples of the template for each candidate mode.
  • a cost is calculated as the SATD between the prediction and the reconstruction samples of the template.
  • the intra prediction mode with the minimum cost and/or some intra prediction modes with the smaller costs are selected and used for intra prediction of the CU.
  • the candidate modes may be all MPMs and/or any subset of MPMs, 67 intra prediction modes as in VVC or extended to 131 intra prediction modes.
  • the intra and inter prediction signals are combined using weighted averaging, where the weight value is calculated depending on the coding modes of the top and left neighbouring blocks.
  • a CU when a CU is coded in merge mode, if the CU contains at least 64 luma samples (that is, CU width times CU height is equal to or larger than 64) , and if both CU width and CU height are less than 128 luma samples, an additional flag maybe signaled to indicate if CIIP mode is applied to the current CU.
  • Geometric partitioning mode is a coding tool for inter prediction that partitions the current block being coded into two or more parts.
  • Each GPM partitioning or GPM split is a partition mode characterized by a distance-angle pairing that defines a bisecting or segmenting line.
  • the geometric partitioning mode (GPM) is signalled using a CU-level flag as one kind of merge mode, with other merge modes that includes the regular merge mode, the MMVD mode, the CIIP mode, and the subblock merge mode.
  • FIG. 3 illustrates the partitioning of a CU by the geometric partitioning mode (GPM) .
  • the figure illustrates examples of the GPM splits grouped by identical angles.
  • GPM geometric partitioning mode
  • a CU is split into at least two parts by a geometrically located straight line.
  • the location of the splitting line is mathematically derived from the angle and offset parameters of a specific partition.
  • Each partition in the CU formed by a partition mode of GPM is inter-predicted using its own motion (vector) .
  • vector vector
  • only uni-prediction is allowed for each partition, that is, each part has one motion vector and one reference index.
  • the uni-prediction motion constraint is applied to ensure that, similar to conventional bi-prediction, only two motion compensated prediction are performed for each CU.
  • a geometric partition index indicating the partition mode of the geometric partitioning (angle and offset) and two merge indices (one for each partition) are further signalled.
  • Each of the at least two partitions created by the geometric partitioning according to a partition mode may be assigned a merge index to select a candidate from a uni-prediction candidate list (also referred to as the GPM candidate list) .
  • the pair of merge indices of the two partitions therefore select a pair of merge candidates.
  • the maximum number of candidates in the GPM candidate list may be signalled explicitly in SPS to specify syntax binarization for GPM merge indices.
  • the sample values along the geometric partitioning edge are adjusted using a blending processing with adaptive weights. This is the prediction signal for the whole CU, and transform and quantization process will be applied to the whole CU as in other prediction modes.
  • the motion field of the CU as predicted by GPM is then stored.
  • the uni-prediction candidate list for a GPM partition may be derived directly from the merge candidate list of the current CU.
  • FIG. 4 illustrates an example uni-prediction candidate list 400 for a GPM partition and the selection of a uni-prediction MV for GPM.
  • the GPM candidate list 400 is constructed in an even-odd manner with only uni-prediction candidates that alternates between L0 MV and L1 MV. Let n be the index of the uni-prediction motion in the uni-prediction candidate list for GPM.
  • the LX (i.e., L0 or L1) motion vector of the n-th extended merge candidate, with X equal to the parity of n, is used as the n-th uni-prediction motion vector for GPM. (These motion vectors are marked with “x” in the figure. ) In case a corresponding LX motion vector of the n-th extended merge candidate does not exist, the L (1 -X) motion vector of the same candidate is used instead as the uni-prediction motion vector for GPM.
  • the sample values along the geometric partition edge are adjusted using a blending processing with adaptive weights. Specifically, after predicting each part of a geometric partition using its own motion, blending is applied to the at least two prediction signals to derive samples around geometric partition edge.
  • the blending weight for each position of the CU are derived based on the distance between the individual position and the partition edge.
  • the distance for a position (x, y) to the partition edge are derived as:
  • i, j are the indices for angle and offset of a geometric partition, which depend on the signaled geometric partition index.
  • the sign of ⁇ x, j and ⁇ y, j depend on angle index i.
  • FIG. 5 illustrates an example partition edge blending process for GPM for a CU 500.
  • blending weights are generated based on an initial blending weight w 0 .
  • motionIdx is equal to d (4x+2, 4y+2) , which is recalculated from equation (2) .
  • the partIdx depends on the angle index i. If sType is equal to 0 or 1, Mv0 or Mv1 are stored in the corresponding motion field, otherwise if sType is equal to 2, a combined Mv from Mv0 and Mv2 are stored.
  • the combined Mv are generated using the following process: (i) If Mv1 and Mv2 are from different reference picture lists (one from L0 and the other from L1) , then Mv1 and Mv2 are simply combined to form the bi-prediction motion vectors; (ii) otherwise, if Mv1 and Mv2 are from the same list, only uni-prediction motion Mv2 is stored.
  • Template matching is a decoder-side MV derivation method to refine the motion information of the current CU by finding the closest match between a template of the current CU (e.g., top and/or left neighbouring blocks of the current CU) in the current picture and a set of pixels (i.e., same size to the template) in a reference picture.
  • a template of the current CU e.g., top and/or left neighbouring blocks of the current CU
  • a set of pixels i.e., same size to the template
  • FIG. 6 conceptually illustrates performing template matching based on a search area around an initial motion vector (MV) .
  • the video coder searches the reference picture or frame 601 within a [-8, +8] -pel search range around an initial MV 610 for a better or refined MV 611.
  • the initial MV 610 identifies an initial reference template 630.
  • the refined MV 611 identifies a refined reference template 631.
  • the search is based on minimizing the difference (or cost) between a current template 620 neighboring the current block 605 and the refined reference template 631.
  • the template matching may be performed with a search step size that is determined based on an adaptive motion vector resolution mode (AMVR) .
  • the template matching process can be cascaded with a bilateral matching process in merge modes.
  • AMVR adaptive motion vector resolution mode
  • an MVP candidate is determined based on template matching error to select the one that reaches the minimum difference between the current block template and the reference block template, and then TM is performed only for this particular MVP candidate for MV refinement.
  • the TM process refines this MVP candidate, starting from full-pel MVD precision (or 4-pel for 4-pel AMVR mode) within a [–8, +8] -pel search range by using iterative diamond search.
  • the AMVP candidate may be further refined by using cross search with full-pel MVD precision (or 4-pel for 4-pel AMVR mode) , followed sequentially by half-pel and quarter-pel ones depending on a AMVR mode search pattern.
  • Adaptive Reordering of Merge Candidates with Template Matching is a method to re-order merge candidates based on template-matching (TM) cost, where signaling efficiency is improved by sorting merge candidates in ascending order of TM costs.
  • TM template-matching
  • merge candidates are reordered before the refinement process.
  • the template matching cost of a merge candidate may be measured by the sum of absolute differences (SAD) between samples of the current template 620 of the current block and their corresponding reference samples in the reference template 621.
  • merge candidates are divided into several subgroups.
  • the subgroup size is set to 5 for regular merge mode and TM merge mode.
  • the subgroup size is set to 3 for affine merge mode.
  • Merge candidates in each subgroup are reordered ascendingly according to cost values based on template matching. In some embodiments, merge candidates in the last but not the first subgroup are not reordered.
  • the above template may include several sub-templates with the size of Wsub ⁇ 1, and the left template includes several sub-templates with the size of 1 ⁇ Hsub.
  • the motion information of the subblocks in the first row and the first column of current block is used to derive the reference samples of each sub-template.
  • FIG. 7 conceptually illustrates a current block 700 having sub-block motion.
  • the current block 700 is coded by using the motion information of the subblocks in the first row and the first column (subblocks A-G) of the current block.
  • the motion information of subblocks A-G are used to identify reference subblocks A’ -G’ in a reference picture.
  • the current block 700 has a neighboring template 710, which includes sub-templates 711-717 that neighbors subblocks A-G, respectively, that are above and left of the current block 700.
  • the reference subblocks A’ -G’ has corresponding neighboring reference sub-templates 721-727 in the reference picture.
  • the TM costs of the motion information of the different subblocks can be computed by matching the sub-templates 711-717 with the corresponding respective sub-templates 721-727.
  • a bilateral-matching (BM) based decoder side motion vector refinement can be applied to refine MVs.
  • BM bilateral-matching
  • a refined MV is searched around the initial MVs in the reference picture list L0 and reference picture list L1.
  • the BM method calculates the distortion between the two candidate blocks in the reference picture list L0 and list L1.
  • the MV candidate with the lowest SAD becomes the refined MV and used to generate the bi-predicted signal.
  • a multi-pass decoder-side motion vector refinement (MP-DMVR) method is applied in regular merge mode if the selected merge candidate meets the DMVR conditions.
  • MP-DMVR multi-pass decoder-side motion vector refinement
  • first pass bilateral matching (BM) is applied to the coding block.
  • second pass BM is applied to each 16x16 subblock within the coding block.
  • MV in each 8x8 subblock is refined by applying bi-directional optical flow (BDOF) .
  • BDOF bi-directional optical flow
  • the BM refines a pair of motion vectors MV0 and MV1 under the constraint that motion vector difference MVD0 (i.e., MV0’ -MV0) is just the opposite sign of motion vector difference MVD1 (i.e., MV1’ -MV1) .
  • the number of directional intra modes may be extended from 33, as used in HEVC, to 65 direction modes so that the range of k is from ⁇ 1 to ⁇ 16.
  • These denser directional intra prediction modes apply for all block sizes and for both luma and chroma intra predictions.
  • the number of intra-prediction mode is 35 (or 67) .
  • some modes are identified as a set of most probable modes (MPM) for intra-prediction in current prediction block.
  • the encoder may reduce bit rate by signaling an index to select one of the MPMs instead of an index to select one of the 35 (or 67) intra-prediction modes.
  • the intra-prediction mode used in the left prediction block and the intra-prediction mode used in the above prediction block are used as MPMs.
  • the intra-prediction mode in two neighboring blocks use the same intra-prediction mode, the intra-prediction mode can be used as an MPM.
  • the two neighboring directions immediately next to this directional mode can be used as MPMs.
  • DC mode and Planar mode are also considered as MPMs to fill the available spots in the MPM set, especially if the above or top neighboring blocks are not available or not coded in intra-prediction, or if the intra-prediction modes in neighboring blocks are not directional modes.
  • the intra-prediction mode for current prediction block is one of the modes in the MPM set, 1 or 2 bits are used to signal which one it is. Otherwise, the intra-prediction mode of the current block is not the same as any entry in the MPM set, and the current block will be coded as a non-MPM mode. There are all-together 32 such non-MPM modes and a (5-bit) fixed length coding method is applied to signal this mode.
  • Max –Min is greater than or equal to 62:
  • Max –Min is equal to 2:
  • Conventional angular intra prediction directions are defined from 45 degrees to -135 degrees in clockwise direction.
  • VVC several conventional angular intra prediction modes are adaptively replaced with wide-angle intra prediction modes for non-square blocks.
  • the replaced modes are signalled using the original mode indices, which are remapped to indices of wide angular modes after parsing.
  • the total number of intra prediction modes is unchanged, i.e., 67, and the intra mode coding method is unchanged.
  • a top reference template with length 2W+1 and a left reference template with length 2H+1 are defined.
  • FIGS. 10A-B conceptually illustrate top and left reference templates with extended lengths for supporting wide-angular direction mode for non-square blocks of different aspect ratios.
  • OBMC Overlapped Block Motion Compensation
  • Inter predY represents the samples predicted by the motion of current block in the original domain
  • Intra predY represents the samples predicted in the mapped domain
  • OBMC predY represents the samples predicted by the motion of neighboring blocks in the original domain
  • w 0 and w 1 are the weights.
  • a subblock-boundary OBMC is performed by applying the same blending to the top, left, bottom, and right subblock boundary pixels using neighboring subblocks’ motion information.
  • OBMC maybe enabled for the subblock based coding tools such as Affine AMVP modes, Affine merge modes and subblock-based temporal motion vector prediction, and subblock-based bilateral matching.
  • LMCS luma mapping with chroma scaling
  • inter blending is performed prior to LMCS mapping of inter samples.
  • LMCS is applied to blended inter samples which are combined with LMCS applied intra samples in CIIP mode.
  • some embodiments of the disclosure provide a method for performing OOB check to predictive coding modes other than bi-directional motion compensation.
  • bi-directional motion compensation with OOB check is only applied to vertical direction, that is, use horizontal wrap around motion compensation for OOB samples at horizontal direction and use bi-directional motion compensation with OOB check for OOB samples at vertical direction.
  • bi-directional motion compensation with OOB check is not applied for horizontal wrap around motion compensation process or for motion compensation of 360-degree video in the ERP projection format.
  • FIGS. 11A-C conceptually illustrate horizontal wrap-around motion compensation with out-of-boundary (OOB) checks.
  • a current block 1105 in a current picture 1100 is coded by bi-prediction using motion vectors MV0 and MV1.
  • FIG. 11A shows MV0 pointing to a reference block 1120 in a reference picture 1110.
  • FIG. 11B shows MV1 pointing to a reference block 1121 in a L1 reference picture 1111.
  • a part 1130 of the reference block 1120 is OOB by being outside of the left (or right) boundary of the projected domain of the reference picture 1110.
  • the samples for the OOB part 1130 of the reference block 1120 are generated by horizontal wrap- around motion compensation, specifically by replicating from a spherical neighbor part 1131 that is located within the reference picture 1110 toward the right boundary in the projected domain.
  • a part 1140 of the reference block 1121 is OOB by being outside of the top (or bottom) boundary of the projected domain of the L1 reference picture 1111. In some embodiments, repetitive padding is applied to generate the samples of the OOB part 1140.
  • FIG. 11C illustrates generating a bi-directional motion compensation predictor 1135 based on the OOB and the non-OOB samples for ERP video.
  • the methods described in Section I above can be used to generate the predictor 1135 for the current block 1110 based on the reference blocks 1120 and 1121 and their respective OOB status.
  • the samples generated by horizontal wrap-around for the OOB part 1130 of the reference block 1120 are not considered OOB samples.
  • the non-OOB samples from the reference block 1120 including horizontal wrap-around samples 1130
  • the non-OOB samples from the reference block 1121 are blended (by e.g., weighted averaging) to generate prediction samples for the predictor 1135.
  • sample positions that correspond to the OOB samples e.g., positions in OOB part 1140
  • the non-OOB samples from the reference block 1120 are used (so OOB samples are not included in the predictor 1135) .
  • OOB check is applied to intra-predicted blocks.
  • the video coder may treat the sample as OOB.
  • FIG. 12 conceptually illustrates intra prediction with OOB reference samples.
  • a current block 1210 is intra-predicted.
  • the samples of an upper/top L-neighbor 1220 of the current block are real reconstructed samples, and the samples of a left L-neighbor 1222 of the current block are padding samples.
  • the block 1210 is intra predicted by a diagonal direction.
  • the samples in an upper triangle 1230 are non-OOB and the samples in a lower triangle 1232 are OOB samples.
  • OOB check is applied to a DMVR sub-block (or coding block level MP-DMVR, or subblock level MP-DMVR) .
  • a search point e.g., the initial MV and the MV offset
  • the search point is designated as unavailable or deprioritized relative to non-OOB search points (by e.g., applying additional costs or a pre-defined large cost. )
  • the refined MV is set to initial MV.
  • the video coder even if the initial MV is OOB, the video coder still compares the initial MV with other non-OOB search points. The video coder does not set the initial MV as unavailable, or set the initial MV to a lower priority than non-OOB search points, or add additional costs, or apply a pre-defined large cost.
  • OOB check is applied to the two (or more) partitions of a GPM coded block (by e.g., GPM-TM, GPM-MMVD, GPM-INTER-INTRA, GPM-INTRA-INTRA. ) Specifically, OOB checks may be performed for two Inter partitions, or one Inter partition and one Intra partition, or two Intra partitions. GPM blending is then applied to the results of OOB checks according to the following:
  • the OOB check is not applied to Intra part and is only applied to Inter part, such that the OOB check results of the Intra part is always non-OOB.
  • GPM Blending for GPM with one Inter part and one Intra part may be computed according to the following:
  • OOB of P i, j Inter OOB of P i, j Inter , L0 &OOB of P i, j Inter , L1
  • OOB check is applied to Inter part and Intra part of CIIP, then apply CIIP blending with results of OOB checks according to the following:
  • OOB of P i, j Inter OOB of P i, j Inter, L0
  • OOB of P i, j Inter OOB of P i, j Inter, L1
  • OOB of P i, j Inter OOB of P i, j Inter, L0
  • P i, j final CIIP_blending (P i, j Inter , P i, j Intra )
  • OOB check is not applied to Intra part and is only applied to Inter part, that is OOB check results of Intra part is always non-OOB.
  • CIIP blending for CIIP with one Inter part and one Intra part is according to the following:
  • P i, j final CIIP_blending (P i, j Inter , P i, j Intra )
  • a sample of Inter part is OOB, use the corresponding sample of Intra part as predictor, otherwise apply CIIP blending to the samples of the two CIIP parts.
  • OOB check is applied to the template regions of template matching (TM) or ARMC-TM, then apply blending with results of OOB checks.
  • the video coder may perform motion vector refinement based on minimizing TM matching cost between a reference template (e.g., reference template 630) neighboring the reference block and a current template (e.g., current template 620) neighboring the current block.
  • the TM cost may be calculated based on OOB checks of the reference template and/or the current template. For example, OOB samples in the reference template will not be used for determining the TM cost.
  • OOB of P i, j Inter OOB of P i, j Inter, L0
  • the video coder applies OOB check to Multi-Hypothesis Prediction (MHP) , then adds additional inter prediction signal/hypothesis with results of OOB checks. If the additional inter prediction signal is OOB, the additional inter prediction signal is not added to the original prediction signal. Otherwise, if the additional inter prediction signal is not OOB, the additional inter prediction signal is added to the original prediction signal.
  • the OOB check results may be derived from OOB results of L0 prediction (reference block) and OOB results of L1 prediction (reference block) according to the following:
  • OOB of P i, j Inter OOB of P i, j Inter, L0 &OOB of P i, j Inter, L1
  • OOB of P i, j Inter OOB of P i, j Inter, L0
  • the foregoing proposed method can be implemented in encoders and/or decoders.
  • the proposed method can be implemented in a inter prediction module and/or intra block copy prediction module of an encoder, and/or a inter prediction module (and/or intra block copy prediction module) of a decoder.
  • FIG. 13 illustrates an example video encoder 1300 that may implement predictive coding.
  • the video encoder 1300 receives input video signal from a video source 1305 and encodes the signal into bitstream 1395.
  • the video encoder 1300 has several components or modules for encoding the signal from the video source 1305, at least including some components selected from a transform module 1310, a quantization module 1311, an inverse quantization module 1314, an inverse transform module 1315, an intra-picture estimation module 1320, an intra-prediction module 1325, a motion compensation module 1330, a motion estimation module 1335, an in-loop filter 1345, a reconstructed picture buffer 1350, a MV buffer 1365, and a MV prediction module 1375, and an entropy encoder 1390.
  • the motion compensation module 1330 and the motion estimation module 1335 are part of an inter-prediction module 1340.
  • the modules 1310 –1390 are modules of software instructions being executed by one or more processing units (e.g., a processor) of a computing device or electronic apparatus. In some embodiments, the modules 1310 –1390 are modules of hardware circuits implemented by one or more integrated circuits (ICs) of an electronic apparatus. Though the modules 1310 –1390 are illustrated as being separate modules, some of the modules can be combined into a single module.
  • the video source 1305 provides a raw video signal that presents pixel data of each video frame without compression.
  • a subtractor 1308 computes the difference between the raw video pixel data of the video source 1305 and the predicted pixel data 1313 from the motion compensation module 1330 or intra-prediction module 1325 as prediction residual 1309.
  • the transform module 1310 converts the difference (or the residual pixel data or residual signal 1308) into transform coefficients (e.g., by performing Discrete Cosine Transform, or DCT) .
  • the quantization module 1311 quantizes the transform coefficients into quantized data (or quantized coefficients) 1312, which is encoded into the bitstream 1395 by the entropy encoder 1390.
  • the inverse quantization module 1314 de-quantizes the quantized data (or quantized coefficients) 1312 to obtain transform coefficients, and the inverse transform module 1315 performs inverse transform on the transform coefficients to produce reconstructed residual 1319.
  • the reconstructed residual 1319 is added with the predicted pixel data 1313 to produce reconstructed pixel data 1317.
  • the reconstructed pixel data 1317 is temporarily stored in a line buffer (not illustrated) for intra-picture prediction and spatial MV prediction.
  • the reconstructed pixels are filtered by the in-loop filter 1345 and stored in the reconstructed picture buffer 1350.
  • the reconstructed picture buffer 1350 is a storage external to the video encoder 1300.
  • the reconstructed picture buffer 1350 is a storage internal to the video encoder 1300.
  • the intra-picture estimation module 1320 performs intra-prediction based on the reconstructed pixel data 1317 to produce intra prediction data.
  • the intra-prediction data is provided to the entropy encoder 1390 to be encoded into bitstream 1395.
  • the intra-prediction data is also used by the intra-prediction module 1325 to produce the predicted pixel data 1313.
  • the motion estimation module 1335 performs inter-prediction by producing MVs to reference pixel data of previously decoded frames stored in the reconstructed picture buffer 1350. These MVs are provided to the motion compensation module 1330 to produce predicted pixel data.
  • the MV prediction module 1375 generates the predicted MVs based on reference MVs that were generated for encoding previously video frames, i.e., the motion compensation MVs that were used to perform motion compensation.
  • the MV prediction module 1375 retrieves reference MVs from previous video frames from the MV buffer 1365.
  • the video encoder 1300 stores the MVs generated for the current video frame in the MV buffer 1365 as reference MVs for generating predicted MVs.
  • the MV prediction module 1375 uses the reference MVs to create the predicted MVs.
  • the predicted MVs can be computed by spatial MV prediction or temporal MV prediction.
  • the difference between the predicted MVs and the motion compensation MVs (MC MVs) of the current frame (residual motion data) are encoded into the bitstream 1395 by the entropy encoder 1390.
  • the entropy encoder 1390 encodes various parameters and data into the bitstream 1395 by using entropy-coding techniques such as context-adaptive binary arithmetic coding (CABAC) or Huffman encoding.
  • CABAC context-adaptive binary arithmetic coding
  • the entropy encoder 1390 encodes various header elements, flags, along with the quantized transform coefficients 1312, and the residual motion data as syntax elements into the bitstream 1395.
  • the bitstream 1395 is in turn stored in a storage device or transmitted to a decoder over a communications medium such as a network.
  • the in-loop filter 1345 performs filtering or smoothing operations on the reconstructed pixel data 1317 to reduce the artifacts of coding, particularly at boundaries of pixel blocks.
  • the filtering or smoothing operations performed by the in-loop filter 1345 include deblock filter (DBF) , sample adaptive offset (SAO) , and/or adaptive loop filter (ALF) .
  • DPF deblock filter
  • SAO sample adaptive offset
  • ALF adaptive loop filter
  • FIG. 14 illustrates portions of the video encoder 1300 that implement out-of-bound checks for predictive coding. Specifically, the figure illustrates the components of the motion compensation module 1330 of the video encoder 1300.
  • the motion compensation module 1330 has a candidate selector 1410 that selects block vectors (including motion vectors and intra prediction modes) from the MV buffer 1365 and the intra prediction module 1325. The selection is controlled by the motion estimation module 1335.
  • the selected block vectors are examined by a OOB check module 1430, which determines whether the reference blocks identified by the block vectors are out-of-bound.
  • the selected block vectors are also provided to a prediction generator module 1420.
  • the prediction generator module 1420 generates a predictor or prediction block for the current block as the predicted pixel data 1313 by fetching samples from the reconstructed picture buffer 1350.
  • the motion estimation module 1335 selects one or more prediction coding tools for the current block (e.g., bi-prediction, GPM, CIIP, MHP, intra-prediction, IBC, etc. ) , and the prediction generator module 1420 generates the predictor based on the selected prediction tool (s) .
  • the motion estimation module 1335 also provides the prediction tool selection to the entropy encoder 1390 to be signaled in the bitstream 1395.
  • the predictor generator module 1420 also uses the OOB check result from the OOB check module 1430 to determine whether to exclude certain reference samples identified by the selected block vector (s) when performing blending to generate the predictor. For example, if the current picture is an ERP picture and the reference block identified by the block vector has OOB samples, the predictor generator module 1420 may use the horizontal wrap-around samples fetched from the reconstructed picture buffer 1650 for blending if the reference block is OOB at the left or right boundary, or exclude the OOB samples from blending if the reference block is OOB at the top or bottom boundary. The predictor generator module 1420 may also use the OOB result to determine whether to exclude samples from BM cost calculation or TM cost calculation.
  • FIG. 15 conceptually illustrates a process 1500 for using out-of-bound checks for predictive coding of a block of pixels.
  • one or more processing units e.g., a processor
  • a computing device implementing the encoder 1300 performs the process 1500 by executing instructions stored in a computer readable medium.
  • an electronic apparatus implementing the encoder 1300 performs the process 1500.
  • the encoder receives (at block 1510) data to be encoded as a current block of pixels in a current picture of a video.
  • the encoder identifies (at block 1520) a first reference block in a first reference picture based on a first block vector of the current block.
  • the encoder identifies (at block 1530) a second reference block in a second reference picture based on a second block vector of the current block.
  • the first and second block vectors may be motion vectors.
  • the first reference picture or the second reference picture is the current picture
  • the first block vector or the second block vector is an intra-prediction direction or mode.
  • the encoder performs (at block 1540) out-of-bound (OOB) check for the first and second reference blocks.
  • the encoder generates (at block 1550) a predictor for the current block based on the first and second reference blocks and based on the OOB check.
  • OOB out-of-bound
  • a bi-directional motion compensation is not applied to at least one of a horizontal direction and a vertical direction.
  • the bi-directional motion compensation is not applied when a sample of the first reference block is OOB over a left or right boundary of the first reference picture or when a sample of the first reference block is OOB over a top or bottom boundary of the first reference picture.
  • a corresponding predictor sample is generated by blending the non-OOB sample of the first reference block with a corresponding non-OOB sample of the second reference block.
  • a sample of the first reference block is a OOB sample
  • a corresponding predictor sample is generated by using a corresponding non-OOB sample of the second reference block without blending.
  • the video may be a 360-degree video in an equi-rectangular projection (ERP) format.
  • ERP equi-rectangular projection
  • a corresponding predictor sample is generated by blending a horizontal wrap-around sample from the first reference picture with a corresponding non-OOB sample from the second reference block in the second reference picture.
  • a corresponding predictor sample is generated by using a corresponding non-OOB sample from the second reference block without blending.
  • the current block is partitioned into multiple geometric partitions
  • the first reference block is used to generate a first predictor for a first geometric partition
  • the second reference block is used to generate a second predictor for a second geometric partition.
  • a corresponding predictor sample is generated by blending samples of the first and second predictors along a boundary between the first and second geometric partitions, with out-of-bound samples of the first and second predictors excluded from the blending.
  • the generated predictor is a combined inter-intra prediction (CIIP) , wherein the first reference block provides inter-prediction samples and the second reference block provide intra-prediction samples.
  • the current block is coded using MHP, such that the encoder performs an additional OOB check for a third reference block, and the predictor is generated further based on the third reference block and the additional OOB check.
  • the first block vector is refined to minimize a calculated template matching cost between a reference template neighboring the first reference block and a current template neighboring the current block, with OOB samples of the reference template excluded from the cost calculation.
  • the first and second block vectors are refined to minimize a bilateral matching cost based on the first and second reference blocks, with OOB samples in the first and second reference blocks excluded from the cost calculation.
  • the encoder encodes (at block 1560) the current block by using the generated predictor to produce prediction residuals.
  • an encoder may signal (or generate) one or more syntax element in a bitstream, such that a decoder may parse said one or more syntax element from the bitstream.
  • FIG. 16 illustrates an example video decoder 1600 that may implement predictive coding.
  • the video decoder 1600 is an image-decoding or video-decoding circuit that receives a bitstream 1695 and decodes the content of the bitstream into pixel data of video frames for display.
  • the video decoder 1600 has several components or modules for decoding the bitstream 1695, including some components selected from an inverse quantization module 1611, an inverse transform module 1610, an intra-prediction module 1625, a motion compensation module 1630, an in-loop filter 1645, a decoded picture buffer 1650, a MV buffer 1665, a MV prediction module 1675, and a parser 1690.
  • the motion compensation module 1630 is part of an inter-prediction module 1640.
  • the modules 1610 –1690 are modules of software instructions being executed by one or more processing units (e.g., a processor) of a computing device. In some embodiments, the modules 1610 –1690 are modules of hardware circuits implemented by one or more ICs of an electronic apparatus. Though the modules 1610 –1690 are illustrated as being separate modules, some of the modules can be combined into a single module.
  • the parser 1690 receives the bitstream 1695 and performs initial parsing according to the syntax defined by a video-coding or image-coding standard.
  • the parsed syntax element includes various header elements, flags, as well as quantized data (or quantized coefficients) 1612.
  • the parser 1690 parses out the various syntax elements by using entropy-coding techniques such as context-adaptive binary arithmetic coding (CABAC) or Huffman encoding.
  • CABAC context-adaptive binary arithmetic coding
  • Huffman encoding Huffman encoding
  • the inverse quantization module 1611 de-quantizes the quantized data (or quantized coefficients) 1612 to obtain transform coefficients, and the inverse transform module 1610 performs inverse transform on the transform coefficients 1616 to produce reconstructed residual signal 1619.
  • the reconstructed residual signal 1619 is added with predicted pixel data 1613 from the intra-prediction module 1625 or the motion compensation module 1630 to produce decoded pixel data 1617.
  • the decoded pixels data are filtered by the in-loop filter 1645 and stored in the decoded picture buffer 1650.
  • the decoded picture buffer 1650 is a storage external to the video decoder 1600.
  • the decoded picture buffer 1650 is a storage internal to the video decoder 1600.
  • the content of the decoded picture buffer 1650 is used for display.
  • a display device 1655 either retrieves the content of the decoded picture buffer 1650 for display directly, or retrieves the content of the decoded picture buffer to a display buffer.
  • the display device receives pixel values from the decoded picture buffer 1650 through a pixel transport.
  • the motion compensation module 1630 produces predicted pixel data 1613 from the decoded pixel data 1617 stored in the decoded picture buffer 1650 according to motion compensation MVs (MC MVs) . These motion compensation MVs are decoded by adding the residual motion data received from the bitstream 1695 with predicted MVs received from the MV prediction module 1675.
  • MC MVs motion compensation MVs
  • the MV prediction module 1675 generates the predicted MVs based on reference MVs that were generated for decoding previous video frames, e.g., the motion compensation MVs that were used to perform motion compensation.
  • the MV prediction module 1675 retrieves the reference MVs of previous video frames from the MV buffer 1665.
  • the video decoder 1600 stores the motion compensation MVs generated for decoding the current video frame in the MV buffer 1665 as reference MVs for producing predicted MVs.
  • the in-loop filter 1645 performs filtering or smoothing operations on the decoded pixel data 1617 to reduce the artifacts of coding, particularly at boundaries of pixel blocks.
  • the filtering or smoothing operations performed by the in-loop filter 1645 include deblock filter (DBF) , sample adaptive offset (SAO) , and/or adaptive loop filter (ALF) .
  • DPF deblock filter
  • SAO sample adaptive offset
  • ALF adaptive loop filter
  • FIG. 17 illustrates portions of the video decoder 1600 that implement out-of-bound checks for predictive coding. Specifically, the figure illustrates the components of the motion compensation module 1630 of the video decoder 1600.
  • the motion compensation module 1630 has a candidate selector 1710 that selects block vectors (including motion vectors and intra prediction modes) from the MV buffer 1665 and the intra prediction module 1625.
  • the selection is provided by the entropy decoder 1690 based on syntax elements parsed from the bitstream 1695.
  • the selected block vectors are examined by a OOB check module 1730, which determines whether the reference blocks identified by the block vectors are out-of-bound.
  • the selected block vectors are also provided to a prediction generator module 1720.
  • the prediction generator module 1720 generates a predictor or prediction block for the current block as the predicted pixel data 1613 by fetching samples from the decoded picture buffer 1650.
  • the entropy decoder 1690 selects one or more prediction coding tools for the current block (e.g., bi-prediction, GPM, CIIP, MHP, intra-prediction, IBC, etc. ) , and the prediction generator module 1720 generates the predictor based on the selected prediction tool (s) .
  • the predictor generator module 1720 also uses the OOB check result from the OOB check module 1730 to determine whether to exclude certain reference samples identified by the selected block vector (s) when performing blending to generate the predictor. For example, if the current picture is an ERP picture and the reference block identified by the block vector has OOB samples, the predictor generator module 1720 may use the horizontal wrap-around samples fetched from the decoded picture buffer 1650 for blending if the reference block is OOB at the left or right boundary, or exclude the OOB samples from blending if the reference block is OOB at the top or bottom boundary. The predictor generator module 1720 may also use the OOB result to determine whether to exclude samples from BM cost calculation or TM cost calculation.
  • FIG. 18 conceptually illustrates a process 1800 for using out-of-bound checks for predictive coding of a block of pixels.
  • one or more processing units e.g., a processor
  • a computing device implementing the decoder 1600 performs the process 1800 by executing instructions stored in a computer readable medium.
  • an electronic apparatus implementing the decoder 1600 performs the process 1800.
  • the decoder receives (at block 1810) data to be decoded as a current block of pixels in a current picture of a video.
  • the decoder identifies (at block 1820) a first reference block in a first reference picture based on a first block vector of the current block.
  • the decoder identifies (at block 1830) a second reference block in a second reference picture based on a second block vector of the current block.
  • the first and second block vectors may be motion vectors.
  • the first reference picture or the second reference picture is the current picture
  • the first block vector or the second block vector is an intra-prediction direction or mode.
  • a corresponding predictor sample is generated by blending the non-OOB sample of the first reference block with a corresponding non-OOB sample of the second reference block.
  • a sample of the first reference block is a OOB sample
  • a corresponding predictor sample is generated by using a corresponding non-OOB sample of the second reference block without blending.
  • the video may be a 360-degree video in an equi-rectangular projection (ERP) format.
  • ERP equi-rectangular projection
  • a corresponding predictor sample is generated by blending a horizontal wrap-around sample from the first reference picture with a corresponding non-OOB sample from the second reference block in the second reference picture.
  • a corresponding predictor sample is generated by using a corresponding non-OOB sample from the second reference block without blending.
  • the current block is partitioned into multiple geometric partitions
  • the first reference block is used to generate a first predictor for a first geometric partition
  • the second reference block is used to generate a second predictor for a second geometric partition.
  • a corresponding predictor sample is generated by blending samples of the first and second predictors along a boundary between the first and second geometric partitions, with out-of-bound samples of the first and second predictors excluded from the blending.
  • the generated predictor is a combined inter-intra prediction (CIIP) , wherein the first reference block provides inter-prediction samples and the second reference block provide intra-prediction samples.
  • the current block is coded using MHP, such that the decoder performs an additional OOB check for a third reference block, and the predictor is generated further based on the third reference block and the additional OOB check.
  • the first block vector is refined to minimize a calculated template matching cost between a reference template neighboring the first reference block and a current template neighboring the current block, with OOB samples of the reference template excluded from the cost calculation.
  • the first and second block vectors are refined to minimize a bilateral matching cost based on the first and second reference blocks, with OOB samples in the first and second reference blocks excluded from the cost calculation.
  • the decoder reconstructs (at block 1860) the current block by using the generated predictor.
  • the decoder may then provide the reconstructed current block for display as part of the reconstructed current picture.
  • Computer readable storage medium also referred to as computer readable medium
  • these instructions are executed by one or more computational or processing unit (s) (e.g., one or more processors, cores of processors, or other processing units) , they cause the processing unit (s) to perform the actions indicated in the instructions.
  • computational or processing unit e.g., one or more processors, cores of processors, or other processing units
  • Examples of computer readable media include, but are not limited to, CD-ROMs, flash drives, random-access memory (RAM) chips, hard drives, erasable programmable read only memories (EPROMs) , electrically erasable programmable read-only memories (EEPROMs) , etc.
  • the computer readable media does not include carrier waves and electronic signals passing wirelessly or over wired connections.
  • the term “software” is meant to include firmware residing in read-only memory or applications stored in magnetic storage which can be read into memory for processing by a processor.
  • multiple software inventions can be implemented as sub-parts of a larger program while remaining distinct software inventions.
  • multiple software inventions can also be implemented as separate programs.
  • any combination of separate programs that together implement a software invention described here is within the scope of the present disclosure.
  • the software programs when installed to operate on one or more electronic systems, define one or more specific machine implementations that execute and perform the operations of the software programs.
  • FIG. 19 conceptually illustrates an electronic system 1900 with which some embodiments of the present disclosure are implemented.
  • the electronic system 1900 may be a computer (e.g., a desktop computer, personal computer, tablet computer, etc. ) , phone, PDA, or any other sort of electronic device.
  • Such an electronic system includes various types of computer readable media and interfaces for various other types of computer readable media.
  • Electronic system 1900 includes a bus 1905, processing unit (s) 1910, a graphics-processing unit (GPU) 1915, a system memory 1920, a network 1925, a read-only memory 1930, a permanent storage device 1935, input devices 1940, and output devices 1945.
  • the bus 1905 collectively represents all system, peripheral, and chipset buses that communicatively connect the numerous internal devices of the electronic system 1900.
  • the bus 1905 communicatively connects the processing unit (s) 1910 with the GPU 1915, the read-only memory 1930, the system memory 1920, and the permanent storage device 1935.
  • the processing unit (s) 1910 retrieves instructions to execute and data to process in order to execute the processes of the present disclosure.
  • the processing unit (s) may be a single processor or a multi-core processor in different embodiments. Some instructions are passed to and executed by the GPU 1915.
  • the GPU 1915 can offload various computations or complement the image processing provided by the processing unit (s) 1910.
  • the read-only-memory (ROM) 1930 stores static data and instructions that are used by the processing unit (s) 1910 and other modules of the electronic system.
  • the permanent storage device 1935 is a read-and-write memory device. This device is a non-volatile memory unit that stores instructions and data even when the electronic system 1900 is off. Some embodiments of the present disclosure use a mass-storage device (such as a magnetic or optical disk and its corresponding disk drive) as the permanent storage device 1935.
  • the system memory 1920 is a read-and-write memory device. However, unlike storage device 1935, the system memory 1920 is a volatile read-and-write memory, such a random access memory.
  • the system memory 1920 stores some of the instructions and data that the processor uses at runtime.
  • processes in accordance with the present disclosure are stored in the system memory 1920, the permanent storage device 1935, and/or the read-only memory 1930.
  • the various memory units include instructions for processing multimedia clips in accordance with some embodiments. From these various memory units, the processing unit (s) 1910 retrieves instructions to execute and data to process in order to execute the processes of some embodiments.
  • the bus 1905 also connects to the input and output devices 1940 and 1945.
  • the input devices 1940 enable the user to communicate information and select commands to the electronic system.
  • the input devices 1940 include alphanumeric keyboards and pointing devices (also called “cursor control devices” ) , cameras (e.g., webcams) , microphones or similar devices for receiving voice commands, etc.
  • the output devices 1945 display images generated by the electronic system or otherwise output data.
  • the output devices 1945 include printers and display devices, such as cathode ray tubes (CRT) or liquid crystal displays (LCD) , as well as speakers or similar audio output devices. Some embodiments include devices such as a touchscreen that function as both input and output devices.
  • CTR cathode ray tubes
  • LCD liquid crystal displays
  • bus 1905 also couples electronic system 1900 to a network 1925 through a network adapter (not shown) .
  • the computer can be a part of a network of computers (such as a local area network ( “LAN” ) , a wide area network ( “WAN” ) , or an Intranet, or a network of networks, such as the Internet. Any or all components of electronic system 1900 may be used in conjunction with the present disclosure.
  • Some embodiments include electronic components, such as microprocessors, storage and memory that store computer program instructions in a machine-readable or computer-readable medium (alternatively referred to as computer-readable storage media, machine-readable media, or machine-readable storage media) .
  • computer-readable media include RAM, ROM, read-only compact discs (CD-ROM) , recordable compact discs (CD-R) , rewritable compact discs (CD-RW) , read-only digital versatile discs (e.g., DVD-ROM, dual-layer DVD-ROM) , a variety of recordable/rewritable DVDs (e.g., DVD-RAM, DVD-RW, DVD+RW, etc.
  • the computer-readable media may store a computer program that is executable by at least one processing unit and includes sets of instructions for performing various operations. Examples of computer programs or computer code include machine code, such as is produced by a compiler, and files including higher-level code that are executed by a computer, an electronic component, or a microprocessor using an interpreter.
  • ASICs application specific integrated circuits
  • FPGAs field programmable gate arrays
  • integrated circuits execute instructions that are stored on the circuit itself.
  • PLDs programmable logic devices
  • ROM read only memory
  • RAM random access memory
  • the terms “computer” , “server” , “processor” , and “memory” all refer to electronic or other technological devices. These terms exclude people or groups of people.
  • display or displaying means displaying on an electronic device.
  • the terms “computer readable medium, ” “computer readable media, ” and “machine readable medium” are entirely restricted to tangible, physical objects that store information in a form that is readable by a computer. These terms exclude any wireless signals, wired download signals, and any other ephemeral signals.
  • any two components so associated can also be viewed as being “operably connected” , or “operably coupled” , to each other to achieve the desired functionality, and any two components capable of being so associated can also be viewed as being “operably couplable” , to each other to achieve the desired functionality.
  • operably couplable include but are not limited to physically mateable and/or physically interacting components and/or wirelessly interactable and/or wirelessly interacting components and/or logically interacting and/or logically interactable components.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)

Abstract

L'invention concerne un procédé de codage d'images vidéo au moyen d'un codage prédictif avec des contrôles hors limite (OOB). Un codeur vidéo reçoit des données à coder ou à décoder en tant que bloc courant d'une image courante d'une vidéo. Le codeur vidéo identifie un premier bloc de référence dans une première image de référence sur la base d'un premier vecteur de bloc du bloc courant et d'un second bloc de référence dans une seconde image de référence sur la base d'un second vecteur de bloc du bloc courant. Le codeur vidéo met en œuvre un contrôle hors limite (OOB) pour les premier et second blocs de référence. Le codeur vidéo génère un prédicteur pour le bloc courant sur la base des premier et second blocs de référence et sur la base du contrôle OOB, de sorte que l'utilisation d'une compensation de mouvement bidirectionnel puisse être contrainte. Le codeur vidéo code ou décode le bloc courant au moyen du prédicateur généré.
PCT/CN2023/098287 2022-07-06 2023-06-05 Génération de prédiction avec contrôle hors limite dans un codage vidéo Ceased WO2024007789A1 (fr)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN202380051898.2A CN119586131A (zh) 2022-07-06 2023-06-05 视频编解码中带越界检查的预测生成
TW112125288A TW202420819A (zh) 2022-07-06 2023-07-06 視訊編解碼中帶越界檢查的預測生成

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US202263367738P 2022-07-06 2022-07-06
US63/367,738 2022-07-06

Publications (1)

Publication Number Publication Date
WO2024007789A1 true WO2024007789A1 (fr) 2024-01-11

Family

ID=89454133

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2023/098287 Ceased WO2024007789A1 (fr) 2022-07-06 2023-06-05 Génération de prédiction avec contrôle hors limite dans un codage vidéo

Country Status (3)

Country Link
CN (1) CN119586131A (fr)
TW (1) TW202420819A (fr)
WO (1) WO2024007789A1 (fr)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109417632A (zh) * 2016-07-08 2019-03-01 Vid拓展公司 使用几何图形投影的360度视频编码
CN111684807A (zh) * 2018-02-14 2020-09-18 高通股份有限公司 用于360度视频的帧内预测
WO2021127430A1 (fr) * 2019-12-20 2021-06-24 Qualcomm Incorporated Compensation de mouvement à l'aide de la taille d'une image de référence
US20210266594A1 (en) * 2020-02-24 2021-08-26 Alibaba Group Holding Limited Methods for combining decoder side motion vector refinement with wrap-around motion compensation

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109417632A (zh) * 2016-07-08 2019-03-01 Vid拓展公司 使用几何图形投影的360度视频编码
CN111684807A (zh) * 2018-02-14 2020-09-18 高通股份有限公司 用于360度视频的帧内预测
WO2021127430A1 (fr) * 2019-12-20 2021-06-24 Qualcomm Incorporated Compensation de mouvement à l'aide de la taille d'une image de référence
US20210266594A1 (en) * 2020-02-24 2021-08-26 Alibaba Group Holding Limited Methods for combining decoder side motion vector refinement with wrap-around motion compensation

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
Y.-W. CHEN, X. XIU (KWAI), H.-J. JHU, N. YAN, X. WANG (KWAI), H. HUANG, Y-J. CHANG, C.-C. CHEN, M. KARCZEWICZ, V. SEREGIN, Y. ZHAN: "EE2-Test2.2: Enhanced bi-directional motion compensation", 26. JVET MEETING; 20220420 - 20220429; TELECONFERENCE; (THE JOINT VIDEO EXPLORATION TEAM OF ISO/IEC JTC1/SC29/WG11 AND ITU-T SG.16 ), 13 April 2022 (2022-04-13), XP030301020 *
Z. ZHANG (QUALCOMM), H. HUANG, C.-C. CHEN, Y.-J. CHANG, Y. ZHANG, V. SEREGIN, M. COBAN, M. KARCZEWICZ (QUALCOMM), F. LE LÉANNEC (X: "EE2-related: Motion compensation boundary padding", 26. JVET MEETING; 20220420 - 20220429; TELECONFERENCE; (THE JOINT VIDEO EXPLORATION TEAM OF ISO/IEC JTC1/SC29/WG11 AND ITU-T SG.16 ), 25 April 2022 (2022-04-25), XP030301010 *

Also Published As

Publication number Publication date
CN119586131A (zh) 2025-03-07
TW202420819A (zh) 2024-05-16

Similar Documents

Publication Publication Date Title
US11553173B2 (en) Merge candidates with multiple hypothesis
US11172203B2 (en) Intra merge prediction
US11297348B2 (en) Implicit transform settings for coding a block of pixels
WO2023198105A1 (fr) Dérivation et prédiction de mode intra implicites basées sur une région
WO2023198187A1 (fr) Dérivation et prédiction de mode intra basées sur un modèle
WO2023241347A9 (fr) Zones adaptatives pour dérivation et prédiction de mode intra côté décodeur
US20250274604A1 (en) Extended template matching for video coding
WO2024037645A1 (fr) Dérivation d'échantillon limite dans un codage vidéo
US20250317579A1 (en) Threshold of similarity for candidate list
WO2025021011A1 (fr) Mode de prédiction combiné
WO2025016418A1 (fr) Mode de fusion intra
WO2024131778A1 (fr) Prédiction intra avec dérivation basée sur une région
WO2023236914A1 (fr) Codage de prédiction d'hypothèses multiples
WO2024007789A1 (fr) Génération de prédiction avec contrôle hors limite dans un codage vidéo
WO2025152999A1 (fr) Extensions de mode de partitionnement géométrique
WO2024222399A1 (fr) Affinement pour différence de vecteur de mouvement en mode fusion
WO2024022144A1 (fr) Prédiction intra basée sur de multiples lignes de référence
WO2024016955A1 (fr) Vérification hors limite dans un codage vidéo
WO2024027700A1 (fr) Indexation conjointe de mode de partitionnement géométrique dans un codage vidéo
WO2024017224A1 (fr) Affinement de candidat affine
WO2026046374A1 (fr) Mélange de prédicteur adaptatif et ordre de traitement dans des blocs en chevauchement
WO2024146511A1 (fr) Mode de prédiction représentatif d'un bloc de pixels
WO2024213123A1 (fr) Copie intra-bloc avec modes de sous-bloc et appariement modèle-objet
WO2023236916A1 (fr) Mise à jour d'attributs de mouvement de candidats de fusion
WO2024037641A1 (fr) Gestion de bloc de référence hors limites

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 23834563

Country of ref document: EP

Kind code of ref document: A1

WWE Wipo information: entry into national phase

Ref document number: 202380051898.2

Country of ref document: CN

NENP Non-entry into the national phase

Ref country code: DE

WWP Wipo information: published in national office

Ref document number: 202380051898.2

Country of ref document: CN

122 Ep: pct application non-entry in european phase

Ref document number: 23834563

Country of ref document: EP

Kind code of ref document: A1