WO2020181476A1 - Procédé et dispositif de prédiction d'image vidéo - Google Patents
Procédé et dispositif de prédiction d'image vidéo Download PDFInfo
- Publication number
- WO2020181476A1 WO2020181476A1 PCT/CN2019/077726 CN2019077726W WO2020181476A1 WO 2020181476 A1 WO2020181476 A1 WO 2020181476A1 CN 2019077726 W CN2019077726 W CN 2019077726W WO 2020181476 A1 WO2020181476 A1 WO 2020181476A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- motion vector
- prediction block
- block
- reference prediction
- candidate
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Ceased
Links
Images
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/50—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
- H04N19/503—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
- H04N19/51—Motion estimation or motion compensation
- H04N19/513—Processing of motion vectors
- H04N19/517—Processing of motion vectors by encoding
- H04N19/52—Processing of motion vectors by encoding by predictive encoding
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/169—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
- H04N19/17—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
- H04N19/176—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a block, e.g. a macroblock
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/50—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
- H04N19/503—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
- H04N19/51—Motion estimation or motion compensation
- H04N19/513—Processing of motion vectors
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/50—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
- H04N19/503—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
- H04N19/51—Motion estimation or motion compensation
- H04N19/513—Processing of motion vectors
- H04N19/517—Processing of motion vectors by encoding
Definitions
- This application relates to the field of image coding and decoding technologies, and in particular to video image prediction methods and devices, and corresponding video encoders and video decoders.
- Video compression technology has increasingly become an indispensable key technology in the field of video applications.
- the basic principle of video coding and compression is to use the correlation between spatial domain, time domain and codewords to remove redundancy as much as possible.
- the current popular approach is to use a hybrid video coding framework based on image blocks to implement video coding compression through steps such as prediction (including intra-frame prediction and inter-frame prediction), transformation, quantization, and entropy coding.
- motion estimation/motion compensation in inter-frame prediction is a key technology that affects encoding/decoding performance.
- MMVD merge with motion vector difference
- the embodiments of the present application provide video image prediction methods, devices, and corresponding encoders and decoders, which can reduce redundancy to a certain extent, improve image prediction accuracy, and thereby improve coding and decoding performance.
- an embodiment of the present application provides a video image prediction method, including:
- the second initial motion vector predicted value execute a motion vector correction process (or a motion vector refinement process, such as decoder-side motion vector refinement (DMVR)) to obtain the first corrected motion vector Predicted value and the second modified motion vector prediction value; determine the first motion vector prediction value according to the difference between the first modified motion vector prediction value and the first motion vector, and determine the first motion vector prediction value according to the second modified motion vector prediction value and the The second motion vector difference determines a second motion vector predictor; and predicts the current image block to be processed according to the first motion vector predictor and the second motion vector predictor.
- a motion vector correction process or a motion vector refinement process, such as decoder-side motion vector refinement (DMVR)
- DMVR decoder-side motion vector refinement
- the image block currently to be processed (referred to as the current block for short) herein can be understood as the image block currently being processed.
- the current block for short
- the image block currently being processed in the encoding process, it refers to the image block currently being encoded (encoding); in the decoding process, it refers to the image block currently being decoded (decoding block).
- the first initial motion vector prediction value corresponds to the initial motion vector prediction value of the first list (ie list0), and accordingly, the second initial motion vector prediction value corresponds to the second list (ie list1) The initial motion vector prediction value.
- the first initial motion vector prediction value corresponds to the initial motion vector prediction value in the first direction (for example, forward), and correspondingly, the second initial motion vector prediction value corresponds to the second direction (for example, Backward) initial motion vector prediction value; this application does not limit this.
- the initial motion information of the current image block in the embodiment of the present application may include a motion vector MV and reference image indication information.
- the initial motion information may also include either or both of them.
- the initial motion information may only include the motion vector MV.
- the reference image indication information is used to indicate which one or which reconstructed images are used in the current block as the reference image, and the motion vector indicates the position offset of the reference block position relative to the current block position in the reference image used, generally including horizontal component offset and Vertical component offset. For example, use (x, y) to represent MV, x to represent the position offset in the horizontal direction, and y to represent the position offset in the vertical direction.
- the reference image indication information may include a reference image list and/or a reference image index corresponding to the reference image list.
- the reference image index is used to identify the reference image corresponding to the used motion vector in the specified reference image list (list0 or list1).
- the image may be referred to as a frame, and the reference image may be referred to as a reference frame.
- the initial motion information of the current image block in the embodiment of the present application is initial bidirectional prediction motion information, that is, it includes motion information used for forward and backward prediction directions.
- the forward and backward prediction directions are the two prediction directions of the bidirectional prediction mode. It can be understood that "forward" and “backward” respectively correspond to the reference image list 0 (list0, above The first list) and reference image list 1 (list1, the second list above).
- the execution subject of the method in the embodiments of the present application may be an image prediction device, for example, a video encoder or a video decoder, or an electronic device with video encoding and decoding functions, for example, it may be a frame in a video encoder. Inter prediction unit, or motion compensation unit in video decoder.
- the motion vector correction process is performed according to the first initial motion vector predicted value and the second initial motion vector predicted value to obtain the first corrected motion vector predicted value and the second corrected motion
- the vector prediction value may include: performing a motion vector correction process according to the first initial motion vector prediction value to obtain a first modified motion vector prediction value, and performing a motion vector correction process according to the second initial motion vector prediction value to obtain a second modified motion vector Predictive value.
- the above design can be applied to inter-frame prediction on the encoding side, and can also be applied to inter-frame prediction on the decoding side.
- the motion vector correction process may be a DMVR process, and in the embodiment of the present application, the two may be replaced with each other.
- the first modified motion vector prediction value or the second modified motion vector prediction value may also be referred to as the first refined motion vector prediction value or the second refined motion vector prediction value.
- the image prediction method in the embodiments of this application is not only suitable for merge prediction mode (merge) and/or advanced motion vector prediction mode (advanced motion vector prediction, AMVP), but also suitable for using spatial reference blocks, time domain reference blocks and / Or other modes in which the motion information of the inter-view reference block predicts the motion information of the current image block, thereby improving the coding and decoding performance.
- the bidirectional initial motion vector prediction value is optimized by the motion vector refinement method, and then combined with the MVD information for decoding.
- the first reference image and the second reference image that is, between the forward and backward prediction images
- the traditional method reduces the redundancy to a certain extent, so that the prediction accuracy is relatively improved.
- determining the first initial motion vector prediction value, the second initial motion vector prediction value, the first motion vector difference, and the second motion vector difference of the current image block to be processed includes:
- the first flag parsed from the code stream (such as mmvd_flag[x0][y0]) indicates that the current image block to be processed is inter-frame prediction using the fused motion vector difference MMVD method, it is determined that the current image block to be processed
- the first initial motion vector predictor, the second initial motion vector predictor, the first motion vector difference, and the second motion vector difference it is determined that the current image block to be processed.
- the first flag may also be called mmvd_flag[x0][y0], and the above name is also used in the standard text or code.
- mmvd_flag[x0][y0] when mmvd_flag[x0][y0] is the first value, it indicates that the inter-frame prediction of the current image block to be processed adopts the fused motion vector difference MMVD method, and when mmvd_flag[x0][y0] is the second value, Indicates that the inter-frame prediction of the current image block to be processed does not use the merged motion vector difference MMVD mode.
- the first value can be 1 (or true), and the second value can be 0 (or false).
- the MMVD method combined with the motion vector refinement method (ie, the DMVR method based on MMVD) is used to perform inter-frame prediction.
- the two-way prediction process is optimized and then combined with MVD information for decoding
- the matching relationship between the first reference image and the second reference image that is, between the forward and backward prediction images
- the redundancy is reduced to a certain extent, and the prediction accuracy is relatively Can be improved.
- the first initial motion vector predictor of the current image block to be processed when it is determined that the current image block to be processed is inter-frame prediction using the fused motion vector difference MMVD method, the first initial motion vector predictor of the current image block to be processed, The second initial motion vector predictor, the first motion vector difference, and the second motion vector difference.
- the available inter-frame prediction modes can include multiple, for example, the inter-frame prediction mode with the least rate-distortion cost can be selected among multiple inter-frame prediction modes.
- the selected inter-frame prediction mode is MMVD
- the solution provided in this application can be used when determining the prediction block of the current processing block based on MMVD, and then the rate-distortion cost algorithm can be compared with other inter-frame The prediction blocks determined by the prediction prediction mode are compared, and the inter prediction mode with the least rate-distortion cost is selected.
- the determining the first initial motion vector prediction value and the second initial motion vector prediction value of the current image block to be processed includes:
- the corresponding candidate motion information (for example, base candidate) is determined from the candidate list, and the candidate motion information includes the third motion vector predictor and the fourth motion vector predictor.
- Motion vector predictor, the third motion vector predictor is used as the first initial motion vector predictor, and the fourth motion vector predictor is used as the second initial motion vector predictor; or, the first position in the candidate list is determined
- the third motion vector predictor and the fourth motion vector predictor included in the candidate motion information are the first initial motion vector predictor and the second initial motion vector predictor.
- the determining the first initial motion vector prediction value and the second initial motion vector prediction value of the current image block to be processed includes:
- select candidate motion information for example, base candidate
- the candidate motion information includes a third motion vector predictor and a fourth motion vector predictor
- the third motion vector predictor As the first initial motion vector predictor, the fourth motion vector predictor is used as the second initial motion vector predictor; or, determine the third motion vector predictor and the second motion vector predictor included in the candidate motion information at the first position in the candidate list
- the four motion vector predictors are the first initial motion vector predictor and the second initial motion vector predictor.
- the executing a motion vector correction process according to the first initial motion vector predicted value and the second initial motion vector predicted value includes:
- the candidate motion information corresponding to the candidate index (or the selected candidate motion information) is from the temporal neighboring block of the current image block
- the motion information of the T1 pixel position of the corresponding position is not located in the image where the current image block to be processed is located), and the motion vector correction is performed according to the first initial motion vector predicted value and the second initial motion vector predicted value process.
- the motion vector correction process is performed. Because of different images, the success rate of performing correction to find a better motion vector predictor is higher. High, the redundancy can be reduced through the above design.
- the above design can be applied to the encoding side or the decoding side.
- it also includes:
- the candidate motion information corresponding to the candidate index (or the selected candidate motion information) is from the spatial neighboring block of the current image block
- the motion information of the A0 pixel position is located in the image where the current image block to be processed is located
- determine the first target motion vector predicted value according to the first initial motion vector predicted value and the first motion vector difference
- Determine the second target motion vector predictor according to the second initial motion vector predictor and the second motion vector difference; according to the first target motion vector predictor and the second target motion vector predictor, the current The image block to be processed is decoded.
- the above design can be applied to the encoding side as well as the decoding side.
- the motion vector correction process is not performed. Because of the same image, the correction is performed to find a better motion vector predictor. The success rate is low, which can improve resource utilization to a certain extent.
- the determining the first initial motion vector prediction value and the second initial motion vector prediction value of the current image block to be processed includes:
- the candidate including the first candidate motion information and the second candidate motion Information
- the first candidate motion information includes a fifth motion vector predictor and a sixth motion vector predictor
- the second candidate motion information includes a seventh motion vector predictor and an eighth motion vector predictor
- the fifth motion vector predictor value and the sixth motion vector predictor value are The first initial motion vector prediction value and the second initial motion vector prediction value (that is, the fifth motion vector prediction value is the first initial motion vector prediction value, and the sixth motion vector prediction value is the second initial motion vector prediction value );or,
- the seventh motion vector predictor and the eighth motion vector predictor are the The first initial motion vector prediction value and the second initial motion vector prediction value (the seventh motion vector prediction value is the first initial motion vector prediction value, and the eighth motion vector prediction value is the second initial motion vector prediction value).
- the construction of the candidate list uses the motion information (for example, motion vector MV) of the image blocks that have been previously encoded or decoded before the current image block to be processed.
- Some previously encoded or decoded image blocks are provided in accordance with this application.
- the motion vector correction method is processed. Some previously encoded or decoded image blocks are processed in the traditional way.
- the candidate index only corresponds to the second candidate motion information (that is, the original candidate motion information, that is, the non-correction method).
- Candidate motion information corresponds to the first candidate motion information (in the previous encoding or decoding process, the candidate motion vector information in the correction mode) and the second candidate motion information.
- the fifth motion vector predicted value and the sixth motion vector predicted value are modified motion vector predicted values
- the seventh motion vector predicted value and the eighth motion vector predicted value are original motion vector predicted values.
- the image block to which the first candidate motion information and the second candidate motion information belong is the same image block (such as the A0 pixel position), and the first candidate motion information is corrected (DMVR), and the second candidate motion information The information is uncorrected (DMVR).
- the determining the first initial motion vector predictor and the second initial motion vector predictor of the current image block to be processed includes:
- the corresponding candidate (for example, base candidate) is determined from the candidate list.
- the candidate includes first candidate motion information and second candidate motion information, wherein the first candidate motion information includes the fifth candidate.
- the fifth motion vector predictor value and the sixth motion vector predictor value are The first initial motion vector prediction value and the second initial motion vector prediction value (that is, the fifth motion vector prediction value is the first initial motion vector prediction value, and the sixth motion vector prediction value is the sixth motion vector prediction value) ;or,
- the seventh motion vector predictor and the eighth motion vector predictor are the The first initial motion vector prediction value and the second initial motion vector prediction value (the seventh motion vector prediction value is the first initial motion vector prediction value, and the eighth motion vector prediction value is the second motion vector prediction value).
- the candidate at the first position in the candidate list includes first candidate motion information (modified candidate motion information) and second candidate motion information (original candidate motion information).
- a candidate motion information includes a fifth motion vector predictor and a sixth motion vector predictor, and the second candidate motion information includes a seventh motion vector predictor and an eighth motion vector predictor;
- the determining the first initial motion vector prediction value and the second initial motion vector prediction value of the current image block to be processed includes:
- the fifth motion vector predictor and the sixth motion vector predictor are the first An initial motion vector prediction value and the second initial motion vector prediction value
- the seventh motion vector predictor and the eighth motion vector predictor are the The first initial motion vector prediction value and the second initial motion vector prediction value.
- the above design can be applied to the encoding side as well as the decoding side.
- the executing a motion vector correction process according to the first initial motion vector predicted value and the second initial motion vector predicted value includes:
- the difference between the first modified reference prediction block and the second modified reference prediction block is less than or equal to the difference between the first reference prediction block and the second reference prediction block
- the first modified reference prediction block A prediction block is an image block in a first preset area that has the same size as the first reference prediction block, the first preset area includes the first reference prediction block
- the second modified reference prediction block is An image block in a second preset area that has the same size as the second reference prediction block, the second preset area includes the second reference prediction block; the first modified reference prediction block corresponds to the The first modified motion vector predictor, and the second modified reference prediction block corresponds to the second modified motion vector predictor.
- the above design can be applied to the encoding side as well as the decoding side.
- determining a first modified reference prediction block according to the first reference prediction block and determining a second modified reference prediction block according to the second reference prediction block includes:
- the first reference prediction block pair includes the first reference prediction block and the second reference prediction block;
- the second reference prediction block pair includes a third reference prediction block and a fourth reference prediction block, and the The third reference prediction block is obtained based on the motion search of the first reference prediction block in the first preset area, and the fourth reference prediction block is obtained based on the second reference prediction block in the second It is obtained by motion search in the preset area;
- the third reference prediction block pair includes a fifth reference prediction block and a sixth reference prediction block
- the fifth reference prediction block is based on the third reference prediction block included in the second reference prediction block pair with the smallest difference. Obtained by performing a motion search in the first preset area, and the sixth reference prediction block is based on a second reference prediction block with the smallest difference to a fourth reference prediction block included in a motion search in the second preset area get;
- the fifth reference prediction block included in the third reference prediction block pair with the smallest difference is the first modified reference prediction block
- the above design can be applied to the encoding side as well as the decoding side.
- it also includes:
- the first reference prediction block is the first modified reference prediction block
- it is determined that the second reference prediction block is a second modified reference prediction block.
- the above design can be applied to the encoding side as well as the decoding side.
- it also includes:
- the difference between the fifth reference prediction block and the sixth reference prediction block included in the third reference prediction block pair with the smallest difference is greater than the third reference prediction block and the fourth reference prediction block included in the second reference prediction block pair with the smallest difference
- the third reference prediction block included in the second reference prediction block pair with the smallest difference is the first modified reference prediction block
- it is determined that the fourth reference prediction block included in the second reference prediction block pair with the smallest difference is The second modification refers to the prediction block.
- the above design can be applied to the encoding side as well as the decoding side.
- the executing a motion vector correction process according to the first initial motion vector predicted value and the second initial motion vector predicted value includes:
- the reference prediction block pair with the smallest difference among the at least one reference prediction block pair obtained by the motion search is less than the difference of the basic reference prediction block pair
- the reference prediction block pair with the smallest difference is updated to the basic reference prediction block pair , Continue to perform motion search based on the updated basic reference prediction block pair;
- the difference between the reference prediction block pair is the difference between the first reference prediction block and the second reference prediction block included in the reference prediction block pair, and the first reference prediction block included in the reference prediction block pair after search is determined based on the basic reference prediction block.
- a motion search is performed on the included first reference prediction block in the surrounding preset area, and the surrounding preset area of the first reference prediction block included in the basic reference prediction block is located in the first preset area;
- the second reference prediction block included in the reference prediction block pair is obtained by performing a motion search on the included second reference prediction block in the surrounding preset area based on the basic reference prediction block.
- the comparison between the basic reference prediction block and the included second reference prediction block The peripheral preset area is located in the second preset area;
- the motion search is stopped, the first reference prediction block included in the basic reference prediction block pair is used as the target reference prediction block, and the basic reference prediction block The second reference prediction block included in the prediction block pair is used as the target reference prediction block;
- Stop motion search If it is determined that the search area of the first reference prediction block included in the basic reference prediction block pair exceeds the first preset area or the search area of the second reference prediction block included in the basic reference prediction block pair exceeds the second preset area, Stop motion search.
- the above design can be applied to the encoding side as well as the decoding side.
- an embodiment of the present application provides a video image prediction method, including:
- the predicted value performs a motion vector correction process to obtain a first corrected motion vector predicted value, and performs a motion vector correction process according to the second initial motion vector predicted value to obtain a second corrected motion vector predicted value); according to the first motion vector predicted value 1.
- the second motion vector predictor performs a motion vector correction process to obtain a first modified motion vector predictor and a second modified motion vector predictor; according to the first modified motion vector predictor and the second modified The motion vector predictor predicts the first image block to be processed.
- the first initial motion vector prediction value corresponds to the initial motion vector prediction value of the first list (ie list0), and accordingly, the second initial motion vector prediction value corresponds to the second list (ie list1) The initial motion vector prediction value.
- the first initial motion vector prediction value corresponds to the initial motion vector prediction value in the first direction (for example, forward), and correspondingly, the second initial motion vector prediction value corresponds to the second direction (for example, Backward) initial motion vector prediction value; this application does not limit this.
- the above design can be applied to the encoding side as well as the decoding side.
- the motion vector correction process may be a DMVR process, and in the embodiment of the present application, the two may be replaced with each other.
- the first modified motion vector prediction value or the second modified motion vector prediction value may also be referred to as the first refined motion vector prediction value or the second refined motion vector prediction value.
- the motion vector correction process is performed, and then the motion vector prediction value after the correction is used for inter-frame prediction. Compared with the traditional In terms of processing methods, the prediction accuracy will be relatively improved.
- determining the first initial motion vector prediction value, the second initial motion vector prediction value, the first motion vector difference, and the second motion vector difference of the current image block to be processed includes:
- the first flag parsed in the code stream indicates that when the current image block to be processed is inter-predicted using the fused motion vector difference MMVD method, the first flag of the current image block to be processed is determined
- the first flag may also be called mmvd_flag[x0][y0], and the above name is also used in the standard text or code.
- mmvd_flag[x0][y0] when mmvd_flag[x0][y0] is the first value, it indicates that the inter-frame prediction of the current image block to be processed adopts the fused motion vector difference MMVD method, and when mmvd_flag[x0][y0] is the second value, Indicates that the inter-frame prediction of the current image block to be processed does not use the merged motion vector difference MMVD mode.
- the first value can be 1 (or true), and the second value can be 0 (or false).
- the solution provided by the embodiment of this application is adopted, and the motion vector correction process is performed based on the two initial motion vector prediction values obtained by the MMVD method. Compared with the traditional method, the accuracy is Will be relatively improved.
- the decoding side when determining the first initial motion vector predictor, the second initial motion vector predictor, the first motion vector difference, and the second motion vector of the current image block to be processed Before the difference, it also includes: determining the first initial motion vector prediction value and the second initial motion vector prediction value of the current image block to be processed when the fused motion vector difference MMVD method is used for inter-frame prediction of the current image block to be processed , The first motion vector difference, and the second motion vector difference.
- the determining the first initial motion vector predictor and the second initial motion vector predictor of the currently to-be-processed image block includes: from the candidate index parsed from the code stream from The candidate list determines the corresponding candidate motion information, the candidate motion information includes a third motion vector predictor and a fourth motion vector predictor, the third motion vector predictor is used as the first initial motion vector predictor, and the second Four motion vector predictors are used as the second initial motion vector predictors; or, the third motion vector predictor and the fourth motion vector predictor included in the candidate motion information at the first position in the candidate list are determined to be the first initial motion The vector predicted value and the second initial motion vector predicted value.
- the determining the first initial motion vector predictor and the second initial motion vector predictor of the current image block to be processed includes: selecting from a candidate list according to a rate-distortion cost algorithm
- the candidate motion information includes a third motion vector predictor and a fourth motion vector predictor, the third motion vector predictor serves as the first initial motion vector predictor, and the fourth motion vector predictor
- the vector predictor is used as the second initial motion vector predictor; or, it is determined that the third motion vector predictor and the fourth motion vector predictor included in the candidate motion information at the first position in the candidate list are the first initial motion vector predictor. Value and the second initial motion vector predicted value.
- the executing the motion vector correction process according to the first motion vector predicted value and the second motion vector predicted value includes:
- a motion vector correction process is performed according to the first motion vector predicted value and the second motion vector predicted value.
- the above design can be applied to the encoding side as well as the decoding side.
- it also includes:
- the current image block to be processed is determined according to the first motion vector prediction value and the second motion vector prediction value. Make predictions.
- the above design can be applied to the encoding side as well as the decoding side.
- determining the first initial motion vector predictor and the second initial motion vector predictor of the current image block to be processed includes: according to a candidate index parsed from the code stream (for example, base candidate index) Determine the corresponding candidate (for example, base candidate) from the candidate list, where the candidate includes the first candidate motion information and the second candidate motion information, wherein the first candidate motion information includes the fifth motion A vector predictor and a sixth motion vector predictor, and the second candidate motion information includes a seventh motion vector predictor and an eighth motion vector predictor;
- the fifth motion vector predictor value and the sixth motion vector predictor value are The first initial motion vector prediction value and the second initial motion vector prediction value; or,
- the seventh motion vector predictor and the eighth motion vector predictor are the The first initial motion vector prediction value and the second initial motion vector prediction value.
- the construction of the candidate list uses the motion information (for example, motion vector MV) of the image blocks that have been previously encoded or decoded before the current image block to be processed.
- Some previously encoded or decoded image blocks are provided in accordance with this application.
- the motion vector correction method is processed. Some previously encoded or decoded image blocks are processed in the traditional way.
- the candidate index only corresponds to the second candidate motion information (that is, the original candidate motion information, that is, the non-correction method).
- Candidate motion information corresponds to the first candidate motion information (in the previous encoding or decoding process, the candidate motion vector information in the correction mode) and the second candidate motion information.
- the fifth motion vector predicted value and the sixth motion vector predicted value are modified motion vector predicted values
- the seventh motion vector predicted value and the eighth motion vector predicted value are original motion vector predicted values.
- the corresponding candidate (for example, base candidate) is determined from the candidate list.
- the candidate includes first candidate motion information and second candidate motion information, wherein the first candidate motion information includes the fifth candidate.
- the fifth motion vector predictor value and the sixth motion vector predictor value are The first initial motion vector prediction value and the second initial motion vector prediction value; or,
- the seventh motion vector predictor and the eighth motion vector predictor are the The first initial motion vector prediction value and the second initial motion vector prediction value.
- the determining the first initial motion vector prediction value and the second initial motion vector prediction value of the current image block to be processed includes:
- the candidate at the first position in the candidate list includes first candidate motion information and second candidate motion information, where the first candidate motion information includes a fifth motion vector predictor and a sixth motion vector predictor.
- the second candidate motion information includes the seventh motion vector predictor and the eighth motion vector predictor;
- the fifth motion vector predictor and the sixth motion vector predictor are the first An initial motion vector prediction value and the second initial motion vector prediction value
- the seventh motion vector predictor and the eighth motion vector predictor are the The first initial motion vector prediction value and the second initial motion vector prediction value.
- the above design can be applied to the encoding side as well as the decoding side.
- the executing the motion vector correction process according to the first motion vector predicted value and the second motion vector predicted value includes:
- the difference between the first modified reference prediction block and the second modified reference prediction block is less than or equal to the difference between the first reference prediction block and the second reference prediction block
- the first modified reference prediction block A prediction block is an image block in a first preset area that has the same size as the first reference prediction block, the first preset area includes the first reference prediction block
- the second modified reference prediction block is An image block in a second preset area that has the same size as the second reference prediction block, the second preset area includes the second reference prediction block;
- the first modified reference prediction block corresponds to the The first modified motion vector predictor, and the second modified reference prediction block corresponds to the second modified motion vector predictor.
- determining a first modified reference prediction block according to the first reference prediction block and determining a second modified reference prediction block according to the second reference prediction block includes:
- the first reference prediction block pair includes the first reference prediction block and the second reference prediction block;
- the second reference prediction block pair includes a third reference prediction block and a fourth reference prediction block, and the The third reference prediction block is obtained based on the motion search of the first reference prediction block in the first preset area, and the fourth reference prediction block is obtained based on the second reference prediction block in the second It is obtained by motion search in the preset area;
- the third reference prediction block pair includes a fifth reference prediction block and a sixth reference prediction block
- the fifth reference prediction block is based on the third reference prediction block included in the second reference prediction block pair with the smallest difference. Obtained by performing a motion search in the first preset area, and the sixth reference prediction block is based on a second reference prediction block with the smallest difference to a fourth reference prediction block included in a motion search in the second preset area get;
- the fifth reference prediction block included in the third reference prediction block pair with the smallest difference is the first modified reference prediction block
- the above design can be applied to the encoding side as well as the decoding side.
- it also includes:
- the first reference prediction block is the first modified reference prediction block
- it is determined that the second reference prediction block is a second modified reference prediction block.
- it also includes:
- the difference between the fifth reference prediction block and the sixth reference prediction block included in the third reference prediction block pair with the smallest difference is greater than the third reference prediction block and the fourth reference prediction block included in the second reference prediction block pair with the smallest difference
- the third reference prediction block included in the second reference prediction block pair with the smallest difference is the first modified reference prediction block
- it is determined that the fourth reference prediction block included in the second reference prediction block pair with the smallest difference is The second modification refers to the prediction block.
- the above design can be applied to the encoding side as well as the decoding side.
- an embodiment of the present application provides a video image prediction device, including:
- a prediction unit configured to determine the first initial motion vector predictor, the second initial motion vector predictor, the first motion vector difference, and the second motion vector difference of the current image block to be processed
- a correction unit configured to perform a motion vector correction process according to the first initial motion vector predicted value and the second initial motion vector predicted value to obtain a first corrected motion vector predicted value and a second corrected motion vector predicted value;
- the prediction unit is further configured to determine a first motion vector prediction value according to the difference between the first modified motion vector prediction value and the first motion vector, and determine the first motion vector prediction value according to the second modified motion vector prediction value and the second motion vector prediction value.
- the motion vector difference determines a second motion vector predictor; and predicts the current image block to be processed according to the first motion vector predictor and the second motion vector predictor.
- the image prediction device is for example applied to a video encoding device (video encoder) or a video decoding device (video decoder).
- the function of the foregoing apparatus may be implemented by an inter prediction unit.
- the inter prediction unit includes a prediction unit and a correction unit.
- the prediction unit determines the first initial motion vector predicted value, the second initial motion vector predicted value, the first motion vector difference, and the second motion vector of the current image block to be processed.
- the bad aspects are specifically used for:
- the first initial motion vector predictor and the second prediction value of the current image block to be processed are determined.
- the device provided by this design can be applied to the decoder.
- the action of parsing the first identifier from the code stream may be performed by the entropy decoding unit in the decoder, and the entropy decoding unit parses the first identifier from the code stream and transmits it to the prediction unit in the image prediction device.
- the prediction unit determines the first initial motion vector predicted value, the second initial motion vector predicted value, and the second predicted value of the current image block to be processed.
- the aspects of the first motion vector difference and the second motion vector difference are specifically used for:
- the first image block of the current image block to be processed is determined An initial motion vector predictor, a second initial motion vector predictor, a first motion vector difference, and a second motion vector difference.
- the prediction unit is specifically configured to determine the first initial motion vector prediction value and the second initial motion vector prediction value of the current image block to be processed:
- the corresponding candidate motion information is determined from the candidate list.
- the candidate motion information includes a third motion vector predictor and a fourth motion vector predictor.
- the third motion vector predictor Value as the first initial motion vector prediction value
- the fourth motion vector prediction value as the second initial motion vector prediction value.
- the device provided by the above design can be applied to a decoder.
- the device is applied to an encoder, and the prediction unit is specifically designed to determine the first initial motion vector prediction value and the second initial motion vector prediction value of the current image block to be processed Used for:
- the candidate motion information includes a third motion vector predictor and a fourth motion vector predictor.
- the third motion vector predictor is used as the first initial motion vector.
- Predicted value, the fourth motion vector predicted value is used as the second initial motion vector predicted value.
- the prediction unit is specifically configured to determine the first initial motion vector prediction value and the second initial motion vector prediction value of the current image block to be processed:
- the third motion vector predictor and the fourth motion vector predictor included in the candidate motion information of the first position in the candidate list are determined as the first initial motion vector predictor and the second initial motion vector predictor.
- the device provided by the above design can be applied to an encoder or a decoder.
- the correction unit is specifically used for:
- a motion vector correction process is performed according to the first initial motion vector predicted value and the second initial motion vector predicted value.
- the prediction unit is further configured to, when the image block to which the candidate motion information belongs and the current image block to be processed belong to the same image, according to the first initial motion vector predicted value, The first motion vector difference determines the first target motion vector predictor, and the second target motion vector predictor is determined according to the second initial motion vector predictor and the second motion vector difference; according to the first target motion vector predictor And the second target motion vector predictor to predict the current image block to be processed.
- the device provided by the above design can be applied to an encoder or a decoder.
- the prediction unit is specifically configured to determine the first initial motion vector prediction value and the second initial motion vector prediction value of the current image block to be processed:
- a corresponding candidate item is determined from the candidate list according to the candidate index parsed from the code stream, the candidate item includes the first candidate motion information and the second candidate motion information, wherein the first candidate motion information includes the fifth motion A vector predictor and a sixth motion vector predictor, and the second candidate motion information includes a seventh motion vector predictor and an eighth motion vector predictor;
- the fifth motion vector predictor value and the sixth motion vector predictor value are The first initial motion vector prediction value and the second initial motion vector prediction value; or,
- the seventh motion vector predictor and the eighth motion vector predictor are the The first initial motion vector prediction value and the second initial motion vector prediction value.
- the device provided by the above design can be applied to a decoder.
- the device provided by this design is applied to an encoder, and the prediction unit is used in determining the first initial motion vector prediction value and the second initial motion vector prediction value of the current image block to be processed , Specifically used for:
- a corresponding candidate item is determined from the candidate list.
- the candidate item includes the first candidate motion information and the second candidate motion information, wherein the first candidate motion information includes the fifth motion vector predictor and the second candidate motion information.
- the fifth motion vector predictor value and the sixth motion vector predictor value are The first initial motion vector prediction value and the second initial motion vector prediction value; or,
- the seventh motion vector predictor and the eighth motion vector predictor are the The first initial motion vector prediction value and the second initial motion vector prediction value.
- the candidate at the first position in the candidate list includes first candidate motion information and second candidate motion information, wherein the first candidate motion information includes the fifth motion vector predictor and the sixth candidate motion information.
- the prediction unit is specifically configured to determine the first initial motion vector prediction value and the second initial motion vector prediction value of the current image block to be processed:
- the fifth motion vector predictor and the sixth motion vector predictor are the first An initial motion vector prediction value and the second initial motion vector prediction value
- the seventh motion vector predictor and the eighth motion vector predictor are the The first initial motion vector prediction value and the second initial motion vector prediction value.
- the device provided by the above design can be applied to an encoder or a decoder.
- the correction unit is specifically configured to perform a motion vector correction process according to the first initial motion vector predicted value and the second initial motion vector predicted value:
- the difference between the first modified reference prediction block and the second modified reference prediction block is less than or equal to the difference between the first reference prediction block and the second reference prediction block
- the first modified reference prediction block A prediction block is an image block in a first preset area that has the same size as the first reference prediction block, the first preset area includes the first reference prediction block
- the second modified reference prediction block is An image block in a second preset area that has the same size as the second reference prediction block, the second preset area includes the second reference prediction block; the first modified reference prediction block corresponds to the The first modified motion vector predictor, and the second modified reference prediction block corresponds to the second modified motion vector predictor.
- the device provided by the above design can be applied to an encoder or a decoder.
- the modification unit specifically uses the aspect of determining a first modified reference prediction block according to the first reference prediction block and determining a second modified reference prediction block according to the second reference prediction block. in:
- the first reference prediction block pair includes the first reference prediction block and the second reference prediction block;
- the second reference prediction block pair includes a third reference prediction block and a fourth reference prediction block, and the The third reference prediction block is obtained based on the motion search of the first reference prediction block in the first preset area, and the fourth reference prediction block is obtained based on the second reference prediction block in the second It is obtained by motion search in the preset area;
- the third reference prediction block pair includes a fifth reference prediction block and a sixth reference prediction block
- the fifth reference prediction block is based on the third reference prediction block included in the second reference prediction block pair with the smallest difference. Obtained by performing a motion search in the first preset area, and the sixth reference prediction block is based on a second reference prediction block with the smallest difference to a fourth reference prediction block included in a motion search in the second preset area get;
- the fifth reference prediction block included in the third reference prediction block pair with the smallest difference is the first modified reference prediction block
- the device provided by the above design can be applied to an encoder or a decoder.
- the correction unit is also used for:
- the first reference prediction block is the first modified reference prediction block
- it is determined that the second reference prediction block is a second modified reference prediction block.
- the device provided by the above design can be applied to an encoder or a decoder.
- the correction unit is also used for:
- the difference between the fifth reference prediction block and the sixth reference prediction block included in the third reference prediction block pair with the smallest difference is greater than the third reference prediction block and the fourth reference prediction block included in the second reference prediction block pair with the smallest difference
- the third reference prediction block included in the second reference prediction block pair with the smallest difference is the first modified reference prediction block
- it is determined that the fourth reference prediction block included in the second reference prediction block pair with the smallest difference is The second modification refers to the prediction block.
- the device provided by the above design can be applied to an encoder or a decoder.
- an embodiment of the present application provides a video image prediction device, including:
- a prediction unit configured to determine a first initial motion vector prediction value, a second initial motion vector prediction value, a first motion vector difference, and a second motion vector difference of the first image block to be processed;
- the correction unit is configured to determine a first motion vector predictor according to the first initial motion vector predictor and the first motion vector difference, and determine a second motion according to the second initial motion vector predictor and the second motion vector difference Vector predicted value;
- the prediction unit is further configured to perform a motion vector correction process according to the first motion vector predicted value and the second motion vector predicted value to obtain the first corrected motion vector predicted value and the second corrected motion vector predicted value ; Predict the first image block to be processed according to the first modified motion vector prediction value and the second modified motion vector prediction value.
- the video image prediction device is applied to, for example, a video encoding device (video encoder) or a video decoding device (video decoder).
- the prediction unit determines the first initial motion vector predicted value, the second initial motion vector predicted value, the first motion vector difference, and the second motion vector of the current image block to be processed.
- the bad aspects are specifically used for:
- the first initial motion vector predictor and the second prediction value of the current image block to be processed are determined.
- the device provided by the above design can be applied to a decoder.
- the prediction unit when applied to an encoder, determines the first initial motion vector prediction value, the second initial motion vector prediction value, and the first motion vector difference of the current image block to be processed.
- the aspect of the second motion vector difference specifically used for:
- the first initial motion vector predictor of the current image block to be processed is determined, The second initial motion vector predictor, the first motion vector difference, and the second motion vector difference.
- the prediction unit is specifically configured to determine the first initial motion vector prediction value and the second initial motion vector prediction value of the current image block to be processed:
- the corresponding candidate motion information is determined from the candidate list.
- the candidate motion information includes a third motion vector predictor and a fourth motion vector predictor.
- the third motion vector predictor Value as the first initial motion vector prediction value
- the fourth motion vector prediction value as the second initial motion vector prediction value
- the third motion vector predictor and the fourth motion vector predictor included in the candidate motion information of the first position in the candidate list are determined as the first initial motion vector predictor and the second initial motion vector predictor.
- the device provided by the above design can be applied to a decoder.
- the prediction unit when applied to an encoder, specifically uses the predictive value of the first initial motion vector and the second predictive value of the initial motion vector of the current image block to be processed. in:
- the candidate motion information includes a third motion vector predictor and a fourth motion vector predictor.
- the third motion vector predictor is used as the first initial motion vector.
- Predicted value, the fourth motion vector predicted value is used as the second initial motion vector predicted value; or,
- the third motion vector predictor and the fourth motion vector predictor included in the candidate motion information of the first position in the candidate list are determined as the first initial motion vector predictor and the second initial motion vector predictor.
- the correction unit is specifically configured to perform a motion vector correction process according to the first motion vector predicted value and the second motion vector predicted value:
- a motion vector correction process is performed according to the first motion vector predicted value and the second motion vector predicted value.
- the device provided by the above design can be applied to an encoder or a decoder.
- the prediction unit is also used for:
- the current image block to be processed is determined according to the first motion vector prediction value and the second motion vector prediction value. Make predictions.
- the device provided by the above design can be applied to an encoder or a decoder.
- the prediction unit is specifically configured to determine the first initial motion vector prediction value and the second initial motion vector prediction value of the current image block to be processed:
- a corresponding candidate item is determined from the candidate list according to the candidate index parsed from the code stream, the candidate item includes the first candidate motion information and the second candidate motion information, wherein the first candidate motion information includes the fifth motion A vector predictor and a sixth motion vector predictor, and the second candidate motion information includes a seventh motion vector predictor and an eighth motion vector predictor;
- the fifth motion vector predictor value and the sixth motion vector predictor value are The first initial motion vector prediction value and the second initial motion vector prediction value; or,
- the seventh motion vector predictor and the eighth motion vector predictor are the The first initial motion vector prediction value and the second initial motion vector prediction value.
- the device provided by the above design can be applied to a decoder.
- the device provided by this design is applied to an encoder, and the prediction unit is used in determining the first initial motion vector prediction value and the second initial motion vector prediction value of the current image block to be processed , Specifically used for:
- the corresponding candidate items are determined from the candidate list according to the rate-distortion cost algorithm.
- the candidates include first candidate motion information and second candidate motion information, wherein the first candidate motion information includes the fifth motion vector predictor and the second candidate motion information.
- the fifth motion vector predictor value and the sixth motion vector predictor value are The first initial motion vector prediction value and the second initial motion vector prediction value; or,
- the seventh motion vector predictor and the eighth motion vector predictor are the The first initial motion vector prediction value and the second initial motion vector prediction value.
- the prediction unit is specifically configured to determine the first initial motion vector prediction value and the second initial motion vector prediction value of the current image block to be processed:
- the candidate at the first position in the candidate list includes first candidate motion information and second candidate motion information, where the first candidate motion information includes a fifth motion vector predictor and a sixth motion vector predictor.
- the second candidate motion information includes the seventh motion vector predictor and the eighth motion vector predictor;
- the fifth motion vector predictor and the sixth motion vector predictor are the first An initial motion vector prediction value and the second initial motion vector prediction value
- the seventh motion vector predictor and the eighth motion vector predictor are the The first initial motion vector prediction value and the second initial motion vector prediction value.
- the device provided by the above design can be applied to an encoder or a decoder.
- the correction unit in terms of performing a motion vector correction process according to the first motion vector predicted value and the second motion vector predicted value, specifically includes:
- the difference between the first modified reference prediction block and the second modified reference prediction block is less than or equal to the difference between the first reference prediction block and the second reference prediction block
- the first modified reference prediction block A prediction block is an image block in a first preset area that has the same size as the first reference prediction block, the first preset area includes the first reference prediction block
- the second modified reference prediction block is An image block in a second preset area that has the same size as the second reference prediction block, the second preset area includes the second reference prediction block; the first modified reference prediction block corresponds to the The first modified motion vector predictor, and the second modified reference prediction block corresponds to the second modified motion vector predictor.
- the device provided by the above design can be applied to an encoder or a decoder.
- a fifth aspect of the present application provides an image prediction device, the device includes: a processor and a memory coupled to the processor; the processor is configured to execute various implementation manners of the first aspect or the second aspect Method in.
- a sixth aspect of the present application provides a video encoder.
- the video encoder is used to encode a current image block to be processed and includes: an inter-frame prediction module, wherein the inter-frame prediction module includes the application in the third aspect or the fourth aspect
- the inter-frame prediction module is used to predict the predicted value of the pixel value of the current image block to be processed;
- the entropy coding module is used to encode the indication information into the code stream ,
- the indication information is used to indicate the initial motion information of the image block (including the first initial motion vector prediction value and the second initial motion vector prediction value);
- the reconstruction module is used to indicate the pixels of the current image block to be processed The predicted value of the value reconstructs the image block.
- the seventh aspect of the present application provides a video decoder, which is used to decode image blocks from a code stream, and includes: an entropy decoding module, which is used to decode indication information from the code stream.
- an entropy decoding module which is used to decode indication information from the code stream.
- the inter-frame prediction module includes the one applied to the decoder in the third or fourth aspect Design the provided image prediction device, the inter-frame prediction module is used to predict the predicted value of the pixel value of the current image block to be processed; the reconstruction module is used to predict the pixel value based on the current image block to be processed Value reconstruction of the current image block to be processed.
- an embodiment of the present application provides a device for decoding video data, and the device includes:
- the memory is used to store video data in the form of a code stream, and the video data includes one or more image blocks;
- a video decoder is used to determine (or obtain) the first initial motion vector predictor, the second initial motion vector predictor, the first motion vector difference, and the second Motion vector difference; according to the first initial motion vector predicted value and the second initial motion vector predicted value, perform a motion vector correction (or motion vector refinement, such as DMVR) process to obtain the first corrected motion vector predicted value And the second modified motion vector predictor (in other words, the motion vector correction process is performed according to the first initial motion vector predictor to obtain the first modified motion vector predictor, and the motion vector correction process is performed according to the second initial motion vector predictor to Obtain the second modified motion vector predictor); determine the first motion vector predictor according to the difference between the first modified motion vector predictor and the first motion vector, and determine the first motion vector predictor according to the second modified motion vector predictor and the The second motion vector difference determines a second motion vector prediction value; the current image block to be processed is predicted according to the first motion vector prediction value and the second motion vector prediction value.
- DMVR motion vector refinement
- the video decoder can specifically implement the method corresponding to the design of the decoder described in the first aspect.
- the video decoder includes any device in the third aspect applied to the design of an inter prediction unit or a decoder.
- a video decoder is used to determine (or obtain) the first initial motion vector predictor, the second initial motion vector predictor, the first motion vector difference, and the first image block to be processed.
- a second motion vector difference determine a first motion vector predictor according to the first initial motion vector predictor and a first motion vector difference, and determine a second motion vector predictor according to the second initial motion vector predictor and a second motion vector difference Motion vector prediction value (in other words, the motion vector correction process is performed according to the first initial motion vector prediction value to obtain the first modified motion vector prediction value, and the motion vector correction process is performed according to the second initial motion vector prediction value to obtain the second correction Motion vector prediction value); according to the first motion vector prediction value and the second motion vector prediction value, perform a motion vector correction process to obtain the first modified motion vector prediction value and the second modified motion vector prediction value; The first modified motion vector predictor and the second modified motion vector predictor predict the first image block to be processed.
- the video decoder can specifically implement the method corresponding to the design of the decoder described in the second aspect.
- the video decoder includes any device in the fourth aspect applied to the design of an inter prediction unit or a decoder.
- an embodiment of the present application provides a device for encoding video data, and the device includes:
- the memory is used to store video data in the form of a code stream, and the video data includes one or more image blocks;
- a video encoder is used to determine (or obtain) the first initial motion vector predictor, the second initial motion vector predictor, the first motion vector difference, and the second Motion vector difference; according to the first initial motion vector predicted value and the second initial motion vector predicted value, perform a motion vector correction (or motion vector refinement, such as DMVR) process to obtain the first corrected motion vector predicted value
- a motion vector correction or motion vector refinement, such as DMVR
- the second modified motion vector predictor in other words, the motion vector correction process is performed according to the first initial motion vector predictor to obtain the first modified motion vector predictor, and the motion vector correction process is performed according to the second initial motion vector predictor to Obtain the second modified motion vector predictor); determine the first motion vector predictor according to the difference between the first modified motion vector predictor and the first motion vector, and determine the first motion vector predictor according to the second modified motion vector predictor and the The second motion vector difference determines a second motion vector prediction value; the current image block to be processed is predicted according to the first motion vector prediction value and the second motion
- the video encoder may implement the method corresponding to the design of the encoder described in the first aspect.
- the video encoder includes any device in the third aspect applied to the design of the inter prediction unit.
- a video encoder is used to determine (or obtain) the first initial motion vector predictor, the second initial motion vector predictor, the first motion vector difference, and the first image block to be processed.
- a second motion vector difference determine a first motion vector predictor according to the first initial motion vector predictor and a first motion vector difference, and determine a second motion vector predictor according to the second initial motion vector predictor and a second motion vector difference Motion vector prediction value (in other words, the motion vector correction process is performed according to the first initial motion vector prediction value to obtain the first modified motion vector prediction value, and the motion vector correction process is performed according to the second initial motion vector prediction value to obtain the second correction Motion vector prediction value); according to the first motion vector prediction value and the second motion vector prediction value, perform a motion vector correction process to obtain the first modified motion vector prediction value and the second modified motion vector prediction value; The first modified motion vector predictor and the second modified motion vector predictor predict the first image block to be processed.
- the video encoder may implement the method corresponding to the design of the encoder described in the second aspect.
- the video encoder includes any device applied to the design of an inter prediction unit in the fourth aspect.
- an embodiment of the present application provides an encoding device, including: a non-volatile memory and a processor coupled with each other, the processor calls the program code stored in the memory to execute the first aspect or the second aspect
- the method described in the aspect corresponds to some or all of the steps of the method applied to the design of the encoder.
- an embodiment of the present application provides a decoding device, including: a non-volatile memory and a processor coupled with each other, the processor calls the program code stored in the memory to execute the first aspect or the second aspect
- the second aspect corresponds to some or all of the steps of the method applied to the design of the decoder.
- an embodiment of the present application provides a computer-readable storage medium, the computer-readable storage medium stores program code, where the program code includes any one of the first aspect or the second aspect Instructions for some or all of the steps of the method.
- the embodiments of the present application provide a computer program product, which when the computer program product runs on a computer, causes the computer to execute part or all of the steps of any one of the first aspect or the second aspect .
- a fourteenth aspect of the present application provides an electronic device, including the video encoder according to the sixth aspect, or the video decoder according to the seventh aspect, or the third, fourth, or fifth aspect.
- FIG. 1A is a block diagram of an example of a video encoding and decoding system 10 used to implement an embodiment of the present application;
- FIG. 1B is a block diagram of an example of a video decoding system 40 used to implement an embodiment of the present application
- FIG. 2 is a block diagram of an example structure of an encoder 20 used to implement an embodiment of the present application
- FIG. 3 is a block diagram of an example structure of a decoder 30 used to implement an embodiment of the present application
- FIG. 4 is a block diagram of an example of a video decoding device 400 used to implement an embodiment of the present application
- Fig. 5 is a block diagram of another example of an encoding device or a decoding device for implementing an embodiment of the present application
- FIG. 6 is a schematic diagram of candidate blocks in the spatial domain and the time domain used to implement an embodiment of the present application
- FIG. 7A is a schematic diagram of MMVD search points used to implement an embodiment of the present application.
- FIG. 7B is a schematic diagram of an MMVD search process used to implement an embodiment of the present application.
- FIG. 8 is a schematic flowchart of a method for predicting and predicting a video image according to an embodiment of the present application
- FIG. 9 is a schematic diagram of forward and backward reference images used to implement an embodiment of the present application.
- FIG. 10A is a schematic diagram of a candidate list used to implement an embodiment of the present application.
- Fig. 10B is a schematic diagram of selecting a motion vector of a prediction block of a current block for implementing an embodiment of the present application
- FIG. 11 is a schematic diagram of a search point used to implement an embodiment of the present application.
- Fig. 12 is a schematic diagram of a motion vector refinement process used to implement an embodiment of the present application.
- FIG. 13 is a schematic flowchart of another video image prediction and prediction method used to implement an embodiment of the present application.
- FIG. 14 is a schematic diagram of a motion vector refinement process used to implement an embodiment of the present application.
- FIG. 15 is a structural block diagram of a video image prediction device 1500 used to implement an embodiment of the present application.
- the corresponding device may include one or more units such as functional units to perform the described one or more method steps (for example, one unit performs one or more steps) , Or multiple units, each of which performs one or more of multiple steps), even if such one or more units are not explicitly described or illustrated in the drawings.
- the corresponding method may include one step to perform the functionality of one or more units (for example, one step performs one or more units). The functionality, or multiple steps, each of which performs the functionality of one or more of the multiple units), even if such one or more steps are not explicitly described or illustrated in the drawings.
- Video coding generally refers to processing a sequence of pictures that form a video or video sequence.
- the terms "picture”, "frame” or “image” can be used as synonyms.
- Video encoding used in this article means video encoding or video decoding.
- Video encoding is performed on the source side and usually includes processing (for example, by compressing) the original video picture to reduce the amount of data required to represent the video picture, so as to store and/or transmit more efficiently.
- Video decoding is performed on the destination side and usually involves inverse processing relative to the encoder to reconstruct the video picture.
- the “encoding” of video pictures involved in the embodiments should be understood as involving “encoding” or “decoding” of a video sequence.
- the combination of the encoding part and the decoding part is also called codec (encoding and decoding).
- a video sequence includes a series of pictures, the pictures are further divided into slices, and the slices are divided into blocks.
- Video coding is performed in units of blocks.
- the concept of blocks is further expanded.
- MB macroblock
- the macroblock can be further divided into multiple prediction blocks (partitions) that can be used for predictive coding.
- HEVC high-efficiency video coding
- basic concepts such as coding unit (CU), prediction unit (PU), and transform unit (TU) are adopted, which are functionally
- CU coding unit
- PU prediction unit
- TU transform unit
- the CU can be divided into smaller CUs according to the quadtree, and the smaller CUs can be further divided to form a quadtree structure.
- the CU is a basic unit for dividing and encoding the coded image.
- PU can correspond to prediction block and is the basic unit of prediction coding.
- the CU is further divided into multiple PUs according to the division mode.
- the TU can correspond to the transform block and is the basic unit for transforming the prediction residual.
- no matter CU, PU or TU they all belong to the concept of block (or image block) in nature.
- a CTU is split into multiple CUs by using a quadtree structure represented as a coding tree.
- a decision is made at the CU level whether to use inter-picture (temporal) or intra-picture (spatial) prediction to encode picture regions.
- Each CU can be further split into one, two or four PUs according to the PU split type.
- the same prediction process is applied in a PU, and relevant information is transmitted to the decoder on the basis of the PU.
- the CU may be divided into transform units (TU) according to other quadtree structures similar to the coding tree used for the CU.
- quad-tree and binary tree Quad-tree and binary tree (Quad-tree and Binary Tree, QTBT) are used to divide frames to divide coding blocks.
- the CU may have a square or rectangular shape.
- the image block to be processed in the current image can be referred to as the current block or image block to be processed.
- a reference block is a block that provides a reference signal for the current block, where the reference signal represents the pixel value in the image block.
- the block in the reference image that provides the prediction signal for the current block may be a prediction block, where the prediction signal represents the pixel value or sample value or sample signal in the prediction block. For example, after traversing multiple reference blocks, the best reference block is found. This best reference block will provide prediction for the current block, and this block is called a prediction block.
- the original video picture can be reconstructed, that is, the reconstructed video picture has the same quality as the original video picture (assuming no transmission loss or other data loss during storage or transmission).
- quantization is performed to perform further compression to reduce the amount of data required to represent the video picture, and the decoder side cannot completely reconstruct the video picture, that is, the quality of the reconstructed video picture is compared with the original video picture The quality is low or poor.
- Video coding standards of H.261 belong to "lossy hybrid video coding and decoding” (that is, combining spatial and temporal prediction in the sample domain with 2D transform coding for applying quantization in the transform domain).
- Each picture of a video sequence is usually divided into a set of non-overlapping blocks, and is usually coded at the block level.
- the encoder side usually processes the video at the block (video block) level, that is, encodes the video.
- the prediction block is generated by spatial (intra-picture) prediction and temporal (inter-picture) prediction, from the current block (currently processed or to be processed).
- the processed block subtracts the prediction block to obtain the residual block, transforms the residual block in the transform domain and quantizes the residual block to reduce the amount of data to be transmitted (compressed), and the decoder side will process the inverse of the encoder Partially applied to the coded or compressed block to reconstruct the current block for representation.
- the encoder duplicates the decoder processing loop, so that the encoder and the decoder generate the same prediction (for example, intra prediction and inter prediction) and/or reconstruction for processing, that is, to encode subsequent blocks.
- FIG. 1A exemplarily shows a schematic block diagram of a video encoding and decoding system 10 applied in an embodiment of the present application.
- the video encoding and decoding system 10 may include a source device 12 and a destination device 14.
- the source device 12 generates encoded video data. Therefore, the source device 12 may be referred to as a video encoding device.
- the destination device 14 can decode the encoded video data generated by the source device 12, and therefore, the destination device 14 can be referred to as a video decoding device.
- Various implementations of source device 12, destination device 14, or both may include one or more processors and memory coupled to the one or more processors.
- the memory may include, but is not limited to, RAM, ROM, EEPROM, flash memory, or any other medium that can be used to store desired program codes in the form of instructions or data structures accessible by a computer, as described herein.
- the source device 12 and the destination device 14 may include various devices, including desktop computers, mobile computing devices, notebook (for example, laptop) computers, tablet computers, set-top boxes, telephone handsets such as so-called "smart" phones. Computers, televisions, cameras, display devices, digital media players, video game consoles, on-board computers, wireless communication equipment, or the like.
- FIG. 1A shows the source device 12 and the destination device 14 as separate devices
- the device embodiment may also include the source device 12 and the destination device 14 or the functionality of both, that is, the source device 12 or the corresponding The functionality of the destination device 14 or the corresponding functionality.
- the same hardware and/or software may be used, or separate hardware and/or software, or any combination thereof may be used to implement the source device 12 or the corresponding functionality and the destination device 14 or the corresponding functionality .
- the source device 12 and the destination device 14 may communicate with each other via a link 13, and the destination device 14 may receive encoded video data from the source device 12 via the link 13.
- Link 13 may include one or more media or devices capable of moving encoded video data from source device 12 to destination device 14.
- link 13 may include one or more communication media that enable source device 12 to transmit encoded video data directly to destination device 14 in real time.
- the source device 12 may modulate the encoded video data according to a communication standard, such as a wireless communication protocol, and may transmit the modulated video data to the destination device 14.
- the one or more communication media may include wireless and/or wired communication media, such as a radio frequency (RF) spectrum or one or more physical transmission lines.
- RF radio frequency
- the one or more communication media may form part of a packet-based network, such as a local area network, a wide area network, or a global network (e.g., the Internet).
- the one or more communication media may include routers, switches, base stations, or other devices that facilitate communication from source device 12 to destination device 14.
- the source device 12 includes an encoder 20, and optionally, the source device 12 may also include a picture source 16, a picture preprocessor 18, and a communication interface 22.
- the encoder 20, the picture source 16, the picture preprocessor 18, and the communication interface 22 may be hardware components in the source device 12, or may be software programs in the source device 12. They are described as follows:
- the picture source 16 which can include or can be any type of picture capture device, for example to capture real-world pictures, and/or any type of pictures or comments (for screen content encoding, some text on the screen is also considered to be encoded Picture or part of an image) generating equipment, for example, a computer graphics processor for generating computer animation pictures, or for obtaining and/or providing real world pictures, computer animation pictures (for example, screen content, virtual reality, VR) pictures), and/or any combination thereof (for example, augmented reality (AR) pictures).
- the picture source 16 may be a camera for capturing pictures or a memory for storing pictures.
- the picture source 16 may also include any type of (internal or external) interface for storing previously captured or generated pictures and/or acquiring or receiving pictures.
- the picture source 16 When the picture source 16 is a camera, the picture source 16 may be, for example, a local or an integrated camera integrated in the source device; when the picture source 16 is a memory, the picture source 16 may be local or, for example, an integrated camera integrated in the source device. Memory.
- the interface When the picture source 16 includes an interface, the interface may be, for example, an external interface for receiving pictures from an external video source.
- the external video source is, for example, an external picture capturing device, such as a camera, an external memory, or an external picture generating device, such as It is an external computer graphics processor, computer or server.
- the interface can be any type of interface according to any proprietary or standardized interface protocol, such as a wired or wireless interface, and an optical interface.
- a picture can be regarded as a two-dimensional array or matrix of picture elements.
- the pixel points in the array can also be called sampling points.
- the number of sampling points of the array or picture in the horizontal and vertical directions (or axis) defines the size and/or resolution of the picture.
- three color components are usually used, that is, pictures can be represented as or contain three sample arrays.
- a picture includes corresponding red, green, and blue sample arrays.
- each pixel is usually expressed in a luminance/chrominance format or color space.
- a picture in the YUV format includes the luminance component indicated by Y (sometimes indicated by L) and the two indicated by U and V. Chrominance components.
- the luma component Y represents brightness or gray level intensity (for example, the two are the same in a grayscale picture), and the two chroma components U and V represent chroma or color information components.
- a picture in the YUV format includes a luminance sample array of luminance sample values (Y), and two chrominance sample arrays of chrominance values (U and V).
- Pictures in RGB format can be converted or converted to YUV format, and vice versa. This process is also called color conversion or conversion. If the picture is black and white, the picture may only include the luminance sample array.
- the picture transmitted from the picture source 16 to the picture processor may also be referred to as original picture data 17.
- the picture preprocessor 18 is configured to receive the original picture data 17 and perform preprocessing on the original picture data 17 to obtain the preprocessed picture 19 or the preprocessed picture data 19.
- the pre-processing performed by the picture pre-processor 18 may include trimming, color format conversion (for example, conversion from RGB format to YUV format), toning, or denoising.
- the encoder 20 (or video encoder 20) is configured to receive the pre-processed picture data 19, and process the pre-processed picture data 19 using a relevant prediction mode (such as the prediction mode in the various embodiments herein), thereby
- the encoded picture data 21 is provided (the structure details of the encoder 20 will be described further based on FIG. 2 or FIG. 4 or FIG. 5).
- the encoder 20 may be used to implement the various embodiments described below to realize the application of the chrominance block prediction method described in this application on the encoding side.
- the communication interface 22 can be used to receive the encoded picture data 21, and can transmit the encoded picture data 21 to the destination device 14 or any other device (such as a memory) via the link 13 for storage or direct reconstruction, so The other device can be any device used for decoding or storage.
- the communication interface 22 may be used, for example, to encapsulate the encoded picture data 21 into a suitable format, such as a data packet, for transmission on the link 13.
- the destination device 14 includes a decoder 30, and optionally, the destination device 14 may also include a communication interface 28, a picture post processor 32, and a display device 34. They are described as follows:
- the communication interface 28 may be used to receive the encoded picture data 21 from the source device 12 or any other source, for example, a storage device, and the storage device is, for example, an encoded picture data storage device.
- the communication interface 28 can be used to transmit or receive the encoded picture data 21 via the link 13 between the source device 12 and the destination device 14 or via any type of network.
- the link 13 is, for example, a direct wired or wireless connection.
- the type of network is, for example, a wired or wireless network or any combination thereof, or any type of private network and public network, or any combination thereof.
- the communication interface 28 may be used, for example, to decapsulate the data packet transmitted by the communication interface 22 to obtain the encoded picture data 21.
- Both the communication interface 28 and the communication interface 22 can be configured as a one-way communication interface or a two-way communication interface, and can be used, for example, to send and receive messages to establish connections, confirm and exchange any other communication links and/or, for example, encoded picture data Information about the transmission of the transmitted data.
- the decoder 30 (or referred to as the decoder 30) is used to receive the encoded picture data 21 and provide the decoded picture data 31 or the decoded picture 31 (below will further describe the decoder 30 based on Figure 3 or Figure 4 or Figure 5 Structural details).
- the decoder 30 may be used to implement the various embodiments described below to realize the application of the chrominance block prediction method described in this application on the decoding side.
- the picture post processor 32 is configured to perform post-processing on the decoded picture data 31 (also referred to as reconstructed picture data) to obtain post-processed picture data 33.
- the post-processing performed by the picture post-processor 32 may include: color format conversion (for example, conversion from YUV format to RGB format), toning, trimming or resampling, or any other processing, and can also be used to convert post-processed picture data 33 Transmission to display device 34.
- the display device 34 is configured to receive the post-processed image data 33 to display the image to, for example, users or viewers.
- the display device 34 may be or may include any type of display for presenting reconstructed pictures, for example, an integrated or external display or monitor.
- the display may include a liquid crystal display (LCD), an organic light emitting diode (OLED) display, a plasma display, a projector, a micro LED display, a liquid crystal on silicon (LCoS), Digital light processor (digital light processor, DLP) or any other type of display.
- FIG. 1A shows the source device 12 and the destination device 14 as separate devices
- the device embodiment may also include the source device 12 and the destination device 14 or the functionality of both, that is, the source device 12 or Corresponding functionality and destination device 14 or corresponding functionality.
- the same hardware and/or software may be used, or separate hardware and/or software, or any combination thereof may be used to implement the source device 12 or the corresponding functionality and the destination device 14 or the corresponding functionality .
- the source device 12 and the destination device 14 may include any of a variety of devices, including any type of handheld or stationary device, for example, a notebook or laptop computer, mobile phone, smart phone, tablet or tablet computer, video camera, desktop Computers, set-top boxes, televisions, cameras, in-vehicle devices, display devices, digital media players, video game consoles, video streaming devices (such as content service servers or content distribution servers), broadcast receiver devices, broadcast transmitter devices And so on, and can not use or use any type of operating system.
- a notebook or laptop computer mobile phone, smart phone, tablet or tablet computer
- video camera desktop Computers
- set-top boxes televisions, cameras, in-vehicle devices, display devices, digital media players, video game consoles, video streaming devices (such as content service servers or content distribution servers), broadcast receiver devices, broadcast transmitter devices And so on, and can not use or use any type of operating system.
- Both the encoder 20 and the decoder 30 can be implemented as any of various suitable circuits, for example, one or more microprocessors, digital signal processors (digital signal processors, DSP), and application-specific integrated circuits (application-specific integrated circuits). circuit, ASIC), field-programmable gate array (FPGA), discrete logic, hardware, or any combination thereof.
- the device can store the instructions of the software in a suitable non-transitory computer-readable storage medium, and can use one or more processors to execute the instructions in hardware to execute the technology of the present disclosure . Any of the foregoing content (including hardware, software, a combination of hardware and software, etc.) can be regarded as one or more processors.
- the video encoding and decoding system 10 shown in FIG. 1A is only an example, and the technology of this application can be applied to video encoding settings that do not necessarily include any data communication between encoding and decoding devices (for example, video encoding or video encoding). decoding).
- the data can be retrieved from local storage, streamed on the network, etc.
- the video encoding device can encode data and store the data to the memory, and/or the video decoding device can retrieve the data from the memory and decode the data.
- encoding and decoding are performed by devices that do not communicate with each other but only encode data to the memory and/or retrieve data from the memory and decode the data.
- FIG. 1B is an explanatory diagram of an example of a video coding system 40 including the encoder 20 of FIG. 2 and/or the decoder 30 of FIG. 3 according to an exemplary embodiment.
- the video decoding system 40 can implement a combination of various technologies in the embodiments of the present application.
- the video decoding system 40 may include an imaging device 41, an encoder 20, a decoder 30 (and/or a video encoder/decoder implemented by the logic circuit 47 of the processing circuit 46), and an antenna 42 , One or more processors 43, one or more memories 44 and/or display devices 45.
- the imaging device 41, the antenna 42, the processing circuit 46, the logic circuit 47, the encoder 20, the decoder 30, the processor 43, the memory 44, and/or the display device 45 can communicate with each other.
- the encoder 20 and the decoder 30 are used to illustrate the video coding system 40, in different examples, the video coding system 40 may include only the encoder 20 or only the decoder 30.
- antenna 42 may be used to transmit or receive an encoded bitstream of video data.
- the display device 45 may be used to present video data.
- the logic circuit 47 may be implemented by the processing circuit 46.
- the processing circuit 46 may include application-specific integrated circuit (ASIC) logic, graphics processor, general purpose processor, and so on.
- the video decoding system 40 may also include an optional processor 43, and the optional processor 43 may similarly include application-specific integrated circuit (ASIC) logic, a graphics processor, a general-purpose processor, and the like.
- the logic circuit 47 may be implemented by hardware, such as dedicated hardware for video encoding, and the processor 43 may be implemented by general software, an operating system, and the like.
- the memory 44 may be any type of memory, such as volatile memory (for example, static random access memory (Static Random Access Memory, SRAM), dynamic random access memory (Dynamic Random Access Memory, DRAM), etc.) or non-volatile memory. Memory (for example, flash memory, etc.), etc.
- volatile memory for example, static random access memory (Static Random Access Memory, SRAM), dynamic random access memory (Dynamic Random Access Memory, DRAM), etc.
- Memory for example, flash memory, etc.
- the memory 44 may be implemented by cache memory.
- the logic circuit 47 may access the memory 44 (e.g., to implement an image buffer).
- the logic circuit 47 and/or the processing circuit 46 may include memory (e.g., cache, etc.) for implementing image buffers and the like.
- the encoder 20 implemented by logic circuits may include an image buffer (e.g., implemented by the processing circuit 46 or the memory 44) and a graphics processing unit (e.g., implemented by the processing circuit 46).
- the graphics processing unit may be communicatively coupled to the image buffer.
- the graphics processing unit may include an encoder 20 implemented by a logic circuit 47 to implement various modules discussed with reference to FIG. 2 and/or any other encoder system or subsystem described herein.
- Logic circuits can be used to perform the various operations discussed herein.
- decoder 30 may be implemented by logic circuit 47 in a similar manner to implement the various modules discussed with reference to decoder 30 of FIG. 3 and/or any other decoder systems or subsystems described herein.
- the decoder 30 implemented by logic circuits may include an image buffer (implemented by the processing circuit 44 or the memory 44) and a graphics processing unit (implemented by the processing circuit 46, for example).
- the graphics processing unit may be communicatively coupled to the image buffer.
- the graphics processing unit may include a decoder 30 implemented by a logic circuit 47 to implement the various modules discussed with reference to FIG. 3 and/or any other decoder systems or subsystems described herein.
- antenna 42 may be used to receive an encoded bitstream of video data.
- the encoded bitstream may include data, indicators, index values, mode selection data, etc., related to the encoded video frame discussed herein, such as data related to coded partitions (e.g., transform coefficients or quantized transform coefficients). , (As discussed) optional indicators, and/or data defining code partitions).
- the video coding system 40 may also include a decoder 30 coupled to the antenna 42 and used to decode the encoded bitstream.
- the display device 45 is used to present video frames.
- the decoder 30 may be used to perform the reverse process.
- the decoder 30 can be used to receive and parse such syntax elements, and decode related video data accordingly.
- the encoder 20 may entropy encode the syntax elements into an encoded video bitstream. In such instances, the decoder 30 can parse such syntax elements and decode related video data accordingly.
- the video image encoding method described in the embodiment of the application occurs at the encoder 20, and the video image decoding method described in the embodiment of the application occurs at the decoder 30.
- the encoder 20 and the decoder in the embodiment of the application may be, for example, an encoder/decoder corresponding to video standard protocols such as H.263, H.264, HEVV, MPEG-2, MPEG-4, VP8, VP9, or next-generation video standard protocols (such as H.266, etc.).
- Fig. 2 shows a schematic/conceptual block diagram of an example of an encoder 20 for implementing an embodiment of the present application.
- the encoder 20 includes a residual calculation unit 204, a transformation processing unit 206, a quantization unit 208, an inverse quantization unit 210, an inverse transformation processing unit 212, a reconstruction unit 214, a buffer 216, and a loop filter. 220.
- the prediction processing unit 260 may include an inter prediction unit 244, an intra prediction unit 254, and a mode selection unit 262.
- the inter prediction unit 244 may include a motion estimation unit and a motion compensation unit (not shown).
- the encoder 20 shown in FIG. 2 may also be referred to as a hybrid video encoder or a video encoder according to a hybrid video codec.
- the residual calculation unit 204, the transform processing unit 206, the quantization unit 208, the prediction processing unit 260, and the entropy encoding unit 270 form the forward signal path of the encoder 20, and for example, the inverse quantization unit 210, the inverse transform processing unit 212, and the The structure unit 214, the buffer 216, the loop filter 220, the decoded picture buffer (DPB) 230, and the prediction processing unit 260 form the backward signal path of the encoder, wherein the backward signal path of the encoder corresponds to The signal path of the decoder (see decoder 30 in FIG. 3).
- the encoder 20 receives a picture 201 or an image block 203 of a picture 201, for example, a picture in a picture sequence that forms a video or a video sequence through, for example, an input 202.
- the image block 203 may also be called the current picture block or the picture block to be encoded
- the picture 201 may be called the current picture or the picture to be encoded (especially when the current picture is distinguished from other pictures in video encoding, the other pictures are for example the same video sequence). That is, the previous coded and/or decoded picture in the video sequence that also includes the current picture).
- the embodiment of the encoder 20 may include a segmentation unit (not shown in FIG. 2) for segmenting the picture 201 into a plurality of blocks such as the image block 203, usually into a plurality of non-overlapping blocks.
- the segmentation unit can be used to use the same block size and the corresponding grid defining the block size for all pictures in the video sequence, or to change the block size between pictures or subsets or groups of pictures, and divide each picture into The corresponding block.
- the prediction processing unit 260 of the encoder 20 may be used to perform any combination of the aforementioned segmentation techniques.
- the image block 203 is also or can be regarded as a two-dimensional array or matrix of sampling points with sample values, although its size is smaller than that of the picture 201.
- the image block 203 may include, for example, one sampling array (for example, a luminance array in the case of a black-and-white picture 201) or three sampling arrays (for example, one luminance array and two chrominance arrays in the case of a color picture) or Any other number and/or type of array depending on the color format applied.
- the number of sampling points in the horizontal and vertical directions (or axes) of the image block 203 defines the size of the image block 203.
- the encoder 20 shown in FIG. 2 is used to encode the picture 201 block by block, for example, to perform encoding and prediction on each image block 203.
- the residual calculation unit 204 is configured to calculate the residual block 205 based on the picture image block 203 and the prediction block 265 (other details of the prediction block 265 are provided below), for example, by subtracting the sample value of the picture image block 203 sample by sample (pixel by pixel). The sample value of the block 265 is de-predicted to obtain the residual block 205 in the sample domain.
- the transform processing unit 206 is configured to apply a transform such as discrete cosine transform (DCT) or discrete sine transform (DST) to the sample values of the residual block 205 to obtain transform coefficients 207 in the transform domain.
- a transform such as discrete cosine transform (DCT) or discrete sine transform (DST)
- DCT discrete cosine transform
- DST discrete sine transform
- the transform coefficient 207 may also be referred to as a transform residual coefficient, and represents the residual block 205 in the transform domain.
- the transform processing unit 206 may be used to apply an integer approximation of DCT/DST, such as the transform specified for HEVC/H.265. Compared with the orthogonal DCT transform, this integer approximation is usually scaled by a factor. In order to maintain the norm of the residual block processed by the forward and inverse transformation, an additional scaling factor is applied as part of the transformation process.
- the scaling factor is usually selected based on certain constraints. For example, the scaling factor is a trade-off between the power of 2 used for the shift operation, the bit depth of the transform coefficient, accuracy, and implementation cost.
- the inverse transformation processing unit 212 for the inverse transformation designate a specific scaling factor, and accordingly, the encoder The 20 side uses the transformation processing unit 206 to specify a corresponding scaling factor for the positive transformation.
- the quantization unit 208 is used to quantize the transform coefficient 207 by applying scalar quantization or vector quantization, for example, to obtain the quantized transform coefficient 209.
- the quantized transform coefficient 209 may also be referred to as a quantized residual coefficient 209.
- the quantization process can reduce the bit depth associated with some or all of the transform coefficients 207. For example, n-bit transform coefficients can be rounded down to m-bit transform coefficients during quantization, where n is greater than m.
- the degree of quantization can be modified by adjusting the quantization parameter (QP). For example, for scalar quantization, different scales can be applied to achieve finer or coarser quantization.
- QP quantization parameter
- a smaller quantization step size corresponds to a finer quantization
- a larger quantization step size corresponds to a coarser quantization.
- the appropriate quantization step size can be indicated by a quantization parameter (QP).
- the quantization parameter may be an index of a predefined set of suitable quantization steps.
- a smaller quantization parameter can correspond to fine quantization (smaller quantization step size)
- a larger quantization parameter can correspond to coarse quantization (larger quantization step size)
- the quantization may include division by a quantization step size and corresponding quantization or inverse quantization performed by, for example, inverse quantization 210, or may include multiplication by a quantization step size.
- Embodiments according to some standards such as HEVC may use quantization parameters to determine the quantization step size.
- the quantization step size can be calculated based on the quantization parameter using a fixed-point approximation of an equation including division. Additional scaling factors can be introduced for quantization and inverse quantization to restore the norm of the residual block that may be modified due to the scale used in the fixed-point approximation of the equations for the quantization step size and the quantization parameter.
- the scales of inverse transform and inverse quantization may be combined.
- a custom quantization table can be used and signaled from the encoder to the decoder in, for example, a bitstream. Quantization is a lossy operation, where the larger the quantization step, the greater the loss.
- the inverse quantization unit 210 is configured to apply the inverse quantization of the quantization unit 208 on the quantized coefficients to obtain the inverse quantized coefficients 211, for example, based on or use the same quantization step size as the quantization unit 208, and apply the quantization scheme applied by the quantization unit 208 The inverse quantification scheme.
- the inversely quantized coefficient 211 may also be referred to as the inversely quantized residual coefficient 211, which corresponds to the transform coefficient 207, although the loss due to quantization is usually different from the transform coefficient.
- the inverse transform processing unit 212 is configured to apply the inverse transform of the transform applied by the transform processing unit 206, for example, an inverse discrete cosine transform (DCT) or an inverse discrete sine transform (DST), so as to be in the sample domain Obtain the inverse transform block 213.
- the inverse transformation block 213 may also be referred to as an inverse transformation and inverse quantization block 213 or an inverse transformation residual block 213.
- the reconstruction unit 214 (for example, the summer 214) is used to add the inverse transform block 213 (that is, the reconstructed residual block 213) to the prediction block 265 to obtain the reconstructed block 215 in the sample domain, for example, The sample value of the reconstructed residual block 213 and the sample value of the prediction block 265 are added.
- the buffer unit 216 (or “buffer” 216 for short) such as the line buffer 216 is used to buffer or store the reconstructed block 215 and the corresponding sample value, for example, for intra prediction.
- the encoder can be used to use the unfiltered reconstructed block and/or the corresponding sample value stored in the buffer unit 216 to perform any type of estimation and/or prediction, such as intra-frame prediction.
- the embodiment of the encoder 20 may be configured such that the buffer unit 216 is used not only for storing the reconstructed block 215 for intra prediction 254, but also for the loop filter 220 unit (not shown in FIG. 2 Out), and/or, for example, the buffer unit 216 and the decoded picture buffer unit 230 form one buffer.
- Other embodiments may be used to use the filtered block 221 and/or blocks or samples from the decoded picture buffer 230 (neither shown in FIG. 2) as the input or basis for the intra prediction 254.
- the loop filter unit 220 (or “loop filter” 220 for short) is used to filter the reconstructed block 215 to obtain the filtered block 221, thereby smoothly performing pixel conversion or improving video quality.
- the loop filter unit 220 is intended to represent one or more loop filters, such as deblocking filters, sample-adaptive offset (SAO) filters or other filters, such as bilateral filters, auto Adaptive loop filter (ALF), or sharpening or smoothing filter, or collaborative filter.
- the loop filter unit 220 is shown as an in-loop filter in FIG. 2, in other configurations, the loop filter unit 220 may be implemented as a post-loop filter.
- the filtered block 221 may also be referred to as a filtered reconstructed block 221.
- the decoded picture buffer 230 may store the reconstructed coded block after the loop filter unit 220 performs a filtering operation on the reconstructed coded block.
- the embodiment of the encoder 20 may be used to output loop filter parameters (e.g., sample adaptive offset information), for example, directly output or by the entropy encoding unit 270 or any other
- the entropy coding unit outputs after entropy coding, for example, so that the decoder 30 can receive and apply the same loop filter parameters for decoding.
- the decoded picture buffer (DPB) 230 may be a reference picture memory that stores reference picture data for the encoder 20 to encode video data.
- DPB 230 can be formed by any of a variety of memory devices, such as dynamic random access memory (DRAM) (including synchronous DRAM (SDRAM), magnetoresistive RAM (MRAM), resistive RAM) (resistive RAM, RRAM)) or other types of memory devices.
- DRAM dynamic random access memory
- SDRAM synchronous DRAM
- MRAM magnetoresistive RAM
- RRAM resistive RAM
- the DPB 230 and the buffer 216 may be provided by the same memory device or by separate memory devices.
- a decoded picture buffer (DPB) 230 is used to store the filtered block 221.
- the decoded picture buffer 230 may be further used to store other previous filtered blocks of the same current picture or different pictures such as the previously reconstructed picture, such as the previously reconstructed and filtered block 221, and may provide a complete previous Reconstruction is a decoded picture (and corresponding reference blocks and samples) and/or a partially reconstructed current picture (and corresponding reference blocks and samples), for example, for inter prediction.
- a decoded picture buffer (DPB) 230 is used to store the reconstructed block 215.
- the prediction processing unit 260 also called the block prediction processing unit 260, is used to receive or obtain the image block 203 (the current image block 203 of the current picture 201) and reconstructed picture data, such as the same (current) picture from the buffer 216
- the reference samples and/or the reference picture data 231 of one or more previously decoded pictures from the decoded picture buffer 230, and used to process such data for prediction, that is, the provision can be an inter-predicted block 245 or a The prediction block 265 of the intra prediction block 255.
- the mode selection unit 262 may be used to select a prediction mode (for example, intra or inter prediction mode) and/or the corresponding prediction block 245 or 255 used as the prediction block 265 to calculate the residual block 205 and reconstruct the reconstructed block 215.
- a prediction mode for example, intra or inter prediction mode
- the corresponding prediction block 245 or 255 used as the prediction block 265 to calculate the residual block 205 and reconstruct the reconstructed block 215.
- the embodiment of the mode selection unit 262 can be used to select a prediction mode (for example, from those supported by the prediction processing unit 260) that provides the best match or minimum residual (the minimum residual means Better compression in transmission or storage), or provide minimal signaling overhead (minimum signaling overhead means better compression in transmission or storage), or consider or balance both.
- the mode selection unit 262 may be configured to determine a prediction mode based on rate distortion optimization (RDO), that is, select a prediction mode that provides the smallest rate-distortion optimization, or select a prediction mode whose related rate-distortion at least meets the prediction mode selection criteria .
- RDO rate distortion optimization
- the encoder 20 is used to determine or select the best or optimal prediction mode from a set of (predetermined) prediction modes.
- the prediction mode set may include, for example, an intra prediction mode and/or an inter prediction mode.
- the set of intra prediction modes may include 35 different intra prediction modes, for example, non-directional modes such as DC (or mean) mode and planar mode, or directional modes as defined in H.265, or may include 67 Different intra-frame prediction modes, for example, non-directional modes such as DC (or mean) mode and planar mode, or directional modes as defined in H.266 under development.
- the set of inter-frame prediction modes depends on the available reference pictures (ie, for example, the aforementioned at least part of the decoded pictures stored in the DBP230) and other inter-frame prediction parameters, such as whether to use the entire reference picture or only use A part of the reference picture, such as the search window area surrounding the area of the current block, to search for the best matching reference block, and/or depending on whether pixel interpolation such as half pixel and/or quarter pixel interpolation is applied.
- the set of inter prediction modes may include, for example, skip mode and merge mode.
- the inter-frame prediction mode set may include the skip-based merged motion vector difference (MMVD) mode in the embodiment of the present application, or the merge-based MMVD mode.
- the intra prediction unit 254 may be used to perform any combination of inter prediction techniques described below.
- the embodiments of the present application may also apply skip mode and/or direct mode.
- the prediction processing unit 260 may be further used to divide the image block 203 into smaller block partitions or sub-blocks, for example, by iteratively using quad-tree (QT) segmentation and binary-tree (BT) segmentation. Or triple-tree (TT) segmentation, or any combination thereof, and used to perform prediction, for example, for each of the block partitions or sub-blocks, where the mode selection includes selecting the tree structure of the segmented image block 203 and selecting the application The prediction mode for each of the block partitions or sub-blocks.
- QT quad-tree
- BT binary-tree
- TT triple-tree
- the inter prediction unit 244 may include a motion estimation (ME) unit (not shown in FIG. 2) and a motion compensation (MC) unit (not shown in FIG. 2).
- the motion estimation unit is used to receive or obtain the picture image block 203 (the current picture image block 203 of the current picture 201) and the decoded picture 231, or at least one or more previously reconstructed blocks, for example, one or more other/different
- the reconstructed block of the previously decoded picture 231 is used for motion estimation.
- the video sequence may include the current picture and the previously decoded picture 31, or in other words, the current picture and the previously decoded picture 31 may be part of the picture sequence forming the video sequence, or form the picture sequence.
- the encoder 20 may be used to select a reference block from multiple reference blocks of the same or different pictures among multiple other pictures, and provide the reference picture and/or provide a reference to the motion estimation unit (not shown in FIG. 2)
- the offset (spatial offset) between the position of the block (X, Y coordinates) and the position of the current block is used as an inter prediction parameter. This offset is also called a motion vector (MV).
- the motion compensation unit is used to obtain inter prediction parameters, and perform inter prediction based on or using the inter prediction parameters to obtain the inter prediction block 245.
- the motion compensation performed by the motion compensation unit may include fetching or generating a prediction block based on a motion/block vector determined by motion estimation (interpolation of sub-pixel accuracy may be performed). Interpolation filtering can generate additional pixel samples from known pixel samples, thereby potentially increasing the number of candidate prediction blocks that can be used to encode picture blocks.
- the motion compensation unit 246 can locate the prediction block pointed to by the motion vector in a reference picture list.
- the motion compensation unit 246 may also generate syntax elements associated with the blocks and video slices for use by the decoder 30 when decoding picture blocks of the video slices.
- the aforementioned inter-prediction unit 244 may transmit syntax elements to the entropy encoding unit 270, and the syntax elements include inter-prediction parameters (for example, after traversing multiple inter-prediction modes and selecting the inter-prediction mode used for prediction of the current block) Instructions).
- the inter-frame prediction parameter may not be carried in the syntax element.
- the decoder 30 can directly use the default prediction mode for decoding. It can be understood that the inter prediction unit 244 may be used to perform any combination of inter prediction techniques.
- the intra prediction unit 254 is used to obtain, for example, receive a picture block 203 (current picture block) of the same picture and one or more previously reconstructed blocks, for example reconstructed adjacent blocks, for intra estimation.
- the encoder 20 may be used to select an intra prediction mode from a plurality of (predetermined) intra prediction modes.
- the embodiment of the encoder 20 may be used to select an intra prediction mode based on optimization criteria, for example, based on a minimum residual (for example, an intra prediction mode that provides a prediction block 255 most similar to the current picture block 203) or a minimum rate distortion.
- a minimum residual for example, an intra prediction mode that provides a prediction block 255 most similar to the current picture block 203
- a minimum rate distortion for example, an intra prediction mode that provides a prediction block 255 most similar to the current picture block 203
- the intra prediction unit 254 is further configured to determine the intra prediction block 255 based on the intra prediction parameters of the selected intra prediction mode. In any case, after selecting the intra prediction mode for the block, the intra prediction unit 254 is also used to provide intra prediction parameters to the entropy encoding unit 270, that is, to provide an indication of the selected intra prediction mode for the block Information. In one example, the intra prediction unit 254 may be used to perform any combination of intra prediction techniques.
- the aforementioned intra-prediction unit 254 may transmit syntax elements to the entropy encoding unit 270, and the syntax elements include intra-prediction parameters (for example, after traversing multiple intra-prediction modes, selecting the intra-prediction mode used for prediction of the current block) Instructions).
- the intra prediction parameters may not be carried in the syntax element.
- the decoder 30 can directly use the default prediction mode for decoding.
- the entropy coding unit 270 is used to apply entropy coding algorithms or schemes (for example, variable length coding (VLC) scheme, context adaptive VLC (context adaptive VLC, CAVLC) scheme, arithmetic coding scheme, context adaptive binary arithmetic) Coding (context adaptive binary arithmetic coding, CABAC), syntax-based context-adaptive binary arithmetic coding (SBAC), probability interval partitioning entropy (PIPE) coding or other entropy Encoding method or technique) applied to quantized residual coefficients 209, inter-frame prediction parameters, intra-frame prediction parameters and/or loop filter parameters, one or all (or not applied), to obtain the output 272
- VLC variable length coding
- CAVLC context adaptive VLC
- CABAC context adaptive binary arithmetic
- SBAC syntax-based context-adaptive binary arithmetic coding
- PIPE probability interval partitioning entropy
- encoded picture data 21 output in the form of encoded bitstream
- the encoded bitstream can be transmitted to the video decoder 30, or archived for later transmission or retrieval by the video decoder 30.
- the entropy encoding unit 270 may also be used for entropy encoding other syntax elements of the current video slice being encoded.
- the non-transform-based encoder 20 may directly quantize the residual signal without the transform processing unit 206 for certain blocks or frames.
- the encoder 20 may have a quantization unit 208 and an inverse quantization unit 210 combined into a single unit.
- the encoder 20 may be used to implement the video image encoding method described in the following embodiments.
- the video encoder 20 may directly quantize the residual signal without being processed by the transform processing unit 206, and accordingly does not need to be processed by the inverse transform processing unit 212; or, for some For image blocks or image frames, the video encoder 20 does not generate residual data, and accordingly does not need to be processed by the transform processing unit 206, quantization unit 208, inverse quantization unit 210, and inverse transform processing unit 212; or, the video encoder 20 may The reconstructed image block is directly stored as a reference block without being processed by the filter 220; or, the quantization unit 208 and the inverse quantization unit 210 in the video encoder 20 may be combined together.
- the loop filter 220 is optional, and for lossless compression coding, the transform processing unit 206, the quantization unit 208, the inverse quantization unit 210, and the inverse transform processing unit 212 are optional. It should be understood that, according to different application scenarios, the inter prediction unit 244 and the intra prediction unit 254 may be selectively activated.
- FIG. 3 shows a schematic/conceptual block diagram of an example of a decoder 30 for implementing an embodiment of the present application.
- the video decoder 30 is used to receive, for example, encoded picture data (for example, an encoded bit stream) 21 encoded by the encoder 20 to obtain a decoded picture 231.
- video decoder 30 receives video data from video encoder 20, such as an encoded video bitstream and associated syntax elements that represent picture blocks of an encoded video slice.
- the decoder 30 includes an entropy decoding unit 304, an inverse quantization unit 310, an inverse transform processing unit 312, a reconstruction unit 314 (such as a summer 314), a buffer 316, a loop filter 320, and The decoded picture buffer 330 and the prediction processing unit 360.
- the prediction processing unit 360 may include an inter prediction unit 344, an intra prediction unit 354, and a mode selection unit 362.
- video decoder 30 may perform decoding passes that are substantially reciprocal of the encoding passes described with video encoder 20 of FIG. 2.
- the entropy decoding unit 304 is configured to perform entropy decoding on the encoded picture data 21 to obtain, for example, quantized coefficients 309 and/or decoded encoding parameters (not shown in FIG. 3), for example, inter prediction, intra prediction parameters , Loop filter parameters and/or any one or all of other syntax elements (decoded).
- the entropy decoding unit 304 is further configured to forward the inter prediction parameters, intra prediction parameters and/or other syntax elements to the prediction processing unit 360.
- the video decoder 30 may receive syntax elements at the video slice level and/or the video block level.
- the inverse quantization unit 310 can be functionally the same as the inverse quantization unit 110
- the inverse transformation processing unit 312 can be functionally the same as the inverse transformation processing unit 212
- the reconstruction unit 314 can be functionally the same as the reconstruction unit 214
- the buffer 316 can be functionally identical.
- the loop filter 320 may be functionally the same as the loop filter 220
- the decoded picture buffer 330 may be functionally the same as the decoded picture buffer 230.
- the prediction processing unit 360 may include an inter prediction unit 344 and an intra prediction unit 354.
- the inter prediction unit 344 may be functionally similar to the inter prediction unit 244, and the intra prediction unit 354 may be functionally similar to the intra prediction unit 254.
- the prediction processing unit 360 is generally used to perform block prediction and/or obtain a prediction block 365 from the encoded data 21, and to receive or obtain (explicitly or implicitly) prediction-related parameters and/or information about the prediction from the entropy decoding unit 304, for example. Information about the selected prediction mode.
- the intra-prediction unit 354 of the prediction processing unit 360 is used for the intra-prediction mode based on the signal and the previous decoded block from the current frame or picture. Data to generate a prediction block 365 for the picture block of the current video slice.
- the inter-frame prediction unit 344 eg, motion compensation unit
- the prediction processing unit 360 is used for the motion vector and the received from the entropy decoding unit 304
- the other syntax elements generate a prediction block 365 for the video block of the current video slice.
- a prediction block can be generated from a reference picture in a reference picture list.
- the video decoder 30 may use the default construction technique to construct a list of reference frames based on the reference pictures stored in the DPB 330: list 0 and list 1.
- the prediction processing unit 360 is configured to determine prediction information for the video block of the current video slice by parsing the motion vector and other syntax elements, and use the prediction information to generate the prediction block for the current video block being decoded.
- the prediction processing unit 360 uses some syntax elements received to determine the prediction mode (for example, intra or inter prediction) and the inter prediction slice type ( For example, B slice, P slice or GPB slice), construction information for one or more of the reference picture list for the slice, motion vector for each inter-coded video block of the slice, The inter prediction status and other information of each inter-encoded video block of the slice to decode the video block of the current video slice.
- the syntax elements received by the video decoder 30 from the bitstream include receiving adaptive parameter set (APS), sequence parameter set (sequence parameter set, SPS), and picture parameter set (picture parameter set). parameter set, PPS) or a syntax element in one or more of the slice headers.
- APS adaptive parameter set
- SPS sequence parameter set
- PPS picture parameter set
- the inverse quantization unit 310 may be used to inverse quantize (ie, inverse quantize) the quantized transform coefficients provided in the bitstream and decoded by the entropy decoding unit 304.
- the inverse quantization process may include using the quantization parameter calculated by the video encoder 20 for each video block in the video slice to determine the degree of quantization that should be applied and also determine the degree of inverse quantization that should be applied.
- the inverse transform processing unit 312 is used to apply an inverse transform (for example, an inverse DCT, an inverse integer transform, or a conceptually similar inverse transform process) to transform coefficients so as to generate a residual block in the pixel domain.
- an inverse transform for example, an inverse DCT, an inverse integer transform, or a conceptually similar inverse transform process
- the reconstruction unit 314 (for example, the summer 314) is used to add the inverse transform block 313 (that is, the reconstructed residual block 313) to the prediction block 365 to obtain the reconstructed block 315 in the sample domain, for example by adding The sample value of the reconstructed residual block 313 and the sample value of the prediction block 365 are added.
- the loop filter unit 320 (during the encoding cycle or after the encoding cycle) is used to filter the reconstructed block 315 to obtain the filtered block 321, thereby smoothly performing pixel transformation or improving video quality.
- the loop filter unit 320 may be used to perform any combination of the filtering techniques described below.
- the loop filter unit 320 is intended to represent one or more loop filters, such as deblocking filters, sample-adaptive offset (SAO) filters or other filters, such as bilateral filters, auto Adaptive loop filter (ALF), or sharpening or smoothing filter, or collaborative filter.
- the loop filter unit 320 is shown as an in-loop filter in FIG. 3, in other configurations, the loop filter unit 320 may be implemented as a post-loop filter.
- the decoded video block 321 in a given frame or picture is then stored in a decoded picture buffer 330 that stores reference pictures for subsequent motion compensation.
- the decoder 30 is used, for example, to output the decoded picture 31 through the output 332 for presentation or viewing by the user.
- the decoder 30 may generate an output video stream without the loop filter unit 320.
- the non-transform-based decoder 30 may directly inversely quantize the residual signal without the inverse transform processing unit 312 for certain blocks or frames.
- the video decoder 30 may have an inverse quantization unit 310 and an inverse transform processing unit 312 combined into a single unit.
- the decoder 30 is used to implement the video image decoding method described in the following embodiments.
- the video decoder 30 may generate an output video stream without processing by the filter 320; or, for some image blocks or image frames, the entropy decoding unit 304 of the video decoder 30 does not decode the quantized coefficients, and accordingly does not It needs to be processed by the inverse quantization unit 310 and the inverse transform processing unit 312.
- the loop filter 320 is optional; and for lossless compression, the inverse quantization unit 310 and the inverse transform processing unit 312 are optional.
- the inter prediction unit and the intra prediction unit may be selectively activated.
- the processing result for a certain link can be further processed and output to the next link, for example, in interpolation filtering, motion vector derivation or loop filtering, etc.
- operations such as Clip or shift are further performed on the processing results of the corresponding link.
- the motion vector of the control point of the current image block derived from the motion vector of the adjacent affine coding block, or the motion vector of the sub-block of the current image block derived from the motion vector may undergo further processing, and this application will not do this limited.
- restrict the value range of the motion vector so that it is within a certain bit width. Assuming that the bit width of the allowed motion vector is bitDepth, the range of the motion vector is -2 ⁇ (bitDepth-1) ⁇ 2 ⁇ (bitDepth-1)-1, where the " ⁇ " symbol represents the power. If bitDepth is 16, the value range is -32768 ⁇ 32767. If bitDepth is 18, the value range is -131072 ⁇ 131071.
- the value of the motion vector (for example, the motion vector MV of the four 4x4 sub-blocks in an 8x8 image block) is restricted, so that the maximum difference between the integer parts of the four 4x4 sub-blocks MV does not exceed N pixels, for example, no more than one pixel.
- FIG. 4 is a schematic structural diagram of a video decoding device 400 (for example, a video encoding device 400 or a video decoding device 400) provided by an embodiment of the present application.
- the video coding device 400 is suitable for implementing the embodiments described herein.
- the video coding device 400 may be a video decoder (for example, the decoder 30 of FIG. 1A) or a video encoder (for example, the encoder 20 of FIG. 1A).
- the video coding device 400 may be one or more components of the decoder 30 in FIG. 1A or the encoder 20 in FIG. 1A described above.
- the video decoding device 400 includes: an entry port 410 for receiving data and a receiving unit (Rx) 420, a processor, logic unit or central processing unit (CPU) 430 for processing data, and a transmitter unit for transmitting data (Tx) 440 (or simply referred to as transmitter 440) and outlet port 450, as well as memory 460 (such as memory 460) for storing data.
- the video decoding device 400 may also include photoelectric conversion components and electro-optical (EO) components coupled with the inlet port 410, the receiver unit 420 (or simply referred to as the receiver 420), the transmitter unit 440 and the outlet port 450 for optical signals. Or the outlet or entrance of electrical signals.
- EO electro-optical
- the processor 430 is implemented by hardware and software.
- the processor 430 may be implemented as one or more CPU chips, cores (e.g., multi-core processors), FPGA, ASIC, and DSP.
- the processor 430 communicates with the ingress port 410, the receiver unit 420, the transmitter unit 440, the egress port 450, and the memory 460.
- the processor 430 includes a decoding module 470 (for example, an encoding module 470 or a decoding module 470).
- the encoding/decoding module 470 implements the embodiments disclosed herein to implement the chroma block prediction method provided in the embodiments of the present application. For example, the encoding/decoding module 470 implements, processes, or provides various encoding operations.
- the encoding/decoding module 470 provides a substantial improvement to the function of the video decoding device 400 and affects the conversion of the video decoding device 400 to different states.
- the encoding/decoding module 470 is implemented by instructions stored in the memory 460 and executed by the processor 430.
- the memory 460 includes one or more magnetic disks, tape drives, and solid-state hard disks, and can be used as an overflow data storage device for storing programs when these programs are selectively executed, and storing instructions and data read during program execution.
- the memory 460 may be volatile and/or non-volatile, and may be read-only memory (ROM), random access memory (RAM), random access memory (ternary content-addressable memory, TCAM), and/or static Random Access Memory (SRAM).
- FIG. 5 is a simplified block diagram of an apparatus 500 that can be used as either or both of the source device 12 and the destination device 14 in FIG. 1A according to an exemplary embodiment.
- the device 500 can implement the technology of the present application.
- FIG. 5 is a schematic block diagram of an implementation manner of an encoding device or a decoding device (referred to as a decoding device 500 for short) according to an embodiment of the application.
- the decoding device 500 may include a processor 510, a memory 530, and a bus system 550.
- the processor and the memory are connected through a bus system, the memory is used to store instructions, and the processor is used to execute instructions stored in the memory.
- the memory of the decoding device stores program codes, and the processor can call the program codes stored in the memory to execute the various video image encoding or decoding methods described in this application, especially in various inter prediction modes or intra prediction modes Video encoding or decoding method. To avoid repetition, it will not be described in detail here.
- the processor 510 may be a central processing unit (Central Processing Unit, referred to as "CPU"), and the processor 510 may also be other general-purpose processors, digital signal processors (DSP), and dedicated integrated Circuit (ASIC), off-the-shelf programmable gate array (FPGA) or other programmable logic devices, discrete gates or transistor logic devices, discrete hardware components, etc.
- the general-purpose processor may be a microprocessor or the processor may also be any conventional processor or the like.
- the memory 530 may include a read only memory (ROM) device or a random access memory (RAM) device. Any other suitable type of storage device can also be used as the memory 530.
- the memory 530 may include code and data 531 accessed by the processor 510 using the bus 550.
- the memory 530 may further include an operating system 533 and an application program 535.
- the application program 535 includes at least one program that allows the processor 510 to execute the video encoding or decoding method described in this application (especially the video image prediction method described in this application).
- the application program 535 may include applications 1 to N, which further include a video encoding or decoding application (referred to as a video coding application) that executes the video encoding or decoding method described in this application.
- the bus system 550 may also include a power bus, a control bus, and a status signal bus. However, for clear description, various buses are marked as the bus system 550 in the figure.
- the decoding device 500 may further include one or more output devices, such as a display 570.
- the display 570 may be a touch-sensitive display that merges the display with a touch-sensitive unit operable to sense touch input.
- the display 570 may be connected to the processor 510 via the bus 550.
- a candidate motion list (also referred to as a candidate list) based on the motion information of the adjacent coded blocks in the spatial or temporal domain of the current block, and use the candidate motion information with the least rate-distortion cost in the candidate motion list as the current
- the motion vector predictor (MVP) of the block and then the index value of the position of the optimal candidate motion information in the candidate motion list (for example, denoted as merge index, the same below) is passed to the decoding end.
- merge index the index value of the position of the optimal candidate motion information in the candidate motion list
- the rate-distortion cost is calculated by formula (1), where J represents the rate-distortion cost RD Cost, and SAD is the sum of absolute errors between the predicted pixel value and the original pixel value obtained after motion estimation using candidate motion vector predictors (sum of absolute differences, SAD), R represents the bit rate, and ⁇ represents the Lagrangian multiplier.
- the encoding end transmits the index value of the selected motion vector predictor in the candidate motion list to the decoding end. Further, the motion search is performed in the neighborhood centered on the MVP to obtain the actual motion vector of the current block, and the encoding end transmits the difference (motion vector difference) (ie, residual) between the MVP and the actual motion vector to the decoding end.
- the current block spatial and temporal candidate motion information is shown in Figure 6.
- the spatial candidate motion information comes from 5 adjacent blocks (A0, A1, B0, B1 and B2) in space. See Figure 6, if adjacent blocks Unavailable (the neighboring block does not exist or the neighboring block is not coded or the prediction mode adopted by the neighboring block is not an inter prediction mode), then the motion information of the neighboring block is not added to the candidate motion list.
- the temporal candidate motion information of the current block is obtained by scaling the MV of the corresponding block in the reference frame according to the picture order count (POC) of the reference frame and the current frame. First, determine whether the block at position T in the reference frame is available, and if not, select the block at position C in the reference frame.
- POC picture order count
- the position and traversal order of neighboring blocks in merge mode are also predefined, and the position and traversal order of neighboring blocks may be different in different modes.
- a list of candidate motions needs to be maintained in the merge mode. Before adding new motion information to the candidate list, it will first check whether the same motion information already exists in the list, and if it does, the motion information will not be added to the list. We call this checking process the pruning of the candidate motion list. List pruning is to prevent the same motion information from appearing in the list and avoid redundant rate-distortion cost calculation.
- MMVD makes use of merge candidates. Select one or more candidate motion information from the merge candidate motion list, and then perform motion vector (MV) extended expression based on the candidate motion information.
- MV expansion expression includes MV starting point, movement step length and movement direction.
- the selected candidate motion vector is the default merge type (for example, MRG_TYPE_DEFAULT_N).
- the selected candidate motion vector is the starting point of the MV, in other words, the selected candidate motion vector is used to determine the initial position of the MV.
- the basic candidate index indicates which candidate motion vector in the candidate motion list is selected as the optimal candidate motion vector.
- the Basecandidate IDX may not be determined.
- the first candidate motion information in the candidate motion list is used as the selected candidate motion information.
- the step identifier represents the offset distance information of the motion vector.
- the value of the step size represents the distance from the initial position (for example, the preset distance), and the definition of the preset distance is shown in Table 2.
- the direction IDX indicates the direction of the motion vector difference (MVD) based on the initial position.
- the direction indicator can include four situations in total, see Table 3 for specific definitions.
- the solid line is the corresponding position of the motion vector in the L0 reference frame and the L1 reference frame of the bidirectional prediction in the motion vector starting point
- the dashed line is the pointing position of the motion vector combined with MVD, the vector between the two The difference is MVD.
- the motion vector difference can be determined based on Distance IDX and Direction IDX.
- the black solid dot is (shown in Table 2) the peripheral offset motion vector (the motion vector at the starting point of the motion vector) at one time Value plus MVD) pointing position
- the hollow solid line dot is the pointing position of the peripheral offset motion vector (the motion vector value of the starting point of the motion vector plus MVD) twice the distance.
- the process of determining the predicted pixel value of the current image block according to the MMVD method may include:
- the solid line is the bidirectional prediction in the L0 reference frame and the L1 reference frame.
- determine which direction to shift based on the starting point of the MV and then determine how many pixels to shift in the direction indicated by the Direction IDX based on the Distance IDX.
- the motion vector difference (MVD) can be determined based on the Direction IDX and the Distance IDX, and then the motion vector predictor identified by the Basecandidate IDX is added to the determined MVD to obtain the motion vector predictor required for decoding.
- the candidate motion information may include the forward motion vector predictor and the backward motion vector predictor.
- the forward motion vector predicted value and the backward motion vector predicted value may be predicted values obtained by forward and backward prediction with reference to the two reference frame lists List0 and List1.
- the candidate motion information may include the forward and backward motion vector predictor and the picture sequence number (PictureOrderCount, POC) corresponding to the forward and backward reference prediction block.
- POC PictureOrderCount
- At least one means one or more, and “multiple” means two or more.
- “And/or” describes the association relationship of the associated objects, indicating that there can be three relationships, for example, A and/or B, which can mean: A alone exists, A and B exist at the same time, and B exists alone, where A, B can be singular or plural.
- the character “/” generally indicates that the associated objects are in an “or” relationship.
- “The following at least one item (a)” or similar expressions refers to any combination of these items, including any combination of a single item (a) or plural items (a).
- At least one item (a) of a, b, or c can represent: a, b, c, ab, ac, bc, or abc, where a, b, and c can be single or multiple .
- first, second, etc. may be used in the embodiments of the present application to describe each object (such as a motion vector prediction value, a reference prediction block, etc.), these terms are only used to distinguish each object from each other.
- the existing inter-frame prediction adopts the MMVD mode in the case of bidirectional prediction, the encoding and decoding accuracy is low due to insufficient use of the matching relationship between the forward and backward reference prediction blocks.
- the embodiments of the present application provide a video image prediction method and device.
- a combined motion vector correction (or motion vector refinement) method can be used, such as decoding
- a decoder-side motion vector refinement (DMVR) method is used to correct two bidirectional motion vector predictors, and then perform a decoding operation based on the corrected motion vector predictors, thereby improving decoding accuracy.
- DMVR decoder-side motion vector refinement
- the method and the device are based on the same inventive concept. Since the principles of the method and the device to solve the problem are similar, the implementation of the device and the method can be referred to each other, and the repetition will not be repeated.
- the embodiments of the present application illustrate the following two methods, using the MMVD method combined with the motion vector correction method for prediction:
- the first possible implementation manner After the initial motion vector predictor is determined based on the candidate index, in the case of bidirectional prediction, the initial forward and backward motion vector predictor is corrected based on the motion vector correction method, based on the corrected motion
- the motion vector predictor combined with the vector predictor and MVD decodes the current image block to be processed.
- the second possible implementation manner After the initial motion vector predictor is determined based on the candidate index, in the case of bidirectional prediction, the initial forward and backward motion vector predictor is corrected based on the motion vector correction method, based on the corrected motion
- the motion vector predictor combined with the vector predictor and MVD decodes the previous block of the image to be processed.
- the two implementation manners described above may be specifically executed by a video codec device, a video codec, a video codec system, and other devices with a video codec function.
- the two implementation manners described above can occur both in the encoding process and the decoding process. More specifically, the two implementation manners described above can occur in the inter-frame prediction process during encoding and decoding.
- FIG. 8 for a schematic flowchart of a first possible implementation manner in the video image prediction provided by this application.
- S801 Determine the first initial motion vector prediction value, the second initial motion vector prediction value, the first motion vector difference, and the second motion vector difference of the current image block to be processed.
- the first initial motion vector prediction value corresponds to the initial motion vector prediction value of the first list (ie list0), and accordingly, the second initial motion vector prediction value corresponds to the second list (ie list1) The initial motion vector prediction value.
- the first initial motion vector prediction value corresponds to the initial motion vector prediction value in the first direction (for example, forward), and correspondingly, the second initial motion vector prediction value corresponds to the second direction (for example, Backward) initial motion vector prediction value; this application does not limit this.
- the current image to which the current image block to be processed belongs in the embodiment of the present application has two reference images one after the other, which are the first reference image (such as the forward reference image) and the second reference image respectively.
- Reference image (such as backward reference image). That is, the first initial motion vector prediction value may be the initial forward motion vector prediction value in the forward prediction direction, and the second initial motion vector prediction value may be the initial backward motion vector prediction value in the backward prediction direction.
- the sum of the first modified motion vector prediction value and the first motion vector difference may be used as the first motion vector prediction value, and the second modified motion vector prediction value may be combined with The sum of the second motion vector difference is used as the second motion vector prediction value.
- S804 Predict the current image block to be processed according to the first motion vector prediction value and the second motion vector prediction value.
- the current image block to be processed may be a sub-block after the current block is divided, or the current block.
- the image to be processed is divided into 16 ⁇ 16 image blocks for encoding and decoding.
- inter-frame prediction can be performed for 16 ⁇ 16 image blocks.
- the 16 ⁇ 16 image block can also be further divided, for example, into 16 4 ⁇ 4 sub-blocks, and the image prediction method provided in the embodiment of the present application is used for each sub-block to perform inter-frame prediction.
- the forward and backward motion vector predictors of multiple sub-blocks belonging to the same image block and the forward and backward motion vector differences are the same, but the forward and backward motion vector prediction of each sub-block after refined processing is generally Will be different.
- the execution of the solution provided in the embodiments of the application may have trigger conditions, for example, it is determined that the current image block to be processed uses the MMVD method for inter-frame prediction, and then the execution is started.
- Step S801 is to determine the first initial value of the current image block to be processed.
- the motion vector predictor, the second initial motion vector predictor, the first motion vector difference, and the second motion vector difference When it is determined not to adopt the MMVD method, other methods can be used for inter-frame prediction.
- the code stream Analyzes the first flag (such as mmvd_flag[x0][y0]); thereby determining the first initial motion vector prediction value, the second initial motion vector prediction value, the first motion vector difference, and the first motion vector prediction value of the current image block to be processed
- the first flag such as mmvd_flag[x0][y0]
- the second motion vector difference when the first indicator indicates that the fused motion vector difference MMVD method is used for inter-frame prediction of the current image block to be processed, the determination of the first initial motion vector predictor of the current image block to be processed is performed, The second initial motion vector predictor, the first motion vector difference, and the second motion vector difference.
- the first flag may also be called mmvd_flag[x0][y0], and the above name is also used in the standard text or code.
- mmvd_flag[x0][y0] when mmvd_flag[x0][y0] is the first value, it indicates that the inter-frame prediction of the current image block to be processed adopts the fused motion vector difference MMVD method, and when mmvd_flag[x0][y0] is the second value, Indicates that the inter-frame prediction of the current image block to be processed does not use the merged motion vector difference MMVD mode.
- the first value can be 1 (or true), and the second value can be 0 (or false).
- a candidate list is constructed according to the motion information of neighboring blocks of the current image block to be processed, and a certain candidate motion information is selected from the candidate list as the predicted motion information of the current image block to be processed.
- the motion vector predictor of the neighboring block A0 is selected as the predicted motion information of the current image block.
- the forward motion vector of A0 is used as the forward motion vector of the current block
- the backward motion vector of A0 The motion vector is used as the backward predictive motion vector of the current block.
- the constructed candidate motion vector list may include multiple candidate motion information, or may only include one candidate motion information.
- the candidate index will be included in the code stream during decoding, so that before the candidate motion information of the current block to be processed is determined, the candidate index is parsed from the code stream, and the candidate index is obtained from the candidate list according to the candidate index. Determine the corresponding candidate motion information.
- the candidate motion information corresponding to the candidate index includes a third motion vector predictor and a fourth motion vector predictor.
- the third motion vector predictor is used as the first initial motion vector predictor
- the fourth motion vector predictor is used as the second motion vector predictor.
- the initial motion vector prediction value when the candidate list includes multiple candidate motion information, during encoding, when selecting candidate motion information for the current image block to be processed, the candidate motion information with the least rate-distortion cost in the candidate motion list can be used as the current block Motion vector prediction value.
- the candidate index may not be included in the code stream during encoding, so as to determine the third motion vector predictor and the first motion vector prediction value included in the candidate motion information at the first position in the candidate list.
- the four motion vector predictors are the first initial motion vector predictor and the second initial motion vector predictor.
- the candidate list includes multiple candidate motion information.
- the candidate index may not be compiled into the code stream, so that the candidate index that is not included in the code stream during decoding can directly determine the third motion vector predictor and the first motion vector prediction value included in the candidate motion information at the first position in the candidate list.
- the four motion vector predictors are the first initial motion vector predictor and the second initial motion vector predictor.
- Method 2 Construct a candidate list based on the motion information of the neighboring blocks of the current image block to be processed.
- the candidate list is constructed by using the MVs of the neighboring blocks that have been previously encoded or decoded.
- the neighboring blocks may be provided in accordance with this application.
- the candidate motion information corresponding to the candidate index may only include the original candidate motion information (candidate motion information obtained by non-motion vector refinement processing), and the candidate motion information corresponding to the candidate index may include the original candidate motion information and refinement The processed candidate motion information.
- L0 represents the first list (list0)
- L1 represents the second list (list1)
- mvL0_A, ref0 and mvL1_A indicated by candidate index 1 ref1 represents the candidate motion information after refinement processing
- candidate index The mvL0_D, ref0 and mvL1_D, ref1 indicated by 0 represent the original candidate motion information.
- the mvL0_B, ref0 and mvL1_B, ref1 indicated by the candidate index 1 represent the candidate motion information after refinement processing
- the mvL0_C, ref0 and mvL1_C, and ref1 indicated by the candidate index 1 represent the original candidate motion information.
- the code stream when decoding, will include the candidate index, so that before the candidate motion information of the current block to be processed is determined, the candidate index is parsed from the code stream, and the candidate index is obtained according to the candidate index. Determine the corresponding candidate in the candidate list.
- the candidate includes two candidate motion information, such as the first candidate motion information (refined candidate motion information) and the second candidate motion information (non-refined candidate motion information) ).
- the candidate index is 0, the first candidate motion information includes (mvL0_B, ref0 and mvL1_B, ref1), and the second candidate motion information includes mvL0_C, ref0 and mvL1_C, ref1.
- the two motion vector predictors included in the refined candidate motion information can be selected as the first initial motion vector predictor and the second initial motion vector predictor.
- the non-refined motion vector predictor can be selected.
- the two motion vector predictors included in the candidate motion information of the transformation process are used as the first initial motion vector predictor and the second initial motion vector predictor.
- it can be determined that the image block to which the candidate belongs and the current waiting Whether the processed image blocks belong to different images, when they belong to different images the two motion vector predictors included in the refined candidate motion information can be selected as the first initial motion vector predictor and the second initial motion vector predictor.
- two motion vector predictors included in the candidate motion information of the non-refined processing can be selected as the first initial motion vector predictor and the second initial motion vector predictor.
- the candidate motion information (first candidate motion information or second candidate motion information) corresponding to the candidate is the motion information from the T1 pixel position of the temporal neighboring block of the current image block to be processed, namely The T1 pixel position is outside the image where the current image block to be processed is located.
- the candidate motion information corresponding to the candidate item is the motion information from the A0 pixel position of the spatial neighboring block of the current image block to be processed, that is, the A0 pixel position is located in the image where the current image block to be processed is located.
- the candidate index may not be compiled into the code stream to determine the candidate at the first position in the candidate list.
- the candidate includes two candidate motion information, each The candidates include two candidate motion information, such as the first candidate motion information (refined (or modified) processed candidate motion information), the second candidate motion information (non-refined processed candidate motion information), an example , You can select the two motion vector predictors included in the refined candidate motion information as the first initial motion vector predictor and the second initial motion vector predictor. In another example, you can select the non-refined processed The candidate motion information includes two motion vector predictors as the first initial motion vector predictor and the second initial motion vector predictor.
- the image block to which the candidate belongs and the current image block to be processed can be determined Whether it belongs to different images, when it belongs to different images, two motion vector predictors included in the refined candidate motion information can be selected as the first initial motion vector predictor and the second initial motion vector predictor. When they belong to the same image , The two motion vector predictors included in the candidate motion information of the non-refined processing can be selected as the first initial motion vector predictor and the second initial motion vector predictor.
- this article exemplifies a partial syntax structure for parsing the inter-frame prediction mode (including parsing the first identifier and the candidate index) used in the current image block to be processed, as shown in Table 4.
- mmvd_flag[x0][y0] corresponds to the first flag
- mmvd_merge_flag[x0][y0] in Table 4 can also be called mmvd_merge_idx[x0][y0]
- mmvd_merge_idx[x0][y0] is used for
- the basic candidate index indicating the selected MMVD candidate motion vector list, mmvd_merge_flag[x0][y0] or mmvd_merge_idx[x0][y0] corresponds to the candidate index mentioned in this embodiment of the application.
- mmvd_distance_idx[x0][y0] is used to indicate the distance index of the offset initial position.
- mmvd_direction_idx[x0][y0] is used to indicate the direction of the initial position MVD.
- first motion vector difference and the second motion vector difference mentioned in the embodiment of the present application may be determined according to mmvd_distance_idx[x0][y0] and mmvd_direction_idx[x0][y0].
- step S802 there may be a starting condition, for example, when the image block to which the candidate motion information selected based on the candidate list belongs and the current image block to be processed belong to different images, according to the first An initial motion vector prediction value, the second initial motion vector prediction value, and a motion vector correction process.
- non-refinement processing can be directly performed, specifically, based on the first initial motion vector prediction value .
- the first motion vector difference determines the first target motion vector predictor
- the second target motion vector predictor is determined according to the second initial motion vector predictor and the second motion vector difference
- the second target motion vector predictor is predicted according to the first target motion vector Value and the second target motion vector prediction value to predict the current image block to be processed.
- step S802 There may be multiple ways to perform the correction process in step S802.
- the following examples describe two possible motion vector correction methods.
- other possible motion vector correction methods can also be used in the embodiment of the present application, which will not be repeated in this application.
- A1 Determine the predicted motion information of the current image block to be processed.
- the predicted motion information includes the initial forward motion vector predicted value and the initial backward motion vector predicted value.
- the forward reference prediction block of the current block is obtained in the forward reference image by the motion compensation method.
- the backward reference prediction block of the current image block to be processed is obtained from the backward reference image by the motion compensation method.
- A4 Determine the distance between the forward reference prediction block obtained by A2 and the backward reference prediction block obtained by A3 according to the pixel value of the forward reference prediction block obtained by A2 and the pixel value of the backward reference prediction block obtained by A3 The difference.
- the forward reference prediction block obtained by A2 is used as a starting point to perform a motion search of integer or sub-pixel steps.
- the sub-pixels can be 1/2 pixels, 1/4 pixels, 1/8 pixels, 1/16 pixels, etc., all of which perform an entire pixel step motion search to obtain at least one forward prediction block of the currently decoded block.
- the (0,0) point position is the search starting point.
- the search is performed at 8 full-pixel step search points around the search starting point to obtain 8 forward reference prediction blocks.
- the search method used is not limited, and any search method may be used.
- A6 Similar to A5, in the backward reference image described in A3, the backward reference prediction block obtained by A3 is used as a starting point to perform a motion search with an entire pixel step to obtain 8 backward reference prediction blocks.
- the calculated difference between the forward reference prediction block and the backward reference prediction block is calculated to obtain 8 difference values, and the 8 difference values corresponding to the search starting point are determined.
- the forward and backward reference prediction blocks corresponding to the smallest difference are the best forward reference prediction block and the best backward reference prediction block.
- the first initial motion vector prediction value corresponds to the first reference prediction block
- the second initial motion vector prediction value corresponds to the second reference prediction block.
- a first modified reference prediction block is determined according to the first reference prediction block
- a second modified reference prediction block is determined according to the second reference prediction block.
- the difference between the first modified reference prediction block and the second modified reference prediction block is less than or equal to the difference between the first reference prediction block and the second reference prediction block
- the first modified reference prediction block A prediction block is an image block in a first preset area that has the same size as the first reference prediction block, the first preset area includes the first reference prediction block
- the second modified reference prediction block is An image block in a second preset area that has the same size as the second reference prediction block, the second preset area includes the second reference prediction block; the first modified reference prediction block corresponds to the The first modified motion vector predictor, and the second modified reference prediction block corresponds to the second modified motion vector predictor.
- first reference prediction block and the second reference prediction block appear in pairs, for the convenience of description, the first reference prediction block and the second reference prediction block are referred to as a first reference prediction block pair.
- the motion vector correction process is based on the reference prediction block pair consisting of two reference prediction blocks as the starting search point, and the surrounding search for multiple reference prediction block pairs with the smallest difference Reference prediction block pair.
- the first modified reference prediction block is determined according to the first reference prediction block
- the second modified reference prediction block is determined according to the second reference prediction block
- the first reference prediction block pair includes the first reference prediction block and the second reference prediction block;
- the second reference prediction block pair includes a third reference prediction block and a fourth reference prediction block, and the The third reference prediction block is obtained based on the motion search of the first reference prediction block in the first preset area, and the fourth reference prediction block is obtained based on the second reference prediction block in the second It can be obtained by motion search in the preset area.
- the search when performing a motion search based on the first reference prediction block pair in B1, the search may be performed based on the first reference prediction block pair in whole-pixel or sub-pixel steps to obtain at least one second reference Predict block pairs.
- the sub-pixels can be 1/2 pixels, 1/4 pixels, 1/8 pixels, or 1/16 pixels.
- the third reference prediction block when the third reference prediction block is compared with the fourth reference prediction block, the sum of the absolute values of the differences of the pixels in the two image blocks may be used as the third reference prediction block and the fourth reference prediction block
- the sum of the squares of the pixel differences in the two image blocks may be used as the difference value between the third reference prediction block and the fourth reference prediction block, and the comparison method of the difference is not specifically limited.
- the third reference prediction block pair includes a fifth reference prediction block and a sixth reference prediction block
- the fifth reference prediction block is based on the third reference prediction block included in the second reference prediction block pair with the smallest difference. Obtained by performing a motion search in the first preset area, and the sixth reference prediction block is based on a second reference prediction block with the smallest difference to a fourth reference prediction block included in a motion search in the second preset area get;
- the B5. Determine that the difference between the fifth reference prediction block and the sixth reference prediction block included in the third reference prediction block pair with the smallest difference is smaller than the third reference prediction block and the fourth reference prediction included in the second reference prediction block pair with the smallest difference
- the fifth reference prediction block included in the third reference prediction block pair with the smallest difference is determined to be the first modified reference prediction block
- the third reference prediction block included in the third reference prediction block pair with the smallest difference is determined to be the first
- the six reference prediction block is the second modified reference prediction block.
- the difference between the fifth reference prediction block and the sixth reference prediction block included in the third reference prediction block pair with the smallest difference is smaller than the third reference prediction included in the second reference prediction block pair with the smallest difference.
- the fourth reference prediction block continue to perform the motion search with the third reference prediction block pair with the smallest difference as the search starting point. Until the number of executions reaches a preset threshold, or the searched position exceeds the search area.
- B3 determines the reference prediction block pair with the smallest difference among the at least one second reference prediction block pair, if it is determined that the second reference prediction block pair with the smallest difference includes the third reference prediction block and the fourth reference If the difference between the prediction blocks is greater than the difference between the first reference prediction block and the second reference prediction block, it is determined that the first reference prediction block is the first modified reference prediction block, and the second reference prediction block is determined The prediction block is the second modified reference prediction block.
- the second reference prediction block pair includes the difference between the third reference prediction block and the fourth reference prediction block
- the third reference prediction block included in the second reference prediction block pair with the smallest difference is determined to be the first modified reference prediction block , Determining that the fourth reference prediction block included in the second reference prediction block pair with the smallest difference is the second modified reference prediction block.
- the predicted motion information of the current image block to be processed is obtained.
- the forward and backward motion vector prediction values of the current image block to be processed are MV0 (-22, 18) and MV1 (2, 12)
- the forward and backward motion The vector difference is MVD0(1,0) and MVD1(-1,0).
- the forward and backward prediction is performed on the current image block to be processed to obtain the forward prediction block and the backward prediction block of the current image block to be processed.
- the first precision is 1 pixel.
- the previous backward reference prediction blocks q0 and h0 are used as the search starting point to perform the first-precision motion search to determine the difference between the new forward and backward reference prediction blocks obtained in each search, such as 8 forward and backward reference prediction blocks around the forward and backward reference prediction blocks
- the difference between the pairs, and the difference between the forward reference prediction block q0 and the backward reference prediction block h0, assuming that the motion vector prediction values of the front and back reference prediction blocks with the smallest difference are (-21,18) and (1,12), respectively .
- the updated search points are (-21, 18) and (1, 12) respectively corresponding to the forward reference prediction block q1 and the backward reference prediction block h1, and the motion search with the first precision is continued.
- the previous and backward reference prediction blocks q1 and h1 are used as the search starting point to perform the first-precision motion search to determine the difference between the front and back reference prediction blocks obtained in each search, such as the forward and backward reference prediction blocks q1 and h1 around 8 forward and backward reference predictions
- the difference between the block pairs, and the difference between the forward reference prediction block q1 and the backward reference prediction block h1, assuming that the motion vector prediction values of the front and rear reference prediction blocks with the smallest difference are (-20, 18) and (0, 12) ).
- (-21,18) and (1,12) correspond to the forward reference prediction block q2 and the backward reference prediction block h2, respectively.
- the number of motion searches with the first precision can be configured, such as once, twice, and so on. Or determine the range of motion search. Stop searching when out of range.
- the motion vector prediction value (-20,18) of the forward reference prediction block q2 and MVD0(1,0) are summed to obtain (-19,18), and the motion of the backward reference prediction block h2
- the vector predicted value (0,12) and MVD1 are summed to get (1,12). Therefore, the current image block to be processed is predicted based on the forward motion vector predictor (-19, 18) and the backward motion vector predictor (1, 12).
- Figure 12 shows only one kind of motion search process.
- the first precision can be any set precision, for example, it can be 1 pixel precision or 1/2 pixel precision or 1/4 pixel accuracy or 1/8 pixel accuracy, etc.
- C1 Obtain the predicted motion information of the current image block to be processed (including the initial forward motion vector predicted value and the initial backward motion vector predicted value);
- the forward reference prediction block of the current block is obtained in the forward reference image by the motion compensation method.
- the backward reference prediction block of the current image block to be processed is obtained from the backward reference image by the motion compensation method.
- the pixel value of the template matching block is obtained by a weighting method.
- C5 In the forward reference image described in C2, perform a motion search with an entire pixel step. It should be pointed out that, regardless of whether the search starting point is an entire pixel (the starting point can be an entire pixel, or a sub-pixel, such as: 1/2, 1/4, 1/8, 1/16, etc.), the entire pixel is performed Step motion search to obtain at least one forward reference prediction block of the currently decoded block.
- the (0,0) point position is the search starting point.
- the search is performed at 8 full-pixel step search points around the search starting point to obtain the corresponding prediction block.
- the search method used is not limited, and any search method may be used. Calculate the matching error between each forward reference prediction block and the template matching block described in C4, and select the forward motion vector predictor corresponding to the forward reference prediction block with the smallest matching error as the optimal forward Motion vector prediction value.
- the matching error can be calculated using the SAD criterion.
- C6 Similar to C5, in the backward reference image described in C3, a motion search with an entire pixel step is performed, regardless of whether the search starting point is an entire pixel point (the starting point can be an entire pixel or a sub-pixel, such as: 1 /2, 1/4, 1/8, 1/16, etc.), the whole pixel step motion search is performed to obtain at least one backward prediction block of the currently decoded block, and each backward prediction block is calculated with the C4 For the matching error between the template matching blocks, the backward motion vector predictor corresponding to the backward predictive block with the smallest matching error is selected as the optimal backward motion vector predictor.
- the motion vector refinement process provided in the second possible example above may be performed multiple times to complete image prediction.
- the optimal forward motion vector predictor and the optimal backward motion vector predictor through C1-C6 use the optimal forward motion vector predictor as the search The starting point, and the optimal backward motion vector predictor as the starting point of the search for motion search.
- the searched forward motion vector prediction block is compared with the template matching block, and the forward motion vector prediction value corresponding to the forward reference prediction block with the smallest matching error is selected as the optimal forward motion vector prediction value for the second search.
- the template matching blocks used in multiple searches can be the same, and of course they can also be updated.
- the motion compensation method is used in the forward reference Obtain the forward reference prediction block of the current image block to be processed in the image; and obtain the backward reference of the current image block to be processed in the backward reference image through the motion compensation method according to the optimal backward motion vector prediction value of a search
- the prediction block the pixel values of the forward and backward reference prediction blocks obtained by motion compensation are weighted to obtain the latest matching template block.
- FIG. 13 for a schematic flowchart of a second possible implementation manner in the video image prediction provided by this application.
- the method shown in FIG. 13 may be executed by a video codec device, a video codec, a video codec system, and other devices with video codec functions.
- the method shown in FIG. 13 may occur during the encoding process or the decoding process. More specifically, the method shown in FIG. 13 may occur during the inter-frame prediction process during encoding and decoding.
- S1301 Determine the first initial motion vector predictor, the second initial motion vector predictor, the first motion vector difference, and the second motion vector difference of the first image block to be processed.
- S1302. Determine a first motion vector prediction value according to the first initial motion vector prediction value and the first motion vector difference, and determine a second motion vector prediction value according to the second initial motion vector prediction value and the second motion vector difference .
- the motion vector correction process is performed according to the first initial motion vector prediction value to obtain the first modified motion vector prediction value
- the motion vector correction process is performed according to the second initial motion vector prediction value to obtain the second modified motion vector prediction value ).
- the sum of the first initial motion vector prediction value and the first motion vector difference may be used as the first motion vector prediction value
- the second initial motion vector prediction value and the second motion vector The sum of the differences is used as the second motion vector prediction value.
- S1304 Predict the first image block to be processed according to the first modified motion vector prediction value and the second modified motion vector prediction value.
- the first initial motion vector prediction value corresponds to the initial motion vector prediction value of the first list (ie list0), and accordingly, the second initial motion vector prediction value corresponds to the second list (ie list1) The initial motion vector prediction value.
- the first initial motion vector prediction value corresponds to the initial motion vector prediction value in the first direction (for example, forward), and correspondingly, the second initial motion vector prediction value corresponds to the second direction (for example, Backward) initial motion vector prediction value; this application does not limit this.
- the motion vector correction process may be a DMVR process, and in the embodiment of the present application, the two may be replaced with each other.
- the first modified motion vector prediction value or the second modified motion vector prediction value may also be referred to as the first refined motion vector prediction value or the second refined motion vector prediction value.
- the method before determining the first initial motion vector prediction value, the second initial motion vector prediction value, the first motion vector difference, and the second motion vector difference of the current image block to be processed, the method further includes :
- the first flag such as mmvd_flag[x0][y0]
- a start condition when S1303 is executed in the embodiment of the present application For example, when the image block to which the candidate motion information belongs and the current image block to be processed belong to the selected candidate list belong to different images, The first motion vector predictor and the second motion vector predictor perform a motion vector correction process. When the image block to which the candidate motion information selected in the candidate list belongs belongs to the same image as the currently to-be-processed image block, the first motion vector prediction value and the second motion vector prediction value obtained in step S1302 Value, decode the current image block to be processed.
- the predicted motion information of the current image block to be processed is obtained. It is assumed that the forward and backward motion vector prediction values of the current image block to be processed are MV0 (-22, 18) and MV1 (2, 12), respectively. The vector difference is MVD0(1,0) and MVD1(-1,0).
- the forward reference prediction block corresponding to MV2 is q0
- the forward reference prediction block corresponding to MV3 is h0.
- the first precision is 1 pixel.
- the previous backward reference prediction blocks q0 and h0 are used as the search starting point to perform the first-precision motion search to determine the difference between the new forward and backward reference prediction blocks obtained in each search, such as 8 forward and backward reference prediction block pairs around the forward and backward reference prediction blocks
- the difference between the forward reference prediction block q0 and the backward reference prediction block h0 it is assumed that the motion vector prediction values of the front and rear reference prediction blocks with the smallest difference are (-21, 17) and (1, 11), respectively.
- the updated search points are (-21, 17) and (1, 11) respectively corresponding to the forward reference prediction block q1 and the backward reference prediction block h1, and the motion search with the first precision is continued.
- the previous and backward reference prediction blocks q1 and h1 are used as the search starting point to perform the first-precision motion search to determine the difference between the front and back reference prediction blocks obtained in each search, such as the forward and backward reference prediction blocks q1 and h1 around 8 forward and backward reference predictions
- the difference between blocks, and the difference between the forward reference prediction block q1 and the backward reference prediction block h1, assuming that the motion vector prediction values of the front and back reference prediction blocks with the smallest difference are (-21,16) and (1,10), respectively .
- (-21, 16) and (1, 10) correspond to the forward reference prediction block q2 and the backward reference prediction block h2, respectively.
- the number of motion searches with the first precision can be configured, such as once, twice, and so on. Or determine the range of motion search. Stop searching when out of range.
- the current image block to be processed is predicted based on the forward motion vector predictor (-21, 16) and the backward motion vector predictor (1, 10).
- Figure 14 shows only one kind of motion search process.
- the first precision can be any set precision, for example, it can be 1 pixel precision or 1/2 pixel precision or 1/4 pixel accuracy or 1/8 pixel accuracy, etc.
- FIG. 15 is a schematic block diagram of an image prediction device according to an embodiment of the present application. It should be noted that the image prediction device 1500 is suitable for both inter-frame prediction of decoded video images and inter-frame prediction of encoded video images. It should be understood that the image prediction device 1500 herein may correspond to the frame in FIG. 2 The inter prediction unit 244 may alternatively correspond to the inter prediction unit 344 in FIG. 3. The image prediction device 1500 may include a prediction unit 1501 and a correction unit 1502.
- the prediction unit 1501 is configured to determine the first initial motion vector prediction value, the second initial motion vector prediction value, the first motion vector difference, and the second motion vector difference of the current image block to be processed;
- the correction unit 1502 is configured to perform a motion vector correction process according to the first initial motion vector predicted value and the second initial motion vector predicted value to obtain the first corrected motion vector predicted value and the second corrected motion vector predicted value ;
- the prediction unit 1501 is further configured to determine a first motion vector prediction value according to the difference between the first modified motion vector prediction value and the first motion vector, and determine the first motion vector prediction value according to the second modified motion vector prediction value and the first motion vector prediction value.
- the second motion vector prediction value is determined by the difference of the two motion vectors; and the current image block to be processed is predicted according to the first motion vector prediction value and the second motion vector prediction value.
- the prediction unit 1501 determines the first initial motion vector prediction value, the second initial motion vector prediction value, the first motion vector difference, and the second motion vector difference of the current image block to be processed , Specifically used for:
- the first initial motion vector predictor and the second prediction value of the current image block to be processed are determined.
- the prediction unit 1501 is specifically configured to determine the first initial motion vector prediction value and the second initial motion vector prediction value of the current image block to be processed:
- the candidate index (or according to the rate-distortion cost algorithm) determines the corresponding candidate motion information from the candidate list.
- the candidate motion information includes a third motion vector predictor and a fourth motion vector predictor.
- the third motion vector predictor is used as The first initial motion vector prediction value
- the fourth motion vector prediction value is used as the second initial motion vector prediction value.
- the prediction unit 1501 is specifically configured to determine the first initial motion vector prediction value and the second initial motion vector prediction value of the current image block to be processed:
- the third motion vector predictor and the fourth motion vector predictor included in the candidate motion information of the first position in the candidate list are determined as the first initial motion vector predictor and the second initial motion vector predictor.
- correction unit 1502 is specifically configured to:
- a motion vector correction process is performed according to the first initial motion vector predicted value and the second initial motion vector predicted value.
- the prediction unit 1501 is further configured to: when the image block to which the candidate motion information belongs belongs to the same image as the current image block to be processed, according to the first initial motion vector predicted value, the first motion The vector difference determines the first target motion vector prediction value, and the second target motion vector prediction value is determined according to the second initial motion vector prediction value and the second motion vector difference; according to the first target motion vector prediction value and the The second target motion vector predictor predicts the current image block to be processed.
- the prediction unit 1501 is specifically configured to determine the first initial motion vector prediction value and the second initial motion vector prediction value of the current image block to be processed:
- the corresponding candidate is determined from the candidate list.
- the candidate includes the first candidate motion information and the second candidate motion information.
- the first candidate motion information includes a fifth motion vector predictor and a sixth motion vector predictor, and the second candidate motion information includes a seventh motion vector predictor and an eighth motion vector predictor;
- the fifth motion vector prediction value and the sixth motion vector prediction value are the first initial motion Vector prediction value and the second initial motion vector prediction value;
- the seventh motion vector predictor and the eighth motion vector predictor are the first initial motion vector predictor Value and the second initial motion vector predicted value.
- the candidate at the first position in the candidate list includes first candidate motion information and second candidate motion information, wherein the first candidate motion information includes the fifth motion vector predictor and the sixth motion vector predictor ,
- the second candidate motion information includes a seventh motion vector predictor and an eighth motion vector predictor;
- the prediction unit 1501 is specifically configured to determine the first initial motion vector prediction value and the second initial motion vector prediction value of the current image block to be processed:
- the fifth motion vector predictor and the sixth motion vector predictor are the first initial motion vector predictor and The second initial motion vector prediction value
- the seventh motion vector predictor and the eighth motion vector predictor are the first initial motion vector predictor Value and the second initial motion vector predicted value.
- the correction unit 1502 is specifically configured to perform a motion vector correction process according to the first initial motion vector predicted value and the second initial motion vector predicted value:
- the difference between the first modified reference prediction block and the second modified reference prediction block is less than or equal to the difference between the first reference prediction block and the second reference prediction block
- the first modified reference prediction block A prediction block is an image block in a first preset area that has the same size as the first reference prediction block, the first preset area includes the first reference prediction block
- the second modified reference prediction block is An image block in a second preset area that has the same size as the second reference prediction block, the second preset area includes the second reference prediction block; the first modified reference prediction block corresponds to the The first modified motion vector predictor, and the second modified reference prediction block corresponds to the second modified motion vector predictor.
- the modification unit 1502 is specifically configured to: determine a first modified reference prediction block according to the first reference prediction block and determine a second modified reference prediction block according to the second reference prediction block:
- the first reference prediction block pair includes the first reference prediction block and the second reference prediction block;
- the second reference prediction block pair includes a third reference prediction block and a fourth reference prediction block, and the The third reference prediction block is obtained based on the motion search of the first reference prediction block in the first preset area, and the fourth reference prediction block is obtained based on the second reference prediction block in the second It is obtained by motion search in the preset area;
- the third reference prediction block pair includes a fifth reference prediction block and a sixth reference prediction block
- the fifth reference prediction block is based on the third reference prediction block included in the second reference prediction block pair with the smallest difference. Obtained by performing a motion search in the first preset area, and the sixth reference prediction block is based on a second reference prediction block with the smallest difference to a fourth reference prediction block included in a motion search in the second preset area get;
- the fifth reference prediction block included in the third reference prediction block pair with the smallest difference is the first modified reference prediction block
- correction unit 1502 is further configured to:
- the first reference prediction block is the first modified reference prediction block
- it is determined that the second reference prediction block is a second modified reference prediction block.
- correction unit 1502 is further configured to:
- the difference between the fifth reference prediction block and the sixth reference prediction block included in the third reference prediction block pair with the smallest difference is greater than the third reference prediction block and the fourth reference prediction block included in the second reference prediction block pair with the smallest difference
- the third reference prediction block included in the second reference prediction block pair with the smallest difference is the first modified reference prediction block
- it is determined that the fourth reference prediction block included in the second reference prediction block pair with the smallest difference is The second modification refers to the prediction block.
- a prediction unit configured to determine a first initial motion vector prediction value, a second initial motion vector prediction value, a first motion vector difference, and a second motion vector difference of the first image block to be processed;
- the correction unit is configured to determine a first motion vector predictor according to the first initial motion vector predictor and the first motion vector difference, and determine a second motion according to the second initial motion vector predictor and the second motion vector difference Vector predicted value;
- the prediction unit is further configured to perform a motion vector correction process according to the first motion vector predicted value and the second motion vector predicted value to obtain the first corrected motion vector predicted value and the second corrected motion vector predicted value ; Predict the first image block to be processed according to the first modified motion vector prediction value and the second modified motion vector prediction value.
- the prediction unit is specifically used for determining the first initial motion vector prediction value, the second initial motion vector prediction value, the first motion vector difference, and the second motion vector difference of the current image block to be processed:
- the first initial motion vector predictor and the second prediction value of the current image block to be processed are determined.
- the prediction unit is specifically configured to determine the first initial motion vector prediction value and the second initial motion vector prediction value of the current image block to be processed:
- the candidate motion information includes the third motion vector predictor and the fourth motion vector predictor.
- the third motion vector predicted value is used as the first initial motion vector predicted value
- the fourth motion vector predicted value is used as the second initial motion vector predicted value
- the third motion vector predictor and the fourth motion vector predictor included in the candidate motion information of the first position in the candidate list are determined as the first initial motion vector predictor and the second initial motion vector predictor.
- the correction unit is specifically configured to perform a motion vector correction process according to the first motion vector predicted value and the second motion vector predicted value:
- a motion vector correction process is performed according to the first motion vector predicted value and the second motion vector predicted value.
- the prediction unit is further used for:
- the current image block to be processed is determined according to the first motion vector prediction value and the second motion vector prediction value. Make predictions.
- the prediction unit 1501 is specifically configured to determine the first initial motion vector prediction value and the second initial motion vector prediction value of the current image block to be processed:
- the candidate including the first candidate motion information and the second candidate motion information, wherein the first candidate A candidate motion information includes a fifth motion vector predictor and a sixth motion vector predictor, and the second candidate motion information includes a seventh motion vector predictor and an eighth motion vector predictor;
- the fifth motion vector prediction value and the sixth motion vector prediction value are the first initial motion Vector prediction value and the second initial motion vector prediction value;
- the seventh motion vector predictor and the eighth motion vector predictor are the first initial motion vector predictor Value and the second initial motion vector predicted value.
- the prediction unit 1501 is specifically configured to determine the first initial motion vector prediction value and the second initial motion vector prediction value of the current image block to be processed:
- the candidate at the first position in the candidate list includes first candidate motion information and second candidate motion information, where the first candidate motion information includes a fifth motion vector predictor and a sixth motion vector predictor.
- the second candidate motion information includes the seventh motion vector predictor and the eighth motion vector predictor;
- the fifth motion vector predictor and the sixth motion vector predictor are the first initial motion vector predictor and The second initial motion vector prediction value
- the seventh motion vector predictor and the eighth motion vector predictor are the first initial motion vector predictor Value and the second initial motion vector predicted value.
- the correction unit 1502 is specifically configured to perform a motion vector correction process according to the first motion vector predicted value and the second motion vector predicted value, including:
- the difference between the first modified reference prediction block and the second modified reference prediction block is less than or equal to the difference between the first reference prediction block and the second reference prediction block
- the first modified reference prediction block A prediction block is an image block in a first preset area that has the same size as the first reference prediction block, the first preset area includes the first reference prediction block
- the second modified reference prediction block is An image block in a second preset area that has the same size as the second reference prediction block, the second preset area includes the second reference prediction block; the first modified reference prediction block corresponds to the The first modified motion vector predictor, and the second modified reference prediction block corresponds to the second modified motion vector predictor.
- the apparatus 1500 including only the prediction unit 1501 and the correction unit 1502 may correspond to an inter-frame prediction unit, and may be applied to both the encoding end and the decoding end.
- the positions of the prediction unit 1501 and the correction unit 1502 in FIG. 15 correspond to the positions of the inter prediction unit 344 in FIG. 3, in other words, the specific implementation of the functions of the prediction unit 1501 and the correction unit 1502 can be seen in FIG. Specific details of the inter prediction unit 344 in 3.
- the positions of the prediction unit 1501 and the correction unit 1502 correspond to the positions of the inter prediction unit 244 in FIG. 2, in other words, the specific implementation of the functions of the prediction unit 1501 and the correction unit 1502 Refer to the specific details of the inter prediction unit 244 in FIG. 2.
- the foregoing apparatus 1500 may execute the method shown in FIG. 8 or FIG. 13, and the apparatus 1500 may be a video encoding apparatus, a video decoding apparatus, a video encoding and decoding system, or other equipment with a video encoding and decoding function.
- the apparatus 1500 can be used to perform image prediction during the encoding process, and can also be used to perform image prediction during the decoding process.
- the computer-readable medium may include a computer-readable storage medium, which corresponds to a tangible medium, such as a data storage medium, or a communication medium that includes any medium that facilitates the transfer of a computer program from one place to another (for example, according to a communication protocol) .
- computer-readable media may generally correspond to (1) non-transitory tangible computer-readable storage media, or (2) communication media, such as signals or carrier waves.
- Data storage media can be any available media that can be accessed by one or more computers or one or more processors to retrieve instructions, codes, and/or data structures for implementing the techniques described in this application.
- the computer program product may include a computer-readable medium.
- such computer-readable storage media may include RAM, ROM, EEPROM, CD-ROM or other optical disk storage devices, magnetic disk storage devices or other magnetic storage devices, flash memory, or structures that can be used to store instructions or data Any other media that can be accessed by the computer in the form of desired program code. And, any connection is properly termed a computer-readable medium.
- any connection is properly termed a computer-readable medium.
- coaxial cable, fiber optic cable, twisted pair, digital subscriber line (DSL), or wireless technologies such as infrared, radio, and microwave to transmit instructions from a website, server, or other remote source
- coaxial cable Wire, fiber optic cable, twisted pair, DSL or wireless technologies such as infrared, radio and microwave are included in the definition of media.
- the computer-readable storage media and data storage media do not include connections, carrier waves, signals, or other temporary media, but are actually directed to non-transient tangible storage media.
- magnetic disks and optical discs include compact discs (CD), laser discs, optical discs, digital versatile discs (DVD), and Blu-ray discs. Disks usually reproduce data magnetically, while discs use lasers to reproduce data optically. data. Combinations of the above should also be included in the scope of computer-readable media.
- DSP digital signal processors
- ASIC application-specific integrated circuits
- FPGA field programmable logic arrays
- processor may refer to any of the foregoing structure or any other structure suitable for implementing the techniques described herein.
- DSP digital signal processors
- ASIC application-specific integrated circuits
- FPGA field programmable logic arrays
- the term "processor” as used herein may refer to any of the foregoing structure or any other structure suitable for implementing the techniques described herein.
- the functions described by the various illustrative logical blocks, modules, and steps described herein may be provided in dedicated hardware and/or software modules configured for encoding and decoding, or combined Into the combined codec.
- the technology may be fully implemented in one or more circuits or logic elements.
- the technology of this application can be implemented in a variety of devices or devices, including wireless handsets, integrated circuits (ICs), or a set of ICs (for example, chipsets).
- ICs integrated circuits
- Various components, modules, or units are described in this application to emphasize the functional aspects of the device for implementing the disclosed technology, but they do not necessarily need to be implemented by different hardware units.
- various units can be combined in the codec hardware unit with appropriate software and/or firmware, or through interoperable hardware units (including one or more processors as described above). provide.
Landscapes
- Engineering & Computer Science (AREA)
- Multimedia (AREA)
- Signal Processing (AREA)
- Compression Or Coding Systems Of Tv Signals (AREA)
Abstract
La présente invention concerne un procédé et un dispositif de prédiction d'image vidéo, destinés à être utilisés pour résoudre dans une certaine mesure le problème, rencontré dans l'état antérieur de la technique, de la faible précision de prédiction. Dans certains modes de réalisation de la présente invention, une prédiction inter-trames est effectuée en utilisant un mode de fusion avec différence de vecteurs de mouvement (MMVD) en combinaison avec un procédé de raffinement de vecteurs de mouvement côté décodeur (DMVR) (c.à.d. un procédé de DMVR basé sur le MMVD); pour une situation où il existe un processus de prédiction bidirectionnelle dans un MMVD, un décodage est effectué en combinant des informations de MVD après que le processus de prédiction bidirectionnelle a été optimisé; il est ainsi possible de tirer pleinement parti d'une relation de correspondance entre une première image de référence et une seconde image de référence (c.à.d. entre des images de prédiction vers l'avant et vers l'arrière); et en comparaison du procédé traditionnel, la présente invention réduit la redondance dans une certaine mesure, améliorant ainsi relativement la précision de prédiction.
Priority Applications (3)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN202310526825.6A CN116600139A (zh) | 2019-03-11 | 2019-03-11 | 视频图像解码方法、编码方法及装置 |
| PCT/CN2019/077726 WO2020181476A1 (fr) | 2019-03-11 | 2019-03-11 | Procédé et dispositif de prédiction d'image vidéo |
| CN201980093851.6A CN113557738B (zh) | 2019-03-11 | 2019-03-11 | 视频图像预测方法及装置 |
Applications Claiming Priority (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| PCT/CN2019/077726 WO2020181476A1 (fr) | 2019-03-11 | 2019-03-11 | Procédé et dispositif de prédiction d'image vidéo |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| WO2020181476A1 true WO2020181476A1 (fr) | 2020-09-17 |
Family
ID=72426320
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| PCT/CN2019/077726 Ceased WO2020181476A1 (fr) | 2019-03-11 | 2019-03-11 | Procédé et dispositif de prédiction d'image vidéo |
Country Status (2)
| Country | Link |
|---|---|
| CN (2) | CN116600139A (fr) |
| WO (1) | WO2020181476A1 (fr) |
Cited By (4)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN114095736A (zh) * | 2022-01-11 | 2022-02-25 | 杭州微帧信息科技有限公司 | 一种快速运动估计视频编码方法 |
| CN114449283A (zh) * | 2020-10-30 | 2022-05-06 | 腾讯科技(深圳)有限公司 | 一种数据处理方法、装置、计算机设备及存储介质 |
| CN115379240A (zh) * | 2022-07-07 | 2022-11-22 | 康达洲际医疗器械有限公司 | 一种基于方向扩展的视频编码mmvd预测方法与系统 |
| CN115643414A (zh) * | 2022-09-26 | 2023-01-24 | 北京达佳互联信息技术有限公司 | 视频处理方法、装置、电子设备及存储介质 |
Citations (5)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN106713933A (zh) * | 2010-01-19 | 2017-05-24 | 三星电子株式会社 | 图像解码方法 |
| CN107113424A (zh) * | 2014-11-18 | 2017-08-29 | 联发科技股份有限公司 | 基于来自单向预测的运动矢量和合并候选的双向预测视频编码方法 |
| WO2017194773A1 (fr) * | 2016-05-13 | 2017-11-16 | Telefonaktiebolaget Lm Ericsson (Publ) | Codage et décodage de différence de vecteur de mouvement |
| CN109155847A (zh) * | 2016-03-24 | 2019-01-04 | 英迪股份有限公司 | 用于编码/解码视频信号的方法和装置 |
| CN109218733A (zh) * | 2017-06-30 | 2019-01-15 | 华为技术有限公司 | 一种确定运动矢量预测值的方法以及相关设备 |
Family Cites Families (3)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US10750203B2 (en) * | 2016-12-22 | 2020-08-18 | Mediatek Inc. | Method and apparatus of adaptive bi-prediction for video coding |
| US10965955B2 (en) * | 2016-12-22 | 2021-03-30 | Mediatek Inc. | Method and apparatus of motion refinement for video coding |
| US10477237B2 (en) * | 2017-06-28 | 2019-11-12 | Futurewei Technologies, Inc. | Decoder side motion vector refinement in video coding |
-
2019
- 2019-03-11 WO PCT/CN2019/077726 patent/WO2020181476A1/fr not_active Ceased
- 2019-03-11 CN CN202310526825.6A patent/CN116600139A/zh active Pending
- 2019-03-11 CN CN201980093851.6A patent/CN113557738B/zh active Active
Patent Citations (5)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN106713933A (zh) * | 2010-01-19 | 2017-05-24 | 三星电子株式会社 | 图像解码方法 |
| CN107113424A (zh) * | 2014-11-18 | 2017-08-29 | 联发科技股份有限公司 | 基于来自单向预测的运动矢量和合并候选的双向预测视频编码方法 |
| CN109155847A (zh) * | 2016-03-24 | 2019-01-04 | 英迪股份有限公司 | 用于编码/解码视频信号的方法和装置 |
| WO2017194773A1 (fr) * | 2016-05-13 | 2017-11-16 | Telefonaktiebolaget Lm Ericsson (Publ) | Codage et décodage de différence de vecteur de mouvement |
| CN109218733A (zh) * | 2017-06-30 | 2019-01-15 | 华为技术有限公司 | 一种确定运动矢量预测值的方法以及相关设备 |
Cited By (6)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN114449283A (zh) * | 2020-10-30 | 2022-05-06 | 腾讯科技(深圳)有限公司 | 一种数据处理方法、装置、计算机设备及存储介质 |
| CN114449283B (zh) * | 2020-10-30 | 2024-06-07 | 腾讯科技(深圳)有限公司 | 一种数据处理方法、装置、计算机设备及存储介质 |
| CN114095736A (zh) * | 2022-01-11 | 2022-02-25 | 杭州微帧信息科技有限公司 | 一种快速运动估计视频编码方法 |
| CN114095736B (zh) * | 2022-01-11 | 2022-05-24 | 杭州微帧信息科技有限公司 | 一种快速运动估计视频编码方法 |
| CN115379240A (zh) * | 2022-07-07 | 2022-11-22 | 康达洲际医疗器械有限公司 | 一种基于方向扩展的视频编码mmvd预测方法与系统 |
| CN115643414A (zh) * | 2022-09-26 | 2023-01-24 | 北京达佳互联信息技术有限公司 | 视频处理方法、装置、电子设备及存储介质 |
Also Published As
| Publication number | Publication date |
|---|---|
| CN116600139A (zh) | 2023-08-15 |
| CN113557738B (zh) | 2023-04-28 |
| CN113557738A (zh) | 2021-10-26 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| US12549711B2 (en) | Video picture prediction method and apparatus | |
| CN111953997B (zh) | 候选运动矢量列表获取方法、装置及编解码器 | |
| CN111788833B (zh) | 帧间预测方法、装置以及相应的编码器和解码器 | |
| TWI806212B (zh) | 視訊編碼器、視訊解碼器及相應方法 | |
| CN113491132B (zh) | 视频图像解码、编码方法、装置及可读存储介质 | |
| WO2020232845A1 (fr) | Dispositif et procédé de prédiction intertrame | |
| CN114270847B (zh) | 融合候选运动信息列表的构建方法、装置及编解码器 | |
| CN113557738B (zh) | 视频图像预测方法及装置 | |
| CN111432219B (zh) | 一种帧间预测方法及装置 | |
| CN111263166B (zh) | 一种视频图像预测方法及装置 | |
| WO2020259353A1 (fr) | Procédé de codage/décodage entropique d'élément syntaxique, dispositif et codec | |
| CN111726617A (zh) | 用于融合运动矢量差技术的优化方法、装置及编解码器 | |
| WO2020114509A1 (fr) | Procédé et appareil d'encodage et de décodage d'image de vidéo | |
| CN112135137A (zh) | 视频编码器、视频解码器及相应方法 | |
| WO2020186882A1 (fr) | Procédé et dispositif de traitement basé sur un mode d'unité de prédiction de triangle | |
| WO2020187062A1 (fr) | Procédé et appareil d'optimisation destinés à être utilisés en fusion avec une technologie de différence de vecteur de mouvement, et codeur-décodeur | |
| WO2020108168A9 (fr) | Procédé et dispositif de prédiction d'image vidéo | |
| CN111726630A (zh) | 基于三角预测单元模式的处理方法及装置 |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| 121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 19919485 Country of ref document: EP Kind code of ref document: A1 |
|
| NENP | Non-entry into the national phase |
Ref country code: DE |
|
| 122 | Ep: pct application non-entry in european phase |
Ref document number: 19919485 Country of ref document: EP Kind code of ref document: A1 |