WO2020232845A1 - 一种帧间预测的方法和装置 - Google Patents
一种帧间预测的方法和装置 Download PDFInfo
- Publication number
- WO2020232845A1 WO2020232845A1 PCT/CN2019/100751 CN2019100751W WO2020232845A1 WO 2020232845 A1 WO2020232845 A1 WO 2020232845A1 CN 2019100751 W CN2019100751 W CN 2019100751W WO 2020232845 A1 WO2020232845 A1 WO 2020232845A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- block
- motion vector
- sub
- reference frame
- processed
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Ceased
Links
Images
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/50—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
- H04N19/503—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
- H04N19/51—Motion estimation or motion compensation
- H04N19/513—Processing of motion vectors
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/50—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
- H04N19/503—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
- H04N19/51—Motion estimation or motion compensation
- H04N19/513—Processing of motion vectors
- H04N19/517—Processing of motion vectors by encoding
- H04N19/52—Processing of motion vectors by encoding by predictive encoding
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/50—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
- H04N19/503—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/102—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
- H04N19/103—Selection of coding mode or of prediction mode
- H04N19/105—Selection of the reference unit for prediction within a chosen coding or prediction mode, e.g. adaptive choice of position and number of pixels used for prediction
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/102—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
- H04N19/124—Quantisation
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/134—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
- H04N19/136—Incoming video signal characteristics or properties
- H04N19/137—Motion inside a coding unit, e.g. average field, frame or block difference
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/134—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
- H04N19/136—Incoming video signal characteristics or properties
- H04N19/14—Coding unit complexity, e.g. amount of activity or edge presence estimation
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/134—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
- H04N19/157—Assigned coding mode, i.e. the coding mode being predefined or preselected to be further used for selection of another element or parameter
- H04N19/159—Prediction type, e.g. intra-frame, inter-frame or bidirectional frame prediction
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/169—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
- H04N19/17—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
- H04N19/176—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a block, e.g. a macroblock
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/169—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
- H04N19/186—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being a colour or a chrominance component
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/42—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals characterised by implementation details or hardware specially adapted for video compression or decompression, e.g. dedicated software implementation
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/50—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
- H04N19/503—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
- H04N19/51—Motion estimation or motion compensation
- H04N19/577—Motion compensation with bidirectional frame interpolation, i.e. using B-pictures
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/50—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
- H04N19/593—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving spatial prediction techniques
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/90—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using coding techniques not provided for in groups H04N19/10-H04N19/85, e.g. fractals
- H04N19/96—Tree coding, e.g. quad-tree coding
Definitions
- the present invention relates to the field of video coding and decoding, in particular to a method and device for inter-frame prediction of video images.
- Digital video capabilities can be incorporated into a variety of devices, including digital televisions, digital live broadcast systems, wireless broadcast systems, personal digital assistants (PDAs), laptop or desktop computers, tablet computers, e-book readers, Digital cameras, digital recording devices, digital media players, video game devices, video game consoles, cellular or satellite radio telephones (so-called "smart phones"), video teleconferencing devices, video streaming devices and the like .
- Digital video devices implement video compression technology, for example, in the standards defined by MPEG-2, MPEG-4, ITU-T H.263, ITU-T H.264/MPEG-4 Part 10 Advanced Video Coding (AVC), Video compression technology described in the video coding standard H.265/High Efficiency Video Coding (HEVC) standard and extensions of such standards.
- Video devices can transmit, receive, encode, decode, and/or store digital video information more efficiently by implementing such video compression techniques.
- Video compression techniques perform spatial (intra-image) prediction and/or temporal (inter-image) prediction to reduce or remove redundancy inherent in video sequences.
- a video slice ie, a video frame or part of a video frame
- image blocks can also be called tree blocks, coding units (CU) and/or coding nodes .
- the image block in the to-be intra-coded (I) slice of the image is encoded using spatial prediction about reference samples in neighboring blocks in the same image.
- the image blocks in the to-be-coded (P or B) slice of the image may use spatial prediction with respect to reference samples in neighboring blocks in the same image or temporal prediction with respect to reference samples in other reference images.
- the image may be referred to as a frame, and the reference image may be referred to as a reference frame.
- the embodiments of the present application provide a method and device for inter-frame prediction of video images, and corresponding encoders and decoders, which improve the prediction accuracy of motion information of image blocks and reduce implementation complexity.
- an embodiment of the present application provides an inter-frame prediction method, where a block to be processed includes one or more sub-blocks, and the method includes: determining the value of the block to be processed according to spatial neighboring blocks of the block to be processed Time domain offset vector, the time domain offset vector is used to determine the corresponding sub-block of the sub-block of the block to be processed; determine the motion of the sub-block of the block to be processed according to the motion vector of the corresponding sub-block Vector, wherein, when the motion vector of the corresponding sub-block is not available, the motion vector of the sub-block of the block to be processed is obtained according to the first preset motion vector.
- This embodiment uses the sub-block as the unit for obtaining the motion vector, which improves the accuracy of motion vector prediction and the coding efficiency.
- the sub-block is obtained based on the preset motion vector. Compared with the method of deriving the default motion vector, the motion vector reduces the complexity of implementation.
- the determining the time domain offset vector of the to-be-processed block according to the spatial neighboring blocks of the to-be-processed block includes: sequentially checking a plurality of first presets in a preset order Whether the motion vector of the neighboring block in the spatial domain of the position is available, until the motion vector of the neighboring block in the spatial domain is available for the first motion vector in the preset order is obtained; the first motion vector in the preset order is available The obtained motion vector of the neighboring block in the spatial domain is used as the time domain offset vector.
- a plurality of spatial neighboring blocks are used to obtain a time-domain offset vector, which makes full use of the spatial correlation of the prediction object.
- the second preset motion vector is used as the time domain offset vector.
- the second preset motion vector is a zero motion vector.
- a zero motion vector is used as a backup solution when the motion vectors of adjacent blocks in the spatial domain at multiple preset positions are not available, thereby reducing the complexity of implementation.
- the determining the time domain offset vector of the to-be-processed block according to the spatial neighboring block of the to-be-processed block includes: acquiring the spatial neighboring block of the second preset position Motion vector and reference frame, wherein the motion vector of the spatial neighboring block at the second preset position is available; the motion vector of the spatial neighboring block at the second preset position is used as the time domain offset vector .
- the time-domain offset vector is obtained by using adjacent blocks in the spatial domain at the preset position, which eliminates the checking steps in the foregoing embodiment, and further reduces the implementation complexity.
- the third preset motion vector is used as the time domain offset vector.
- the third preset motion vector is a zero motion vector.
- This embodiment adopts a zero motion vector as a backup solution when the motion vector of the spatial neighboring block at the preset position is not available, which reduces the complexity of implementation.
- the motion vector of the spatial neighboring block at the second preset position includes a first-directional motion vector based on the first reference frame list, and the spatial phase at the second preset position
- the reference frame of the neighboring block includes the first-directional reference frame corresponding to the first-directional motion vector
- the using the motion vector of the spatial neighboring block at the second preset position as the time-domain offset vector includes: When the first-direction reference frame and the image frame in which the corresponding sub-block is located are the same, the first-direction motion vector is used as the time domain offset vector.
- the method includes: using the third preset motion vector as the time-domain offset vector.
- the motion vector of the spatial neighboring block at the second preset position further includes the motion vector based on the second reference
- the second direction motion vector of the frame list the reference frame of the spatial neighboring block at the second preset position includes the second direction reference frame corresponding to the second direction motion vector, when the first direction reference frame and the
- the time domain corresponding blocks of the to-be-processed block are in different image frames, it includes: when the second-direction reference frame and the corresponding sub-block are the same, using the second-direction motion vector as the Time domain offset vector; when the second-direction reference frame and the image frame in which the corresponding sub-block is located are different, the third preset motion vector is used as the time domain offset vector.
- the motion vector of the spatial neighboring block at the second preset position includes the motion vector based on the first reference frame The first-direction motion vector of the list and the second-direction motion vector based on the second reference frame list.
- the reference frame of the spatial neighboring block at the second preset position includes the first-direction motion vector corresponding to the first-direction motion vector.
- the using the motion vector of the spatial neighboring block at the second preset position as the time domain offset vector includes: When the image frame where the corresponding sub-block is located is obtained from the second reference frame list: when the second-direction reference frame is the same as the image frame where the corresponding sub-block is located, the second-direction motion vector is used as The time domain offset vector; when the second-direction reference frame and the image frame where the corresponding sub-block is located are different and the first-direction reference frame and the image frame where the corresponding sub-block is located are the same, the The first-direction motion vector is used as the time-domain offset vector; when the image frame in which the corresponding sub-block is located is obtained from the first reference frame list: when the first-direction reference frame and the corresponding sub-block are When the image frames where the blocks are located are the same, the first-direction motion vector is used as the time domain offset vector; when the first-direction reference frame and the corresponding sub-block are located in
- the using the motion vector of the spatial neighboring block at the second preset position as the time domain offset vector includes: when the image frame where the corresponding sub-block is located When the display order of all reference frames in the reference frame list of the block to be processed is acquired from the second reference frame list and the display order of all the reference frames in the block to be processed is before the image frame where the block to be processed is located: When the image frames where the corresponding sub-blocks are located are the same, the second-direction motion vector is used as the time-domain offset vector; when the second-direction reference frame and the image frames where the corresponding sub-blocks are located are different and the When the first-direction reference frame is the same as the image frame where the corresponding sub-block is located, the first-direction motion vector is used as the time domain offset vector; when the image frame where the corresponding sub-block is located is from the first When the display order of at least one reference frame obtained in the reference frame list or in the reference frame list of the to-be-processed block is after the
- the third preset motion vector is used as the time domain offset vector.
- the index of the image frame in which the corresponding sub-block is located in the reference frame list of the spatial neighboring block of the block to be processed is obtained by parsing the code stream.
- This embodiment enables multiple possibilities for selecting the image frame where the corresponding sub-block is located, and improves the coding performance.
- the condition that the motion vector of the spatial neighboring block is unavailable includes a combination of one or more of the following: the spatial neighboring block is not coded/decoded; or, the spatial neighboring block The block adopts intra prediction or intra block copy mode; or, the spatial neighboring block does not exist; or, the spatial neighboring block and the block to be processed are located in different coding regions.
- the coding region includes: an image, a strip, a slice or a slice group.
- the method before the determining the motion vector of the sub-block of the to-be-processed block, the method further includes: determining whether the motion vector corresponding to the preset position in the corresponding sub-block is available; , The determining the motion vector of the sub-block of the block to be processed includes: when the motion vector corresponding to the position in the preset block is available, obtaining the motion vector corresponding to the position in the preset block The motion vector of the sub-block of the block to be processed; when the motion vector corresponding to the position in the preset block is not available, the motion vector of the sub-block of the block to be processed is obtained according to the first preset motion vector.
- the position within the preset block is the geometric center position of the corresponding sub-block.
- the geometric center position is used as the preset position in the block, and other positions in the block, such as the upper left vertex of the corresponding sub-block, may also be used as the preset position in the block.
- the motion vector corresponding to the position in the preset block is not available;
- the prediction unit where the position in the preset block is located uses inter-frame prediction, the motion vector corresponding to the position in the preset block is available.
- the prediction mode is used to determine whether the corresponding sub-block motion vector is available, which further reduces the implementation complexity.
- the obtaining the motion vector of the sub-block of the block to be processed according to the first preset motion vector includes: using the first preset motion vector as the sub-block of the block to be processed The motion vector of the block.
- the first preset motion vector is a zero motion vector.
- the zero motion vector is used as a backup solution for the motion vector of the sub-block of the block to be processed, which further reduces the implementation complexity.
- the motion vector of the sub-block includes the first-direction sub-block motion vector based on the first reference frame list and/or the second-direction sub-block motion vector based on the second reference frame list, when When the motion vector corresponding to the position in the preset block is not available, the obtaining the motion vector of the sub-block of the block to be processed according to the first preset motion vector includes: determining the sub-block of the block to be processed Adopt unidirectional prediction based on the first-direction sub-block motion vector, and obtain the first-direction sub-block motion vector of the sub-block of the to-be-processed block according to the first preset motion vector; or determine the to-be-processed block
- the sub-block of the processing block adopts unidirectional prediction based on the second-direction sub-block motion vector, and the second-direction sub-block motion vector of the sub-block of the block to be processed is obtained according to the first preset motion vector.
- the obtaining the motion vector of the sub-block of the block to be processed according to the first preset motion vector includes :
- the prediction type of the coding region where the block to be processed is located is type B prediction
- the prediction type of the coding region where the to-be-processed block is located is P-type prediction, determine The sub-block of the block to be processed adopts unidirectional prediction, and the first-directional sub-block motion vector of the sub-block of the block to be processed is obtained according to the first preset motion vector.
- the prediction type of the coding area where the block to be processed is located is type B prediction means that the area where the block to be processed is located is a type B area.
- the block to be processed is located in a B frame, located in a B slice, and located in a B slice. Located in the B slice group and so on, in this case the block to be processed is allowed to use bidirectional prediction, and is also allowed to use unidirectional prediction.
- the prediction type of the coding area where the block to be processed is located is P type prediction means that the area where the block to be processed is located is a P type area.
- the block to be processed is located in a P frame, located in a P slice, located in a P slice, located in a P slice Groups and so on, in this case the block to be processed is only allowed to use unidirectional prediction.
- the acquiring the motion vector of the sub-block of the block to be processed according to the motion vector corresponding to the position in the preset block includes: based on the first time domain distance difference and the second time domain The ratio of the distance difference, the motion vector corresponding to the position in the preset block is scaled to obtain the motion vector of the sub-block of the to-be-processed block, wherein the first time-domain distance difference is the to-be-processed The image sequence count difference between the image frame where the block is located and the reference frame of the block to be processed, and the second time domain distance difference is the image sequence of the image frame where the corresponding sub-block is located and the reference frame of the corresponding sub-block Count difference.
- the index of the reference frame of the to-be-processed block in the reference frame list of the to-be-processed block is obtained by parsing a code stream.
- the selection of the reference frame has multiple possibilities, and the coding performance is improved.
- the index of the reference frame of the to-be-processed block in the reference frame list of the to-be-processed block is 0.
- the bit rate for transmitting related information is saved.
- the method further includes: performing motion compensation on the sub-block of the to-be-processed block based on the motion vector of the sub-block of the to-be-processed block and the reference frame of the to-be-processed block to obtain the The predicted value of the sub-block of the block to be processed.
- This prediction method can be used as one of a variety of possible inter-frame predictions, can participate in the construction of a list of candidate prediction vectors, and can be combined with other prediction modes such as merge and affine prediction. And then realize the reconstruction of the block to be processed.
- an embodiment of the present application provides an inter-frame prediction device, where a block to be processed includes one or more sub-blocks, and the device includes: an offset acquisition module, which is configured to determine spatial neighboring blocks of the block to be processed, Determine the time domain offset vector of the to-be-processed block, the time-domain offset vector is used to determine the corresponding sub-block of the sub-block of the to-be-processed block; the motion vector acquisition module is used to determine the corresponding sub-block according to the The motion vector determines the motion vector of the sub-block of the block to be processed, wherein, when the motion vector of the corresponding sub-block is not available, the motion vector of the sub-block of the block to be processed is obtained according to the first preset motion vector .
- the offset acquisition module is specifically configured to: sequentially check whether the motion vectors of the spatial neighboring blocks at the first preset positions are available in a preset order, until the preset The motion vector of the spatial neighboring block available for the first motion vector in the sequence; the motion vector of the spatial neighboring block available for the first motion vector in the preset sequence is used as the time domain offset vector.
- the offset acquisition module is specifically configured to: when the motion vectors of the multiple spatial neighboring blocks at the first preset positions are not available, use the second preset motion vector as The time domain offset vector.
- the second preset motion vector is a zero motion vector.
- the offset acquisition module is specifically configured to: acquire motion vectors and reference frames of neighboring blocks in space at a second preset position, where the space at the second preset position is adjacent The motion vector of the block is available; the motion vector of the spatial neighboring block at the second preset position is used as the time domain offset vector.
- the offset acquisition module is specifically configured to: when the motion vector of the spatial neighboring block at the second preset position is not available, use the third preset motion vector as the time Domain offset vector.
- the third preset motion vector is a zero motion vector.
- the motion vector of the spatial neighboring block at the second preset position includes a first-directional motion vector based on the first reference frame list, and the spatial phase at the second preset position
- the reference frame of the neighboring block includes the first-direction reference frame corresponding to the first-direction motion vector
- the offset acquisition module is specifically configured to: when the first-direction reference frame is different from the image frame in which the corresponding sub-block is located At the same time, the first motion vector is used as the time domain offset vector.
- the offset acquisition module is specifically configured to: use the third preset motion vector as The time domain offset vector.
- the motion vector of the spatial neighboring block at the second preset position further includes the motion vector based on the second reference The second direction motion vector of the frame list
- the reference frame of the spatial neighboring block at the second preset position includes the second direction reference frame corresponding to the second direction motion vector, when the first direction reference frame and the
- the offset acquisition module is specifically configured to: when the second-direction reference frame and the corresponding sub-block are the same, set the The second direction motion vector is used as the time domain offset vector; when the second direction reference frame and the image frame in which the corresponding sub-block is located are different, the third preset motion vector is used as the time domain offset vector Shift vector.
- the motion vector of the spatial neighboring block at the second preset position includes the motion vector based on the first reference frame The first-direction motion vector of the list and the second-direction motion vector based on the second reference frame list.
- the reference frame of the spatial neighboring block at the second preset position includes the first-direction motion vector corresponding to the first-direction motion vector.
- the offset obtaining module is specifically configured to: when the image frame in which the corresponding sub-block is located is obtained from the second reference frame list : When the second-direction reference frame is the same as the image frame in which the corresponding sub-block is located, the second-direction motion vector is used as the time domain offset vector; when the second-direction reference frame and the When the corresponding sub-blocks are in different image frames and the first-direction reference frame and the corresponding sub-block are in the same image frame, the first-direction motion vector is used as the time-domain offset vector; when the corresponding When the image frame in which the sub-block is located is obtained from the first reference frame list: when the first-direction reference frame and the image frame in which the corresponding sub-block is located are the same, the first-direction motion vector is used as the Time-domain offset vector; when the first-direction reference frame and the image frame in which the corresponding sub-block is located are different
- the offset obtaining module is specifically configured to: when the image frame in which the corresponding sub-block is located is obtained from the second reference frame list and is in the reference frame list of the block to be processed When the display order of all the reference frames is before the image frame where the block to be processed is located: when the second-direction reference frame and the image frame where the corresponding sub-block is located are the same, the second-direction motion vector As the time domain offset vector; when the second-direction reference frame and the image frame where the corresponding sub-block is located are different and the first-direction reference frame and the image frame where the corresponding sub-block is located are the same, set The first motion vector is used as the time domain offset vector; when the image frame in which the corresponding sub-block is located is obtained from the first reference frame list or at least one of the reference frame lists of the block to be processed When the display order of the reference frame is after the image frame where the block to be processed is located: when the first-direction reference frame and the image frame where the corresponding sub-block is
- the offset acquisition module is specifically configured to: use the third preset motion vector as the time domain offset vector.
- the index of the image frame in which the corresponding sub-block is located in the reference frame list of the spatial neighboring block of the block to be processed is obtained by parsing the code stream.
- the condition that the motion vector of the spatial neighboring block is unavailable includes a combination of one or more of the following: the spatial neighboring block is not coded/decoded; or, the spatial neighboring block The block adopts intra prediction or intra block copy mode; or, the spatial neighboring block does not exist; or, the spatial neighboring block and the block to be processed are located in different coding regions.
- the coding region includes: an image, a strip, a slice or a slice group.
- it further includes: a judging module for judging whether the motion vector corresponding to the position in the preset block of the corresponding sub-block is available; correspondingly, the motion vector acquiring module is specifically configured to: When the motion vector corresponding to the position in the preset block is available, obtain the motion vector of the sub-block of the block to be processed according to the motion vector corresponding to the position in the preset block; when the position in the preset block corresponds to When the motion vector of is not available, the motion vector of the sub-block of the block to be processed is obtained according to the first preset motion vector.
- the position within the preset block is the geometric center position of the corresponding sub-block.
- the motion vector corresponding to the position in the preset block is not available;
- the prediction unit where the position in the preset block is located uses inter-frame prediction, the motion vector corresponding to the position in the preset block is available.
- the motion vector acquiring module is specifically configured to use the first preset motion vector as the motion vector of the sub-block of the block to be processed.
- the first preset motion vector is a zero motion vector.
- the motion vector of the sub-block includes the first-direction sub-block motion vector based on the first reference frame list and/or the second-direction sub-block motion vector based on the second reference frame list, when When the motion vector corresponding to the position in the preset block is not available, the motion vector acquisition module is specifically configured to: determine that the sub-block of the block to be processed adopts unidirectional prediction based on the first-directional sub-block motion vector, And obtain the first-directional sub-block motion vector of the sub-block of the to-be-processed block according to the first preset motion vector; or determine that the sub-block of the to-be-processed block is based on the second-directional sub-block motion vector And obtain the second-directional sub-block motion vector of the sub-block of the to-be-processed block according to the first preset motion vector.
- the motion vector acquisition module is specifically configured to: when the prediction type of the coding region where the block to be processed is located is B In type prediction, it is determined that the sub-block of the block to be processed adopts bidirectional prediction, and the first-direction sub-block motion vector of the sub-block of the block to be processed and the sub-block to be processed are respectively obtained according to the first preset motion vector The second-direction sub-block motion vector of the sub-block of the block; when the prediction type of the coding region where the block to be processed is located is P-type prediction, it is determined that the sub-block of the block to be processed adopts unidirectional prediction, and according to the The first preset motion vector obtains the first sub-block motion vector of the sub-block of the block to be processed.
- the motion vector acquisition module is specifically configured to: scale the motion vector corresponding to the position in the preset block based on the ratio of the first time domain distance difference and the second time domain distance difference Processing to obtain the motion vector of the sub-block of the block to be processed, wherein the first time domain distance difference is the difference in the sequence count of the image frame where the block to be processed is located and the reference frame of the block to be processed The second time-domain distance difference is the difference in the sequence count of the image frame in which the corresponding sub-block is located and the reference frame of the corresponding sub-block.
- the index of the reference frame of the to-be-processed block in the reference frame list of the to-be-processed block is obtained by parsing a code stream.
- the index of the reference frame of the to-be-processed block in the reference frame list of the to-be-processed block is 0.
- a motion compensation module configured to perform processing on the sub-block of the block to be processed based on the motion vector of the sub-block of the block to be processed and the reference frame of the block to be processed Motion compensation to obtain the predicted value of the sub-block of the block to be processed.
- an embodiment of the present application provides a video encoder, where the video encoder is used to encode image blocks and includes: the inter-frame prediction device according to the second aspect of the embodiment of the present application, wherein the inter-frame prediction device Used to predict the motion information of the currently encoded image block based on the target candidate motion information, and determine the predicted pixel value of the currently encoded image block based on the motion information of the currently encoded image block;
- An entropy coding module configured to encode an index identifier of the target candidate motion information into a code stream, where the index identifier indicates the target candidate motion information used for the currently encoded image block;
- the reconstruction module is configured to reconstruct the currently coded image block based on the predicted pixel value.
- an embodiment of the present application provides a video decoder, which is used to decode image blocks from a bitstream, and includes: an entropy decoding module, used to decode an index identifier from the bitstream, the index The identifier is used to indicate the target candidate motion information of the currently decoded image block; the inter-frame prediction device according to the second aspect of the embodiment of the present application is used to predict the target candidate motion information indicated by the index identifier Determining the predicted pixel value of the currently decoded image block based on the motion information of the currently decoded image block;
- the reconstruction module is configured to reconstruct the currently decoded image block based on the predicted pixel value.
- an embodiment of the present application provides an encoding device, including: a non-volatile memory and a processor coupled with each other, the processor calls the program code stored in the memory to execute any one of the first aspect Part or all of the steps of this method.
- an embodiment of the present application provides a decoding device, including: a non-volatile memory and a processor coupled to each other, the processor calls the program code stored in the memory to execute any one of the first aspect Part or all of the steps of this method.
- an embodiment of the present application provides a computer-readable storage medium, the computer-readable storage medium stores program code, wherein the program code includes part or Instructions for all steps.
- the embodiments of the present application provide a computer program product, which when the computer program product runs on a computer, causes the computer to execute part or all of the steps of any method in the first aspect.
- an embodiment of the present application provides an inter-frame prediction method.
- a block to be processed includes one or more sub-blocks.
- the method includes: acquiring spatial neighboring blocks of the block to be processed; Block, obtain a time domain offset vector, the time domain offset vector is used to determine the corresponding sub-block of the sub-block of the to-be-processed block, wherein the neighboring block in the spatial domain has a first reference frame list
- the first-direction reference frame, and the image frame in which the corresponding sub-block is located is the same as the first-direction reference frame, the time-domain offset vector is the first-direction motion vector of the spatial neighboring block,
- the first-direction motion vector corresponds to the first-direction reference frame.
- the neighboring block in the spatial domain does not have a first-direction reference frame located in the first reference frame list, or the image frame in which the corresponding sub-block is located and the first-direction reference frame
- it further includes: the neighboring block in the spatial domain has a second-direction reference frame in the second reference frame list, and the image frame in which the corresponding sub-block is located is the same as the second-direction reference frame
- the time-domain offset vector is a second-direction motion vector of the spatial neighboring block, and the second-direction motion vector corresponds to the second-direction reference frame.
- the acquiring the spatial neighboring block of the to-be-processed block includes: checking whether the spatial neighboring block is available; when the spatial neighboring block is available, acquiring Adjacent blocks of the airspace.
- the image frame where the corresponding sub-block is located is the same as the first-direction reference frame, including: POC of the image frame where the corresponding sub-block is located and the first-direction reference frame POC is the same.
- the image frame in which the corresponding sub-block is located is the same as the second-direction reference frame, including: the POC of the image frame in which the corresponding sub-block is located and the second-direction reference frame POC is the same.
- the method further includes: parsing the code stream to obtain index information of the image frame in which the corresponding sub-block is located.
- the method further includes: using an image frame having a preset relationship with the block to be processed as the image frame where the corresponding sub-block is located.
- the preset relationship includes: the image frame where the corresponding sub-block is located is adjacent to the image frame where the block to be processed is located in the decoding order, and is earlier than the image frame where the block to be processed is located.
- the image frame where the block is located is decoded.
- the preset relationship includes: the image frame in which the corresponding sub-block is located is a reference frame index in the first-direction reference frame list or the second-direction reference frame list of the block to be processed The reference frame is 0.
- the neighboring block in the spatial domain does not have a second-direction reference frame in the second reference frame list, or the image frame in which the corresponding sub-block is located and the second-direction reference frame
- it further includes: using a zero motion vector as the time domain offset vector.
- an embodiment of the present application provides a video encoding and decoding device, including: a non-volatile memory and a processor coupled with each other, and the processor calls the program code stored in the memory to execute the The method described in the aspect.
- FIG. 1A is a block diagram of an example of a video encoding and decoding system 10 used to implement an embodiment of the present invention
- FIG. 1B is a block diagram of an example of a video decoding system 40 used to implement an embodiment of the present invention
- FIG. 2 is a block diagram of an example structure of an encoder 20 used to implement an embodiment of the present invention
- FIG. 3 is a block diagram of an example structure of a decoder 30 used to implement an embodiment of the present invention
- FIG. 4 is a block diagram of an example of a video decoding device 400 used to implement an embodiment of the present invention.
- Figure 5 is a block diagram of another example of an encoding device or a decoding device for implementing an embodiment of the present invention
- FIG. 6 is an exemplary schematic diagram of spatial neighboring blocks and time domain reference blocks used to implement embodiments of the present invention.
- FIG. 7 is an exemplary schematic diagram of an AMVP prediction mode used to implement an embodiment of the present invention.
- Fig. 8 is an exemplary schematic diagram of sub-blocks used to implement an embodiment of the present invention.
- Fig. 9 is an exemplary flow chart of an inter-frame prediction method for implementing an embodiment of the present invention.
- Fig. 10 is an exemplary schematic diagram for implementing the motion vector scaling processing of the embodiment of the present invention.
- FIG. 11 is an exemplary schematic diagram of sub-blocks and their corresponding sub-blocks of a block to be processed for implementing an embodiment of the present invention
- FIG. 12 is an exemplary flowchart of another inter-frame prediction method used to implement an embodiment of the present invention.
- Fig. 13 is an exemplary block diagram of an inter-frame prediction apparatus for implementing an embodiment of the present invention.
- the corresponding device may include one or more units such as functional units to perform the described one or more method steps (for example, one unit performs one or more steps) , Or multiple units, each of which performs one or more of multiple steps), even if such one or more units are not explicitly described or illustrated in the drawings.
- the corresponding method may include one step to perform the functionality of one or more units (for example, one step performs one or more units). The functionality, or multiple steps, each of which performs the functionality of one or more of the multiple units), even if such one or more steps are not explicitly described or illustrated in the drawings.
- the technical solutions involved in the embodiments of the present invention may not only be applied to existing video coding standards (such as H.264, HEVC and other standards), but may also be applied to future video coding standards (such as H.266 standard).
- the terms used in the embodiment of the present invention are only used to explain specific embodiments of the present invention, and are not intended to limit the present invention. The following briefly introduces some concepts that may be involved in the embodiments of the present invention.
- Video coding generally refers to processing a sequence of pictures that form a video or video sequence.
- the terms "picture”, "frame” or “image” can be used as synonyms.
- Video encoding used in this article means video encoding or video decoding.
- Video encoding is performed on the source side and usually includes processing (for example, by compressing) the original video picture to reduce the amount of data required to represent the video picture, so as to store and/or transmit more efficiently.
- Video decoding is performed on the destination side and usually involves inverse processing relative to the encoder to reconstruct the video picture.
- the “encoding” of video pictures involved in the embodiments should be understood as involving “encoding” or “decoding” of a video sequence.
- the combination of the encoding part and the decoding part is also called codec (encoding and decoding).
- a video sequence includes a series of pictures, the pictures are further divided into slices, and the slices are divided into blocks.
- Video coding is performed in units of blocks.
- the concept of blocks is further expanded.
- MB macroblock
- the macroblock can be further divided into multiple prediction blocks (partitions) that can be used for predictive coding.
- HEVC high-efficiency video coding
- basic concepts such as coding unit (CU), prediction unit (PU), and transform unit (TU) are used.
- CU coding unit
- PU prediction unit
- TU transform unit
- a variety of block units are divided, and a new tree-based structure is used for description.
- the CU can be divided into smaller CUs according to the quadtree, and the smaller CUs can be further divided to form a quadtree structure.
- the CU is a basic unit for dividing and encoding the coded image.
- PU can correspond to prediction block and is the basic unit of prediction coding.
- the CU is further divided into multiple PUs according to the division mode.
- the TU can correspond to the transform block and is the basic unit for transforming the prediction residual.
- no matter CU, PU or TU they all belong to the concept of block (or image block) in nature.
- a CTU is split into multiple CUs by using a quadtree structure represented as a coding tree.
- a decision is made at the CU level whether to use inter-picture (temporal) or intra-picture (spatial) prediction to encode picture regions.
- Each CU can be further split into one, two or four PUs according to the PU split type.
- the same prediction process is applied in a PU, and relevant information is transmitted to the decoder on the basis of the PU.
- the CU may be divided into transform units (TU) according to other quadtree structures similar to the coding tree used for the CU.
- quad-tree and binary tree Quad-tree and binary tree (Quad-tree and Binary Tree, QTBT) are used to divide frames to divide coding blocks.
- the CU may have a square or rectangular shape.
- the image block to be encoded in the currently encoded image may be referred to as the current block.
- a reference block is a block that provides a reference signal for the current block, where the reference signal represents the pixel value in the image block.
- the block in the reference image that provides the prediction signal for the current block may be a prediction block, where the prediction signal represents the pixel value or sample value or sample signal in the prediction block. For example, after traversing multiple reference blocks, the best reference block is found. This best reference block will provide prediction for the current block, and this block is called a prediction block.
- the original video picture can be reconstructed, that is, the reconstructed video picture has the same quality as the original video picture (assuming no transmission loss or other data loss during storage or transmission).
- quantization is performed to perform further compression to reduce the amount of data required to represent the video picture, and the decoder side cannot completely reconstruct the video picture, that is, the quality of the reconstructed video picture is compared with the original video picture The quality is low or poor.
- Video coding standards of H.261 belong to "lossy hybrid video coding and decoding” (that is, combining spatial and temporal prediction in the sample domain with 2D transform coding for applying quantization in the transform domain).
- Each picture of a video sequence is usually divided into a set of non-overlapping blocks, and is usually coded at the block level.
- the encoder side usually processes the video at the block (video block) level, that is, encodes the video.
- the prediction block is generated by spatial (intra-picture) prediction and temporal (inter-picture) prediction, from the current block (currently processed or to be processed).
- the processed block subtracts the prediction block to obtain the residual block, transforms the residual block in the transform domain and quantizes the residual block to reduce the amount of data to be transmitted (compressed), and the decoder side will process the inverse of the encoder Partially applied to the coded or compressed block to reconstruct the current block for representation.
- the encoder duplicates the decoder processing loop, so that the encoder and the decoder generate the same prediction (for example, intra prediction and inter prediction) and/or reconstruction for processing, that is, to encode subsequent blocks.
- FIG. 1A exemplarily shows a schematic block diagram of a video encoding and decoding system 10 to which an embodiment of the present invention is applied.
- the video encoding and decoding system 10 may include a source device 12 and a destination device 14.
- the source device 12 generates encoded video data. Therefore, the source device 12 may be referred to as a video encoding device.
- the destination device 14 can decode the encoded video data generated by the source device 12, and therefore, the destination device 14 can be referred to as a video decoding device.
- Various implementations of source device 12, destination device 14, or both may include one or more processors and memory coupled to the one or more processors.
- the memory may include, but is not limited to, RAM, ROM, EEPROM, flash memory, or any other medium that can be used to store desired program codes in the form of instructions or data structures accessible by a computer, as described herein.
- the source device 12 and the destination device 14 may include various devices, including desktop computers, mobile computing devices, notebook (for example, laptop) computers, tablet computers, set-top boxes, telephone handsets such as so-called "smart" phones. Computers, televisions, cameras, display devices, digital media players, video game consoles, on-board computers, wireless communication equipment, or the like.
- FIG. 1A shows the source device 12 and the destination device 14 as separate devices
- the device embodiment may also include the source device 12 and the destination device 14 or the functionality of both, that is, the source device 12 or the corresponding The functionality of the destination device 14 or the corresponding functionality.
- the same hardware and/or software may be used, or separate hardware and/or software, or any combination thereof may be used to implement the source device 12 or the corresponding functionality and the destination device 14 or the corresponding functionality .
- the source device 12 and the destination device 14 may communicate with each other via a link 13, and the destination device 14 may receive encoded video data from the source device 12 via the link 13.
- Link 13 may include one or more media or devices capable of moving encoded video data from source device 12 to destination device 14.
- link 13 may include one or more communication media that enable source device 12 to transmit encoded video data directly to destination device 14 in real time.
- the source device 12 may modulate the encoded video data according to a communication standard, such as a wireless communication protocol, and may transmit the modulated video data to the destination device 14.
- the one or more communication media may include wireless and/or wired communication media, such as a radio frequency (RF) spectrum or one or more physical transmission lines.
- RF radio frequency
- the one or more communication media may form part of a packet-based network, such as a local area network, a wide area network, or a global network (e.g., the Internet).
- the one or more communication media may include routers, switches, base stations, or other devices that facilitate communication from source device 12 to destination device 14.
- the source device 12 includes an encoder 20, and optionally, the source device 12 may also include a picture source 16, a picture preprocessor 18, and a communication interface 22.
- the encoder 20, the picture source 16, the picture preprocessor 18, and the communication interface 22 may be hardware components in the source device 12, or may be software programs in the source device 12. They are described as follows:
- the picture source 16 which can include or can be any type of picture capture device, for example to capture real-world pictures, and/or any type of pictures or comments (for screen content encoding, some text on the screen is also considered to be encoded Picture or part of an image) generating equipment, for example, a computer graphics processor for generating computer animation pictures, or for obtaining and/or providing real world pictures, computer animation pictures (for example, screen content, virtual reality, VR) pictures), and/or any combination thereof (for example, augmented reality (AR) pictures).
- the picture source 16 may be a camera for capturing pictures or a memory for storing pictures.
- the picture source 16 may also include any type of (internal or external) interface for storing previously captured or generated pictures and/or acquiring or receiving pictures.
- the picture source 16 When the picture source 16 is a camera, the picture source 16 may be, for example, a local or an integrated camera integrated in the source device; when the picture source 16 is a memory, the picture source 16 may be local or, for example, an integrated camera integrated in the source device. Memory.
- the interface When the picture source 16 includes an interface, the interface may be, for example, an external interface for receiving pictures from an external video source.
- the external video source is, for example, an external picture capturing device, such as a camera, an external memory, or an external picture generating device, such as It is an external computer graphics processor, computer or server.
- the interface can be any type of interface according to any proprietary or standardized interface protocol, such as a wired or wireless interface, and an optical interface.
- a picture can be regarded as a two-dimensional array or matrix of picture elements.
- the pixel points in the array can also be called sampling points.
- the number of sampling points of the array or picture in the horizontal and vertical directions (or axis) defines the size and/or resolution of the picture.
- three color components are usually used, that is, pictures can be represented as or contain three sample arrays.
- a picture includes corresponding red, green, and blue sample arrays.
- each pixel is usually expressed in a luminance/chrominance format or color space.
- a picture in the YUV format includes the luminance component indicated by Y (sometimes indicated by L) and the two indicated by U and V. Chrominance components.
- the luma component Y represents brightness or gray level intensity (for example, the two are the same in a grayscale picture), and the two chroma components U and V represent chroma or color information components.
- a picture in the YUV format includes a luminance sample array of luminance sample values (Y), and two chrominance sample arrays of chrominance values (U and V).
- Pictures in RGB format can be converted or converted to YUV format, and vice versa. This process is also called color conversion or conversion. If the picture is black and white, the picture may only include the luminance sample array.
- the picture transmitted from the picture source 16 to the picture processor may also be referred to as original picture data 17.
- the picture preprocessor 18 is configured to receive the original picture data 17 and perform preprocessing on the original picture data 17 to obtain the preprocessed picture 19 or the preprocessed picture data 19.
- the pre-processing performed by the picture pre-processor 18 may include trimming, color format conversion (for example, conversion from RGB format to YUV format), toning, or denoising.
- the encoder 20 (or video encoder 20) is configured to receive the pre-processed picture data 19, and process the pre-processed picture data 19 using a relevant prediction mode (such as the prediction mode in the various embodiments herein), thereby
- the encoded picture data 21 is provided (the structure details of the encoder 20 will be described further based on FIG. 2 or FIG. 4 or FIG. 5).
- the encoder 20 may be used to implement the various embodiments described below to realize the application of the chrominance block prediction method described in the present invention on the encoding side.
- the communication interface 22 can be used to receive the encoded picture data 21, and can transmit the encoded picture data 21 to the destination device 14 or any other device (such as a memory) via the link 13 for storage or direct reconstruction, so The other device can be any device used for decoding or storage.
- the communication interface 22 may be used, for example, to encapsulate the encoded picture data 21 into a suitable format, such as a data packet, for transmission on the link 13.
- the destination device 14 includes a decoder 30, and optionally, the destination device 14 may also include a communication interface 28, a picture post processor 32, and a display device 34. They are described as follows:
- the communication interface 28 may be used to receive the encoded picture data 21 from the source device 12 or any other source, for example, a storage device, and the storage device is, for example, an encoded picture data storage device.
- the communication interface 28 can be used to transmit or receive the encoded picture data 21 via the link 13 between the source device 12 and the destination device 14 or via any type of network.
- the link 13 is, for example, a direct wired or wireless connection.
- the type of network is, for example, a wired or wireless network or any combination thereof, or any type of private network and public network, or any combination thereof.
- the communication interface 28 may be used, for example, to decapsulate the data packet transmitted by the communication interface 22 to obtain the encoded picture data 21.
- Both the communication interface 28 and the communication interface 22 can be configured as a one-way communication interface or a two-way communication interface, and can be used, for example, to send and receive messages to establish connections, confirm and exchange any other communication links and/or, for example, encoded picture data Information about the transmission of the transmitted data.
- the decoder 30 (or referred to as the decoder 30) is used to receive the encoded picture data 21 and provide the decoded picture data 31 or the decoded picture 31 (below will further describe the decoder 30 based on Figure 3 or Figure 4 or Figure 5 Structural details).
- the decoder 30 may be used to implement the various embodiments described below to realize the application of the chrominance block prediction method described in the present invention on the decoding side.
- the picture post processor 32 is configured to perform post-processing on the decoded picture data 31 (also referred to as reconstructed picture data) to obtain post-processed picture data 33.
- the post-processing performed by the picture post-processor 32 may include: color format conversion (for example, conversion from YUV format to RGB format), toning, trimming or resampling, or any other processing, and can also be used to convert post-processed picture data 33 is transmitted to the display device 34.
- the display device 34 is configured to receive the post-processed image data 33 to display the image to, for example, users or viewers.
- the display device 34 may be or may include any type of display for presenting reconstructed pictures, for example, an integrated or external display or monitor.
- the display may include a liquid crystal display (LCD), an organic light emitting diode (OLED) display, a plasma display, a projector, a micro LED display, a liquid crystal on silicon (LCoS), Digital light processor (digital light processor, DLP) or any other type of display.
- FIG. 1A shows the source device 12 and the destination device 14 as separate devices
- the device embodiment may also include the source device 12 and the destination device 14 or the functionality of both, that is, the source device 12 or Corresponding functionality and destination device 14 or corresponding functionality.
- the same hardware and/or software may be used, or separate hardware and/or software, or any combination thereof may be used to implement the source device 12 or the corresponding functionality and the destination device 14 or the corresponding functionality .
- the source device 12 and the destination device 14 may include any of a variety of devices, including any type of handheld or stationary device, for example, a notebook or laptop computer, mobile phone, smart phone, tablet or tablet computer, video camera, desktop Computers, set-top boxes, televisions, cameras, in-vehicle devices, display devices, digital media players, video game consoles, video streaming devices (such as content service servers or content distribution servers), broadcast receiver devices, broadcast transmitter devices And so on, and can not use or use any type of operating system.
- a notebook or laptop computer mobile phone, smart phone, tablet or tablet computer
- video camera desktop Computers
- set-top boxes televisions, cameras, in-vehicle devices, display devices, digital media players, video game consoles, video streaming devices (such as content service servers or content distribution servers), broadcast receiver devices, broadcast transmitter devices And so on, and can not use or use any type of operating system.
- Both the encoder 20 and the decoder 30 can be implemented as any of various suitable circuits, for example, one or more microprocessors, digital signal processors (digital signal processors, DSP), and application-specific integrated circuits (application-specific integrated circuits). circuit, ASIC), field-programmable gate array (FPGA), discrete logic, hardware, or any combination thereof.
- the device can store the instructions of the software in a suitable non-transitory computer-readable storage medium, and can use one or more processors to execute the instructions in hardware to execute the technology of the present disclosure . Any of the foregoing content (including hardware, software, a combination of hardware and software, etc.) can be regarded as one or more processors.
- the video encoding and decoding system 10 shown in FIG. 1A is only an example, and the technology of this application can be applied to video encoding settings that do not necessarily include any data communication between encoding and decoding devices (for example, video encoding or video encoding). decoding).
- the data can be retrieved from local storage, streamed on the network, etc.
- the video encoding device can encode data and store the data to the memory, and/or the video decoding device can retrieve the data from the memory and decode the data.
- encoding and decoding are performed by devices that do not communicate with each other but only encode data to the memory and/or retrieve data from the memory and decode the data.
- FIG. 1B is an explanatory diagram of an example of a video coding system 40 including the encoder 20 of FIG. 2 and/or the decoder 30 of FIG. 3 according to an exemplary embodiment.
- the video decoding system 40 can implement a combination of various technologies in the embodiments of the present invention.
- the video coding system 40 may include an imaging device 41, an encoder 20, a decoder 30 (and/or a video encoder/decoder implemented by the logic circuit 47 of the processing unit 46), and an antenna 42 , One or more processors 43, one or more memories 44 and/or display devices 45.
- the imaging device 41, the antenna 42, the processing unit 46, the logic circuit 47, the encoder 20, the decoder 30, the processor 43, the memory 44, and/or the display device 45 can communicate with each other.
- the encoder 20 and the decoder 30 are used to illustrate the video coding system 40, in different examples, the video coding system 40 may include only the encoder 20 or only the decoder 30.
- antenna 42 may be used to transmit or receive an encoded bitstream of video data.
- the display device 45 may be used to present video data.
- the logic circuit 47 may be implemented by the processing unit 46.
- the processing unit 46 may include application-specific integrated circuit (ASIC) logic, a graphics processor, a general-purpose processor, and so on.
- the video decoding system 40 may also include an optional processor 43, and the optional processor 43 may similarly include application-specific integrated circuit (ASIC) logic, a graphics processor, a general-purpose processor, and the like.
- the logic circuit 47 may be implemented by hardware, such as dedicated hardware for video encoding, and the processor 43 may be implemented by general software, an operating system, and the like.
- the memory 44 may be any type of memory, such as volatile memory (for example, static random access memory (Static Random Access Memory, SRAM), dynamic random access memory (Dynamic Random Access Memory, DRAM), etc.) or non-volatile memory. Memory (for example, flash memory, etc.), etc.
- volatile memory for example, static random access memory (Static Random Access Memory, SRAM), dynamic random access memory (Dynamic Random Access Memory, DRAM), etc.
- Memory for example, flash memory, etc.
- the memory 44 may be implemented by cache memory.
- the logic circuit 47 may access the memory 44 (e.g., to implement an image buffer).
- the logic circuit 47 and/or the processing unit 46 may include memory (e.g., a cache, etc.) for implementing image buffers and the like.
- the encoder 20 implemented by logic circuits may include an image buffer (e.g., implemented by the processing unit 46 or the memory 44) and a graphics processing unit (e.g., implemented by the processing unit 46).
- the graphics processing unit may be communicatively coupled to the image buffer.
- the graphics processing unit may include an encoder 20 implemented by a logic circuit 47 to implement various modules discussed with reference to FIG. 2 and/or any other encoder system or subsystem described herein.
- Logic circuits can be used to perform the various operations discussed herein.
- decoder 30 may be implemented by logic circuit 47 in a similar manner to implement the various modules discussed with reference to decoder 30 of FIG. 3 and/or any other decoder systems or subsystems described herein.
- the decoder 30 implemented by logic circuits may include an image buffer (implemented by the processing unit 2820 or the memory 44) and a graphics processing unit (implemented by the processing unit 46, for example).
- the graphics processing unit may be communicatively coupled to the image buffer.
- the graphics processing unit may include a decoder 30 implemented by a logic circuit 47 to implement the various modules discussed with reference to FIG. 3 and/or any other decoder systems or subsystems described herein.
- antenna 42 may be used to receive an encoded bitstream of video data.
- the encoded bitstream may include data, indicators, index values, mode selection data, etc., related to the encoded video frame discussed herein, such as data related to coded partitions (e.g., transform coefficients or quantized transform coefficients). , (As discussed) optional indicators, and/or data defining code partitions).
- the video coding system 40 may also include a decoder 30 coupled to the antenna 42 and used to decode the encoded bitstream.
- the display device 45 is used to present video frames.
- the decoder 30 may be used to perform the reverse process.
- the decoder 30 can be used to receive and parse such syntax elements, and decode related video data accordingly.
- the encoder 20 may entropy encode the syntax elements into an encoded video bitstream. In such instances, the decoder 30 can parse such syntax elements and decode related video data accordingly.
- the motion vector prediction method described in the embodiment of the present invention is mainly used for the inter-frame prediction process. This process exists in both the encoder 20 and the decoder 30.
- the encoder 20 and the decoder 30 in the embodiment of the present invention can be, for example, an encoder/decoder corresponding to video standard protocols such as H.263, H.264, HEVV, MPEG-2, MPEG-4, VP8, VP9, or next-generation video standard protocols (such as H.266, etc.).
- Fig. 2 shows a schematic/conceptual block diagram of an example of an encoder 20 for implementing an embodiment of the present invention.
- the encoder 20 includes a residual calculation unit 204, a transformation processing unit 206, a quantization unit 208, an inverse quantization unit 210, an inverse transformation processing unit 212, a reconstruction unit 214, a buffer 216, and a loop filter.
- Unit 220 a decoded picture buffer (DPB) 230, a prediction processing unit 260, and an entropy coding unit 270.
- the prediction processing unit 260 may include an inter prediction unit 244, an intra prediction unit 254, and a mode selection unit 262.
- the inter prediction unit 244 may include a motion estimation unit and a motion compensation unit (not shown).
- the encoder 20 shown in FIG. 2 may also be referred to as a hybrid video encoder or a video encoder according to a hybrid video codec.
- the residual calculation unit 204, the transform processing unit 206, the quantization unit 208, the prediction processing unit 260, and the entropy encoding unit 270 form the forward signal path of the encoder 20, and for example, the inverse quantization unit 210, the inverse transform processing unit 212, and the The structure unit 214, the buffer 216, the loop filter 220, the decoded picture buffer (DPB) 230, and the prediction processing unit 260 form the backward signal path of the encoder, wherein the backward signal path of the encoder corresponds to The signal path of the decoder (see decoder 30 in FIG. 3).
- the encoder 20 receives the picture 201 or the image block 203 of the picture 201 through, for example, an input 202, for example, a picture in a picture sequence that forms a video or a video sequence.
- the image block 203 may also be called the current picture block or the picture block to be encoded
- the picture 201 may be called the current picture or the picture to be encoded (especially when the current picture is distinguished from other pictures in video encoding, the other pictures are for example the same video sequence). That is, the previous coded and/or decoded picture in the video sequence that also includes the current picture).
- the embodiment of the encoder 20 may include a segmentation unit (not shown in FIG. 2) for segmenting the picture 201 into a plurality of blocks such as the image block 203, usually into a plurality of non-overlapping blocks.
- the segmentation unit can be used to use the same block size and the corresponding grid defining the block size for all pictures in the video sequence, or to change the block size between pictures or subsets or groups of pictures, and divide each picture into The corresponding block.
- the prediction processing unit 260 of the encoder 20 may be used to perform any combination of the aforementioned segmentation techniques.
- the image block 203 is also or can be regarded as a two-dimensional array or matrix of sampling points with sample values, although its size is smaller than that of the picture 201.
- the image block 203 may include, for example, one sampling array (for example, a luminance array in the case of a black-and-white picture 201) or three sampling arrays (for example, one luminance array and two chrominance arrays in the case of a color picture) or Any other number and/or type of array depending on the color format applied.
- the number of sampling points in the horizontal and vertical directions (or axes) of the image block 203 defines the size of the image block 203.
- the encoder 20 shown in FIG. 2 is used to encode the picture 201 block by block, for example, to perform encoding and prediction on each image block 203.
- the residual calculation unit 204 is configured to calculate the residual block 205 based on the picture image block 203 and the prediction block 265 (other details of the prediction block 265 are provided below), for example, by subtracting the sample value of the picture image block 203 sample by sample (pixel by pixel). The sample value of the block 265 is de-predicted to obtain the residual block 205 in the sample domain.
- the transform processing unit 206 is configured to apply a transform such as discrete cosine transform (DCT) or discrete sine transform (DST) on the sample values of the residual block 205 to obtain transform coefficients 207 in the transform domain.
- a transform such as discrete cosine transform (DCT) or discrete sine transform (DST)
- DCT discrete cosine transform
- DST discrete sine transform
- the transform coefficient 207 may also be referred to as a transform residual coefficient, and represents the residual block 205 in the transform domain.
- the transform processing unit 206 may be used to apply an integer approximation of DCT/DST, such as the transform specified for HEVC/H.265. Compared with the orthogonal DCT transform, this integer approximation is usually scaled by a factor. In order to maintain the norm of the residual block processed by the forward and inverse transformation, an additional scaling factor is applied as part of the transformation process.
- the scaling factor is usually selected based on certain constraints. For example, the scaling factor is a trade-off between the power of 2 used for the shift operation, the bit depth of the transform coefficient, accuracy, and implementation cost.
- the inverse transformation processing unit 212 for the inverse transformation designate a specific scaling factor, and accordingly, the encoder The 20 side uses the transformation processing unit 206 to specify a corresponding scaling factor for the positive transformation.
- the quantization unit 208 is used to quantize the transform coefficient 207 by applying scalar quantization or vector quantization, for example, to obtain the quantized transform coefficient 209.
- the quantized transform coefficient 209 may also be referred to as a quantized residual coefficient 209.
- the quantization process can reduce the bit depth associated with some or all of the transform coefficients 207. For example, n-bit transform coefficients can be rounded down to m-bit transform coefficients during quantization, where n is greater than m.
- the degree of quantization can be modified by adjusting the quantization parameter (QP). For example, for scalar quantization, different scales can be applied to achieve finer or coarser quantization.
- QP quantization parameter
- a smaller quantization step size corresponds to a finer quantization
- a larger quantization step size corresponds to a coarser quantization.
- the appropriate quantization step size can be indicated by a quantization parameter (QP).
- the quantization parameter may be an index of a predefined set of suitable quantization steps.
- a smaller quantization parameter can correspond to fine quantization (smaller quantization step size)
- a larger quantization parameter can correspond to coarse quantization (larger quantization step size)
- Quantization may include division by a quantization step size and corresponding quantization or inverse quantization performed by, for example, inverse quantization 210, or may include multiplication by a quantization step size.
- Embodiments according to some standards such as HEVC may use quantization parameters to determine the quantization step size.
- the quantization step size can be calculated based on the quantization parameter using a fixed-point approximation of an equation including division. Additional scaling factors can be introduced for quantization and inverse quantization to restore the norm of the residual block that may be modified due to the scale used in the fixed-point approximation of the equations for the quantization step size and the quantization parameter.
- the scales of inverse transform and inverse quantization may be combined.
- a custom quantization table can be used and signaled from the encoder to the decoder in, for example, a bitstream. Quantization is a lossy operation, where the larger the quantization step, the greater the loss.
- the inverse quantization unit 210 is configured to apply the inverse quantization of the quantization unit 208 on the quantized coefficients to obtain the inverse quantized coefficients 211, for example, based on or use the same quantization step size as the quantization unit 208, and apply the quantization scheme applied by the quantization unit 208 The inverse quantification scheme.
- the inversely quantized coefficient 211 may also be referred to as the inversely quantized residual coefficient 211, which corresponds to the transform coefficient 207, although the loss due to quantization is usually different from the transform coefficient.
- the inverse transform processing unit 212 is used to apply the inverse transform of the transform applied by the transform processing unit 206, for example, an inverse discrete cosine transform (DCT) or an inverse discrete sine transform (DST), so as to be in the sample domain Obtain the inverse transform block 213.
- the inverse transformation block 213 may also be referred to as an inverse transformation and inverse quantization block 213 or an inverse transformation residual block 213.
- the reconstruction unit 214 (for example, the summer 214) is used to add the inverse transform block 213 (that is, the reconstructed residual block 213) to the prediction block 265 to obtain the reconstructed block 215 in the sample domain, for example, The sample value of the reconstructed residual block 213 and the sample value of the prediction block 265 are added.
- the buffer unit 216 (or “buffer” 216 for short) such as the line buffer 216 is used to buffer or store the reconstructed block 215 and the corresponding sample value, for example, for intra prediction.
- the encoder can be used to use the unfiltered reconstructed block and/or the corresponding sample value stored in the buffer unit 216 to perform any type of estimation and/or prediction, such as intra-frame prediction.
- the embodiment of the encoder 20 may be configured such that the buffer unit 216 is used not only for storing the reconstructed block 215 for intra prediction 254, but also for the loop filter unit 220 (not shown in FIG. 2 Out), and/or, for example, the buffer unit 216 and the decoded picture buffer unit 230 form one buffer.
- Other embodiments may be used to use the filtered block 221 and/or blocks or samples from the decoded picture buffer 230 (neither shown in FIG. 2) as the input or basis for the intra prediction 254.
- the loop filter unit 220 (or “loop filter” 220 for short) is used to filter the reconstructed block 215 to obtain the filtered block 221, thereby smoothly performing pixel conversion or improving video quality.
- the loop filter unit 220 is intended to represent one or more loop filters, such as deblocking filters, sample-adaptive offset (SAO) filters or other filters, such as bilateral filters, auto Adaptive loop filter (ALF), or sharpening or smoothing filter, or collaborative filter.
- the loop filter unit 220 is shown as an in-loop filter in FIG. 2, in other configurations, the loop filter unit 220 may be implemented as a post-loop filter.
- the filtered block 221 may also be referred to as a filtered reconstructed block 221.
- the decoded picture buffer 230 may store the reconstructed coded block after the loop filter unit 220 performs a filtering operation on the reconstructed coded block.
- the embodiment of the encoder 20 may be used to output loop filter parameters (e.g., sample adaptive offset information), for example, directly output or by the entropy encoding unit 270 or any other
- the entropy coding unit outputs after entropy coding, for example, so that the decoder 30 can receive and apply the same loop filter parameters for decoding.
- the decoded picture buffer (DPB) 230 may be a reference picture memory that stores reference picture data for the encoder 20 to encode video data.
- DPB 230 can be formed by any of a variety of memory devices, such as dynamic random access memory (DRAM) (including synchronous DRAM (SDRAM), magnetoresistive RAM (MRAM), resistive RAM) (resistive RAM, RRAM)) or other types of memory devices.
- DRAM dynamic random access memory
- SDRAM synchronous DRAM
- MRAM magnetoresistive RAM
- RRAM resistive RAM
- the DPB 230 and the buffer 216 may be provided by the same memory device or by separate memory devices.
- a decoded picture buffer (DPB) 230 is used to store the filtered block 221.
- the decoded picture buffer 230 may be further used to store other previous filtered blocks of the same current picture or different pictures such as the previously reconstructed picture, such as the previously reconstructed and filtered block 221, and may provide a complete previous Reconstruction is a decoded picture (and corresponding reference blocks and samples) and/or a partially reconstructed current picture (and corresponding reference blocks and samples), for example, for inter prediction.
- a decoded picture buffer (DPB) 230 is used to store the reconstructed block 215.
- the prediction processing unit 260 also called the block prediction processing unit 260, is used to receive or obtain the image block 203 (the current image block 203 of the current picture 201) and reconstructed picture data, such as the same (current) picture from the buffer 216
- the reference samples and/or the reference picture data 231 of one or more previously decoded pictures from the decoded picture buffer 230, and used to process such data for prediction, that is, the provision can be an inter-predicted block 245 or a The prediction block 265 of the intra prediction block 255.
- the mode selection unit 262 may be used to select a prediction mode (for example, intra or inter prediction mode) and/or the corresponding prediction block 245 or 255 used as the prediction block 265 to calculate the residual block 205 and reconstruct the reconstructed block 215.
- a prediction mode for example, intra or inter prediction mode
- the corresponding prediction block 245 or 255 used as the prediction block 265 to calculate the residual block 205 and reconstruct the reconstructed block 215.
- the embodiment of the mode selection unit 262 can be used to select a prediction mode (for example, from those supported by the prediction processing unit 260) that provides the best match or minimum residual (the minimum residual means Better compression in transmission or storage), or provide minimal signaling overhead (minimum signaling overhead means better compression in transmission or storage), or consider or balance both.
- the mode selection unit 262 may be configured to determine a prediction mode based on rate distortion optimization (RDO), that is, select a prediction mode that provides the smallest rate-distortion optimization, or select a prediction mode whose related rate-distortion at least meets the prediction mode selection criteria .
- RDO rate distortion optimization
- the encoder 20 is used to determine or select the best or optimal prediction mode from a set of (predetermined) prediction modes.
- the prediction mode set may include, for example, an intra prediction mode and/or an inter prediction mode.
- the set of intra prediction modes may include 35 different intra prediction modes, for example, non-directional modes such as DC (or mean) mode and planar mode, or directional modes as defined in H.265, or may include 67 Different intra-frame prediction modes, for example, non-directional modes such as DC (or mean) mode and planar mode, or directional modes as defined in H.266 under development.
- the set of inter-frame prediction modes depends on the available reference pictures (ie, for example, the aforementioned at least part of the decoded pictures stored in the DBP230) and other inter-frame prediction parameters, such as whether to use the entire reference picture or only use A part of the reference picture, such as the search window area surrounding the area of the current block, to search for the best matching reference block, and/or depending on whether pixel interpolation such as half pixel and/or quarter pixel interpolation is applied.
- the set of inter prediction modes may include, for example, an advanced motion vector (Advanced Motion Vector Prediction, AMVP) mode and a merge mode.
- AMVP Advanced Motion Vector Prediction
- the inter-frame prediction mode set may include an improved AMVP mode based on a control point in the embodiment of the present invention, and an improved merge mode based on a control point.
- the intra prediction unit 254 may be used to perform any combination of inter prediction techniques described below.
- the embodiment of the present invention may also apply skip mode and/or direct mode.
- the prediction processing unit 260 may be further used to divide the image block 203 into smaller block partitions or sub-blocks, for example, by iteratively using quad-tree (QT) segmentation and binary-tree (BT) segmentation. Or triple-tree (TT) segmentation, or any combination thereof, and used to perform prediction, for example, for each of the block partitions or sub-blocks, where the mode selection includes selecting the tree structure of the segmented image block 203 and selecting the application The prediction mode for each of the block partitions or sub-blocks.
- QT quad-tree
- BT binary-tree
- TT triple-tree
- the inter prediction unit 244 may include a motion estimation (ME) unit (not shown in FIG. 2) and a motion compensation (MC) unit (not shown in FIG. 2).
- the motion estimation unit is used to receive or obtain the picture image block 203 (the current picture image block 203 of the current picture 201) and the decoded picture 231, or at least one or more previously reconstructed blocks, for example, one or more other/different
- the reconstructed block of the previously decoded picture 231 is used for motion estimation.
- the video sequence may include the current picture and the previously decoded picture 31, or in other words, the current picture and the previously decoded picture 31 may be part of the picture sequence forming the video sequence, or form the picture sequence.
- the encoder 20 may be used to select a reference block from multiple reference blocks of the same or different pictures among multiple other pictures, and provide the reference picture and/or provide a reference to the motion estimation unit (not shown in FIG. 2)
- the offset (spatial offset) between the position of the block (X, Y coordinates) and the position of the current block is used as an inter prediction parameter. This offset is also called a motion vector (MV).
- the motion compensation unit is used to obtain inter prediction parameters, and perform inter prediction based on or using the inter prediction parameters to obtain the inter prediction block 245.
- the motion compensation performed by the motion compensation unit may include fetching or generating a prediction block based on a motion/block vector determined by motion estimation (interpolation of sub-pixel accuracy may be performed). Interpolation filtering can generate additional pixel samples from known pixel samples, thereby potentially increasing the number of candidate prediction blocks that can be used to encode picture blocks.
- the motion compensation unit 246 can locate the prediction block pointed to by the motion vector in a reference picture list.
- the motion compensation unit 246 may also generate syntax elements associated with the blocks and video slices for use by the decoder 30 when decoding picture blocks of the video slices.
- the aforementioned inter-prediction unit 244 may transmit syntax elements to the entropy encoding unit 270, and the syntax elements include inter-prediction parameters (for example, after traversing multiple inter-prediction modes and selecting the inter-prediction mode used for prediction of the current block) Instructions).
- the inter-frame prediction parameter may not be carried in the syntax element.
- the decoder 30 can directly use the default prediction mode for decoding. It can be understood that the inter prediction unit 244 may be used to perform any combination of inter prediction techniques.
- the intra prediction unit 254 is used to obtain, for example, receive a picture block 203 (current picture block) of the same picture and one or more previously reconstructed blocks, for example reconstructed adjacent blocks, for intra estimation.
- the encoder 20 may be used to select an intra prediction mode from a plurality of (predetermined) intra prediction modes.
- the embodiment of the encoder 20 may be used to select an intra prediction mode based on optimization criteria, for example, based on a minimum residual (for example, an intra prediction mode that provides a prediction block 255 most similar to the current picture block 203) or a minimum rate distortion.
- a minimum residual for example, an intra prediction mode that provides a prediction block 255 most similar to the current picture block 203
- a minimum rate distortion for example, an intra prediction mode that provides a prediction block 255 most similar to the current picture block 203
- the intra prediction unit 254 is further configured to determine the intra prediction block 255 based on the intra prediction parameters of the selected intra prediction mode. In any case, after selecting the intra prediction mode for the block, the intra prediction unit 254 is also used to provide intra prediction parameters to the entropy encoding unit 270, that is, to provide an indication of the selected intra prediction mode for the block Information. In one example, the intra prediction unit 254 may be used to perform any combination of intra prediction techniques.
- the aforementioned intra-prediction unit 254 may transmit syntax elements to the entropy encoding unit 270, and the syntax elements include intra-prediction parameters (for example, after traversing multiple intra-prediction modes, selecting the intra-prediction mode used for prediction of the current block) Instructions).
- the intra prediction parameter may not be carried in the syntax element.
- the decoder 30 can directly use the default prediction mode for decoding.
- the entropy coding unit 270 is used to apply entropy coding algorithms or schemes (for example, variable length coding (VLC) scheme, context adaptive VLC (context adaptive VLC, CAVLC) scheme, arithmetic coding scheme, context adaptive binary arithmetic) Coding (context adaptive binary arithmetic coding, CABAC), syntax-based context-adaptive binary arithmetic coding (SBAC), probability interval partitioning entropy (PIPE) coding or other entropy Encoding method or technique) applied to quantized residual coefficients 209, inter-frame prediction parameters, intra-frame prediction parameters and/or loop filter parameters, one or all (or not applied), to obtain the output 272
- VLC variable length coding
- CAVLC context adaptive VLC
- CABAC context adaptive binary arithmetic
- SBAC syntax-based context-adaptive binary arithmetic coding
- PIPE probability interval partitioning entropy
- encoded picture data 21 output in the form of encoded bitstream
- the encoded bitstream can be transmitted to the video decoder 30, or archived for later transmission or retrieval by the video decoder 30.
- the entropy encoding unit 270 may also be used for entropy encoding other syntax elements of the current video slice being encoded.
- the non-transform-based encoder 20 may directly quantize the residual signal without the transform processing unit 206 for certain blocks or frames.
- the encoder 20 may have a quantization unit 208 and an inverse quantization unit 210 combined into a single unit.
- the encoder 20 may be used to implement the motion vector prediction method described in the following embodiments.
- the video encoder 20 may directly quantize the residual signal without being processed by the transform processing unit 206, and accordingly does not need to be processed by the inverse transform processing unit 212; or, for some For image blocks or image frames, the video encoder 20 does not generate residual data, and accordingly does not need to be processed by the transform processing unit 206, quantization unit 208, inverse quantization unit 210, and inverse transform processing unit 212; or, the video encoder 20 may The reconstructed image block is directly stored as a reference block without being processed by the filter 220; or, the quantization unit 208 and the inverse quantization unit 210 in the video encoder 20 may be combined together.
- the loop filter 220 is optional, and for lossless compression coding, the transform processing unit 206, the quantization unit 208, the inverse quantization unit 210, and the inverse transform processing unit 212 are optional. It should be understood that, according to different application scenarios, the inter prediction unit 244 and the intra prediction unit 254 may be selectively activated.
- Fig. 3 shows a schematic/conceptual block diagram of an example of a decoder 30 for implementing an embodiment of the present invention.
- the video decoder 30 is used to receive, for example, encoded picture data (for example, an encoded bit stream) 21 encoded by the encoder 20 to obtain a decoded picture 231.
- video decoder 30 receives video data from video encoder 20, such as an encoded video bitstream and associated syntax elements that represent picture blocks of an encoded video slice.
- the decoder 30 includes an entropy decoding unit 304, an inverse quantization unit 310, an inverse transform processing unit 312, a reconstruction unit 314 (such as a summer 314), a buffer 316, a loop filter 320, and The decoded picture buffer 330 and the prediction processing unit 360.
- the prediction processing unit 360 may include an inter prediction unit 344, an intra prediction unit 354, and a mode selection unit 362.
- video decoder 30 may perform a decoding process that is substantially reciprocal of the encoding process described with video encoder 20 of FIG. 2.
- the entropy decoding unit 304 is configured to perform entropy decoding on the encoded picture data 21 to obtain, for example, quantized coefficients 309 and/or decoded encoding parameters (not shown in FIG. 3), for example, inter prediction, intra prediction parameters , Loop filter parameters and/or any one or all of other syntax elements (decoded).
- the entropy decoding unit 304 is further configured to forward the inter prediction parameters, intra prediction parameters and/or other syntax elements to the prediction processing unit 360.
- the video decoder 30 may receive syntax elements at the video slice level and/or the video block level.
- the inverse quantization unit 310 can be functionally the same as the inverse quantization unit 110
- the inverse transformation processing unit 312 can be functionally the same as the inverse transformation processing unit 212
- the reconstruction unit 314 can be functionally the same as the reconstruction unit 214
- the buffer 316 can be functionally identical.
- the loop filter 320 may be functionally the same as the loop filter 220
- the decoded picture buffer 330 may be functionally the same as the decoded picture buffer 230.
- the prediction processing unit 360 may include an inter prediction unit 344 and an intra prediction unit 354.
- the inter prediction unit 344 may be functionally similar to the inter prediction unit 244, and the intra prediction unit 354 may be functionally similar to the intra prediction unit 254.
- the prediction processing unit 360 is generally used to perform block prediction and/or obtain a prediction block 365 from the encoded data 21, and to receive or obtain (explicitly or implicitly) prediction-related parameters and/or information about the prediction from the entropy decoding unit 304, for example. Information about the selected prediction mode.
- the intra-prediction unit 354 of the prediction processing unit 360 is used for the intra-prediction mode based on the signal and the previous decoded block from the current frame or picture. Data to generate a prediction block 365 for the picture block of the current video slice.
- the inter-frame prediction unit 344 eg, motion compensation unit
- the prediction processing unit 360 is used for the motion vector and the received from the entropy decoding unit 304
- the other syntax elements generate a prediction block 365 for the video block of the current video slice.
- a prediction block can be generated from a reference picture in a reference picture list.
- the video decoder 30 may use the default construction technique to construct a list of reference frames based on the reference pictures stored in the DPB 330: list 0 and list 1.
- the prediction processing unit 360 is configured to determine prediction information for the video block of the current video slice by parsing the motion vector and other syntax elements, and use the prediction information to generate the prediction block for the current video block being decoded.
- the prediction processing unit 360 uses some of the received syntax elements to determine the prediction mode (for example, intra or inter prediction) and the inter prediction slice type ( For example, B slice, P slice or GPB slice), construction information for one or more of the reference picture list for the slice, motion vector for each inter-coded video block of the slice, The inter prediction status and other information of each inter-encoded video block of the slice to decode the video block of the current video slice.
- the syntax elements received by the video decoder 30 from the bitstream include receiving adaptive parameter set (APS), sequence parameter set (sequence parameter set, SPS), and picture parameter set (picture parameter set). parameter set, PPS) or a syntax element in one or more of the slice headers.
- APS adaptive parameter set
- SPS sequence parameter set
- PPS picture parameter set
- the inverse quantization unit 310 may be used to inverse quantize (ie, inverse quantize) the quantized transform coefficients provided in the bitstream and decoded by the entropy decoding unit 304.
- the inverse quantization process may include using the quantization parameter calculated by the video encoder 20 for each video block in the video slice to determine the degree of quantization that should be applied and also determine the degree of inverse quantization that should be applied.
- the inverse transform processing unit 312 is used to apply an inverse transform (for example, an inverse DCT, an inverse integer transform, or a conceptually similar inverse transform process) to transform coefficients so as to generate a residual block in the pixel domain.
- an inverse transform for example, an inverse DCT, an inverse integer transform, or a conceptually similar inverse transform process
- the reconstruction unit 314 (for example, the summer 314) is used to add the inverse transform block 313 (that is, the reconstructed residual block 313) to the prediction block 365 to obtain the reconstructed block 315 in the sample domain, for example by adding The sample value of the reconstructed residual block 313 and the sample value of the prediction block 365 are added.
- the loop filter unit 320 (during the encoding cycle or after the encoding cycle) is used to filter the reconstructed block 315 to obtain the filtered block 321, thereby smoothly performing pixel conversion or improving video quality.
- the loop filter unit 320 may be used to perform any combination of the filtering techniques described below.
- the loop filter unit 320 is intended to represent one or more loop filters, such as deblocking filters, sample-adaptive offset (SAO) filters or other filters, such as bilateral filters, auto Adaptive loop filter (ALF), or sharpening or smoothing filter, or collaborative filter.
- the loop filter unit 320 is shown as an in-loop filter in FIG. 3, in other configurations, the loop filter unit 320 may be implemented as a post-loop filter.
- the decoded video block 321 in a given frame or picture is then stored in a decoded picture buffer 330 that stores reference pictures for subsequent motion compensation.
- the decoder 30 is used, for example, to output the decoded picture 31 through the output 332 for presentation or viewing by the user.
- the decoder 30 may generate an output video stream without the loop filter unit 320.
- the non-transform-based decoder 30 may directly inversely quantize the residual signal without the inverse transform processing unit 312 for certain blocks or frames.
- the video decoder 30 may have an inverse quantization unit 310 and an inverse transform processing unit 312 combined into a single unit.
- the decoder 30 is used to implement the motion vector prediction method described in the following embodiments.
- the video decoder 30 may generate an output video stream without processing by the filter 320; or, for some image blocks or image frames, the entropy decoding unit 304 of the video decoder 30 does not decode the quantized coefficients, and accordingly does not It needs to be processed by the inverse quantization unit 310 and the inverse transform processing unit 312.
- the loop filter 320 is optional; and for lossless compression, the inverse quantization unit 310 and the inverse transform processing unit 312 are optional.
- the inter prediction unit and the intra prediction unit may be selectively activated.
- the processing result for a certain link can be further processed and output to the next link, for example, in interpolation filtering, motion vector derivation or loop filtering, etc.
- operations such as Clip or shift are further performed on the processing results of the corresponding link.
- the motion vector of the control point of the current image block derived from the motion vector of the adjacent affine coding block, or the motion vector of the sub-block of the current image block derived from the motion vector may undergo further processing, and this application will not do this limited.
- restrict the value range of the motion vector so that it is within a certain bit width. Assuming that the bit width of the allowed motion vector is bitDepth, the range of the motion vector is -2 ⁇ (bitDepth-1) ⁇ 2 ⁇ (bitDepth-1)-1, where the " ⁇ " symbol represents the power. If bitDepth is 16, the value range is -32768 ⁇ 32767. If bitDepth is 18, the value range is -131072 ⁇ 131071.
- the value of the motion vector (for example, the motion vector MV of the four 4x4 sub-blocks in an 8x8 image block) is restricted, so that the maximum difference between the integer parts of the four 4x4 sub-blocks MV does not exceed N pixels, for example, no more than one pixel.
- ux (vx+2 bitDepth )%2 bitDepth
- vx is the horizontal component of the motion vector of the image block or the sub-block of the image block
- vy is the vertical component of the motion vector of the image block or the sub-block of the image block
- ux and uy are intermediate values
- bitDepth represents bit width
- the value of vx is -32769, and the value obtained by the above formula is 32767. Because in the computer, the value is stored in the form of two's complement, the two's complement of -32769 is 1,0111,1111,1111,1111 (17 bits), and the computer treats overflow as discarding the high bits, then the value of vx If it is 0111,1111,1111,1111, it is 32767, which is consistent with the result obtained by formula processing.
- vx Clip3(-2 bitDepth-1 ,2 bitDepth-1 -1,vx)
- vx is the horizontal component of the motion vector of the image block or the sub-block of the image block
- vy is the vertical component of the motion vector of the image block or the sub-block of the image block
- x, y, and z correspond to MV clamps, respectively
- the definition of Clip3 is to clamp the value of z to the interval [x, y]:
- FIG. 4 is a schematic structural diagram of a video decoding device 400 (for example, a video encoding device 400 or a video decoding device 400) provided by an embodiment of the present invention.
- the video coding device 400 is suitable for implementing the embodiments described herein.
- the video coding device 400 may be a video decoder (for example, the decoder 30 of FIG. 1A) or a video encoder (for example, the encoder 20 of FIG. 1A).
- the video coding device 400 may be one or more components of the decoder 30 in FIG. 1A or the encoder 20 in FIG. 1A described above.
- the video decoding device 400 includes: an entry port 410 for receiving data and a receiving unit (Rx) 420, a processor, logic unit or central processing unit (CPU) 430 for processing data, and a transmitter unit for transmitting data (Tx) 440 and outlet port 450, and a memory 460 for storing data.
- the video decoding device 400 may further include a photoelectric conversion component and an electro-optical (EO) component coupled with the inlet port 410, the receiver unit 420, the transmitter unit 440, and the outlet port 450 for the outlet or inlet of optical signals or electrical signals.
- EO electro-optical
- the processor 430 is implemented by hardware and software.
- the processor 430 may be implemented as one or more CPU chips, cores (for example, multi-core processors), FPGA, ASIC, and DSP.
- the processor 430 communicates with the ingress port 410, the receiver unit 420, the transmitter unit 440, the egress port 450, and the memory 460.
- the processor 430 includes a decoding module 470 (for example, an encoding module 470 or a decoding module 470).
- the encoding/decoding module 470 implements the embodiments disclosed herein to implement the chroma block prediction method provided by the embodiments of the present invention. For example, the encoding/decoding module 470 implements, processes, or provides various encoding operations.
- the encoding/decoding module 470 provides a substantial improvement to the function of the video decoding device 400 and affects the conversion of the video decoding device 400 to different states.
- the encoding/decoding module 470 is implemented by instructions stored in the memory 460 and executed by the processor 430.
- the memory 460 includes one or more magnetic disks, tape drives, and solid-state hard disks, and can be used as an overflow data storage device for storing programs when these programs are selectively executed, and storing instructions and data read during program execution.
- the memory 460 may be volatile and/or non-volatile, and may be read-only memory (ROM), random access memory (RAM), random access memory (ternary content-addressable memory, TCAM), and/or static Random Access Memory (SRAM).
- FIG. 5 is a simplified block diagram of an apparatus 500 that can be used as either or both of the source device 12 and the destination device 14 in FIG. 1A according to an exemplary embodiment.
- the device 500 can implement the technology of the present application.
- FIG. 5 is a schematic block diagram of an implementation manner of an encoding device or a decoding device (referred to as a decoding device 500 for short) according to an embodiment of the application.
- the decoding device 500 may include a processor 510, a memory 530, and a bus system 550.
- the processor and the memory are connected through a bus system, the memory is used to store instructions, and the processor is used to execute instructions stored in the memory.
- the memory of the decoding device stores program codes, and the processor can call the program codes stored in the memory to execute various video encoding or decoding methods described in this application, especially various new inter-frame prediction methods. To avoid repetition, it will not be described in detail here.
- the processor 510 may be a central processing unit (Central Processing Unit, referred to as "CPU"), and the processor 510 may also be other general-purpose processors, digital signal processors (DSP), and dedicated integrated Circuit (ASIC), off-the-shelf programmable gate array (FPGA) or other programmable logic devices, discrete gates or transistor logic devices, discrete hardware components, etc.
- the general-purpose processor may be a microprocessor or the processor may also be any conventional processor or the like.
- the memory 530 may include a read only memory (ROM) device or a random access memory (RAM) device. Any other suitable type of storage device can also be used as the memory 530.
- the memory 530 may include code and data 531 accessed by the processor 510 using the bus 550.
- the memory 530 may further include an operating system 533 and an application program 535.
- the application program 535 includes at least one program that allows the processor 510 to execute the video encoding or decoding method described in this application (especially the motion vector prediction method described in this application).
- the application program 535 may include applications 1 to N, which further include a video encoding or decoding application (referred to as a video coding application) that executes the video encoding or decoding method described in this application.
- the bus system 550 may also include a power bus, a control bus, and a status signal bus. However, for clear description, various buses are marked as the bus system 550 in the figure.
- the decoding device 500 may further include one or more output devices, such as a display 570.
- the display 570 may be a touch-sensitive display that merges the display with a touch-sensitive unit operable to sense touch input.
- the display 570 may be connected to the processor 510 via the bus 550.
- FIG. 4 and FIG. 5 may be used to execute the method in the embodiment of the present application.
- inter-frame prediction is an important part of the video codec system.
- HEVC introduces two inter-frame prediction modes, namely Advanced Motion Vector Prediction (AMVP) mode and Merge mode.
- AMVP Advanced Motion Vector Prediction
- Merge mode Merge mode
- For the AMVP mode first construct a candidate motion vector list based on the motion information of the adjacent coded units in the spatial or temporal domain of the current coding unit, and then determine the optimal motion vector from the candidate motion vector list as the motion vector prediction of the current coding unit Value (Motion vector predictor, MVP).
- MVP Motion vector predictor
- the rate-distortion cost is calculated by formula (1), where J is the rate-distortion cost (RD Cost), and SAD is the sum of absolute errors between the predicted pixel value and the original pixel value obtained after motion estimation using candidate motion vector predictors ( Sum of Absolute Differences, SAD), R is the code rate, ⁇ is the Lagrangian multiplier, and the encoder transmits the index value of the motion vector predictor selected based on the rate-distortion cost in the candidate motion vector list and the reference frame index value To the decoding end.
- RD Cost rate-distortion cost
- SAD Sum of Absolute Differences
- a motion search is performed in a neighborhood centered on the MVP to obtain the actual motion vector of the current coding unit, and the encoding end transmits the difference (Motion vector difference, MVD) between the MVP and the actual motion vector to the decoding end.
- MVD Motion vector difference
- the current coding unit spatial and temporal candidate motion information is shown in Figure 6.
- the spatial candidate motion information comes from 5 adjacent blocks (A0, A1, B0, B1 and B2) in space. If the adjacent blocks are not available or are frames In intra coding mode or intra block copy mode, the candidate motion information list is not added.
- the temporal candidate motion information of the current coding unit is obtained after scaling the MV of the corresponding block in the reference frame according to the picture order count (POC) of the reference frame and the current frame.
- the determination of the corresponding position block includes first determining whether the block with position T in the reference frame is available, and if it is not available, selecting the block with position C as the corresponding position block.
- motion compensation is performed based on the assumption that the motion information of all pixels in the coding unit is the same to obtain the predicted value of the pixels of the coding unit.
- a coding unit not all pixels necessarily have the same motion characteristics. Therefore, using the same motion information to predict all pixels in the CU may reduce the accuracy of motion compensation, thereby increasing residual information.
- the coding unit is divided into at least two sub-coding units, and then the motion information of each sub-coding unit is derived, and motion compensation is performed according to the motion information of the sub-coding unit, thereby improving The accuracy of prediction, for example, sub-coding unit motion vector prediction (Sub-CU based motion vector prediction, SMVP) technology.
- sub-coding unit motion vector prediction Sub-CU based motion vector prediction, SMVP
- SMVP divides the current coding unit into MxN sub-coding units, and derives the motion information of each sub-coding unit, and then uses the motion information of each sub-coding unit to perform motion compensation to obtain the predicted value of the current coding unit.
- candidate motion information list is also called a sub-block based merging candidate list (sub-block based merging candidate list).
- the sub-block fusion candidate list includes ATMVP prediction mode, affine model-based prediction mode (including the use of inherited control point motion vector prediction methods and/or the use of constructed control point motion vector prediction methods) , Based on one or more of the inter-plane prediction modes (PLANAR).
- ATMVP prediction mode including the use of inherited control point motion vector prediction methods and/or the use of constructed control point motion vector prediction methods
- affine model-based prediction mode including the use of inherited control point motion vector prediction methods and/or the use of constructed control point motion vector prediction methods
- PLANAR inter-plane prediction modes
- the ATMVP technology first determines the corresponding position reference frame (Collocated reference picture), then divides the current coding unit into MxN sub-coding units, and obtains the corresponding sub-coding of each current sub-coding unit in the corresponding position reference frame The motion information of the pixel at the center point of the unit is scaled and converted into the current motion information of each sub-coding unit.
- ATMVP is also called subblock-based temporal motion vector prediction (SbTMVP) ).
- the STMVP technology obtains the motion information of adjacent positions in the upper spatial domain, left spatial domain, and lower right time domain of each sub-coding unit, finds the average value, and converts it into the current motion information of each sub-coding unit.
- the current coding unit is divided into four sub coding units A, B, C, and D.
- the motion information of the sub coding unit A is derived from the motion information of the adjacent positions c and b in the spatial domain and the position D in the corresponding position reference frame.
- the filled motion information can be 0 or other motion information agreed in advance, which is not limited .
- the process of using ATMVP technology to perform inter-frame prediction on the current to-be-processed image mainly includes: determining the offset motion vector of the current to-be-processed block in the current to-be-processed image; according to the position of the to-be-processed sub-block in the current to-be-processed block and the above offset Shift the motion vector, determine the corresponding sub-block of the sub-block to be processed in the corresponding reference image; determine the motion vector of the current sub-block to be processed according to the motion vector of the corresponding sub-block; determine the motion vector of the current sub-block to be processed according to the motion vector of the sub-block to be processed Perform motion compensation prediction to obtain the predicted pixel value of the sub-block to be processed.
- FIG. 9 in a feasible implementation manner, it includes:
- the offset motion vector is used to determine the correspondence of the preset location point in the current CU (that is, the current CU to be processed, or can also be called the block to be processed, the block to be coded, the block to be decoded, etc.) in the collocated picture The location of the point.
- the offset motion vector may use the motion vector of the spatial neighboring block of the current CU.
- the corresponding image of the frame where the current CU is located can be obtained by parsing the code stream, or can be a preset value (the preset value has the same value on the codec side). Exemplarily, let us set the corresponding image as an image with a reference frame index of 0 in the reference frame list of the current CU.
- the position of the current CU in the corresponding image block may also be determined according to the offset motion vector and the corresponding image block, and the block at this position may be referred to as a corresponding/collocated block (corresponding/collocated block).
- the offset motion vector can be obtained by any of the following methods.
- Method 1 Determine whether the motion vector of A1 in Figure 6 is available. It should be understood that “available” refers to the existence and availability of the motion vector. Exemplarily, when the adjacent block does not exist, or the adjacent block and the current block are not in the same coding area (such as slice, slice ( tile), tile group (tile group), etc.), or if the adjacent block adopts intra prediction mode or intra block copy mode (intra block copy, IBC), the motion vector of the adjacent block is not available. On the contrary, if the adjacent block adopts the inter prediction mode, or the adjacent block adopts the inter prediction mode and is in the same coding area as the current block, the motion vector of the adjacent block is available.
- intra prediction mode intra block copy
- IBC intra block copy
- the IBC mode is a block-level coding mode. At the coding end, a block matching method is used to find the best block vector or motion vector for each CU.
- the motion vector here is mainly used to represent the displacement from the current block to the reference block, which is also called a displacement vector.
- the reference block is a reconstructed block in the current image.
- the IBC mode can be considered as the third prediction mode in addition to the intra prediction or inter prediction mode. In order to save storage space and reduce the complexity of the decoder, the IBC mode in some embodiments only allows the reconstruction part of the predefined area of the current CTU to be used for prediction.
- the zero motion vector can be used as the offset motion vector of the current CU.
- A1 may have a first-directional motion vector based on the first reference frame list list0 and a second-directional motion vector based on the second reference frame list list1.
- the second-direction motion vector of A1 based on list1 is used as the offset motion vector of the current CU.
- the conditions include:
- A1 uses the reference frame in list1 for prediction
- the reference frame in list1 of A1 used for prediction is the same as the corresponding image of the image frame where the current CU is located;
- the POC of the reference frame in list1 and the POC of the corresponding image of the image frame where the current CU is located are the same, wherein the information characterizing the corresponding image can be obtained by analyzing the code stream.
- the image type of the image where the current CU is located is a B image, or the slice where the current CU is located is a B slice, or the tile group where the current CU is located is a B slice group;
- Conditions include:
- A1 uses the reference frame in list0 for prediction
- the reference frame in the list0 of A1 used for prediction is the same as the corresponding image of the image frame where the current CU is located.
- Method 2 According to the order of A1, B1, B0, A0 in Figure 6, find the first available motion vector of the adjacent block. If the reference frame of the adjacent block is the corresponding image of the frame where the current CU is located, that is, the found motion vector points to the corresponding image, then the motion vector of the adjacent block is used as the offset motion vector of the current CU. Otherwise, that is, if the reference frame of the neighboring block is not the corresponding image of the frame where the current CU is located, in a feasible implementation manner, the zero motion vector can be used as the offset motion vector of the current CU.
- the motion vector of the first available neighboring block is based on the POC of the corresponding image, the POC of the image where the current CU is located, and the reference frame of the first available neighboring block POC, zoom, make the zoomed motion vector point to the corresponding image, and use the zoomed motion vector as the offset motion vector of the current CU.
- zoom refer to the method for obtaining a time-domain motion vector in the prior art, or the scaling method in step 1005 in this embodiment, and will not be repeated.
- the image block in the corresponding image at the same position as the current CU is the corresponding block of the current CU in the corresponding image.
- the ATMVP prediction mode is not used to obtain the motion vector of the sub-block of the current CU.
- S902 Determine whether the ATMVP mode is available according to the offset motion vector.
- the image block where the corresponding point of the current CU preset position point in the corresponding image is located is the S sub-block, and its coordinate position is (x col , y col ), for example,
- (x, y) is the coordinates of the upper left vertex of the current CU
- W is the width of the current CU
- H is the height of the current CU
- (x off , y off ) is the offset motion vector.
- step 902 and subsequent steps are stopped.
- the prediction mode of the S sub-block is the inter-frame prediction mode
- the motion information of the S sub-block is determined as the initial default Motion vector.
- the initial default motion vector MV is scaled to obtain the default motion vector MVs of the sub-block to be processed.
- MVs can be obtained by formula (3):
- CurPoc is the POC of the frame where the current CU is located
- ColPoc is the POC of the corresponding image
- CurRefPoc is the POC of the reference frame of the current CU
- ColRefPoc is the POC of the reference frame of the sub-block S.
- MV includes a horizontal motion vector MVx and a vertical motion vector MVy, which can be calculated according to the above formulas to obtain the scaled horizontal motion vector MVsx and the scaled vertical motion vector MVsy, respectively.
- S903 Determine the motion information of the sub-block to be processed according to the motion information of the corresponding sub-block.
- the block to be processed that is, the current CU, is located in the current image.
- the block to be processed contains 4 sub-blocks, and one sub-block to be processed is the upper left sub-block of the block to be processed.
- the sub-block at the upper left position in the corresponding image may be determined as the corresponding sub-block of the sub-block to be processed. It should be understood that different corresponding sub-blocks may exist in the form of corresponding blocks as a whole, or may exist in the form of separate sub-blocks.
- the motion information corresponding to the position of the geometric center of the corresponding sub-block can be obtained.
- the center position (x (i, j) of the corresponding sub-block of the (i, j)-th to-be-processed sub-block of the current CU that is, the i-th from left to right and the j-th from top to bottom )
- y (i,j) ) can be obtained according to formula (4)
- (x, y) represents the coordinates of the upper left vertex of the current CU
- M represents the width of the sub-block to be processed
- H represents the height of the sub-block to be processed.
- the motion vector at the center position is available, and the motion vector at this position is obtained. Based on the temporal relationship between the image frame where the sub-block to be processed is located and the image frame where the corresponding sub-block is located, the motion vector is scaled to obtain the motion vector of the sub-block to be processed.
- the zoom processing method is similar to formula (2), which is exemplary,
- MV is the motion vector at the aforementioned center position
- MV R is the motion vector of the sub-block to be processed.
- the POC CurRefPoc of the reference frame of the current CU may be preset to the POC of the reference frame whose reference frame index is 0 in the reference image list of the frame where the current CU is located.
- CurRefPoc may also be other reference frames in the reference image list of the frame where the current CU is located, and is not limited.
- the motion vector at the center position is not available, and the default motion vector MVs of the sub-block to be processed determined in step S902 is used as the sub-block to be processed.
- the motion vector of the block is the motion vector of the block.
- S904 Perform motion compensation based on the motion information of the sub-block of the current CU to obtain the predicted pixel value of the current CU.
- each sub-block For each sub-block, perform motion compensation based on the motion vector determined in step 903 and the reference frame of the image where the current CU is located, for example, the motion vector MV R and the reference frame CurRefPoc, to obtain the predicted pixel value of the sub-block.
- the process of motion compensation can refer to the foregoing description, or any improvement scheme in the prior art, and will not be repeated.
- the predicted pixel value of the current CU is also obtained.
- the motion vector of each sub-block by obtaining the motion vector of each sub-block, it can reflect the more complex motion conditions within the block to be processed, improve the accuracy of the motion vector, and thereby improve the coding efficiency.
- the motion information of the corresponding sub-block is not available , Need to calculate the default motion information, which affects the encoding and decoding speed.
- the to-be-processed block includes one or more sub-blocks, and the identification information of the corresponding image of the to-be-processed block is parsed and obtained from the code stream.
- the motion vector of a spatial neighboring block at the preset position of the block to be processed is available and the reference frame corresponding to the motion vector is the corresponding image
- the motion vector is determined as the time-domain offset vector.
- the position of the corresponding sub-block of the sub-block of the block to be processed is determined in the corresponding image based on the time-domain offset vector. Determine whether the motion vector of the corresponding sub-block is available, and determine the motion vector of the sub-block of the block to be processed based on the motion vector of the corresponding sub-block.
- the inter-frame prediction method is specifically:
- the time domain offset vector is used to determine the corresponding sub-block of the sub-block of the block to be processed.
- the block to be processed includes 4 sub-blocks to be processed, and the sub-block of each block to be processed is based on its position in the current image and the time domain offset vector (identified in the figure). Is the offset motion vector), and the corresponding sub-block is determined in the corresponding image (denoted as the target image in the figure, that is, the image where the corresponding sub-block is located).
- the index of the image frame (corresponding image) where the corresponding sub-block is located in the reference frame list of the spatial neighboring block of the block to be processed is obtained by parsing the code stream, that is, the decoding end can be obtained by parsing the corresponding code stream in the code stream.
- the information determines the corresponding image.
- the image frame with the best performance can be determined as the corresponding image through the RDO selection method, or a certain frame is designated as the corresponding image, and the instruction information of the corresponding image is written into the code stream.
- the corresponding image can also be set according to the pre-protocol of the codec.
- step 1201 may specifically be:
- the motion vector of the neighboring block in the spatial domain that is available for the first motion vector in the preset sequence is used as the time domain offset vector.
- the second preset motion vector is used as the time domain offset vector.
- the spatial neighboring blocks A1, B1, B0, A0 of the block to be processed can be checked in turn whether their motion vectors are available, until the spatial neighboring blocks available for the first motion vector are found.
- B0 stop checking
- B0 the motion vector of B0 as the time-domain offset vector.
- the motion vector of B0 can be scaled to make the scale
- the subsequent motion vector uses the corresponding image as the reference frame.
- the second preset motion vector may be set as a zero motion vector as the time-domain offset vector.
- the spatial neighboring block at the first preset position is pre-configured by the codec end or determined according to higher-level syntax elements, which is not limited in the embodiment of the present application.
- the condition that the motion vector of the spatial neighboring block is unavailable includes one or a combination of the following: the spatial neighboring block is not encoded/decoded (if the prediction method is implemented on the encoding side, the spatial neighboring block is not encoded, if When implemented at the decoding end, the spatial neighboring block is not decoded); or, the spatial neighboring block adopts intra prediction or intra block copy mode; or, the spatial neighboring block does not exist; or, the spatial neighboring block
- the spatial neighboring block and the block to be processed are located in different coding regions.
- the coding area includes: an image, a slice, a tile, or a tile group.
- step 1201 may also specifically be:
- the third preset motion vector is used as the time domain offset vector.
- the spatial neighboring block at the second preset position also satisfies that its reference frame is the same as the image in which the corresponding image is located.
- the motion vector of the spatial neighboring block at the second preset position includes a first-directional motion vector based on the first reference frame list
- the reference frame of the spatial neighboring block at the second preset position includes The first direction reference frame corresponding to the first direction motion vector.
- the use of the motion vector of the spatial neighboring block at the second preset position as the time domain offset vector is specifically: when the first direction reference frame and the When the image frames where the corresponding sub-blocks are located are the same, the first motion vector is used as the time domain offset vector.
- the first-direction reference frame and the image frame where the corresponding sub-block is located are different, use the third preset motion vector as the time-domain offset vector.
- the third preset motion vector is a zero motion vector.
- the motion vector of the spatial neighboring block at the second preset position also includes the second direction based on the second reference frame list.
- the use of the motion vector of the spatial neighboring block at the second preset position as the time domain offset vector is specifically: when the first direction reference frame and the When the image frames in which the corresponding sub-blocks are located are the same, use the first motion vector as the time-domain offset vector.
- the method further includes: judging whether the coding region of the spatial neighboring block is of type B, that is, whether it is a B frame , B film, B strip or B film group, etc.
- the motion vector of the spatial neighboring block at the second preset position is used as the time
- the domain offset vector is specifically: when the image frame in which the corresponding sub-block is located is obtained from the second reference frame list, determine whether the second-direction reference frame and the image frame in which the corresponding sub-block is located are the same When the second-direction reference frame and the image frame in which the corresponding sub-block is located are the same, use the second-direction motion vector as the time-domain offset vector; when the second-direction reference frame and the When the corresponding sub-blocks are in different image frames and the first-direction reference frame and the corresponding sub-block are in the same image frame, the first-direction motion vector is used as the time-domain offset vector; when the corresponding When the image frame where the sub-block is located is obtained from the first reference frame list, it is determined whether the first-direction reference frame and the image frame where the corresponding sub-block is located are the same
- the third preset motion The vector is used as the time domain offset vector.
- the motion vector of the spatial neighboring block at the second preset position is used as the time
- the domain offset vector is specifically: when the image frame in which the corresponding sub-block is located is obtained from the second reference frame list and the display order of all reference frames in the reference frame list of the block to be processed is in the Before the image frame where the block to be processed is located, determine whether the second-direction reference frame and the image frame where the corresponding sub-block is located are the same; when the second-direction reference frame and the image frame where the corresponding sub-block is located are the same At the same time, the second-direction motion vector is used as the time-domain offset vector; when the second-direction reference frame and the corresponding sub-block are located in different image frames and the first-direction reference frame and the corresponding When the image frames where the sub-blocks are located are the same, the first motion vector is used as the time domain offset vector; when the image frame where the corresponding sub-blocks are located are the same, the first motion vector is used as the time domain offset vector; when the image
- the third preset motion The vector is used as the time domain offset vector.
- whether the image frame in which the corresponding sub-block is located is obtained from the first/second reference frame list can be obtained according to the syntax element collocated_from_l0_flag in the parsing code stream.
- collocated_from_l0_flag 1
- it means that the image frame (corresponding image) in which the corresponding sub-block is located is obtained from the first reference frame list
- the collocated_from_l0_flag information is not carried in the code stream, the image frame in which the corresponding sub-block is located is obtained from the first reference frame list by default.
- the display order of all reference frames in the reference frame list of the to-be-processed block is before the image frame where the to-be-processed block is located, that is, a low-delay encoding frame structure is adopted.
- the reference frame used is the display order before the current frame to be encoded.
- the reference frames used are all in the display order before the current frame to be decoded.
- the adjacent spatial block will not adopt bidirectional prediction, and implementations involving bidirectional prediction may not achieve good technical effects. Therefore, optionally, you can Before performing the above-mentioned implementation involving bidirectional prediction, it is judged whether the coding region where the spatial neighboring block is located is of type B.
- S1202. Determine whether the motion vector corresponding to the position in the preset block of the corresponding sub-block is available.
- the position of the corresponding sub-block of the sub-block in the corresponding image may be determined according to the position coordinates of a sub-block (may be called an example sub-block) in the block to be processed and the time-domain offset vector determined in step S1201.
- a sub-block may be called an example sub-block
- the position of the corresponding sub-block can be obtained in the manner of formula (4) herein.
- x and y respectively represent the horizontal and vertical coordinates of the upper left vertex of the block to be processed
- i, j indicate that the sample sub-block is arranged in the block to be processed
- x off and y off represent the horizontal and vertical coordinates of the time domain offset vector
- M and N represent the width and height of the sub-block
- x (i, j) and y (i, j) respectively represent the position coordinates of the corresponding sub-blocks of the example sub-blocks (may be referred to as the corresponding sub-blocks for short).
- M/2 and N/2 in formula (4) indicate that the position in the preset block is the geometric center position of the corresponding sub-block.
- the position in the preset block can also be other positions in the block such as the upper left vertex of the corresponding sub-block, and is not limited.
- the motion vector corresponding to the preset position in the corresponding sub-block can also be used as the motion vector of the corresponding sub-block.
- the prediction unit in the corresponding image where the position coordinates are located can be determined, and the preset block of the corresponding sub-block can be determined according to the prediction information of the prediction unit Whether the motion vector corresponding to the inner position is available.
- the prediction unit is a result of the actual encoding of the corresponding image, and may be inconsistent with the corresponding sub-block.
- the motion vector corresponding to the position in the preset block is available, and when the prediction mode of the prediction unit is intra prediction or intra block copy mode, The motion vector corresponding to the position in the preset block is not available.
- the prediction mode information of the prediction unit may be examined, and the prediction mode of the prediction unit may be determined according to the prediction mode information as intra prediction, inter prediction, or intra block copy mode and other modes.
- the motion information of the prediction unit can be examined, for example, the prediction direction can be examined.
- the prediction direction identifier predFlagL0 and/or predFlagL1 is 1, it is inter prediction, otherwise it is intra prediction mode or other motion. Prediction mode where vectors are not available.
- the motion vector of the corresponding sub-block is not available, the motion vector of the sub-block of the block to be processed is obtained according to the first preset motion vector.
- the motion vector corresponding to the position in the preset block is available, and the motion vector of the sub-block of the block to be processed is obtained according to the motion vector corresponding to the position in the preset block Vector.
- the motion vector corresponding to the position in the preset block may be scaled to obtain the motion vector of the sub-block of the block to be processed .
- the first time-domain distance difference is the difference in picture sequence count between the image frame in which the block to be processed is located and the reference frame of the block to be processed
- the second time-domain distance difference is the corresponding sub-block
- the index of the reference frame of the to-be-processed block in the reference frame list of the to-be-processed block is obtained by parsing the code stream, that is, the reference frame of the to-be-processed block can be determined by parsing the corresponding information in the code stream at the decoding end .
- the image frame with the best performance can be determined as the reference frame of the block to be processed by RDO selection, or a certain frame can be designated as the reference frame of the block to be processed, and the instruction information of the reference frame of the block to be processed can be written Code stream.
- the reference frame of the block to be processed can also be set according to the pre-protocol of the codec.
- the index of the reference frame of the to-be-processed block in the reference frame list of the to-be-processed block is 0.
- the motion vector corresponding to the position in the preset block may be directly obtained from the motion vector storage unit corresponding to the position, it may also be obtained from the motion vector storage unit of the adjacent position, or may be stored according to the motion vector of the adjacent position.
- the motion vector in the unit is obtained by interpolation filtering, which is not limited.
- the motion vector corresponding to the position in the preset block is not available, and the sub-frame of the block to be processed is obtained according to the first preset motion vector.
- the motion vector of the block is obtained according to the first preset motion vector.
- the motion vector of the sub-block includes a first-direction sub-block motion vector based on a first reference frame list and/or a second-direction sub-block motion vector based on a second reference frame list, so
- obtaining the motion vector of the sub-block of the block to be processed according to the first preset motion vector is specifically: determining the sub-block of the block to be processed The block adopts unidirectional prediction based on the first-direction sub-block motion vector, and obtains the first-direction sub-block motion vector of the sub-block of the to-be-processed block according to the first preset motion vector; or, determining the The sub-block of the block to be processed adopts unidirectional prediction based on the second-direction sub-block motion vector, and the second-direction sub-block motion vector of the sub-block of the block to be processed is obtained according to the first preset motion vector.
- the motion vector of the sub-block includes a first-direction sub-block motion vector based on a first reference frame list and a second-direction sub-block motion vector based on a second reference frame list.
- the motion vector corresponding to the position in the preset block is not available
- the motion vector of the sub-block of the block to be processed is acquired according to the first preset motion vector, specifically: when the code of the block to be processed is located
- the prediction type of the region is B-type prediction
- the first-direction sub-block motions of the sub-blocks of the block to be processed are respectively obtained according to the first preset motion vector Vector and the second-directional sub-block motion vector of the sub-block of the block to be processed
- the prediction type of the coding region where the block to be processed is located is P-type prediction, it is determined that the sub-block of the block to be processed adopts unidirectional Predict, and obtain the first sub-block
- the coding area includes: an image, a strip, a slice or a slice group.
- the obtaining the motion vector of the sub-block of the block to be processed according to the first preset motion vector includes: using the first preset motion vector as the sub-block of the block to be processed The motion vector of the block.
- the first preset motion vector is a zero motion vector.
- the flag predFlagL0 of the first-direction prediction can be set to 1, the flag predFlagL1 of the second-direction prediction is 0, and the first-direction sub-block motion vector mvL0 can be set to (0, 0);
- the flag predFlagL0 for the first-direction prediction can be set to 0, the flag predFlagL1 for the second-direction prediction is 1, and the second-direction sub-block motion vector mvL1 is (0, 0);
- the flag predFlagL0 of the first-direction prediction can be set to 1, the flag predFlagL1 of the second-direction prediction is 1, and the first The one-directional sub-block motion vector mvL0 is (0, 0), and the second-directional sub-block motion vector mvL1 is (0, 0); otherwise (when the coding area of the sample sub-block does not correspond to the above-mentioned bidirectional prediction area), you can set The flag predFlagL0 of the first-direction prediction is 1, the flag predFlagL1 of the second-direction prediction is 0, and the first-direction sub-block motion vector mvL0 is (0, 0).
- the processing procedure of the foregoing processing example sub-block is performed, and the predicted value of each sub-block can be obtained. Since the block to be processed is composed of sub-blocks, after the predicted value of each sub-block is determined, the predicted value of the block to be processed is determined at the same time accordingly.
- An inter-frame prediction apparatus 1300 includes:
- the offset acquisition module 1301 is configured to determine the time domain offset vector of the to-be-processed block according to the spatial neighboring block of the to-be-processed block, and the time-domain offset vector is used to determine the sub-offset of the to-be-processed block.
- the corresponding sub-block of the block; the motion vector acquisition module 1302 is used to determine the motion vector of the sub-block of the block to be processed according to the motion vector of the corresponding sub-block, wherein, when the motion vector of the corresponding sub-block is not available At this time, the motion vector of the sub-block of the block to be processed is obtained according to the first preset motion vector.
- the offset acquisition module 1301 is specifically configured to: sequentially check whether the motion vectors of the neighboring blocks in the space at the first preset positions are available in a preset order, until the prediction is obtained. Set the motion vector of the neighboring block in the space domain available for the first motion vector in the sequence; use the motion vector of the neighboring block in the spatial domain available for the first motion vector in the preset sequence as the time domain offset vector.
- the offset acquisition module 1301 is specifically configured to: when the motion vectors of the spatial neighboring blocks of the multiple first preset positions are not available, set the second preset motion vector As the time domain offset vector.
- the second preset motion vector is a zero motion vector.
- the offset acquisition module 1301 is specifically configured to: acquire the motion vector and reference frame of the spatial neighboring block at the second preset position, wherein the spatial phase phase of the second preset position The motion vector of the neighboring block is available; the motion vector of the spatial neighboring block at the second preset position is used as the time domain offset vector.
- the offset acquisition module 1301 is specifically configured to: when the motion vector of the spatial neighboring block at the second preset position is not available, use the third preset motion vector as the Time domain offset vector.
- the third preset motion vector is a zero motion vector.
- the motion vector of the spatial neighboring block at the second preset position includes a first-directional motion vector based on the first reference frame list, and the spatial phase at the second preset position
- the reference frame of the neighboring block includes the first-direction reference frame corresponding to the first-direction motion vector
- the offset acquisition module 1301 is specifically configured to: when the first-direction reference frame and the image frame where the corresponding sub-block is located When the same, the first motion vector is used as the time domain offset vector.
- the offset acquisition module 1301 is specifically configured to: use the third preset motion vector As the time domain offset vector.
- the motion vector of the spatial neighboring block at the second preset position further includes the motion vector based on the second reference
- the second direction motion vector of the frame list the reference frame of the spatial neighboring block at the second preset position includes the second direction reference frame corresponding to the second direction motion vector, when the first direction reference frame and the
- the offset acquisition module 1301 is specifically configured to: when the second-direction reference frame and the image frame in which the corresponding sub-blocks are located are the same, The second-direction motion vector is used as the time domain offset vector; when the second-direction reference frame and the image frame in which the corresponding sub-block is located are different, the third preset motion vector is used as the time domain Offset vector.
- the motion vector of the spatial neighboring block at the second preset position includes the motion vector based on the first reference frame The first-direction motion vector of the list and the second-direction motion vector based on the second reference frame list.
- the reference frame of the spatial neighboring block at the second preset position includes the first-direction motion vector corresponding to the first-direction motion vector.
- the offset acquisition module 1301 is specifically configured to: when the image frame in which the corresponding sub-block is located is acquired from the second reference frame list Time: when the second-direction reference frame and the image frame in which the corresponding sub-block is located are the same, use the second-direction motion vector as the time-domain offset vector; when the second-direction reference frame is When the image frames where the corresponding sub-blocks are located are different and the first-direction reference frame and the image frame where the corresponding sub-blocks are located are the same, the first-direction motion vector is used as the time domain offset vector; when the When the image frame where the corresponding sub-block is located is obtained from the first reference frame list: when the first-direction reference frame is the same as the image frame where the corresponding sub-block is located, the first-direction motion vector is used as the The time-domain offset vector; when the first-direction reference frame and the image frame in which the corresponding sub-
- the offset obtaining module 1301 is specifically configured to: when the image frame in which the corresponding sub-block is located is obtained from the second reference frame list and the reference frame list of the block to be processed When the display order of all the reference frames in are before the image frame where the block to be processed is located: when the second-direction reference frame and the image frame where the corresponding sub-block is located are the same, move the second direction A vector is used as the time-domain offset vector; when the second-direction reference frame and the corresponding sub-block are in different image frames and the first-direction reference frame and the corresponding sub-block are in the same image frame, Use the first motion vector as the time domain offset vector; when the image frame in which the corresponding sub-block is located is obtained from the first reference frame list or at least one of the reference frame lists of the to-be-processed block When the display sequence of a reference frame is after the image frame where the block to be processed is located: when the first-direction reference frame and the image frame where the corresponding sub-block is
- the offset acquisition module 1301 is specifically configured to: use the third preset motion vector as the time domain offset vector.
- the index of the image frame in which the corresponding sub-block is located in the reference frame list of the spatial neighboring block of the block to be processed is obtained by parsing the code stream.
- the condition that the motion vector of the spatial neighboring block is unavailable includes a combination of one or more of the following: the spatial neighboring block is not coded/decoded; or, the spatial neighboring block The block adopts intra prediction or intra block copy mode; or, the spatial neighboring block does not exist; or, the spatial neighboring block and the block to be processed are located in different coding regions.
- the coding region includes: an image, a strip, a slice or a slice group.
- it further includes: a judging module 1303 for judging whether the motion vector corresponding to the position in the preset block of the corresponding sub-block is available; correspondingly, the motion vector acquiring module 1302 specifically uses In: when the motion vector corresponding to the position in the preset block is available, obtain the motion vector of the sub-block of the block to be processed according to the motion vector corresponding to the position in the preset block; When the motion vector corresponding to the position is not available, the motion vector of the sub-block of the block to be processed is obtained according to the first preset motion vector.
- the position within the preset block is the geometric center position of the corresponding sub-block.
- the motion vector corresponding to the position in the preset block is not available;
- the prediction unit where the position in the preset block is located uses inter-frame prediction, the motion vector corresponding to the position in the preset block is available.
- the motion vector acquiring module 1302 is specifically configured to use the first preset motion vector as the motion vector of the sub-block of the block to be processed.
- the first preset motion vector is a zero motion vector.
- the motion vector of the sub-block includes the first-direction sub-block motion vector based on the first reference frame list and/or the second-direction sub-block motion vector based on the second reference frame list, when When the motion vector corresponding to the position in the preset block is not available, the motion vector acquisition module 1302 is specifically configured to: determine that the sub-block of the block to be processed adopts unidirectional prediction based on the first-directional sub-block motion vector , And obtain the first-directional sub-block motion vector of the sub-block of the to-be-processed block according to the first preset motion vector; or determine that the sub-block of the to-be-processed block adopts the motion based on the second-direction sub-block One-way prediction of the vector, and the second-direction sub-block motion vector of the sub-block of the block to be processed is obtained according to the first preset motion vector.
- the motion vector obtaining module 1302 is specifically configured to: when the prediction type of the coding region where the block to be processed is located is In the B-type prediction, it is determined that the sub-block of the block to be processed adopts bidirectional prediction, and the first-direction sub-block motion vector of the sub-block of the block to be processed and the sub-block to be processed are respectively obtained according to the first preset motion vector.
- the second-direction sub-block motion vector of the sub-block of the processing block when the prediction type of the coding region where the block to be processed is located is P-type prediction, it is determined that the sub-block of the block to be processed adopts unidirectional prediction, and according to the The first preset motion vector obtains the first sub-block motion vector of the sub-block of the block to be processed.
- the motion vector acquisition module 1302 is specifically configured to: perform a calculation on the motion vector corresponding to the position in the preset block based on the ratio of the first time domain distance difference and the second time domain distance difference. Scaling processing to obtain the motion vector of the sub-block of the block to be processed, wherein the first time-domain distance difference is the image sequence count of the image frame where the block to be processed is located and the reference frame of the block to be processed Difference, the second time-domain distance difference is the difference in the sequence count of the image frame in which the corresponding sub-block is located and the reference frame of the corresponding sub-block.
- the index of the reference frame of the to-be-processed block in the reference frame list of the to-be-processed block is obtained by parsing a code stream.
- the index of the reference frame of the to-be-processed block in the reference frame list of the to-be-processed block is 0.
- each module in the embodiment of the present application shown in FIG. 13 is used to execute the method shown in FIG. 12 and each feasible implementation manner, and has the same technical effect.
- Step 1 Confirm the offset motion vector (offset motion vector)
- the offset motion vector (offset motion vector) is used to determine the position of the corresponding block of the current CU in the collocated picture, and the offset motion vector may use the motion vector of the spatial neighboring block of the current CU. According to the offset motion vector and the corresponding image of the current frame, the position of the current CU in the corresponding image is determined, which is called a corresponding block (corresponding/collocated block).
- the acquisition of the offset motion vector may be one of the following methods:
- Method 1 In an implementation, as shown in Fig. 6, if A1 is available, the offset motion vector is determined according to the following method.
- A1 unavailability means that the block at A1 is not decoded, or it is not an inter prediction block (intra block copy, IBC), or this block is in the current slice or slice. , Tile group or image, it is considered as an unusable block.
- the intra block copy mode may also be referred to as the intra block copy mode.
- the offset motion vector of the current block is the motion vector corresponding to list1 of A1
- A1 uses list1 for prediction
- the reference frame predicted by A1 using list1 is the same as the collocated picture of the current frame. (By judging whether the POC is the same.
- the reference frame idx and the corresponding POC number of the corresponding image idx are the same, the idx of the corresponding image of the current block can be obtained from the code stream)
- a low-delay coding structure is used, that is, only images whose display order is before the current image are used for prediction.
- the image type or slice type or tile group type of the current block is B
- collocated_from_l0_flag 0, where collocated_from_l0_flag is 1 means that the co-located image for time-domain motion vector prediction is obtained from the reference image queue list0. If it is 0, it means that the co-located image for temporal motion vector prediction is obtained from the reference image queue list1. When it does not appear in the code stream, the value is 1.
- the offset motion vector is the motion vector corresponding to list0 of A1.
- A1 uses list0 for prediction
- A1 uses list0 to predict the reference frame is the same as the collocated picture of the current frame.
- Method 2 For example, according to the order of A1, B1, B0, A0 in Fig. 6, find the first available motion vector of the adjacent block, if it points to the corresponding image, use it as the offset motion vector of the current CU. Otherwise, the zero motion vector can be used, or it can be scaled to point to the corresponding image as the offset motion vector of the current CU.
- the offset vector used may also be a zero offset vector.
- the image block in the corresponding image at the same position as the block to be processed is the corresponding block of the block to be processed in the corresponding image.
- the ATMVP technology may not be used, but other technologies may be used to obtain the motion vector of the sub-block to be processed.
- Step 2 Obtain ATMVP availability information and default sports information
- the corresponding block can be obtained according to the offset vector first, and the prediction mode of the corresponding sub-block S at the preset position in the corresponding block can be obtained, and the coordinates (xcol, ycol) of the preset position can be obtained according to formula (6).
- the default motion information and availability of ATMVP are obtained according to the prediction mode and motion information of the corresponding sub-block S.
- the specific method is as follows:
- (x, y) represents the coordinates of the upper left vertex of the current CU
- (x off , y off ) represents the offset motion vector
- W represents the current CU width
- H represents the current CU height.
- the prediction mode of the sub-block S corresponding to the preset position is the intra prediction mode or the intra block copy mode, the ATMVP motion information is not available.
- the motion information of the corresponding sub-block S is further extracted, and the coordinates of the preset position are obtained according to formula (6), and then the motion vector field of the corresponding image
- the motion information of the position in the corresponding sub-block is used as the S motion information of the corresponding sub-block.
- the motion information of the corresponding sub-block S is called the default motion information of the corresponding sub-block.
- the default motion vector MV of the corresponding sub-block S is scaled to obtain the default motion vector MVs of the sub-block to be processed, and the scaled motion vector MVs is used as the default motion information.
- formula (7) can be used to obtain the scaled MVs.
- the zoom method is not specifically limited here.
- the POC number of the frame where the current block is located is CurPoc
- the POC number of the reference frame of the current block is CurRefPoc
- the POC number of the corresponding image is ColPoc
- the POC number of the reference frame corresponding to the sub-block is ColRefPoc
- the motion vector to be scaled is MV.
- the MV is decomposed into a horizontal motion vector MVx and a vertical motion vector MVy, which are respectively calculated according to the above formula to obtain a horizontal motion vector MVsx and a vertical motion vector MVsy.
- Step 3 Determine the motion information of the sub-block to be processed according to the motion information of the corresponding sub-block
- the corresponding sub-block of the sub-block in the corresponding image is determined, and the motion information of the corresponding sub-block is obtained.
- the prediction mode of the corresponding sub-block obtains the prediction mode of the corresponding sub-block. If the prediction mode of the corresponding sub-block is inter-frame prediction, and the motion information of the corresponding sub-block is available, the motion vector field of the corresponding image The motion information of the position is used as the motion information of the corresponding sub-block. The motion information of the corresponding sub-block is scaled to obtain the motion information of the sub-block to be processed.
- the zooming method is the same as the method in step 2, and will not be repeated here.
- the motion information of the corresponding sub-block is not available, and the default motion information obtained in step 2 can be used as the motion information of the corresponding sub-block.
- Step 4 Perform motion compensation prediction according to the motion information of each sub-block to obtain the predicted pixel value of the current CU
- the coordinates of the pixel point at the upper left corner of the sub-block are added to the motion vector to find the corresponding coordinate point in the reference frame. If the motion vector has sub-pixel accuracy, interpolation filtering is required to obtain the predicted pixel value of the sub-block; otherwise, the pixel value in the reference frame is directly obtained as the predicted pixel value of the sub-block.
- Step 1 Confirm the offset motion vector (offset motion vector)
- the offset motion vector (offset motion vector) is used to determine the position of the corresponding block of the current CU in the corresponding image, and the offset motion vector may use the motion vector of the spatial neighboring block of the current CU.
- the reference frame of the neighboring block in the spatial domain is taken as the collocated picture of the current CU.
- the position of the current CU in the corresponding image block is determined, which is called a corresponding block (corresponding/collocated block).
- Step 2 Obtain the motion information of the corresponding sub-block
- the corresponding block can be obtained according to the offset vector first, and then the corresponding sub-block with a relative position relationship with the sub-block to be processed is determined in the target image according to the position of the sub-block to be processed (it can also be understood as In the block, a corresponding sub-block that has a relative position relationship with the sub-block to be processed is determined).
- the prediction mode of the corresponding sub-block obtains the prediction mode of the corresponding sub-block. If the prediction mode of the corresponding sub-block is inter-frame prediction, and the motion information of the corresponding sub-block is available, the motion vector field of the corresponding image The motion information of the position is used as the motion information of the corresponding sub-block.
- the motion information of the current sub-block is derived from the motion information of the corresponding sub-block.
- the motion vector of the corresponding sub-block is scaled and converted into the motion vector of the sub-block.
- the zoom method can use the zoom method in the prior art, which will not be repeated here.
- the prediction mode of the corresponding sub-block is intra prediction or intra-block copy mode
- the motion information of the corresponding sub-block is not available.
- one of the following processing methods can be used:
- predFlagL0 1
- predFlagL1 0
- mvL0 0
- mvL1 0.
- predFlagL0 and predFlagL1 represent the prediction directions using list0 and list1 respectively
- mvL0 and mvL1 represent the motion vectors used for prediction using list0 and list1, respectively
- mvL1 0 means that the horizontal and vertical components of mvL1 are both zero.
- Step 3 Perform motion compensation prediction according to the motion information of each sub-block to obtain the predicted pixel value of the current CU
- the coordinates of the pixel point at the upper left corner of the sub-block are added to the motion vector to find the corresponding coordinate point in the reference frame. If the motion vector has sub-pixel accuracy, interpolation filtering is required to obtain the predicted pixel value of the sub-block; otherwise, the pixel value in the reference frame is directly obtained as the predicted pixel value of the sub-block.
- Attached text revision (For the basis of text revision, please refer to JVET-N1001-v3. The meaning of the following pseudo-code can be referred to this text.
- the text can be downloaded from the website http://phenix.int-evry.fr/jvet/ )
- this embodiment of the application solves the need for complex initial offset motion vector calculation in the prior art, and directly fills the preset motion information to determine the availability information and default motion of ATMVP The problem of information reduces the complexity of encoding and decoding.
- This embodiment relates to an inter-frame prediction method, which optimizes the method for obtaining offset motion vectors.
- Steps 2 and 3 are the same as those of Embodiment A, and the specific description is as follows:
- Step 1 Confirm the offset motion vector (offset motion vector)
- the offset motion vector (offset motion vector) is used to determine the position of the corresponding block of the current CU in the corresponding image, and the offset motion vector may use the motion vector of the spatial neighboring block of the current CU.
- the reference frame of the neighboring block in the spatial domain is taken as the collocated picture of the current CU.
- the position of the current CU in the corresponding image block is determined, which is called a corresponding block (corresponding/collocated block).
- the motion vector of A1 is used as the offset motion vector of the current CU
- the acquisition of the offset motion vector may be one of the following methods. If A1 is not available, the offset motion vector value is 0.
- A1 unavailability means that the block at A1 is not decoded, or is an intra prediction block or intra block copy mode, or this block is outside the current slice, slice, tile group, or image. Think it is an unusable block.
- Method 1 Determine whether the following preset conditions are all met. If they all meet the preset conditions: check whether the reference frame corresponding to list1 of A1 is the same as the corresponding image of the current frame, if they are the same, use the motion vector corresponding to list1 as the offset motion If the vectors are different, check whether the reference frame corresponding to list0 is the same as the corresponding image of the current frame. If they are the same, use the motion vector corresponding to list0 as the offset motion vector; otherwise, the offset motion vector is 0.
- the image type or slice type or tile group type of the current block is B.
- collocated_from_l0_flag is 0, where collocated_from_l0_flag is 1 means that the co-located image for time-domain motion vector prediction is obtained from the reference image queue list0. If it is 0, it means that the co-located image for temporal motion vector prediction is obtained from the reference image queue list1. When it does not appear in the code stream, the value is 1.
- Method 2 Determine whether the following preset conditions are all met, and if they all meet the preset conditions: Check whether the reference frame corresponding to list1 of A1 is the same as the corresponding image of the current frame, if they are the same, use the motion vector corresponding to list1 as the offset motion If the vectors are different, check whether the reference frame corresponding to list0 is the same as the corresponding image of the current frame. If they are the same, use the motion vector corresponding to list0 as the offset motion vector; otherwise, the offset motion vector is 0.
- the image type or slice type or tile group type of the current block is B
- collocated_from_l0_flag is 0, where collocated_from_l0_flag is 1 means that the co-located image for time-domain motion vector prediction is obtained from the reference image queue list0. If it is 0, it means that the co-located image for temporal motion vector prediction is obtained from the reference image queue list1. When it does not appear in the code stream, the value is 1.
- Method 3 First check whether the reference frame corresponding to list0 of A1 and the corresponding image of the current frame are the same. If they are the same, use the motion vector corresponding to list0 of A1 as the offset motion vector, and there is no need to check whether the reference frame corresponding to list1 is the same as the current frame. Whether the corresponding images of the frames are the same. If the reference frame corresponding to list0 of A1 and the corresponding image of the current frame are not the same, and the image type or slice type or tile group type of the current block is B, it is also necessary to determine whether the reference frame of list1 is the same as the corresponding image of the current frame. If so, the motion vector corresponding to list1 of A1 can be used as the offset motion vector, otherwise, the offset motion vector is 0.
- Method 4 Check whether the reference frame corresponding to the list0 of A1 and the corresponding image of the current frame are the same, if they are the same, use the motion vector corresponding to the list0 of A1 as the offset motion vector, otherwise, the offset motion vector is 0.
- the index number (idx) of the corresponding image of the current frame of the image block can be obtained from the code stream.
- the offset vector used may also be a zero offset vector.
- the image block in the corresponding image at the same position as the block to be processed is the corresponding block of the block to be processed in the corresponding image.
- the ATMVP technology may not be used, but other technologies may be used to obtain the motion vector of the sub-block to be processed.
- checkL1First is set equal to 1:
- --DiffPicOrderCnt(aPic,currPic) is less than or equal to 0 for every picture aPic in every reference picture list of the current slice
- tempMV is set equal to mvLKA 1 :
- K is set equal to(1--checkL1First),and if all of the following conditions are true,tempMV is set equal to mvLKA 1 :
- checkL1First is set equal to 1:
- tempMV is set equal to mvLKA 1 :
- K is set equal to(1--checkL1First),and if all of the following conditions are true,tempMV is set equal to mvLKA 1 :
- tempMV is set equal to mvL0A 1 :
- tempMV is set equal to mvL1A 1 :
- tempMV is set equal to mvL0A 1 :
- the embodiment of the present application provides a new method for calculating the offset motion vector, which improves the coding and decoding efficiency while reducing the coding and decoding complexity.
- the computer-readable medium may include a computer-readable storage medium, which corresponds to a tangible medium, such as a data storage medium, or a communication medium that includes any medium that facilitates the transfer of a computer program from one place to another (for example, according to a communication protocol) .
- computer-readable media may generally correspond to (1) non-transitory tangible computer-readable storage media, or (2) communication media, such as signals or carrier waves.
- Data storage media can be any available media that can be accessed by one or more computers or one or more processors to retrieve instructions, codes, and/or data structures for implementing the techniques described in this application.
- the computer program product may include a computer-readable medium.
- such computer-readable storage media may include RAM, ROM, EEPROM, CD-ROM or other optical disk storage devices, magnetic disk storage devices or other magnetic storage devices, flash memory, or structures that can be used to store instructions or data Any other media that can be accessed by the computer in the form of desired program code. And, any connection is properly termed a computer-readable medium.
- any connection is properly termed a computer-readable medium.
- coaxial cable, fiber optic cable, twisted pair, digital subscriber line (DSL), or wireless technologies such as infrared, radio, and microwave to transmit instructions from a website, server, or other remote source
- coaxial cable Wire, fiber optic cable, twisted pair, DSL or wireless technologies such as infrared, radio and microwave are included in the definition of media.
- the computer-readable storage media and data storage media do not include connections, carrier waves, signals, or other temporary media, but are actually directed to non-transitory tangible storage media.
- magnetic disks and optical discs include compact discs (CD), laser discs, optical discs, digital versatile discs (DVD), and Blu-ray discs. Disks usually reproduce data magnetically, while discs use lasers to reproduce data optically. data. Combinations of the above should also be included in the scope of computer-readable media.
- DSP digital signal processors
- ASIC application-specific integrated circuits
- FPGA field programmable logic arrays
- processor may refer to any of the foregoing structure or any other structure suitable for implementing the techniques described herein.
- DSP digital signal processors
- ASIC application-specific integrated circuits
- FPGA field programmable logic arrays
- the term "processor” as used herein may refer to any of the foregoing structure or any other structure suitable for implementing the techniques described herein.
- the functions described by the various illustrative logical blocks, modules, and steps described herein may be provided in dedicated hardware and/or software modules configured for encoding and decoding, or combined Into the combined codec.
- the technology may be fully implemented in one or more circuits or logic elements.
- the technology of this application can be implemented in a variety of devices or devices, including wireless handsets, integrated circuits (ICs), or a set of ICs (for example, chipsets).
- ICs integrated circuits
- a set of ICs for example, chipsets.
- Various components, modules, or units are described in this application to emphasize the functional aspects of the device for performing the disclosed technology, but they do not necessarily need to be implemented by different hardware units.
- various units can be combined with appropriate software and/or firmware in the codec hardware unit, or by interoperating hardware units (including one or more processors as described above). provide.
Landscapes
- Engineering & Computer Science (AREA)
- Multimedia (AREA)
- Signal Processing (AREA)
- Compression Or Coding Systems Of Tv Signals (AREA)
Abstract
Description
Claims (68)
- 一种帧间预测方法,其特征在于,待处理块包括一个或多个子块,所述方法包括:根据所述待处理块的空域相邻块,确定所述待处理块的时域偏移矢量,所述时域偏移矢量用于确定所述待处理块的子块的对应子块;根据所述对应子块的运动矢量,确定所述待处理块的子块的运动矢量,其中,当所述对应子块的运动矢量不可得时,根据第一预设运动矢量获取所述待处理块的子块的运动矢量。
- 根据权利要求1所述的方法,其特征在于,所述根据所述待处理块的空域相邻块,确定所述待处理块的时域偏移矢量,包括:按照预设顺序依次检查多个第一预设位置的空域相邻块的运动矢量是否可得,直到获取所述预设顺序中第一个运动矢量可得的空域相邻块的运动矢量;将所述预设顺序中第一个运动矢量可得的空域相邻块的运动矢量作为所述时域偏移矢量。
- 根据权利要求2所述的方法,其特征在于,当所述多个第一预设位置的空域相邻块的运动矢量均不可得时,将第二预设运动矢量作为所述时域偏移矢量。
- 根据权利要求3所述的方法,其特征在于,所述第二预设运动矢量为零运动矢量。
- 根据权利要求1所述的方法,其特征在于,所述根据所述待处理块的空域相邻块,确定所述待处理块的时域偏移矢量,包括:获取第二预设位置的空域相邻块的运动矢量和参考帧,其中,所述第二预设位置的空域相邻块的运动矢量可得;将所述第二预设位置的空域相邻块的运动矢量作为所述时域偏移矢量。
- 根据权利要求5所述的方法,其特征在于,当所述第二预设位置的空域相邻块的运动矢量不可得时,将第三预设运动矢量作为所述时域偏移矢量。
- 根据权利要求6所述的方法,其特征在于,所述第三预设运动矢量为零运动矢量。
- 根据权利要求5至7任一项所述的方法,其特征在于,所述第二预设位置的空域相邻块的运动矢量包括基于所述第一参考帧列表的第一向运动矢量,所述第二预设位置的空域相邻块的参考帧包括所述第一向运动矢量对应的第一向参考帧,所述将所述第二预设位置的空域相邻块的运动矢量作为所述时域偏移矢量,包括:当所述第一向参考帧和所述对应子块所在的图像帧相同时,将所述第一向运动矢量作为所述时域偏移矢量。
- 根据权利要求8所述的方法,其特征在于,当所述第一向参考帧和所述对应子块所在的图像帧不同时,包括:将所述第三预设运动矢量作为所述时域偏移矢量。
- 根据权利要求8所述的方法,其特征在于,当所述第二预设位置的空域相邻块采用双向预测时,所述第二预设位置的空域相邻块的运动矢量还包括基于所述第二参考帧列表的第二向运动矢量,所述第二预设位置的空域相邻块的参考帧包括所述第二向运动矢量 对应的第二向参考帧,当所述第一向参考帧和所述待处理块的时域对应块所在的图像帧不同时,包括:当所述第二向参考帧和所述对应子块所在的图像帧相同时,将所述第二向运动矢量作为所述时域偏移矢量;当所述第二向参考帧和所述对应子块所在的图像帧不同时,将所述第三预设运动矢量作为所述时域偏移矢量。
- 根据权利要求5至7任一项所述的方法,其特征在于,当所述第二预设位置的空域相邻块采用双向预测时,所述第二预设位置的空域相邻块的运动矢量包括基于所述第一参考帧列表的第一向运动矢量和基于所述第二参考帧列表的第二向运动矢量,所述第二预设位置的空域相邻块的参考帧包括所述第一向运动矢量对应的第一向参考帧和所述第二向运动矢量对应的第二向参考帧,所述将所述第二预设位置的空域相邻块的运动矢量作为所述时域偏移矢量,包括:当所述对应子块所在的图像帧从所述第二参考帧列表中获取时:当所述第二向参考帧和所述对应子块所在的图像帧相同时,将所述第二向运动矢量作为所述时域偏移矢量;当所述第二向参考帧和所述对应子块所在的图像帧不同且所述第一向参考帧和所述对应子块所在的图像帧相同时,将所述第一向运动矢量作为所述时域偏移矢量;当所述对应子块所在的图像帧从所述第一参考帧列表中获取时:当所述第一向参考帧和所述对应子块所在的图像帧相同时,将所述第一向运动矢量作为所述时域偏移矢量;当所述第一向参考帧和所述对应子块所在的图像帧不同且所述第二向参考帧和所述对应子块所在的图像帧相同时,将所述第二向运动矢量作为所述时域偏移矢量。
- 根据权利要求11所述的方法,其特征在于,所述将所述第二预设位置的空域相邻块的运动矢量作为所述时域偏移矢量,包括:当所述对应子块所在的图像帧从所述第二参考帧列表中获取且所述待处理块的参考帧列表中的全部参考帧的显示顺序均在所述待处理块所在的图像帧之前时:当所述第二向参考帧和所述对应子块所在的图像帧相同时,将所述第二向运动矢量作为所述时域偏移矢量;当所述第二向参考帧和所述对应子块所在的图像帧不同且所述第一向参考帧和所述对应子块所在的图像帧相同时,将所述第一向运动矢量作为所述时域偏移矢量;当所述对应子块所在的图像帧从所述第一参考帧列表中获取或所述待处理块的参考帧列表中的至少一个参考帧的显示顺序在所述待处理块所在的图像帧之后时:当所述第一向参考帧和所述对应子块所在的图像帧相同时,将所述第一向运动矢量作为所述时域偏移矢量;当所述第一向参考帧和所述对应子块所在的图像帧不同且所述第二向参考帧和所述对应子块所在的图像帧相同时,将所述第二向运动矢量作为所述时域偏移矢量。
- 根据权利要求11或12所述的方法,其特征在于,当所述第二向参考帧和所述对应子块所在的图像帧不同且所述第一向参考帧和所述对应子块所在的图像帧不同时,将所述第三预设运动矢量作为所述时域偏移矢量。
- 根据权利要求8至13任一项所述的方法,其特征在于,所述对应子块所在的图像帧在所述待处理块的空域相邻块的参考帧列表中的索引通过解析所述码流获取。
- 根据权利要求2至14任一项所述的方法,其特征在于,所述空域相邻块的运动矢量不可得的条件包括下列一项或多项的组合:所述空域相邻块未编码/解码;或者,所述空域相邻块采用帧内预测或帧内块复制模式;或者,所述空域相邻块不存在;或者,所述空域相邻块和所述待处理块位于不同的编码区域。
- 根据权利要求15所述的方法,其特征在于,所述编码区域包括:图像、条带、片或片组。
- 根据权利要求1至16任一项所述的方法,其特征在于,在所述确定所述待处理块的子块的运动矢量之前,还包括:判断所述对应子块的预设块内位置对应的运动矢量是否可得;对应的,所述确定所述待处理块的子块的运动矢量,包括:当所述预设块内位置对应的运动矢量可得时,根据所述预设块内位置对应的运动矢量获取所述待处理块的子块的运动矢量;当所述预设块内位置对应的运动矢量不可得时,根据所述第一预设运动矢量获取所述待处理块的子块的运动矢量。
- 根据权利要求17所述的方法,其特征在于,所述预设块内位置为所述对应子块的几何中心位置。
- 根据权利要求17或18所述的方法,其特征在于,当所述预设块内位置所在的预测单元采用帧内预测或帧内块复制模式时,所述预设块内位置对应的运动矢量不可得;当所述预设块内位置所在的预测单元采用帧间预测时,所述预设块内位置对应的运动矢量可得。
- 根据权利要求17至19任一项所述的方法,其特征在于,所述根据第一预设运动矢量获取所述待处理块的子块的运动矢量,包括:将所述第一预设运动矢量作为所述待处理块的子块的运动矢量。
- 根据权利要求1至20任一项所述的方法,其特征在于,所述第一预设运动矢量为零运动矢量。
- 根据权利要求17至21任一项所述的方法,其特征在于,所述子块的运动矢量包括基于第一参考帧列表的第一向子块运动矢量和/或基于第二参考帧列表的第二向子块运动矢量,当所述预设块内位置对应的运动矢量不可得时,所述根据所述第一预设运动矢量获取所述待处理块的子块的运动矢量,包括:确定所述待处理块的子块采用基于所述第一向子块运动矢量的单向预测,且根据所述第一预设运动矢量获取所述待处理块的子块的第一向子块运动矢量;或者,确定所述待处理块的子块采用基于所述第二向子块运动矢量的单向预测,且根据所述第一预设运动矢量获取所述待处理块的子块的第二向子块运动矢量。
- 根据权利要求22所述的方法,其特征在于,当所述预设块内位置对应的运动矢量不可得时,所述根据所述第一预设运动矢量获取所述待处理块的子块的运动矢量,包括:当所述待处理块所在的编码区域的预测类型为B型预测时,确定所述待处理块的子块 采用双向预测,且根据所述第一预设运动矢量分别获取所述待处理块的子块的第一向子块运动矢量和所述待处理块的子块的第二向子块运动矢量;当所述待处理块所在的编码区域的预测类型为P型预测时,确定所述待处理块的子块采用单向预测,且根据所述第一预设运动矢量获取所述待处理块的子块的第一向子块运动矢量。
- 根据权利要求17至23任一项所述的方法,其特征在于,所述根据所述预设块内位置对应的运动矢量获取所述待处理块的子块的运动矢量,包括:基于第一时域距离差和第二时域距离差的比值,对所述预设块内位置对应的运动矢量进行缩放处理,以获取所述待处理块的子块的运动矢量,其中,所述第一时域距离差为所述待处理块所在的图像帧与所述待处理块的参考帧的图序计数差,所述第二时域距离差为所述对应子块所在的图像帧与所述对应子块的参考帧的图序计数差。
- 根据权利要求24所述的方法,其特征在于,所述待处理块的参考帧在所述待处理块的参考帧列表中的索引通过解析码流获取。
- 根据权利要求24或25所述的方法,其特征在于,所述待处理块的参考帧在所述待处理块的参考帧列表中的索引为0。
- 根据权利要求1至26任一项所述的方法,其特征在于,还包括:基于所述待处理块的子块的运动矢量和所述待处理块的参考帧,对所述待处理块的子块进行运动补偿,以获取所述待处理块的子块的预测值。
- 一种帧间预测装置,其特征在于,待处理块包括一个或多个子块,所述装置包括:偏移获取模块,用于根据所述待处理块的空域相邻块,确定所述待处理块的时域偏移矢量,所述时域偏移矢量用于确定所述待处理块的子块的对应子块;运动矢量获取模块,用于根据所述对应子块的运动矢量,确定所述待处理块的子块的运动矢量,其中,当所述对应子块的运动矢量不可得时,根据第一预设运动矢量获取所述待处理块的子块的运动矢量。
- 根据权利要求28所述的装置,其特征在于,所述偏移获取模块具体用于:按照预设顺序依次检查多个第一预设位置的空域相邻块的运动矢量是否可得,直到获取所述预设顺序中第一个运动矢量可得的空域相邻块的运动矢量;将所述预设顺序中第一个运动矢量可得的空域相邻块的运动矢量作为所述时域偏移矢量。
- 根据权利要求29所述的装置,其特征在于,所述偏移获取模块具体用于:当所述多个第一预设位置的空域相邻块的运动矢量均不可得时,将第二预设运动矢量作为所述时域偏移矢量。
- 根据权利要求30所述的装置,其特征在于,所述第二预设运动矢量为零运动矢量。
- 根据权利要求28所述的装置,其特征在于,所述偏移获取模块具体用于:获取第二预设位置的空域相邻块的运动矢量和参考帧,其中,所述第二预设位置的空域相邻块的运动矢量可得;将所述第二预设位置的空域相邻块的运动矢量作为所述时域偏移矢量。
- 根据权利要求32所述的装置,其特征在于,所述偏移获取模块具体用于:当所述第二预设位置的空域相邻块的运动矢量不可得时,将第三预设运动矢量作为所述时域偏移矢量。
- 根据权利要求33所述的装置,其特征在于,所述第三预设运动矢量为零运动矢量。
- 根据权利要求32至34任一项所述的装置,其特征在于,所述第二预设位置的空域相邻块的运动矢量包括基于所述第一参考帧列表的第一向运动矢量,所述第二预设位置的空域相邻块的参考帧包括所述第一向运动矢量对应的第一向参考帧,所述偏移获取模块具体用于:当所述第一向参考帧和所述对应子块所在的图像帧相同时,将所述第一向运动矢量作为所述时域偏移矢量。
- 根据权利要求35所述的装置,其特征在于,当所述第一向参考帧和所述对应子块所在的图像帧不同时,所述偏移获取模块具体用于:将所述第三预设运动矢量作为所述时域偏移矢量。
- 根据权利要求35所述的装置,其特征在于,当所述第二预设位置的空域相邻块采用双向预测时,所述第二预设位置的空域相邻块的运动矢量还包括基于所述第二参考帧列表的第二向运动矢量,所述第二预设位置的空域相邻块的参考帧包括所述第二向运动矢量对应的第二向参考帧,当所述第一向参考帧和所述待处理块的时域对应块所在的图像帧不同时,所述偏移获取模块具体用于:当所述第二向参考帧和所述对应子块所在的图像帧相同时,将所述第二向运动矢量作为所述时域偏移矢量;当所述第二向参考帧和所述对应子块所在的图像帧不同时,将所述第三预设运动矢量作为所述时域偏移矢量。
- 根据权利要求32至34任一项所述的装置,其特征在于,当所述第二预设位置的空域相邻块采用双向预测时,所述第二预设位置的空域相邻块的运动矢量包括基于所述第一参考帧列表的第一向运动矢量和基于所述第二参考帧列表的第二向运动矢量,所述第二预设位置的空域相邻块的参考帧包括所述第一向运动矢量对应的第一向参考帧和所述第二向运动矢量对应的第二向参考帧,所述偏移获取模块具体用于:当所述对应子块所在的图像帧从所述第二参考帧列表中获取时:当所述第二向参考帧和所述对应子块所在的图像帧相同时,将所述第二向运动矢量作为所述时域偏移矢量;当所述第二向参考帧和所述对应子块所在的图像帧不同且所述第一向参考帧和所述对应子块所在的图像帧相同时,将所述第一向运动矢量作为所述时域偏移矢量;当所述对应子块所在的图像帧从所述第一参考帧列表中获取时:当所述第一向参考帧和所述对应子块所在的图像帧相同时,将所述第一向运动矢量作为所述时域偏移矢量;当所述第一向参考帧和所述对应子块所在的图像帧不同且所述第二向参考帧和所述对应子块所在的图像帧相同时,将所述第二向运动矢量作为所述时域偏移矢量。
- 根据权利要求38所述的装置,其特征在于,所述偏移获取模块具体用于:当所述对应子块所在的图像帧从所述第二参考帧列表中获取且所述待处理块的参考帧列表中的全部参考帧的显示顺序均在所述待处理块所在的图像帧之前时:当所述第二向参考帧和所述对应子块所在的图像帧相同时,将所述第二向运动矢量作为所述时域偏移矢量;当所述第二向参考帧和所述对应子块所在的图像帧不同且所述第一向参考帧和所述对应子块所在的图像帧相同时,将所述第一向运动矢量作为所述时域偏移矢量;当所述对应子块所在的图像帧从所述第一参考帧列表中获取或所述待处理块的参考帧列表中的至少一个参考帧的显示顺序在所述待处理块所在的图像帧之后时:当所述第一向参考帧和所述对应子块所在的图像帧相同时,将所述第一向运动矢量作为所述时域偏移矢量;当所述第一向参考帧和所述对应子块所在的图像帧不同且所述第二向参考帧和所述对应子块所在的图像帧相同时,将所述第二向运动矢量作为所述时域偏移矢量。
- 根据权利要求38或39所述的装置,其特征在于,当所述第二向参考帧和所述对应子块所在的图像帧不同且所述第一向参考帧和所述对应子块所在的图像帧不同时,所述偏移获取模块具体用于:将所述第三预设运动矢量作为所述时域偏移矢量。
- 根据权利要求37至40任一项所述的装置,其特征在于,所述对应子块所在的图像帧在所述待处理块的空域相邻块的参考帧列表中的索引通过解析所述码流获取。
- 根据权利要求29至41任一项所述的装置,其特征在于,所述空域相邻块的运动矢量不可得的条件包括下列一项或多项的组合:所述空域相邻块未编码/解码;或者,所述空域相邻块采用帧内预测或帧内块复制模式;或者,所述空域相邻块不存在;或者,所述空域相邻块和所述待处理块位于不同的编码区域。
- 根据权利要求42所述的装置,其特征在于,所述编码区域包括:图像、条带、片或片组。
- 根据权利要求28至43任一项所述的装置,其特征在于,还包括:判断模块,用于判断所述对应子块的预设块内位置对应的运动矢量是否可得;对应的,所述运动矢量获取模块具体用于:当所述预设块内位置对应的运动矢量可得时,根据所述预设块内位置对应的运动矢量获取所述待处理块的子块的运动矢量;当所述预设块内位置对应的运动矢量不可得时,根据所述第一预设运动矢量获取所述待处理块的子块的运动矢量。
- 根据权利要求44所述的装置,其特征在于,所述预设块内位置为所述对应子块的几何中心位置。
- 根据权利要求44或45所述的装置,其特征在于,当所述预设块内位置所在的预测单元采用帧内预测或帧内块复制模式时,所述预设块内位置对应的运动矢量不可得;当所述预设块内位置所在的预测单元采用帧间预测时,所述预设块内位置对应的运动矢量可得。
- 根据权利要求44至46任一项所述的装置,其特征在于,所述运动矢量获取模块具体用于:将所述第一预设运动矢量作为所述待处理块的子块的运动矢量。
- 根据权利要求28至47任一项所述的装置,其特征在于,所述第一预设运动矢量为零运动矢量。
- 根据权利要求44至48任一项所述的装置,其特征在于,所述子块的运动矢量包括基于第一参考帧列表的第一向子块运动矢量和/或基于第二参考帧列表的第二向子块运动矢量,当所述预设块内位置对应的运动矢量不可得时,所述运动矢量获取模块具体用于:确定所述待处理块的子块采用基于所述第一向子块运动矢量的单向预测,且根据所述第一预设运动矢量获取所述待处理块的子块的第一向子块运动矢量;或者,确定所述待处理块的子块采用基于所述第二向子块运动矢量的单向预测,且根据所述第一预设运动矢量获取所述待处理块的子块的第二向子块运动矢量。
- 根据权利要求49所述的装置,其特征在于,当所述预设块内位置对应的运动矢量不可得时,所述运动矢量获取模块具体用于:当所述待处理块所在的编码区域的预测类型为B型预测时,确定所述待处理块的子块采用双向预测,且根据所述第一预设运动矢量分别获取所述待处理块的子块的第一向子块运动矢量和所述待处理块的子块的第二向子块运动矢量;当所述待处理块所在的编码区域的预测类型为P型预测时,确定所述待处理块的子块采用单向预测,且根据所述第一预设运动矢量获取所述待处理块的子块的第一向子块运动矢量。
- 根据权利要求44至50任一项所述的装置,其特征在于,所述运动矢量获取模块具体用于:基于第一时域距离差和第二时域距离差的比值,对所述预设块内位置对应的运动矢量进行缩放处理,以获取所述待处理块的子块的运动矢量,其中,所述第一时域距离差为所述待处理块所在的图像帧与所述待处理块的参考帧的图序计数差,所述第二时域距离差为所述对应子块所在的图像帧与所述对应子块的参考帧的图序计数差。
- 根据权利要求51所述的装置,其特征在于,所述待处理块的参考帧在所述待处理块的参考帧列表中的索引通过解析码流获取。
- 根据权利要求51或52所述的装置,其特征在于,所述待处理块的参考帧在所述待处理块的参考帧列表中的索引为0。
- 根据权利要求28至53任一项所述的装置,其特征在于,还包括:运动补偿模块,用于基于所述待处理块的子块的运动矢量和所述待处理块的参考帧,对所述待处理块的子块进行运动补偿,以获取所述待处理块的子块的预测值。
- 一种视频编码器,其特征在于,所述视频编码器用于编码图像块,包括:如权利要求28至54任一项所述的帧间预测装置,其中所述帧间预测装置用于基于目标候选运动信息预测当前编码图像块的运动信息,基于所述当前编码图像块的运动信息确定所述当前编码图像块的预测像素值;熵编码模块,用于将所述目标候选运动信息的索引标识编入码流,所述索引标识指示 用于所述当前编码图像块的所述目标候选运动信息;重建模块,用于基于所述预测像素值重建所述当前编码图像块。
- 一种视频解码器,其特征在于,所述视频解码器用于从码流中解码出图像块,包括:熵解码模块,用于从码流中解码出索引标识,所述索引标识用于指示当前解码图像块的目标候选运动信息;如权利要求28至54任一项所述的帧间预测装置,所述帧间预测装置用于基于所述索引标识指示的目标候选运动信息预测当前解码图像块的运动信息,基于所述当前解码图像块的运动信息确定所述当前解码图像块的预测像素值;重建模块,用于基于所述预测像素值重建所述当前解码图像块。
- 一种视频编解码设备,包括:相互耦合的非易失性存储器和处理器,所述处理器调用存储在所述存储器中的程序代码以执行如权利要求1-27任一项所描述的方法。
- 一种帧间预测方法,其特征在于,待处理块包括一个或多个子块,所述方法包括:获取所述待处理块的空域相邻块;根据所述空域相邻块,获得时域偏移矢量,所述时域偏移矢量用于确定所述待处理块的子块的对应子块,其中,在所述空域相邻块具有位于第一参考帧列表中的第一向参考帧,且所述对应子块所在的图像帧和所述第一向参考帧相同的情况下,所述时域偏移矢量为所述空域相邻块的第一向运动矢量,所述第一向运动矢量对应于所述第一向参考帧。
- 根据权利要求58所述的方法,其特征在于,在所述空域相邻块不具有位于第一参考帧列表中的第一向参考帧,或所述对应子块所在的图像帧和所述第一向参考帧不同的情况下,还包括:在所述空域相邻块具有位于第二参考帧列表中的第二向参考帧,且所述对应子块所在的图像帧和所述第二向参考帧相同的情况下,所述时域偏移矢量为所述空域相邻块的第二向运动矢量,所述第二向运动矢量对应于所述第二向参考帧。
- 根据权利要求58或59所述的方法,其特征在于,所述获取所述待处理块的空域相邻块,包括:检查所述空域相邻块是否可得;在所述空域相邻块可得的情况下,获取所述空域相邻块。
- 根据权利要求58至60任一项所述的方法,其特征在于,所述对应子块所在的图像帧和所述第一向参考帧相同,包括:所述对应子块所在的图像帧的POC和所述第一向参考帧的POC相同。
- 根据权利要求59至61任一项所述的方法,其特征在于,所述对应子块所在的图像帧和所述第二向参考帧相同,包括:所述对应子块所在的图像帧的POC和所述第二向参考帧的POC相同。
- 根据权利要求58至62任一项所述的方法,其特征在于,还包括:解析码流以获得所述对应子块所在的图像帧的索引信息。
- 根据权利要求58至62任一项所述的方法,其特征在于,还包括:将与所述待处理块具有预设关系的图像帧作为所述对应子块所在的图像帧。
- 根据权利要求64所述的方法,其特征在于,所述预设关系,包括:所述对应子块所在的图像帧在解码顺序上与所述待处理块所在的图像帧相邻,且早于所述待处理块所在的图像帧解码。
- 根据权利要求64所述的方法,其特征在于,所述预设关系,包括:所述对应子块所在的图像帧为所述待处理块的第一向参考帧列表或第二向参考帧列表中参考帧索引为0的参考帧。
- 根据权利要求58至66任一项所述的方法,其特征在于,在所述空域相邻块不具有位于第二参考帧列表中的第二向参考帧,或所述对应子块所在的图像帧和所述第二向参考帧不同的情况下,还包括:将零运动矢量作为所述时域偏移矢量。
- 一种视频编解码设备,包括:相互耦合的非易失性存储器和处理器,所述处理器调用存储在所述存储器中的程序代码以执行如权利要求58-67任一项所描述的方法。
Priority Applications (17)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN201980015456.6A CN112243586A (zh) | 2019-05-17 | 2019-08-15 | 一种帧间预测的方法和装置 |
| KR1020257017410A KR20250078641A (ko) | 2019-05-17 | 2019-08-15 | 인터 예측 방법 및 장치 |
| EP19929484.4A EP3955576B1 (en) | 2019-05-17 | 2019-08-15 | Inter-frame prediction method and device |
| PL19929484.4T PL3955576T3 (pl) | 2019-05-17 | 2019-08-15 | Sposób i urządzenie do predykcji międzyobrazowej |
| EP24201023.9A EP4518316A3 (en) | 2019-05-17 | 2019-08-15 | Inter prediction method and apparatus |
| BR112021023118A BR112021023118A2 (pt) | 2019-05-17 | 2019-08-15 | Método e aparelho de interpredição |
| KR1020217038708A KR102814842B1 (ko) | 2019-05-17 | 2019-08-15 | 인터 예측 방법 및 장치 |
| ES19929484T ES3010207T3 (en) | 2019-05-17 | 2019-08-15 | Inter-frame prediction method and device |
| MX2021013316A MX2021013316A (es) | 2019-05-17 | 2019-08-15 | Metodo y aparato de inter prediccion. |
| JP2021568601A JP7318007B2 (ja) | 2019-05-17 | 2019-08-15 | インター予測方法および装置 |
| DK19929484.4T DK3955576T3 (da) | 2019-05-17 | 2019-08-15 | Fremgangsmåde og anordning til interframe-forudsigelse |
| FIEP19929484.4T FI3955576T3 (fi) | 2019-05-17 | 2019-08-15 | Kehysten välinen ennustusmenetelmä ja -laite |
| MX2025003391A MX2025003391A (es) | 2019-05-17 | 2021-10-29 | Metodo y aparato de inter prediccion |
| US17/529,106 US12108046B2 (en) | 2019-05-17 | 2021-11-17 | Inter prediction method and apparatus |
| JP2023117744A JP7547574B2 (ja) | 2019-05-17 | 2023-07-19 | インター予測方法および装置 |
| US18/882,668 US12519948B2 (en) | 2019-05-17 | 2024-09-11 | Inter prediction method and apparatus |
| US19/407,477 US20260106992A1 (en) | 2019-05-17 | 2025-12-03 | Inter prediction method and apparatus |
Applications Claiming Priority (2)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN201910414914.5 | 2019-05-17 | ||
| CN201910414914.5A CN111953995A (zh) | 2019-05-17 | 2019-05-17 | 一种帧间预测的方法和装置 |
Related Child Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| US17/529,106 Continuation US12108046B2 (en) | 2019-05-17 | 2021-11-17 | Inter prediction method and apparatus |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| WO2020232845A1 true WO2020232845A1 (zh) | 2020-11-26 |
Family
ID=73336807
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| PCT/CN2019/100751 Ceased WO2020232845A1 (zh) | 2019-05-17 | 2019-08-15 | 一种帧间预测的方法和装置 |
Country Status (14)
| Country | Link |
|---|---|
| US (3) | US12108046B2 (zh) |
| EP (2) | EP3955576B1 (zh) |
| JP (2) | JP7318007B2 (zh) |
| KR (2) | KR102814842B1 (zh) |
| CN (2) | CN111953995A (zh) |
| BR (1) | BR112021023118A2 (zh) |
| DK (1) | DK3955576T3 (zh) |
| ES (1) | ES3010207T3 (zh) |
| FI (1) | FI3955576T3 (zh) |
| HU (1) | HUE069769T2 (zh) |
| MX (2) | MX2021013316A (zh) |
| PL (1) | PL3955576T3 (zh) |
| PT (1) | PT3955576T (zh) |
| WO (1) | WO2020232845A1 (zh) |
Cited By (2)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| JP2024508193A (ja) * | 2021-03-19 | 2024-02-22 | 杭州海康威視数字技術股▲フン▼有限公司 | 復号方法、符号化方法、装置、デバイスおよび記憶媒体 |
| JP2025506045A (ja) * | 2022-02-15 | 2025-03-05 | エルジー エレクトロニクス インコーポレイティド | 画像符号化/復号化方法、ビットストリームを伝送する方法、及びビットストリームを保存した記録媒体 |
Families Citing this family (11)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN120751158A (zh) * | 2019-06-27 | 2025-10-03 | 三星电子株式会社 | 视频编码方法和视频解码方法 |
| US20230059035A1 (en) * | 2021-08-23 | 2023-02-23 | Netflix, Inc. | Efficient encoding of film grain noise |
| US12238294B2 (en) * | 2022-04-18 | 2025-02-25 | Tencent America LLC | Sub-block based temporal motion vector predictor with an motion vector offset |
| US12425602B2 (en) * | 2022-05-25 | 2025-09-23 | Tencent America LLC | Subblock level temporal motion vector prediction with multiple displacement vector predictors and an offset |
| WO2024005456A1 (ko) * | 2022-06-27 | 2024-01-04 | 현대자동차주식회사 | 영상 부호화/복호화 방법, 장치 및 비트스트림을 저장한 기록 매체 |
| CN119654864A (zh) | 2022-08-18 | 2025-03-18 | 三星电子株式会社 | 使用ai的图像解码设备和图像编码设备以及利用该设备的方法 |
| WO2024151335A1 (en) * | 2023-01-09 | 2024-07-18 | Tencent America LLC | Subblock based motion vector predictor with mv offset in amvp mode |
| US12464133B2 (en) | 2023-02-02 | 2025-11-04 | Tencent America LLC | Smooth sub-block motion vector prediction |
| US12563217B2 (en) * | 2023-04-14 | 2026-02-24 | Tencent America LLC | Systems and methods for candidate list construction |
| JP7853591B2 (ja) * | 2023-09-21 | 2026-04-30 | サミー株式会社 | ぱちんこ遊技機 |
| CN118337962B (zh) * | 2024-06-12 | 2024-09-03 | 湖南中泓汇智智能科技有限公司 | 一种用于超视距远程驾驶平台的5g网络数据传输方法 |
Citations (2)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN102917197A (zh) * | 2011-08-04 | 2013-02-06 | 想象技术有限公司 | 运动估计系统中的外部矢量 |
| WO2017138352A1 (en) * | 2016-02-08 | 2017-08-17 | Sharp Kabushiki Kaisha | Systems and methods for transform coefficient coding |
Family Cites Families (6)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| TW201436531A (zh) * | 2012-09-28 | 2014-09-16 | Vid Scale Inc | 多層視訊編碼適應性升取樣 |
| US11477477B2 (en) | 2015-01-26 | 2022-10-18 | Qualcomm Incorporated | Sub-prediction unit based advanced temporal motion vector prediction |
| US20160337662A1 (en) | 2015-05-11 | 2016-11-17 | Qualcomm Incorporated | Storage and signaling resolutions of motion vectors |
| CN116567220A (zh) * | 2016-08-11 | 2023-08-08 | Lx 半导体科技有限公司 | 图像编码/解码设备和图像数据的发送设备 |
| CN109587479B (zh) | 2017-09-29 | 2023-11-10 | 华为技术有限公司 | 视频图像的帧间预测方法、装置及编解码器 |
| CN119922329A (zh) | 2017-11-01 | 2025-05-02 | 交互数字Vc控股公司 | 用于合并模式的解码器侧运动矢量细化和子块运动导出 |
-
2019
- 2019-05-17 CN CN201910414914.5A patent/CN111953995A/zh active Pending
- 2019-08-15 KR KR1020217038708A patent/KR102814842B1/ko active Active
- 2019-08-15 MX MX2021013316A patent/MX2021013316A/es unknown
- 2019-08-15 FI FIEP19929484.4T patent/FI3955576T3/fi active
- 2019-08-15 HU HUE19929484A patent/HUE069769T2/hu unknown
- 2019-08-15 ES ES19929484T patent/ES3010207T3/es active Active
- 2019-08-15 DK DK19929484.4T patent/DK3955576T3/da active
- 2019-08-15 CN CN201980015456.6A patent/CN112243586A/zh active Pending
- 2019-08-15 EP EP19929484.4A patent/EP3955576B1/en active Active
- 2019-08-15 BR BR112021023118A patent/BR112021023118A2/pt unknown
- 2019-08-15 EP EP24201023.9A patent/EP4518316A3/en active Pending
- 2019-08-15 WO PCT/CN2019/100751 patent/WO2020232845A1/zh not_active Ceased
- 2019-08-15 KR KR1020257017410A patent/KR20250078641A/ko active Pending
- 2019-08-15 JP JP2021568601A patent/JP7318007B2/ja active Active
- 2019-08-15 PL PL19929484.4T patent/PL3955576T3/pl unknown
- 2019-08-15 PT PT199294844T patent/PT3955576T/pt unknown
-
2021
- 2021-10-29 MX MX2025003391A patent/MX2025003391A/es unknown
- 2021-11-17 US US17/529,106 patent/US12108046B2/en active Active
-
2023
- 2023-07-19 JP JP2023117744A patent/JP7547574B2/ja active Active
-
2024
- 2024-09-11 US US18/882,668 patent/US12519948B2/en active Active
-
2025
- 2025-12-03 US US19/407,477 patent/US20260106992A1/en active Pending
Patent Citations (2)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN102917197A (zh) * | 2011-08-04 | 2013-02-06 | 想象技术有限公司 | 运动估计系统中的外部矢量 |
| WO2017138352A1 (en) * | 2016-02-08 | 2017-08-17 | Sharp Kabushiki Kaisha | Systems and methods for transform coefficient coding |
Non-Patent Citations (2)
| Title |
|---|
| LIN, JIANLIANG ET AL.: "Motion Vector Coding in the HEVC Standard", IEEE JOURNAL OF SELECTED TOPICS IN SIGNAL PROCESSING, vol. 7, no. 6, 31 December 2013 (2013-12-31), XP055238578, DOI: 20200109090229X * |
| SHINOBU, KUDO ET AL.: "Motion Vector Prediction Methods Considering Prediction Continuity in HEVC", 2016 PICTURE CODING SYMPOSIUM, 31 December 2016 (2016-12-31), XP033086849, DOI: 20200109090237X * |
Cited By (8)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| JP2024508193A (ja) * | 2021-03-19 | 2024-02-22 | 杭州海康威視数字技術股▲フン▼有限公司 | 復号方法、符号化方法、装置、デバイスおよび記憶媒体 |
| JP7494403B2 (ja) | 2021-03-19 | 2024-06-03 | 杭州海康威視数字技術股▲フン▼有限公司 | 復号方法、符号化方法、装置、デバイスおよび記憶媒体 |
| JP2024098086A (ja) * | 2021-03-19 | 2024-07-19 | 杭州海康威視数字技術股▲フン▼有限公司 | 復号方法、符号化方法、装置、デバイスおよび記憶媒体 |
| US12063386B2 (en) | 2021-03-19 | 2024-08-13 | Hangzhou Hikvision Digital Technology Co., Ltd. | Methods, apparatuses, devices, and storage media for encoding or decoding |
| JP7655474B2 (ja) | 2021-03-19 | 2025-04-02 | 杭州海康威視数字技術股▲フン▼有限公司 | 復号方法、符号化方法、装置、デバイスおよび記憶媒体 |
| US12574543B2 (en) | 2021-03-19 | 2026-03-10 | Hangzhou Hikvision Digital Technology Co., Ltd. | Methods, apparatuses, devices, and storage media for encoding or decoding |
| JP2025506045A (ja) * | 2022-02-15 | 2025-03-05 | エルジー エレクトロニクス インコーポレイティド | 画像符号化/復号化方法、ビットストリームを伝送する方法、及びビットストリームを保存した記録媒体 |
| JP7741336B2 (ja) | 2022-02-15 | 2025-09-17 | エルジー エレクトロニクス インコーポレイティド | 画像符号化/復号化方法、ビットストリームを伝送する方法、及びビットストリームを保存した記録媒体 |
Also Published As
| Publication number | Publication date |
|---|---|
| CN111953995A (zh) | 2020-11-17 |
| JP7547574B2 (ja) | 2024-09-09 |
| DK3955576T3 (da) | 2025-02-10 |
| KR102814842B1 (ko) | 2025-05-28 |
| HUE069769T2 (hu) | 2025-04-28 |
| BR112021023118A2 (pt) | 2022-01-04 |
| US20250080748A1 (en) | 2025-03-06 |
| JP2022532670A (ja) | 2022-07-15 |
| US20260106992A1 (en) | 2026-04-16 |
| KR20250078641A (ko) | 2025-06-02 |
| US20220078441A1 (en) | 2022-03-10 |
| EP3955576A4 (en) | 2022-07-27 |
| PT3955576T (pt) | 2025-02-10 |
| EP4518316A2 (en) | 2025-03-05 |
| JP2023156315A (ja) | 2023-10-24 |
| MX2025003391A (es) | 2025-05-02 |
| PL3955576T3 (pl) | 2025-03-03 |
| ES3010207T3 (en) | 2025-04-01 |
| EP3955576A1 (en) | 2022-02-16 |
| US12519948B2 (en) | 2026-01-06 |
| CN112243586A (zh) | 2021-01-19 |
| MX2021013316A (es) | 2021-12-10 |
| EP4518316A3 (en) | 2025-05-28 |
| EP3955576B1 (en) | 2024-11-06 |
| US12108046B2 (en) | 2024-10-01 |
| FI3955576T3 (fi) | 2025-02-12 |
| JP7318007B2 (ja) | 2023-07-31 |
| KR20220003037A (ko) | 2022-01-07 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| US12519948B2 (en) | Inter prediction method and apparatus | |
| CN113545040B (zh) | 用于多假设编码的加权预测方法及装置 | |
| CN115243039B (zh) | 一种视频图像预测方法及装置 | |
| CN111953997B (zh) | 候选运动矢量列表获取方法、装置及编解码器 | |
| CN111788833B (zh) | 帧间预测方法、装置以及相应的编码器和解码器 | |
| WO2020182194A1 (zh) | 帧间预测的方法及相关装置 | |
| US12531975B2 (en) | Picture prediction method and apparatus, and computer-readable storage medium | |
| CN111526362B (zh) | 帧间预测方法和装置 | |
| WO2020259567A1 (zh) | 视频编码器、视频解码器及相应方法 | |
| CN114270847B (zh) | 融合候选运动信息列表的构建方法、装置及编解码器 | |
| AU2020261145B2 (en) | Picture prediction method and apparatus, and computer-readable storage medium | |
| CN112153389B (zh) | 一种帧间预测的方法和装置 | |
| CN111432219B (zh) | 一种帧间预测方法及装置 | |
| CN113366850B (zh) | 视频编码器、视频解码器及相应方法 | |
| CN112135137B (zh) | 视频编码器、视频解码器及相应方法 | |
| WO2020155791A1 (zh) | 帧间预测方法和装置 | |
| CN113615191A (zh) | 图像显示顺序的确定方法、装置和视频编解码设备 | |
| WO2020187062A1 (zh) | 用于融合运动矢量差技术的优化方法、装置及编解码器 | |
| WO2020186882A1 (zh) | 基于三角预测单元模式的处理方法及装置 |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| 121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 19929484 Country of ref document: EP Kind code of ref document: A1 |
|
| ENP | Entry into the national phase |
Ref document number: 2021568601 Country of ref document: JP Kind code of ref document: A |
|
| NENP | Non-entry into the national phase |
Ref country code: DE |
|
| ENP | Entry into the national phase |
Ref document number: 20217038708 Country of ref document: KR Kind code of ref document: A |
|
| REG | Reference to national code |
Ref country code: BR Ref legal event code: B01A Ref document number: 112021023118 Country of ref document: BR |
|
| ENP | Entry into the national phase |
Ref document number: 2019929484 Country of ref document: EP Effective date: 20211111 |
|
| ENP | Entry into the national phase |
Ref document number: 112021023118 Country of ref document: BR Kind code of ref document: A2 Effective date: 20211117 |
|
| WWG | Wipo information: grant in national office |
Ref document number: MX/A/2021/013316 Country of ref document: MX |
|
| WWD | Wipo information: divisional of initial pct application |
Ref document number: 1020257017410 Country of ref document: KR |
|
| WWP | Wipo information: published in national office |
Ref document number: 1020257017410 Country of ref document: KR |





