WO2020181507A1 - Procédé et appareil de traitement d'image - Google Patents

Procédé et appareil de traitement d'image Download PDF

Info

Publication number
WO2020181507A1
WO2020181507A1 PCT/CN2019/077894 CN2019077894W WO2020181507A1 WO 2020181507 A1 WO2020181507 A1 WO 2020181507A1 CN 2019077894 W CN2019077894 W CN 2019077894W WO 2020181507 A1 WO2020181507 A1 WO 2020181507A1
Authority
WO
WIPO (PCT)
Prior art keywords
image block
motion vector
sub
cpmv
prediction
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Ceased
Application number
PCT/CN2019/077894
Other languages
English (en)
Chinese (zh)
Inventor
孟学苇
郑萧桢
王苫社
马思伟
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Peking University
SZ DJI Technology Co Ltd
Original Assignee
Peking University
SZ DJI Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Peking University, SZ DJI Technology Co Ltd filed Critical Peking University
Priority to CN201980005232.7A priority Critical patent/CN111247804B/zh
Priority to PCT/CN2019/077894 priority patent/WO2020181507A1/fr
Publication of WO2020181507A1 publication Critical patent/WO2020181507A1/fr
Anticipated expiration legal-status Critical
Ceased legal-status Critical Current

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/503Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
    • H04N19/51Motion estimation or motion compensation
    • H04N19/513Processing of motion vectors
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/103Selection of coding mode or of prediction mode
    • H04N19/107Selection of coding mode or of prediction mode between spatial and temporal predictive coding, e.g. picture refresh
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/13Adaptive entropy coding, e.g. adaptive variable length coding [AVLC] or context adaptive binary arithmetic coding [CABAC]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/17Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
    • H04N19/176Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a block, e.g. a macroblock
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/503Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
    • H04N19/51Motion estimation or motion compensation
    • H04N19/56Motion estimation with initialisation of the vector search, e.g. estimating a good candidate to initiate a search
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/80Details of filtering operations specially adapted for video compression, e.g. for pixel interpolation
    • H04N19/82Details of filtering operations specially adapted for video compression, e.g. for pixel interpolation involving filtering within a prediction loop

Definitions

  • This application relates to the field of image processing, and more specifically, to an image processing method and device.
  • inter-frame prediction uses the time-domain correlation between adjacent frames of the video, use the previously coded reconstructed frame as a reference frame, and predict the current frame through motion estimation and motion compensation. Remove the time redundant information of the video.
  • the general process of inter-frame prediction includes Motion Estimation (ME) and Motion Compensation (MC).
  • ME Motion Estimation
  • MC Motion Compensation
  • the current coding block of the current frame searches for the most similar block in the reference frame as the prediction block of the current block, and the relative displacement between the current block and its similar block is a motion vector (MV).
  • the process of motion estimation is the process of obtaining a motion vector after searching and comparing the current coding block of the current frame in the reference frame.
  • Motion compensation is the process of obtaining prediction frames using MV and reference frames.
  • the predicted frame obtained by motion compensation may be different from the original current frame. Therefore, the difference (residual) between the predicted frame and the current frame needs to be transmitted to the decoder after transformation, quantization, etc., in addition to Pass the MV and reference frame information to the decoder.
  • the decoder can reconstruct the current frame through the MV, the reference frame, and the difference between the predicted frame and the current frame.
  • the motion vector of the object between two adjacent frames may not be exactly an integer number of pixel units.
  • sub-pixel accuracy is proposed.
  • HEVC high-efficiency video coding
  • motion vectors with 1/4 pixel precision are used for motion estimation of luminance components.
  • there is no sample value at the fractional pixel in digital video there is no sample value at the fractional pixel in digital video.
  • the value of these fractional pixels must be approximately interpolated, that is, the line direction and the reference frame K-fold interpolation is performed in the column direction, that is, the prediction block is searched in the reference frame after interpolation.
  • the pixels in the current block and the pixels in the adjacent area need to be used.
  • affine motion compensation prediction Affine motion compensation prediction
  • the affine motion field of the image block can be derived from the motion vector of two control points (four parameters) or three control points (six parameters).
  • the image processing unit of the Affine technology is a sub-CU (which can be referred to as a sub-block), and the size of the sub-CU is 4 ⁇ 4 (unit: pixel), which will cause the Affine technology to generate greater bandwidth pressure.
  • the present application provides an image processing method and device, which can reduce the bandwidth pressure caused by the Affine prediction technology to a certain extent.
  • an image processing method includes: obtaining a motion vector CPMV of a control point of an image block; obtaining a motion vector of a sub-image block in the image block according to the CPMV of the image block, so The motion vector mentioned is in integer pixel accuracy.
  • an image processing device comprising: a first acquisition unit, configured to acquire a motion vector CPMV of a control point of an image block; and a second acquisition unit, configured to acquire according to the first acquisition unit
  • the CPMV of the image block is obtained, and the motion vector of the sub-image block in the image block is obtained, and the motion vector has an integer pixel accuracy.
  • an image processing device in a third aspect, includes a memory and a processor, the memory is used to store instructions, and the processor is used to execute instructions stored in the memory and store Execution of the instructions of causes the processor to execute the method provided in the first aspect.
  • a chip in a fourth aspect, includes a processing module and a communication interface, the processing module is configured to control the communication interface to communicate with the outside, and the processing module is also configured to implement the method provided in the first aspect.
  • a computer-readable storage medium on which a computer program is stored.
  • the computer program When the computer program is executed by a computer, the computer realizes the method in the first aspect or any possible implementation manner of the first aspect .
  • a computer program product containing instructions is provided, which when executed by a computer causes the computer to implement the method provided in the first aspect.
  • the motion compensation process of the sub-image block does not involve sub-pixels, which can reduce the Affine prediction technology to a certain extent. Bandwidth pressure.
  • Figure 1 is a schematic diagram of a video coding architecture.
  • Figure 2 is a schematic diagram of 1/4 pixel interpolation.
  • Figures 3(a) and 3(b) are schematic diagrams of the four-parameter Affine model and the six-parameter Affine model, respectively.
  • Figure 4 is a schematic diagram of the Affine motion vector field.
  • Fig. 5 is a comparison diagram of reference pixels required by the Affine mode and the HEVC mode in the prior art.
  • Fig. 6 is a schematic flowchart of an image processing method according to an embodiment of the present application.
  • Fig. 7 is another schematic flowchart of an image processing method according to an embodiment of the present application.
  • Fig. 8 is another schematic flowchart of the image processing method according to an embodiment of the present application.
  • Fig. 9 is a schematic flowchart of an image processing apparatus according to an embodiment of the present application.
  • Fig. 10 is another schematic flowchart of an image processing apparatus according to an embodiment of the present application.
  • the video coding framework mainly includes intra-frame prediction, inter-frame prediction, transformation, quantization, entropy coding, and loop filtering.
  • This application is mainly aimed at improving the inter prediction (inter prediction) part.
  • inter-frame prediction uses the temporal correlation between adjacent frames of the video, use the reconstructed frame as the reference frame, and use Motion Estimation (ME) and Motion Compensation (MC) to compare the current frame Make predictions to remove the temporal redundant information of the video.
  • ME Motion Estimation
  • MC Motion Compensation
  • the current frame mentioned in this article means the frame currently being encoded, and in the decoding scene, means the frame currently being decoded.
  • the reconstructed frame mentioned in this article, in the encoding scene, means the previously encoded frame, in the decoding scene, means the previously decoded frame.
  • the entire frame of image is not directly processed in the encoding process, and the entire frame of image is usually divided into image blocks for processing.
  • CTU Coding Tree Unit
  • the size of the CTU is 64 ⁇ 64 or 128 ⁇ 128 (unit: pixels)
  • the CTU can be further divided into square or rectangular Coding Unit (CU).
  • CU Coding Unit
  • the unit of the size of the image block mentioned in this article is all pixels.
  • Motion estimation refers to the process of obtaining a motion vector after searching and comparing the current block of the current frame in the reference frame.
  • Motion compensation refers to the process of obtaining a prediction block using a reference block and a motion vector obtained by motion estimation.
  • the prediction block obtained by the inter-frame prediction process may be different from the original current block. Therefore, it is necessary to calculate the difference between the prediction block and the current block, and the difference may be called the residual. After performing transformation, quantization, entropy coding and other processing on the residual, the coded bit stream is obtained.
  • the bit stream and encoding mode information are stored or sent to the decoding end.
  • the decoding end after obtaining the entropy coded bitstream, first perform entropy decoding on the bitstream to obtain the corresponding residual; then, obtain the prediction block according to the coding mode information such as the decoded motion vector; finally, according to the residual and prediction Block, get the value of each pixel in the current block, that is, reconstruct the current block, and so on, reconstruct the current frame.
  • steps such as inverse quantization and inverse transformation may also be included.
  • Dequantization refers to the process opposite to the quantification process.
  • Inverse transformation refers to the process opposite to the transformation process.
  • Inter-frame prediction mainly includes forward prediction, backward prediction, bi-prediction and so on.
  • forward prediction is to use the previous reconstructed frame of the current frame (may be called the historical frame) to predict the current frame.
  • Backward prediction is to use frames after the current frame (may be called a future frame) to predict the current frame.
  • Bi-prediction may be bi-directional prediction, that is, both "historical frames” and "future frames” are used to predict the current frame.
  • Bi-prediction can also be prediction in two directions, for example, using two "historical frames” to predict the current frame, or using two "future frames” to predict the current frame.
  • the motion vector of the object between two adjacent frames may not be exactly an integer number of pixels. Therefore, the accuracy of motion estimation needs to be improved to the sub-pixel level (also called 1/K pixel accuracy). For example, in the HEVC standard, motion vectors with 1/4 pixel accuracy are used for motion estimation of the luminance component.
  • the process of 1/4 pixel interpolation is shown in Figure 2.
  • the 3 pixels on the left and 4 pixels on the right outside the image block will be used to generate interpolation points The pixel value.
  • a 0,0 and d 0,0 are 1/4 pixels
  • b 0,0 and h 0,0 are half pixels
  • c 0, 0 and n 0,0 are 3/4 pixels.
  • the current block is a 2 ⁇ 2 block
  • a 0,0 ⁇ A 1,0 , A 0,0 ⁇ A 0,1 are enclosed by 2 ⁇ 2 blocks.
  • some points outside the 2 ⁇ 2 need to be used, including 3 on the left, 4 on the right, 3 on the top, and 4 on the bottom.
  • Affine motion compensated prediction technology (Affine motion compensated prediction, hereinafter referred to as Affine).
  • Affine is an inter-frame prediction technology.
  • an Affine mode sports field can pass two control points (four parameters) (as shown in Figure 3(a)) or three control points (six parameters) (as shown in Figure 3(b))
  • the motion vector is exported.
  • MV controlpointmotionvector
  • CPMV controlpointmotionvector
  • the processing unit of Affine is not a CU, but a sub-block (sub-CU) obtained after dividing the CU, and the size of each sub-CU is 4 ⁇ 4.
  • each sub-CU has one MV. It can be understood that, unlike ordinary CUs, Affine mode CUs do not only have one MV. There are as many sub-CUs as there are in a CU.
  • the MV of the sub-CU in one CU is derived through the CPMV calculation of two control points or three control points as shown in FIG. 3.
  • the MV of the sub-CU at the (x, y) position is calculated by the following formula:
  • the MV of the sub-CU at the (x, y) position is calculated by the following formula:
  • (mv 0x , mv 0y ) is the MV of the upper left control point
  • (mv 1x , mv 1y ) is the MV of the upper right control point
  • (mv 2x , mv 2y ) is the MV of the lower left control point.
  • W in the above formula represents the width of the CU where the sub-CU is located
  • H represents the height of the CU where the sub-CU is located.
  • each square represents a sub-CU with a size of 4 ⁇ 4.
  • the MVs of all sub-CUs will be converted into 1/16 pixel precision representation, that is to say, the highest precision of the sub-CU MV is 1/16 pixel.
  • the prediction block of each sub-CU is obtained through the process of motion compensation.
  • the size of the sub-CU of the chrominance component and the luminance component is 4 ⁇ 4, and the motion vector of the chrominance component 4 ⁇ 4 block is obtained by averaging the corresponding four 4 ⁇ 4 luminance component motion vectors.
  • CPMV information is written in the code stream, and there is no need to write the MV information of each sub-CU.
  • AMVR Adaptive Motion Vector Resolution
  • AMVR technology can make the CU have motion vectors with full pixel precision and sub-pixel precision.
  • the integer pixel accuracy can be, for example, 1-pixel accuracy, 2-pixel accuracy, or the like.
  • the sub-pixel accuracy can be, for example, 1/2 pixel accuracy, 1/4 pixel accuracy, 1/8 pixel accuracy, or 1/16 pixel accuracy.
  • the corresponding MV accuracy is adaptively decided at the encoding end, and the result of the decision is written into the code stream and passed to the decoding end.
  • the whole pixel accuracy or sub-pixel accuracy mentioned in Affine AMVR technology refers to the pixel accuracy of CPMV, not the pixel accuracy of sub-CU.
  • the motion estimation process of the CU is the whole pixel process, but the MV of the sub-CU obtained after the above formula (1) or formula (2) may be 1/4 pixel accuracy or other sub-pixel accuracy. Pixel accuracy.
  • the motion compensation process of the sub-CU will involve sub-pixels, and since the size of the sub-CU is 4 ⁇ 4, this will cause the Affine prediction process to generate greater bandwidth pressure.
  • the simulation result is shown in Figure 5.
  • the box on the right represents the 4 ⁇ 4 bidirectional inter-frame prediction CU under the worst case (1/16 and 1/4 pixel precision MV) in the Affine mode of VVC.
  • this application proposes an image processing method and device, which can reduce the bandwidth pressure generated by the Affine technology to a certain extent.
  • This application is suitable for the field of digital video coding technology, and is specifically used for the inter-frame prediction part of a video codec.
  • This application can be applied to codecs that comply with the international video coding standard H.264/HEVC and the Chinese AVS2 standard, as well as codecs that comply with the next-generation video coding standard VVC or AVS3.
  • This application can be applied to the inter-frame prediction part of a video codec, that is to say, the image processing method according to the embodiment of this application can be executed by an encoding device or a decoding device.
  • FIG. 6 is a schematic flowchart of an image processing method 600 provided by this application.
  • the method 600 includes the following steps.
  • CPMV motion vector
  • the motion vector of the sub-image block in the image block is obtained, and the pixel accuracy of the motion vector of the sub-image block is made to be an integer pixel accuracy.
  • the sub-image block mentioned in this application represents a processing unit of image processing or video processing.
  • the width and/or height of the sub-image block may be less than 8 pixels.
  • the size of the sub-image block is 4 ⁇ 4 (pixels).
  • the sub image block may be a block obtained by dividing the image block. It can be understood that if the size of the image block and the sub-image block are the same, the sub-image block can be regarded as the image block itself.
  • the sub-image block may be a square block, for example, a block with a size of 4 ⁇ 4 or 8 ⁇ 8, or a rectangular block, for example, a block with a size of 2 ⁇ 4 or 4 ⁇ 8.
  • the size of the image block mentioned in this application can be 16 ⁇ 16, 16 ⁇ 8, 16 ⁇ 4, 8 ⁇ 16, 4 ⁇ 8, 8 ⁇ 8, 8 ⁇ 4, 4 ⁇ 8 and other sizes.
  • the motion vector of the sub-image block as the processing unit has an integer pixel accuracy. Therefore, the motion compensation process of the sub-image block does not involve sub-pixels, which can reduce the bandwidth pressure generated by the video inter-frame prediction process.
  • the process of obtaining the motion vector of the sub-image block in the image block may include: calculating the motion vector of the sub-image block according to the motion vector of the two or three control points of the image block, and making the obtained motion vector
  • the pixel accuracy of the motion vector of the sub-image block is the integer pixel accuracy.
  • the motion vector of the sub-image block can be calculated according to formula (1) or formula (2) described above.
  • this motion vector is the motion vector of the sub-image block to be obtained in this application. .
  • an algorithm is used to calculate the motion vector of the sub-image block according to the CPMV of the image block.
  • the algorithm can ensure that the calculated pixel accuracy of the motion vector of the sub-image block is an integer pixel.
  • the calculated pixel accuracy of the motion vector of the sub-image block is sub-pixel accuracy, for example, 1/4 pixel accuracy, 1/8 pixel accuracy, or 1 /16 pixel accuracy, you also need to process the currently calculated motion vector to change it from sub-pixel accuracy to full-pixel accuracy.
  • step 620 includes the following steps 1) and 2).
  • the first motion vector of the sub-image block is calculated based on CPMV, and the pixel accuracy of the calculated first motion vector is sub-pixel.
  • step 2) the second motion vector is obtained according to the first motion vector of the sub-image block, so that the end point of the second motion vector is the whole pixel point closest to the end point of the first motion vector.
  • the closest whole pixel point may be the whole pixel point above, below, left or right of the end point of the first motion vector.
  • the following formula is used to calculate the second motion vector (MV2x, MV2y) of the sub-image block according to the first motion vector (MV1x, MV1y) of the sub-image block.
  • MV2x ((MV1x+(1 ⁇ (shift-1)))>>shift) ⁇ shift;
  • MV2y ((MV1y+(1 ⁇ (shift-1)))>>shift) ⁇ shift;
  • the value of shift is related to the storage accuracy of the motion vector in the coding software platform.
  • the storage accuracy of the motion vector is 1/16 accuracy, and the value of shift can be set to 4.
  • the following formula is used to obtain the second motion vector (MV2x, MV2y) of the sub-image block according to the first motion vector (MV1x, MV1y) of the sub-image block.
  • this application does not limit the manner in which the pixel accuracy of the motion vector is converted from the sub-pixel level to the entire pixel level.
  • the pixel accuracy of the CPMV of the image block may be a whole pixel or a sub-pixel.
  • the pixel accuracy of the motion vector of the sub-image block calculated according to the CPMV of the image block is also sub-pixel; if the pixel accuracy of the CPMV of the image block is full pixels, according to the image block
  • the pixel accuracy of the motion vector of the sub-image block calculated by CPMV may also be sub-pixel.
  • the pixel accuracy of the motion vector of the sub-image block calculated according to formula (1) or formula (2) may be sub-pixel.
  • the pixel accuracy of the sub-image block that is, the motion vector of the processing unit may be sub-pixel, which will cause the motion compensation process to involve sub-pixels, which will increase the bandwidth pressure of the Affine technology.
  • the motion compensation process of the sub-image block does not involve sub-pixels, which can reduce the Affine prediction technology to a certain extent. Bandwidth pressure.
  • the bandwidth pressure problem can also be relieved to a certain extent, but this will reduce the image compression performance.
  • the motion vector of the sub-image block as the processing unit into integer pixel accuracy, it can ensure the motion compensation of the integer pixel accuracy, so that on the one hand, the problem of bandwidth pressure can be solved, and on the other hand, better image compression can be ensured. performance.
  • the existing Affine technology can be improved according to the solution provided by the present application, that is, the motion vector of the Sub-CU in the Affine mode is processed to an integer pixel accuracy, so that the bandwidth pressure generated by the Affine technology can be reduced.
  • the solution provided in this application can also be applied to other similar technologies that may appear in the future.
  • the pixel accuracy of the motion vector includes integer pixel accuracy and sub-pixel accuracy, and the size of the image processing unit Smaller, for example, 4 ⁇ 4.
  • the method provided in the embodiment of the present application further includes: processing the CPMV of the image block to integer pixel accuracy.
  • This embodiment can ensure that the CPMV of the image block has an integer pixel accuracy.
  • step 610 includes the following step 611, step 612, and step 613.
  • the motion vectors of the spatial and/or temporal neighboring blocks of the image block are acquired, and based on the motion vectors of these neighboring blocks, a motion information candidate list of the image block is constructed.
  • the aforementioned formula (3) or formula (4) can be used to process the motion vector in the motion information candidate list into integer pixel accuracy.
  • the neighboring block refers to the neighboring block used to construct the motion information candidate list of the image block, for example, the neighboring block in the temporal and/or spatial domain. This application does not limit the manner of determining neighboring blocks.
  • Affine inter prediction modes can be divided into Affine merge mode and Affine inter mode.
  • the embodiment shown in FIG. 7 can be applied to the Affine inter mode and can also be applied to the Affine merge mode.
  • the inter-frame prediction mode of the image block is the Affine merge mode.
  • a CPMV can be selected from the motion information candidate list directly as the CPMV of the image block. That is, step 613 includes: selecting a CPMV from the motion information candidate list of the image block as the CPMV of the image block.
  • selecting CPMV from the motion information candidate list directly as the CPMV of the image block can ensure that the CPMV of the image block is an integer pixel.
  • the general process of inter prediction in Affine merge mode includes the following steps.
  • the image block is a CU as an example.
  • Step 1-1 Obtain the motion vector (MV) of the neighboring block from the spatial neighboring block and/or the temporal neighboring block.
  • the MV of the neighboring block in the Affine mode and the MV of the neighboring block in the traditional mode are obtained, and CPMVs are obtained according to the MV combination of these neighboring blocks, and the motion information candidate list of the CU is constructed from these CPMVs.
  • Step 1-2 processing the motion vector in the motion information candidate list of the CU into integer pixel accuracy.
  • Steps 1-3 select a combination from the motion information candidate list (the combination may contain two or three CPMV, representing two control points and three control points CPMV), as the CPMVs of the CU.
  • the CPMVs selected in the motion information candidate list are used as the CPMVs of the current CU, no motion estimation is required, and there is no concept of MVD in the Affine inter mode (described below). That is, in the Affine merge mode, only the index of CPMVs selected from the motion information candidate list (one CU only needs to write one index) is written into the code stream, and there is no need to transmit MVD.
  • the inter prediction mode of the neighboring block can be the traditional inter prediction mode or the affine mode. Therefore, the MV obtained from the neighboring block may be of integer pixel accuracy or Sub-pixel accuracy.
  • the embodiment shown in FIG. 7 can also be applied to the Affine inter mode.
  • the general flow of the Affine Inter mode will be described first.
  • the general process of Affine Inter mode includes the following steps.
  • the image block is a CU as an example.
  • Step 2-1 Obtain motion vectors of neighboring blocks from spatial neighboring blocks and/or temporal neighboring blocks.
  • the motion vector of the neighboring block in the Affine mode and the motion vector of the neighboring block in the traditional mode are obtained;
  • CPMVs are obtained by combining the obtained motion vectors, and the motion information candidate list of the CU is constructed from these CPMVs.
  • Step 2-2 select a combination from the motion information candidate list constructed in step 2-1 (the combination may contain two or three CPMV, representing two control points and three control points CPMV), as the current CU MV (Motion vector prediction, MVP) (that is, the predicted CPMVs of the current CU).
  • the combination may contain two or three CPMV, representing two control points and three control points CPMV
  • the current CU MV Motion vector prediction, MVP
  • Step 2-3 Perform motion estimation with the current entire CU as a unit, and obtain CPMVs of the current CU.
  • Step 2-4 Calculate the difference between the CPMVs selected in step 2-2 and the CPMVs of step 2-3 motion estimation to obtain a motion vector difference (MVD).
  • MVD motion vector difference
  • the index of the selected CPMVs and MVD need to be written into the code stream.
  • the motion estimation process is performed in units of CU (corresponding to the image block in the embodiment of this application), and the motion compensation process is performed in a 4 ⁇ 4 sub-CU (corresponding to the sub-image in the embodiment of this application). Block) as a unit.
  • the inter prediction mode of the neighboring block can be the traditional inter prediction mode or the affine mode. Therefore, the MV obtained from the neighboring block may be of integer pixel accuracy or Sub-pixel accuracy.
  • the encoder will select different pixel precisions of the motion vector of the CU. This process can be called adaptive motion vector resolution (AMVR) decision-making.
  • AMVR adaptive motion vector resolution
  • the pixel accuracy of AMVR decision is essentially the pixel accuracy of MVD, that is, the pixel accuracy of CPMVs of CU, not the pixel accuracy of MV of sub-CU.
  • the range of pixel accuracy for AMVR decisions includes but is not limited to: 1/16 pixel accuracy, 1/8 pixel accuracy, 1/4 pixel accuracy, 1/2 pixel accuracy, 1 pixel accuracy, 2 Pixel accuracy, 4-pixel accuracy, etc.
  • the CU can have multiple CPMVs with different pixel accuracy.
  • the CU can have three different CPMVs of integer pixels, 1/4 pixel accuracy, and 1/16 pixel accuracy.
  • the inter prediction mode of the image block is Affine Inter mode
  • step 611 includes obtaining the motion information candidate list of the image block
  • step 612 includes The motion vector is processed to integer pixel accuracy
  • step 613 includes: selecting the predicted CPMV of the image block from the motion information candidate list of the image block to obtain the MVD of the image block, the predicted CPMV of the image block and the MVD of the image block, and obtaining the The CPMV of the image block.
  • step 610 may further include step 614 of performing a motion vector accuracy decision of N pixels for the image block, where N is a positive integer.
  • AMVR decision whole pixel precision motion vector precision decision
  • the pixel accuracy of the MVD of the image block can be guaranteed to be integer pixels, and the pixel accuracy of the CPMV of the image block can also be guaranteed to be integer pixels. In this way, it can be ensured that no sub-pixels are involved in the motion estimation process of the image block, thereby reducing the bandwidth pressure to a certain extent.
  • Affine AMVR when used to make motion vector accuracy decisions, it does not make decisions on all pixel accuracy, but skips the decision of 1/M (M>1) pixel accuracy, that is, only N pixel accuracy is made Decision-making.
  • the number of bits (bit number) written into the code stream is reduced correspondingly because the pixel accuracy options are reduced, and there is even no need to write to indicate motion.
  • the number of bits for the vector precision index include three types: integer pixels, 1/4 pixels, and 1/16 pixels. At least 2 bits of information are required to indicate these three pixel accuracy. For example, "0" is used to indicate 1/4 pixel. "10” means 1/16 pixel, and "11” means whole pixel.
  • "0" can be used to represent the whole pixel, so only 1 bit of data needs to be written in the code stream, or the whole pixel precision can be agreed through the agreement, so there is no need to write the motion vector precision index. Into the code stream, this saves signaling overhead, while also reducing bandwidth pressure.
  • N N is a positive integer
  • pixel motion vector accuracy decision on the image block is the same as the embodiment shown in FIG. 8 It can be implemented in combination, or it can be implemented independently from the embodiment shown in FIG. 8.
  • the inter prediction mode of the image block is Affine inter mode
  • step 610 includes: obtaining the CPMV of the image block, and performing the motion vector precision of N pixels on the image block Decision, N is a positive integer.
  • the CPMV of the image block can be guaranteed to have the integer pixel accuracy.
  • the implementation manner of processing the CPMV of the image block's pixel accuracy to integer pixel accuracy is: processing the motion vectors in the motion information candidate list to integer pixel accuracy.
  • the implementation of processing the pixel accuracy of the CPMV of the image block to integer pixel accuracy is: processing the motion vector in the motion information candidate list to integer pixel accuracy, and performing integer pixel accuracy on the image block AMVR decision.
  • the implementation of processing the CPMV pixel accuracy of the image block to the integer pixel accuracy is to implement an AMVR decision with the integer pixel accuracy for the image block.
  • the method shown in the above formula (3) or formula (4) can be used to process the motion vector of the neighboring block to integer pixel accuracy. It is also possible to use other feasible algorithms or methods that convert from sub-pixels to pixels to process the motion vectors of neighboring blocks into integer-pixel accuracy. This application does not limit this.
  • the CPMV of the image block is processed into integer pixel accuracy.
  • the threshold can be determined according to actual needs.
  • the threshold is 16 pixels.
  • the CPMV of the image block is processed to the integer pixel accuracy.
  • the Affine Inter mode motion estimation in units of image blocks will be performed. For example, when the height and width of the image block are equal to or greater than 16 pixels, even the sub-pixel precision motion estimation process will not cause a large bandwidth pressure. In this case, the CPMV of the image block may not be processed to make it Become an integer pixel accuracy.
  • the motion estimation process with sub-pixel accuracy may cause large Bandwidth pressure.
  • the CPMV of the image block can be processed to full pixel accuracy.
  • the prediction mode of the image block is Affine Inter mode, and the height and/or width of the image block are less than 16 pixels.
  • the method according to the embodiment of the present application further includes: performing integer pixel accuracy on the image block AMVR decision.
  • This embodiment can ensure the motion estimation process with the accuracy of the whole pixel, so as to avoid causing a large bandwidth pressure.
  • the motion vector accuracy index of the image block that meets the condition of height and/or width less than 16 pixels is written into the code stream, the number of bits written into the code stream can be reduced because the pixel accuracy options are reduced.
  • AMVR pixel accuracy can be selected from three methods: integer, 1/4, and 1/16 pixels. For example, “0” represents 1/4 pixel, and “10 "Represents 1/16 pixel, and "11" represents an entire pixel. For CUs with a height and/or width less than 16 pixels, because there is only one AMVR pixel accuracy option, there is no need to write the AMVR pixel accuracy index into the code stream. For example, the whole pixel accuracy can be adopted by agreement.
  • the embodiments of the present application can be applied to different kinds of inter-frame prediction methods, for example, forward prediction, backward prediction, or bi-prediction.
  • the inter-frame prediction mode of the sub-image block mentioned in the embodiment of the present application may be any of the following: forward prediction, backward prediction, and bi-prediction.
  • the motion vector of the sub-image block obtained in the forward prediction process is processed as an integer pixel.
  • the motion vector of the sub-image block obtained in the backward prediction process is processed as an integer pixel.
  • the motion vector of the sub-image block obtained by the bi-prediction process is processed as integer pixels.
  • the inter-frame prediction mode of the sub-image block is bi-prediction, but for only one prediction process in the bi-prediction, the method provided in the embodiment of the present application is used to process the motion vector of the sub-image block to integer pixel accuracy.
  • the CPMV of the image block is the CPMV of the image block obtained by forward prediction in the bi-prediction process, or the CPMV of the image block obtained by backward prediction in the bi-prediction process.
  • the motion vector of the sub-image block obtained in one prediction process of the bi-prediction process is processed as an integer pixel.
  • This prediction process may be the forward prediction process in the bi-prediction or the backward prediction process in the bi-prediction.
  • the motion estimation with integer pixel accuracy can be guaranteed, which helps reduce bandwidth pressure.
  • the solution provided by the present application can reduce the bandwidth pressure caused by the inter-frame prediction process, and at the same time can ensure a certain compression performance.
  • an embodiment of the present application provides an image processing apparatus 900, which includes the following units.
  • the first acquiring unit 910 is configured to acquire the motion vector CPMV of the control point of the image block.
  • the second acquiring unit 920 is configured to acquire a motion vector of a sub-image block in the image block according to the CPMV of the image block acquired by the first acquiring unit 910, and the motion vector has an integer pixel accuracy.
  • the motion compensation process of the sub-image block does not involve sub-pixels, which can reduce the Affine prediction technology to a certain extent. Bandwidth pressure.
  • the second acquiring unit 920 is configured to: calculate the first motion vector of the sub-image block according to the CPMV of the image block, the first motion vector is of sub-pixel accuracy; A motion vector is processed as a second motion vector with integer pixel precision.
  • the second obtaining unit 920 is configured to obtain a second motion vector according to the first motion vector of the sub-image block, so that the end point of the second motion vector is the same as that of the first motion vector. The whole pixel closest to the end point.
  • the second acquisition unit 920 is configured to process the first motion vector into a second motion vector with a pixel accuracy of an entire pixel through formula (3) or formula (4).
  • the height and/or width of the sub-image block is 4 pixels.
  • the first obtaining unit 910 is configured to: obtain a motion information candidate list of the image block, and process the motion vector in the motion information candidate list to integer pixel accuracy; and according to the motion information candidate The list is processed as a motion vector with integer pixel precision, and the CPMV of the image block is obtained.
  • the device 900 further includes: a processing unit 930, configured to make a motion vector accuracy decision of N pixels for the image block, where N is a positive integer.
  • the height and/or width of the image block is less than 16 pixels.
  • the inter-frame prediction mode of the sub-image block is any one of the following: forward prediction, backward prediction, and bi-prediction.
  • the inter-frame prediction mode of the sub-image block is bi-prediction, wherein the CPMV of the image block is the CPMV of the image block obtained by forward prediction in the bi-prediction process, or bi-prediction The CPMV of the image block obtained by backward prediction in the process.
  • the image processing apparatus 900 of this embodiment may be an encoder, and the apparatus 900 may also include functional modules for implementing video encoding related processes.
  • the image processing apparatus 900 of this embodiment may be a decoder, and the apparatus 900 may further include functional modules for implementing video decoding related processes.
  • an embodiment of the present invention also provides an image processing apparatus 1000.
  • the device 1000 includes a processor 1010 and a memory 1020.
  • the memory 1020 is used to store instructions.
  • the processor 1010 is used to execute instructions stored in the memory 1020. The execution of the instructions stored in the memory 1020 makes the processor 1010 The method used to perform the above method embodiment.
  • the encoding device 1000 further includes a communication interface 1030 for transmitting signals with external devices.
  • the image processing apparatus 1000 in this embodiment is an encoder, and the communication interface 1030 is used to receive image or video data to be processed from an external device.
  • the communication interface 1030 is also used to send a coded stream to the decoding end.
  • the image processing apparatus 1000 in this embodiment is a decoder, and the communication interface 1030 is used to receive an encoded bitstream from an encoding end.
  • the embodiment of the present invention also provides a computer storage medium on which a computer program is stored.
  • the computer program When the computer program is executed by a computer, the computer executes the method in the above method embodiment.
  • An embodiment of the present invention also provides a computer program product containing instructions, which is characterized in that, when the instructions are executed by a computer, the computer executes the method of the above method embodiment.
  • the above embodiments it may be implemented in whole or in part by software, hardware, firmware or any other combination.
  • software it can be implemented in the form of a computer program product in whole or in part.
  • the computer program product includes one or more computer instructions.
  • the computer program instructions When the computer program instructions are loaded and executed on the computer, the processes or functions according to the embodiments of the present invention are generated in whole or in part.
  • the computer can be a general-purpose computer, a dedicated computer, a computer network, or other programmable devices.
  • Computer instructions can be stored in a computer-readable storage medium, or transmitted from one computer-readable storage medium to another computer-readable storage medium.
  • computer instructions can be transmitted from a website, computer, server, or data center through a cable (such as Coaxial cable, optical fiber, digital subscriber line (digital subscriber line, DSL) or wireless (such as infrared, wireless, microwave, etc.) transmission to another website site, computer, server, or data center.
  • a computer-readable storage medium may be any available medium that can be accessed by a computer or a data storage device such as a server or data center integrated with one or more available media. Available media may be magnetic media (for example, floppy disk, hard disk, tape), optical media (for example, digital video disc (DVD)), or semiconductor media (for example, solid state disk (SSD)), etc.
  • the disclosed system, device, and method may be implemented in other ways.
  • the device embodiments described above are merely illustrative.
  • the division of the units is only a logical function division, and there may be other divisions in actual implementation, for example, multiple units or components can be combined or It can be integrated into another system, or some features can be ignored or not implemented.
  • the displayed or discussed mutual coupling or direct coupling or communication connection may be indirect coupling or communication connection through some interfaces, devices or units, and may be in electrical, mechanical or other forms.
  • the units described as separate components may or may not be physically separated, and the components displayed as units may or may not be physical units, that is, they may be located in one place, or they may be distributed on multiple network units. Some or all of the units may be selected according to actual needs to achieve the objectives of the solutions of the embodiments.
  • each unit in each embodiment of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units may be integrated into one unit.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)

Abstract

L'invention concerne un procédé et un appareil de traitement d'image. Le procédé consiste à : obtenir un vecteur de mouvement de point de commande (CPMV) d'un bloc d'image ; et obtenir des vecteurs de mouvement (MV) de blocs de sous-image dans le bloc d'image selon le CPMV de celui-ci, le MV représentant la précision de l'ordre d'un pixel complet. En utilisant les MV des sous-blocs d'image servant d'unités de traitement d'image en tant que précision de l'ordre d'un pixel complet, un processus de compensation de mouvement des blocs de sous-image ne peut pas faire intervenir de sous-pixels, ce qui permet de réduire la pression de bande passante générée par une technologie de prédiction affine dans une certaine mesure.
PCT/CN2019/077894 2019-03-12 2019-03-12 Procédé et appareil de traitement d'image Ceased WO2020181507A1 (fr)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN201980005232.7A CN111247804B (zh) 2019-03-12 2019-03-12 图像处理的方法与装置
PCT/CN2019/077894 WO2020181507A1 (fr) 2019-03-12 2019-03-12 Procédé et appareil de traitement d'image

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2019/077894 WO2020181507A1 (fr) 2019-03-12 2019-03-12 Procédé et appareil de traitement d'image

Publications (1)

Publication Number Publication Date
WO2020181507A1 true WO2020181507A1 (fr) 2020-09-17

Family

ID=70865988

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2019/077894 Ceased WO2020181507A1 (fr) 2019-03-12 2019-03-12 Procédé et appareil de traitement d'image

Country Status (2)

Country Link
CN (1) CN111247804B (fr)
WO (1) WO2020181507A1 (fr)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106303544A (zh) * 2015-05-26 2017-01-04 华为技术有限公司 一种视频编解码方法、编码器和解码器
CN106534858A (zh) * 2015-09-10 2017-03-22 展讯通信(上海)有限公司 真实运动估计方法及装置
CN109005407A (zh) * 2015-05-15 2018-12-14 华为技术有限公司 视频图像编码和解码的方法、编码设备和解码设备
CN109218733A (zh) * 2017-06-30 2019-01-15 华为技术有限公司 一种确定运动矢量预测值的方法以及相关设备
WO2019032765A1 (fr) * 2017-08-09 2019-02-14 Vid Scale, Inc. Conversion-élévation de fréquence de trame à complexité réduite

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP3414900B1 (fr) * 2016-03-15 2025-08-06 HFI Innovation Inc. Procédé et appareil de codage vidéo avec compensation de mouvement affine
CN109391814B (zh) * 2017-08-11 2023-06-06 华为技术有限公司 视频图像编码和解码的方法、装置及设备
CN107277506B (zh) * 2017-08-15 2019-12-03 中南大学 基于自适应运动矢量精度的运动矢量精度选择方法及装置

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109005407A (zh) * 2015-05-15 2018-12-14 华为技术有限公司 视频图像编码和解码的方法、编码设备和解码设备
CN106303544A (zh) * 2015-05-26 2017-01-04 华为技术有限公司 一种视频编解码方法、编码器和解码器
CN106534858A (zh) * 2015-09-10 2017-03-22 展讯通信(上海)有限公司 真实运动估计方法及装置
CN109218733A (zh) * 2017-06-30 2019-01-15 华为技术有限公司 一种确定运动矢量预测值的方法以及相关设备
WO2019032765A1 (fr) * 2017-08-09 2019-02-14 Vid Scale, Inc. Conversion-élévation de fréquence de trame à complexité réduite

Also Published As

Publication number Publication date
CN111247804A (zh) 2020-06-05
CN111247804B (zh) 2023-10-13

Similar Documents

Publication Publication Date Title
TWI736872B (zh) 基於解碼器側運動向量推導之運動向量預測推導之限制
JP7853534B2 (ja) マージモードのための予測重み付けを決定する方法、装置及びシステム
TWI893323B (zh) 影像編解碼系統中基於模板匹配之移動向量細化
JP2023014095A (ja) 動きベクトル精密化および動き補償のためのメモリアクセスウィンドウおよびパディング
TW202041002A (zh) 解碼器側運動向量精緻化之限制
TWI841033B (zh) 視頻數據的幀間預測方法和裝置
TW201931854A (zh) 統一合併候選列表運用
TW202005383A (zh) 部分成本計算
TW201933874A (zh) 使用局部照明補償之視訊寫碼
JP7743595B2 (ja) 動きベクトル予測方法及び関連する装置
TW202038611A (zh) 用於視訊寫碼之三角運動資訊
TW202041010A (zh) 用於編碼視訊資料之訊框間預測方法
CN114845102A (zh) 光流修正的提前终止
CN104811728B (zh) 一种视频内容自适应的运动搜索方法
CN116980596B (zh) 一种帧内预测方法、编码器、解码器及存储介质
TW202029771A (zh) 涉及仿射運動的一般應用
JP2024531578A (ja) ビデオ符号化復号化におけるアフィンマージモードの候補導出
CN117981315A (zh) 视频编码中仿射合并模式的候选推导
TW202524909A (zh) 視訊編碼中基於模板匹配的運動細化
KR20230081711A (ko) 비디오 압축을 위한 기하학적 모델을 사용한 모션 코딩
KR100926752B1 (ko) 동영상 부호화를 위한 미세 움직임 추정 방법 및 장치
WO2020252707A1 (fr) Procédé et dispositif de traitement vidéo
WO2020181507A1 (fr) Procédé et appareil de traitement d'image
CN118160309A (zh) 视频编解码中仿射合并模式的候选推导
WO2021134666A1 (fr) Procédé et appareil de traitement vidéo

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19919041

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 19919041

Country of ref document: EP

Kind code of ref document: A1