WO2025077156A1 - Procédé de codage vidéo, procédé de décodage vidéo, et appareil - Google Patents
Procédé de codage vidéo, procédé de décodage vidéo, et appareil Download PDFInfo
- Publication number
- WO2025077156A1 WO2025077156A1 PCT/CN2024/090205 CN2024090205W WO2025077156A1 WO 2025077156 A1 WO2025077156 A1 WO 2025077156A1 CN 2024090205 W CN2024090205 W CN 2024090205W WO 2025077156 A1 WO2025077156 A1 WO 2025077156A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- segmentation
- image block
- target image
- pixel
- point
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/50—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
- H04N19/503—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
- H04N19/51—Motion estimation or motion compensation
- H04N19/513—Processing of motion vectors
- H04N19/517—Processing of motion vectors by encoding
- H04N19/52—Processing of motion vectors by encoding by predictive encoding
Definitions
- Some embodiments of the present disclosure relate to the technical field of video encoding and decoding, and more specifically, to a video encoding method, a video decoding method and a device.
- the reconstructed target image block is obtained according to the two reconstructed sub-image blocks.
- some embodiments of the present disclosure provide a video encoding device, including:
- the processor is configured to, when calling a computer program, enable the video encoding device to implement the video encoding method described in the first aspect.
- some embodiments of the present disclosure provide a video decoding device, including:
- a memory configured to store a computer program
- the processor is configured to, when calling a computer program, enable the video decoding device to implement the video decoding method described in the second aspect.
- some embodiments of the present disclosure provide a computer-readable storage medium having a computer program stored thereon.
- the computer program When executed by a computing device, the computing device implements the video encoding method described in the first aspect or the video decoding method described in the second aspect.
- some embodiments of the present disclosure provide a computer program product, which, when executed on a computer, enables the computer to implement the video encoding method described in the first aspect or the video decoding method described in the second aspect.
- FIG1 shows a block diagram of a video decoding system in some embodiments of the present disclosure
- FIG3 is a schematic diagram showing the structure of a video decoder in some embodiments of the present disclosure.
- FIG5 shows a second flowchart of the video encoding method in some embodiments of the present disclosure
- FIG6 is a schematic diagram showing non-maximum suppression in some embodiments of the present disclosure.
- FIG7 is a schematic diagram showing a set of image regions in some embodiments of the present disclosure.
- FIG8 is a schematic diagram showing a merging solution in some embodiments of the present disclosure.
- FIG9 is a schematic diagram showing spatially adjacent pixel points at four edges of a target image area in some embodiments of the present disclosure.
- FIG10 is a schematic diagram showing pixel value step points in some embodiments of the present disclosure.
- FIG11 is a schematic diagram showing a third-order Bezier curve in some embodiments of the present disclosure.
- FIG12 is a schematic diagram showing a first pixel and a second pixel in some other embodiments of the present disclosure.
- FIG14 is a schematic diagram showing target pixel points in other embodiments of the present disclosure.
- FIG15 is a schematic diagram showing sampling results of segmentation curves in other embodiments of the present disclosure.
- FIG16 is a schematic diagram showing two sub-image blocks in some embodiments of the present disclosure.
- FIG17 is a flowchart showing the steps of a video decoding method in some embodiments of the present disclosure.
- FIG18 shows a third flow chart of the steps of the video encoding method in some embodiments of the present disclosure.
- FIG19 shows a fourth flowchart of the video encoding method in some embodiments of the present disclosure.
- FIG20 is a schematic diagram showing non-maximum suppression in some embodiments of the present disclosure.
- FIG21 is a schematic diagram showing a set of image regions in some embodiments of the present disclosure.
- FIG22 is a schematic diagram showing a merging solution in some embodiments of the present disclosure.
- FIG24 is a schematic diagram showing a second-order Bezier curve in some embodiments of the present disclosure.
- FIG25 is a schematic diagram showing a third-order Bezier curve in some embodiments of the present disclosure.
- FIG26 is a schematic diagram showing a sub-curve in some other embodiments of the present disclosure.
- FIG27 is a schematic diagram showing two sub-image blocks in some embodiments of the present disclosure.
- FIG28 is a flowchart showing the steps of a video decoding method in some embodiments of the present disclosure.
- FIG29 shows a fifth flow chart of the steps of the video encoding method in some embodiments of the present disclosure
- FIG30 is a schematic diagram showing an initial prediction block and a boundary line in some embodiments of the present disclosure.
- FIG31 is a schematic diagram showing the distance from a pixel point to a boundary line in some embodiments of the present disclosure
- FIG32 shows a sixth flow chart of the steps of the video encoding method in some embodiments of the present disclosure
- FIG33 is a schematic diagram showing a first region of an initial prediction block in some embodiments of the present disclosure.
- FIG34 is a flowchart showing the steps of a video decoding method in some embodiments of the present disclosure.
- FIG35 is a flowchart showing the steps of a video encoding method in some embodiments of the present disclosure.
- FIG36 is a flowchart showing the steps of a video encoding method in some other embodiments of the present disclosure.
- FIG37 is a schematic diagram showing the structure of a motion vector prediction model in some embodiments of the present disclosure.
- FIG39 is a schematic diagram showing a horizontal motion vector matrix in some embodiments of the present disclosure.
- FIG40 is a schematic diagram showing a vertical motion vector matrix in some embodiments of the present disclosure.
- FIG41 is a flowchart showing the steps of a video decoding method in some embodiments of the present disclosure.
- FIG42 is a flowchart showing the steps of a video encoding method in some embodiments of the present disclosure.
- FIG43 is a schematic diagram showing an encoded block for generating a candidate motion vector of a target image block in some embodiments of the present disclosure
- FIG44 is a schematic diagram showing a first segmentation mode in some embodiments of the present disclosure.
- FIG45 is a schematic diagram showing a second segmentation mode in some embodiments of the present disclosure.
- FIG46 is a schematic diagram showing a third segmentation mode in some embodiments of the present disclosure.
- FIG48 is a schematic diagram showing a fifth segmentation mode in some embodiments of the present disclosure.
- FIG49 is a schematic diagram showing a sixth segmentation mode in some embodiments of the present disclosure.
- FIG50 is a schematic diagram showing a seventh segmentation mode in some embodiments of the present disclosure.
- FIG51 is a schematic diagram showing an eighth segmentation mode in some embodiments of the present disclosure.
- FIG52 is a schematic diagram showing a ninth segmentation mode in some embodiments of the present disclosure.
- FIG53 is a schematic diagram showing a tenth segmentation mode in some embodiments of the present disclosure.
- FIG54 is a schematic diagram showing an eleventh segmentation mode in some embodiments of the present disclosure.
- FIG55 is a schematic diagram showing a twelfth segmentation mode in some embodiments of the present disclosure.
- FIG56 is a schematic diagram showing a thirteenth segmentation mode in some embodiments of the present disclosure.
- FIG57 is a schematic diagram showing a fourteenth segmentation mode in some embodiments of the present disclosure.
- FIG58 is a schematic diagram showing a fifteenth segmentation mode in some embodiments of the present disclosure.
- FIG59 is a schematic diagram showing a sixteenth segmentation mode in some embodiments of the present disclosure.
- FIG. 60 is a flowchart showing the steps of a video decoding method in some embodiments of the present disclosure.
- the embodiments of the present disclosure relate to the technical field of video coding and decoding.
- the following first describes a video coding and decoding framework for executing the video coding method and the video decoding method provided by the embodiments of the present disclosure.
- Video can be regarded as a sequence of multiple video frames (images).
- Video playback can be regarded as the display of video frames at a preset speed (for example, 24 frames per second, 30 frames per second, and 60 frames per second) in the order of the sequence.
- the amount of video data is positively correlated with the resolution of the video frame. The higher the resolution of the video frame, the larger the amount of video data. If the pixel data of each pixel of all video frames is directly saved in the video file, the amount of video data will be very large, which will make the video difficult to store and transmit.
- Video encoding and decoding is proposed to solve this problem to a certain extent.
- Video decoding mainly includes: video encoding and video decoding. Among them, video encoding can be understood as the process of compressing the original video frame, and video decoding can be understood as the process of reconstructing the video frame based on the compressed video data.
- the video decoding system 100 includes: a source device 10 and a destination device 20.
- the source device 10 can obtain original video data through a video source 101, and encode the original video frame through a video encoder 102 to obtain video encoding data, and provide the video encoder 102 output video encoding data to the destination device 20 through an output interface 103.
- the destination device 20 can obtain the video encoding data provided by the source device 10 through an input interface 201, and decode the video encoding data through a video decoder 202 to obtain video decoding data, and input the video decoding data into a player 203 to play the video.
- the source device 10 and the destination device 20 may include any one of a wide range of devices, such as: a personal computer (Program Counter), a notebook computer, a tablet computer, a set-top box, a mobile phone, a television, a camera, a display, a digital media player, a video game console, a video streaming device, etc.
- a personal computer Program Counter
- notebook computer a tablet computer
- set-top box a mobile phone
- television a camera
- a display a digital media player
- video game console a video streaming device
- the video source 101 of the source device 10 may be a video shooting device, such as a camera.
- the video source 101 may be a component capable of generating video based on computer graphics, such as a screen recording component, an animation generation component, etc.
- the destination device 20 may receive the video encoding data provided by the source device 10 via a computer-readable medium.
- the computer-readable medium may include any type of medium or device capable of moving the video encoding data from the source device 10 to the destination device 20.
- the computer-readable medium may include a communication medium.
- the communication medium may modulate the video encoding data according to a communication standard (e.g., a wireless communication protocol) and transmit it to the destination device 20.
- the communication medium may include any wireless or wired communication medium, such as a radio frequency (RF) spectrum or one or more physical transmission lines.
- the communication medium may form part of a packet network (e.g., a local area network, a wide area network, or a global network, such as the Internet).
- the communication medium may include a router, a switch, a base station, or any other device that can be used to facilitate communication from the source device 10 to the destination device 20.
- the video encoding data may be output from the output interface 103 of the source device 10 to a storage device. Accordingly, the video encoding data may be accessed from the storage device by the input interface 201 of the destination device 20.
- the storage device may include any of a variety of distributed or locally accessed data storage media, such as a hard drive, a Blu-ray disc, a DVD, a CD-ROM, a flash memory, a volatile or non-volatile memory, or any other suitable digital storage medium for storing video encoding data.
- the storage device may be a server or an intermediate storage device for storing the video encoding data generated by the source device 10.
- the destination device 20 may obtain the stored video encoding data from the storage device via streaming or downloading.
- the file server may be any type of computer capable of storing encoded video data and transmitting the encoded video data to the destination device 20.
- the file server includes a network server (e.g., for a website), an FTP server, a network attached storage device, or a local disk drive.
- Destination device 20 can access the encoded video data through any standard data connection (including an Internet connection). This may include a wireless channel (e.g., a Wi-Fi connection), a wired connection (e.g., DSL, cable modem, etc.), or a combination of both that are suitable for accessing the encoded video data stored on the file server.
- the transmission of the encoded video data from the storage device may be a streaming transmission, a download transmission, or a combination thereof.
- the video coding standard has gradually evolved from ISO/IECMPEG-1 (International Standardization Organization/International Electrotechnical Commission Moving Picture Experts Group-1), through ISO/IECMPEG-2, ISO/IECMPEG-4, Advanced Video Coding (AVC), High Efficiency Video Coding (HEVC), etc. to Versatile Video Coding (VVC).
- ISO/IECMPEG-1 International Standardization Organization/International Electrotechnical Commission Moving Picture Experts Group-1
- ISO/IECMPEG-2 International Standardization Organization/International Electrotechnical Commission Moving Picture Experts Group-1
- ISO/IECMPEG-4 Advanced Video Coding
- AVC Advanced Video Coding
- HEVC High Efficiency Video Coding
- VVC Versatile Video Coding
- the video encoder 200 includes: a segmentation module 21.
- the segmentation module 21 is used to divide the video frame to be encoded into multiple rectangular image blocks, and decide whether to further segment the rectangular image blocks according to the actual encoding situation. Specifically: since selecting a larger encoding block is more conducive to improving the encoding efficiency, for rectangular image blocks with relatively flat pixels, fewer segmentation times can be selected, and the segmentation depth is shallower. On the contrary, for rectangular image blocks with more complex textures or located at the edge of the video frame, more segmentation times can be selected, and the segmentation depth is deeper.
- the segmentation module 21 will also segment at least one rectangular image block into two irregular sub-image blocks based on a segmentation curve that fits the real texture edge in the rectangular image block.
- the video encoder 200 provided in some embodiments of the present disclosure further includes a motion estimation module 22.
- the motion estimation module 22 is used to perform motion estimation (ME) on each coding block (including rectangular image blocks and shaped sub-image blocks that are not further divided) to obtain a reference block (RB) corresponding to each coding block.
- ME motion estimation
- the video encoder 200 provided in some embodiments of the present disclosure further includes a motion compensation module 23.
- the motion compensation module 23 is used to perform motion compensation (MC) on the reference block to obtain a prediction block and a motion vector (MV) corresponding to each coding block.
- the video encoder 200 provided in some embodiments of the present disclosure further includes a residual calculation module 24.
- the residual calculation module 24 is used to calculate the residual between the prediction block and the corresponding coding block to obtain a prediction residual.
- the video encoder 200 provided in some embodiments of the present disclosure further includes a quantization transformation module 25.
- the quantization transformation module 25 is used to transform and quantize the prediction residual to obtain a transformation coefficient.
- the video encoder 200 provided in some embodiments of the present disclosure further includes an entropy coding module 26.
- the entropy coding module 26 is used to perform entropy coding on information such as segmentation curves and transformation coefficients to obtain code stream data of a video frame to be encoded.
- the above encoding process only describes some steps in the video encoding process.
- the video encoding process also includes other steps.
- the prediction mode, transform mode and other information are input into the entropy encoding module 26, and the entropy encoding result of the motion vector index number in the motion vector derivation table, the prediction mode, the transform mode and other information is added to the In the coded data of the coded video frame.
- a video decoder 300 provided by some embodiments of the present disclosure includes an entropy decoding module 31.
- the entropy decoding module 31 is used to perform entropy decoding on the bitstream data of the video frame to be encoded, and obtain at least one of the information such as a segmentation curve for segmenting the rectangular image block of the video frame to be reconstructed into two irregular sub-image blocks, motion vectors of the two sub-image blocks, transformation coefficients of the two sub-image blocks, motion vectors of the rectangular image block, and transformation coefficients of the rectangular image block according to the entropy decoding result.
- the video decoder 300 provided in some embodiments of the present disclosure further includes a motion estimation module 32.
- the motion estimation module 32 is used to determine a reference block corresponding to each block to be reconstructed (including rectangular image blocks and irregular sub-image blocks that are not further divided).
- the video decoder 300 provided in some embodiments of the present disclosure further includes a motion compensation module 33.
- the motion compensation module 33 is used to perform motion compensation on the reference block according to the motion vector to obtain a prediction block corresponding to the block to be reconstructed.
- the video decoder 300 provided in some embodiments of the present disclosure further includes: an inverse quantization transformation module 34.
- the inverse quantization transformation module 34 is used to perform inverse quantization and inverse transformation operations on transformation coefficients to obtain prediction residuals.
- the video decoder 300 provided in some embodiments of the present disclosure further includes a fusion module 35.
- the fusion module 35 is used to add and fuse the prediction residual and the prediction block to obtain the reconstructed data of the block to be reconstructed.
- the video decoder 300 provided in some embodiments of the present disclosure further includes a reconstruction module 36.
- the reconstruction module 36 is used to reconstruct each block to be reconstructed according to the reconstruction data of the block to be reconstructed to obtain a reconstructed video frame.
- the embodiment of the present disclosure provides a video encoding method. As shown in FIG. 4 , the video encoding method includes the following steps S41 to S46:
- the target image block is a rectangular image block obtained by dividing the video frame into blocks.
- the encoder After receiving a video frame to be encoded, the encoder will first divide the video frame to be encoded into multiple image blocks, and encode the image blocks obtained by dividing the video frame to be encoded as the minimum coding unit.
- the encoding task for the image block can be further refined.
- the image block obtained by dividing the video frame to be encoded is called a coding tree unit (CTU), and the coding tree unit can be further divided into smaller square coding units (CU) by using a quadtree division method.
- the maximum size of the coding tree unit can be supported to 64 ⁇ 64 and the minimum size can be supported to 16 ⁇ 16.
- the VVC standard expands the maximum size of CTU to 128 ⁇ 128, and the further division method is no longer limited to quadtree division, but also supports binary tree and ternary tree division.
- the target image block in the embodiment of the present disclosure can be a coding tree unit, or a coding unit obtained by further dividing the coding tree unit, which is not limited in the embodiment of the present disclosure.
- S42 Acquire a boundary line of the target image block according to a texture edge in the target image block.
- the dividing line is used to divide the target image block into two image areas.
- the above step S42 (obtaining the boundary line of the target image block according to the texture edge in the target image block) includes: dividing the target image block into two image areas based on the texture edge in the target image block, and obtaining the boundary line of the two image areas as the boundary line of the target image block.
- the embodiment of the present disclosure may not further divide the sub-image blocks, but directly encode such image blocks as a coding unit.
- the texture edge in the image block will divide the image block into image areas greater than 2. For such image blocks, multiple image areas can be first merged into two image areas, and then the boundary line of the two image areas can be obtained as the boundary line of the target image block.
- S43 Obtain two end segments of the segmentation curve according to the encoded video content to obtain a segmentation start point and a segmentation end point.
- the video content that has been encoded in the embodiment of the present disclosure refers to the video content that has been encoded before encoding the target image block.
- the video content that has completed video encoding can be the content in a video frame that is close to the current video frame in time domain, or it can be the content in the current video frame that is close to the target image block in spatial domain.
- S44 Acquire the segmentation curve and control information of the segmentation curve according to the segmentation starting point, the segmentation end point, and the dividing line.
- the disclosed embodiment does not limit the type of segmentation curve.
- the control information of the segmentation curve is a control information of a different type. Therefore, the disclosed embodiment does not limit the control information of the segmentation curve, and the segmentation curve can be reconstructed according to the segmentation starting point, the segmentation end point, and the control information of the segmentation curve.
- the segmentation curve is a second-order Bezier curve
- the control information of the segmentation curve is the position information of the control points of the second-order Bezier curve.
- the control information of the segmentation curve is the position information of the first control point and the position information of the second control point of the third-order Bezier curve.
- the segmentation curve is a continuous curve.
- the continuous segmentation curve may pass through the smallest unit (pixel point) of the digital image, but the smallest unit of the digital image cannot be segmented. Therefore, the segmentation curve cannot be directly applied to the segmentation of the digital image. Therefore, when the target image block is segmented into two sub-image blocks based on the segmentation curve of the target image block, it is necessary to sample the segmentation curve into a discrete curve, and then segment the target image block into two sub-image blocks through the sampled curve.
- S46 Acquire code stream data corresponding to the target image block according to the control information of the segmentation curve and the two sub-image blocks.
- the video encoding method when encoding the target image block obtained by dividing the video frame into blocks, first obtains the dividing line used to divide the target image block into two image areas according to the texture edge in the target image block, then obtains the two end segments of the dividing curve according to the video content that has been encoded to obtain the dividing starting point and the dividing end point, then obtains the dividing curve and the control information of the dividing curve according to the dividing starting point, the dividing end point and the dividing line, and divides the target image block into two sub-image blocks based on the dividing curve of the target image block, and finally obtains the code stream data corresponding to the target image block according to the control information of the dividing curve and the two sub-image blocks.
- the decoding end can also obtain the dividing starting point and the dividing end point according to the video content that has been encoded, so the encoding end only needs to add the dividing curve to the code stream data corresponding to the target image block.
- the decoding end can obtain complete segmentation curve information by the control information, so the embodiment of the present disclosure can avoid adding the segmentation start point and segmentation end point of the segmentation curve to the bitstream data corresponding to the target image block, thereby reducing the bit rate overhead caused by representing the segmentation curve.
- the embodiment of the present disclosure provides another video encoding method.
- the video encoding method includes the following steps S501 to S515:
- the target image block is a rectangular image block obtained by dividing the video frame into blocks.
- S502 Segment the target image block into two image areas based on a texture edge in the target image block, and obtain a boundary line of the two image areas as a boundary line of the target image block.
- segmenting the target image block into two image regions based on a texture edge in the target image block comprises the following steps a to g:
- Step a converting the target image block into a grayscale image to obtain a grayscale image block corresponding to the target image block.
- the color space of the target image block is a YUV color space
- the pixel value of each pixel of the target image block is composed of a brightness signal Y representing a brightness value and two chrominance signals U and V representing color values.
- the implementation method of converting the target image block into a grayscale image may include: deleting the chrominance signals U and V in the pixel value of each pixel and retaining only the brightness signal Y, thereby converting the target image block into a grayscale image.
- the embodiment of the present disclosure first converts the target image block into a grayscale image and filters out the chromaticity signal in the target image block, thereby reducing the amount of data processing in the subsequent edge detection process.
- Step b filtering the grayscale image block based on a Gaussian kernel of a preset size.
- the Gaussian kernel of the preset size may be a 5*5 Gaussian kernel.
- the Gaussian kernel G ⁇ may be defined as shown in the following formula (1):
- ⁇ is the standard deviation
- x and y are the distance values from the center of the Gaussian kernel in the X-axis direction and the Y-axis direction respectively.
- the grayscale image block is represented as I
- the grayscale image block after filtering is represented as I ⁇
- the Gaussian kernel of the preset size is represented as G ⁇ .
- Gaussian kernels can also be used to filter the grayscale image block.
- a 7*7 Gaussian kernel is selected to filter the grayscale image block.
- Another example is to select a 3*3 Gaussian kernel to filter the grayscale image block.
- the embodiment of the present disclosure does not limit the size of the Gaussian kernel.
- the Gaussian kernel can be selected according to the resolution of the grayscale image block. That is, when the resolution of the grayscale image block is large, a Gaussian kernel with a larger size can be selected, and when the resolution of the grayscale image block is small, a Gaussian kernel with a smaller size can be selected.
- the embodiment of the present disclosure also performs filtering processing on the target image block through the above step b, the embodiment of the present disclosure can filter out the noise in the target image block, thereby improving the accuracy of subsequent texture edge detection.
- Step c performing texture edge detection on the target image block to obtain at least one texture edge of the target image block.
- the texture edge of the target image block refers to an image region in the target image block where the brightness changes significantly.
- the edge detection algorithm used for texture edge detection of the target image block is not limited, and the edge detection algorithm used is subject to the ability to obtain the texture edge of the target image block.
- the texture edge detection of the target image block can be performed using edge detection algorithms such as Sobel, Prewitt, Roberts, Canny, and Marr-Hildreth.
- step c is described below by using the Canny edge detection algorithm to perform texture edge detection on the grayscale image block.
- Using the Canny edge detection algorithm to perform texture edge detection on the grayscale image block includes the following steps c1 to c3:
- Step c1 calculating the gradient value and gradient direction of the grayscale image block after filtering.
- the gradient value and gradient direction of the grayscale image block refer to the global gradient value and global gradient direction of the grayscale image block.
- the global gradient value includes the gradient value of each pixel in the grayscale image block
- the global gradient direction includes the gradient direction of each pixel in the grayscale image block.
- calculating the gradient value and gradient direction of the grayscale image block after filtering may include: calculating the gradient value and gradient direction of the grayscale image block after filtering based on a Sobel operator, a Roberts operator, a Prewitt operator, or a Laplacian operator.
- step c1 is described below by taking the calculation of the gradient value and gradient direction of the grayscale image block after filtering based on the Sobel operator as an example.
- G is the gradient value of the grayscale image block after filtering
- ⁇ is the gradient direction of the grayscale image block after filtering
- Step c2 performing non-maximum suppression on the gradient value of the grayscale image block after filtering according to the gradient value and gradient direction of the grayscale image block after filtering.
- the texture edge directly obtained based on the gradient value is too rough, so it is necessary to perform non-maximum suppression on the gradient value to refine the edge.
- the above-mentioned step c2 (performing non-maximum suppression on the gradient value of the grayscale image block after filtering according to the gradient value and gradient direction of the grayscale image block after filtering) includes: traversing the pixel points of the grayscale image block after filtering, and determining whether the gradient value of the current pixel point is the maximum value among the surrounding pixel points with similar gradient direction; if so, retaining the current pixel point as a candidate edge point, if not, deleting the current pixel point from the candidate edge points.
- the implementation method for determining whether the gradient value of the current pixel is the maximum value of the surrounding pixels with similar gradient directions may be: discretizing the gradient direction ⁇ into eight directions of 0°, 45°, 90°, 135°, 180°, 225°, 270°, and 315°, traversing each pixel belonging to the texture edge, obtaining the gradient value of the pixel closest to the corresponding first direction and the gradient value of the pixel closest to the corresponding second direction, if the gradient value of the current pixel is greater than the gradient value of the pixel in the first direction, and the gradient value of the current pixel is greater than the gradient value of the pixel in the second direction, then retain the current pixel as a pixel belonging to the texture edge, otherwise suppress the current pixel and remove it from the texture edge.
- the first direction corresponding to the pixel is the direction with the smallest angle with the gradient direction of the pixel among the eight directions
- the second direction corresponding to the pixel is the opposite direction of the first direction corresponding to the
- the gradient direction of pixel a is ⁇ , and the direction with the smallest angle with the gradient direction ⁇ of pixel a among the eight directions obtained by discrete gradient directions is the 45° direction, the pixel closest to pixel a in the 45° direction is pixel b, the opposite direction of the 45° direction is the 225° direction, and the pixel closest to pixel a in the 225° direction is pixel c, so the gradient value G b of pixel b and the gradient value G c of pixel c are obtained, if the gradient value G a of pixel a is greater than the gradient value G b of pixel b, and the gradient value G a of pixel a is greater than the gradient value G c of pixel c, then pixel a is retained as a pixel belonging to the texture edge, and if the gradient value G a of pixel a is less than the gradient value G b of pixel b and/or the gradient value G a of pixel
- Step c3 determining the texture edge of the target image block based on the first threshold and the second threshold.
- the first threshold is smaller than the second threshold.
- the above step c3 (determining the texture edge of the target image block based on the first threshold and the second threshold) includes: determining that the pixel point with a gradient value greater than the second threshold is a strong edge point, and the strong edge point belongs to the texture edge, determining that the pixel point with a gradient value less than the first threshold does not belong to the texture edge, determining that the pixel point with a gradient value between the second threshold and the first threshold belongs to a weak edge point, and further determining whether the weak edge point belongs to the texture edge based on whether the weak edge point is connected to the strong edge.
- Step d performing image region segmentation on the target image block based on the at least one texture edge to obtain an image region set corresponding to the target image block.
- Step e obtaining the number of image regions in the image region set.
- the number of image regions obtained by dividing the image region of the target image block by the texture edge of the target image block is determined.
- Step f1 obtaining at least two merging schemes of the image region set.
- the image area set includes: image area A, image area B and image area C, so there are three merging schemes.
- Merging scheme 1 is: merging image area A and image area B to obtain image area D, thereby merging the image areas in the image area set into two image areas, image area D and image area C;
- merging scheme 2 is: merging image area A and image area C to obtain image area E, thereby merging the image areas in the image area set into two image areas, image area E and image area B;
- merging scheme 3 is: merging image area B and image area C to obtain image area F, thereby merging the image areas in the image area set into two image areas, image area F and image area A.
- Step f2 respectively obtain the difference between the at least two merging schemes.
- the difference of any merging scheme is the absolute difference between the first similarity and the second similarity of the merging scheme; the first similarity and the second similarity of any merging scheme are respectively the sum of the absolute differences in the grayscale values of each pixel point of the two image areas and the corresponding co-located blocks under the merging scheme.
- Step f21 respectively obtain the co-located blocks corresponding to the two image regions under the merging scheme.
- Coding structures include: All Intra (AI), Low Delay (LD), Low Latency (LP), and Random Access (RA).
- AI All Intra
- LD Low Delay
- LP Low Latency
- RA Random Access
- the co-located block corresponding to any image region refers to the image region in the adjacent coded video frame of the current video frame that has the same position as the image region.
- Step f22 respectively calculate the sum of the absolute differences of the grayscale values of each pixel of the two image regions under the merging scheme and the corresponding co-located blocks to obtain the first similarity and the second similarity of the merging scheme.
- the first similarity of the merge scheme j is expressed as ⁇ j1 , and the first similarity ⁇ j1 of the merge scheme j can be calculated by the following formula (9):
- n1 is the number of pixels in the first image region under the merging scheme j, and n1 is also the number of pixels in the same-position block corresponding to the first image region under the merging scheme j; is the brightness value of pixel i in the first image area; is the brightness value of pixel i in the same-position block corresponding to the first image area.
- the second similarity of the merged solution j is expressed as ⁇ j2 , and the second similarity ⁇ j2 of the merged solution j can be calculated by the following formula (10):
- Step f23 Calculate the absolute difference between the first similarity and the second similarity of the merging scheme to obtain the difference of the merging scheme.
- Step f3 combine the image regions in the image region set by using the merging scheme with the largest difference among the at least two merging schemes.
- the image regions are merged into two image regions.
- the differences of merging scheme 1, merging scheme 2, and merging scheme 3 are D 1 , D 2 , and D 3 respectively. If D 1 > D 2 and D 1 > D 3 , the image regions in the image region set are merged into two image regions, image region D and image region C, through merging scheme 1. If D 2 > D 1 and D 2 > D 3 , the image regions in the image region set are merged into two image regions, image region B and image region E, through merging scheme 2. If D 3 > D 1 and D 3 > D 2 , the image regions in the image region set are merged into two image regions, image region A and image region F, through merging scheme 3.
- each merging scheme merges the image areas in the image area set into two image areas
- the larger the difference D j of the merging scheme the greater the difference in the grayscale values of the two image partitions obtained by the merging scheme, and the two image partitions obtained by the merging scheme are more likely to belong to different image content objects. Therefore, by merging the image areas in the image area set into two image areas through the merging scheme with the largest difference among the at least two merging schemes, the target image block can be more reasonably divided into two image areas.
- step f finds and construct an optimal edge line among multiple edges.
- each image area of the target image block is merged to achieve this purpose, but the disclosed embodiment is not limited to this.
- Other solutions that can solve the problem of selecting multiple edge lines can also be used as variant solutions, such as: using a path-finding algorithm to find the longest path.
- the Astar image search algorithm can be used to find the optimal dividing line to obtain a discrete curve whose starting point and end point are both on the boundaries of the coding block.
- S503 Determine whether spatially adjacent pixel points at four edges of the target image block can be obtained in the encoded video content.
- the premise for obtaining the spatially adjacent pixels on the four edges of the target image block in the encoded video content is that the target image block is not located at the edge of the current video frame and the image content around the target image block has completed video encoding.
- FIG. 9 takes the target image block 90 with a resolution of M*N as an example to determine whether the spatially adjacent pixels of the four edges of the target image block 90 can be obtained in the encoded video content, that is, to determine whether the M pixels 91 directly above the target image block, the M pixels 92 directly below the target image block, the N pixels 93 directly to the left of the target image block, and the N pixels 94 directly to the right of the target image block can be obtained in the encoded video content.
- step S504 if the spatially adjacent pixel points of the four edges of the target image block can be obtained in the encoded video content, the following step S504 is performed:
- S504 Acquire a segmentation start point and a segmentation end point of a segmentation curve according to spatially adjacent pixel points at four edges of the target image block.
- the above step S504 (obtaining the segmentation start point and segmentation end point of the segmentation curve according to the spatially adjacent pixel points of the four edges of the target image block) includes the following steps 1 to 3:
- Step 1 Obtain the gradient values of spatially adjacent pixel points at four edges of the target image block in the direction of the corresponding edges.
- the directions of the four edges of the target image block are horizontal or vertical, so obtaining the gradient values of spatially adjacent pixel points at the four edges of the target image block in the direction of the corresponding edges can be simplified to the difference in pixel grayscale values in the horizontal or vertical direction.
- Step 2 Obtain at least two pixel value step points according to the gradient values of spatially adjacent pixel points at four edges of the target image block in the direction of the corresponding edges.
- the pixel value step point is a pixel point having the largest gradient value and a gradient value greater than a threshold gradient value among spatially adjacent pixel points at any edge of the target image block.
- an implementation method for obtaining at least two pixel value step points may include: traversing the spatially adjacent pixels of the four edges of the target image block respectively, and determining whether the gradient value of the current pixel is greater than the current maximum gradient value; if so, updating the current maximum gradient value to the gradient value of the current pixel; if not, determining whether the gradient value of the next pixel is greater than the current maximum gradient value; after completing the traversal of the spatially adjacent pixels of the edge, determining whether the maximum gradient value is greater than the threshold gradient value; if so, determining the pixel corresponding to the maximum gradient value as a pixel value step point; if not, determining that no pixel value step point exists in the spatially adjacent pixels of the current edge.
- the target image block has four edges, there is one pixel value step point or no pixel value step point in the spatial domain adjacent pixel points on each edge, so according to the gradient value of the spatial domain adjacent pixel points of the four edges of the target image block in the direction of the corresponding edge, at most four pixel value step points can be obtained, and at least 0 pixel value step points can be obtained.
- the number of pixel value step points obtained is 0 or 1, it means that there is no obvious boundary inside the target image block, and continuing to divide the special-shaped coding block will not bring significant compression performance improvement.
- the encoder is also very likely not to select this mode when selecting the coding mode later, but affects the coding efficiency.
- the special-shaped coding block mode is directly terminated in advance, when the number of pixel value step points is 2, the two pixel value step points are directly determined as the segmentation starting point and segmentation end point of the segmentation curve, and when the number of pixel value step points obtained is 3 or 4, it is necessary to further perform the following step 3 to determine the segmentation starting point and segmentation end point of the segmentation curve.
- FIG10 takes the example of obtaining four pixel value step points according to the gradient values of the spatially adjacent pixels of the four edges of the target image block in the direction of the corresponding edges as an example to illustrate the above step 3.
- the four pixel value step points are respectively pixel point 101 in the spatially adjacent pixels of the upper edge, pixel point 102 in the spatially adjacent pixels of the lower edge, pixel point 103 in the spatially adjacent pixels of the left edge, and pixel point 103 in the spatially adjacent pixels of the right edge.
- the gradient value of pixel point 101 is e1
- the gradient value of pixel point 102 is e2
- the gradient value of pixel point 103 is e3
- the gradient value of pixel point 104 is e4 .
- two pixel value step points with the largest values are selected from e1 , e2 , e3 , and e4 , and the selected two pixel value step points are determined as the segmentation starting point and the segmentation end point.
- S505 Obtain the best matching block of the target image block in the encoded video content, and The segmentation start point and the segmentation end point are obtained by using edge pixels of the best matching block of the image block.
- the above step S505 (obtaining the best matching block of the target image block in the encoded video content, and obtaining the segmentation start point and the segmentation end point according to the edge pixel points of the best matching block of the target image block) includes the following steps 1 to 4:
- the implementation of obtaining the best matching block of the target image block may include: performing motion estimation on the current video frame in a reference video frame of the current video frame to obtain the best matching block of the target image block.
- Coding structures include: All Intra (AI), Low Delay (LD), Low Latency (LP), and Random Access (RA).
- AI All Intra
- LD Low Delay
- LP Low Latency
- RA Random Access
- the commonly used method for motion estimation is the block matching method, which focuses on the block matching criteria and search methods.
- SAD Sum of Absolute Difference
- MSE Mean Squared Error
- NCCF Normalized Cross Correlation Function
- M ⁇ N is the resolution of the target image block
- f(x,y) is the pixel value at (x,y) of the target image block
- g(x,y) is the reconstructed pixel value at (x,y).
- Search methods include full search, cross search, two-dimensional logarithmic search and other fast search algorithms.
- the full search algorithm is extremely complex and cannot meet the requirements of real-time coding.
- the TZ search algorithm is a commonly used fast search algorithm. Compared with the full search algorithm, the performance is slightly reduced, while the search time is greatly reduced.
- the motion estimation in the disclosed embodiment can use the TZ search fast algorithm to perform a diamond search, select SAD as the matching criterion, and finally obtain an optimal integer pixel motion vector.
- Step 2 obtaining at least two pixel value step points according to the gradient values of the pixel points on the four edges of the best matching block of the target image block in the direction of the corresponding edges.
- the pixel value step point is a pixel point having the largest gradient value and a gradient value greater than a threshold gradient value among the pixel points on any edge of the best matching block of the target image block.
- step 2 and step 3 is similar to the implementation of step 2 and step 3 above, and will not be repeated here to avoid redundancy.
- Step 4 determining the segmentation starting point and the segmentation end point according to the two pixel value step points with the largest gradient values and the motion vector of the best matching block of the target image block respectively.
- the segmentation starting point and the segmentation end point are determined in the target image block according to the two pixel value step points with the largest gradient values and the motion vector of the best matching block, respectively, including: subtracting the position coordinates of the two pixel value step points with the largest gradient values from the motion vector of the best matching block in the horizontal and vertical directions, respectively, to obtain the position coordinates of the segmentation starting point and the segmentation end point.
- t ⁇ [0,1] is the endpoint interpolation variable of the Bezier curve function.
- the boundary line is fitted with a third-order Bezier curve, that is, the coordinates of a group of points on the boundary line (x 0 , y 0 ),...,(x m , y m ) are substituted into the following formula (14), and the coordinates of the first control point p1 and the second control point p2 are solved according to the coefficients of the Bezier curve obtained when the value of formula (14) is minimum:
- the dividing line in FIG11 is shown as line 111 in FIG11
- the third-order Bezier curve obtained by fitting the dividing line with the segmentation starting point and the segmentation end point as two endpoints of the third-order Bezier curve is shown as an example of line 112 in FIG11.
- the starting point of the fitted third-order Bezier curve is the segmentation starting point s
- the starting point of the fitted third-order Bezier curve is the segmentation end point e
- the first control point and the control point of the fitted third-order Bezier curve are p1 and p2 respectively.
- S507 Determine whether the first control point and the second control point are located on the same side of a straight line connecting the segmentation start point and the segmentation end point.
- step S508 If the first control point and the second control point of the fitted third-order Bezier curve are on the same side of the straight line connecting the segmentation starting point and the segmentation end point, it means that the segmentation curve is a single arch, so the second-order Bezier curve is selected for fitting to further reduce the bit rate overhead caused by representing the segmentation curve. If the first control point and the second control point of the fitted third-order Bezier curve are on both sides of the straight line connecting the segmentation starting point and the segmentation end point, it means that the segmentation curve is a double arch, and the fitting accuracy of the second-order Bezier curve is poor, so the third-order Bezier curve is selected for fitting to improve the fitting accuracy of the segmentation curve. That is, in the above step S507, if it is determined that the first control point and the second control point are not on the same side of the straight line connecting the segmentation starting point and the segmentation end point, the following step S508 is executed:
- the above step S508 (obtaining the segmentation curve according to the segmentation starting point, the segmentation end point, the first control point and the second control point) includes the following steps 5081 to 5083:
- Step 5081 Determine a first pixel point and a second pixel point according to the segmentation starting point, the segmentation end point, the first control point, and the second control point.
- the first pixel point is a pixel point located on a one-third perpendicular line of a line segment with the segmentation starting point and the segmentation end point as endpoints and is closest to the first control point
- the second pixel point is a pixel point located on a two-thirds perpendicular line of a line segment with the segmentation starting point and the segmentation end point as endpoints and is closest to the second control point.
- the line segment with the segmentation starting point s and the segmentation end point e as endpoints is line segment 121
- the one-third perpendicular line of line segment 121 is straight line 122
- the pixel point located on straight line 122 and closest to the first control point p1 is pixel a
- the two-thirds perpendicular line of line segment 121 is straight line 123
- the pixel point located on straight line 123 and closest to the second control point p2 is pixel b. Therefore, the first pixel point a and the second pixel point b can be determined according to the segmentation starting point s, the segmentation end point e, the first control point p1 and the second control point p2.
- Quantizing the position coordinates of the first pixel point and the position coordinates of the second pixel point can reduce the number of pixels representing the segmentation.
- the bit rate overhead caused by the curve control information can reduce the number of pixels representing the segmentation.
- Step 5083 Generate the segmentation curve according to the segmentation starting point, the segmentation end point, the first quantization control point, the second quantization control point and the third-order Bezier curve formula.
- step S509 if it is determined that the first control point and the second control point are located on the same side of the straight line connecting the segmentation starting point and the segmentation end point, the following step S509 is performed:
- t ⁇ [0,1] is the endpoint interpolation variable of the Bezier curve function.
- a second-order Bezier curve is fitted on the boundary line, that is, the coordinates of a group of points on the boundary line (x 0 , y 0 ),...,(x m , y m ) are substituted into the following formula (16), and the coordinates of the control point p are solved according to the coefficients of the Bezier curve obtained when the value of formula (16) is minimum:
- Bix and Biy are the x-coordinate and y-coordinate of the i-th point after fitting
- Xi and Yi are the x-coordinate and y-coordinate of the i-th point on the dividing line.
- the above step S510 (obtaining the segmentation curve according to the segmentation starting point, the segmentation end point and the control point) includes the following steps 5101 to 5103:
- Step 5101 determine the target pixel point according to the segmentation starting point, the segmentation end point and the control point.
- the line segment with the segmentation starting point s and the segmentation end point e as endpoints is line segment 141
- the perpendicular bisector of line segment 141 is straight line 142
- the pixel point located on straight line 142 and closest to the control point p is pixel point c. Therefore, the target pixel point c can be determined based on the segmentation starting point s, the segmentation end point e and the control point p.
- Step 5102 quantize the position coordinates of the target pixel point to obtain a quantized control point.
- step S511 After obtaining the segmentation curve through the above step S508 or S510, the embodiment of the present disclosure continues to perform the following step S511:
- S512 Divide the target image block into two sub-image blocks according to the sampling result of the segmentation curve.
- the target image block may be segmented into two sub-image blocks, namely, a sub-image block 161 and a sub-image block 162 .
- the control information of the segmentation curve of the target image block is encoded to obtain the first bitstream data, including: entropy encoding the position information of the control points of the second-order Bezier curve (the position information of the quantized control points) to obtain the first bitstream data.
- encoding a sub-image block to obtain code stream data of the sub-image block may include: first, performing motion estimation on the sub-image block to obtain a reference block corresponding to the sub-image block, then performing motion compensation on the reference block to obtain a prediction block and motion vector corresponding to each coding block, and calculating the residual between the prediction block and the sub-image block to obtain a prediction residual, and finally transforming and quantizing the prediction residual to obtain a transform coefficient, and entropy encoding the motion vector and transform coefficient to obtain the code stream data of the sub-image block.
- S515 Generate code stream data corresponding to the target image block according to the first code stream data, the second code stream data, and the third code stream data.
- obtaining the control information of the segmentation curve according to the first bitstream data includes: performing entropy decoding on the first bitstream data to obtain position information of a first control point and position information of a second control point of a third-order Bezier curve.
- S174 Acquire the segmentation curve according to the control information of the segmentation curve, the segmentation starting point, and the segmentation end point.
- the step S174 includes: generating the segmentation curve according to the position information of the first control point, the position information of the second control point, the segmentation start point, the segmentation end point and a third-order Bezier curve.
- the target image block is a rectangular image block obtained by dividing the video frame into blocks.
- the VVC standard expands the maximum size of CTU to 128 ⁇ 128, and the further division method is no longer limited to quadtree division, but also supports binary tree and ternary tree division.
- the target image block in the embodiment of the present disclosure can be a coding tree unit, or a coding unit obtained by further dividing the coding tree unit, which is not limited in the embodiment of the present disclosure.
- the texture edge of the target image block refers to an image region in the target image block where the brightness changes significantly.
- the edge detection algorithm used for performing texture edge detection on the target image block is not limited, and the edge detection algorithm used is subject to the ability to obtain the texture edge of the target image block.
- the texture edge detection of the target image block can be performed using edge detection algorithms such as Sobel, Prewitt, Roberts, Canny, and Marr-Hildreth.
- the above step S184 (performing curve fitting on the dividing line to obtain the segmentation curve of the target image block) includes: performing Bezier curve fitting on the dividing line, and determining the curve obtained by performing Bezier curve fitting on the dividing line as the segmentation curve of the target image block.
- the video encoding method when encoding a target image block obtained by block-dividing a video frame, first performs texture edge detection on the target image block to obtain at least one texture edge of the target image block, then divides the target image block into two image areas based on the at least one texture edge, and obtains a boundary line between the two image areas, then performs curve fitting on the boundary line to obtain a segmentation curve of the target image block, and divides the target image block into two sub-image blocks based on the segmentation curve of the target image block, and finally obtains code stream data corresponding to the target image block according to the segmentation curve of the target image block and the two sub-image blocks.
- the color space of the target image block is a YUV color space
- the pixel value of each pixel of the target image block is composed of a brightness signal Y representing a brightness value and two chrominance signals U and V representing color values.
- the implementation method of converting the target image block into a grayscale image may include: deleting the chrominance signals U and V in the pixel value of each pixel and retaining only the brightness signal Y, thereby converting the target image block into a grayscale image.
- the color space of the target image block is an RGB color space
- the implementation method of converting the target image block into a grayscale image may include: converting the color space of the target image block from the RGB color space to the YUV color space, and then deleting the chrominance signals U and V in the pixel value of each pixel point, and only retaining the brightness signal Y, thereby converting the target image block into a grayscale image.
- the Gaussian kernel of the preset size may be a 5*5 Gaussian kernel.
- the grayscale image block is represented as I
- the grayscale image block after filtering is represented as I ⁇
- the Gaussian kernel of the preset size is represented as G ⁇ .
- the embodiment of the present disclosure also performs filtering processing on the target image block through the above step S1903, the embodiment of the present disclosure can filter out the noise in the target image block, thereby improving the accuracy of subsequent texture edge detection.
- Step a Calculate the gradient value and gradient direction of the grayscale image block after filtering.
- calculating the gradient value and gradient direction of the grayscale image block after filtering may include: calculating the gradient value and gradient direction of the grayscale image block after filtering based on a Sobel operator, a Roberts operator, a Prewitt operator, or a Laplacian operator.
- step a is described below by taking the calculation of the gradient value and gradient direction of the grayscale image block after filtering based on the Sobel operator as an example.
- the gradient value and gradient direction of the grayscale image block after filtering are calculated according to the gradient values of the grayscale image block after filtering in the X-axis direction and the Y-axis direction and the following formulas (23) and (24):
- G is the gradient value of the grayscale image block after filtering
- ⁇ is the gradient direction of the grayscale image block after filtering
- Step b performing non-maximum suppression on the gradient value of the grayscale image block after filtering according to the gradient value and gradient direction of the grayscale image block after filtering.
- the texture edge directly obtained based on the gradient value is too rough, so it is necessary to perform non-maximum suppression on the gradient value to refine the edge.
- the above step b (performing non-maximum suppression on the gradient value of the grayscale image block after filtering according to the gradient value and gradient direction of the grayscale image block after filtering) includes: traversing the pixel points of the grayscale image block after filtering, and determining whether the gradient value of the current pixel point is the maximum value among the surrounding pixel points with similar gradient direction; if so, retaining the current pixel point as a candidate edge point, if not, deleting the current pixel point from the candidate edge points.
- the gradient direction of pixel a is ⁇
- the direction with the smallest angle with the gradient direction ⁇ of pixel a among the eight directions obtained by discrete gradient directions is the 45° direction
- the pixel closest to pixel a in the 45° direction is is pixel b
- the opposite direction of the 45° direction is the 225° direction
- the pixel closest to pixel a in the 225° direction is pixel c
- the gradient value G b of pixel b and the gradient value G c of pixel c are obtained, if the gradient value G a of pixel a is greater than the gradient value G b of pixel b, and the gradient value G a of pixel a is greater than the gradient value G c of pixel c, then pixel a is retained as a pixel belonging to the texture edge, and if the gradient value G a of pixel a is less than the gradient value G b of pixel b and/or the gradient value G a of
- the first threshold is smaller than the second threshold.
- the weak edge point is classified as a texture edge, recorded as r(x,y), and the above operation is repeated with r(x,y) as the starting point, so as to obtain the texture edge of the current image block.
- S1905 Perform image region segmentation on the target image block based on the at least one texture edge to obtain an image region set corresponding to the target image block.
- the target image block is likely to have more than one texture edge, which means that the target image block will be divided into more than two image areas. Taking into account the bit rate cost, too many sub-image areas cannot bring a corresponding performance improvement. Therefore, when the number of image areas is greater than 2, the embodiment of the present disclosure will first merge the image areas into two image areas. If the number of image areas is equal to 2, no merging process is required. That is, if the number of image areas in the image area set is less than 2, the target image block is directly encoded as a whole. If the number of image areas in the image area set is equal to 2, jump to step S1908. If the number of image areas in the image area set is greater than 2, execute the following step S1907:
- the above step S1907 (merging the image regions in the image region set into two image regions to divide the target image block into two image regions) includes the following steps 1 to 3:
- Step 1 Obtain at least two merging schemes for the image region set.
- the at least two merging schemes include merging the image areas in the image area set into two image areas.
- Various scenarios for the domain are possible.
- the merging solution of the image region set may be based on the connectivity of each image region in the image region set. If two image regions have a common boundary, the two image regions may be merged into one image region.
- the image area set includes: image area A, image area B and image area C, so there are three merging schemes.
- Merging scheme 1 is: merging image area A with image area B to obtain image area D, thereby merging the image areas in the image area set into two image areas, image area D and image area C;
- merging scheme 2 is: merging image area A with image area C to obtain image area E, thereby merging the image areas in the image area set into two image areas, image area E and image area B;
- merging scheme 3 is: merging image area B with image area C to obtain image area F, thereby merging the image areas in the image area set into two image areas, image area F and image area A.
- Step 2 respectively obtain the difference between the at least two merging schemes.
- the difference of any merging scheme is the absolute difference between the first similarity and the second similarity of the merging scheme; the first similarity and the second similarity of any merging scheme are respectively the sum of the absolute differences in the grayscale values of each pixel point of the two image areas and the corresponding co-located blocks under the merging scheme.
- the difference of the merging scheme can be obtained through the following steps 21 to 23:
- Step 21 respectively obtain the same-position blocks corresponding to the two image regions under the merging scheme.
- Step 22 respectively calculate the sum of the absolute differences of the grayscale values of each pixel of the two image regions under the merging scheme and the corresponding co-located blocks to obtain the first similarity and the second similarity of the merging scheme.
- the first similarity of the merge scheme j is expressed as ⁇ j1 , and the first similarity ⁇ j1 of the merge scheme j can be calculated by the following formula (25):
- n1 is the number of pixels in the first image region under the merging scheme j, and n1 is also the number of pixels in the same-position block corresponding to the first image region under the merging scheme j; is the brightness value of pixel i in the first image area; is the brightness value of pixel i in the same-position block corresponding to the first image area.
- the second similarity of the merged solution j is expressed as ⁇ j2 , and the second similarity ⁇ j2 of the merged solution j can be calculated by the following formula (26):
- n2 is the number of pixels in the second image area under the merging scheme j, and n2 is also the number of pixels in the same-position block corresponding to the second image area under the merging scheme j; is the brightness value of pixel i in the second image area; is the brightness value of pixel i in the same-position block corresponding to the second image area.
- Step 23 Calculate the absolute difference between the first similarity and the second similarity of the merged solution to obtain the difference of the merged solution.
- each merging scheme merges the image areas in the image area set into two image areas
- the larger the difference D j of the merging scheme the greater the difference in the grayscale values of the two image partitions obtained by the merging scheme, and the two image partitions obtained by the merging scheme are more likely to belong to different image content objects. Therefore, by merging the image areas in the image area set into two image areas through the merging scheme with the largest difference among the at least two merging schemes, the target image block can be more reasonably divided into two image areas.
- step S1907 finds and construct an optimal edge line among multiple edges.
- each image area of the target image block is merged to achieve this purpose, but the disclosed embodiment is not limited to this.
- Other solutions that can solve the problem of selecting multiple edge lines can also be used as variant solutions, such as: using a path-finding algorithm to find the longest path.
- the Astar image search algorithm can be used to find the optimal dividing line to obtain a discrete curve whose starting point and end point are both on the boundaries of the coding block.
- the target image block is divided into two image regions, image region D and image region E, so a boundary line 230 between image region D and image region E can be obtained.
- the above step S1909 (performing Bezier curve fitting on the boundary line, and determining the curve obtained by performing Bezier curve fitting on the boundary line as the segmentation curve of the target image block) includes the following steps 1 to 4:
- Step 1 performing second-order Bezier curve fitting on the boundary line to obtain a first fitting result.
- the second-order Bezier curve is a curve defined by a starting point s, an end point e and a control point p. Therefore, the second-order Bezier curve fitting is performed on the dividing line, that is, the starting point of the dividing line is used as the starting point s of the second-order Bezier curve, the end point of the dividing line is used as the end point e of the second-order Bezier curve, and the control point p is obtained.
- FIG24 takes the dividing line as line 241 in FIG24, and performs second-order Bezier curve fitting on the dividing line to obtain the first fitting result as line 242 in FIG24 as an example.
- the starting point of the dividing line and the first fitting result obtained by performing second-order Bezier curve fitting on the dividing line are both point s
- the end point of the dividing line and the first fitting result obtained by performing second-order Bezier curve fitting on the dividing line are both point e
- the first fitting result is a single arch curve
- the shape of the first fitting result is controlled by the control point p.
- Step 2 performing third-order Bezier curve fitting on the boundary line to obtain a second fitting result.
- the dividing line is shown as line 251 in FIG25, and the dividing line is fitted with a third-order Bezier curve to obtain a second fitting result as shown in FIG25.
- the starting points of the dividing line and the second fitting result obtained by fitting the dividing line with a third-order Bezier curve are both point s
- the end points of the dividing line and the first fitting result obtained by fitting the dividing line with a second-order Bezier curve are both point e
- the shape of the first fitting result is controlled by the first control point p1 and the second control point p2.
- Step 3 respectively obtain a first fitting accuracy of the first fitting result and a second fitting accuracy of the second fitting result.
- the second fitting accuracy is greater than or equal to the first fitting accuracy
- the bitstream overhead of the third-order Bezier curve is greater than the bitstream overhead of the second-order Bezier curve. Therefore, the first fitting result or the second fitting result can be selected as the segmentation curve of the target image block based on the bitstream overhead brought by the third-order Bezier curve and the fitting accuracy improved by the third-order Bezier curve. For example: the difference between the second fitting accuracy and the first fitting accuracy is calculated, and it is determined whether the difference between the second fitting accuracy and the first fitting accuracy is greater than a preset threshold. If the difference between the second fitting accuracy and the first fitting accuracy is greater than the preset threshold, the second fitting result is selected as the segmentation curve of the target image block.
- the first fitting result is selected as the segmentation curve of the target image block.
- the preset threshold can be set according to the difference between the bitstream overhead brought by the third-order Bezier curve and the bitstream overhead brought by the second-order Bezier curve.
- the second-order Bezier curve fitting and the third-order Bezier curve fitting of the dividing line are used as an example for explanation, but the embodiment of the present disclosure is not limited thereto, and a higher-order Bezier curve can also be used to fit the dividing line.
- the boundary line is fitted to improve the fitting accuracy, but the amount of calculation brought by fitting the boundary line with a higher-order Bezier curve is also greater. Therefore, in actual adaptation, the image quality improvement brought by the amount of calculation and fitting accuracy can be balanced to select a Bezier curve of appropriate order to fit the boundary line.
- the Bezier curve (segmentation curve) generated in the above step S1909 is a continuous curve and cannot be directly applied to the division of digital images, the sampling result of the discrete segmentation curve is obtained by sampling with integer pixel accuracy, and then the target image block is segmented.
- sampling the segmentation curve of the target image block with integer pixel accuracy may include: determining a boundary of pixels through which the segmentation curve passes as a sampling result of the segmentation curve.
- the target image block may be segmented into two sub-image blocks, namely, a sub-image block 271 and a sub-image block 272 .
- S1912 Encode the information of the segmentation curve of the target image block to obtain first code stream data.
- the segmentation curve is a second-order Bezier curve obtained by fitting the dividing line with a second-order Bezier curve
- encoding the information of the segmentation curve of the target image block to obtain the first code stream data includes: encoding the position information of the starting point, the position information of the end point, and the position information of the control point of the second-order Bezier curve to obtain the first code stream data.
- the segmentation curve is a third-order Bezier curve obtained by fitting the dividing line with a third-order Bezier curve
- encoding the information of the segmentation curve of the target image block to obtain the first code stream data includes: encoding the position information of the starting point, the position information of the end point, the position information of the first control point, and the position information of the second control point of the third-order Bezier curve to obtain the first code stream data.
- S1914 Generate code stream data corresponding to the target image block according to the first code stream data, the second code stream data, and the third code stream data.
- the present disclosure also provides a video decoding method. As shown in FIG. 28 , the video decoding method includes the following steps S281 to S285:
- the target image block is a rectangular image block obtained by dividing a video frame into blocks;
- the bitstream data corresponding to the target image block includes: first bitstream data, second bitstream data and third bitstream data;
- the first bitstream data is bitstream data obtained by encoding information of a segmentation curve used to divide the target image block into two sub-image blocks;
- the second bitstream data and the third bitstream data are bitstream data obtained by encoding the two sub-image blocks respectively.
- the first bitstream data includes: position information of the starting point, the terminal point, and the control point of the second-order Bezier curve
- obtaining the segmentation curve according to the first bitstream data includes: reconstructing the second-order Bezier curve according to the position information of the starting point, the terminal point, and the control point of the second-order Bezier curve to obtain the segmentation curve
- the first bitstream data includes: position information of the starting point of the third-order Bezier curve, position information of the terminal point, position information of the first control point, and position information of the second control point.
- the acquiring the segmentation curve according to the first bitstream data includes: reconstructing the third-order Bezier curve according to the position information of the starting point of the third-order Bezier curve, position information of the terminal point, position information of the first control point, and position information of the second control point to acquire the segmentation curve.
- the above step S283 (segmenting the target image block based on the segmentation curve to obtain the two sub-image blocks) includes the following steps 2831 and 2832:
- Step 2832 divide the target image block into two sub-image blocks according to the sampling result of the segmentation curve.
- step 2831 and step 2832 may refer to the implementation of steps S1910 and S1911 in the above embodiments, and will not be repeated here to avoid redundancy.
- S284 Reconstruct the two sub-image blocks according to the second code stream data and the third code stream data.
- the second code stream data and the third code stream data respectively include prediction residuals and motion vectors of the two sub-image blocks
- reconstructing the two sub-image blocks according to the second code stream data and the third code stream data includes: respectively determining reference blocks corresponding to the two sub-image blocks, respectively performing motion compensation on the reference blocks of the two sub-image blocks according to the motion vectors of the two sub-image blocks, so as to respectively obtain prediction blocks corresponding to the to-be-reconstructed blocks of the two sub-image blocks; and respectively adding and fusing the prediction residuals of the two sub-image blocks with the prediction blocks, so as to obtain the reconstructed two sub-image blocks.
- the two reconstructed sub-image blocks are spliced into the reconstructed target image block.
- the video decoding method After obtaining the bitstream data corresponding to the target image block, the video decoding method provided by the embodiment of the present disclosure can obtain the segmentation curve according to the first bitstream data and segment the target image block based on the segmentation curve to obtain the two sub-image blocks because the bitstream data corresponding to the target image block includes the first bitstream data obtained by encoding the information of the segmentation curve used to segment the target image block into two sub-image blocks.
- the embodiment of the present disclosure can also reconstruct the two sub-image blocks according to the second bitstream data and the third bitstream data, and obtain the reconstructed target image block according to the reconstructed two sub-image blocks. Therefore, the embodiment of the present disclosure can realize normal decoding of the video while improving the performance of video encoding.
- the block partitioning scheme supports the partitioning of blocks into asymmetric, irregular, and irregular shaped coding blocks, which can make the division of coding blocks as close to the real image edge as possible, thereby improving the coding performance of the video.
- these prediction blocks need to be merged to perform residual operations on the image blocks.
- the synthesized prediction block boundary pixels will be uneven, which will cause irregular jumps in the residual during compensation. If not properly processed, the boundary area of the reconstructed image may appear discontinuous, uneven or obvious.
- the embodiment of the present disclosure provides a video encoding method. As shown in FIG. 29 , the video encoding method includes the following steps S291 to S298:
- S292 Divide the target image block into a first sub-image block and a second sub-image block based on a texture edge in the target image block.
- texture edge detection may be performed on the target image block using edge detection algorithms such as Sobel, Prewitt, Roberts, Canny, and Marr-Hildreth to obtain texture edges in the target image block.
- edge detection algorithms such as Sobel, Prewitt, Roberts, Canny, and Marr-Hildreth to obtain texture edges in the target image block.
- the texture edge in the target image block is generally not a horizontal or vertical straight line
- the first sub-image block and the second sub-image block are probably not rectangular image blocks, but irregularly shaped image blocks.
- the above step S292 (segmenting the target image block into a first sub-image block and a second sub-image block based on a texture edge in the target image block) includes the following steps 1 to 4:
- Step 1 Segment the target image block into two image areas based on a texture edge in the target image block, and obtain a boundary line between the two image areas.
- the brightness of the pixels in the image block is relatively flat, and texture edge detection on the image block cannot obtain a texture edge that can divide the image block into two image regions.
- the embodiment of the present disclosure may not further divide the sub-image blocks, but directly encode the image block as a coding unit.
- the texture edge obtained by texture edge detection on the image block will divide the image block into image regions greater than 2. For such image blocks, multiple image regions can be first merged into two image regions, and then the subsequent video encoding steps can be performed.
- Step 2 Perform curve fitting on the boundary line to obtain a segmentation curve of the target image block.
- the boundary line between two image areas has no function expression or the function expression is very complex.
- Directly expressing the boundary line between the two image areas in the bitstream data will greatly increase the overhead of the bitstream data. Therefore, the embodiment of the present disclosure performs curve fitting on the boundary line between the two image areas to simplify the function expression of the boundary line, thereby reducing the overhead caused by the boundary line.
- Step 3 Sampling the segmentation curve with integer pixel accuracy to obtain a sampling result of the segmentation curve.
- step a may refer to the above step 1, and will not be described in detail here to avoid redundancy.
- Step b obtaining two endpoints of the segmentation curve according to the encoded video content.
- obtaining two endpoints of a segmentation curve according to video content that has been encoded includes: determining whether spatially adjacent pixel points of four edges of the target image block can be obtained in the video content that has been encoded; if so, obtaining the segmentation start point and the segmentation end point according to the spatially adjacent pixel points of the four edges of the target image block; if not, obtaining the best matching block of the target image block in the video content that has been encoded, and obtaining the segmentation start point and the segmentation end point according to the edge pixel points of the best matching block of the target image block.
- Step c determining the segmentation curve according to the two endpoints of the segmentation curve and the dividing line.
- Step d sampling the segmentation curve with integer pixel accuracy to obtain a sampling result of the segmentation curve.
- Step e dividing the target image block into the first sub-image block and the second sub-image block based on the sampling result of the segmentation curve.
- S293 Obtain a first prediction block of the first sub-image block and a second prediction block of the second sub-image block.
- obtaining a first prediction block of the first sub-image block and a second prediction block of the second sub-image block includes: performing motion estimation on the first sub-image block to obtain a first reference block and a first motion vector; performing motion compensation on the first reference block based on the first motion vector to obtain the first prediction block; performing motion estimation on the second sub-image block to obtain a second reference block and a second motion vector; and performing motion compensation on the second reference block based on the second motion vector to obtain the second prediction block.
- the first prediction block 301 and the second prediction block 302 are combined to obtain an initial prediction block 303 , and a boundary line 3000 between the first prediction block 301 and the second prediction block 302 in the initial prediction block 303 is obtained.
- the filtering strength of each pixel in the initial prediction block is negatively correlated with the minimum distance from each pixel in the initial prediction block to the boundary line. That is, for any pixel in the initial prediction block, if the minimum distance from the pixel to the boundary line is smaller, the filtering strength of the pixel is greater; conversely, if the minimum distance from the pixel to the boundary line is larger, the filtering strength of the pixel is smaller.
- the filtering strength of each pixel point in the initial prediction block is negatively correlated with the minimum distance from each pixel point in the initial prediction block to the boundary line. Therefore, the embodiment of the present disclosure can adaptively filter the image according to the content and needs of the image, so that for pixels with stronger noise, stronger filtering is applied to remove noise, and for cleaner pixels, weaker filtering is applied to retain authenticity.
- calculating the residual between the target image block and the fused prediction block includes: calculating the difference between the pixel values of each pixel point of the target image block and the co-located pixel point in the fused prediction block to obtain the residual data of the target image block.
- the video encoding method provided by the embodiment of the present disclosure After the first prediction block and the second prediction block are combined to obtain an initial prediction block, a boundary line between the first prediction block and the second prediction block in the initial prediction block is obtained, and a filtering strength of each pixel in the initial prediction block is determined according to a minimum distance from each pixel in the initial prediction block to the boundary line, and each pixel in the initial prediction block is filtered based on the filtering strength of each pixel in the initial prediction block. Therefore, the video encoding method provided in the embodiment of the present disclosure can filter uneven pixels in the prediction block, thereby avoiding uneven pixels at the boundary of the prediction block.
- the target image block is a rectangular image block obtained by dividing the video frame into blocks.
- S3202 Divide the target image block into a first sub-image block and a second sub-image block based on a texture edge in the target image block.
- S3203 Perform motion estimation on the first sub-image block to obtain a first reference block and a first motion vector.
- the first area is an area composed of pixel points adjacent to the boundary line.
- the pixels of the initial prediction block 303 that are adjacent to the boundary line 500 include pixels with pixel coordinates: (4,13), (4,14), (4,15), (4,16), (5,13), (5,14), (5,15), (5,16), (6,13), (6,14), (7,7), (7 , 8), (7 , 9)..., so the area composed of these pixels is determined as the first area 3031 of the initial prediction block 303.
- the preset filtering strength is 1.
- the first region is a region composed of pixels adjacent to the boundary line
- the second region is a region composed of pixels not adjacent to the boundary line
- the set of pixels in the first region and the set of pixels in the second region are complementary to each other.
- the second region includes other regions in the initial prediction block 303 except the first region 3031.
- the pixel distance between the pixel points in the second area and the pixel points in the first area can be calculated by the following formula (32):
- S3210 Calculate the filtering strength of each pixel in the second area of the initial prediction block according to the minimum distance from each pixel in the second area to the boundary line.
- the above step S3209 (calculating the filtering strength of each pixel in the second area of the initial prediction block according to the minimum distance from each pixel in the second area to the boundary line) includes:
- the minimum distance from the pixel point (i, j) in the second area to the boundary line is expressed as di ,j
- the filtering strength of the pixel point (i, j) in the second area is expressed as w i,j .
- the filtering strength of each pixel point in the first area and the filtering strength of each pixel point in the second area are determined through the above steps, and the first area and the second area can be combined into an initial prediction block, so the filtering strength of each pixel point in the initial prediction block is determined.
- the filtering strength of each pixel in the initial prediction block can be expressed as the following formula (34):
- d i,j is the minimum distance from the pixel point (i, j) in the initial prediction block to the boundary line
- wi ,j is the filtering strength of the pixel point (i, j).
- the to-be-filtered pixel point set is a set of pixel points in the initial prediction block whose filtering strength is greater than a strength threshold.
- the disclosed embodiment compares the filtering strength w i,j of each pixel in the initial prediction block with the strength threshold.
- w i,j is greater than or equal to the strength threshold, the pixel is classified into the set of pixels to be filtered, and the pixels not classified into the set of pixels to be filtered are not processed. Therefore, the above embodiment can reduce the number of pixels to be processed, thereby reducing the computational complexity and saving computing power consumption.
- the target filter in the embodiment of the present disclosure may be a weighted mean filter, a Gaussian two-dimensional filter, an adaptive median filter, a bilateral filter, a Laplace filter, or the like.
- the Gaussian 2D filter When the standard deviation is small (usually less than 1.0), the Gaussian 2D filter has a small smoothing effect and is suitable for tasks that need to retain image details and are sensitive to noise, such as noise removal before edge detection.
- the standard When the standard is medium (usually between 1.0 and 4.0), it is usually used for general smoothing operations, which can remove noise to a certain extent and retain the main features of the image.
- the Gaussian 2D filter has a stronger smoothing effect, removes more details, and is suitable for tasks that require strong smoothing of images to remove a lot of noise.
- the method before obtaining the filter matrix of each pixel point in the initial prediction block according to the filter parameters of each pixel point in the set of pixel points to be filtered and the response characteristics of the target filter, the method further includes: The size of the filter matrix of each pixel point in the set of pixel points to be filtered is determined according to the filter parameters of each pixel point in the set of pixel points to be filtered.
- the target filter is a Gaussian two-dimensional filter
- a Gaussian kernel filter matrix of size 3x3
- a Gaussian kernel of size 5x5 will be selected
- a Gaussian kernel of size 7x7 or larger will be selected.
- encoding data of the target image block is generated based on the residual data of the target image block, including: transforming, entropy encoding, and other operations on the residual data of the target image block, information of a segmentation curve for segmenting the target image block into the first sub-image block and the second sub-image block, the motion vector of the first sub-image block, and the motion vector of the second sub-image block, so as to obtain the encoding data of the target image block.
- Step 1 perform noise point detection on the fused prediction block based on a preset noise detection algorithm.
- the preset noise detection algorithm may be a square root algorithm, an inter-frame difference detection algorithm, an autocorrelation algorithm, a spectrum analysis algorithm, a wavelet transform algorithm, a threshold processing algorithm, or other noise detection algorithms.
- noise point detection is performed on the fused prediction block based on threshold processing, including: calculating the gradient amplitude of each pixel in the fused prediction block, and determining the pixel whose gradient amplitude is higher than the threshold gradient amplitude as a noise point.
- the gradient amplitude of each pixel point in the fused prediction block may be calculated by using a Prewitt operator, a Roberts operator, a Sobel operator, a Laplacian operator, a Scharr operator, or the like.
- the Prewitt operator is used to calculate the gradient amplitude of each pixel in the horizontal and vertical directions, including:
- the horizontal Prewitt kernel P x and the vertical Prewitt kernel P y are used to convolve the fused prediction blocks respectively.
- the image's horizontal gradient magnitude G x and vertical gradient magnitude G y are obtained. Specifically, they can be expressed as the following formulas (37) and (38):
- the gradient magnitude of each pixel point of the fused prediction block is calculated according to the gradient magnitude P x in the horizontal direction and the gradient magnitude P y in the vertical direction of each pixel point of the fused prediction block.
- calculating the gradient amplitude of each pixel of the fused prediction block according to the gradient amplitude P x in the horizontal direction and the gradient amplitude P y in the vertical direction of each pixel of the fused prediction block may include: calculating the gradient amplitude GM of each pixel of the fused prediction block according to the following formula (39):
- the method for obtaining the filtering strength of each pixel point in the fused prediction block can be similar to the method for obtaining the filtering strength of each pixel point in the initial prediction block in the above embodiment. To avoid redundancy, it will not be described in detail here.
- Step 3 filter each pixel in the fused prediction block based on the filtering strength of each pixel in the fused prediction block.
- the filter used to filter each pixel in the fused prediction block may be the same as the filter (target filter) used to filter the pixel in the initial prediction block, or may be different from the filter used to filter the pixel in the initial prediction block.
- the present disclosure also provides a video decoding method. As shown in FIG. 34 , the video decoding method includes the following steps S341 to S346:
- obtaining the encoding data of the target image block includes: receiving video stream data of a to-be-played video sent by a media resource server, and extracting the encoding data of the target image block from the video stream data.
- the encoding data of the target image block includes: first encoding data, second encoding data, third encoding data and fourth encoding data;
- the first encoding data is encoding data obtained by encoding information of a segmentation curve used to segment the target image block into a first sub-image block and a second sub-image block;
- the second encoding data is encoding data obtained by encoding residual data of the target image block, and
- the third encoding data and the fourth encoding data are encoding data obtained by encoding motion vectors of the first sub-image block and the second sub-image block, respectively.
- the segmentation curve for segmenting the target image block into the first sub-image block and the second sub-image block is Information includes: the position information of the starting point, the end point and the control point of the second-order Bezier curve.
- the information of the segmentation curve for segmenting the target image block into the first sub-image block and the second sub-image block includes: the position information of the starting point, the end point, the first control point and the second control point of the third-order Bezier curve.
- the information of the segmentation curve for segmenting the target image block into the first sub-image block and the second sub-image block includes: position information of control points of the second-order Bezier curve.
- the information of the segmentation curve for segmenting the target image block into the first sub-image block and the second sub-image block includes: position information of the first control point and the position information of the second control point of the third-order Bezier curve.
- the first prediction block and the second prediction block are prediction blocks of a first sub-image block and a second sub-image block respectively, and the first sub-image block and the second sub-image block are two sub-image blocks obtained by segmenting the target image block based on a texture edge in the target image block.
- Acquiring the first prediction block and the second prediction block according to the encoding data of the target image block includes: performing entropy decoding, inverse transformation, inverse quantization and other operations on the first encoding data, the third encoding data and the fourth encoding data to obtain the segmentation curve, the motion vector of the first sub-image block and the motion vector of the second sub-image block, segmenting the target image block based on the segmentation curve, and determining the reference blocks corresponding to the two sub-image blocks according to the motion vector of the first sub-image block and the motion vector of the second sub-image block, and performing motion compensation on the reference blocks of the two sub-image blocks according to the motion vectors of the two sub-image blocks, so as to obtain the first prediction block and the second prediction block respectively.
- S343 Combine the first prediction block and the second prediction block to obtain an initial prediction block, and obtain a boundary line between the first prediction block and the second prediction block in the initial prediction block.
- S344 Determine the filtering strength of each pixel in the initial prediction block according to the minimum distance from each pixel in the initial prediction block to the boundary line.
- the video decoding method provided by the embodiment of the present disclosure can reconstruct the target image block based on the encoded data of the target image block obtained by the video encoding provided by the above embodiment to obtain the reconstructed image block of the target image block. Therefore, the embodiment of the present disclosure can achieve normal decoding of the video while improving the performance of video encoding.
- Motion estimation refers to dividing the video into several blocks, trying to search for the best matching block of each block in the adjacent video frame as a reference block, and deriving the relative offset between the two in spatial position as a motion vector (Motion Vector, MV).
- Motion Vector Motion Vector
- the process of obtaining the prediction block of the current coding block based on the motion vector and the reference block is called motion compensation.
- the current mainstream video coding standards are based on the coding blocks obtained by segmenting the video frame rather than the complete video frame when performing predictive coding. Different coding parameters will be used between different coding blocks. The more accurate the coding block division is, the smaller the difference between it and the prediction block is, and the corresponding bit rate overhead is also smaller. Therefore, appropriate block division is crucial.
- the present disclosure provides a video encoding method, as shown in FIG. 35 , the video encoding method includes the following steps:
- the target image block is a rectangular image block obtained by dividing the target video frame into blocks.
- the above step S351 (obtaining a segmentation curve of the target image block based on a texture edge in the target image block) may include the following steps 1 to 3:
- Step 1 Perform texture edge detection on the target image block to obtain at least one texture edge of the target image block.
- edge detection algorithms such as Sobel, Prewitt, Roberts, Canny, and Marr-Hildreth can be used to detect the The target image block is subjected to texture edge detection to obtain a texture edge in the target image block.
- Step 2 Segment the target image block into two image areas based on the at least one texture edge, and obtain a boundary line between the two image areas.
- the texture edge in the target image block is generally not a horizontal or vertical straight line, the two image regions are likely not rectangular regions but irregular shaped regions.
- dividing the target image block into two image areas based on the at least one texture edge includes: performing image area segmentation on the target image block based on the at least one texture edge to obtain an image area set corresponding to the target image block; obtaining the number of image areas in the image area set; if the number of image areas in the image area set is greater than 2, merging the image areas in the image area set into two image areas to divide the target image block into two image areas.
- merging the image areas in the image area set into two image areas includes: obtaining at least two merging schemes for the image area set, the at least two merging schemes including various schemes for merging the image areas in the image area set into two image areas; respectively obtaining the difference between the at least two merging schemes, the difference between any merging scheme being the absolute difference between a first similarity and a second similarity of the merging scheme; the first similarity and the second similarity of any merging scheme being the sum of the absolute differences between the grayscale values of each pixel point of the two image areas and the corresponding co-located blocks under the merging scheme; and merging the image areas in the image area set into two image areas by using the merging scheme with the largest difference among the at least two merging schemes.
- Step 3 Perform curve fitting on the boundary line to obtain a segmentation curve of the target image block.
- performing curve fitting on the dividing line to obtain the segmentation curve of the target image block includes: performing second-order or third-order Bezier curve fitting on the dividing line, and determining the Bezier curve obtained by performing second-order or third-order Bezier curve fitting on the dividing line as the segmentation curve of the target image block.
- the second-order Bezier curve is a smooth single-arch curve. It only needs to add the starting point, end point and control point of the second-order Bezier curve to the bitstream data to enable the decoder to obtain the second-order Bezier curve losslessly.
- the target image block is reconstructed using an order Bezier curve.
- the third-order Bezier curve is a concave-convex double-arch curve. It only needs to add the starting point, end point, first control point and second control point of the third-order Bezier curve to the bitstream data so that the decoding end can losslessly decode the third-order Bezier curve, and then reconstruct the target image block according to the decoded third-order Bezier curve.
- the above step S351 (obtaining a segmentation curve of the target image block based on a texture edge in the target image block) includes the following steps a to d:
- steps a and b may refer to the above steps 1 and 2, and will not be described in detail here to avoid redundancy.
- Step c obtaining a segmentation start point and a segmentation end point of a segmentation curve according to the encoded video content.
- obtaining a segmentation start point and a segmentation end point of a segmentation curve according to video content that has been encoded includes: determining whether spatially adjacent pixel points of four edges of the target image block can be obtained in the video content that has been encoded; if so, obtaining the segmentation start point and the segmentation end point according to the spatially adjacent pixel points of the four edges of the target image block; if not, obtaining the best matching block of the target image block in the video content that has been encoded, and obtaining the segmentation start point and the segmentation end point according to the edge pixel points of the best matching block of the target image block.
- the acquiring the segmentation start point and the segmentation end point according to the edge pixel points of the best matching block of the target image block comprises: acquiring the gradient amplitudes of the pixel points on four edges of the best matching block of the target image block in the direction of the corresponding edge; acquiring at least two pixel value step points according to the gradient amplitudes of the pixel points on four edges of the best matching block of the target image block in the direction of the corresponding edge, wherein the pixel value step points are pixel points with the largest gradient amplitude and a gradient amplitude greater than a threshold gradient amplitude among the pixel points on any edge of the best matching block of the target image block; acquiring two pixel value step points with the largest gradient amplitude among the at least two pixel value step points; and determining the segmentation start point and the segmentation end point according to the two pixel value step points with the largest gradient amplitude and the motion vector of the best matching block of the target image block, respectively.
- Step d determining the segmentation curve according to the segmentation starting point, the segmentation end point and the dividing line.
- S352 Divide the target image block into a first sub-image block and a second sub-image block based on the segmentation curve.
- the segmentation curve is a continuous curve, and the continuous segmentation curve may pass through the smallest unit (pixel point) of the digital image, and the smallest unit of the digital image cannot be segmented and encoded, the segmentation curve cannot be directly applied to the segmentation of the digital image.
- the embodiment of the present disclosure first samples the segmentation curve with an integer pixel accuracy to sample the segmentation curve into a discrete curve, and then segments the target image block into two sub-image blocks through the sampled curve (the sampling result of the segmentation curve), thereby avoiding segmenting different parts of a pixel point into different sub-image blocks.
- S353 Obtain a motion vector of each pixel in the target image block according to the image block that has been encoded in the target video frame.
- the motion vector of each pixel point of the image block that has been encoded in the target video frame is predicted according to the motion vector of each pixel point of the image block that has not been encoded in the target video frame, and the motion vector of each pixel point of the target image block (current block) is selected from the prediction result.
- S354 Obtain a first motion vector of the first sub-image block and a second motion vector of the second sub-image block according to the motion vector of each pixel in the target image block.
- the motion vector of each pixel in the target image block is obtained, and according to the motion vector of each pixel in the first sub-image block, the first motion vector of the first sub-image block is obtained;
- the motion vector of each pixel in the target image block is obtained, and according to the motion vector of each pixel in the second sub-image block, the second motion vector of the second sub-image block is obtained.
- Step 1 obtaining a prediction block of the first sub-image block according to the first motion vector and a reference block of the first sub-image block.
- Step 3 Combine the prediction block of the first sub-image block and the prediction block of the second sub-image block to obtain the prediction block of the target image block.
- Step 4 Calculate the residual between the target image block and the prediction block of the target image block to obtain residual data of the target image block.
- generating the encoding data of the target image block also includes: encoding information such as the segmentation curve, motion vector prediction mode, etc. of the target image block, and encapsulating the encoding results of the segmentation curve, motion vector prediction mode, etc. of the target image block into the encoding data of the target image block.
- the video encoding method when encoding the target image block obtained by block division of the target video frame, first obtains the segmentation curve of the target image block based on the texture edge in the target image block, and divides the target image block into a first sub-image block and a second sub-image block based on the segmentation curve, then obtains the motion vector of each pixel in the target image block according to the image block that has been encoded in the target video frame, and then obtains the first motion vector of the first sub-image block and the second motion vector of the second sub-image block according to the motion vector of each pixel in the target image block, and encodes the target image block according to the first motion vector and the second motion vector.
- S3602 Divide the target image block into a first sub-image block and a second sub-image block based on the segmentation curve.
- steps S3601 and S3602 may refer to the implementation of steps S351 and S352 in the above embodiment, and will not be described in detail here to avoid redundancy.
- mask processing is performed on brightness components, horizontal motion vectors, and vertical motion vectors of pixels in at least one image block among multiple image blocks, including: randomly sampling multiple image blocks obtained by block division using a sampling strategy that obeys uniform distribution, and then masking the brightness components, horizontal motion vectors, and vertical motion vectors of pixels in unsampled image blocks.
- a vertical motion vector matrix corresponding to the sample video frame is generated according to the vertical motion vector of each pixel point of the sample video frame; the dimension of the vertical motion vector matrix corresponding to the sample video frame is W*H, and the elements in the vertical motion vector matrix corresponding to the sample video frame are respectively the vertical motion vector of a pixel point of the sample video frame.
- the decoding module 372 is used to predict the feature information of the masked image block according to the encoded representation of the model input output by the encoding module 371. That is, when the running vector prediction model is trained, a complete video frame is divided into multiple image blocks during training, and a part of the image blocks in the multiple image blocks are randomly masked (the information of a part of the image blocks is set to empty), and then the information of the image blocks that are not masked (brightness component, horizontal motion vector, vertical motion vector) is input to the encoding module 371, and the encoding module 371 can extract the features of the part of the image block. Then, the features extracted by the encoding module 371 and the masked image area are input into the decoding module 372 together to generate complete image information (brightness components of all pixels of the video frame, horizontal motion vectors, and vertical motion vectors).
- the encoded image block can be reconstructed according to the encoded data of the encoded image block, and the brightness value, horizontal motion vector and vertical motion vector of each pixel point of the reconstructed image block can be obtained to obtain the brightness value, horizontal motion vector and vertical motion vector of each pixel point of the encoded image block in the target video frame.
- the elements in the brightness component matrix respectively correspond to the pixel points of the target video frame; constructing the brightness component matrix according to the brightness values of each pixel point of the image block that has been encoded in the target video frame includes: setting the values of the elements in the brightness component matrix corresponding to the pixel points of the image block that has been encoded to the brightness values of the corresponding pixel values, and setting the values of the elements in the brightness component matrix corresponding to the pixel points of the image block that has not been encoded to the first preset value.
- FIG. 38 takes the first preset value as 0 and the size of the target video frame 600 as 8*8 as an example. Since the elements in the brightness component matrix 3800 correspond to the pixel points of the target video frame respectively, the brightness component matrix 3800 is an 8*8 matrix. The values of the elements corresponding to the pixel points of the image block that has been encoded in the brightness component matrix 3800 are the brightness values of the corresponding pixel values.
- Pixel point, the pixel point in the 5th row and 4th column of the target video frame 600 is a pixel point of the uncoded image block, so the value of the element in the 5th row and 4th column of the brightness component matrix 3800 is 0.
- the pixel point corresponding to the element in the 5th row and 4th column of the brightness component matrix 3800 is the pixel point in the 8th row and 6th column of the target video frame 600
- the pixel point in the 8th row and 6th column of the target video frame 600 is a pixel point of the uncoded image block, so the value of the element in the 8th row and 6th column of the brightness component matrix 3800 is 0.
- Step (3) constructing a horizontal motion vector matrix according to the horizontal motion vectors of each pixel of the image block that has been encoded in the target video frame.
- the elements in the horizontal motion vector matrix and the vertical motion vector matrix correspond to the pixels of the target video frame respectively; constructing the horizontal motion vector matrix according to the horizontal motion vectors of the pixels of the image blocks that have been encoded in the target video frame includes: setting the values of the elements in the horizontal motion vector matrix corresponding to the pixels of the image blocks that have been encoded to the horizontal motion vectors of the corresponding pixel values, and setting the values of the elements in the horizontal motion vector matrix corresponding to the pixels of the image blocks that have not been encoded to the second preset values.
- FIG. 39 takes the second preset value as x0 and the size of the target video frame 600 as 8*8 as an example. Since the elements in the horizontal motion vector matrix 3900 correspond to the pixels of the target video frame respectively, the horizontal motion vector matrix 3900 is an 8*8 matrix. The values of the elements corresponding to the pixels of the image blocks that have been encoded in the horizontal motion vector matrix 3900 are set to the brightness values of the corresponding pixel values.
- the pixel corresponding to the element in the 1st row and 3rd column of the horizontal motion vector matrix 3900 is the pixel in the 1st row and 3rd column of the target video frame 600, so the value of the element in the 1st row and 3rd column of the horizontal motion vector matrix 3900 is the brightness value x13 of the pixel in the 1st row and 3rd column of the target video frame 600.
- the pixel corresponding to the element in the 2nd row and 4th column of the vertical motion vector matrix 4000 is the pixel in the 2nd row and 4th column of the target video frame 600, so the value of the element in the 2nd row and 4th column of the vertical motion vector matrix 4000 is the brightness value x24 of the pixel in the 2nd row and 4th column of the target video frame 600.
- the pixel corresponding to the element in the 3rd row and 7th column of the vertical motion vector matrix 4000 is the pixel in the 3rd row and 7th column of the target video frame 600, so the value of the element in the 3rd row and 7th column of the vertical motion vector matrix 4000 is the brightness value x37 of the pixel in the 3rd row and 7th column of the target video frame 600.
- the value of the element in the brightness component matrix corresponding to the pixel of the image block that has not been encoded is x0.
- the horizontal motion vector and the vertical motion vector of each pixel point of the target video frame can also be saved, and when encoding subsequent image blocks of the target video frame, the motion vector of each pixel point in the next image block can be obtained based on the horizontal motion vector and the vertical motion vector of each pixel point of the target video frame.
- the motion vector of each pixel in the target image block is obtained according to the horizontal motion vector and the vertical motion vector of each pixel of the target video frame output by the motion vector prediction model, including: extracting the horizontal motion vector and the vertical motion vector of each pixel in the target image block from the horizontal motion vector and the vertical motion vector of each pixel of the target video frame output by the motion vector prediction model, synthesizing the horizontal motion vector and the vertical motion vector of each pixel in the target image block, and obtaining the motion vector of each pixel in the target image block.
- the motion vectors of the pixels of the second sub-image block are added together, and then divided by the number of pixels of the second sub-image block, and the obtained average pixel vector is used as the second motion vector of the second sub-image block.
- encoding the target image block according to the first motion vector and the second motion vector includes:
- the rate-distortion cost corresponding to each motion vector prediction mode in the motion vector prediction mode set is calculated, and the target image block is encoded according to the motion vector of the first sub-image block and the motion vector of the second sub-image block predicted by the motion vector prediction mode with the smallest corresponding rate-distortion cost in the motion vector prediction mode set.
- the motion vector prediction mode set includes a target motion vector prediction mode for obtaining the first motion vector and the second motion vector; the rate-distortion cost corresponding to any motion vector prediction mode is the rate-distortion cost of encoding the target image block according to the motion vector of the first sub-image block and the motion vector of the second sub-image block predicted by the motion vector prediction mode.
- Merge mode is an inter-frame prediction mode that allows the encoder to directly use the motion information of adjacent blocks (reference frame index and motion vector) as the motion parameters of the current image block during the encoding process.
- the encoder first establishes a motion vector candidate list, then traverses the list in a certain order, selects the motion vector with the lowest bit rate cost as the optimal motion vector, and encodes its index value.
- the candidate list of Merge mode can be divided into two types: spatial domain and temporal domain.
- the spatial domain candidate list includes B0, B1, B2, A0 and A1, where B0 represents the coding block to the upper right of the current coding block, B1 represents the coding block to the upper right of the current coding block, and A2 represents the coding block to the upper right of the current coding block.
- the temporal candidate list is established using the coding blocks at the same position of the current coding block in the adjacent encoded video frames.
- MMVD mode Merge mode with MVD
- the identification code corresponding to the motion vector prediction mode of the target image block can be obtained based on the syntax tree structure, and the identification code corresponding to the motion vector prediction mode of the target image block can be added to the encoded data of the target image block so that the decoding end can decode the target image block normally.
- the target image block is a rectangular image block obtained by dividing the target video frame into blocks.
- the encoding data of the target image block may be obtained by encoding the target image block according to any of the above-mentioned video encoding methods.
- the encoded data obtained.
- S412 Acquire a segmentation curve of the image block according to the encoded data of the target image block.
- the information of the segmentation curve for segmenting the target image block into the first sub-image block and the second sub-image block includes: position information of the starting point, position information of the end point and position information of the control point of the second-order Bezier curve.
- the information of the segmentation curve for segmenting the target image block into the first sub-image block and the second sub-image block includes: the position information of the starting point, the end point, the first control point and the second control point of the third-order Bezier curve.
- the information of the segmentation curve for segmenting the target image block into the first sub-image block and the second sub-image block includes: position information of control points of the second-order Bezier curve.
- the information of the segmentation curve for segmenting the target image block into the first sub-image block and the second sub-image block includes: position information of the first control point and the position information of the second control point of the third-order Bezier curve.
- step S414 may refer to the implementation of step S351 in the above embodiment, and will not be repeated here to avoid redundancy.
- S416 Reconstruct the target image block according to the first motion vector and the second motion vector.
- the target image block is reconstructed according to the first motion vector and the second motion vector, including: obtaining residual data of the target image block according to the encoded data of the target image block, obtaining a prediction block corresponding to the first sub-image block and a prediction block corresponding to the second sub-image block according to the first motion vector of the first sub-image area and the second motion vector of the first sub-image area, combining the prediction block corresponding to the first sub-image block and the prediction block corresponding to the second sub-image block to obtain the prediction block corresponding to the target image block, and obtaining the reconstructed target image block according to the residual data of the target image block and the prediction block corresponding to the target image block.
- the video decoding method provided by the embodiment of the present disclosure can reconstruct the target image block based on the encoded data of the target image block obtained by the video encoding provided by the above embodiment to obtain the reconstructed image block of the target image block. Therefore, the embodiment of the present disclosure can improve the efficiency of video encoding while realizing normal decoding of the video.
- Predictive coding technology is an indispensable part of video coding. It uses the temporal and spatial correlations in the video sequence to remove redundant information in the video, thereby achieving the purpose of video compression.
- Predictive coding technology includes: intra-frame prediction and inter-frame prediction.
- Intra-frame prediction mainly uses the encoded reconstructed video pixel values to directly or at a certain angle to compensate the coding block to be coded, so as to eliminate the spatial redundant information, and only transform the predicted residual video.
- Inter-frame prediction mainly involves two aspects: motion estimation and motion compensation.
- Motion estimation refers to dividing the video into several blocks, trying to search for the best matching block of each block in the adjacent video frame as a reference block, and deriving the relative offset between the two in spatial position as a motion vector (Motion Vector, MV).
- Motion Vector Motion Vector
- the process of obtaining the prediction block of the current coding block based on the motion vector and the reference block is called motion compensation.
- the current mainstream video coding standards are based on the coding blocks obtained by segmenting the video frame rather than the complete video frame when performing predictive coding. Different coding parameters will be used between different coding blocks. The more accurate the coding block division is, the smaller the difference between it and the prediction block is, and the corresponding bit rate overhead is also smaller. Therefore, appropriate block division is crucial.
- H.266/VVC introduces a geometric partitioning mode, which uses an inclined straight line as a dividing line to divide the coding block into two non-rectangular sub-partitions.
- the motion vectors of the two sub-partitions need to be encoded separately, which greatly increases the codeword consumption, thereby affecting the coding efficiency of the video.
- the present disclosure provides the following technical solutions:
- the present disclosure provides a video encoding method, as shown in FIG. 42 , the video encoding method includes the following steps:
- the target image block is a rectangular image block obtained by dividing the target video frame into blocks.
- the encoder After receiving a video frame to be encoded, the encoder will first divide the video frame to be encoded into multiple image blocks, and encode the image blocks obtained by dividing the video frame to be encoded as the minimum coding unit.
- the encoding task for the image block can be further refined.
- the image block obtained by dividing the video frame to be encoded is called a coding tree unit (CTU), and the coding tree unit can be further divided into smaller square coding units (CU) by using a quadtree division method.
- the maximum size of the coding tree unit can be supported to 64 ⁇ 64 and the minimum size can be supported to 16 ⁇ 16.
- the VVC standard expands the maximum size of CTU to 128 ⁇ 128, and the further division method is no longer limited to quadtree division, but also supports binary tree and ternary tree division.
- the target image block in the embodiment of the present disclosure can be a coding tree unit, or a coding unit obtained by further dividing the coding tree unit, which is not limited in the embodiment of the present disclosure.
- the above step S421 (obtaining a segmentation curve of the target image block based on a texture edge in the target image block) includes the following steps 1 to 3:
- Step 1 Perform texture edge detection on the target image block to obtain the texture edge in the target image block.
- texture edge detection may be performed on the target image block using edge detection algorithms such as Sobel, Prewitt, Roberts, Canny, Marr-Hildreth, etc., to obtain texture edges in the target image block.
- edge detection algorithms such as Sobel, Prewitt, Roberts, Canny, Marr-Hildreth, etc.
- Step 2 Segment the target image block into two image areas based on the texture edge in the target image block, and obtain a boundary line between the two image areas.
- the embodiment of the present disclosure may not further divide the sub-image blocks, but directly encode the image block as a coding unit.
- the texture edge obtained by performing texture edge detection on the image block will divide the image block into image regions greater than 2. For such image blocks, multiple image regions can be first merged into two image regions, and then the subsequent video encoding steps can be performed.
- Step 3 Perform curve fitting on the boundary line to obtain a segmentation curve of the target image block.
- the above step S421 (obtaining a segmentation curve of the target image block based on a texture edge in the target image block) includes the following steps a to e:
- Step b dividing the target image block into two image areas based on the texture edge in the target image block, and obtaining a boundary line between the two image areas.
- Step c obtaining two endpoints of the segmentation curve according to the encoded video content.
- obtaining two endpoints of the segmentation curve according to the encoded video content includes: determining whether the spatially adjacent pixel points of the four edges of the target image block can be obtained in the encoded video content; if so, obtaining the segmentation start point and the segmentation end point according to the spatially adjacent pixel points of the four edges of the target image block; if not, obtaining the best matching block of the target image block in the encoded video content, and obtaining the best matching block of the target image block according to the target image block.
- the segmentation starting point and the segmentation end point are obtained based on the edge pixel points of the best matching block.
- obtaining the segmentation starting point and the segmentation end point based on the spatially adjacent pixel points at the four edges of the target image block includes: obtaining the gradient amplitude of the spatially adjacent pixel points at the four edges of the target image block in the direction of the corresponding edge; obtaining at least two pixel value step points based on the gradient amplitude of the spatially adjacent pixel points at the four edges of the target image block in the direction of the corresponding edge, the pixel value step point being a pixel point with the largest gradient amplitude and a gradient amplitude greater than a threshold gradient amplitude among the spatially adjacent pixel points at any edge of the target image block; and respectively determining the two pixel value step points with the largest gradient amplitude among the at least two pixel value step points as the segmentation starting point and the segmentation end point.
- the acquiring the segmentation start point and the segmentation end point according to the edge pixel points of the best matching block of the target image block comprises: acquiring the gradient amplitudes of the pixel points on four edges of the best matching block of the target image block in the direction of the corresponding edge; acquiring at least two pixel value step points according to the gradient amplitudes of the pixel points on four edges of the best matching block of the target image block in the direction of the corresponding edge, wherein the pixel value step points are pixel points with the largest gradient amplitude and a gradient amplitude greater than a threshold gradient amplitude among the pixel points on any edge of the best matching block of the target image block; acquiring two pixel value step points with the largest gradient amplitude among the at least two pixel value step points; and determining the segmentation start point and the segmentation end point according to the two pixel value step points with the largest gradient amplitude and the motion vector of the best matching block of the target image block, respectively.
- the segmentation curve is determined according to the two endpoints of the segmentation curve and the dividing line, including: fitting the dividing line with a third-order Bezier curve with the segmentation starting point and the segmentation end point as the two endpoints of the third-order Bezier curve to obtain a first control point and a second control point; determining whether the first control point and the second control point are located on the same side of a straight line connecting the segmentation starting point and the segmentation end point; if not, obtaining the segmentation curve according to the segmentation starting point, the segmentation end point, the first control point and the second control point; if yes, fitting the dividing line with a second-order Bezier curve with the segmentation starting point and the segmentation end point as the two endpoints of the second-order Bezier curve to obtain control points, and obtaining the segmentation curve according to the segmentation starting point, the segmentation end point and the control points.
- the decoding end can also obtain the segmentation starting point and the segmentation end point according to the video content that has been encoded. Therefore, the encoding end only needs to add the control information of the segmentation curve to the bitstream data corresponding to the target image block, and the decoding end can obtain the complete segmentation curve information. Therefore, the embodiment of the present disclosure can avoid adding the segmentation starting point and the segmentation end point of the segmentation curve to the bitstream data corresponding to the target image block, thereby reducing the bit rate overhead caused by representing the segmentation curve.
- the number of pixels included in the first sub-image block is greater than the number of pixels included in the second sub-image block.
- segmenting the target image block into a first sub-image block and a second sub-image block using the segmentation curve includes: sampling the segmentation curve with integer pixel accuracy, obtaining the sampling result of the segmentation curve, and segmenting the target image block into the first sub-image block and the second sub-image block based on the sampling result of the segmentation curve.
- the segmentation curve is a continuous curve
- the continuous segmentation curve may pass through the smallest unit (pixel point) of the digital image, and the smallest unit of the digital image cannot be segmented and encoded, so the segmentation curve cannot be directly applied to the segmentation of the digital image.
- the embodiment of the present disclosure first samples the segmentation curve with integer pixel accuracy to sample the segmentation curve into a discrete curve, and then segments the target image block into two sub-image blocks through the sampled curve (the sampling result of the segmentation curve), thereby avoiding segmenting a pixel point into different sub-image blocks.
- the embodiment of the present disclosure first divides the target image block into two sub-image blocks through the segmentation curve, and defines the sub-image block containing more pixels among the two sub-image blocks as the first sub-image block, and defines the sub-image block containing fewer pixels among the two sub-image blocks as the second sub-image block.
- each segmentation mode uniquely corresponds to a candidate motion vector derivation table.
- a correspondence table including each segmentation mode and the candidate motion vector derivation table can be established in advance, and the candidate motion vector derivation table corresponding to the segmentation mode can be determined based on the segmentation mode of the target image block and the correspondence table.
- the target motion vector group is selected from the candidate motion vector derivation table based on the coding cost, including: calculating the coding cost corresponding to each motion vector group in the candidate motion vector derivation table, and determining the motion vector group with the smallest coding cost in the candidate motion vector derivation table as the target motion vector group.
- generating the encoding data of the target image block according to the target motion vector group and the index number of the target motion vector group includes:
- the second sub-image block determines the segmentation mode of the target image block according to the position information of the endpoints of the segmentation curve and the position information of the first sub-image block, and obtain the candidate motion vector derivation table corresponding to the segmentation mode, and finally determine the target motion vector group to be selected from the candidate motion vector derivation table according to the coding cost, and generate the coding data of the target image block according to the target motion vector group and the index number of the target motion vector group.
- the video coding method provided by the embodiment of the present disclosure can represent the motion vectors of the two sub-image blocks obtained by segmenting the target image block by the segmentation curve through an index number in the coding data, compared with the related art that uses two index numbers in the coding data to respectively represent the motion vectors of the two sub-image blocks obtained by segmenting the target image block by the segmentation curve, the embodiment of the present disclosure can reduce the number of index numbers representing the motion vectors, thereby reducing the bit rate of the video and improving the coding efficiency of the video.
- the coded image blocks adjacent to the target image block 4300 in the spatial domain include: the coded image block A 0 located at the lower left of the current image block, the coded image block A 1 located at the left side of the current image block, the coded image block B 2 located at the upper left of the current image block, the coded image block B 1 located above the current image block, and the coded image block B 0 located at the upper right of the current image block.
- the coded image block adjacent to the current image block in the spatial domain is the coded image block T, so the candidate motion vector of the target image block includes: the first motion vector t obtained according to the coded image block T, the second motion vector a0 of the coded image block A 0 , the third motion vector b0 of the coded image block B 0 , the fourth motion vector b2 of the coded image block B 2 , the fifth motion vector a1 of the coded image block A 1 , and the sixth motion vector b1 of the coded image block B 1 .
- the following describes in detail how to determine the segmentation mode of the target image block based on the position information of the endpoints of the segmentation curve and the position information of the first sub-image block, and how to derive a candidate motion vector table corresponding to the segmentation mode.
- the segmentation mode of the target image block is determined to be the first segmentation mode.
- the first candidate motion vector set is ⁇ t, a0, b0 ⁇ ; the second candidate motion vector set is ⁇ b2, a1, b1 ⁇ , and the motion vector of the first sub-image block selected from ⁇ t, a0, b0 ⁇ and the motion vector of the second sub-image block selected from ⁇ b2, a1, b1 ⁇ are combined, and the obtained multiple motion vector groups include: (t, b2), (t, a1), (t, b1), (a0, b2), (a0, a1), (a0, b1), (a0, b2), (b0, a1), (b0, a1).
- obtaining a candidate motion vector derivation table corresponding to the segmentation mode according to the candidate motion vector of the target image block further includes: after obtaining the multiple motion vector groups, sorting the multiple motion vector groups in descending order according to the selection probabilities of the multiple motion vector groups to obtain the sorting results of the multiple motion vector groups, and assigning index numbers to the multiple motion vector groups in order from small to large according to the sorting results of the multiple motion vector groups to generate the The candidate motion vector derivation table corresponding to the segmentation mode, wherein the selection probability of any motion vector group is used to represent the probability of selecting the motion vector group as the target motion vector group.
- the selection probabilities of (t, a1) and (t, b1) are the same, the selection probabilities of (a0, b2) and (b0, b2) are the same, and the selection probabilities of (a0, a1), (a0, b2), (b0, a1), and (b0, b2) are the same, and the order of motion vector groups with the same selection probability can be adjusted arbitrarily, the order and index numbers of (t, a1) and (t, b1), the order and index numbers of (a0, b2) and (b0, b2), and the order and index numbers of (a0, a1), (a0, b2), (b0, a1), and (b0, b2) can be adjusted to generate a candidate motion vector derivation table corresponding to other forms of the first segmentation mode.
- the probability that the first sub-image block Z A selects the fourth motion vector b2 as the motion vector is the highest, followed by the fifth motion vector a1 and the sixth motion vector b1.
- the probability that the second sub-image block Z B selects the first motion vector t as the motion vector is the highest, followed by the second motion vector a0 and the third motion vector b0. Because the number of pixels included in the first sub-image block is greater than the number of pixels included in the second sub-image block, the first sub-image block has a greater impact on the encoding cost.
- the probability of the first sub-image block Z A selecting the second motion vector a0 as the motion vector is the highest, followed by the fourth motion vector b2 and the fifth motion vector a1.
- the probability of the second sub-image block Z B selecting the third motion vector b0 as the motion vector is the highest, followed by the sixth motion vector b1. Because the number of pixels included in the first sub-image block is greater than the number of pixels included in the second sub-image block, the first sub-image block has a greater impact on the encoding cost.
- the order and index numbers of (b2, b0) and (a1, b0) can also be adjusted to generate candidate motion vector derivation tables corresponding to other forms of third segmentation modes.
- the segmentation mode of the target image block is determined to be the fourth segmentation mode.
- the probability of the first sub-image block Z A selecting the third motion vector b0 as the motion vector is the highest, followed by the sixth motion vector b1 and the fourth motion vector b2.
- the probability of the second sub-image block Z B selecting the second motion vector a0 as the motion vector is the highest, followed by the fifth motion vector a1 and the fourth motion vector b2. Because the number of pixels included in the first sub-image block is greater than the number of pixels included in the second sub-image block, the first sub-image block has a greater impact on the encoding cost.
- the multiple motion vector groups are ranked from small to large.
- the candidate motion vector derivation table corresponding to the fourth segmentation mode generated by assigning index numbers to the motion vector group can be shown in the following Table 6:
- the selection probabilities of (b0, b2) and (b0, a1) are the same, the selection probabilities of (b1, a0) and (b2, a0) are the same, and the selection probabilities of (b2, b2), (b2, a1), (b1, b2), and (b1, a1) are the same, the order and index numbers of (b0, b2) and (b0, a1), the order and index numbers of (b1, a0) and (b2, a0), and the order and index numbers of (b2, b2), (b2, a1), (b1, b2), and (b1, a1) can also be adjusted to generate candidate motion vector derivation tables corresponding to other forms of the fourth segmentation mode.
- the segmentation mode of the target image block is determined to be the fifth segmentation mode.
- the motion vector selected by the first sub-image block from the fifth candidate motion vector set and the motion vector selected by the second sub-image block from the sixth candidate motion vector set are combined to obtain the multiple motion vector groups.
- the fifth candidate motion vector set includes: the third motion vector, the fourth motion vector and the sixth motion vector;
- the sixth candidate motion vector set includes: the second motion vector and the fifth motion vector.
- the probability of the first sub-image block Z A selecting the third motion vector b0 as the motion vector is the highest, followed by the fourth motion vector b2 and the sixth motion vector b1.
- the probability of the second sub-image block Z B selecting the second motion vector a0 as the motion vector is the highest, followed by the fifth motion vector a1. Because the number of pixels included in the first sub-image block is greater than the number of pixels included in the second sub-image block, the first sub-image block has a greater impact on the encoding cost.
- the motion vector selected by the first sub-image block from the seventh candidate motion vector set and the motion vector selected by the second sub-image block from the fifth candidate motion vector set are combined to obtain the plurality of motion vector groups.
- the sixth candidate motion vector set includes: the second motion vector and the fifth motion vector; and the seventh candidate motion vector set includes: the first motion vector, the second motion vector and the fourth motion vector.
- the seventh candidate motion vector set is ⁇ t, a0, b2 ⁇ ; the fifth candidate motion vector set is ⁇ b0, b2, b1 ⁇ , and the motion vector of the first sub-image block selected from ⁇ t, a0, b2 ⁇ and the motion vector of the second sub-image block selected from ⁇ b0, b2, b1 ⁇ are combined, and the obtained multiple motion vector groups include: (t, b0), (t, b2), (t, b1), (a0, b0), (a0, b2), (a0, b1), (b2, b0), (b2, b2), (b2, b1).
- the order and index numbers of (a0, b1), (a0, b2), the order and index numbers of (b2, b0), (t, b0), and the order and index numbers of (b2, b1), (b2, b2), (t, b1), and (t, b2) can also be adjusted to generate candidate motion vector derivation tables corresponding to other forms of the sixth segmentation mode.
- the segmentation mode of the target image block is determined to be the seventh segmentation mode.
- the motion vector selected by the first sub-image block from the second candidate motion vector set and the motion vector selected by the second sub-image block from the first candidate motion vector set are combined to obtain the plurality of motion vector groups.
- the first candidate motion vector set includes: the first motion vector, the second motion vector and the third motion vector;
- the second candidate motion vector set includes: the fourth motion vector, the fifth motion vector and the sixth motion vector.
- the first candidate motion vector set is ⁇ t, a0, b0 ⁇ ;
- the second candidate motion vector set is ⁇ b2, a1, b1 ⁇ , and the motion vector of the first sub-image block selected from ⁇ b2, a1, b1 ⁇ and the motion vector of the second sub-image block selected from ⁇ t, a0, b0 ⁇ are combined, and the obtained multiple motion vector groups include: (b2, t), (b2, a0), (b2, b0), (a1, t), (a1, a0), (a1, a0), (b1, t), (b1, a0), (b1, b0).
- the probability that the first sub-image block Z A selects the fourth motion vector b2 as the motion vector is the highest, followed by the fifth motion vector a1 and the sixth motion vector b1.
- the probability that the second sub-image block Z B selects the first motion vector t as the motion vector is the highest, followed by the second motion vector a0 and the third motion vector b0. Because the number of pixels included in the first sub-image block is greater than the number of pixels included in the second sub-image block, the first sub-image block has a greater impact on the encoding cost.
- the order and index numbers of (b2, b0) and (b2, a0), the order and index numbers of (b1, t) and (a1, t), and the order and index numbers of (b1, b0), (b1, a0), (a1, b0), and (a1, a0) can also be adjusted to generate candidate motion vector derivation tables corresponding to other forms of the seventh segmentation mode.
- the first candidate motion vector set is ⁇ t, a0, b0 ⁇ ; the second candidate motion vector set is ⁇ b2, a1, b1 ⁇ ,
- the motion vector of the first sub-image block selected from ⁇ t, a0, b0 ⁇ and the motion vector of the second sub-image block selected from ⁇ b2, a1, b1 ⁇ are combined, and the obtained multiple motion vector groups include: (t, b2), (t, a11), (t, b1), (a0, b2), (a0, a1), (a0, b1), (b0, b2), (b0, a1), (b0, b1).
- the selection probabilities of (t, a1) and (t, b1) are the same, the selection probabilities of (a0, b2) and (b0, b2) are the same, and the selection probabilities of (a0, a1), (a0, b2), (b0, a1), and (b0, b2) are the same, and the order of motion vector groups with the same selection probability can be adjusted arbitrarily, the order and index numbers of (t, a11) and (t, b1), the order and index numbers of (a0, b2) and (b0, b2), and the order and index numbers of (a0, a11), (a0, b2), (b0, a1), and (b0, b2) can also be adjusted to generate candidate motion vector derivation tables corresponding to other forms of the eighth segmentation mode.
- the segmentation mode of the target image block is determined to be the ninth segmentation mode.
- the motion vector selected by the first sub-image block from the fifth candidate motion vector set and the motion vector selected by the second sub-image block from the eighth candidate motion vector set are combined to obtain the plurality of motion vector groups.
- the fifth candidate motion vector set includes: the third motion vector, the fourth motion vector and the sixth motion vector; and the eighth candidate motion vector set includes: the first motion vector, the second motion vector and the fifth motion vector.
- the segmentation mode of the target image block is determined to be the tenth segmentation mode.
- the motion vector selected by the first sub-image block from the fifth candidate motion vector set and the motion vector selected by the second sub-image block from the eighth candidate motion vector set are combined to obtain the plurality of motion vector groups.
- the fifth candidate motion vector set includes: the third motion vector, the fourth motion vector and the sixth motion vector; and the eighth candidate motion vector set includes: the first motion vector, the second motion vector and the fifth motion vector.
- the fifth candidate motion vector set is ⁇ b0, b1, b2 ⁇ ;
- the eighth candidate motion vector set is ⁇ t, a1, a0 ⁇ , and the motion vector of the first sub-image block selected from ⁇ b0, b1, b2 ⁇ and the motion vector of the second sub-image block selected from ⁇ t, a1, a0 ⁇ are combined, and the obtained multiple motion vector groups include: (b0, t), (b0, a1), (b0, a0), (b1, t), (b1, a1), (b1, a0), (b2, t), (b2, a1), (b2, a0).
- the probability that the first sub-image block Z A selects the fourth motion vector b2 as the motion vector is the highest, followed by the third motion vector b0 and the sixth motion vector b1.
- the probability that the second sub-image block Z B selects the first motion vector t as the motion vector is the highest, followed by the second motion vector a0 and the fifth motion vector a1. Because the number of pixels included in the first sub-image block is greater than the number of pixels included in the second sub-image block, the first sub-image block has a greater impact on the encoding cost.
- the segmentation mode of the target image block is determined to be the eleventh segmentation mode.
- the order of motion vector groups with the same selection probability can be adjusted arbitrarily, the order and index numbers of (a0, b1) and (a0, b2), the order and index numbers of (a1, b0) and (t, b0), and the order and index numbers of (a1, b1), (a1, b2), (t, b1), and (t, b2) can also be adjusted to generate candidate motion vector derivation tables corresponding to other forms of the eleventh segmentation mode.
- the segmentation mode of the target image block is determined to be the twelfth segmentation mode.
- the segmentation mode of the target image block is determined to be the thirteenth segmentation mode.
- the probability of the first sub-image block Z A selecting the second motion vector a0 as the motion vector is the highest, followed by the fourth motion vector b2 and the fifth motion vector a1.
- the probability of the second sub-image block Z B selecting the third motion vector b0 as the motion vector is the highest, followed by the first motion vector t and the sixth motion vector b1. Because the number of pixels included in the first sub-image block is greater than the number of pixels included in the second sub-image block, the first sub-image block has a greater impact on the encoding cost.
- the order of motion vector groups with the same selection probability can be adjusted arbitrarily, the order and index numbers of (a0, b1) and (a0, t), the order and index numbers of (a1, b0) and (b2, b0), and the order and index numbers of (a1, b1), (a1, t), (b2, b1), and (b2, t) can also be adjusted to generate candidate motion vector derivation tables corresponding to other forms of the thirteenth segmentation pattern.
- the segmentation mode of the target image block is determined to be the fourteenth segmentation mode.
- the order of motion vector groups with the same selection probability can be adjusted arbitrarily, the order and index numbers of (b2, a1) and (b2, b1), the order and index numbers of (a1, t) and (b1, t), and the order and index numbers of (a1, b0), (a1, b1), (b1, b0), and (b1, b1) can also be adjusted to generate candidate motion vector derivation tables corresponding to other forms of the fourteenth segmentation mode.
- the segmentation mode of the target image block is determined to be the fifteenth segmentation mode.
- the fifth candidate motion vector set is ⁇ b1, b2, b1 ⁇ ;
- the third candidate motion vector set is ⁇ a0, b2, a1 ⁇ , and the motion vector of the first sub-image block selected from ⁇ b0, b2, b1 ⁇ and the motion vector of the second sub-image block selected from ⁇ a0, b2, a1 ⁇ are combined, and the obtained multiple motion vector groups include: (b0, a0), (b0, b2), (b0, a1), (b2, a0), (b2, b2), (b2, a1), (b1, a0), (b1, b2), (b1, a1).
- the probability of the first sub-image block Z A selecting the third motion vector b0 as the motion vector is the highest, followed by the fourth motion vector b2 and the sixth motion vector b1.
- the probability of the second sub-image block Z B selecting the second motion vector a0 as the motion vector is the highest, followed by the fourth motion vector b2 and the fifth motion vector a1. Because the number of pixels included in the first sub-image block is greater than the number of pixels included in the second sub-image block, the first sub-image block has a greater impact on the encoding cost.
- the motion vector selected by the first sub-image block from the ninth candidate motion vector set and the motion vector selected by the second sub-image block from the second candidate motion vector set are combined to obtain the plurality of motion vector groups.
- the second candidate motion vector set includes: the fourth motion vector, the fifth motion vector and the sixth motion vector;
- the ninth candidate motion vector set Includes: the first motion vector, the third motion vector and the sixth motion vector.
- generating the encoding data of the target image block according to the target motion vector group and the index number of the target motion vector group includes: performing adaptive variable length encoding (Adaptive Length Encoding, ALE for short) on the index number of the target motion vector group.
- adaptive variable length encoding Adaptive Length Encoding, ALE for short
- the present disclosure also provides a video decoding method, as shown in FIG. 60 , the video decoding method includes the following steps: Steps:
- the encoding data of the target image block may be encoding data obtained by encoding the target image block according to any of the above-mentioned video encoding methods.
- obtaining the encoding data of the target image block includes: receiving video stream data of a to-be-played video sent by a media resource server, and extracting the encoding data of the target image block from the video stream data.
- the encoding data of the target image block includes: encoding data obtained by encoding information of a segmentation curve used to segment the target image block into a first sub-image block and a second sub-image block, encoding data obtained by encoding residual data of the target image block, and encoding data obtained by encoding index numbers of target motion vector groups of the first sub-image block and the second sub-image block.
- the information of the segmentation curve for segmenting the target image block into the first sub-image block and the second sub-image block includes: the position information of the starting point, the end point, the first control point and the second control point of the third-order Bezier curve.
- the information of the segmentation curve for segmenting the target image block into the first sub-image block and the second sub-image block includes: position information of the first control point and the position information of the second control point of the third-order Bezier curve.
- S604 Determine a segmentation mode of the target image block according to the position information of the endpoints of the segmentation curve and the position information of the first sub-image region.
- the method of obtaining the candidate motion vector derivation table corresponding to the segmentation mode may include: receiving the candidate motion vector derivation table corresponding to each segmentation mode sent by the encoding end device.
- the processor is configured to enable the video encoding device to implement the video encoding method described in any of the above embodiments when calling the computer program.
- some embodiments of the present disclosure provide a video decoding device, the video decoding device comprising:
- some embodiments of the present disclosure provide a computer-readable storage medium having a computer program stored thereon.
- the computing device implements the video encoding method described in any of the above embodiments or the video decoding method described in any of the above embodiments.
Landscapes
- Engineering & Computer Science (AREA)
- Multimedia (AREA)
- Signal Processing (AREA)
- Compression Or Coding Systems Of Tv Signals (AREA)
Abstract
Certains modes de réalisation de la présente divulgation se rapportent au domaine technique du codage et du décodage vidéo, et concernent un procédé de codage vidéo, un procédé de décodage vidéo, et un appareil. Le procédé de codage vidéo comprend les étapes suivantes : acquisition d'un bloc d'image cible, le bloc d'image cible étant un bloc d'image rectangulaire obtenu par réalisation d'une division de bloc sur une trame vidéo ; sur la base d'un bord de texture dans le bloc d'image cible, acquisition d'une ligne de délimitation du bloc d'image cible, la ligne de délimitation étant utilisée pour segmenter le bloc d'image cible en deux régions d'image ; sur la base d'un contenu vidéo codé, acquisition de deux sections d'extrémité d'une courbe de segmentation, de façon à acquérir un point de départ de segmentation et un point d'extrémité de segmentation ; sur la base du point de départ de segmentation, du point d'extrémité de segmentation, et de la ligne de délimitation, acquisition de la courbe de segmentation et d'informations de contrôle de la courbe de segmentation ; sur la base de la courbe de segmentation du bloc d'image cible, segmentation du bloc d'image cible en deux sous-blocs d'image ; et, sur la base des informations de contrôle de la courbe de segmentation et des deux sous-blocs d'image, acquisition de données de flux binaire correspondant au bloc d'image cible. Certains modes de réalisation de la présente divulgation sont utilisés pour réduire le surdébit de débit binaire provoqué par une courbe de segmentation.
Priority Applications (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN202480022010.7A CN121040069A (zh) | 2023-10-08 | 2024-04-26 | 视频编码方法、视频解码方法及装置 |
Applications Claiming Priority (10)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN202311292356.2A CN119788866A (zh) | 2023-10-08 | 2023-10-08 | 视频编码方法、视频解码方法及装置 |
| CN202311292356.2 | 2023-10-08 | ||
| CN202311295060.6A CN119788867A (zh) | 2023-10-08 | 2023-10-08 | 视频编码方法、视频解码方法及装置 |
| CN202311295060.6 | 2023-10-08 | ||
| CN202311658808.4A CN120111232A (zh) | 2023-12-05 | 2023-12-05 | 视频编码方法、视频解码方法装置 |
| CN202311658808.4 | 2023-12-05 | ||
| CN202410047683.X | 2024-01-12 | ||
| CN202410047683.XA CN120358355A (zh) | 2024-01-12 | 2024-01-12 | 视频编码方法、视频解码方法及装置 |
| CN202410243313.3A CN120639980A (zh) | 2024-03-04 | 2024-03-04 | 视频编码方法、视频解码方法及装置 |
| CN202410243313.3 | 2024-03-04 |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| WO2025077156A1 true WO2025077156A1 (fr) | 2025-04-17 |
Family
ID=95396424
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| PCT/CN2024/090205 Pending WO2025077156A1 (fr) | 2023-10-08 | 2024-04-26 | Procédé de codage vidéo, procédé de décodage vidéo, et appareil |
Country Status (2)
| Country | Link |
|---|---|
| CN (1) | CN121040069A (fr) |
| WO (1) | WO2025077156A1 (fr) |
Citations (6)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US5091976A (en) * | 1989-02-22 | 1992-02-25 | Ricoh Company, Ltd. | Image coding method for extracting, segmenting, and coding image contours |
| JP2012230668A (ja) * | 2011-04-11 | 2012-11-22 | Canon Inc | 画像処理装置、画像処理方法、及びプログラム |
| US20190182505A1 (en) * | 2016-08-12 | 2019-06-13 | Mediatek Inc. | Methods and apparatuses of predictor-based partition in video processing system |
| US20210218977A1 (en) * | 2018-10-01 | 2021-07-15 | Op Solutions, Llc | Methods and systems of exponential partitioning |
| US20210360271A1 (en) * | 2019-01-28 | 2021-11-18 | Op Solutions, Llc | Inter prediction in exponential partitioning |
| CN116472709A (zh) * | 2020-11-24 | 2023-07-21 | 现代自动车株式会社 | 用于视频编码和解码的装置和方法 |
-
2024
- 2024-04-26 WO PCT/CN2024/090205 patent/WO2025077156A1/fr active Pending
- 2024-04-26 CN CN202480022010.7A patent/CN121040069A/zh active Pending
Patent Citations (6)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US5091976A (en) * | 1989-02-22 | 1992-02-25 | Ricoh Company, Ltd. | Image coding method for extracting, segmenting, and coding image contours |
| JP2012230668A (ja) * | 2011-04-11 | 2012-11-22 | Canon Inc | 画像処理装置、画像処理方法、及びプログラム |
| US20190182505A1 (en) * | 2016-08-12 | 2019-06-13 | Mediatek Inc. | Methods and apparatuses of predictor-based partition in video processing system |
| US20210218977A1 (en) * | 2018-10-01 | 2021-07-15 | Op Solutions, Llc | Methods and systems of exponential partitioning |
| US20210360271A1 (en) * | 2019-01-28 | 2021-11-18 | Op Solutions, Llc | Inter prediction in exponential partitioning |
| CN116472709A (zh) * | 2020-11-24 | 2023-07-21 | 现代自动车株式会社 | 用于视频编码和解码的装置和方法 |
Also Published As
| Publication number | Publication date |
|---|---|
| CN121040069A (zh) | 2025-11-28 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| CN115665408B (zh) | 用于跨分量线性模型预测的滤波方法和装置 | |
| CN114175661A (zh) | 具有补充信息消息的点云压缩 | |
| TWI904088B (zh) | 用於視訊寫碼之以梯度為基礎之預測精細化 | |
| TW201933874A (zh) | 使用局部照明補償之視訊寫碼 | |
| CN111133476A (zh) | 点云压缩 | |
| CN114631315B (zh) | 图像编码方法和设备以及图像解码方法和设备 | |
| CN112771870A (zh) | 视频解码器和方法 | |
| CN108141607A (zh) | 视频编码和解码方法、视频编码和解码装置 | |
| KR20240093874A (ko) | 인터 및 인트라 예측을 사용한 동적 메시 압축 | |
| CN104811728B (zh) | 一种视频内容自适应的运动搜索方法 | |
| CN116918324A (zh) | 通用样本补偿 | |
| CN118631994A (zh) | 用于在视频编解码中进行滤波的方法和设备 | |
| CN116848843A (zh) | 可切换的密集运动向量场插值 | |
| CN116980596A (zh) | 一种帧内预测方法、编码器、解码器及存储介质 | |
| CN120111232A (zh) | 视频编码方法、视频解码方法装置 | |
| CN116325735B (zh) | 对参考帧重新排序的方法、计算机设备、装置及存储介质 | |
| JP2023105074A (ja) | 運動ベクトルインタ予測のための大域的運動モデル | |
| KR20240024921A (ko) | 이미지 또는 비디오를 인코딩/디코딩하기 위한 방법들 및 장치들 | |
| JP2024545531A (ja) | 予測方法と装置、デバイス、システム、及び記憶媒体 | |
| CN117616751A (zh) | 动态图像组的视频编解码 | |
| Saberi et al. | The best and most efficient video compression methods | |
| WO2025077156A1 (fr) | Procédé de codage vidéo, procédé de décodage vidéo, et appareil | |
| CN120615301A (zh) | 生成式面部视频的sei消息 | |
| CN117201796B (zh) | 视频编码方法、装置、计算设备和存储介质 | |
| TW202518897A (zh) | 跨分量預測 |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| 121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 24875983 Country of ref document: EP Kind code of ref document: A1 |