WO2023092256A1 - 一种视频编码方法及其相关装置 - Google Patents
一种视频编码方法及其相关装置 Download PDFInfo
- Publication number
- WO2023092256A1 WO2023092256A1 PCT/CN2021/132307 CN2021132307W WO2023092256A1 WO 2023092256 A1 WO2023092256 A1 WO 2023092256A1 CN 2021132307 W CN2021132307 W CN 2021132307W WO 2023092256 A1 WO2023092256 A1 WO 2023092256A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- image block
- image
- video
- video file
- block
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Ceased
Links
Images
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/134—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
- H04N19/136—Incoming video signal characteristics or properties
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/169—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
- H04N19/17—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
- H04N19/176—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a block, e.g. a macroblock
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/102—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
- H04N19/129—Scanning of coding units, e.g. zig-zag scan of transform coefficients or flexible macroblock ordering [FMO]
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/169—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
- H04N19/188—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being a video data packet, e.g. a network abstraction layer [NAL] unit
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/30—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using hierarchical techniques, e.g. scalability
- H04N19/33—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using hierarchical techniques, e.g. scalability in the spatial domain
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/50—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
- H04N19/503—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
- H04N19/51—Motion estimation or motion compensation
- H04N19/513—Processing of motion vectors
- H04N19/517—Processing of motion vectors by encoding
- H04N19/52—Processing of motion vectors by encoding by predictive encoding
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/60—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding
- H04N19/61—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding in combination with predictive coding
Definitions
- the present application relates to the field of video coding and decoding, in particular to a video coding method and related devices.
- Video coding (video coding and decoding) is widely used in digital video applications, such as broadcast digital TV, video transmission on the Internet and mobile networks, real-time session applications such as video chat and video conferencing, high-density digital video disc (digital video disc, DVD ), also known as security applications for Blu-ray Discs, video content capture and editing systems, and camcorders.
- digital video applications such as broadcast digital TV, video transmission on the Internet and mobile networks
- real-time session applications such as video chat and video conferencing
- high-density digital video disc digital video disc, DVD
- security applications for Blu-ray Discs video content capture and editing systems
- camcorders camcorders
- Video coding standards include MPEG-1 Video, MPEG-2 Video, ITU-TH.262/MPEG-2, ITU-TH.263, ITU-TH.264/MPEG-4 Part 10 Advanced Video Coding (Advanced Video Coding , AVC), ITU-TH.265/High Efficiency Video Coding (HEVC) and extensions of such standards, such as scalability and/or 3D (three-dimensional) extensions.
- AVC Advanced Video Coding
- HEVC High Efficiency Video Coding
- extensions of such standards such as scalability and/or 3D (three-dimensional) extensions.
- the present application provides a video encoding method, a video decoding method and related devices, which reduce the storage resources required for storing video and reduce the time required for video transmission under the premise that the complete second image block can be restored. required bandwidth.
- the present application provides a video encoding method, the method comprising:
- first encoding parameter of a first image block and a second encoding parameter of a second image block the first image block is an image block in the first video file
- second image block is an image block in the second video file
- first encoding parameter and the second encoding parameter include a motion vector or a residual
- the first encoding parameter and the second encoding parameter may include a motion vector; the first encoding parameter and the second encoding parameter may include a residual; the first encoding parameter and the second encoding Parameters include motion vectors and residuals.
- the image feature comparison (or is to determine whether the first video and the second video are videos with similar or even repeated content by comparing encoding parameters of image blocks obtained after decoding).
- the coding parameters may be residuals, motion vectors, discrete cosine transform (discrete cosine transform, DCT) coefficients, etc., and then select the first image block and the second image block (or select the first image frame and the second image frame frame, the first image frame includes the first image block, and the second image frame includes the second image block).
- the packaged first video file and the second video file it may be determined based on comparing the subtitle information and/or audio information in the first video file and the second video file Whether the first video and the second video are videos with similar or even repeated content (the comparison between image features and/or encoding parameters can be performed later), and then the first image block and the second image block are selected (or the first image block is selected) An image frame and a second image frame, the first image frame includes the first image block, and the second image frame includes the second image block).
- image processing may be performed based on comparing subtitle information and/or audio information in the first video file and the second video file.
- image frames with similar subtitle information and/or audio information can be selected first as candidates, and then the image features and/or encoding parameters are compared to select the first image block and the second image block (or select the first image frame and the second image frame, the first image frame includes the first image block, and the second image frame includes the second image block).
- the packaged first video file and the second video file may be based on the source of the first video file and the second video file (for example, judging whether the first video file and the second video file It is the same-source video, such as whether it is edited based on the same video file, such as the first video file is only added with subtitles, special effects or beauty treatment, etc.
- the first image block and the second image block (or select the first image frame and a second image frame, the first image frame includes a first image block, and the second image frame includes a second image block).
- the packaged first video file and the second video file based on the image feature comparison of each (or part of) image frames in the first video file and the second video file, Determine whether there are image frames with similar or even duplicate content in the first video file and the second video file.
- the packaged first video file and the second video file based on the image features of the image blocks in each (or part of) image frames in the first video file and the second video file to determine whether there are image blocks with similar or even repeated content in the first video file and the second video file.
- the first video file and the second video file can be video files with similar or repeated content, wherein the so-called similar content can be understood as the presence (or existence of more than a certain number or ratio) of pixels between the video files Image frames with similar values and distributions.
- first video file and the second video file can be different video files, and can also be different parts of the same video file (for example, there will be a large number of repeated video file segments in some video files, then the first video file and The second video file may be two repeated video file segments in the video file), which is not limited here.
- first video file and the second video file may be video files with the same video file content, but belong to different video file package files.
- the first video file may include a first image frame
- the second video file may include a second image frame
- the first image frame may include a plurality of image blocks (including the first image block)
- the second video file may include a second image frame.
- the image frame may include multiple image blocks (including the second image block), wherein the first image block and the second image block may be block units such as MB, prediction block (partition), CU, PU, TU, etc., which are not limited here
- MB prediction block
- CU prediction block
- PU PU
- TU TU
- the similarity of image features between the first image block and the second image block is relatively high, wherein, the image features of the image block can be one or more of the color features, texture features, shape features, etc. of the image block .
- the color feature and texture feature are used to describe the surface properties of the object corresponding to the image block.
- the shape features include contour features and region features, the contour features include the outer boundary features of the object, and the region features include the shape region features of the object.
- the encoding parameters between the first image block and the second image block have a high similarity
- the encoding parameters may be at least one of residuals, DCT coefficients, and motion vectors.
- the first encoding parameter of the first image block and the second encoding parameter of the second image block can be obtained, the first encoding parameter and the second encoding parameter can be motion vectors, the first encoding parameter and the second encoding parameter
- the second coding parameter may be a residual, and the first coding parameter and the second coding parameter may be a motion vector and a residual.
- the residual may be calculated based on the image block and the prediction block, for example, the residual may be obtained in the sample domain by subtracting the sample value of the image block of the picture from the sample value of the prediction block sample by sample (pixel by pixel).
- the residual here may be a residual code.
- the coding parameters may also include syntax information, for example, the syntax information may be but not limited to: any one of inter prediction, intra prediction parameters, loop filter parameters and/or other syntax elements (decoded) or all.
- syntax information may be but not limited to: any one of inter prediction, intra prediction parameters, loop filter parameters and/or other syntax elements (decoded) or all.
- difference information according to the first encoding parameter and the second encoding parameter, where the difference information is used to indicate a difference between the first encoding parameter and the second encoding parameter;
- a subtraction operation may be performed on the first encoding parameter and the second encoding parameter to obtain difference information, and in addition, the difference information may be obtained based on other factors used to quantize the first encoding parameter and the second encoding parameter.
- the difference information is calculated by calculating the difference between them, which is not limited here.
- the difference information may be a difference picture.
- first image block and the second image block may be image blocks with relatively similar image characteristics in the first image frame and the second image frame, and part or all of the image blocks in the first image frame and the second image frame may be Execute the above processing for the first image block and the second image block. Furthermore, difference information of multiple groups of image blocks can be obtained. Encoding the difference information to obtain encoded data. Wherein, the encoded data may be used by the decoding side to restore and obtain the second encoding parameter.
- the difference information may be used as an encoding parameter of the second image block (replacing the original second encoding parameter), and the data of the second video including the encoding parameter of the second image block is encoded.
- first image block and the second image block may be image blocks with relatively similar image characteristics in the first image frame and the second image frame, and part or all of the image blocks in the first image frame and the second image frame may be Execute the above processing for the first image block and the second image block.
- first image frame and the second image frame may be image frames with relatively similar image features in the first video and the second video, and image blocks in part or all of the image frames in the first video and the second video may be Execute the above processing for the first image block and the second image block.
- An embodiment of the present application provides a video encoding method, the method comprising: acquiring a first encoding parameter of a first image block and a second encoding parameter of a second image block, the first image block being the first image block in the first video file image block, the second image block is an image block in the second video file, the first video file and the second video file are different video files, the first encoding parameter and the second
- the encoding parameters include motion vectors and/or residuals; according to the first encoding parameters and the second encoding parameters, difference information is obtained, and the difference information is used to indicate that the first encoding parameters and the second encoding parameters
- the difference between the encoding parameters encoding the difference information to obtain encoded data, and the encoded data is used for restoration on the decoding side to obtain the second encoding parameter.
- the difference information is replaced with the original second encoding parameter. Since the similarity of the image features of the first image block and the second image block is greater than the threshold, the size of the bitstream data after encoding the difference information will be smaller than that of the first image block.
- the bit stream data encoded by the second encoding parameter can be restored based on the difference information and the first encoding parameter to obtain the second encoding parameter. It is equivalent to reducing the storage resource required for storing the video and reducing the bandwidth required for video transmission under the premise that the complete second image block can be recovered.
- the acquiring the first encoding parameter of the first image block and the second encoding parameter of the second image block includes: decoding the first video file and the second video file, A first encoding parameter of the first image block and a second encoding parameter of the second image block are obtained.
- the similarity between the image features of the first image block and the second image block is greater than a threshold; and/or, the difference between the first encoding parameter and the second encoding parameter The similarity between them is greater than a threshold; and/or, the similarity of the subtitle information and/or audio information in the first video file and the second video file is greater than a threshold; and/or, the first image block The similarity between the DCT coefficients and the DCT coefficients of the second image block is greater than a threshold.
- the first coding parameter may include a first motion vector
- the second coding parameter may include a second motion vector
- the difference information includes first difference information
- the first difference information is the first motion vector. vector and the second motion vector.
- the first coding parameter may include a first residual
- the second coding parameter may include a second residual
- the difference information includes second difference information, so The second difference information is a difference between the first residual and the second residual.
- the first coding parameter may include a first motion vector and a first residual
- the second coding parameter may include a second motion vector and a second residual
- the difference information includes a first difference information and second difference information
- the first difference information is the difference between the first motion vector and the second motion vector
- the second difference information is the first The difference between the residual and the second residual.
- the first coding parameter includes a first residual
- the second coding parameter includes a second residual
- the difference information includes second difference information
- the second difference The information is used to indicate the difference between the first residual and the second residual
- the encoding the difference information includes: encoding the second residual by lossless compression encoding or lossy compression encoding
- the difference information is encoded.
- the first difference information represents the difference between motion vectors
- the motion vector is used in inter-frame prediction
- lossy compression is used to compress the first difference information
- difference for example, artifacts appear
- lossless compression is used to compress the first difference information, which can increase the accuracy and effect of inter-frame prediction.
- the first encoding parameter includes a first motion vector
- the second encoding parameter includes a second motion vector
- the difference information includes first difference information
- the first difference The information is used to indicate the difference between the first motion vector and the second motion vector
- the encoding the difference information includes: encoding the first difference information through lossless compression coding .
- the first difference information (the first difference information is used to indicate the difference between the first motion vector and the second motion vector) may be performed by lossless compression coding. coding.
- lossless compression coding or lossy compression coding may be used to encode the second difference information (the second difference information is used to indicate the difference between the first residual and the second residual difference between) to encode.
- lossless compression may be lossless compression including transformation (retaining all coefficients), scanning, and entropy coding
- lossy compression may be lossy compression including transformation (reserving low-frequency coefficients), quantization, scanning, and entropy coding.
- the encoded data and the first encoding parameter may be sent to the decoding side, so that the decoding side can obtain The second encoding parameters; alternatively, the encoded data may be stored locally for later transmission or retrieval.
- the above-mentioned encoded data can be encapsulated (optionally, if the second video file also includes other information such as audio information and subtitle information, in order to enable the decoding side to restore the second video file, the above-mentioned audio information can also be packaged. , subtitle information and other information), and correspondingly, the encapsulated encoded data may be sent to the decoding side, so that the decoding side obtains the second encoding parameter according to the encapsulated encoded data; or, The encapsulated encoded data can be stored locally for later transmission or retrieval.
- the first indication information may also be encoded, and the first indication information may indicate that there is a relationship between the first image block and the second image block (optionally, it may indicate that the difference information is based on the first The difference between the image block and the second image block), and then the decoding side can obtain the first encoding parameter and difference information of the first image block based on the first indication information, and obtain the first encoding parameter and difference information based on the first encoding parameter and difference information Second encoding parameter.
- the first indication information includes the identification of the first image block and the identification of the second image block; and/or, the first indication information includes the identification of the first video file and the identification of the second video file The ID of the file.
- index information information
- the present application provides a video decoding method, the method comprising:
- the coding parameters and the second coding parameters include motion vectors and/or residuals; and the second coding parameters are obtained according to the first coding parameters and the difference information.
- the indication information is obtained by decoding the encoded data; optionally, the indication information is stored on the object associated with the second video file, and when decoding the second video file, it only needs to be decoded once to obtain Instructions.
- the indication information may include an identifier indicating the first image block and an identifier indicating the second image block.
- the encoded data of the second video and the encoding parameters of the first video may be acquired.
- the encoded data of the first video may be decapsulated and decoded to obtain encoding parameters of the first video (for example, may include first encoding parameters of the first image block).
- the first encoding parameter and difference information may be summed to obtain the second encoding parameter.
- the first encoding parameter may include a first motion vector
- the second encoding parameter may include a second motion vector
- the difference information includes first difference information
- the first difference information is used to indicate the
- the difference between the first motion vector and the second motion vector may be calculated by summing the first motion vector and the first difference information to obtain the second motion vector.
- the first encoding parameter may include a first residual
- the second encoding parameter may include a second residual
- the difference information includes second difference information
- the second difference information is used to indicate
- the first residual and the second difference information may be summed to obtain the second residual
- the restored second encoding parameters may be encoded and packaged to obtain original encoded data of the second video.
- An embodiment of the present application provides a video decoding method, the method comprising: acquiring encoded data; decoding the encoded data to obtain difference information; decoding the first image block according to the indication information to obtain the second image block An encoding parameter; wherein the indication information is used to indicate that there is an association between the first image block and the second image block, the first image block belongs to the first video file, and the second image block belongs to the second video file, The first video file and the second video file are different video files; according to the first encoding parameter and the difference information, the second encoding parameter, the first encoding parameter and the The second coding parameters include motion vectors and/or residuals.
- the difference information is replaced with the original second encoding parameter.
- the size of the encoded data after encoding the difference information will be much smaller than
- the encoded data after encoding the second encoding parameter can be restored based on the difference information and the first encoding parameter to obtain the second encoding parameter. It is equivalent to reducing the storage resource required for storing the video and reducing the bandwidth required for video transmission under the premise that the complete second image block can be recovered.
- the similarity between the image features of the first image block and the second image block is greater than a threshold; and/or,
- the degree of similarity between the first encoding parameter and the second encoding parameter is greater than a threshold; and/or,
- the similarity of the subtitle information and/or audio information in the first video file and the second video file is greater than a threshold; and/or,
- the similarity between the DCT coefficients of the first image block and the DCT coefficients of the second image block is greater than a threshold.
- the image features include at least one of the following:
- the first image block is included in a first image frame
- the second image block is included in a second image frame
- the first image frame and the second image frame are different image frame.
- the first image block includes a first image frame in a first video
- the second image block includes a second image frame in a second video
- the first video and The second video is a different video file.
- the present application provides a video encoding method, the method comprising:
- first image block in the first video file and the second image block in the second video file are different video files;
- the coded data of the first video and the second video may be decoded by a decoder to obtain a video signal of the first video and a video signal of the second video, wherein the video signal of the first video may be Including the first image block, the video signal of the second video may include the second image block.
- the decoder may decapsulate the video files of the first video and the second video.
- the image feature comparison (or is to determine whether the first video and the second video are videos with similar or even repeated content by comparing encoding parameters of image blocks obtained after decoding).
- the coding parameters may be residuals, motion vectors, discrete cosine transform (discrete cosine transform, DCT) coefficients, etc., and then select the first image block and the second image block (or select the first image frame and the second image frame frame, the first image frame includes the first image block, and the second image frame includes the second image block).
- the packaged first video file and the second video file it may be determined based on comparing the subtitle information and/or audio information in the first video file and the second video file Whether the first video and the second video are videos with similar or even repeated content (the comparison between image features and/or encoding parameters can be performed later), and then the first image block and the second image block are selected (or the first image block is selected) An image frame and a second image frame, the first image frame includes the first image block, and the second image frame includes the second image block).
- image processing may be performed based on comparing subtitle information and/or audio information in the first video file and the second video file.
- image frames with similar subtitle information and/or audio information can be selected first as candidates, and then the image features and/or encoding parameters are compared to select the first image block and the second image block (or select the first image frame and the second image frame, the first image frame includes the first image block, and the second image frame includes the second image block).
- the packaged first video file and the second video file may be based on the source of the first video file and the second video file (for example, judging whether the first video file and the second video file It is the same-source video, such as whether it is edited based on the same video file, such as the first video file is only added with subtitles, special effects or beauty treatment, etc.
- the first image block and the second image block (or select the first image frame and a second image frame, the first image frame includes a first image block, and the second image frame includes a second image block).
- the first video may include a first image frame
- the second video may include a second image frame
- the first image frame may include a plurality of image blocks (including the first image block)
- the second image frame It may include a plurality of image blocks (including the second image block), wherein the first image block and the second image block may be block units such as MB, prediction block (partition), CU, PU, TU, etc., which are not limited here.
- MB prediction block
- CU prediction block
- PU PU
- TU TU
- the similarity of image features between the first image block and the second image block is relatively high, wherein, the image features of the image block can be one or more of the color features, texture features, shape features, etc. of the image block .
- the color feature and texture feature are used to describe the surface properties of the object corresponding to the image block.
- the shape features include contour features and region features, the contour features include the outer boundary features of the object, and the region features include the shape region features of the object.
- the first image block may be an image block in an I frame
- the second image block may be an image block in a P frame or a B frame
- the first image block may be An image block in a P frame
- the second image block is an image block in a P frame or a B frame.
- determining difference information Using the first image block as a reference block of the second image block, determining difference information, where the difference information is used to indicate a difference between the first image block and the second image block;
- Encoding the difference information to obtain encoded data may be used by the decoding side to restore and obtain the second image block.
- the video signal of the second video may be encoded based on the video signal of the first video.
- the encoding parameter of the second image block may be determined by using the first image block as a reference block of the second image block.
- the acquiring the first image block in the first video file and the second image block in the second video file includes:
- the coding parameters include a motion vector, a residual and/or DCT coefficients; and/or
- the image blocks (all or part of the image blocks) in the first video file may be traversed, and an image block (or an image block with a larger image feature similarity with the second image block) may be selected from the traversed image blocks. Select an image block whose encoding parameters have a higher similarity to the encoding parameters of the second image block, or select the corresponding subtitle information and/or audio information and the corresponding subtitle information and/or audio information of the second image block with a greater similarity image block).
- the first image frame in the first video and the second image frame in the second video may be acquired, and the similarity of the image features of the first image frame and the second image frame greater than the threshold, the first video and the second video are different videos, and the first image frame is used as a reference frame of the second image frame to determine the first encoding parameter of the second image frame .
- the first image block may be an image block of an I frame in the first video
- the second image block may be an image block of a P frame or a B frame in the second video.
- the first image block may be an image block of a P frame in the first video
- the second image block may be an image block of a P frame or a B frame in the second video.
- the first encoding parameter may be residual, motion vector and other syntax information, which is not limited here.
- the first image block may be used as a reference block, and the first image frame having the reference block As a reference frame to predict the second image block in the current frame (that is, the second image frame including the second image block), optionally, the first image frame may be temporally after the current frame, or the current frame may be temporally It is located between a previous reference frame that appears before the current frame in the video sequence and a subsequent reference frame that appears after the current frame in the video sequence (the first image frame can be one of the previous reference frame or the subsequent reference frame).
- the image content similarity between the first image frame and the second image frame can be relatively high.
- the higher image block is used as a reference block when encoding the second image frame.
- the relevant information of the reference block originally used as the second image block in the image group GOP where the second image block is located may not be encoded into the encoded data , reducing the size of encoded data, reducing the storage space required for storing the second video, and reducing the bandwidth required for transmitting the second video.
- the coding module is specifically used for:
- the second indication information is used to indicate that there is an association between the first image block and the second image block (optionally, the second indication information may Indicates that the first image block is used as a reference block for the second image block).
- the second indication information includes the identification of the first image block and the identification of the second image block; and/or, the first indication information includes the identification of the first video file and the identification of the second video file The ID of the file.
- the acquiring module is further configured to acquire the third image block in the third video file and the fourth image block in the second video file, the second image block and the fourth image
- the block belongs to the same image frame in the second video file, the similarity of the image features of the third image block and the fourth image block is greater than a threshold, and the third video file and the second video file for different video files;
- the difference determination module is also used for:
- the difference information is determined by using the first image block as a reference block of the second image block and using the third image block as a reference block of the fourth image block.
- the present application provides a video decoding method, the method comprising: obtaining coded data; decoding the coded data to obtain a first coding parameter of a second image block in a second video, and according to the instruction Information (such as the second indication information described in the third aspect), decoding the first video file to obtain the first image block; the indication information is used to indicate the first image block and the second image block There is an association between them (optionally, it may indicate that the reference block of the second image block is the first image block in the first video), and the first video and the second video are different video files; according to The first encoding parameter and the first image block are used to reconstruct the second image block.
- the instruction Information such as the second indication information described in the third aspect
- the relevant information of the reference block originally used as the second image block in the image group GOP where the second image block is located may not be encoded into the encoded data , reducing the size of encoded data, reducing the storage space required for storing the second video, and reducing the bandwidth required for transmitting the second video.
- the similarity between the image features of the first image block and the second image block is greater than a threshold; and/or,
- the degree of similarity between the first encoding parameter and the second encoding parameter is greater than a threshold; and/or,
- the similarity of the subtitle information and/or audio information in the first video file and the second video file is greater than a threshold; and/or,
- the similarity between the DCT coefficients of the first image block and the DCT coefficients of the second image block is greater than a threshold.
- the indication information includes the identifier of the first image block and the identifier of the second image block; and/or, the indication information includes the identifier of the first video file and the identifier of the second video file.
- the first image block is an image block in an I frame
- the second image block is an image block in a P frame or a B frame; or,
- the first image block is an image block in a P frame
- the second image block is an image block in a P frame or a B frame.
- the method further includes: decoding the encoded data to obtain a fourth image block in the second video; obtaining the third image block according to the second indication information; the The second indication information is used to indicate that the reference block of the fourth image block is the third image block in the third video (or there is an association between the image blocks), and the third video and the second video are different of the video file; reconstructing the fourth image block according to the second encoding parameter and the third image block.
- the method also includes:
- the reference block of the block is the fifth image block in the first video (or there is an association between the image blocks); wherein, the first image block and the fifth image block belong to the fifth image block in the first video The same or a different image frame; reconstructing the sixth image block according to the third encoding parameter and the fifth image block.
- the present application provides a video encoding method, the method comprising: obtaining a first video file and a second video file, where the first video file and the second video file are different video files; The first video file and the second video file are decoded to obtain the first information of the first image block in the first video file and the second information of the second image block in the second video file; according to the First information and the second information to obtain difference information, the difference information is used to indicate the difference between the first information and the second information; encoding the difference information to obtain the code data.
- the encoded data may be used by the decoding side to restore and obtain the second information.
- This application can replace the difference information with the decoding result in the original second video, because the bit stream data size after encoding the difference information will be smaller than the bit stream data after encoding the decoding result in the second video, It is equivalent to reducing the storage resource required for storing the video and reducing the bandwidth required for video transmission under the premise that the complete second video file can be recovered.
- the similarity between image features of the first image block and the second image block is greater than a threshold.
- the difference information is replaced with the original second encoding parameter. Since the similarity between the first image block and the second image block is greater than the threshold, the size of the encoded data after encoding the difference information will be much smaller than that of the first image block.
- the coded data after the second information is coded can be restored to obtain the second information based on the difference information and the first information. It is equivalent to reducing the storage resource required for storing the video and reducing the bandwidth required for video transmission under the premise that the complete second video can be recovered.
- the similarity between the first information and the second information is greater than a threshold.
- the difference information is replaced with the original second encoding parameter. Since the similarity between the first information and the second information is greater than the threshold, the size of the encoded data after encoding the difference information will be much smaller than that of the second information.
- the coded data after the information is coded can be restored to obtain the second information based on the difference information and the first information. It is equivalent to reducing the storage resource required for storing the video and reducing the bandwidth required for video transmission under the premise that the complete second video can be recovered.
- the threshold here may be a numerical value indicating a high similarity between image blocks (this application does not limit the specific magnitude of the numerical value).
- the first information includes a first encoding parameter of a first image block
- the second information includes a second encoding parameter of a second image block
- the second encoding parameters include motion vectors and/or residuals.
- the first video may include a first image frame
- the second video may include a second image frame
- the first image frame may include a plurality of image blocks (including the first image block)
- the second image frame It may include a plurality of image blocks (including the second image block), wherein the first image block and the second image block may be macroblocks (macroblock, MB), prediction blocks (partition), and coding units (coding unit, CU) , block units such as prediction unit (PU) and transform unit (TU), are not limited here.
- the similarity of image features between the first image block and the second image block is relatively high, wherein, the image features of the image block can be one or more of the color features, texture features, shape features, etc. of the image block .
- the color feature and texture feature are used to describe the surface properties of the object corresponding to the image block.
- the shape features include contour features and region features, the contour features include the outer boundary features of the object, and the region features include the shape region features of the object.
- the first encoding parameter of the first image block and the second encoding parameter of the second image block can be obtained, the first encoding parameter and the second encoding parameter can be motion vectors, the first encoding parameter and the second encoding parameter
- the second coding parameter may be a residual, and the first coding parameter and the second coding parameter may be a motion vector and a residual.
- the residual may be calculated based on the image block and the prediction block, for example, the residual may be obtained in the sample domain by subtracting the sample value of the image block of the picture from the sample value of the prediction block sample by sample (pixel by pixel).
- the coding parameters may also include syntax information, for example, the syntax information may be but not limited to: any one of inter prediction, intra prediction parameters, loop filter parameters and/or other syntax elements (decoded) or all.
- syntax information may be but not limited to: any one of inter prediction, intra prediction parameters, loop filter parameters and/or other syntax elements (decoded) or all.
- a subtraction operation may be performed on the first encoding parameter and the second encoding parameter to obtain difference information, and in addition, the difference information may be obtained based on other factors used to quantize the first encoding parameter and the second encoding parameter.
- the difference information is calculated by calculating the difference between them, which is not limited here.
- first image block and the second image block may be image blocks with relatively similar image characteristics in the first image frame and the second image frame, and part or all of the image blocks in the first image frame and the second image frame may be Execute the above processing for the first image block and the second image block. Furthermore, difference information of multiple groups of image blocks can be obtained.
- the difference information may be used as the encoding parameter of the second image block (replacing the original second encoding parameter), and the data of the second video including the encoding parameter of the second image block is encoded to obtain the first Bitstream of the second video.
- first image block and the second image block may be image blocks with relatively similar image characteristics in the first image frame and the second image frame, and part or all of the image blocks in the first image frame and the second image frame may be Execute the above processing for the first image block and the second image block.
- first image frame and the second image frame may be image frames with relatively similar image features in the first video and the second video, and image blocks in part or all of the image frames in the first video and the second video may be Execute the above processing for the first image block and the second image block.
- the first image block is included in a first image frame
- the second image block is included in a second image frame
- the first image frame and the second image frame are different image frame.
- the first encoding parameter may include a first motion vector
- the second encoding parameter may include a second motion vector
- the difference information includes first difference information
- the first difference The value information is a difference between the first motion vector and the second motion vector.
- the first coding parameter may include a first residual
- the second coding parameter may include a second residual
- the difference information includes second difference information, so The second difference information is a difference between the first residual and the second residual.
- the first coding parameter may include a first motion vector and a first residual
- the second coding parameter may include a second motion vector and a second residual
- the difference information includes a first difference information and second difference information
- the first difference information is the difference between the first motion vector and the second motion vector
- the second difference information is the first The difference between the residual and the second residual.
- the first coding parameter includes a first residual
- the second coding parameter includes a second residual
- the difference information includes second difference information
- the second difference The information is used to indicate the difference between the first residual and the second residual
- the encoding the difference information includes: encoding the second residual by lossless compression encoding or lossy compression encoding
- the difference information is encoded.
- the first difference information represents the difference between motion vectors
- the motion vector is used in inter-frame prediction
- lossy compression is used to compress the first difference information
- difference for example, artifacts appear
- lossless compression is used to compress the first difference information, which can increase the accuracy and effect of inter-frame prediction.
- the first encoding parameter includes a first motion vector
- the second encoding parameter includes a second motion vector
- the difference information includes first difference information
- the first difference The information is used to indicate the difference between the first motion vector and the second motion vector
- the encoding the difference information includes: encoding the first difference information through lossless compression coding .
- the first difference information (the first difference information is used to indicate the difference between the first motion vector and the second motion vector) may be performed by lossless compression coding. coding.
- lossless compression coding or lossy compression coding may be used to encode the second difference information (the second difference information is used to indicate the difference between the first residual and the second residual difference between) to encode.
- lossless compression may be lossless compression including transformation (retaining all coefficients), scanning, and entropy coding
- lossy compression may be lossy compression including transformation (reserving low-frequency coefficients), quantization, scanning, and entropy coding.
- the encoded data and the first encoding parameter may be sent to the decoding side, so that the decoding side can obtain The second encoding parameters; alternatively, the encoded data may be stored locally for later transmission or retrieval.
- the above encoded data may be encapsulated, and correspondingly, the encapsulated encoded data may be sent to the decoding side, so that the decoding side may obtain the second encoding parameter according to the encapsulated encoded data; or , the encapsulated encoded data can be stored locally for later transmission or retrieval.
- the first indication information may also be encoded, and the first indication information may indicate that there is a relationship between the first image block and the second image block, and then the decoding side may obtain the first image block based on the first indication information
- the first encoding parameter and the difference information of the first encoding parameter and the second encoding parameter are obtained based on the first encoding parameter and the difference information.
- the first indication information includes the identification of the first image block and the identification of the second image block; and/or, the first indication information includes the identification of the first video file and the identification of the second video file The ID of the file.
- the first information includes a first image block
- the second information includes a second image block
- the difference information is obtained according to the first information and the second information, include:
- the difference information includes a third encoding parameter of the second image block.
- the coded data of the first video and the second video may be decoded by a decoder to obtain a video signal of the first video and a video signal of the second video, wherein the video signal of the first video may be Including the first image block, the video signal of the second video may include the second image block.
- the decoder may decapsulate the video files of the first video and the second video.
- first video and the second video can be different videos, or different parts of the same video (for example, there will be a large number of repeated video segments in some videos, then the first video and the second video can be in the video. Repeated two video clips), it is not limited here.
- the similarity of image features between the first image block and the second image block is relatively high, wherein, the image features of the image block can be one or more of the color features, texture features, shape features, etc. of the image block .
- the color feature and texture feature are used to describe the surface properties of the object corresponding to the image block.
- the shape features include contour features and region features, the contour features include the outer boundary features of the object, and the region features include the shape region features of the object.
- the first image block may be an image block in an I frame
- the second image block may be an image block in a P frame or a B frame
- the first image block may be An image block in a P frame
- the second image block is an image block in a P frame or a B frame.
- the video signal of the second video may be encoded based on the video signal of the first video.
- the third encoding parameter of the second image block may be determined by using the first image block as a reference block of the second image block.
- the first image frame in the first video and the second image frame in the second video may be acquired, and the similarity of the image features of the first image frame and the second image frame greater than a threshold, the first video and the second video are different videos, and the first image frame is used as a reference frame of the second image frame to determine a third encoding parameter of the second image frame .
- the third coding parameter may be residual, motion vector and other syntax information, which is not limited here.
- the first image block may be used as a reference block, and the first image frame having the reference block As a reference frame to predict the second image block in the current frame (that is, the second image frame including the second image block), optionally, the first image frame may be temporally after the current frame, or the current frame may be temporally It is located between a previous reference frame that appears before the current frame in the video sequence and a subsequent reference frame that appears after the current frame in the video sequence (the first image frame can be one of the previous reference frame or the subsequent reference frame).
- the image content similarity between the first image frame and the second image frame can be relatively high.
- the higher image block is used as a reference block when encoding the second image frame.
- the relevant information of the reference block originally used as the second image block in the image group GOP where the second image block is located may not be encoded into the encoded data , reducing the size of encoded data, reducing the storage space required for storing the second video, and reducing the bandwidth required for transmitting the second video.
- the decoder in order to enable the decoder to decode and obtain a complete and accurate video signal of the second video, it is also necessary to encode second indication information into the encoded data, and the second indication information is used to indicate that the second video
- the reference block of the second image block is the first image block.
- the decoder may know that the reference block of the second image block is the first image block when decoding.
- the second indication information may include an identifier of the first image block and an identifier of the second image block.
- the second indication information may indicate that the reference frame of the second image frame is the first image frame, and further, according to the second indication information , acquiring a first image frame, and reconstructing the second image frame according to the encoding parameters of the second image frame and the first image frame.
- the method also includes:
- the difference information includes a fourth encoding parameter of the fourth image block.
- the image blocks with higher image content similarity in the above-mentioned other image frames and the second image frame can be As a reference block when encoding the second image frame.
- the second image frame may also include other image blocks (such as the fourth image block) except the second image block, and in the process of encoding the fourth image block, an image in the third video may be block as a reference block (third image block) of the fourth image block.
- other image blocks such as the fourth image block
- an image in the third video may be block as a reference block (third image block) of the fourth image block.
- the second encoding parameter and second indication information may be encoded, where the second indication information is used to indicate that the reference block of the fourth image block is the third image block.
- the method also includes:
- the similarity of the image features of the sixth image block is greater than a threshold, and the first image block and the fifth image block belong to the same or different image frames in the first video file;
- the difference information includes a fifth encoding parameter of the fifth image block.
- the image block with a high image content similarity between the other image frame and the second image frame may be used as a reference block when encoding the second image frame.
- the second image frame may also include other image blocks (such as the sixth image block) except the second image block, and in the process of encoding the sixth image block, an image in the first video may be block as the reference block (fifth image block) of the sixth image block, the reference block can be the image block in the first image frame (the image frame where the first image block is located), and the reference block can also be the image block in the first video
- the reference block can be the image block in the first image frame (the image frame where the first image block is located)
- the reference block can also be the image block in the first video
- the image features include at least one of the following:
- the present application provides a video decoding method, the method comprising:
- the first video file is decoded to obtain the first information; the indication information is used to indicate that there is an association between the first information and the second information (optionally, the The indication information is used to indicate that the difference information is obtained according to the difference between the first information and the second information), the first information corresponds to the first image block in the first video file, and the second information corresponds to The second image block in the second video file, the first video file and the second video file are different video files;
- the indication information can also be encoded, and the indication information can indicate that the difference information is obtained according to the difference between the first image block and the second image block, and then the decoding side can obtain the second image block of the first image block based on the indication information.
- An encoding parameter and difference information and obtain a second encoding parameter based on the first encoding parameter and the difference information.
- the indication information may include an identifier indicating the first image block and an identifier indicating the second image block.
- This application can replace the difference information with the decoding result in the original second video, because the bit stream data size after encoding the difference information will be smaller than the bit stream data after encoding the decoding result in the second video, It is equivalent to reducing the storage resource required for storing the video and reducing the bandwidth required for video transmission under the premise that the complete second video file can be recovered.
- the first information includes a first encoding parameter of a first image block
- the second information includes a second encoding parameter of a second image block
- the second encoding parameters include motion vectors and/or residuals.
- the first image block is included in a first image frame
- the second image block is included in a second image frame
- the first image frame and the second image frame are different image frame.
- the first information includes a first image block
- the second information includes a second image block
- the indication information is used to indicate that the reference block of the second image block is the first image block.
- the relevant information of the reference frame of the second image frame in the group of pictures GOP where the second image frame is located may not be encoded into the encoded data , reducing the size of encoded data, reducing the storage space required for storing the second video, and reducing the bandwidth required for transmitting the second video.
- the first image block is an image block in an I frame
- the second image block is an image block in a P frame or a B frame
- the first image block is an image block in a P frame
- the second image block is an image block in a P frame or a B frame.
- the similarity of the image features of the first image block and the second image block is greater than a threshold, and the image features include at least one of the following:
- the present application provides a video coding method, the method comprising: acquiring a first image frame in a first video, and a second image frame in a second video (optionally, the first image The similarity of the image features of the frame and the second image frame is greater than a threshold), the first video and the second video are different video files; the first image frame is used as the second image frame Referring to the frame, determining difference information, where the difference information is used to indicate the difference between the first image frame and the second image frame; encoding the difference information to obtain encoded data.
- the coded data may be used by the decoding side to restore and obtain the second image frame.
- the relevant information of the reference frame of the second image frame in the group of pictures GOP where the second image frame is located may not be encoded into the encoded data , reducing the size of encoded data, reducing the storage space required for storing the second video, and reducing the bandwidth required for transmitting the second video.
- the acquiring the first image frame in the first video file and the second image frame in the second video file includes:
- an image frame whose similarity with encoding parameters of the second image frame is greater than a threshold is the first image frame, where the encoding parameters include motion vectors, residuals and/or DCT coefficients; and/or,
- Determining, from the multiple image frames, an image frame whose similarity to the subtitle information and/or audio information corresponding to the second image frame is greater than a threshold value is the first image frame.
- the image features include at least one of the following: the image features include at least one of the following:
- the first image frame is an I frame
- the second image frame is a P frame or a B frame
- the first image frame is a P frame
- the second image frame is P frame or B frame.
- the first encoding parameter includes a first motion vector
- the second encoding parameter includes a second motion vector
- the difference information includes first difference information
- the first difference information indicating a difference between the first motion vector and the second motion vector
- the encoding of the difference information includes:
- the first difference information is encoded by lossless compression encoding.
- the first coding parameter includes a first residual
- the second coding parameter includes a second residual
- the difference information includes second difference information
- the second difference information indicating a difference between the first residual and the second residual
- the encoding of the difference information includes:
- the second difference information is encoded by lossless compression coding or lossy compression coding.
- the encoding the difference information includes:
- the first indication information includes an identifier of the first image frame and an identifier of the second image frame.
- the present application also provides a video decoding method, the method comprising: acquiring encoded data;
- the first video file is decoded to obtain the first image frame;
- the indication information is used to indicate that there is a relationship between the first image frame and the second image frame, and the first image frame belongs to A first video file, the second image frame belongs to a second video file, and the first video file and the second video file are different video files;
- the similarity between the image features of the first image frame and the second image frame is greater than a threshold; and/or,
- the degree of similarity between the first encoding parameter and the second encoding parameter is greater than a threshold; and/or,
- the similarity of the subtitle information and/or audio information in the first video file and the second video file is greater than a threshold; and/or,
- the similarity between the DCT coefficients of the first image frame and the DCT coefficients of the second image frame is greater than a threshold.
- the first image frame is an I frame
- the second image frame is a P frame or a B frame
- the first image frame is a P frame
- the second image frame is P frame or B frame.
- the present application provides a video encoding device, the device comprising:
- An acquisition module configured to acquire a first encoding parameter of a first image block and a second encoding parameter of a second image block, the first image block being an image block in the first video file, and the second image block being the second encoding parameter image blocks in two video files, the first video file and the second video file are different video files, and the first encoding parameters and the second encoding parameters include motion vectors and/or residuals;
- a difference determination module configured to obtain difference information according to the first encoding parameter and the second encoding parameter, and the difference information is used to indicate the difference between the first encoding parameter and the second encoding parameter difference;
- An encoding module configured to encode the difference information to obtain encoded data.
- the encoded data may be used by the decoding side to restore and obtain the second encoding parameter.
- the difference information can be replaced by the original second encoding parameter. Since the similarity between the image features of the first image block and the second image block is greater than a threshold, the size of the encoded data after encoding the difference information will decrease. It is much smaller than the encoded data after encoding the second encoding parameter, and can be restored to obtain the second encoding parameter based on the difference information and the first encoding parameter. It is equivalent to reducing the storage resource required for storing the video and reducing the bandwidth required for video transmission under the premise that the complete second image block can be recovered.
- the acquiring module is specifically used for:
- the similarity between the image features of the first image block and the second image block is greater than a threshold; and/or,
- the degree of similarity between the first encoding parameter and the second encoding parameter is greater than a threshold; and/or,
- the similarity of the subtitle information and/or audio information in the first video file and the second video file is greater than a threshold; and/or,
- the similarity between the DCT coefficients of the first image block and the DCT coefficients of the second image block is greater than a threshold.
- the first image block is included in a first image frame
- the second image block is included in a second image frame
- the first image frame and the second image frame are different image frame.
- the first image block includes a first image frame in a first video
- the second image block includes a second image frame in a second video
- the first video and The second video is a different video file.
- the first encoding parameter may include a first motion vector
- the second encoding parameter may include a second motion vector
- the difference information includes first difference information
- the first difference The value information is a difference between the first motion vector and the second motion vector.
- the first coding parameter may include a first residual
- the second coding parameter may include a second residual
- the difference information includes second difference information, so The second difference information is a difference between the first residual and the second residual.
- the first coding parameter may include a first motion vector and a first residual
- the second coding parameter may include a second motion vector and a second residual
- the difference information includes a first difference information and second difference information
- the first difference information is the difference between the first motion vector and the second motion vector
- the second difference information is the first The difference between the residual and the second residual.
- the first coding parameter includes a first residual
- the second coding parameter includes a second residual
- the difference information includes second difference information
- the second difference The information is used to indicate the difference between the first residual and the second residual
- the coding module is specifically used for:
- the second difference information is encoded by lossless compression coding or lossy compression coding.
- the first encoding parameter includes a first motion vector
- the second encoding parameter includes a second motion vector
- the difference information includes first difference information
- the first difference The information is used to indicate the difference between the first motion vector and the second motion vector
- the coding module is specifically used to:
- the first difference information is encoded by lossless compression encoding.
- the coding module is specifically used for:
- the device also includes:
- a sending module configured to send the encoded data to a decoding side, so that the decoding side can obtain the second encoding parameters according to the encoded data;
- a storage module configured to store the encoded data.
- the present application provides a video decoding device, the device comprising:
- An acquisition module used to acquire encoded data
- a decoding module configured to decode the encoded data to obtain difference information
- the acquisition module is further configured to decode the first image block according to the indication information to obtain the first encoding parameter; wherein the indication information is used to indicate that there is a presence between the first image block and the second image block Association, the first image block belongs to a first video file, the second image block belongs to a second video file, and the first video file and the second video file are different video files;
- a coding parameter restoring module configured to obtain the second coding parameter according to the first coding parameter and the difference information, the first coding parameter and the second coding parameter include a motion vector and/or a residual .
- the similarity between the image features of the first image block and the second image block is greater than a threshold; and/or,
- the degree of similarity between the first encoding parameter and the second encoding parameter is greater than a threshold; and/or,
- the similarity of the subtitle information and/or audio information in the first video file and the second video file is greater than a threshold; and/or,
- the similarity between the DCT coefficients of the first image block and the DCT coefficients of the second image block is greater than a threshold.
- the image features include at least one of the following:
- the first image block is included in a first image frame
- the second image block is included in a second image frame
- the first image frame and the second image frame are different image frame.
- the first image block includes a first image frame in a first video
- the second image block includes a second image frame in a second video
- the first video and The second video is a different video file.
- the present application provides a video encoding device, the device comprising:
- An acquisition module configured to acquire a first image block in a first video file and a second image block in a second video file, where the first video file and the second video file are different video files;
- a difference determination module configured to use the first image block as a reference block of the second image block, and determine difference information, where the difference information is used to indicate the first image block and the second image difference between blocks;
- An encoding module configured to encode the difference information to obtain encoded data.
- the encoded data is used by the decoding side to restore and obtain the second image block.
- the relevant information of the reference block originally used as the second image block in the image group GOP where the second image block is located may not be encoded into the encoded data , reducing the size of encoded data, reducing the storage space required for storing the second video, and reducing the bandwidth required for transmitting the second video.
- the acquiring module is specifically used for:
- the coding parameters include a motion vector, a residual and/or DCT coefficients; and/or
- the first image block is an image block in an I frame
- the second image block is an image block in a P frame or a B frame; or,
- the first image block is an image block in a P frame
- the second image block is an image block in a P frame or a B frame.
- the coding module is specifically used for:
- the acquiring module is further configured to acquire the third image block in the third video file and the fourth image block in the second video file, the second image block and the fourth image
- the block belongs to the same image frame in the second video file, the similarity of the image features of the third image block and the fourth image block is greater than a threshold, and the third video file and the second video file for different video files;
- the difference determination module is also used for:
- the difference information is determined by using the first image block as a reference block of the second image block and using the third image block as a reference block of the fourth image block.
- the present application also provides a video decoding device, the device comprising:
- An acquisition module used to acquire encoded data
- a decoding module configured to decode the encoded data to obtain difference information
- the acquisition module is further configured to decode the first video file according to the indication information to obtain the first image block;
- the indication information is used to indicate that there is an association between the first image block and the second image block, so
- the first image block belongs to a first video file
- the second image block belongs to a second video file
- the first video file and the second video file are different video files;
- a reconstruction module configured to obtain the second image block according to the first image block and the difference information.
- the relevant information of the reference frame of the second image frame in the group of pictures GOP where the second image frame is located may not be encoded into the encoded data , reducing the size of encoded data, reducing the storage space required for storing the second video, and reducing the bandwidth required for transmitting the second video.
- the similarity between the image features of the first image block and the second image block is greater than a threshold; and/or,
- the degree of similarity between the first encoding parameter and the second encoding parameter is greater than a threshold; and/or,
- the similarity of the subtitle information and/or audio information in the first video file and the second video file is greater than a threshold; and/or,
- the similarity between the DCT coefficients of the first image block and the DCT coefficients of the second image block is greater than a threshold.
- the indication information includes an identifier of the first image block and an identifier of the second image block.
- the first image frame is an I frame
- the second image frame is a P frame or a B frame
- the first image frame is a P frame
- the second image frame is P frame or B frame.
- the decoding module is further configured to: decode the encoded data to obtain the fourth image block and second indication information in the second video, the second indication information is used to indicate The reference block of the fourth image block is the third image block in the third video (or there is an association between the image blocks), and the third video and the second video are different video files; the acquiring A module, further configured to acquire the third image block according to the second indication information; the reconstruction module, further configured to reconstruct the fourth image according to the second encoding parameter and the third image block piece.
- the method also includes:
- the third indication information is used to indicate that the reference block of the sixth image block is in the first video
- the fifth image block (or there is an association between the image blocks); wherein, the first image block and the fifth image block belong to the same or different image frames in the first video; according to the The third indication information is to acquire the fifth image block; and reconstruct the sixth image block according to the third encoding parameter and the fifth image block.
- the present application provides a video encoding device, the device comprising:
- An acquisition module configured to acquire a first video file and a second video file, where the first video file and the second video file are different video files;
- a decoding module configured to decode the first video file and the second video file to obtain the first information of the first image block in the first video file and the first information of the second image block in the second video file Two information;
- a difference determination module configured to obtain difference information according to the first information and the second information, where the difference information is used to indicate a difference between the first information and the second information;
- An encoding module configured to encode the difference information to obtain encoded data.
- the encoded data is used by the decoding side to restore and obtain the second information.
- This application can replace the difference information with the decoding result of the original second video file, because the bit stream data size after encoding the difference information will be smaller than the bit stream data after encoding the decoding result of the second video file, It is equivalent to reducing the storage resource required for storing the video and reducing the bandwidth required for video transmission under the premise that the complete second video file can be recovered.
- the similarity between image features of the first image block and the second image block is greater than a threshold.
- the difference information is replaced with the original second encoding parameter. Since the similarity between the first image block and the second image block is greater than the threshold, the size of the encoded data after encoding the difference information will be much smaller than that of the first image block.
- the coded data after the second information is coded can be restored to obtain the second information based on the difference information and the first information. It is equivalent to reducing the storage resources required for storing the video and reducing the bandwidth required for video transmission under the premise that the complete second video can be recovered.
- the similarity between the first information and the second information is greater than a threshold.
- the difference information is replaced with the original second encoding parameter. Since the similarity between the first information and the second information is greater than the threshold, the size of the encoded data after encoding the difference information will be much smaller than that of the second information.
- the coded data after the information is coded can be restored to obtain the second information based on the difference information and the first information. It is equivalent to reducing the storage resource required for storing the video and reducing the bandwidth required for video transmission under the premise that the complete second video can be recovered.
- the first information includes a first encoding parameter of a first image block
- the second information includes a second encoding parameter of a second image block
- the second encoding parameters include motion vectors and/or residuals.
- the first video may include a first image frame
- the second video may include a second image frame
- the first image frame may include a plurality of image blocks (including the first image block)
- the second image frame It may include a plurality of image blocks (including the second image block), wherein the first image block and the second image block may be block units such as MB, prediction block (partition), CU, PU, TU, etc., which are not limited here.
- MB prediction block
- CU prediction block
- PU PU
- TU TU
- the first image block is included in a first image frame
- the second image block is included in a second image frame
- the first image frame and the second image frame are different image frame.
- the first encoding parameter includes a first motion vector
- the second encoding parameter includes a second motion vector
- the difference information includes first difference information
- the first difference information indicating a difference between the first motion vector and the second motion vector
- the encoding module is specifically used for:
- the first difference information is encoded by lossless compression encoding.
- the first coding parameter includes a first residual
- the second coding parameter includes a second residual
- the difference information includes second difference information
- the second difference information indicating a difference between the first residual and the second residual
- the encoding module is specifically used for:
- the second difference information is encoded by lossless compression coding or lossy compression coding.
- the device also includes:
- a sending module configured to send the encoded data to a decoding side, so that the decoding side can obtain the second encoding parameters according to the encoded data;
- a storage module configured to store the encoded data.
- the coding module is specifically used for:
- the first indication information may indicate that there is a relationship between the one image block and the second image block.
- the first indication information includes an identifier of the first image block and an identifier of the second image block.
- the image features include at least one of the following:
- the first information includes a first image block
- the second information includes a second image block
- the coded data of the first video and the second video may be decoded by a decoder to obtain a video signal of the first video and a video signal of the second video, wherein the video signal of the first video may be Including the first image block, the video signal of the second video may include the second image block.
- the decoder may decapsulate the video files of the first video and the second video.
- first video and the second video can be different videos, or different parts of the same video (for example, there will be a large number of repeated video segments in some videos, then the first video and the second video can be in the video. Repeated two video clips), it is not limited here.
- the first video may include a first image frame
- the second video may include a second image frame
- the first image frame may include a plurality of image blocks (including the first image block)
- the second image frame It may include a plurality of image blocks (including the second image block), wherein the first image block and the second image block may be block units such as MB, prediction block (partition), CU, PU, TU, etc., which are not limited here.
- MB prediction block
- CU prediction block
- PU PU
- TU TU
- the similarity of image features between the first image block and the second image block is relatively high, wherein, the image features of the image block can be one or more of the color features, texture features, shape features, etc. of the image block .
- the color feature and texture feature are used to describe the surface properties of the object corresponding to the image block.
- the shape features include contour features and region features, the contour features include the outer boundary features of the object, and the region features include the shape region features of the object.
- the first image block may be an image block in an I frame
- the second image block may be an image block in a P frame or a B frame
- the first image block may be An image block in a P frame
- the second image block is an image block in a P frame or a B frame.
- the video signal of the second video may be encoded based on the video signal of the first video.
- the third encoding parameter of the second image block may be determined by using the first image block as a reference block of the second image block.
- the first image frame in the first video and the second image frame in the second video may be acquired, and the similarity of the image features of the first image frame and the second image frame greater than a threshold, the first video and the second video are different videos, and the first image frame is used as a reference frame of the second image frame to determine a third encoding parameter of the second image frame .
- the third coding parameter may be residual, motion vector and other syntax information, which is not limited here.
- the first image block may be used as a reference block, and the first image frame having the reference block As a reference frame to predict the second image block in the current frame (that is, the second image frame including the second image block), optionally, the first image frame may be temporally after the current frame, or the current frame may be temporally It is located between a previous reference frame that appears before the current frame in the video sequence and a subsequent reference frame that appears after the current frame in the video sequence (the first image frame can be one of the previous reference frame or the subsequent reference frame).
- the image content similarity between the first image frame and the second image frame can be relatively high.
- the higher image block is used as a reference block when encoding the second image frame.
- the relevant information of the reference block originally used as the second image block in the image group GOP where the second image block is located may not be encoded into the encoded data , reducing the size of encoded data, reducing the storage space required for storing the second video, and reducing the bandwidth required for transmitting the second video.
- the decoder in order to enable the decoder to decode and obtain a complete and accurate video signal of the second video, it is also necessary to encode second indication information into the encoded data, and the second indication information is used to indicate that the second video
- the reference block of the second image block is the first image block.
- the decoder may know that the reference block of the second image block is the first image block when decoding.
- the second indication information may include an identifier of the first image block and an identifier of the second image block.
- the second indication information may indicate that the reference frame of the second image frame is the first image frame, and further, according to the second indication information , acquiring a first image frame, and reconstructing the second image frame according to the encoding parameters of the second image frame and the first image frame.
- the acquisition module is also used to:
- the decoding module is further configured to decode the second video file to obtain a fourth image block
- the difference determination module is further configured to use the third image block as a reference block of the fourth image block to determine the difference information, and the difference information includes the fourth image block of the fourth image block. encoding parameters.
- the decoding module is further configured to decode the first video file to obtain a fifth image block
- the similarity of the image features of the sixth image block is greater than a threshold, and the first image block and the fifth image block belong to the same or different image frames in the first video file;
- the difference determination module is further configured to use the fifth image block as a reference block of the sixth image block to determine the difference information, the difference information includes the fifth image block of the fifth image block. encoding parameters.
- the present application provides a video decoding device, the device comprising:
- An acquisition module used to acquire encoded data
- a decoding module configured to decode the coded data to obtain difference information and indication information, where the indication information is used to indicate that there is an association between the first information and the second information (optionally, the The indication information is used to indicate that the difference information is obtained according to the difference between the first information and the second information), the first information corresponds to the first image block in the first video file, and the second information corresponds to The second image block in the second video file, the first video file and the second video file are different video files;
- the acquisition module is further configured to decode the first video file according to the indication information to obtain the first information
- a reconstruction module configured to obtain the second information according to the first information and the difference information.
- the first information includes a first encoding parameter of a first image block
- the second information includes a second encoding parameter of a second image block
- the second encoding parameters include motion vectors and/or residuals.
- the first image block is included in a first image frame
- the second image block is included in a second image frame
- the first image frame and the second image frame are different image frame.
- the indication information includes an identifier of the first image block and an identifier of the second image block.
- the first information includes a first image block
- the second information includes a second image block
- the indication information is used to indicate that the reference block of the second image block is the first image block.
- the first image block is an image block in an I frame
- the second image block is an image block in a P frame or a B frame
- the first image block is an image block in a P frame
- the second image block is an image block in a P frame or a B frame.
- the similarity of the image features of the first image block and the second image block is greater than a threshold, and the image features include at least one of the following:
- the embodiment of the present application provides a computing device, which may include a memory, a processor, and a bus system, wherein the memory is used to store programs, and the processor is used to execute the programs in the memory, so as to perform the above-mentioned first aspect
- a computing device which may include a memory, a processor, and a bus system, wherein the memory is used to store programs, and the processor is used to execute the programs in the memory, so as to perform the above-mentioned first aspect
- Any optional method, any optional method of the second aspect, any optional method of the third aspect, any optional method of the fourth aspect, any optional method of the fifth aspect, the sixth aspect Any optional method, any optional method in the seventh aspect, and any optional method in the eighth aspect.
- the embodiment of the present application provides a computer-readable storage medium, where a computer program is stored in the computer-readable storage medium, and when it runs on a computer, the computer executes any one of the above-mentioned first aspects.
- An optional method, any optional method in the second aspect, any optional method in the third aspect, any optional method in the fourth aspect, any optional method in the fifth aspect, any optional method in the sixth aspect An optional method, any optional method of the seventh aspect, any optional method of the eighth aspect.
- the embodiment of the present application provides a computer program product, including codes, used to implement any optional method in the first aspect and any optional method in the second aspect when the code is executed , any optional method of the third aspect, any optional method of the fourth aspect, any optional method of the fifth aspect, any optional method of the sixth aspect, any optional method of the seventh aspect , any optional method of the eighth aspect.
- the present application provides a chip system
- the chip system includes a processor, used to support the execution device or training device to realize the functions involved in the above aspect, for example, send or process the data involved in the above method ; or, information.
- the chip system further includes a memory, and the memory is used for storing necessary program instructions and data of the execution device or the training device.
- the system-on-a-chip may consist of chips, or may include chips and other discrete devices.
- An embodiment of the present application provides a video encoding method, the method comprising: acquiring a first encoding parameter of a first image block and a second encoding parameter of a second image block, the first image block being the first image block in the first video file image block, the second image block is an image block in the second video file, the first video file and the second video file are different video files, the first encoding parameter and the second
- the encoding parameters include motion vectors and/or residuals; difference information is obtained according to the first encoding parameters and the second encoding parameters, and the difference information is used to indicate that the first encoding parameters and the second encoding parameters encoding the difference between the parameters; encoding the difference information to obtain encoded data.
- the encoded data is used by the decoding side to restore and obtain the second encoding parameters.
- the difference information is replaced by the original second encoding parameter. Since the similarity of the image features of the first image block and the second image block is greater than the threshold, the size of the bitstream data after encoding the difference information will be smaller than that of the first image block.
- the bit stream data encoded by the second encoding parameter can be restored based on the difference information and the first encoding parameter to obtain the second encoding parameter. It is equivalent to reducing the storage resource required for storing the video and reducing the bandwidth required for video transmission under the premise that the complete second image block can be recovered.
- FIG. 1A is a block diagram of an example of a video decoding system for implementing an embodiment of the present invention
- FIG. 1B is a block diagram of another example of a video decoding system for implementing an embodiment of the present invention.
- FIG. 2 is a block diagram of an example structure of an encoder for implementing an embodiment of the present invention
- FIG. 3 is a block diagram of an example structure of a decoder for implementing an embodiment of the present invention
- FIG. 4 is a block diagram of an example of a video decoding device for implementing an embodiment of the present invention.
- FIG. 5A is a block diagram of another example of a video decoding device for implementing an embodiment of the present invention.
- FIG. 5B is a schematic flowchart of a video encoding method according to an embodiment of the present application.
- FIG. 5C is a schematic flowchart of a video decoding method according to an embodiment of the present application.
- FIG. 6 is a schematic flowchart of a video encoding method according to an embodiment of the present application.
- FIG. 7 is a schematic flowchart of a video encoding method according to an embodiment of the present application.
- FIG. 8 is a schematic flowchart of a video coding method according to an embodiment of the present application.
- FIG. 9 is a schematic diagram of a video coding according to an embodiment of the present application.
- FIG. 10 is a schematic diagram of a video coding according to an embodiment of the present application.
- FIG. 11 is a schematic flowchart of a video encoding method according to an embodiment of the present application.
- FIG. 12 is a schematic flowchart of a video decoding method according to an embodiment of the present application.
- FIG. 13 is a schematic flowchart of a video encoding method according to an embodiment of the present application.
- FIG. 14 is a schematic flowchart of a video encoding method according to an embodiment of the present application.
- FIG. 15 is a schematic flowchart of a video decoding method according to an embodiment of the present application.
- FIG. 16a is a schematic flowchart of a video decoding method according to an embodiment of the present application.
- Figure 16b is a schematic flow chart of the video encoding method in this application.
- FIG. 17 is a schematic structural diagram of a video encoding device according to an embodiment of the present application.
- FIG. 18 is a schematic structural diagram of a video decoding device according to an embodiment of the present application.
- FIG. 19 is a schematic structural diagram of a video encoding device according to an embodiment of the present application.
- FIG. 20 is a schematic structural diagram of a video decoding device according to an embodiment of the present application.
- FIG. 21 is a schematic structural diagram of a video encoding device according to an embodiment of the present application.
- FIG. 22 is a schematic structural diagram of a video decoding device according to an embodiment of the present application.
- the corresponding device may include one or more units, such as functional units, to perform the described one or more method steps (for example, one unit performs one or more steps , or a plurality of units, each of which performs one or more of the plurality of steps), even if such one or more units are not explicitly described or illustrated in the drawing.
- units such as functional units, to perform the described one or more method steps (for example, one unit performs one or more steps , or a plurality of units, each of which performs one or more of the plurality of steps), even if such one or more units are not explicitly described or illustrated in the drawing.
- the corresponding method may include a step to perform the functionality of one or more units (for example, a step to perform one or more units functionality, or a plurality of steps, each of which performs the functionality of one or more of the plurality of units), even if such one or more steps are not explicitly described or illustrated in the drawing.
- a step to perform the functionality of one or more units for example, a step to perform one or more units functionality, or a plurality of steps, each of which performs the functionality of one or more of the plurality of units
- “at least one” means one or more, and “multiple” means two or more.
- “And/or” describes the association relationship of associated objects, indicating that there may be three types of relationships, for example, A and/or B, which can mean: A exists alone, A and B exist at the same time, and B exists alone, where A, B can be singular or plural.
- the character “/” generally indicates that the contextual objects are an “or” relationship.
- “At least one of the following” or similar expressions refer to any combination of these items, including any combination of single or plural items.
- At least one item (piece) of a, b, or c can represent: a, b, c, a-b, a-c, b-c, or a-b-c, where a, b, c can be single or multiple .
- Video coding generally refers to the processing of sequences of pictures that form a video or video sequence.
- picture In the field of video coding, the terms “picture”, “frame” or “image” may be used as synonyms.
- Video encoding as used herein means video encoding or video decoding.
- Video encoding is performed on the source side and typically involves processing (eg, by compressing) raw video pictures to reduce the amount of data needed to represent the video pictures for more efficient storage and/or transmission.
- Video decoding is performed at the destination and typically involves inverse processing relative to the encoder to reconstruct the video picture.
- the "encoding" of video pictures involved in the embodiments should be understood as involving “encoding” or “decoding” of video sequences.
- the combination of encoding part and decoding part is also called codec (encoding and decoding).
- a video sequence includes a series of pictures (pictures), the pictures are further divided into slices (slices), and the slices are further divided into blocks (blocks).
- Video coding is coded in units of blocks, and in some new video coding standards, the concept of blocks is further expanded. For example, there is a macroblock (macroblock, MB) in the H.264 standard, and the macroblock can be further divided into multiple predictive blocks (partitions) that can be used for predictive coding.
- HEVC high-efficiency video coding
- the basic concepts of coding unit (coding unit, CU), prediction unit (prediction unit, PU) and transform unit (transform unit, TU) are adopted.
- a variety of block units are divided and described using a new tree-based structure.
- a CU can be divided into smaller CUs according to a quadtree, and the smaller CUs can be further divided to form a quadtree structure.
- a CU is a basic unit for dividing and encoding a coded image.
- PU can correspond to a prediction block and is the basic unit of predictive coding.
- the CU is further divided into multiple PUs according to the division mode.
- a TU can correspond to a transform block and is a basic unit for transforming a prediction residual.
- PU or TU they all belong to the concept of block (or image block) in essence.
- the embodiment of the present application involves the concept of an image block (for example, a first image block and a second image block), for details, reference may be made to the description here.
- a CTU is split into multiple CUs by using a quadtree structure represented as a coding tree.
- the decision whether to encode a region of a picture using inter-picture (temporal) or intra-picture (spatial) prediction is made at the CU level.
- Each CU can be further split into one, two or four PUs according to the PU split type.
- the same prediction process is applied within a PU and relevant information is transferred to the decoder on a PU basis.
- the CU can be partitioned into transform units (TUs) according to other quadtree structures similar to the coding tree used for the CU.
- quad-tree and binary tree QTBT are used to divide the frame to divide the encoding block.
- a CU can be square or rectangular in shape.
- the image block to be processed in the currently coded image can be referred to as the current block.
- the image block to be processed refers to the block currently being encoded;
- the image block to be processed refers to the block currently being decoded.
- the decoded image block used to predict the current block in the reference image is called a reference block, that is, the reference block is a block that provides a reference signal for the current block, wherein the reference signal represents a pixel value in the image block.
- a block that provides a prediction signal for the current block in the reference image may be a prediction block, where the prediction signal represents a pixel value or a sample value or a sample signal in the prediction block. For example, after traversing multiple reference blocks, the best reference block is found, and this best reference block will provide prediction for the current block, and this block may be called a prediction block.
- the original video picture can be reconstructed, ie the reconstructed video picture has the same quality as the original video picture (assuming no transmission loss or other data loss during storage or transmission).
- further compression is performed by, for example, quantization, to reduce the amount of data required to represent a video picture without being able to fully reconstruct the video picture at the decoder side, i.e. the quality of the reconstructed video picture is compared to the original video picture of low or poor quality.
- Video coding standards of H.261 belong to "lossy hybrid video codecs" (ie, combining spatial and temporal prediction in the sample domain with 2D transform coding in the transform domain for applying quantization).
- Each picture of a video sequence is usually partitioned into a non-overlapping set of blocks, usually coded at the block level.
- the encoder side usually processes at the block (video block) level, i.e.
- encodes the video for example, generates a predicted block through spatial (intra-picture) prediction and temporal (inter-picture) prediction, from the current block (currently processed or to be processed block) minus the predicted block to obtain the residual block, transform the residual block in the transform domain and quantize the residual block to reduce the amount of data to be transmitted (compressed), and the decoder side will be inversely processed relative to the encoder Partially applied to an encoded or compressed block to reconstruct the current block for representation. Additionally, the encoder replicates the decoder processing loop such that the encoder and decoder generate the same predictions (eg, intra and inter predictions) and/or reconstructions for processing, ie encoding, subsequent blocks.
- predictions eg, intra and inter predictions
- FIG. 1A exemplarily shows a schematic block diagram of a video encoding and decoding system 10 applied in an embodiment of the present invention.
- a video encoding and decoding system 10 may include a source device 12 that generates encoded video data and a destination device 14 , thus, the source device 12 may be referred to as a video encoding device.
- Destination device 14 may decode encoded video data generated by source device 12, and thus, destination device 14 may be referred to as a video decoding device.
- Various implementations of source device 12, destination device 14, or both may include one or more processors and memory coupled to the one or more processors.
- Source device 12 and destination device 14 may include a variety of devices, including desktop computers, mobile computing devices, notebook (e.g., laptop) computers, tablet computers, set-top boxes, telephone handsets such as so-called "smart" phones, etc. machine, television, camera, display device, digital media player, video game console, in-vehicle computer, wireless communication device, or the like.
- FIG. 1A depicts source device 12 and destination device 14 as separate devices
- device embodiments may include both source device 12 and destination device 14 or the functionality of both, i.e. source device 12 or corresponding functionality and the destination device 14 or corresponding functionality.
- source device 12 or corresponding functionality and destination device 14 or corresponding functionality may be implemented using the same hardware and/or software, or using separate hardware and/or software, or any combination thereof .
- a communicative connection may be made between source device 12 and destination device 14 via link 13 via which destination device 14 may receive encoded video data from source device 12 .
- Link 13 may include one or more media or devices capable of moving encoded video data from source device 12 to destination device 14 .
- link 13 may include one or more communication media that enable source device 12 to transmit encoded video data directly to destination device 14 in real time.
- source device 12 may modulate the encoded video data according to a communication standard, such as a wireless communication protocol, and may transmit the modulated video data to destination device 14 .
- the one or more communication media may include wireless and/or wired communication media, such as a radio frequency (RF) spectrum or one or more physical transmission lines.
- RF radio frequency
- the one or more communication media may form part of a packet-based network, such as a local area network, a wide area network, or a global network (eg, the Internet).
- the one or more communication media may include routers, switches, base stations, or other devices that facilitate communication from source device 12 to destination device 14 .
- the source device 12 includes an encoder 20 , and optionally, the source device 12 may also include a picture source 16 , a picture preprocessor 18 , and a communication interface 22 .
- the encoder 20 , picture source 16 , picture preprocessor 18 , and communication interface 22 may be hardware components in the source device 12 or software programs in the source device 12 . They are described as follows:
- Picture source 16 can comprise or can be any kind of picture capture device, is used for example to capture real world picture, and/or picture or comment of any kind (for screen content coding, some words on the screen also consider to be coded
- a picture or part of an image) generating device such as a computer graphics processor for generating computer-animated pictures, or for acquiring and/or providing real-world pictures, computer-animated pictures (for example, screen content, virtual reality, VR) images), and/or any combination thereof (e.g., augmented reality (AR) images).
- the picture source 16 may be a camera for capturing pictures or a memory for storing pictures, and the picture source 16 may also include any kind of interface (internal or external) that stores previously captured or generated pictures and/or acquires or receives pictures.
- the picture source 16 When the picture source 16 is a camera, the picture source 16 can be, for example, a local or an integrated camera integrated in the source device; when the picture source 16 is a memory, the picture source 16 can be local or, for example, an integrated memory.
- the interface can be, for example, an external interface that receives pictures from an external video source, such as an external picture capture device, such as a camera, an external memory, or an external picture generation device, such as For an external computer graphics processor, computer or server.
- the interface can be any kind of interface according to any proprietary or standardized interface protocol, eg wired or wireless interface, optical interface.
- the picture can be regarded as a two-dimensional array or matrix of pixel points (picture element).
- the pixel points in the array can also be referred to as sampling points.
- the number of samples in the array or picture in the horizontal and vertical directions (or axes) defines the size and/or resolution of the picture.
- three color components are usually used, that is, a picture can be represented as or contain three sample arrays.
- a picture includes corresponding arrays of red, green and blue samples.
- each pixel is usually expressed in luma/chroma format or color space. color components.
- the luminance (luma) component Y represents luminance or grayscale intensity (eg, both are the same in a grayscale picture), while the two chroma (chroma) components U and V represent chrominance or color information components.
- a picture in YUV format includes a luma sample array of luma sample values (Y), and two chroma sample arrays of chroma values (U and V).
- a picture in RGB format can be converted or converted to YUV format and vice versa, this process is also called color conversion or conversion. If the picture is black and white, the picture may only include an array of luma samples.
- the pictures transmitted from the picture source 16 to the picture processor may also be referred to as original picture data 17 .
- the picture preprocessor 18 is configured to receive the original picture data 17 and perform preprocessing on the original picture data 17 to obtain a preprocessed picture 19 or preprocessed picture data 19 .
- preprocessing performed by picture preprocessor 18 may include retouching, color format conversion (eg, from RGB format to YUV format), color grading, or denoising.
- Encoder 20 (or called encoder 20) is used to receive the preprocessed picture data 19, and process the preprocessed picture data 19 using relevant prediction modes, thereby providing coded picture data 21 (the following will be further based on the figure 2 or FIG. 4 or FIG. 5A describe the structural details of the encoder 20).
- the encoder 20 can be used to implement various embodiments described later, so as to realize the video encoding method described in the present invention.
- a communication interface 22 operable to receive encoded picture data 21 and transmit the encoded picture data 21 via link 13 to destination device 14 or any other device (such as a memory) for storage or direct reconstruction, so
- the other devices mentioned above can be any device for decoding or storage.
- the communication interface 22 may be used, for example, to package the encoded picture data 21 into a suitable format, such as a data packet, for transmission over the link 13 .
- the destination device 14 includes a decoder 30 , and optionally, the destination device 14 may also include a communication interface 28 , a picture post-processor 32 and a display device 34 . They are described as follows:
- a communication interface 28 may be used to receive encoded picture data 21 from the source device 12 or any other source, such as a storage device, such as a coded picture data storage device. Communication interface 28 may be used to transmit or receive encoded picture data 21 over link 13 between source device 12 and destination device 14, such as a direct wired or wireless connection, any A class of network is for example a wired or wireless network or any combination thereof, or any class of private and public networks, or any combination thereof. The communication interface 28 may be used, for example, to decapsulate the data packets transmitted by the communication interface 22 to obtain the encoded picture data 21 .
- Both communication interface 28 and communication interface 22 may be configured as a one-way communication interface or a two-way communication interface, and may be used, for example, to send and receive messages to establish a connection, acknowledge and exchange any other communication link and/or, for example, encoded picture data Information about the transmitted data transfer.
- Decoder 30 (or referred to as decoder 30), for receiving encoded picture data 21 and providing decoded picture data 31 or decoded picture 31 (the following will further describe the decoder 30 based on FIG. 3 or FIG. 4 or FIG. 5A structural details).
- the decoder 30 can be used to implement various embodiments described later, so as to realize the video decoding method described in the present invention.
- the picture post-processor 32 is configured to perform post-processing on the decoded picture data 31 (also referred to as reconstructed picture data) to obtain post-processed picture data 33 .
- the post-processing performed by the picture post-processor 32 may include color format conversion (for example, from YUV format to RGB format), color correction, retouching or resampling, or any other processing, and may also be used to convert the post-processed picture data to 33 to the display device 34.
- the display device 34 may be or include any kind of display for presenting the reconstructed picture, eg, an integrated or external display or monitor.
- the display may include a liquid crystal display (liquid crystal display, LCD), an organic light emitting diode (organic light emitting diode, OLED) display, a plasma display, a projector, a micro LED display, a liquid crystal on silicon (LCoS), A digital light processor (DLP) or other display of any kind.
- FIG. 1A depicts source device 12 and destination device 14 as separate devices
- device embodiments may include both source device 12 and destination device 14 or the functionality of both, i.e., source device 12 or Corresponding functionality and destination device 14 or corresponding functionality.
- source device 12 or corresponding functionality and destination device 14 or corresponding functionality may be implemented using the same hardware and/or software, or using separate hardware and/or software, or any combination thereof .
- Source device 12 and destination device 14 may comprise any of a variety of devices, including any class of handheld or stationary devices, such as notebook or laptop computers, mobile phones, smartphones, tablet or tablet computers, video cameras, desktop Computers, set-top boxes, televisions, cameras, in-vehicle devices, display devices, digital media players, video game consoles, video streaming devices (such as content service servers or content distribution servers), broadcast receiver devices, broadcast transmitter devices etc., and can be used without or with any class of operating system.
- handheld or stationary devices such as notebook or laptop computers, mobile phones, smartphones, tablet or tablet computers, video cameras, desktop Computers, set-top boxes, televisions, cameras, in-vehicle devices, display devices, digital media players, video game consoles, video streaming devices (such as content service servers or content distribution servers), broadcast receiver devices, broadcast transmitter devices etc., and can be used without or with any class of operating system.
- Both encoder 20 and decoder 30 may be implemented as any of a variety of suitable circuits, for example, one or more microprocessors, digital signal processors (DSPs), application-specific integrated circuit, ASIC), field-programmable gate array (field-programmable gate array, FPGA), discrete logic, hardware, or any combination thereof.
- DSPs digital signal processors
- ASIC application-specific integrated circuit
- FPGA field-programmable gate array
- a device may store instructions for the software in a suitable non-transitory computer-readable storage medium and may execute the instructions in hardware using one or more processors to perform the techniques of this disclosure . Any of the foregoing (including hardware, software, a combination of hardware and software, etc.) may be considered as one or more processors.
- the video encoding and decoding system 10 shown in FIG. decoding may be retrieved from local storage, streamed over a network, etc.
- a video encoding device may encode and store data to memory, and/or a video decoding device may retrieve and decode data from memory.
- encoding and decoding are performed by devices that do not communicate with each other but merely encode data to memory and/or retrieve data from memory and decode data.
- the encoder 20 can be deployed on the terminal device or a server on the cloud side
- the decoder 30 can be deployed on the terminal device or a server on the cloud side
- the encoder 20 and decoder 30 can also be deployed on the terminal device or the cloud side together. on the server.
- the encoder 20 and the decoder 30 can be deployed on the terminal device, and the encoder 20 can encode and compress the video on the terminal device, and perform secondary compression based on the video encoding method provided in the embodiment of this application ( Or call it incremental storage compression), and store the compressed data, when the video needs to be played, the stored compressed data can be decoded.
- the encoder 20 and the decoder 30 can be deployed on multiple terminal devices, and the encoder 20 can encode and compress the video on the terminal device, and perform secondary encoding based on the video encoding method provided in the embodiment of this application Compress (or called incremental storage compression), and transmit the compressed data to other terminal devices, and other devices can decode the stored compressed data when they need to play video.
- Compress or called incremental storage compression
- the encoder 20 and the decoder 30 can be deployed on the terminal device and the server on the cloud side, and the encoder 20 can encode and compress the video on the terminal device, and based on the video encoding method provided in the embodiment of this application Perform secondary compression (or called incremental storage compression), and transmit the compressed data to the server.
- secondary compression or called incremental storage compression
- FIG. 1B is an illustrative diagram of an example of a video coding system 40 including encoder 20 of FIG. 2 and/or decoder 30 of FIG. 3 , according to an exemplary embodiment.
- the video decoding system 40 can implement a combination of various techniques in the embodiments of the present invention.
- video coding system 40 may include imaging device 41, encoder 20, decoder 30 (and/or a video encoder/decoder implemented by logic circuit 47 of processing unit 46), antenna 42 , one or more processors 43, one or more memories 44 and/or a display device 45.
- imaging device 41 , antenna 42 , processing unit 46 , logic circuit 47 , encoder 20 , decoder 30 , processor 43 , memory 44 and/or display device 45 can communicate with each other.
- video coding system 40 is illustrated with encoder 20 and decoder 30 , video coding system 40 may include only encoder 20 or only decoder 30 in different examples.
- antenna 42 may be used to transmit or receive encoded encoded data of video data.
- display device 45 may be used to present video data.
- logic circuitry 47 may be implemented by processing unit 46 .
- the processing unit 46 may include application-specific integrated circuit (application-specific integrated circuit, ASIC) logic, a graphics processor, a general-purpose processor, and the like.
- the video decoding system 40 may also include an optional processor 43, and the optional processor 43 may similarly include application-specific integrated circuit (ASIC) logic, a graphics processor, a general-purpose processor, and the like.
- the logic circuit 47 may be implemented by hardware, such as dedicated hardware for video encoding, etc., and the processor 43 may be implemented by general-purpose software, an operating system, and the like.
- the memory 44 can be any type of memory, such as a volatile memory (for example, a static random access memory (Static Random Access Memory, SRAM), a dynamic random access memory (Dynamic Random Access Memory, DRAM), etc.) or a nonvolatile memory permanent memory (for example, flash memory, etc.) and the like.
- memory 44 may be implemented by cache memory.
- logic circuitry 47 may access memory 44 (eg, to implement an image buffer).
- logic circuitry 47 and/or processing unit 46 may include memory (eg, cache, etc.) for implementing an image buffer or the like.
- encoder 20 implemented by logic circuitry may include an image buffer (eg, implemented by processing unit 46 or memory 44 ) and a graphics processing unit (eg, implemented by processing unit 46 ).
- a graphics processing unit may be communicatively coupled to the image buffer.
- Graphics processing unit may contain encoder 20 implemented by logic circuitry 47 to implement the various modules discussed with reference to FIG. 2 and/or any other encoder system or subsystem described herein.
- Logic circuits may be used to perform the various operations discussed herein.
- decoder 30 may be implemented by logic circuitry 47 in a similar manner to implement the various modules discussed with reference to decoder 30 of FIG. 3 and/or any other decoder system or subsystem described herein.
- logic implemented decoder 30 may include an image buffer (implemented by processing unit 2820 or memory 44 ) and a graphics processing unit (eg, implemented by processing unit 46 ).
- a graphics processing unit may be communicatively coupled to the image buffer.
- the graphics processing unit may contain decoder 30 implemented by logic circuitry 47 to implement the various modules discussed with reference to FIG. 3 and/or any other decoder system or subsystem described herein.
- antenna 42 may be used to receive encoded encoded data of video data.
- encoded encoded data may include data related to encoded video frames, indicators, index values, mode selection data, etc., as discussed herein, such as data related to encoding partitions (e.g., transform coefficients or quantized transform coefficients , (as discussed) an optional indicator, and/or data defining an encoding split).
- Video coding system 40 may also include decoder 30 coupled to antenna 42 and for decoding encoded encoded data.
- a display device 45 is used to present video frames.
- the decoder 30 may be used to perform a reverse process.
- the decoder 30 may be configured to receive and parse such syntax elements and decode the associated video data accordingly.
- encoder 20 may entropy encode the syntax elements into encoded video encoding data. In such instances, decoder 30 may parse such syntax elements and decode related video data accordingly.
- Fig. 2 shows a schematic/conceptual block diagram of an example of an encoder 20 for implementing an embodiment of the present invention.
- the encoder 20 includes a residual calculation unit 204, a transform processing unit 206, a quantization unit 208, an inverse quantization unit 210, an inverse transform processing unit 212, a reconstruction unit 214, a buffer 216, a loop filter unit 220 , a decoded picture buffer (decoded picture buffer, DPB) 230 , a prediction processing unit 260 and an entropy encoding unit 270 .
- Prediction processing unit 260 may include inter prediction unit 244 , intra prediction unit 254 , and mode selection unit 262 .
- Inter prediction unit 244 may include a motion estimation unit and a motion compensation unit (not shown).
- the encoder 20 shown in FIG. 2 may also be called a hybrid video encoder or a video encoder according to a hybrid video codec.
- the residual calculation unit 204, the transform processing unit 206, the quantization unit 208, the prediction processing unit 260, and the entropy encoding unit 270 form the forward signal path of the encoder 20, while for example the inverse quantization unit 210, the inverse transform processing unit 212, the The structure unit 214, the buffer 216, the loop filter 220, the decoded picture buffer (decoded picture buffer, DPB) 230, and the prediction processing unit 260 form the backward signal path of the encoder, wherein the backward signal path of the encoder corresponds to to the signal path of the decoder (see decoder 30 in FIG. 3).
- the encoder 20 receives, eg via an input 202, a picture 201 or an image block 203 of a picture 201, eg a picture in a sequence of pictures forming a video or a video sequence.
- the image block 203 can also be called the current coding block or the image block to be processed
- the picture 201 can be called the current picture or the picture to be coded (especially when the current picture is distinguished from other pictures in video coding, other pictures such as the same video sequence That is, previously encoded and/or decoded pictures in the video sequence also including the current picture).
- Embodiments of the encoder 20 may include a partitioning unit (not shown in FIG. 2 ) for partitioning the picture 201 into a plurality of blocks such as the image block 203 , usually into a plurality of non-overlapping blocks.
- a split unit can be used to use the same block size for all pictures in a video sequence and a corresponding grid defining the block size, or to change the block size between pictures or subsets or groups of pictures and split each picture into the corresponding block.
- prediction processing unit 260 of encoder 20 may be configured to perform any combination of the partitioning techniques described above.
- the image block 203 is also or can be regarded as a two-dimensional array or matrix of sampling points with sample values, although its size is smaller than that of the picture 201 .
- the image block 203 may include, for example, one sample array (eg, a luma array in the case of a black and white picture 201) or three sample arrays (eg, a luma array and two chrominance arrays in the case of a color picture) or An array of any other number and/or category depending on the color format being applied.
- the number of sampling points in the horizontal and vertical directions (or axes) of the image block 203 defines the size of the image block 203 .
- the encoder 20 shown in FIG. 2 is used to encode a picture 201 block by block, eg, perform encoding and prediction on each image block 203 .
- the residual calculation unit 204 is configured to calculate the residual block 205 based on the picture image block 203 and the prediction block 265 (further details of the prediction block 265 are provided below), for example, by subtracting the sample values of the picture image block 203 on a sample-by-sample (pixel-by-pixel) basis.
- the sample values of the block 265 are depredicted to obtain the residual block 205 in the sample domain.
- the transform processing unit 206 is configured to apply a transform such as a discrete cosine transform (discrete cosine transform, DCT) or a discrete sine transform (discrete sine transform, DST) on the sample values of the residual block 205 to obtain transform coefficients 207 in the transform domain .
- the transform coefficients 207 may also be referred to as transform residual coefficients and represent the residual block 205 in the transform domain.
- the transform processing unit 206 may be configured to apply an integer approximation of DCT/DST, such as the transform specified for HEVC/H.265. Such integer approximations are usually scaled by some factor compared to the orthogonal DCT transform. In order to maintain the norm of the forward and inverse transformed residual blocks, an additional scaling factor is applied as part of the transformation process.
- the scaling factor is usually chosen based on certain constraints, for example, the scaling factor is a power of 2 for the shift operation, the bit depth of the transform coefficients, the trade-off between accuracy and implementation cost, etc.
- specifying a specific scaling factor for the inverse transform at the decoder 30 side by eg the inverse transform processing unit 212 (and at the encoder 20 side for the corresponding inverse transform by eg the inverse transform processing unit 212), and correspondingly, may be performed at the encoder
- the side 20 specifies the corresponding scaling factor for the forward transform through the transform processing unit 206 .
- a quantization unit 208 is configured to quantize the transform coefficients 207 , eg by applying scalar quantization or vector quantization, to obtain quantized transform coefficients 209 .
- Quantized transform coefficients 209 may also be referred to as quantized residual coefficients 209 .
- the quantization process may reduce the bit depth associated with some or all of the transform coefficients 207 .
- n-bit transform coefficients may be rounded down to m-bit transform coefficients during quantization, where n is greater than m.
- the degree of quantization can be modified by adjusting a quantization parameter (quantization parameter, QP). For example, with scalar quantization, different scales can be applied to achieve finer or coarser quantization.
- a suitable quantization step size can be indicated by a quantization parameter (QP).
- QP quantization parameter
- a quantization parameter may be an index to a predefined set of suitable quantization step sizes.
- a smaller quantization parameter may correspond to fine quantization (smaller quantization step size)
- a larger quantization parameter may correspond to coarse quantization (larger quantization step size)
- Quantization may involve division by a quantization step size and corresponding quantization or inverse quantization, eg performed by inverse quantization 210, or may involve multiplication by a quantization step size.
- Embodiments according to some standards such as HEVC may use quantization parameters to determine the quantization step size.
- the quantization step size can be calculated based on the quantization parameter using a fixed-point approximation of an equation involving division.
- An additional scaling factor can be introduced for quantization and dequantization to recover the norm of the residual block that might have been modified by the scale used in the fixed-point approximation of the equations for the quantization step size and quantization parameter.
- inverse transform and inverse quantized scaling may be combined.
- a custom quantization table can be used and signaled from the encoder to the decoder in e.g. encoded data.
- Quantization is a lossy operation, where the larger the quantization step size, the greater the loss.
- the inverse quantization unit 210 is configured to apply the inverse quantization of the quantization unit 208 on the quantized coefficients to obtain the dequantized coefficients 211, e.g., based on or using the same quantization step size as the quantization unit 208, applying the quantization scheme applied by the quantization unit 208 The inverse quantization scheme.
- the dequantized coefficients 211 may also be referred to as dequantized residual coefficients 211, corresponding to the transform coefficients 207, although the losses due to quantization are generally not the same as the transform coefficients.
- the inverse transform processing unit 212 is configured to apply an inverse transform of the transform applied by the transform processing unit 206, for example, an inverse discrete cosine transform (DCT) or an inverse discrete sine transform (DST), to generate in the sample domain Get the inverse transform block 213 .
- the inverse transform block 213 may also be referred to as the inverse transform dequantized block 213 or the inverse transform residual block 213 .
- the reconstruction unit 214 (e.g. summer 214) is used to add the inverse transform block 213 (i.e. reconstructed residual block 213) to the prediction block 265 to obtain the reconstructed block 215 in the sample domain, e.g.
- the sample values of the reconstructed residual block 213 are added to the sample values of the prediction block 265 .
- a buffer unit 216 (or simply "buffer” 216), such as a line buffer 216, is used to buffer or store the reconstructed block 215 and corresponding sample values, for eg intra prediction.
- the encoder may be configured to use the unfiltered reconstructed blocks and/or corresponding sample values stored in the buffer unit 216 for any kind of estimation and/or prediction, such as intra predict.
- an embodiment of encoder 20 may be configured such that buffer unit 216 is used not only for storing reconstructed blocks 215 for intra prediction 254, but also for loop filter unit 220 (not shown in FIG. 2 ). output), and/or such that buffer unit 216 and decoded picture buffer unit 230 form one buffer.
- Other embodiments may be used to use filtered blocks 221 and/or blocks or samples from decoded picture buffer 230 (neither shown in FIG. 2 ) as input or basis for intra prediction 254 .
- a loop filter unit 220 (or “loop filter” 220 for short) is used to filter the reconstructed block 215 to obtain a filtered block 221 to smooth pixel transformation or improve video quality.
- the loop filter unit 220 is intended to represent one or more loop filters, such as deblocking filters, sample-adaptive offset (sample-adaptive offset, SAO) filters or other filters, such as bilateral filters, auto An adaptive loop filter (ALF), or a sharpening or smoothing filter, or a collaborative filter.
- loop filter unit 220 is shown in FIG. 2 as an in-loop filter, in other configurations, loop filter unit 220 may be implemented as a post-loop filter.
- the filtered block 221 may also be referred to as a filtered reconstructed block 221 .
- the decoded picture buffer 230 may store the reconstructed encoded block after the loop filter unit 220 performs a filtering operation on the reconstructed encoded block.
- Embodiments of the encoder 20 may be used to output loop filter parameters (e.g., SAO information), for example, directly or by the entropy encoding unit 270 or any other
- the entropy encoding unit outputs after entropy encoding, for example, so that the decoder 30 can receive and apply the same loop filter parameters for decoding.
- a decoded picture buffer (DPB) 230 may be a reference picture memory storing reference picture data for the encoder 20 to encode video data.
- the DPB 230 can be formed by any one of various memory devices, such as dynamic random access memory (DRAM) (including synchronous DRAM (synchronous DRAM, SDRAM), magnetoresistive RAM (magnetoresistive RAM, MRAM), resistive RAM ( resistive RAM, RRAM)) or other types of memory devices.
- DPB 230 and buffer 216 may be provided by the same memory device or separate memory devices.
- a decoded picture buffer (DPB) 230 is used to store the filtered block 221 .
- the decoded picture buffer 230 may further be used to store other previously filtered blocks of the same current picture or a different picture such as a previous reconstructed picture, such as the previously reconstructed and filtered block 221, and may provide a complete previously Reconstructed, ie decoded pictures (and corresponding reference blocks and samples) and/or partially reconstructed current pictures (and corresponding reference blocks and samples), eg for inter prediction.
- a decoded picture buffer (DPB) 230 is used to store the reconstructed block 215 if the reconstructed block 215 is reconstructed without in-loop filtering.
- Prediction processing unit 260 also referred to as block prediction processing unit 260, is adapted to receive or acquire image block 203 (current image block 203 of current picture 201) and reconstructed picture data, e.g. the same (current) picture from buffer 216 and/or reference picture data 231 of one or more previously decoded pictures from the decoded picture buffer 230, and for processing such data for prediction, i. Prediction block 265 of intra prediction block 255 .
- the mode selection unit 262 may be used to select a prediction mode (such as an intra or inter prediction mode) and/or the corresponding prediction block 245 or 255 used as the prediction block 265 for computing the residual block 205 and reconstructing the reconstructed block 215.
- a prediction mode such as an intra or inter prediction mode
- the corresponding prediction block 245 or 255 used as the prediction block 265 for computing the residual block 205 and reconstructing the reconstructed block 215.
- Embodiments of the mode selection unit 262 may be used to select a prediction mode (e.g., from those supported by the prediction processing unit 260) that provides the best match or the smallest residual (minimum residual means better compression in transmission or storage), or provide minimal signaling overhead (minimum signaling overhead means better compression in transmission or storage), or consider or balance both of the above.
- the mode selection unit 262 can be used to determine the prediction mode based on rate distortion optimization (RDO), that is, to select the prediction mode that provides the minimum rate distortion optimization, or to select the prediction mode that the relevant rate distortion at least satisfies the prediction mode selection standard .
- RDO rate distortion optimization
- the encoder 20 is used to determine or select the best or optimal prediction mode from a (predetermined) set of prediction modes.
- the set of prediction modes may include, for example, intra prediction modes and/or inter prediction modes.
- the set of intra prediction modes may include a variety of different intra prediction modes, for example, non-directional modes such as DC (or mean) mode and planar mode, or directional as defined in H.265 mode, or may include 67 different intra prediction modes, for example, non-directional modes such as DC (or mean) mode and planar mode, or directional modes as defined in the developing H.266.
- the set of inter prediction modes depends on available reference pictures (i.e., for example, the aforementioned at least part of the decoded pictures stored in DBP 230) and other inter prediction parameters, for example on whether the entire reference picture is used or only A part of the reference picture, e.g. a search window area surrounding the area of the current block, to search for the best matching reference block, and/or e.g. depending on whether pixel interpolation such as half-pixel and/or quarter-pixel interpolation is applied.
- the inter prediction mode set may include, for example, an Advanced Motion Vector Prediction (AMVP) mode and a merge mode.
- AMVP Advanced Motion Vector Prediction
- the inter-frame prediction mode set may include the prediction mode based on the affine motion model described in the embodiment of the present invention, such as the advanced motion vector prediction mode (Affine AMVP mode) based on the affine motion model or the affine motion model-based
- the fusion mode (Affine Merge mode), specifically, AMVP mode based on control points (inherited control point motion vector prediction method or constructed control point motion vector prediction method), based on control point merge mode (inherited control point motion Vector prediction method or constructed control point motion vector prediction method); and, advanced temporal motion vector prediction (advanced temporal motion vector prediction, ATMVP) method, PLANAER method, etc.; or, through the fusion of the above-mentioned affine-based affine motion model
- the sub-block based merging mode (Sub-block based merging mode) formed by the synthesis of ATMVP and/or PLANAR methods, etc.
- the inter-frame prediction of the push image block to be processed can be applied to unidirectional prediction (forward or backward), bidirectional prediction (forward and backward) or multi-frame prediction, when applied to bidirectional prediction , generalized bi-prediction (Generalized Bi-prediction, GBi) at the block level can be adopted, or a weighted prediction method.
- the intra-frame prediction unit 254 can be used to perform any combination of the inter-frame prediction techniques described below .
- the embodiment of the present invention may also apply skip mode and/or direct mode.
- the prediction processing unit 260 may further be used to divide the image block 203 into smaller block partitions or sub-blocks, for example, by iteratively using quad-tree (quad-tree, QT) segmentation, binary-tree (binary-tree, BT) segmentation or ternary-tree (triple-tree, TT) partitioning, or any combination thereof, and for performing prediction, for example, for each of the block partitions or sub-blocks, wherein mode selection includes selection of the tree structure of the partitioned image block 203 and selection of the application prediction mode for each of the block partitions or sub-blocks.
- quad-tree quad-tree
- QT quad-tree
- binary-tree binary-tree
- BT binary-tree
- TT ternary-tree
- the inter prediction unit 244 may include a motion estimation (motion estimation, ME) unit (not shown in FIG. 2 ) and a motion compensation (motion compensation, MC) unit (not shown in FIG. 2 ).
- the motion estimation unit is adapted to receive or acquire a picture image block 203 (current picture image block 203 of a current picture 201) and a decoded picture 231, or at least one or more previously reconstructed blocks, e.g. one or more other/different
- the reconstructed blocks of the previously decoded picture 231 are motion estimated based on the determined inter prediction mode.
- the video sequence may comprise the current picture and the previously decoded picture 31, or in other words the current picture and the previously decoded picture 31 may be part of, or form, a sequence of pictures forming the video sequence.
- the encoder 20 can be used to select a reference block from multiple reference blocks of the same or different pictures in multiple other pictures (reference pictures), and provide the reference picture and /or provide the offset (spatial offset) between the location of the reference block (X, Y coordinates) and the location of the current block as an inter prediction parameter.
- This offset is also called a motion vector (MV).
- the motion compensation unit is used to obtain inter prediction parameters, and perform inter prediction based on or using the inter prediction parameters to obtain inter prediction blocks 245 .
- Motion compensation performed by a motion compensation unit may involve fetching or generating a predictive block (predictor) based on motion/block vectors determined by motion estimation (possibly performing interpolation to sub-pixel precision). Interpolation filtering can generate additional pixel samples from known pixel samples, potentially increasing the number of candidate predictive blocks that can be used to encode a picture block.
- motion compensation unit 246 may locate the predictive block to which the motion vector points in a reference picture list. Motion compensation unit 246 may also generate syntax elements associated with blocks and video slices for use by decoder 30 in decoding picture blocks of the video slice.
- the above-mentioned inter prediction unit 244 may transmit syntax elements to the entropy coding unit 270, and the syntax elements include, for example, inter prediction parameters (for example, the inter prediction mode selected for current block prediction after traversing multiple inter prediction modes indication information), the index number of the candidate motion vector list, and optionally also includes a GBi index number, a reference frame index, and the like.
- inter prediction parameters for example, the inter prediction mode selected for current block prediction after traversing multiple inter prediction modes indication information
- the index number of the candidate motion vector list optionally also includes a GBi index number, a reference frame index, and the like.
- the inter prediction unit 244 can be configured to perform any combination of inter prediction techniques.
- the intra prediction unit 254 is configured to obtain, eg receive, the picture block 203 (current picture block) of the same picture and one or more previously reconstructed blocks, eg reconstructed adjacent blocks, for intra estimation.
- the encoder 20 may be configured to select an intra prediction mode from a plurality of (predetermined) intra prediction modes.
- Embodiments of the encoder 20 may be used to select the intra prediction mode based on optimization criteria, such as based on the smallest residual (eg, the intra prediction mode that provides the most similar prediction block 255 to the current picture block 203 ) or the smallest rate-distortion.
- optimization criteria such as based on the smallest residual (eg, the intra prediction mode that provides the most similar prediction block 255 to the current picture block 203 ) or the smallest rate-distortion.
- the intra-prediction unit 254 is further configured to determine an intra-prediction block 255 based on the intra-prediction parameters as the selected intra-prediction mode. In any case, after selecting the intra prediction mode for the block, the intra prediction unit 254 is also configured to provide the intra prediction parameters to the entropy coding unit 270, i.e., to provide an indication of the selected intra prediction mode for the block Information. In one example, intra prediction unit 254 may be configured to perform any combination of intra prediction techniques.
- the above-mentioned intra prediction unit 254 may transmit syntax elements to the entropy encoding unit 270, where the syntax elements include intra prediction parameters (for example, the intra prediction mode selected for current block prediction after traversing multiple intra prediction modes) instructions).
- intra prediction parameters for example, the intra prediction mode selected for current block prediction after traversing multiple intra prediction modes
- the decoding end 30 may directly use the default prediction mode for decoding.
- the entropy coding unit 270 is used to use an entropy coding algorithm or scheme (for example, a variable length coding (variable length coding, VLC) scheme, a context adaptive VLC (context adaptive VLC, CAVLC) scheme, an arithmetic coding scheme, a context adaptive binary arithmetic Coding (context adaptive binary arithmetic coding, CABAC), syntax-based context-adaptive binary arithmetic coding (syntax-based context-adaptive binary arithmetic coding, SBAC), probability interval partitioning entropy (probability interval partitioning entropy, PIPE) coding or other entropy coding method or technique) applied to one or all (or none) of the quantized residual coefficients 209, inter prediction parameters, intra prediction parameters, and/or loop filter parameters to obtain
- the encoded picture data 21 output in the form of encoded encoded data 21, for example.
- the encoded encoded data may be transmitted to the decoder 30 or
- encoder 20 may directly quantize the residual signal without the transform processing unit 206 for certain blocks or frames.
- encoder 20 may have quantization unit 208 and inverse quantization unit 210 combined into a single unit.
- encoder 20 may be used to encode video streams.
- the encoder 20 can directly quantize the residual signal without being processed by the transform processing unit 206, and correspondingly does not need to be processed by the inverse transform processing unit 212; or, for some images blocks or image frames, the encoder 20 does not generate residual data, and accordingly does not need to be processed by the transform processing unit 206, the quantization unit 208, the inverse quantization unit 210, and the inverse transform processing unit 212; or, the encoder 20 can reconstruct The image block is directly stored as a reference block without being processed by the filter 220; or, the quantization unit 208 and the inverse quantization unit 210 in the encoder 20 can be combined together.
- the loop filter 220 is optional, and in the case of lossless compression coding, the transform processing unit 206, the quantization unit 208, the inverse quantization unit 210 and the inverse transform processing unit 212 are optional. It should be understood that, according to different application scenarios, the inter prediction unit 244 and the intra prediction unit 254 may be selectively enabled.
- Fig. 3 shows a schematic/conceptual block diagram of an example of a decoder 30 for implementing an embodiment of the present invention.
- the decoder 30 is configured to receive encoded picture data (eg, encoded coded data) 21 encoded by the encoder 20 to obtain a decoded picture 231 .
- decoder 30 receives video data, such as encoded video encoding data representing picture blocks of an encoded video slice, and associated syntax elements from encoder 20 .
- the decoder 30 includes an entropy decoding unit 304, an inverse quantization unit 310, an inverse transform processing unit 312, a reconstruction unit 314 (such as a summer 314), a buffer 316, a loop filter 320, an The decoded picture buffer 330 and the prediction processing unit 360 .
- Prediction processing unit 360 may include inter prediction unit 344 , intra prediction unit 354 , and mode selection unit 362 .
- decoder 30 may perform a decoding pass that is substantially the inverse of the encoding pass described with reference to encoder 20 of FIG. 2 .
- the entropy decoding unit 304 is configured to perform entropy decoding on the encoded picture data 21 to obtain, for example, quantized coefficients 309 and/or decoded encoding parameters (not shown in FIG. 3 ), for example, inter-frame prediction, intra-frame prediction parameters Any or all of (decoded) , loop filter parameters, and/or other syntax elements.
- the entropy decoding unit 304 is further configured to forward the inter prediction parameters, intra prediction parameters and/or other syntax elements to the prediction processing unit 360 .
- Decoder 30 may receive syntax elements at the video slice level and/or the video block level.
- the inverse quantization unit 310 may be functionally the same as the inverse quantization unit 110
- the inverse transform processing unit 312 may be functionally the same as the inverse transform processing unit 212
- the reconfiguration unit 314 may be functionally the same as the reconfiguration unit 214
- the buffer 316 may be functionally
- loop filter 320 may be functionally the same as loop filter 220
- decoded picture buffer 330 may be functionally the same as decoded picture buffer 230 .
- the prediction processing unit 360 may include an inter prediction unit 344 and an intra prediction unit 354, wherein the inter prediction unit 344 may be functionally similar to the inter prediction unit 244, and the intra prediction unit 354 may be functionally similar to the intra prediction unit 254 .
- the prediction processing unit 360 is generally configured to perform block prediction and/or obtain a predicted block 365 from the encoded data 21, and to receive or obtain prediction related parameters and/or information about the Information about the selected prediction mode.
- intra prediction unit 354 of prediction processing unit 360 is used to perform the prediction based on the signaled intra prediction mode and the previous decoded block from the current frame or picture. data to generate prediction blocks 365 for picture blocks of the current video slice.
- inter prediction unit 344 e.g., motion compensation unit
- the predicted block can be generated from one reference picture within one reference picture list.
- Decoder 30 may construct reference frame lists: List 0 and List 1 based on the reference pictures stored in DPB 330 using default construction techniques.
- the prediction processing unit 360 is configured to determine prediction information for a video block of the current video slice by parsing motion vectors and other syntax elements, and use the prediction information to generate a prediction block for the current video block being decoded.
- prediction processing unit 360 uses some of the received syntax elements to determine the prediction mode (e.g., intra or inter prediction), the inter prediction slice type ( For example, B slice, P slice, or GPB slice), construction information for one or more of the reference picture lists for the slice, motion vectors for each inter-coded video block of the slice, Inter prediction status and other information for each inter-coded video block of the slice to decode the video blocks of the current video slice.
- the prediction mode e.g., intra or inter prediction
- the inter prediction slice type For example, B slice, P slice, or GPB slice
- construction information for one or more of the reference picture lists for the slice motion vectors for each inter-coded video block of the slice
- Inter prediction status and other information for each inter-coded video block of the slice to decode the video blocks of the
- the syntax elements received by the decoder 30 from the encoded data include receiving adaptive parameter set (adaptive parameter set, APS), sequence parameter set (sequence parameter set, SPS), picture parameter set (picture parameter set, PPS) or syntax elements in one or more of the slice headers.
- Inverse quantization unit 310 may be used to inverse quantize (ie, inverse quantize) the quantized transform coefficients provided in the encoded data and decoded by entropy decoding unit 304 .
- the inverse quantization process may include using quantization parameters calculated by encoder 20 for each video block in a video slice to determine the degree of quantization that should be applied and likewise determine the degree of inverse quantization that should be applied.
- An inverse transform processing unit 312 is used to apply an inverse transform (eg, an inverse DCT, an inverse integer transform, or a conceptually similar inverse transform process) to the transform coefficients to generate a residual block in the pixel domain.
- an inverse transform eg, an inverse DCT, an inverse integer transform, or a conceptually similar inverse transform process
- Reconstruction unit 314 (e.g. summer 314) is used to add inverse transform block 313 (i.e. reconstructed residual block 313) to prediction block 365 to obtain reconstructed block 315 in the sample domain, e.g. by adding The sample values of the reconstructed residual block 313 are added to the sample values of the prediction block 365 .
- a loop filter unit 320 is used (either during the encoding loop or after the encoding loop) to filter the reconstructed block 315 to obtain a filtered block 321 to smooth pixel transitions or improve video quality.
- loop filter unit 320 may be configured to perform any combination of the filtering techniques described below.
- the loop filter unit 320 is intended to represent one or more loop filters, such as deblocking filters, sample-adaptive offset (SAO) filters, or other filters, such as bilateral filters, adaptive Loop filter (adaptiveloopfilter, ALF), or sharpening or smoothing filter, or collaborative filter.
- loop filter unit 320 is shown in FIG. 3 as an in-loop filter, in other configurations, loop filter unit 320 may be implemented as a post-loop filter.
- the decoded video blocks 321 in a given frame or picture are then stored in a decoded picture buffer 330 that stores reference pictures for subsequent motion compensation.
- the decoder 30 is configured to output the decoded picture 31 for presentation to or viewing by a user, eg, via an output 332 .
- decoder 30 may be used to decode compressed encoded data.
- decoder 30 may generate an output video stream without loop filter unit 320 .
- the non-transform based decoder 30 may directly inverse quantize the residual signal without the inverse transform processing unit 312 for certain blocks or frames.
- decoder 30 may have inverse quantization unit 310 and inverse transform processing unit 312 combined into a single unit.
- the decoder 30 can be used to implement the inter-frame prediction method described in the embodiment of FIG. 11A later.
- the processing result for a certain link can be further processed and then output to the next link, for example, in interpolation filtering, motion vector derivation or loop filtering, etc.
- operations such as Clip or shift can be further performed on the processing results of the corresponding link.
- FIG. 4 is a schematic structural diagram of a video decoding device 400 (for example, a video encoding device 400 or a video decoding device 400 ) provided by an embodiment of the present invention.
- Video coding apparatus 400 is suitable for implementing the embodiments described herein.
- the video decoding device 400 may be a video decoder (such as the decoder 30 of FIG. 1A ) or a video encoder (such as the encoder 20 of FIG. 1A ).
- the video decoding device 400 may be one or more components in the decoder 30 of FIG. 1A or the encoder 20 of FIG. 1A described above.
- the video decoding device 400 includes: an ingress port 410 and a receiving unit (Rx) 420 for receiving data, a processor, logic unit or central processing unit (CPU) 430 for processing data, and a transmitter unit for transmitting data (Tx) 440 and egress port 450, and memory 460 for storing data.
- the video decoding device 400 may also include a photoelectric conversion component and an electro-optical (EO) component coupled with the inlet port 410 , the receiver unit 420 , the transmitter unit 440 and the outlet port 450 for egress or ingress of optical or electrical signals.
- EO electro-optical
- the processor 430 is realized by hardware and software.
- Processor 430 may be implemented as one or more CPU chips, cores (eg, multi-core processors), FPGAs, ASICs, and DSPs.
- Processor 430 is in communication with inlet port 410 , receiver unit 420 , transmitter unit 440 , outlet port 450 and memory 460 .
- the processor 430 includes a decoding module 470 (eg, an encoding module 470 or a decoding module 470).
- the encoding/decoding module 470 implements the embodiments disclosed herein to implement the chrominance block prediction method provided by the embodiment of the present invention.
- the encoding/decoding module 470 implements, processes or provides various encoding operations.
- the encoding/decoding module 470 is implemented in instructions stored in the memory 460 and executed by the processor 430 .
- Memory 460 including one or more magnetic disks, tape drives, and solid-state drives, may be used as an overflow data storage device to store programs while those programs are selectively executed, and to store instructions and data that are read during program execution.
- Memory 460 may be volatile and/or nonvolatile, and may be read-only memory (ROM), random-access memory (RAM), random-access memory (ternary content-addressable memory, TCAM), and/or static Random Access Memory (SRAM).
- ROM read-only memory
- RAM random-access memory
- TCAM ternary content-addressable memory
- SRAM static Random Access Memory
- FIG. 5A is a simplified block diagram of an apparatus 500 that may be used as either or both of source device 12 and destination device 14 in FIG. 1A , according to an exemplary embodiment.
- Apparatus 500 may implement the techniques of the present invention.
- FIG. 5A is a schematic block diagram of an implementation manner of an encoding device or a decoding device (referred to as a decoding device 500 for short) according to an embodiment of the present invention.
- the decoding device 500 may include a processor 510 , a memory 530 and a bus system 550 .
- the processor and the memory are connected through a bus system, the memory is used for storing instructions, and the processor is used for executing the instructions stored in the memory.
- the memory of the decoding device stores program codes, and the processor can invoke the program codes stored in the memory to execute various video encoding or decoding methods described in the present invention. To avoid repetition, no detailed description is given here.
- the processor 510 may be a central processing unit (Central Processing Unit, referred to as "CPU"), and the processor 510 may also be other general-purpose processors, digital signal processors (DSP), dedicated integrated circuit (ASIC), off-the-shelf programmable gate array (FPGA) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, etc.
- CPU Central Processing Unit
- DSP digital signal processors
- ASIC dedicated integrated circuit
- FPGA off-the-shelf programmable gate array
- a general-purpose processor may be a microprocessor, or the processor may be any conventional processor, or the like.
- the memory 530 may include a read only memory (ROM) device or a random access memory (RAM) device. Any other suitable type of storage device may also be used as memory 530 .
- Memory 530 may include code and data 531 accessed by processor 510 using bus 550 .
- the memory 530 may further include an operating system 533 and an application program 535 including at least one program allowing the processor 510 to execute the video encoding or decoding method described in the present invention.
- the application program 535 may include applications 1 to N, which further include a video encoding or decoding application (referred to as a video decoding application) for executing the video encoding or decoding method described in the present invention.
- the bus system 550 may include not only a data bus, but also a power bus, a control bus, and a status signal bus. However, for clarity of illustration, the various buses are labeled as bus system 550 in the figure.
- the decoding device 500 may also include one or more output devices, such as a display 570 .
- the display 570 may be a touch-sensitive display that incorporates a display with a tactile unit operable to sense touch input.
- Display 570 may be connected to processor 510 via bus 550 .
- processor 510 and memory 530 of device 500 are shown in FIG. 5A as being integrated in a single unit, other configurations may also be used. Operation of processor 510 may be distributed among multiple directly coupleable machines, each having one or more processors, or across a local area or other network.
- the memory 530 may be distributed among multiple machines, such as a network-based memory or memory among multiple machines running the apparatus 500 . Although only a single bus is shown here, the bus 550 of the device 500 may be formed by multiple buses.
- the slave memory 530 may be directly coupled to other components of the apparatus 500 or may be accessed through a network, and may include a single integrated unit, such as a memory card, or multiple units, such as multiple memory cards. Accordingly, apparatus 500 may be implemented in a variety of configurations.
- FIG. 5B is a schematic flowchart of a video coding method provided by the embodiment of the present application.
- a video coding method provided by the embodiment of the present application may include:
- step 501 For the description of step 501, reference may be made to the description of step 601 or the description of step 1301, which will not be repeated here.
- the similarity between image features of the first image block and the second image block is greater than a threshold.
- the difference information is replaced with the original second encoding parameter. Since the similarity between the first image block and the second image block is greater than the threshold, the size of the encoded data after encoding the difference information will be much smaller than that of the first image block.
- the coded data after the second information is coded can be restored to obtain the second information based on the difference information and the first information. It is equivalent to reducing the storage resource required for storing the video and reducing the bandwidth required for video transmission under the premise that the complete second video can be recovered.
- the similarity between the first information and the second information is greater than a threshold.
- the difference information is replaced with the original second encoding parameter. Since the similarity between the first information and the second information is greater than the threshold, the size of the encoded data after encoding the difference information will be much smaller than that of the second information.
- the coded data after the information is coded can be restored to obtain the second information based on the difference information and the first information. It is equivalent to reducing the storage resource required for storing the video and reducing the bandwidth required for video transmission under the premise that the complete second video can be recovered.
- the threshold here may be a numerical value indicating a high similarity between image blocks (this application does not limit the specific magnitude of the numerical value).
- step 503 For the description of step 503, reference may be made to the description of step 602 or the description of step 1302, which will not be repeated here.
- step 504 For the description of step 504, reference may be made to the description of step 603 or the description of step 1303, which will not be repeated here.
- the first information includes a first encoding parameter of a first image block
- the second information includes a second encoding parameter of a second image block
- the second encoding parameters include motion vectors and/or residuals.
- the first video may include a first image frame
- the second video may include a second image frame
- the first image frame may include a plurality of image blocks (including the first image block)
- the second image frame It may include a plurality of image blocks (including the second image block), wherein the first image block and the second image block may be block units such as MB, prediction block (partition), CU, PU, TU, etc., which are not limited here.
- MB prediction block
- CU prediction block
- PU PU
- TU TU
- the similarity of image features between the first image block and the second image block is relatively high, wherein, the image features of the image block can be one or more of the color features, texture features, shape features, etc. of the image block .
- the color feature and texture feature are used to describe the surface properties of the object corresponding to the image block.
- Shape features include outline features and area features, outline features include the outer boundary features of the object, and area features include the shape area features of the object.
- the first encoding parameter of the first image block and the second encoding parameter of the second image block can be obtained, the first encoding parameter and the second encoding parameter can be motion vectors, the first encoding parameter and the second encoding parameter
- the second coding parameter may be a residual, and the first coding parameter and the second coding parameter may be a motion vector and a residual.
- the residual may be calculated based on the image block and the prediction block, for example, the residual may be obtained in the sample domain by subtracting the sample value of the image block of the picture from the sample value of the prediction block sample by sample (pixel by pixel).
- the coding parameters may also include syntax information, for example, the syntax information may be but not limited to: any one of inter prediction, intra prediction parameters, loop filter parameters and/or other syntax elements (decoded) or all.
- syntax information may be but not limited to: any one of inter prediction, intra prediction parameters, loop filter parameters and/or other syntax elements (decoded) or all.
- a subtraction operation may be performed on the first encoding parameter and the second encoding parameter to obtain difference information, and in addition, the difference information may be obtained based on other factors used to quantize the first encoding parameter and the second encoding parameter.
- the difference information is calculated by calculating the difference between them, which is not limited here.
- first image block and the second image block may be image blocks with relatively similar image characteristics in the first image frame and the second image frame, and part or all of the image blocks in the first image frame and the second image frame may be Execute the above processing for the first image block and the second image block. Furthermore, difference information of multiple groups of image blocks can be obtained.
- the difference information may be used as the encoding parameter of the second image block (replacing the original second encoding parameter), and the data of the second video including the encoding parameter of the second image block is encoded to obtain the first Bitstream of the second video.
- first image block and the second image block may be image blocks with relatively similar image characteristics in the first image frame and the second image frame, and part or all of the image blocks in the first image frame and the second image frame may be Execute the above processing for the first image block and the second image block.
- first image frame and the second image frame may be image frames with relatively similar image features in the first video and the second video, and image blocks in part or all of the image frames in the first video and the second video may be Execute the above processing for the first image block and the second image block.
- the first image block is included in a first image frame
- the second image block is included in a second image frame
- the first image frame and the second image frame are different image frame.
- the first encoding parameter may include a first motion vector
- the second encoding parameter may include a second motion vector
- the difference information includes first difference information
- the first difference The value information is a difference between the first motion vector and the second motion vector.
- the first coding parameter may include a first residual
- the second coding parameter may include a second residual
- the difference information includes second difference information, so The second difference information is a difference between the first residual and the second residual.
- the first coding parameter may include a first motion vector and a first residual
- the second coding parameter may include a second motion vector and a second residual
- the difference information includes a first difference information and second difference information
- the first difference information is the difference between the first motion vector and the second motion vector
- the second difference information is the first The difference between the residual and the second residual.
- the first coding parameter includes a first residual
- the second coding parameter includes a second residual
- the difference information includes second difference information
- the second difference The information is used to indicate the difference between the first residual and the second residual
- the encoding the difference information includes: encoding the second residual by lossless compression encoding or lossy compression encoding
- the difference information is encoded.
- the first difference information represents the difference between motion vectors
- the motion vector is used in inter-frame prediction
- lossy compression is used to compress the first difference information
- difference for example, artifacts appear
- lossless compression is used to compress the first difference information, which can increase the accuracy and effect of inter-frame prediction.
- the first encoding parameter includes a first motion vector
- the second encoding parameter includes a second motion vector
- the difference information includes first difference information
- the first difference The information is used to indicate the difference between the first motion vector and the second motion vector
- the encoding the difference information includes: encoding the first difference information through lossless compression coding .
- the first difference information (the first difference information is used to indicate the difference between the first motion vector and the second motion vector) may be performed by lossless compression coding. coding.
- lossless compression coding or lossy compression coding may be used to encode the second difference information (the second difference information is used to indicate the difference between the first residual and the second residual difference between) to encode.
- lossless compression may be lossless compression including transformation (retaining all coefficients), scanning, and entropy coding
- lossy compression may be lossy compression including transformation (reserving low-frequency coefficients), quantization, scanning, and entropy coding.
- the encoded data and the first encoding parameter may be sent to the decoding side, so that the decoding side can obtain The second encoding parameters; alternatively, the encoded data may be stored locally for later transmission or retrieval.
- the above encoded data may be encapsulated, and correspondingly, the encapsulated encoded data may be sent to the decoding side, so that the decoding side may obtain the second encoding parameter according to the encapsulated encoded data; or , the encapsulated encoded data can be stored locally for later transmission or retrieval.
- the first indication information can also be encoded, and the first indication information can indicate that the difference information is obtained according to the difference between the first image block and the second image block, and then the decoding side can obtain based on the first indication information
- the first encoding parameter and the difference information of the first image block, and the second encoding parameter is obtained based on the first encoding parameter and the difference information.
- the first indication information may include an identifier indicating the first image block and an identifier indicating the second image block.
- the first information includes a first image block
- the second information includes a second image block
- the similarity of image features between the first image block and the second image block is greater than threshold
- the obtaining difference information according to the first information and the second information includes:
- the difference information includes a third encoding parameter of the second image block.
- the coded data of the first video and the second video may be decoded by a decoder to obtain a video signal of the first video and a video signal of the second video, wherein the video signal of the first video may be Including the first image block, the video signal of the second video may include the second image block.
- the decoder may decapsulate the video files of the first video and the second video.
- first video and the second video can be different videos, or different parts of the same video (for example, there will be a large number of repeated video segments in some videos, then the first video and the second video can be in the video. Repeated two video clips), it is not limited here.
- the first video may include a first image frame
- the second video may include a second image frame
- the first image frame may include a plurality of image blocks (including the first image block)
- the second image frame It may include a plurality of image blocks (including the second image block), wherein the first image block and the second image block may be block units such as MB, prediction block (partition), CU, PU, TU, etc., which are not limited here.
- MB prediction block
- CU prediction block
- PU PU
- TU TU
- the similarity of image features between the first image block and the second image block is relatively high, wherein, the image features of the image block can be one or more of the color features, texture features, shape features, etc. of the image block .
- the color feature and texture feature are used to describe the surface properties of the object corresponding to the image block.
- the shape features include contour features and region features, the contour features include the outer boundary features of the object, and the region features include the shape region features of the object.
- the first image block may be an image block in an I frame
- the second image block may be an image block in a P frame or a B frame
- the first image block may be An image block in a P frame
- the second image block is an image block in a P frame or a B frame.
- the video signal of the second video may be encoded based on the video signal of the first video.
- the third encoding parameter of the second image block may be determined by using the first image block as a reference block of the second image block.
- the first image frame in the first video and the second image frame in the second video may be acquired, and the similarity of the image features of the first image frame and the second image frame greater than a threshold, the first video and the second video are different videos, and the first image frame is used as a reference frame of the second image frame to determine a third encoding parameter of the second image frame .
- the third coding parameter may be residual, motion vector and other syntax information, which is not limited here.
- the first image block may be used as a reference block, and the first image frame having the reference block As a reference frame to predict the second image block in the current frame (that is, the second image frame including the second image block), optionally, the first image frame may be temporally after the current frame, or the current frame may be temporally It is located between a previous reference frame that appears before the current frame in the video sequence and a subsequent reference frame that appears after the current frame in the video sequence (the first image frame can be one of the previous reference frame or the subsequent reference frame).
- the image content similarity between the first image frame and the second image frame can be relatively high.
- the higher image block is used as a reference block when encoding the second image frame.
- the relevant information of the reference block originally used as the second image block in the image group GOP where the second image block is located may not be encoded into the encoded data , reducing the size of encoded data, reducing the storage space required for storing the second video, and reducing the bandwidth required for transmitting the second video.
- the decoder in order to enable the decoder to decode and obtain a complete and accurate video signal of the second video, it is also necessary to encode second indication information into the encoded data, and the second indication information is used to indicate that the second video
- the reference block of the second image block is the first image block.
- the decoder may know that the reference block of the second image block is the first image block when decoding.
- the second indication information may include an identifier of the first image block and an identifier of the second image block.
- the second indication information may indicate that the reference frame of the second image frame is the first image frame, and further, according to the second indication information , acquiring a first image frame, and reconstructing the second image frame according to the encoding parameters of the second image frame and the first image frame.
- the second video file may also be decoded to obtain a fourth image block; a third video file is obtained, and the third video file and the second video file are different video files ; Decoding the third video file to obtain a third image block in the third video file, the similarity of the image features of the third image block and the fourth image block is greater than a threshold, the first image block The second image block and the fourth image block belong to the same image frame in the second video file; the third image block is used as a reference block of the fourth image block to determine the difference information, so The difference information includes a fourth encoding parameter of the fourth image block.
- the image blocks with higher image content similarity in the above-mentioned other image frames and the second image frame can be As a reference block when encoding the second image frame.
- the second image frame may also include other image blocks (such as the fourth image block) except the second image block, and in the process of encoding the fourth image block, an image in the third video may be block as a reference block (third image block) of the fourth image block.
- other image blocks such as the fourth image block
- an image in the third video may be block as a reference block (third image block) of the fourth image block.
- the second encoding parameter and second indication information may be encoded, where the second indication information is used to indicate that the reference block of the fourth image block is the third image block.
- the first video file may also be decoded to obtain a fifth image block; the second video file may be decoded to obtain a sixth image block; the second image block and the The sixth image block belongs to the same image frame in the second video file, the similarity of the image features of the fifth image block and the sixth image block is greater than a threshold, and the first image block and the The fifth image block belongs to the same or different image frame in the first video file; the fifth image block is used as the reference block of the sixth image block to determine the difference information, and the difference The information includes fifth encoding parameters of said fifth image block.
- the image block with a high image content similarity between the other image frame and the second image frame may be used as a reference block when encoding the second image frame.
- the second image frame may also include other image blocks (such as the sixth image block) except the second image block, and in the process of encoding the sixth image block, an image in the first video may be block as the reference block (fifth image block) of the sixth image block, the reference block can be the image block in the first image frame (the image frame where the first image block is located), and the reference block can also be the image block in the first video
- the reference block can be the image block in the first image frame (the image frame where the first image block is located)
- the reference block can also be the image block in the first video
- the image features include at least one of the following:
- FIG. 5C is a schematic flowchart of a video decoding method provided by the embodiment of the present application.
- a video decoding method provided by the embodiment of the present application may include:
- step 505 For the description of step 505, reference may be made to the description of step 1201 or the description of step 1501, which will not be repeated here.
- step 506 For the description of step 506, reference may be made to the description of step 1202 or the description of step 1502, which will not be repeated here.
- step 507 For the description of step 507, refer to the description of step 1201 or the description of step 1503, which will not be repeated here.
- step 508 reference may be made to the description of step 1203 or the description of step 1505, which will not be repeated here.
- the indication information can also be encoded, and the indication information can indicate that the difference information is obtained according to the difference between the first image block and the second image block, and then the decoding side can obtain the second image block of the first image block based on the indication information.
- An encoding parameter and difference information and obtain a second encoding parameter based on the first encoding parameter and the difference information.
- the indication information may include an identifier indicating the first image block and an identifier indicating the second image block.
- the first information includes a first encoding parameter of a first image block
- the second information includes a second encoding parameter of a second image block
- the second encoding parameters include motion vectors and/or residuals.
- the relevant information of the reference frame of the second image frame in the group of pictures GOP where the second image frame is located may not be encoded into the encoded data , reducing the size of encoded data, reducing the storage space required for storing the second video, and reducing the bandwidth required for transmitting the second video.
- the first image block is included in a first image frame
- the second image block is included in a second image frame
- the first image frame and the second image frame are different image frame.
- the indication information includes an identifier of the first image block and an identifier of the second image block.
- the first information includes a first image block
- the second information includes a second image block
- the indication information is used to indicate that the reference block of the second image block is the first image block.
- the first image block is an image block in an I frame
- the second image block is an image block in a P frame or a B frame
- the first image block is an image block in a P frame
- the second image block is an image block in a P frame or a B frame.
- the similarity of the image features of the first image block and the second image block is greater than a threshold, and the image features include at least one of the following:
- FIG. 6 is a schematic flowchart of a video encoding method provided by an embodiment of the present application.
- a video encoding method provided by an embodiment of the present application may include:
- a first encoding parameter of a first image block and a second encoding parameter of a second image block the first image block is an image block in a first video file
- the second image block is a second video file
- the image blocks in the image block, the first video file and the second video file are different video files
- the first encoding parameter and the second encoding parameter include motion vectors and/or residuals.
- the motion vector and/or residual can be understood as that the first coding parameter and the second coding parameter include a motion vector, or that the first coding parameter and the second coding parameter include a residual, or,
- the first encoding parameters and the second encoding parameters include motion vectors and residuals.
- the first encoding parameter of the first image block can be obtained by decoding the encoded data (or code stream) of the first video, and the encoded data of the second video ( or referred to as a code stream) to obtain the second encoding parameter of the second image block.
- the encoding parameters of the first image frame can be obtained by decoding the encoded data (or code stream) of the first video
- the encoding parameters of the second image frame can be obtained by decoding
- the encoded video data (or code stream) is decoded to obtain encoding parameters of the second image frame (including the second image block).
- a video signal of the first video may be input into an encoder.
- the video signal of the first video may be an uncompressed video file stored in memory.
- the first video may be captured by a video capture device, such as a video camera, and encoded to support playback of the video.
- a video file may include an audio component and a video component.
- the video component includes a series of image frames that, when viewed in sequence, create the visual effect of motion. These frames include pixels represented in terms of light (which may be called a luma component) and color (which may be called a chrominance component).
- the frames may also include depth values to support three-dimensional viewing.
- the video can be split into chunks afterwards. Segmentation involves subdividing the pixels in each frame into square and/or rectangular blocks for compression. For example, an encoding tree can be employed to partition blocks and then recursively subdivide the blocks until a configuration is achieved that supports further encoding. For example, the luma component of a frame may be subdivided until the individual blocks include relatively uniform luma values. Furthermore, the chrominance components of a frame can be subdivided until the individual blocks comprise relatively uniform color values. Therefore, the segmentation mechanism differs according to the content of the video frame.
- Inter prediction aims to exploit the fact that objects in a common scene tend to appear in consecutive frames. Therefore, there is no need to repeat the description of the block depicting the object in the reference frame in adjacent frames.
- an object such as a table
- a pattern matching mechanism may be employed to match objects across multiple frames.
- a moving object may be represented by multiple frames due to object movement or camera movement, etc.
- a video can show a car moving across the screen across multiple frames. This movement can be described using motion vectors.
- a motion vector is a two-dimensional vector that provides the offset from an object's coordinates in one frame to the object's coordinates in a reference frame.
- inter prediction can encode an image block in the current frame as a set of motion vectors, indicating an offset relative to the corresponding block in a reference frame. Any differences between the current image block and the reference block are stored in the residual block.
- the residual block can be transformed to further compress the file.
- Intra prediction encodes blocks in a common frame. Intra prediction exploits the fact that luma and chroma components tend to cluster in a frame. For example, a patch of green in one part of a tree tends to be adjacent to a similar patch of green. Intra-frame prediction adopts multiple directional prediction modes (for example, 33 modes in HEVC), planar mode and direct current (direct current, DC) mode. The direction mode indicates that the current block is similar/identical to samples of neighboring blocks in the corresponding direction. Planar mode indicates that a series of blocks on a row/column (eg, plane) can be interpolated based on neighboring blocks at row edges.
- a row/column eg, plane
- planar mode indicates a smooth transition of light/color across rows/columns by taking a relatively constant slope of varying values.
- DC mode is used for boundary smoothing, indicating that the block is similar/identical to the mean value associated with samples of all neighboring blocks associated with the angular direction of the directional prediction mode. Therefore, an intra prediction block may represent an image block as various relational prediction mode values instead of actual values.
- an inter prediction block may represent image blocks as motion vector values rather than actual values. In either case, the predicted block may not accurately represent the image block in some cases. All differences are stored in the residual block. Residual blocks can be transformed to further compress the file.
- filters are applied according to an in-loop filtering scheme.
- the block-based prediction discussed above makes it possible to create blocky images at the decoder.
- block-based prediction schemes can encode blocks and then reconstruct the encoded blocks for later use as reference blocks.
- the in-loop filtering scheme iteratively applies noise suppression filters, deblocking filters, adaptive loop filters and sample adaptive offset (SAO) filters to blocks/frames.
- SAO sample adaptive offset
- the encoded data includes the data discussed above as well as any signaling data needed to support proper video signal reconstruction at the decoder.
- data may include segmentation data, prediction data, residual blocks, and various flags that provide encoding instructions to the decoder.
- the encoded data may be stored in memory for transmission to the decoder on demand.
- the encoded data may also be broadcast and/or multicast to multiple decoders.
- the creation of the encoded data is an iterative process.
- the encoded data of the first video and the second video can be obtained, and then the encoded data of the first video and the second video can be decoded by a decoder.
- the video files of the first video and the second video can be decapsulated, and each video can have its own format (for example: .MP4 , .RMVB, .AVI, .FLV), etc.
- These formats represent encapsulation formats.
- Short for video format also known as container.
- the video encapsulation format is a specification for packaging video data and audio data into one file.
- the function of decapsulation is to separate the input data in the encapsulation format into audio stream compression encoding data and video stream compression encoding data. For example, after the data in FLV format is decapsulated, encoded video data and audio encoded data are output.
- the decoder can receive the above-mentioned encoded data and start the decoding process. Specifically, the decoder converts the coded data into corresponding syntax data and video data using an entropy decoding scheme. The decoder may employ syntax data from the encoded data to determine partitioning of frames. The segmentation should match the block segmentation results from the encoding process described above.
- the decoder decodes the encoded data of the first video to obtain encoding parameters related to the first video (for example, it may include the first The first encoding parameter of the image block), the decoder decodes the encoded data of the second video to obtain the encoding parameters related to the second video (for example, it may include the second encoding parameter of the second image block).
- the decoder decodes the coded data of the first video to obtain syntax data (for example, may include the motion vector of the first image block), and decodes the coded data of the second video to obtain syntax data (for example, may include including the motion vector of the first image block).
- the decoder may perform block decoding. Specifically, the decoder uses inverse transformation to generate a residual block (for example, may include the residual of the first image block and the residual of the second image block).
- the image feature comparison (or is to determine whether the first video and the second video are videos with similar or even repeated content by comparing encoding parameters of image blocks obtained after decoding).
- the coding parameters may be residuals, motion vectors, discrete cosine transform (discrete cosine transform, DCT) coefficients, etc., and then select the first image block and the second image block (or select the first image frame and the second image frame frame, the first image frame includes the first image block, and the second image frame includes the second image block).
- the packaged first video file and the second video file it may be determined based on comparing the subtitle information and/or audio information in the first video file and the second video file Whether the first video and the second video are videos with similar or even repeated content (the comparison between image features and/or encoding parameters can be performed later), and then the first image block and the second image block are selected (or the first image block is selected) An image frame and a second image frame, the first image frame includes the first image block, and the second image frame includes the second image block).
- image processing may be performed based on comparing subtitle information and/or audio information in the first video file and the second video file.
- image frames with similar subtitle information and/or audio information can be selected first as candidates, and then the image features and/or encoding parameters are compared to select the first image block and the second image block (or select the first image frame and the second image frame, the first image frame includes the first image block, and the second image frame includes the second image block).
- the packaged first video file and the second video file may be based on the source of the first video file and the second video file (for example, judging whether the first video file and the second video file It is the same-source video, such as whether it is edited based on the same video file, such as the first video file is only added with subtitles, special effects or beauty treatment, etc.
- the first image block and the second image block (or select the first image frame and a second image frame, the first image frame includes a first image block, and the second image frame includes a second image block).
- the first video can be determined based on the image feature comparison of each (or part of) image frames in the first video and the second video Whether the content of the second video is similar to or even duplicated with the above.
- the packaged first video and the second video based on the sources of the first video and the second video (for example, judging whether the first video and the second video are videos of the same source), to determine whether the first video and the second video are videos with similar or even duplicate content as described above.
- the first video can be determined based on the image feature comparison of each (or part of) image frames in the first video and the second video Whether there are image frames similar or even duplicated in content with the second video.
- the packaged first video and the second video it may be determined based on the comparison of the image features of the image blocks in each (or part of) image frames in the first video and the second video. Whether there are image blocks with similar or even duplicate content in the first video and the second video.
- the first video and the second video can be videos with similar or repeated content, wherein the so-called similar content can be understood as the existence (or existence exceeding a certain number or proportion) of pixel values and similar distribution between videos image frame.
- first video and the second video can be different video files, and can also be different parts of the same video file (for example, there will be a large number of repeated video segments in some videos, then the first video and the second video can be two video clips repeated in the video), it is not limited here.
- first video and the second video may be videos with the same video content, but belong to different video package files.
- the first video may include a first image frame
- the second video may include a second image frame
- the first image frame may include a plurality of image blocks (including the first image block)
- the second image frame It may include a plurality of image blocks (including the second image block), wherein the first image block and the second image block may be block units such as MB, prediction block (partition), CU, PU, TU, etc., which are not limited here.
- MB prediction block
- CU prediction block
- PU PU
- TU TU
- the similarity of image features between the first image block and the second image block is relatively high, wherein, the image features of the image block can be one or more of the color features, texture features, shape features, etc. of the image block .
- the color feature and texture feature are used to describe the surface properties of the object corresponding to the image block.
- Shape features include outline features and area features, outline features include the outer boundary features of the object, and area features include the shape area features of the object.
- the first encoding parameter of the first image block and the second encoding parameter of the second image block can be obtained, the first encoding parameter and the second encoding parameter can be motion vectors, the first encoding parameter and the second encoding parameter
- the second coding parameter may be a residual, and the first coding parameter and the second coding parameter may be a motion vector and a residual.
- the residual may be calculated based on the image block and the prediction block, for example, the residual may be obtained in the sample domain by subtracting the sample value of the image block of the picture from the sample value of the prediction block sample by sample (pixel by pixel).
- the coding parameters may also include syntax information, for example, the syntax information may be but not limited to: any one of inter prediction, intra prediction parameters, loop filter parameters and/or other syntax elements (decoded) or all.
- syntax information may be but not limited to: any one of inter prediction, intra prediction parameters, loop filter parameters and/or other syntax elements (decoded) or all.
- the first image block as the macroblock of the first image frame in the first video
- the second image block as the macroblock of the second image frame in the second video
- the coding parameters such as motion vector, residual ), V1_MV1, V1_MV2 ... V1_MVn in Fig.
- V2_MV1, V2_MV2 ..., V2_MVn are the motion vectors of the first video frame respectively Macroblock 1, macroblock 2 of the second image frame after video decoding...
- the motion vector of macroblock n, V1_CAVLC1, V1_CAVLC2... V1_CAVLCm are respectively macroblock 1, macroblock 2 of P frame after video file V1 decoding...
- the residuals of macroblock m, V2_CAVLC1, V2_CAVLC2...V2_CAVLCm are the residuals of macroblock 1, macroblock 2...macroblock m of the P frame after decoding the video file V2 respectively.
- first image block and the second image block may be image blocks with relatively similar image characteristics in the first image frame and the second image frame, and part or all of the image blocks in the first image frame and the second image frame may be Execute the above processing for the first image block and the second image block.
- first image frame and the second image frame may be image frames with relatively similar image features in the first video and the second video, and image blocks in part or all of the image frames in the first video and the second video may be Execute the above processing for the first image block and the second image block.
- the first encoding parameter may include a first motion vector
- the second encoding parameter may include a second motion vector
- the difference information includes first difference information
- the first difference Value information is used to indicate a difference between the first motion vector and the second motion vector.
- the first coding parameter may include a first residual
- the second coding parameter may include a second residual
- the difference information includes second difference information
- the second difference Value information is used to indicate a difference between the first residual and the second residual.
- the first coding parameter may include a first motion vector and a first residual
- the second coding parameter may include a second motion vector and a second residual
- the difference information includes a first difference information and second difference information
- the first difference information is used to indicate the difference between the first motion vector and the second motion vector
- the second difference information is used to indicate the The difference between the first residual and the second residual.
- a subtraction operation may be performed on the first encoding parameter and the second encoding parameter to obtain difference information, and in addition, the difference information may be obtained based on other factors used to quantize the first encoding parameter and the second encoding parameter.
- the difference information is calculated by calculating the difference between them, which is not limited here.
- the first coding parameter may include a first motion vector
- the second coding parameter may include a second motion vector
- the difference information includes the first difference information, so The first difference information is a difference between the first motion vector and the second motion vector.
- the first coding parameter may include a first residual
- the second coding parameter may include a second residual
- the difference information includes second difference information, so The second difference information is a difference between the first residual and the second residual.
- the first coding parameter may include a first motion vector and a first residual
- the second coding parameter may include a second motion vector and a second residual
- the difference The value information includes first difference information and second difference information
- the first difference information is the difference between the first motion vector and the second motion vector
- the second difference information is The difference between the first residual and the second residual.
- FIG. 8 is a schematic diagram of difference calculation, where the difference may be a difference between encoding parameters of image blocks of an image frame.
- first image block and the second image block may be image blocks with relatively similar image characteristics in the first image frame and the second image frame, and part or all of the image blocks in the first image frame and the second image frame may be Execute the above processing for the first image block and the second image block. Furthermore, difference information of multiple groups of image blocks can be obtained.
- Figure 9 and Figure 10 are schematic diagrams of difference information calculation, wherein Figure 9 shows a schematic diagram of calculating difference information based on all image blocks of two image frames, specifically, the first image frame
- the image block 1 of the second image frame and the image block 1 of the second image frame can be calculated to obtain the difference information 1
- the image block 2 of the first image frame and the image block 2 of the second image frame can be calculated to obtain the difference information 2
- the first image frame The image block 3 of the first image frame and the image block 3 of the second image frame can be calculated to obtain the difference information 3
- the image block 16 of the first image frame and the image block 16 of the second image frame can be calculated to obtain the difference information 16.
- FIG. 10 shows a schematic diagram of calculating difference information based on some image blocks of two image frames.
- image block 1 of the first image frame and image block 1 of the second image frame can be calculated to obtain difference information 1
- image block 2 of the first image frame and image block 2 of the second image frame can be calculated to obtain difference information 2
- image block 3 of the first image frame and image block 3 of the second image frame can be calculated to obtain difference information 3
- the image block 4 of the first image frame and the image block 4 of the second image frame can be calculated to obtain the difference information 4
- other image blocks of the first image frame and the second image frame do not participate in the difference information calculation.
- the encoded data may be used by the decoding side to restore and obtain the second encoding parameter.
- the difference information may be encoded into encoded data, where the encoded data may be encoded data of the second video.
- the first difference information (the first difference information is used to indicate the difference between the first motion vector and the second motion vector) may be performed by lossless compression coding. coding.
- the first difference information represents the difference between motion vectors
- the motion vector is used in inter-frame prediction
- lossy compression is used to compress the first difference information
- difference for example, artifacts appear
- lossless compression is used to compress the first difference information, which can increase the accuracy and effect of inter-frame prediction.
- the second difference information (the second difference information is used to indicate the difference between the first residual and the second residual may be encoded by lossless compression or lossy compression) The difference between ) is encoded into the encoded data.
- lossless compression may be lossless compression including transformation (retaining all coefficients), scanning, and entropy coding
- lossy compression may be lossy compression including transformation (reserving low-frequency coefficients), quantization, scanning, and entropy coding.
- the difference information may be used as the encoding parameter of the second image block (replacing the original second encoding parameter), and the data of the second video including the encoding parameter of the second image block is encoded to obtain the first The encoded data of the second video.
- a transformation may be applied to the second difference information, such as a discrete cosine transform (discrete cosine transform, DCT), a discrete sine transform (discrete sine transform, DST) or a conceptually similar transformation, thereby generating a second difference value including Video block of information coefficient values.
- DCT discrete cosine transform
- DST discrete sine transform
- the transformation can convert the second difference information from the pixel value domain to the transformation domain, such as the frequency domain.
- a scaling transformation based on frequency etc. may also be used for the second difference information. This scaling involves applying a scaling factor to the second difference information in order to quantize different frequency information at different granularities, which may affect the final visual quality of the reconstructed video.
- Quantized transform coefficients can also be used to further reduce the bit rate.
- the quantization process can reduce the bit depth associated with some or all coefficients.
- the degree of quantization can be modified by adjusting the quantization parameter.
- the transform scaling and quantization component can then scan the matrix including the quantized transform coefficients.
- the quantized transform coefficients are forwarded to encoding components (such as header formatting and CABAC components) for encoding in encoded data.
- the first difference information may be sent to encoding compression components (eg, header formatting and CABAC components) for encoding in the encoded data.
- the first difference information and the second difference information may be directly sent to a coding compression component (such as a header formatting and CABAC component).
- the code compression component can receive data from various components of the codec system and encode such data into coded data for transmission to the decoder.
- Such data can be encoded by employing entropy coding.
- entropy coding context adaptive variable length coding, CAVLC
- CABAC syntax-based context-adaptive binary arithmetic coding
- SBAC syntax-based context-adaptive binary arithmetic coding
- PIPE probability interval segmentation Entropy
- the encoded data can be transmitted to another device (eg, a video decoder) or archived for later transmission or retrieval.
- the encoded data and the first encoding parameter may be sent to the decoding side, so that the decoding side can obtain The second encoding parameters; alternatively, the encoded data may be stored locally for later transmission or retrieval.
- the above encoded data may be encapsulated, and correspondingly, the encapsulated encoded data may be sent to the decoding side, so that the decoding side may obtain the second encoding parameter according to the encapsulated encoded data; or , the encapsulated encoded data can be stored locally for later transmission or retrieval.
- the above encoded data obtained by encoding based on difference information and the first encoding parameter of the first image block may be obtained.
- the encoded data of the second video and the encoding parameters of the first video may be acquired.
- the encoded data of the first video may be decapsulated and decoded to obtain encoding parameters of the first video (for example, may include first encoding parameters of the first image block).
- the coded data of the second video may be decapsulated and decoded to obtain difference information.
- difference information For a description of the difference information, reference may be made to the description in the foregoing embodiments, and details are not repeated here.
- the first encoding parameter and difference information may be summed to obtain the second encoding parameter.
- the first coding parameter may include a first motion vector
- the second coding parameter may include a second motion vector
- the difference information includes first difference information
- the first difference information is used to indicate the
- the difference between the first motion vector and the second motion vector may be calculated by summing the first motion vector and the first difference information to obtain the second motion vector.
- the first encoding parameter may include a first residual
- the second encoding parameter may include a second residual
- the difference information includes second difference information
- the second difference information is used to indicate
- the first residual and the second difference information may be summed to obtain the second residual
- the restored second encoding parameters may be encoded and packaged to obtain original encoded data of the second video.
- FIG. 11 is a schematic diagram of restoring the second video, where the difference may be the difference between encoding parameters of image blocks of the image frame.
- An embodiment of the present application provides a video encoding method, the method comprising: acquiring a first encoding parameter of a first image block and a second encoding parameter of a second image block, the first image block and the second image
- the similarity of the image features of the block is greater than a threshold, the first encoding parameter and the second encoding parameter include motion vectors and/or residuals; according to the first encoding parameter and the second encoding parameter, a difference is obtained Information, where the difference information is used to indicate the difference between the first encoding parameter and the second encoding parameter; encode the difference information to obtain encoded data.
- the difference information is replaced by the original second encoding parameter.
- the size of the bitstream data after encoding the difference information will be much larger. It is smaller than the bit stream data after encoding the second encoding parameter, and can be restored based on the difference information and the first encoding parameter to obtain the second encoding parameter. It is equivalent to reducing the storage resource required for storing the video and reducing the bandwidth required for video transmission under the premise that the complete second image block can be recovered.
- the encoded data of the second video may be obtained.
- step 1201 may be performed after, before or at the same time as step 1202, which is not limited here.
- the coded data of the second video may be decapsulated and decoded to obtain difference information.
- difference information For a description of the difference information, reference may be made to the description in the foregoing embodiments, and details are not repeated here.
- the similarity of the image features of the first image block and the second image block is greater than a threshold, and the image features include at least one of the following: color features, texture features, shape features, and Spatial relationship features.
- the first image block is included in a first image frame
- the second image block is included in a second image frame
- the first image frame and the second image frame are different image frame.
- the first image block includes a first image frame in a first video
- the second image block includes a second image frame in a second video
- the first video and The second video is a different video file.
- the indication information is used to indicate that there is a relationship between the first image block and the second image block, and the first image block Belonging to a first video file, the second image block belongs to a second video file, and the first video file and the second video file are different video files;
- the first encoding parameter and difference information may be summed to obtain the second encoding parameter.
- the first encoding parameter may include a first motion vector
- the second encoding parameter may include a second motion vector
- the difference information includes first difference information
- the first difference information is used to indicate the
- the difference between the first motion vector and the second motion vector may be calculated by summing the first motion vector and the first difference information to obtain the second motion vector.
- the first encoding parameter may include a first residual
- the second encoding parameter may include a second residual
- the difference information includes second difference information
- the second difference information is used to indicate
- the first residual and the second difference information may be summed to obtain the second residual
- the restored second encoding parameters may be encoded and packaged to obtain original encoded data of the second video.
- the difference information may be the exact difference between the first encoding parameter of the first image block and the second encoding parameter of the second image block, if
- the difference information is compressed using lossy coding, there may be a certain error between the difference information and the accurate difference between the first coding parameter of the first image block and the second coding parameter of the second image block, and then according to the There is a certain error between the obtained second coding parameters and the real coding parameters of the second image block.
- FIG. 11 is a schematic diagram of restoring the second video, where the difference may be the difference between encoding parameters of image blocks of the image frame.
- An embodiment of the present application provides a video decoding method, the method comprising: acquiring encoded data; decoding the encoded data to obtain difference information; decoding the first image block according to the indication information to obtain the second image block An encoding parameter; wherein the indication information is used to indicate that there is an association between the first image block and the second image block, the first image block belongs to the first video file, and the second image block belongs to the second video file, The first video file and the second video file are different video files; according to the first encoding parameter and the difference information, the second encoding parameter, the first encoding parameter and the The second coding parameters include motion vectors and/or residuals.
- the difference information is replaced with the original second encoding parameter. Since the similarity of the image features of the first image block and the second image block is greater than the threshold, the size of the encoded data after encoding the difference information will be much smaller than The encoded data after encoding the second encoding parameter can be restored based on the difference information and the first encoding parameter to obtain the second encoding parameter. It is equivalent to reducing the storage resource required for storing the video and reducing the bandwidth required for video transmission under the premise that the complete second image block can be recovered.
- FIG. 13 is a schematic flowchart of a video coding method provided by the embodiment of the present application.
- a video coding method provided by the embodiment of the present application may include:
- the coded data of the first video and the second video may be decoded by a decoder to obtain a video signal of the first video and a video signal of the second video, wherein the video signal of the first video may be Including the first image block, the video signal of the second video may include the second image block.
- the decoder may decapsulate the video files of the first video and the second video.
- first video and the second video can be different videos, or different parts of the same video (for example, there will be a large number of repeated video segments in some videos, then the first video and the second video can be in the video. Repeated two video clips), it is not limited here.
- the first video may include a first image frame
- the second video may include a second image frame
- the first image frame may include a plurality of image blocks (including the first image block)
- the second image frame It may include a plurality of image blocks (including the second image block), wherein the first image block and the second image block may be block units such as MB, prediction block (partition), CU, PU, TU, etc., which are not limited here.
- MB prediction block
- CU prediction block
- PU PU
- TU TU
- the similarity of image features between the first image block and the second image block is relatively high, wherein, the image features of the image block can be one or more of the color features, texture features, shape features, etc. of the image block .
- the color feature and texture feature are used to describe the surface properties of the object corresponding to the image block.
- the shape features include contour features and region features, the contour features include the outer boundary features of the object, and the region features include the shape region features of the object.
- the first image block may be an image block in an I frame
- the second image block may be an image block in a P frame or a B frame
- the first image block may be An image block in a P frame
- the second image block is an image block in a P frame or a B frame.
- the video signal of the second video may be encoded based on the video signal of the first video.
- the encoding parameter of the second image block may be determined by using the first image block as a reference block of the second image block.
- the first image frame in the first video and the second image frame in the second video may be acquired, and the similarity of the image features of the first image frame and the second image frame greater than the threshold, the first video and the second video are different videos, and the first image frame is used as a reference frame of the second image frame to determine the first encoding parameter of the second image frame .
- the first image block may be an image block of an I frame in the first video
- the second image block may be an image block of a P frame or a B frame in the second video.
- the first image block may be an image block of a P frame in the first video
- the second image block may be an image block of a P frame or a B frame in the second video.
- the first encoding parameter may be residual, motion vector and other syntax information, which is not limited here.
- the first image block may be used as a reference block, and the first image frame having the reference block As a reference frame to predict the second image block in the current frame (that is, the second image frame including the second image block), optionally, the first image frame may be temporally after the current frame, or the current frame may be temporally It is located between a previous reference frame that appears before the current frame in the video sequence and a subsequent reference frame that appears after the current frame in the video sequence (the first image frame can be one of the previous reference frame or the subsequent reference frame).
- a TD may indicate the amount of time between a current frame and a reference frame in a video sequence, and may be measured in frames.
- the prediction information of the current block may refer to a reference frame and/or a reference block through a reference index indicating a direction and a temporal distance between frames.
- objects in the current block move from one location in the current frame to another location in the reference frame (eg, the location of the reference block).
- an object may move along a motion trajectory, which is the direction in which the object moves over time.
- Motion vectors describe the direction and magnitude an object moves within a TD along a motion trajectory.
- the encoded motion vectors and reference blocks provide sufficient information to reconstruct and position the current block in the current frame.
- the current block is matched with a previous reference block in a previous reference frame and a subsequent reference block in a subsequent reference frame. This match indicates that, during the course of the video sequence, the object moved along the motion trajectory and via the current block from a position at a previous reference block to a position at a subsequent reference block.
- the current frame is spaced from a previous reference frame by a previous temporal distance ( TD0 ), and from a subsequent reference frame by a subsequent temporal distance ( TD1 ).
- TDO indicates the amount of time in frames between the previous reference frame and the current frame in the video sequence.
- TD1 indicates the amount of time in frames between the current frame and the subsequent reference frame in the video sequence.
- the object moves from the previous reference block to the current block along the motion trajectory within the time period indicated by TDO.
- the object also moves along the motion trajectory from the current block to the subsequent reference block within the time period indicated by TD1.
- the prediction information of the current block may refer to a previous reference frame and/or a previous reference block and a subsequent reference frame and/or a subsequent reference block through a pair of reference indices indicating a direction and a temporal distance between frames.
- a previous motion vector describes the direction and magnitude of an object's movement along a motion trajectory within TDO (eg, between a previous reference frame and the current frame).
- the subsequent motion vector describes the direction and magnitude of the object's movement along the motion trajectory within TD1 (eg, between the current frame and the subsequent reference frame). Therefore, in bi-directional inter prediction, a current block can be encoded and reconstructed by employing previous and/or subsequent reference blocks, MV0 and MV1.
- the encoded data of the second video can be encoded into the encoded data, and the specific description about encoding into the encoded data Reference may be made to the description about the encoder in the foregoing embodiments, and details are not repeated here.
- the image content similarity between the first image frame and the second image frame can be relatively high.
- the higher image block is used as a reference block when encoding the second image frame.
- image blocks with higher image content similarity in the other image frames and the second image frame may be used as reference blocks when encoding the second image frame.
- the method further includes: acquiring a third image block in the third video, a fourth image block in the second video, the third image block and the fourth image
- the similarity of the image features of the block is greater than a threshold, the third video and the second video are different video files, and the second image block and the fourth image block belong to the same one in the second video An image frame; using the third image block as a reference block of the fourth image block, determining a second encoding parameter of the fourth image block; encoding the second encoding parameter.
- the image blocks with higher image content similarity in the above-mentioned other image frames and the second image frame can be As a reference block when encoding the second image frame.
- the second image frame may also include other image blocks (such as the fourth image block) except the second image block, and in the process of encoding the fourth image block, an image in the third video may be block as a reference block (third image block) of the fourth image block.
- other image blocks such as the fourth image block
- an image in the third video may be block as a reference block (third image block) of the fourth image block.
- the second encoding parameter and second indication information may be encoded, where the second indication information is used to indicate that the reference block of the fourth image block is the third image block.
- the fifth image block in the first video and the sixth image block in the second video may be acquired, and the second image block and the sixth image block belong to the In the same image frame in the second video, the similarity of the image features of the fifth image block and the sixth image block is greater than a threshold; wherein, the first image block and the fifth image block belong to the the same or a different image frame in the first video; using the fifth image block as a reference block of the sixth image block to determine a third encoding parameter of the sixth image block; for the third The encoding parameter is encoded.
- the image block with a high image content similarity between the other image frame and the second image frame may be used as a reference block when encoding the second image frame.
- the second image frame may also include other image blocks (such as the sixth image block) except the second image block, and in the process of encoding the sixth image block, an image in the first video may be block as the reference block (fifth image block) of the sixth image block, the reference block can be the image block in the first image frame (the image frame where the first image block is located), and the reference block can also be the image block in the first video
- the reference block can be the image block in the first image frame (the image frame where the first image block is located)
- the reference block can also be the image block in the first video
- the third encoding parameter and third indication information may also be encoded, where the third indication information is used to indicate that the reference block of the sixth image block is the fourth image block .
- the relevant information of the reference block originally used as the second image block in the image group GOP where the second image block is located may not be encoded into the encoded data , reducing the size of encoded data, reducing the storage space required for storing the second video, and reducing the bandwidth required for transmitting the second video.
- the decoder in order for the decoder to decode and obtain a complete and accurate video signal of the second video, it is also necessary to encode first indication information into the encoded data, and the first indication information is used to indicate that the second video
- the reference block of the second image block is the first image block.
- the decoder may know that the reference block of the second image block is the first image block when decoding.
- FIG. 14 is a schematic diagram of an encoding process, wherein, some frames in the second video (belonging to the same group of pictures GOP) can refer to an I frame in the first video, that is, an I frame in the first video
- the I frame serves as a reference frame for some frames in the second video.
- the encoded data can be obtained, and the encoded data can be decoded to obtain the first encoding parameter and first indication information of the second image block, and the first indication information is used to indicate
- the second image block is associated with the first image block (optionally, the first indication information is used to indicate that the reference block of the second image block is the first image block in the first video),
- the first video and the second video are different videos, the first image block is obtained according to the first indication information, and the first image block is reconstructed according to the first encoding parameter and the first image block the second image block.
- the first indication information may indicate that the reference frame of the second image frame is the first image frame, and further, according to the first indication information , acquiring a first image frame, and reconstructing the second image frame according to the encoding parameters of the second image frame and the first image frame.
- An embodiment of the present application provides a video encoding method, the method comprising: acquiring a first image block in a first video and a second image block in a second video, the first image block and the second The similarity of the image features of the image block is greater than a threshold, the first video and the second video are different video files; the first image block is used as the reference block of the second image block, and the second video is determined The first encoding parameter of the two image blocks; encoding the first encoding parameter into the encoded data.
- the relevant information of the reference block originally used as the second image block in the image group GOP where the second image block is located may not be encoded into the encoded data , reducing the size of encoded data, reducing the storage space required for storing the second video, and reducing the bandwidth required for transmitting the second video.
- an embodiment of the present application also provides a video decoding method, the method comprising: acquiring coded data; decoding the coded data to obtain the first coding parameters and the first indication of the second image frame in the second video information, the first indication information is used to indicate that the reference frame of the second image frame is the first image frame in the first video, and the first video and the second video are different video files; according to the The first instruction information is used to acquire the first image frame; and the second image frame is reconstructed according to the first encoding parameter and the first image frame.
- the image features include at least one of the following: color features, texture features, shape features, and spatial relationship features.
- the first image frame is an I frame
- the second image frame is a P frame or a B frame
- the first image frame is a P frame
- the second image frame is P frame or B frame.
- FIG. 15 is an implementation of the present application
- the flowchart of a video decoding method provided by an example, as shown in Figure 15, includes:
- the first indication information is used to indicate that the reference block of the second image block is the first image block. Furthermore, the decoder may know that the reference block of the second image block is the first image block when decoding.
- the similarity of the image features of the first image block and the second image block is greater than a threshold, and the image features include at least one of the following: color features, texture features, shape features, and Spatial relationship features.
- the first image block is an image block in an I frame
- the second image block is an image block in a P frame or a B frame
- the first image block is a P frame
- the second image block is an image block in a P frame or a B frame.
- the indication information is used to indicate that there is a relationship between the first image block and the second image block, and the first image block belongs to The first video file, the second image block belongs to the second video file, and the first video file and the second video file are different video files.
- the first indication information may indicate that the reference frame of the second image frame is the first image frame, and further, according to The first indication information is to acquire the first image frame, and reconstruct the second image frame according to the coding parameters of the second image frame and the first image frame.
- the present application provides a video encoding method, the method comprising: acquiring a first image frame in a first video, and a second image frame in a second video, the first image frame and the second image frame The similarity of the image features is greater than a threshold, and the first video and the second video are different video files; the first image frame is used as the reference frame of the second image frame to determine the second image A first encoding parameter of the frame; encoding the first encoding parameter into the encoded data.
- the relevant information of the reference block originally used as the second image block in the image group GOP where the second image block is located may not be encoded into the encoded data , reducing the size of encoded data, reducing the storage space required for storing the second video, and reducing the bandwidth required for transmitting the second video.
- FIG. 16b is a schematic flowchart of a video encoding method in this application.
- an embodiment of the present application also provides a video decoding method, the method comprising: acquiring coded data; decoding the coded data to obtain the first coding parameters and the first indication of the second image frame in the second video information, the first indication information is used to indicate that the reference frame of the second image frame is the first image frame in the first video, and the first video and the second video are different video files; according to the The first instruction information is used to acquire the first image frame; and the second image frame is reconstructed according to the first encoding parameter and the first image frame.
- the image features include at least one of the following: color features, texture features, shape features, and spatial relationship features.
- the first image frame is an I frame
- the second image frame is a P frame or a B frame
- the first image frame is a P frame
- the second image frame is P frame or B frame.
- the video coding apparatus 1700 of the embodiment of the present application will be introduced below with reference to FIG. 17 . It should be understood that the video coding apparatus shown in FIG. 17 can execute the steps in the video coding method of the embodiment of the present application. In order to avoid unnecessary repetition, repeated descriptions are appropriately omitted when introducing the video encoding apparatus according to the embodiment of the present application below.
- Fig. 17 is a schematic block diagram of a video encoding device according to an embodiment of the present application.
- the video encoding device 1700 shown in FIG. 17 may include:
- An acquisition module 1701 configured to acquire a first encoding parameter of a first image block and a second encoding parameter of a second image block, the first image block is an image block in the first video file, and the second image block is The image block in the second video file, the first video file and the second video file are different video files, the first encoding parameter and the second encoding parameter include motion vectors and/or residuals;
- a difference determination module 1702 configured to obtain difference information according to the first encoding parameter and the second encoding parameter, where the difference information is used to indicate the difference between the first encoding parameter and the second encoding parameter difference between
- An encoding module 1703 configured to encode the difference information to obtain encoded data.
- the acquiring module is specifically used for:
- the image features include at least one of the following:
- the first image block is included in a first image frame
- the second image block is included in a second image frame
- the first image frame and the second image frame are different image frame.
- the first image block includes a first image frame in a first video
- the second image block includes a second image frame in a second video
- the first video and The second video is a different video file.
- the first encoding parameter may include a first motion vector
- the second encoding parameter may include a second motion vector
- the difference information includes first difference information
- the first difference The value information is a difference between the first motion vector and the second motion vector.
- the first coding parameter may include a first residual
- the second coding parameter may include a second residual
- the difference information includes second difference information, so The second difference information is a difference between the first residual and the second residual.
- the first coding parameter may include a first motion vector and a first residual
- the second coding parameter may include a second motion vector and a second residual
- the difference information includes a first difference information and second difference information
- the first difference information is the difference between the first motion vector and the second motion vector
- the second difference information is the first The difference between the residual and the second residual.
- the first coding parameter includes a first residual
- the second coding parameter includes a second residual
- the difference information includes second difference information
- the second difference The information is used to indicate the difference between the first residual and the second residual
- the coding module is specifically used for:
- the second difference information is encoded by lossless compression coding or lossy compression coding.
- the first encoding parameter includes a first motion vector
- the second encoding parameter includes a second motion vector
- the difference information includes first difference information
- the first difference The information is used to indicate the difference between the first motion vector and the second motion vector
- the coding module is specifically used to:
- the first difference information is encoded by lossless compression encoding.
- the device also includes:
- a sending module configured to send the encoded data and the first encoding parameter to a decoding side, so that the decoding side obtains the second encoding parameter according to the encoded data and the first encoding parameter;
- a storage module configured to store the encoded data.
- FIG. 18 is a schematic block diagram of a video decoding device according to an embodiment of the present application.
- the video decoding device 1800 shown in FIG. 18 may include:
- a decoding module 1802 configured to decode the encoded data to obtain difference information
- decoding module 180 For the description of the decoding module 1802, reference may be made to the description of step 1202 in the foregoing embodiment, and details are not repeated here.
- the obtaining module 1801 is further used for the indication information to indicate that the difference information is obtained according to the difference between the first encoding parameter of the first image block and the second encoding parameter of the second image block, so
- the first image block belongs to a first video file
- the second image block belongs to a second video file
- the first video file and the second video file are different video files
- the second encoding parameters include motion vectors and/or residuals
- a coding parameter restoration module 1803, configured to obtain the second coding parameter according to the first coding parameter and the difference information.
- the similarity of the image features of the first image block and the second image block is greater than a threshold, and the image features include at least one of the following:
- the first image block is included in a first image frame
- the second image block is included in a second image frame
- the first image frame and the second image frame are different image frame.
- the first image block includes a first image frame in a first video
- the second image block includes a second image frame in a second video
- the first video and The second video is a different video file.
- Fig. 19 is a schematic block diagram of a video encoding device according to an embodiment of the present application.
- the video encoding device 1900 shown in FIG. 19 may include:
- An acquisition module 1901 configured to acquire a first image block in a first video and a second image block in a second video, where the first video and the second video are different video files;
- An encoding parameter determination module 1902 configured to use the first image block as a reference block of the second image block, and determine a first encoding parameter of the second image block;
- An encoding module 1903 configured to encode the first encoding parameter.
- the image features include at least one of the following:
- the first image block is an image block in an I frame
- the second image block is an image block in a P frame or a B frame; or,
- the first image block is an image block in a P frame
- the second image block is an image block in a P frame or a B frame.
- the coding module is specifically used for:
- the acquisition module is also used to:
- the similarity of the image features of the third image block and the fourth image block is greater than a threshold, and the third The video and the second video are different video files, and the second image block and the fourth image block belong to the same image frame in the second video;
- the encoding parameter determination module is further configured to use the third image block as a reference block of the fourth image block to determine a second encoding parameter of the fourth image block;
- the encoding module is further configured to encode the second encoding parameters.
- the encoding module is further configured to encode the second encoding parameter and second indication information, where the second indication information is used to indicate that the reference block of the fourth image block is The third image block.
- the acquisition module is also used to:
- the encoding parameter determination module is further configured to use the fifth image block as a reference block of the sixth image block to determine a third encoding parameter of the sixth image block; the encoding module , and is also used to encode the third encoding parameter.
- the encoding module is further configured to encode the third encoding parameter and third indication information, where the third indication information is used to indicate that the reference block of the sixth image block is The fourth image block.
- the embodiment of the present application also provides a video encoding device, the device comprising:
- An acquisition module configured to acquire a first image frame in the first video, and a second image frame in the second video, the similarity of the image features of the first image frame and the second image frame is greater than a threshold, so The first video and the second video are different video files;
- An encoding parameter determination module configured to use the first image frame as a reference frame of the second image frame, and determine a first encoding parameter of the second image frame;
- An encoding module configured to encode the first encoding parameter.
- the image features include at least one of the following:
- the first image frame is an I frame
- the second image frame is a P frame or a B frame; or,
- the first image frame is a P frame
- the second image frame is a P frame or a B frame.
- the coding module is specifically used for:
- Fig. 20 is a schematic block diagram of a video decoding device according to an embodiment of the present application.
- the video decoding device 2000 shown in FIG. 20 may include:
- An acquisition module 2001 configured to acquire coded data
- step 1501 and step 1503 for the description of the acquiring module 2001, reference may be made to the descriptions of step 1501 and step 1503 in the above embodiment, and details are not repeated here.
- a decoding module 2002 configured to decode the encoded data to obtain difference information
- decoding module 2002 For the description of the decoding module 2002, reference may be made to the description of step 1502 in the above embodiment, and details are not repeated here.
- the acquisition module 2001 is further used to indicate information, the indication information is used to indicate that the difference information is obtained according to the difference between the first image block and the second image block, and the first image block belongs to A first video file, the second image block belongs to a second video file, and the first video file and the second video file are different video files;
- a reconstruction module 2003 configured to reconstruct the second image block according to the first encoding parameter and the first image block.
- the similarity of the image features of the first image block and the second image block is greater than a threshold, and the image features include at least one of the following:
- the first image block is an image block in an I frame
- the second image block is an image block in a P frame or a B frame; or,
- the first image block is an image block in a P frame
- the second image block is an image block in a P frame or a B frame.
- the decoding module is further configured to: decode the encoded data to obtain second encoding parameters and second indication information of the fourth image block in the second video, the first The second indication information is used to indicate that the reference block of the fourth image block is a third image block in a third video, and the third video and the second video are different video files;
- the acquiring module is further configured to acquire the third image block according to the second indication information
- the reconstruction module is further configured to reconstruct the fourth image block according to the second encoding parameter and the third image block.
- the decoding module is also used for:
- the acquiring module is further configured to acquire the fifth image block according to the third indication information
- the reconstruction module is further configured to reconstruct the sixth image block according to the third encoding parameter and the fifth image block.
- the embodiment of the present application also provides a video decoding device, the device comprising:
- An acquisition module used to acquire encoded data
- a decoding module configured to decode the encoded data to obtain first encoding parameters and first indication information of the second image frame in the second video, where the first indication information is used to indicate the reference of the second image frame
- the frame is a first image frame in the first video, and the first video and the second video are different video files;
- the image features include at least one of the following: color features, texture features, shape features, and spatial relationship features.
- the first image frame is an I frame
- the second image frame is a P frame or a B frame
- the first image frame is a P frame
- the second image frame is P frame or B frame.
- FIG. 21 is a schematic block diagram of a video encoding device according to an embodiment of the present application.
- the video encoding device 2100 shown in FIG. 21 may include:
- An acquisition module 2101 configured to acquire a first video file and a second video file, where the first video file and the second video file are different video files;
- step 501 For the description of the obtaining module 2101, reference may be made to the description of step 501 in the above embodiment, and details are not repeated here.
- Decoding module 2102 configured to decode the first video file and the second video file to obtain the first information of the first image block in the first video file and the information of the second image block in the second video file second information;
- a difference determination module 2103 configured to obtain difference information according to the first information and the second information, where the difference information is used to indicate the difference between the first information and the second information difference;
- An encoding module 2104 configured to encode the difference information to obtain encoded data.
- the similarity between image features of the first image block and the second image block is greater than a threshold.
- the difference information is replaced with the original second encoding parameter. Since the similarity between the first image block and the second image block is greater than the threshold, the size of the encoded data after encoding the difference information will be much smaller than that of the first image block.
- the coded data after the second information is coded can be restored to obtain the second information based on the difference information and the first information. It is equivalent to reducing the storage resource required for storing the video and reducing the bandwidth required for video transmission under the premise that the complete second video can be recovered.
- the similarity between the first information and the second information is greater than a threshold.
- the difference information is replaced with the original second encoding parameter. Since the similarity between the first information and the second information is greater than the threshold, the size of the encoded data after encoding the difference information will be much smaller than that of the second information.
- the coded data after the information is coded can be restored to obtain the second information based on the difference information and the first information. It is equivalent to reducing the storage resource required for storing the video and reducing the bandwidth required for video transmission under the premise that the complete second video can be recovered.
- the first information includes a first encoding parameter of a first image block
- the second information includes a second encoding parameter of a second image block
- the second encoding parameters include motion vectors and/or residuals.
- the first video may include a first image frame
- the second video may include a second image frame
- the first image frame may include a plurality of image blocks (including the first image block)
- the second image frame It may include a plurality of image blocks (including the second image block), wherein the first image block and the second image block may be block units such as MB, prediction block (partition), CU, PU, TU, etc., which are not limited here.
- MB prediction block
- CU prediction block
- PU PU
- TU TU
- the similarity of image features between the first image block and the second image block is relatively high, wherein, the image features of the image block can be one or more of the color features, texture features, shape features, etc. of the image block .
- the color feature and texture feature are used to describe the surface properties of the object corresponding to the image block.
- the shape features include contour features and region features, the contour features include the outer boundary features of the object, and the region features include the shape region features of the object.
- the first image block is included in a first image frame
- the second image block is included in a second image frame
- the first image frame and the second image frame are different image frame.
- the first encoding parameter includes a first motion vector
- the second encoding parameter includes a second motion vector
- the difference information includes first difference information
- the first difference information indicating a difference between the first motion vector and the second motion vector
- the encoding module is specifically used for:
- the first difference information is encoded by lossless compression encoding.
- the first coding parameter includes a first residual
- the second coding parameter includes a second residual
- the difference information includes second difference information
- the second difference information indicating a difference between the first residual and the second residual
- the encoding module is specifically used for:
- the second difference information is encoded by lossless compression coding or lossy compression coding.
- the device also includes:
- a sending module configured to send the encoded data and the first encoding parameter to a decoding side, so that the decoding side obtains the second encoding parameter according to the encoded data and the first encoding parameter;
- a storage module configured to store the encoded data.
- the coding module is specifically used for:
- the first indication information includes an identifier of the first image block and an identifier of the second image block.
- the image features include at least one of the following:
- the first information includes a first image block
- the second information includes a second image block
- the difference determination module is specifically used for:
- the difference information includes a third encoding parameter of the second image block.
- the first image block is an image block in an I frame
- the second image block is an image block in a P frame or a B frame
- the first image block is an image block in a P frame
- the second image block is an image block in a P frame or a B frame.
- the coding module is specifically used for:
- the acquisition module is also used to:
- the decoding module is further configured to decode the second video file to obtain a fourth image block
- the difference determination module is further configured to use the third image block as a reference block of the fourth image block to determine the difference information, and the difference information includes the fourth image block of the fourth image block. encoding parameters.
- the decoding module is further configured to decode the first video file to obtain a fifth image block
- the similarity of the image features of the sixth image block is greater than a threshold, and the first image block and the fifth image block belong to the same or different image frames in the first video file;
- the difference determination module is further configured to use the fifth image block as a reference block of the sixth image block to determine the difference information, the difference information includes the fifth image block of the fifth image block. encoding parameters.
- the image features include at least one of the following:
- FIG. 22 is a schematic block diagram of a video decoding device according to an embodiment of the present application.
- the video decoding device 2200 shown in FIG. 22 may include:
- An acquisition module 2201 configured to acquire coded data
- step 505 and step 507 for the description of the acquiring module 2201, reference may be made to the description of step 505 and step 507 in the above embodiment, and details are not repeated here.
- a decoding module 2202 configured to decode the encoded data to obtain difference information
- decoding module 2202 For the description of the decoding module 2202, reference may be made to the description of step 506 in the above embodiment, and details are not repeated here.
- the obtaining module 2201 is further configured to indicate information, the indication information is used to indicate that there is an association between the first information and the second information (optionally, the indication information is used to indicate that the difference).
- the value information is obtained according to the difference between the first information and the second information), the first information corresponds to the first image block in the first video file, and the second information corresponds to the second image block in the second video file image blocks, the first video file and the second video file are different video files;
- the reconstruction module 2203 is configured to obtain the second information according to the first information and the difference information.
- the indication information can also be encoded, and the indication information can indicate that the difference information is obtained according to the difference between the first image block and the second image block, and then the decoding side can obtain the second image block of the first image block based on the indication information.
- An encoding parameter and difference information and obtain a second encoding parameter based on the first encoding parameter and the difference information.
- the indication information may include an identifier indicating the first image block and an identifier indicating the second image block.
- the first information includes a first encoding parameter of a first image block
- the second information includes a second encoding parameter of a second image block
- the second encoding parameters include motion vectors and/or residuals.
- the relevant information of the reference frame of the second image frame in the group of pictures GOP where the second image frame is located may not be encoded into the encoded data , reducing the size of encoded data, reducing the storage space required for storing the second video, and reducing the bandwidth required for transmitting the second video.
- the first image block is included in a first image frame
- the second image block is included in a second image frame
- the first image frame and the second image frame are different image frame.
- the indication information includes an identifier of the first image block and an identifier of the second image block.
- the first information includes a first image block
- the second information includes a second image block
- the indication information is used to indicate that the reference block of the second image block is the first image block.
- the first image block is an image block in an I frame
- the second image block is an image block in a P frame or a B frame
- the first image block is an image block in a P frame
- the second image block is an image block in a P frame or a B frame.
- the similarity of the image features of the first image block and the second image block is greater than a threshold, and the image features include at least one of the following:
- the device embodiments described above are only illustrative, and the units described as separate components may or may not be physically separated, and the components shown as units may or may not be A physical unit can be located in one place, or it can be distributed to multiple network units. Part or all of the modules can be selected according to actual needs to achieve the purpose of the solution of this embodiment.
- the connection relationship between the modules indicates that they have communication connections, which can be specifically implemented as one or more communication buses or signal lines.
- the essence of the technical solution of this application or the part that contributes to the prior art can be embodied in the form of a software product, and the computer software product is stored in a readable storage medium, such as a floppy disk of a computer , U disk, mobile hard disk, ROM, RAM, magnetic disk or optical disk, etc., including several instructions to make a computer device (which can be a personal computer, training device, or network device, etc.) execute the instructions described in various embodiments of the present application method.
- a computer device which can be a personal computer, training device, or network device, etc.
- all or part of them may be implemented by software, hardware, firmware or any combination thereof.
- software When implemented using software, it may be implemented in whole or in part in the form of a computer program product.
- the computer program product includes one or more computer instructions.
- the computer can be a general purpose computer, a special purpose computer, a computer network, or other programmable devices.
- the computer instructions may be stored in or transmitted from one computer-readable storage medium to another computer-readable storage medium, for example, the computer instructions may be transferred from a website, computer, training device, or data
- the center transmits to another website site, computer, training device or data center via wired (eg, coaxial cable, fiber optic, digital subscriber line (DSL)) or wireless (eg, infrared, wireless, microwave, etc.).
- wired eg, coaxial cable, fiber optic, digital subscriber line (DSL)
- wireless eg, infrared, wireless, microwave, etc.
- the computer-readable storage medium may be any available medium that can be stored by a computer, or a data storage device such as a training device or a data center integrated with one or more available media.
- the available medium may be a magnetic medium (such as a floppy disk, a hard disk, or a magnetic tape), an optical medium (such as a DVD), or a semiconductor medium (such as a solid state disk (Solid State Disk, SSD)), etc.
Landscapes
- Engineering & Computer Science (AREA)
- Multimedia (AREA)
- Signal Processing (AREA)
- Compression Or Coding Systems Of Tv Signals (AREA)
Abstract
Description
Claims (35)
- 一种视频编码方法,其特征在于,所述方法包括:获取第一图像块的第一编码参数和第二图像块的第二编码参数,所述第一图像块为第一视频文件中的图像块,所述第二图像块为第二视频文件中的图像块,所述第一视频文件和所述第二视频文件为不同的视频文件,所述第一编码参数和所述第二编码参数包括运动矢量或残差;根据所述第一编码参数和所述第二编码参数,得到差值信息,所述差值信息用于指示所述第一编码参数和所述第二编码参数之间的差异;对所述差值信息进行编码,得到编码数据。
- 根据权利要求1所述的方法,其特征在于,所述获取第一图像块的第一编码参数和第二图像块的第二编码参数,包括:对所述第一视频文件和所述第二视频文件进行解码,得到第一图像块的第一编码参数和第二图像块的第二编码参数。
- 根据权利要求1或2所述的方法,其特征在于,所述第一图像块和所述第二图像块的图像特征之间的相似度大于阈值;或,所述第一编码参数和所述第二编码参数之间的相似度大于阈值;或,所述第一视频文件和所述第二视频文件中的字幕信息或音频信息的相似度大于阈值;或,所述第一图像块的DCT系数和所述第二图像块的DCT系数之间的相似度大于阈值。
- 根据权利要求1至3任一所述的方法,其特征在于,所述图像特征包括如下的至少一种:颜色特征、纹理特征、形状特征和空间关系特征。
- 根据权利要求1至4任一所述的方法,其特征在于,所述第一编码参数包括第一运动矢量,所述第二编码参数包括第二运动矢量;所述差值信息包括第一差值信息,所述第一差值信息用于指示所述第一运动矢量和所述第二运动矢量之间的差异;所述对所述差值信息进行编码,包括:通过无损压缩编码,对所述第一差值信息进行编码。
- 根据权利要求1至5任一所述的方法,其特征在于,所述第一编码参数包括第一残差,所述第二编码参数包括第二残差;所述差值信息包括第二差值信息,所述第二差值信息用于指示所述第一残差和所述第二残差之间的差异;所述对所述差值信息进行编码,包括:通过无损压缩编码或有损压缩编码,对所述第二差值信息进行编码。
- 根据权利要求1至6任一所述的方法,其特征在于,所述对所述差值信息进行编码,包括:对所述差值信息和第一指示信息进行编码,所述第一指示信息用于指示所述第一图像块和所述第二图像块之间存在关联。
- 根据权利要求7所述的方法,其特征在于,所述第一指示信息包括所述第一图像块的标识以及所述第二图像块的标识;或,所述第一指示信息包括所述第一视频文件的标识以及所述第二视频文件的标识。
- 一种视频编码方法,其特征在于,所述方法包括:获取第一视频文件中的第一图像块以及第二视频文件中的第二图像块,所述第一视频文件和所述第二视频文件为不同的视频文件;将所述第一图像块作为所述第二图像块的参考块,确定差值信息,所述差值信息用于指示所述第一图像块和所述第二图像块之间的差异;对所述差值信息进行编码,得到编码数据。
- 根据权利要求9所述的方法,其特征在于,所述获取第一视频文件中的第一图像块以及第二视频文件中的第二图像块,包括:获取第一视频文件中的多个图像块和所述第二视频文件中的第二图像块;从所述多个图像块中确定和所述第二图像块的图像特征之间的相似度大于阈值的图像块为所述第一图像块;或,从所述多个图像块中确定和所述第二图像块的编码参数之间的相似度大于阈值的图像块为所述第一图像块,所述编码参数包括运动矢量、残差或DCT系数;或,从所述多个图像块中确定和所述第二图像块对应的字幕信息或音频信息的相似度大于阈值的图像块为所述第一图像块。
- 根据权利要求9或10所述的方法,其特征在于,所述获取第一视频文件中的第一图像块以及第二视频文件中的第二图像块,包括:对所述第一视频文件和所述第二视频文件进行解码,得到所述第一视频文件中的第一图像块以及所述第二视频文件中的第二图像块。
- 根据权利要求9至11任一所述的方法,其特征在于,所述第一图像块为I帧中的图像块,所述第二图像块为P帧或B帧中的图像块;或者,所述第一图像块为P帧中的图像块,所述第二图像块为P帧或B帧中的图像块。
- 根据权利要求9至12任一所述的方法,其特征在于,所述对所述差值信息进行编码, 包括:对所述差值信息和第二指示信息进行编码,所述第二指示信息用于指示所述第一图像块和所述第二图像块之间存在关联。
- 根据权利要求9至13任一所述的方法,其特征在于,所述方法还包括:获取第三视频文件中的第三图像块以及所述第二视频文件中的第四图像块,所述第二图像块和所述第四图像块属于所述第二视频文件中的同一个图像帧,所述第三图像块和所述第四图像块的图像特征的相似度大于阈值,所述第三视频文件和所述第二视频文件为不同的视频文件;所述将所述第一图像块作为所述第二图像块的参考块,确定差值信息,包括:将所述第一图像块作为所述第二图像块的参考块,并将所述第三图像块作为所述第四图像块的参考块,确定所述差值信息。
- 一种视频解码方法,其特征在于,所述方法包括:获取编码数据;对所述编码数据进行解码,得到差值信息;根据指示信息,对所述第一图像块进行解码,得到第一编码参数;其中所述指示信息用于指示第一图像块和第二图像块之间存在关联,所述第一图像块属于第一视频文件,所述第二图像块属于第二视频文件,所述第一视频文件和所述第二视频文件为不同的视频文件;根据所述第一编码参数和所述差值信息,得到所述第二编码参数,所述第一编码参数和所述第二编码参数包括运动矢量或残差。
- 根据权利要求15所述的方法,其特征在于,所述指示信息包括所述第一图像块的标识以及所述第二图像块的标识。
- 一种视频解码方法,其特征在于,所述方法包括:获取编码数据;对所述编码数据进行解码,得到差值信息;根据指示信息,对所述第一视频文件进行解码,得到第一图像块;所述指示信息用于指示第一图像块和第二图像块之间存在关联,所述第一图像块属于第一视频文件,所述第二图像块属于第二视频文件,所述第一视频文件和所述第二视频文件为不同的视频文件;根据所述第一图像块和所述差值信息,得到所述第二图像块。
- 根据权利要求17所述的方法,其特征在于,所述指示信息用于指示所述第二图像块的参考块为所述第一图像块。
- 根据权利要求17或18所述的方法,其特征在于,所述指示信息包括所述第一图像块的标识以及所述第二图像块的标识;或,所述指示信息包括所述第一视频文件的标识以及所述第二视频文件的标识。
- 根据权利要求17至19任一所述的方法,其特征在于,所述第一图像块为I帧中的图像块,所述第二图像块为P帧或B帧中的图像块;或者,所述第一图像块为P帧中的图像块,所述第二图像块为P帧或B帧中的图像块。
- [根据细则91更正 24.11.2021]
一种视频编码装置,其特征在于,所述装置包括:获取模块,用于获取第一图像块的第一编码参数和第二图像块的第二编码参数,所述第一图像块为第一视频文件中的图像块,所述第二图像块为第二视频文件中的图像块,所述第一视频文件和所述第二视频文件为不同的视频文件,所述第一编码参数和所述第二编码参数包括运动矢量或残差;差值确定模块,用于根据所述第一编码参数和所述第二编码参数,得到差值信息,所述差值信息用于指示所述第一编码参数和所述第二编码参数之间的差异;编码模块,用于对所述差值信息进行编码,得到编码数据。 - 根据权利要求20所述的装置,其特征在于,所述获取模块,具体用于:对所述第一视频文件和所述第二视频文件进行解码,得到第一图像块的第一编码参数和第二图像块的第二编码参数。
- 根据权利要求21所述的装置,其特征在于,所述第一图像块和所述第二图像块的图像特征之间的相似度大于阈值;或,所述第一编码参数和所述第二编码参数之间的相似度大于阈值;或,所述第一视频文件和所述第二视频文件中的字幕信息或音频信息的相似度大于阈值;或,所述第一图像块的DCT系数和所述第二图像块的DCT系数之间的相似度大于阈值。
- 根据权利要求22所述的装置,其特征在于,所述编码模块,具体用于:对所述差值信息和第一指示信息进行编码,所述第一指示信息用于指示所述第一图像块和所述第二图像块之间存在关联。
- 一种视频编码装置,其特征在于,所述装置包括:获取模块,用于获取第一视频文件中的第一图像块以及第二视频文件中的第二图像块,所述第一视频文件和所述第二视频文件为不同的视频文件;差值确定模块,用于将所述第一图像块作为所述第二图像块的参考块,确定差值信息,所述差值信息用于指示所述第一图像块和所述第二图像块之间的差异;编码模块,用于对所述差值信息进行编码,得到编码数据。
- 根据权利要求24所述的装置,其特征在于,所述获取模块,具体用于:获取第一视频文件中的多个图像块和所述第二视频文件中的第二图像块;从所述多个图像块中确定和所述第二图像块的图像特征之间的相似度大于阈值的图像块为所述第一图像块;或,从所述多个图像块中确定和所述第二图像块的编码参数之间的相似度大于阈值的图像块为所述第一图像块,所述编码参数包括运动矢量、残差或DCT系数;或,从所述多个图像块中确定和所述第二图像块对应的字幕信息或音频信息的相似度大于阈值的图像块为所述第一图像块。
- 根据权利要求24或25所述的装置,其特征在于,所述第一图像块为I帧中的图像块,所述第二图像块为P帧或B帧中的图像块;或者,所述第一图像块为P帧中的图像块,所述第二图像块为P帧或B帧中的图像块。
- 根据权利要求24至26任一所述的装置,其特征在于,所述编码模块,具体用于:对所述差值信息和第二指示信息进行编码,所述第二指示信息用于指示所述第一图像块和所述第二图像块之间存在关联。
- 一种视频解码装置,其特征在于,所述装置包括:获取模块,用于获取编码数据;解码模块,用于对所述编码数据进行解码,得到差值信息;所述获取模块,还用于根据指示信息,对所述第一图像块进行解码,得到所述第一编码参数;其中所述指示信息用于指示第一图像块和第二图像块之间存在关联,所述第一图像块属于第一视频文件,所述第二图像块属于第二视频文件,所述第一视频文件和所述第二视频文件为不同的视频文件;编码参数还原模块,用于根据所述第一编码参数和所述差值信息,得到所述第二编码参数,所述第一编码参数和所述第二编码参数包括运动矢量或残差。
- 一种视频解码装置,其特征在于,所述装置包括:获取模块,用于获取编码数据;解码模块,用于对所述编码数据进行解码,得到差值信息;所述获取模块,还用于根据指示信息,对所述第一视频文件进行解码,得到第一图像块;所述指示信息用于指示第一图像块和第二图像块之间存在关联,所述第一图像块属于第一视频文件,所述第二图像块属于第二视频文件,所述第一视频文件和所述第二视频文件为不同的视频文件;重建模块,用于根据所述第一图像块和所述差值信息,得到所述第二图像块。
- 根据权利要求29所述的装置,其特征在于,所述指示信息用于指示所述第二图像块的参考块为所述第一图像块。
- 根据权利要求29或30所述的装置,其特征在于,所述第一图像块为I帧中的图像块,所述第二图像块为P帧或B帧中的图像块;或者,所述第一图像块为P帧中的图像块,所述第二图像块为P帧或B帧中的图像块。
- 一种计算设备,其特征在于,所述计算设备包括存储器和处理器;所述存储器存储有代码,所述处理器被配置为获取所述代码,并执行如权利要求1至20任一所述的方法。
- 一种计算机存储介质,其特征在于,所述计算机存储介质存储有一个或多个指令,所述指令在由一个或多个计算机执行时使得所述一个或多个计算机实施权利要求1至20任一所述的方法。
- 一种计算机程序产品,包括代码,其特征在于,在所述代码被执行时用于实现如权利要求1至20任一所述的方法。
Priority Applications (4)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN202180104336.0A CN118216148A (zh) | 2021-11-23 | 2021-11-23 | 一种视频编码方法及其相关装置 |
| EP21965000.9A EP4422176A4 (en) | 2021-11-23 | 2021-11-23 | VIDEO CODING METHOD AND ASSOCIATED APPARATUS |
| PCT/CN2021/132307 WO2023092256A1 (zh) | 2021-11-23 | 2021-11-23 | 一种视频编码方法及其相关装置 |
| US18/671,301 US20240314326A1 (en) | 2021-11-23 | 2024-05-22 | Video Coding Method and Related Apparatus Thereof |
Applications Claiming Priority (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| PCT/CN2021/132307 WO2023092256A1 (zh) | 2021-11-23 | 2021-11-23 | 一种视频编码方法及其相关装置 |
Related Child Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| US18/671,301 Continuation US20240314326A1 (en) | 2021-11-23 | 2024-05-22 | Video Coding Method and Related Apparatus Thereof |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| WO2023092256A1 true WO2023092256A1 (zh) | 2023-06-01 |
Family
ID=86538554
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| PCT/CN2021/132307 Ceased WO2023092256A1 (zh) | 2021-11-23 | 2021-11-23 | 一种视频编码方法及其相关装置 |
Country Status (4)
| Country | Link |
|---|---|
| US (1) | US20240314326A1 (zh) |
| EP (1) | EP4422176A4 (zh) |
| CN (1) | CN118216148A (zh) |
| WO (1) | WO2023092256A1 (zh) |
Cited By (1)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN116530979A (zh) * | 2023-07-04 | 2023-08-04 | 清华大学 | 基于震动传感器的监测装置 |
Families Citing this family (1)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN119676434A (zh) * | 2024-12-24 | 2025-03-21 | 北京达佳互联信息技术有限公司 | 视频帧处理方法、装置、电子设备和存储介质 |
Citations (6)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN103609125A (zh) * | 2011-04-19 | 2014-02-26 | 三星电子株式会社 | 用于对多视点视频的运动矢量进行编码和解码的方法和设备 |
| CN103765899A (zh) * | 2011-06-15 | 2014-04-30 | 韩国电子通信研究院 | 用于编码和解码可伸缩视频的方法以及使用其的设备 |
| JP2014168150A (ja) * | 2013-02-28 | 2014-09-11 | Mitsubishi Electric Corp | 画像符号化装置、画像復号装置、画像符号化方法、画像復号方法及び画像符号化復号システム |
| CN104798375A (zh) * | 2012-11-16 | 2015-07-22 | 联发科技股份有限公司 | 在3d视频编码中约束视差向量推导的方法及装置 |
| JP2017079485A (ja) * | 2011-01-24 | 2017-04-27 | ソニー株式会社 | 画像符号化装置と画像符号化方法およびプログラム |
| CN111357290A (zh) * | 2019-01-03 | 2020-06-30 | 北京大学 | 视频图像处理方法与装置 |
Family Cites Families (4)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN103098484A (zh) * | 2010-06-14 | 2013-05-08 | 汤姆森许可贸易公司 | 用于封装编码多组件视频的方法和装置 |
| US9769230B2 (en) * | 2010-07-20 | 2017-09-19 | Nokia Technologies Oy | Media streaming apparatus |
| US10708627B2 (en) * | 2019-03-04 | 2020-07-07 | Intel Corporation | Volumetric video compression with motion history |
| US10854241B2 (en) * | 2019-05-03 | 2020-12-01 | Citrix Systems, Inc. | Generation of media diff files |
-
2021
- 2021-11-23 EP EP21965000.9A patent/EP4422176A4/en active Pending
- 2021-11-23 CN CN202180104336.0A patent/CN118216148A/zh active Pending
- 2021-11-23 WO PCT/CN2021/132307 patent/WO2023092256A1/zh not_active Ceased
-
2024
- 2024-05-22 US US18/671,301 patent/US20240314326A1/en active Pending
Patent Citations (6)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| JP2017079485A (ja) * | 2011-01-24 | 2017-04-27 | ソニー株式会社 | 画像符号化装置と画像符号化方法およびプログラム |
| CN103609125A (zh) * | 2011-04-19 | 2014-02-26 | 三星电子株式会社 | 用于对多视点视频的运动矢量进行编码和解码的方法和设备 |
| CN103765899A (zh) * | 2011-06-15 | 2014-04-30 | 韩国电子通信研究院 | 用于编码和解码可伸缩视频的方法以及使用其的设备 |
| CN104798375A (zh) * | 2012-11-16 | 2015-07-22 | 联发科技股份有限公司 | 在3d视频编码中约束视差向量推导的方法及装置 |
| JP2014168150A (ja) * | 2013-02-28 | 2014-09-11 | Mitsubishi Electric Corp | 画像符号化装置、画像復号装置、画像符号化方法、画像復号方法及び画像符号化復号システム |
| CN111357290A (zh) * | 2019-01-03 | 2020-06-30 | 北京大学 | 视频图像处理方法与装置 |
Non-Patent Citations (1)
| Title |
|---|
| Y. HE (INTERDIGITAL), Y. HE, A. HAMZA (INTERDIGITAL): "AHG8: On inter-layer motion vector prediction", 16. JVET MEETING; 20191001 - 20191011; GENEVA; (THE JOINT VIDEO EXPLORATION TEAM OF ISO/IEC JTC1/SC29/WG11 AND ITU-T SG.16 ), 24 September 2019 (2019-09-24), XP030216325 * |
Cited By (2)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN116530979A (zh) * | 2023-07-04 | 2023-08-04 | 清华大学 | 基于震动传感器的监测装置 |
| CN116530979B (zh) * | 2023-07-04 | 2023-10-17 | 清华大学 | 基于震动传感器的监测装置 |
Also Published As
| Publication number | Publication date |
|---|---|
| EP4422176A4 (en) | 2024-11-20 |
| US20240314326A1 (en) | 2024-09-19 |
| EP4422176A1 (en) | 2024-08-28 |
| CN118216148A (zh) | 2024-06-18 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| CN115243039B (zh) | 一种视频图像预测方法及装置 | |
| CN110891176B (zh) | 基于仿射运动模型的运动矢量预测方法及设备 | |
| CN111385572B (zh) | 预测模式确定方法、装置及编码设备和解码设备 | |
| CN111355951B (zh) | 视频解码方法、装置及解码设备 | |
| CN112823518A (zh) | 用于译码块的三角划分块的帧间预测的装置及方法 | |
| CN111277828B (zh) | 视频编解码方法、视频编码器和视频解码器 | |
| CN111416981B (zh) | 视频图像解码、编码方法及装置 | |
| WO2020182194A1 (zh) | 帧间预测的方法及相关装置 | |
| KR102616713B1 (ko) | 이미지 예측 방법, 장치 및 시스템, 디바이스 및 저장 매체 | |
| CN111698515B (zh) | 帧间预测的方法及相关装置 | |
| CN115243048B (zh) | 视频图像解码、编码方法及装置 | |
| CN111953997A (zh) | 候选运动矢量列表获取方法、装置及编解码器 | |
| CN112055200A (zh) | Mpm列表构建方法、色度块的帧内预测模式获取方法及装置 | |
| WO2020006969A1 (zh) | 运动矢量预测方法以及相关装置 | |
| US20240314326A1 (en) | Video Coding Method and Related Apparatus Thereof | |
| CN112118447B (zh) | 融合候选运动信息列表的构建方法、装置及编解码器 | |
| CN111866502A (zh) | 图像预测方法、装置和计算机可读存储介质 | |
| CN111432219A (zh) | 一种帧间预测方法及装置 | |
| CN111372086A (zh) | 视频图像解码方法及装置 | |
| CN112135149B (zh) | 语法元素的熵编码/解码方法、装置以及编解码器 | |
| CN111901593A (zh) | 一种图像划分方法、装置及设备 | |
| WO2020114393A1 (zh) | 变换方法、反变换方法以及视频编码器和视频解码器 | |
| WO2020134817A1 (zh) | 预测模式确定方法、装置及编码设备和解码设备 | |
| CN111327894A (zh) | 块划分方法、视频编解码方法、视频编解码器 | |
| CN112055211B (zh) | 视频编码器及qp设置方法 |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| 121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 21965000 Country of ref document: EP Kind code of ref document: A1 |
|
| WWE | Wipo information: entry into national phase |
Ref document number: 202180104336.0 Country of ref document: CN |
|
| WWE | Wipo information: entry into national phase |
Ref document number: 2021965000 Country of ref document: EP |
|
| ENP | Entry into the national phase |
Ref document number: 2021965000 Country of ref document: EP Effective date: 20240522 |
|
| NENP | Non-entry into the national phase |
Ref country code: DE |