WO2022228375A1 - 视频编码方法、装置和电子设备 - Google Patents

视频编码方法、装置和电子设备 Download PDF

Info

Publication number
WO2022228375A1
WO2022228375A1 PCT/CN2022/088950 CN2022088950W WO2022228375A1 WO 2022228375 A1 WO2022228375 A1 WO 2022228375A1 CN 2022088950 W CN2022088950 W CN 2022088950W WO 2022228375 A1 WO2022228375 A1 WO 2022228375A1
Authority
WO
WIPO (PCT)
Prior art keywords
image
bits
encoding
frame
ratio
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Ceased
Application number
PCT/CN2022/088950
Other languages
English (en)
French (fr)
Inventor
张勇
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Vivo Mobile Communication Co Ltd
Original Assignee
Vivo Mobile Communication Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Vivo Mobile Communication Co Ltd filed Critical Vivo Mobile Communication Co Ltd
Priority to EP22794845.2A priority Critical patent/EP4333433A4/en
Priority to JP2023564189A priority patent/JP7682297B2/ja
Priority to KR1020237034885A priority patent/KR20230155002A/ko
Publication of WO2022228375A1 publication Critical patent/WO2022228375A1/zh
Priority to US18/485,487 priority patent/US12413738B2/en
Anticipated expiration legal-status Critical
Ceased legal-status Critical Current

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/134Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
    • H04N19/136Incoming video signal characteristics or properties
    • H04N19/14Coding unit complexity, e.g. amount of activity or edge presence estimation
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/115Selection of the code volume for a coding unit prior to coding
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/124Quantisation
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/134Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
    • H04N19/146Data rate or code amount at the encoder output
    • H04N19/149Data rate or code amount at the encoder output by estimating the code amount by means of a model, e.g. mathematical model or statistical model
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/134Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
    • H04N19/146Data rate or code amount at the encoder output
    • H04N19/15Data rate or code amount at the encoder output by monitoring actual compressed data size at the memory before deciding storage at the transmission buffer
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/134Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
    • H04N19/157Assigned coding mode, i.e. the coding mode being predefined or preselected to be further used for selection of another element or parameter
    • H04N19/159Prediction type, e.g. intra-frame, inter-frame or bidirectional frame prediction
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/17Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
    • H04N19/172Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a picture, frame or field
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/177Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being a group of pictures [GOP]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/184Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being bits, e.g. of the compressed video stream
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/189Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the adaptation method, adaptation tool or adaptation type used for the adaptive coding
    • H04N19/196Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the adaptation method, adaptation tool or adaptation type used for the adaptive coding being specially adapted for the computation of encoding parameters, e.g. by averaging previously computed encoding parameters

Definitions

  • the present application belongs to the field of communication technologies, and in particular relates to a video coding method, apparatus and electronic device.
  • Video coding is a data compression method for digital video. The goal is to remove the redundancy in the original video image, save storage and transmission costs; Improve the quality of the encoded video.
  • the video can be encoded by the JVT-G012 rate control algorithm in the video encoding standard H.264/AVC.
  • the JVT-G012 rate control algorithm implements three-level rate control at the GOP (Group of Pictures) level, frame level, and macroblock level, with comprehensive control functions.
  • GOP Group of Pictures
  • the JVT-G012 rate control algorithm allocates the number of bits to the P-frame images in the picture group in an evenly distributed manner That is, the JVT-G012 rate control algorithm does not consider the coding complexity at the frame level when performing bit allocation, which may make the peak signal-to-noise ratio (PSNR) of each frame image in the GOP. ) curve fluctuations, resulting in a decrease in the average peak signal-to-noise ratio (PSNR) of the entire video sequence. In this way, the quality of the encoded video is poor.
  • PSNR peak signal-to-noise ratio
  • the purpose of the embodiments of the present application is to provide a video coding method, apparatus and electronic device, which can solve the problem that the coding complexity is not considered at the frame level, so that the average peak signal-to-noise ratio (PSNR) of the coded video decreases, thereby causing the coded video.
  • PSNR peak signal-to-noise ratio
  • an embodiment of the present application provides a video encoding method, the method comprising: determining a second number of bits for encoding a first image according to a first ratio, a first number of bits, and a first number; based on the second number of bits , encode the first image; wherein, the first ratio is the ratio of the predicted coding complexity of the first image to the actual coding complexity of M frames of the second image, and the first image is the uncoded first frame image in the target image group,
  • the second images of the M frames are coded images in the target image group; the first number of bits is the number of bits remaining in the target image group, the first number is the number of uncoded images in the target image group, and M is an integer greater than 1 .
  • an embodiment of the present application provides a video encoding apparatus, the apparatus includes: a determination module and an encoding module; and a determination module for determining to encode a first image according to a first ratio, a first number of bits, and a first number
  • the encoding module is used for encoding the first image based on the second bit number determined by the determining module; wherein, the first ratio is the predicted encoding complexity of the first image and the actual encoding complexity of M frames of the second image
  • the ratio of degrees the first image is the uncoded first frame image in the target image group, the second image of the M frames is the coded image in the target image group; the first number of bits is the remaining number of bits in the target image group,
  • the first number is the number of uncoded pictures in the target picture group, and M is an integer greater than 1.
  • embodiments of the present application provide an electronic device, the electronic device includes a processor, a memory, and a program or instruction stored on the memory and executable on the processor, the program or instruction being The processor implements the steps of the method according to the first aspect when executed.
  • an embodiment of the present application provides a readable storage medium, where a program or an instruction is stored on the readable storage medium, and when the program or instruction is executed by a processor, the steps of the method according to the first aspect are implemented .
  • an embodiment of the present application provides a chip, the chip includes a processor and a communication interface, the communication interface is coupled to the processor, and the processor is configured to run a program or an instruction, and implement the first aspect the method described.
  • the second number of bits for encoding the first image may be determined according to the first ratio, the first number of bits, and the first number; and the first image is encoded based on the second number of bits; wherein the first ratio is is the ratio of the predicted coding complexity of the first image to the actual coding complexity of M frames of the second image, where the first image is the uncoded first frame of the target image group, and the M second image is the target image group that has been encoded.
  • the first number of bits is the number of bits remaining in the target image group
  • the first quantity is the number of unencoded images in the target image group
  • M is an integer greater than 1.
  • the first ratio can indicate the relative coding complexity between the first image and the M frames of the coded second images in the target image group
  • the video encoding method provided by the embodiments of the present application can The relative coding complexity between the image to be coded and the coded image, the number of bits remaining in the target image group, and the number of frames remaining in the target image group, the number of bits is allocated to the image to be coded, so it is possible to realize the number of bits from the target image group.
  • Images with low coding complexity save coding bits, and use the saved coding bits for coding images with high coding complexity, so that the average coding rate can be kept close to the target code rate (average coding rate).
  • the fluctuation of the PSNR curve of each frame image in the image group is reduced, thereby improving the quality of the encoded video.
  • Fig. 1 is a basic frame diagram of rate control in video coding
  • Fig. 3 is the general structure diagram of the rate control algorithm
  • FIG. 4 is a flowchart of a video encoding method provided by an embodiment of the present application.
  • FIG. 5 is a schematic diagram of a video encoding apparatus provided by an embodiment of the present application.
  • FIG. 6 is a schematic diagram of an electronic device provided by an embodiment of the present application.
  • FIG. 7 is a schematic hardware diagram of an electronic device provided by an embodiment of the present application.
  • first”, “second” and the like in the description and claims of the present application are used to distinguish similar objects, and are not used to describe a specific order or sequence. It is to be understood that data so used may be interchanged under appropriate circumstances so that embodiments of the application can be practiced in sequences other than those illustrated or described herein.
  • the objects distinguished by “first”, “second”, etc. are usually one type, and the number of objects is not limited.
  • the first object may be one or more than one.
  • “and/or” in the description and claims indicates at least one of the connected objects, and the character “/" generally indicates that the associated objects are in an "or” relationship.
  • Basic unit BU (basic unit, BU): is a set of one or more macroblocks MB (macro block, MB).
  • the number of MBs contained in a BU should be divisible by the number of MBs contained in a frame of image. For example, in a video sequence in QCIF format, if a frame of image contains 99 MBs, then: a BU of the image can contain 99, 33 , 11, 9, 3, 1 MB, so that 1, 3, 9, 11, 33, 99 BUs can be included in the image.
  • one BU can include one MB, one slice, one field or one frame of picture.
  • all macroblocks MB in one basic unit BU are coded using the same quantization parameter QP.
  • QP quantization parameter
  • a larger size BU is usually selected, for example, all MBs in one line of an image form a basic unit BU, or a frame of image is used as a basic unit BU.
  • Traffic round-trip model used to calculate the target bits allocated to the current frame image, that is, the number of bits allocated to the current frame image.
  • N represents the number of images included in a GOP in a video sequence, and N is an integer greater than 1;
  • A( ni,j ) is the actual number of bits generated by the coded image ni,j
  • u( ni,j-1 ) is the instantaneous channel before the coded image ni,j Bandwidth
  • F r is the encoding frame rate
  • B s is the buffer size of the buffer
  • the maximum occupancy of the buffer is determined by different profiles and levels;
  • a 0 is a constant, usually the value of a 0 is 8;
  • Buffer Also known as a buffer register, it is used to temporarily store data sent by peripherals (such as encoders) so that the data can be transmitted through the channel bandwidth.
  • the buffer in the embodiment of the present invention is the buffer of the buffer.
  • MAD linear prediction model It is used to predict the MAD of the jth frame image according to the actual MAD of the j-1th frame image, or predict the basic unit of the corresponding position in the jth frame image according to the MAD of a basic unit in the j-1th frame image.
  • the MAD of the cell, j is a positive integer greater than 1.
  • the MAD linear prediction model can be expressed by the following formula (2):
  • MAD cb a 1 *MAD pb +a 2 ;
  • a 1 and a 2 are two parameters of the MAD linear prediction model, and the initial values of a1 and a2 are set to 1 and 0 respectively, and are updated after each BU is edited. It should be noted that, a1 and a2 can be updated according to the difference between the predicted MAD value and the actual MAD value; the specific method can be determined according to the actual use requirements, which is not specifically limited here.
  • MAD of the image It is the absolute average difference between the YUV value (eg Y value) of the current frame image and the YUV value (eg Y value) of the previous frame image (should be P frame image or I frame image) of the current frame image.
  • Y in YUV represents brightness (Luminance or Luma)
  • U and V represent chromaticity or density (Chrominance or Chroma)
  • U and V are used to describe the color of the image.
  • Saturation which are used to indicate the chromaticity of the image.
  • the MAD of the basic unit BU is the absolute average difference between the YUV value of one BU and the YUV value of another BU, where the other BU is the previous frame of the image (for example, the jth frame image) where the one BU is located (for example, BU in the j-1th frame image), and the coordinate information of the one BU in the jth frame image is the same as the coordinate information of the other BU in the j-1th frame image, and the jth frame image is the same as the j-th frame image.
  • One frame of image belongs to the same image group, and j is an integer greater than 1.
  • the video signal transmission bandwidth is usually limited to a certain extent.
  • rate control is to select an appropriate encoding parameter, such as the quantization parameter QP, and encode the image corresponding to the quantization parameter according to the quantization parameter, so that the encoded bit rate of the video signal satisfies the bandwidth limitation and makes the encoding distorted. as small as possible.
  • rate control is a typical rate-distortion optimization problem with multiple constraints and multiple objectives.
  • N is the number of images contained in a video sequence
  • D i is the coding distortion of the ith frame image in the video sequence
  • R i is the ith frame image in the video sequence.
  • number of encoded bits is the optimal encoding parameter (ie, the quantization parameter QP) of each frame image in the video sequence, namely is the optimal encoding parameter of the first frame image,
  • Optimal coding parameters of the second frame image, ..., is the optimal coding parameter of the Nth frame image;
  • Rc is the target coding bit number of the video sequence.
  • a video sequence can be encoded by an encoder, and the encoded encoded bit stream usually needs to be transmitted over a communication channel. Because most of the communication channels in practical applications are constant bit rate CBR (Constant Bitrate, CBR) channels, and most of the encoded code streams output by the encoder are variable bit rate VBR (Variable Bitrate, VBR) code streams. Therefore, in order to effectively transmit the VBR code stream in the CBR channel, a buffer can be set in the output part of the encoder.
  • CBR Constant Bitrate
  • VBR Variable Bitrate
  • FIG. 2 is a schematic diagram of a buffer.
  • a in Figure 2 represents the encoded bit stream output from the video encoder to the buffer
  • Bs represents the buffer size of the buffer
  • Bc (that is, the padding area in Figure 2) is the number of bits to be sent in the buffer's buffer
  • Cb is the channel bandwidth
  • Fr is the encoding frame rate
  • Cb/Fr indicates the amount of data transmitted by the communication channel within the time period during which the encoder encodes one frame of image.
  • bit allocation means allocating limited resources to picture units such as groups of pictures, frames, and macroblocks.
  • quantization parameter estimation is to estimate the optimal coding parameter corresponding to the resource 0 according to the resource allocated to the picture unit (hereinafter referred to as resource 0, that is, the number of bits), so as to minimize the distortion of the encoded video.
  • the code rate control algorithm requires that the encoded code stream is suitable for transmission on a band-limited channel (such as a CBR channel); on the other hand, it requires better video quality under the limited channel transmission bandwidth.
  • a band-limited channel such as a CBR channel
  • the code rate control algorithm requires that the encoded code stream is suitable for transmission on a band-limited channel (such as a CBR channel); on the other hand, it requires better video quality under the limited channel transmission bandwidth.
  • a band-limited channel such as a CBR channel
  • two aspects are usually considered: one is to look at the average PSNR of all frames in the entire sequence, and the video sequence with better average PSNR has better quality; the other is to look at the change of the PSNR curve during the encoding process of the video sequence.
  • Video sequences with smoother PSNR curves are of better quality.
  • the above two problems are studied at three levels: the GOP layer, the frame layer and the BU layer.
  • the rate control of "three layers and two steps" is usually performed in units of GOPs, as shown in Figure 3.
  • a GOP usually starts with an I-frame coded with intra-frame prediction, followed by several P-frames and/or B-frames coded with inter-frame prediction.
  • the I frame is a key frame in the GOP, which belongs to intra-frame compression.
  • the picture of the I frame will be completely preserved, and only the data of this frame is needed to decode the I frame.
  • P frame is a forward search frame, also known as difference frame or inter-frame compression. After P frame encoding, it represents the difference information between the current frame and the I frame or the P frame before the current frame; when decoding the P frame, it is necessary to use the current frame.
  • the picture of the P frame or I frame buffered before the frame is superimposed with the coded difference information defined by this frame, and the picture of the current frame is reconstructed.
  • the B frame is a two-way difference frame, that is to say, the encoded B frame records the difference information between the current frame (that is, the current frame) and the previous and subsequent frames; in other words, to decode the B frame, not only the previous cached picture, but also the For the decoded picture, the image of the current frame is reconstructed from the previous and subsequent frames and the encoded data of the current frame.
  • the amount of data generated by encoding the I frame is much larger than the amount of data generated by encoding the P frame and the encoding B frame. Therefore, after encoding the I frame, the buffer occupancy Bc will reach a high level, and the occupancy Bc gradually decreases in the process of encoding the P frame and B frame after the I frame. After the image in one GOP is encoded, the buffer occupancy can be restored to the level before encoding the GOP.
  • the rate control algorithm allocates coding resources from top to bottom, and determines the quantization parameter QP according to the number of available coded bits.
  • the main task of the rate control of the GOP layer is to allocate the number of coded bits for the entire GOP. The allocation is based on the number of frames contained in the current GOP, the occupancy of the encoder output buffer and the channel bandwidth. Then it is necessary to calculate the QP of the GOP starting I frame; the process of calculating the I frame QP is the process of allocating coding resources between the intra-frame prediction frame and the inter-frame prediction frame.
  • the I frame QP of each GOP is Calculated according to the average QP of all P frames in the previous GOP, for the first GOP, the QP can be selected empirically for the I frame in the first GOP.
  • Frame layer rate control is an important link in video coding. Whether it is GOP layer rate control or BU layer rate control, both are carried out around frame layer rate control.
  • the coded bits should be allocated in the form of target bits among each P frame within the GOP, and then the QP of the current frame should be estimated according to the number of allocated coded bits.
  • the main task is to make the actual number of bits generated by encoding match the target number of bits by setting an appropriate QP for each MB in the frame.
  • the following takes the JVT-F086 rate control algorithm and the JVT-G012 rate control algorithm recommended for H.264/AVC video coding as examples to illustrate the rate control method of the traditional technology.
  • JVT-F086 rate control algorithm and JVT-G012 rate control algorithm are JVT-F086 rate control algorithm and JVT-G012 rate control algorithm
  • the JVT-F086 code rate control algorithm is based on the MPEG-2TM5 code rate model, and performs bit allocation according to the buffer state, and tries to ensure that the buffer neither overflows nor underflows.
  • it is first necessary to estimate the number of bits required for encoding a frame of image before encoding a frame of image, and then pre-assume a QP according to the feedback of the buffer, and perform the frame image according to the QP. Then, according to the actual encoding result of the current frame image, it is judged whether the pre-assumed QP needs to be adjusted.
  • the QP can be adjusted first, and the frame image can be re-encoded according to the adjusted QP; that is, in the JVT -In the F086 rate control algorithm, when encoding each frame of image, it is necessary to determine whether to re-given the QP, and re-encode the frame image according to the re-given QP, so the computational complexity of JVT-F086 is relatively high.
  • the JVT-F086 bit rate control algorithm controls the bit rate in terms of buffer saturation. It controls the buffer well, and the buffer occupancy changes smoothly, but the encoded video quality fluctuates greatly.
  • the JVT-G012 rate control algorithm inherits the idea of the MPEG-4VM8 rate control algorithm, follows the second rate distortion model, and can adjust the model parameters in time according to the characteristics of the source.
  • the key technologies of the JVT-G012 rate control algorithm include Traffic round-trip model, MAD linear prediction model and quadratic rate-distortion model, etc.
  • the JVT-G012 rate control algorithm allocates target coding bits for the current frame according to the pre-defined bit rate, frame rate, buffer fullness and buffer target line, and then uses linear tracking theory to predict the MAD of the current frame image.
  • the rate-distortion model calculates the QP of the current frame image.
  • the JVT-G012 rate control algorithm uses the method of predicting MAD to solve the QP paradox problem, and compared with the JVT-F086 rate control algorithm, it only needs to encode each frame of the image to be encoded, so the JVT-G012 rate control algorithm The computational complexity is low. Further, the JVT-G012 rate control algorithm realizes three-level rate control of the GOP layer, the frame layer, and the macroblock layer, and the control functions are relatively comprehensive.
  • Tr(n i,0 ) represents the number of available/remaining bits of the GOP after encoding the 0th frame of the ith GOP, namely:
  • u(n i ,1 ) represents the available channel transmission rate before encoding the first frame image of the ith GOP
  • Ni is the number of image frames included in the ith GOP
  • B s is the buffer area size
  • B c (n i-1,Ni ) represents the actual occupancy of the buffer after encoding the i-1th GOP
  • F r represents the encoding frame rate.
  • Tr(n i,j ) represents the remaining available bits in the ith GOP after the coded image ni,j
  • u( ni,j ) represents the number of bits before the coded image ni,j
  • the available channel transmission rate of , u(n i,j-1 ) represents the available channel transmission rate before encoding the image n i,j-1
  • A(n i,j ) is the actual number of encoded bits of the image n i,j
  • i is a positive integer
  • j is an integer greater than 1.
  • the above formula (5) can be simplified into formula (6):
  • the process of allocating the number of bits of the ith GOP is the process of performing GOP layer rate control on the ith GOP.
  • the initial quantization parameter of the ith GOP needs to be determined.
  • the initial quantization parameters of the I frame and the first P frame of other GOPs except the first GOP in the video sequence can be calculated by the following formula (7):
  • QP st (i) represents the initial quantization parameter of the ith GOP
  • Sum PQP (i-1) represents the sum of the quantization parameters of all P frames in the ith-1 th GOP
  • N (i -1)p represents the number of P frames in the i-1 th GOP
  • Tr (n i -1,Ni ) represents the encoding of the last frame of the i-1 th GOP, in the i-1 th GOP
  • the number of available bits, T r (n i,0 ) represents the number of bits available in the i-th GOP after encoding the 0th frame of the i-th GOP
  • N i-1 represents the bits contained in the i-1th GOP.
  • the number of image frames, N (i-1)p represents the number of P frames contained in the i-1 th GOP.
  • Frame-level rate control includes two stages: a pre-encoding stage and a post-encoding stage.
  • the main task of this stage is to calculate quantization parameters for all coded frames including P and B frames. Since the B frame is usually not used as a reference frame, its QP can be obtained by simple linear interpolation of the QP of the adjacent frame, and the P frame is used as the reference frame of the subsequent frame, and the value of its QP must be accurately calculated. Therefore, the calculation methods of the quantization parameters of different frames should be considered separately.
  • the ith B frame has The quantization parameters as follows according to the following two cases:
  • Tbl(n i,j ) is the target buffer level of the jth frame P frame image in the ith GOP, and are the average coding complexity of P frame and B frame, respectively
  • u(n i,j ) represents the available channel transmission rate before encoding the jth frame image of the ith GOP
  • B s is the buffer size
  • N p(i- 1) is the number of P frames in the i-1th GOP.
  • the coding complexity of the image can be calculated by formula (13):
  • the number of bits allocated for the jth frame of the ith GOP is determined by the target occupancy of the buffer, the encoding frame rate, the available channel bandwidth and the actual occupancy of the buffer:
  • is a constant, and its value is 0.25 when the GOP is interpolated with a B frame, otherwise it is 0.75
  • u(n i,j ) represents the j-th encoding of the i-th GOP
  • F r is the encoding frame rate
  • Tbl(n i,j ) is the target buffer level of the jth frame image in the ith GOP
  • B c (n i,j ) represents the encoding The actual occupancy of the buffer after the jth frame image in the ith GOP.
  • the remaining bits after encoding the jth frame image in the ith GOP also take into account:
  • N p,r (j-1) and N b,r (j-1) respectively represent the number of remaining uncoded P frames and B frames in the current GOP.
  • N p,r (j-1) and N b,r (j-1) respectively represent the number of remaining uncoded P frames and B frames in the current GOP.
  • the MAD value of the current frame is obtained from the actual MAD of the previous frame through a linear prediction model, and then the quantization parameters of the ith frame image n i,j in a GOP are calculated according to the quadratic rate distortion model. i and j are both positive integers.
  • f(n i,j ) is the number of allocated bits of the jth frame image in the ith GOP, d 1 and d 2 are constants, MAD predict (n i,j ) is the predicted MAD value,
  • the quantization step size calculated for the rate-distortion model can then be converted into the quantization parameter QP.
  • the difference between the quantization parameters of two adjacent frames of images should not be greater than 2, so the quantization parameters of images n i, j are adjusted to
  • Q pp is the quantization parameter of the i-1th frame image ni,j in the ith GOP; the quantization parameter of the final image n ij is limited to:
  • the parameters in the linear prediction model and the parameters in the quadratic rate distortion model may be updated according to the error between the predicted MAD value of the image n i, j and the actual MAD value of the image n i, j.
  • the predicted occupancy of the buffer (also called the new occupancy of the buffer, or the predicted occupancy of the buffer) is actually generated by the image n i, j
  • the number of bits A(n i,j ) and the current occupancy of the buffer, and the amount of data that can be transmitted by the channel within the duration of the encoder encoding one frame are determined.
  • a frame skipping technique needs to be adopted to avoid excessive new occupancy or even overflow of the buffer.
  • the number of skipped frames N post is initialized to 0, and then increases continuously until the following conditions are met:
  • coded image representing prediction The occupancy of the back buffer, j represents the frame number of the frame skipping start; j+Npost represents the image frame that needs to be discarded.
  • the occupancy of the buffer can be calculated by the following formula (22):
  • j represents the frame sequence number of the start frame skipping
  • l is a positive integer
  • the JVT-G012 rate control algorithm does not consider each P frame at the frame level when performing bit allocation. coding complexity between. That is to say, it is assumed that the coding complexity of each P frame in the same GOP is the same, and coding resources are evenly allocated to each P frame. However, in an actual video sequence, the coding complexity of each frame will vary with the magnitude and amount of motion contained in each frame. The use of an average allocation strategy will not only cause fluctuations in the PSNR curve of each frame within the GOP, but also lead to a decrease in the average PSNR of the entire sequence. drop, reducing the quality of the entire video encoding.
  • the application embodiment proposes a video coding method based on coding complexity, which optimizes the step of calculating the number of bits of P frames in the frame-level rate control in the JVT-G012 method.
  • bit allocation of the frame layer is performed according to coding complexity, coding bits are saved from low-complexity frame coding, and used for High-complexity frame encoding can reduce the fluctuation of the PSNR curve of each frame image in the group of pictures on the premise of keeping the average encoding bit rate close to the target bit rate, thereby improving the quality of the encoded video.
  • An embodiment of the present application provides a video encoding method. As shown in FIG. 4 , the method may include the following steps 101 and 102 . The method is exemplarily described below by taking a video encoding device as an execution subject as an example.
  • Step 101 The video encoding apparatus determines the second number of bits for encoding the first image according to the first ratio, the first number of bits and the first number.
  • Step 102 The video encoding apparatus encodes the first image based on the second number of bits.
  • the first ratio is the ratio of the predicted coding complexity of the first image to the actual coding complexity of M frames of the second image.
  • the first image is the uncoded first frame image in the target image group, and the above-mentioned M frames of the second image are the coded images in the target image group; the first number of bits is the remaining number of bits in the target image group, and the first number is The number of uncoded pictures in the target picture group, M can be an integer greater than 1.
  • the second number of bits is the number of bits configured by the video encoding apparatus for the first image, that is, the second number of bits is the target number of bits of the first image.
  • the first ratio may be used to represent the relative coding complexity of the first image to be coded relative to the M frames of the coded second images in the target image group.
  • the first image, the M second images and the first number are determined according to the encoding progress of the target image group.
  • the target image group includes 10 frames of images, namely image 1, image 2, image 3, image 4, image 5, image 6, image 7, image 8, image 9 and image 10, and image 3 is the most recent encoding
  • the first image is image 4
  • the image 5 becomes the unencoded first frame image in the target image group, so that the video encoding apparatus can regard the image 5 as a new first image, and re-execute the above steps 101 and 102. , and so on, until the encoding of image 10 is completed.
  • the video encoding apparatus can then proceed to encode the next GOP.
  • the first ratio can indicate the relative coding complexity between the first image and the coded M frames of the second image in the target image group
  • the method can determine the number of bits to encode the image to be encoded according to the relative encoding complexity between the image to be encoded and the encoded image in the target image group, the number of bits remaining in the target image group, and the number of frames remaining in the target image group, so It is possible to save coding bits from images with low coding complexity in the target GOP, and use the saved coding bits for coding images with high coding complexity, so that the average coding rate can be kept close to the target rate (average coding rate). Under the premise of encoding code rate), the fluctuation of the PSNR curve of each frame image in the group is reduced, so that the quality of the encoded video can be improved.
  • step 101 may be specifically implemented by the following steps 101a and 101b.
  • Step 101a the video encoding apparatus determines a weighting parameter corresponding to the first ratio according to the first ratio.
  • the target picture group is the ith GOP in the video to be encoded
  • the first picture is the jth frame picture in the target picture group
  • the first ratio is MAD ratio ( ni, j )
  • a and b are two encoding parameters set according to the available channel resources (for example, the available channel transmission rate before encoding the first image) and the encoding complexity of the target image group; a represents the target The average coding complexity of the group of pictures, b is the magnitude of the adjustment to the weighting parameter W MAD ( ni,j ).
  • W MAD ( ni,j ) min ⁇ S high ,max ⁇ S low ,W MAD ( ni,j ) ⁇ ;
  • S high represents the upper boundary of the adjustment range of the buffer, which is used to prevent high-complexity frames from excessively occupying coding resources.
  • S low represents the lower boundary of the adjustment range of the buffer, which is used to avoid video quality degradation caused by too few coding resources occupied by low-complexity frames.
  • Step 101b The video encoding apparatus determines the second number of bits for encoding the first image according to the weighting parameter, the first number of bits and the first number.
  • Tr(n i,j ) is the remaining available bits in the target picture group before encoding the first picture
  • G( ni,j ) is the number of bits in the target picture group before encoding the first picture
  • W MAD ( ni,j ) is the weighting parameter corresponding to the first number of bits.
  • the weighting parameter corresponding to the first ratio representing the relative coding complexity of the first image and the coded image in the target image group can be determined first, and then the The number of encoded images determines the number of bits for encoding the first image, that is, the number of bits of the encoded image can be determined based on the relative encoding complexity between each frame of images in the group of images, so the number of bits of the encoded image is determined relative to the method of average distribution.
  • the video encoding method provided by the embodiments of the present application can better suppress the fluctuation of video quality between frames after encoding.
  • the encoding to be encoded may be determined according to the relative encoding complexity, the number of remaining bits, the number of remaining frames, and the buffer status between the image to be encoded (for example, the above-mentioned first image) and the encoded image.
  • the number of bits in the image to avoid buffer overflow and underflow.
  • step 101 may be specifically implemented by the following step 101c.
  • Step 101c The video encoding apparatus determines the second number of bits for encoding the first image according to the first ratio, the first number of bits, the first number and the target parameter.
  • the target parameters include the estimated occupancy of the buffer, the actual occupancy of the buffer, the encoding frame rate, and the available channel transmission rate before encoding the first image.
  • the transmission rate of the available channels before encoding each frame is the same.
  • the second number of bits for encoding the first image can be determined according to the first ratio, the first number of bits, the first number, and the target parameter, fluctuations in inter-frame encoding quality can be suppressed, and buffers can be avoided. occupancy overflows or underflows. In this way, the quality of the encoded video can be further improved.
  • step 101c may be specifically implemented through the following steps A and B.
  • Step A The video encoding apparatus determines the third number of bits according to the first ratio, the first number of bits and the first number.
  • step A the number of bits for encoding the first image is determined based on the relative encoding complexity among the images in the target image group.
  • the video encoding apparatus may first determine the weighting parameter corresponding to the first ratio according to the first ratio, and then determine the third number of bits according to the weighting parameter, the first number of bits, and the first quality. Equation (25). For details, refer to the related descriptions of step 101a and step 101b, which will not be repeated here in order to avoid repetition.
  • Step B The video encoding apparatus determines the fourth bit number according to the target parameter.
  • the fourth number of bits is the number of bits for encoding the first image determined based on the occupancy of the encoder.
  • the fourth bit of the encoding current frame can be determined from the perspective of the buffer occupancy of the encoder. number
  • u(n i,j ) represents the available channel transmission rate before encoding the first image
  • F r is the encoding frame rate
  • ⁇ 1 is a constant whose value is 0.75.
  • Step C the video encoding apparatus performs weighted summation of the third bit number and the fourth bit number to obtain the second bit number.
  • Tc(n i,j ) is the number of bits of the encoded first image determined from the perspective of relative encoding complexity (refer to the above formula (25) for details)
  • Tc(n i,j ) is the number of bits of the encoded first image determined from the perspective of relative encoding complexity (refer to the above formula (25) for details)
  • ⁇ 1 is a weighting parameter, which determines two aspects when determining the number of bits of the encoded image. degree.
  • ⁇ 1 is a constant whose value range is
  • the third bit number of the encoded first image can be determined from the perspective of relative coding complexity
  • the fourth bit number of the encoded first image can be determined from the perspective of buffer occupancy
  • the third bit number The sum of the weights of the fourth bit number and the fourth bit number is used as the bit number of the final encoded first image, so it can not only improve the quality of the high-complexity image after encoding, but also improve the smoothness of the PSNR curve of each frame image in the target image group, The fluctuation of the PSNR curve is reduced, so that the average PSNR of the entire encoded video sequence can be improved. This can improve the quality of the encoded video.
  • step 102 may be specifically implemented by the following steps 102a and 102b.
  • Step 102a the video encoding apparatus determines a quantization parameter (hereinafter referred to as a target quantization parameter) of the first image through a second rate distortion model based on the second number of bits and the predictive encoding complexity of the first image.
  • a quantization parameter hereinafter referred to as a target quantization parameter
  • Step 102b the video encoding apparatus encodes the first image according to the target quantization parameter.
  • the predicted coding complexity of the first image is represented by the predicted MAD value of the first image, and the predicted MAD value of the first image is based on the actual MAD value of the previous frame image (hereinafter referred to as the third image) of the first image. , predicted by the linear prediction model; and then predicted and obtained the target quantization parameter through the quadratic rate distortion model according to the predicted coding complexity of the first image and the actual coding complexity of the third image.
  • the target quantization parameter It can be predicted by the following formula (28).
  • f(n i,j ) is the number of bits to encode the first image
  • d 1 and d 2 are the parameters of the quadratic rate-distortion model
  • d 1 and d 2 are both constants
  • MAD predict ( ni,j ) represents the predictive coding complexity of the first image.
  • Q pp is the quantization parameter of the third image (which can be obtained after encoding the third image).
  • the target quantization parameter is finally limited to:
  • the basic unit BU used in encoding is different, and the method for encoding the first image by the video encoding apparatus according to the target quantization parameter may also be different.
  • the video encoding apparatus may directly use the target encoding parameter to encode the first image.
  • the basic unit BU is at least one macroblock, and the number of at least one macroblock is less than the number of macroblocks included in one frame of image)
  • the video encoding apparatus needs to perform a basic layer (ie, BU layer) rate control.
  • the method for performing the BU layer rate control by the video coding apparatus will be exemplarily described below.
  • the main object of the rate control of the BU layer is the P frame in the GOP.
  • the BU layer rate control algorithm can include the following five steps:
  • Step 1 Calculate the target number of bits of the BU to be coded, that is, allocate the number of bits to the coded BU.
  • the number of bits remaining in the ith frame image be f rb ( ni,j ), and the number of remaining BUs be N ub ;
  • the initial values of f rb ( ni,j ) and N ub are f( ni,j ) and N unit
  • f( ni,j ) is the total number of bits allocated to the ith frame image
  • N unit is the number of all BUs in the i-th frame image
  • the number of bits allocated to the first uncoded BU in the i-th frame image is f rb /N ub .
  • Step 2 Calculate the estimated header bit number m h of the c th BU in the ith frame image, where c is a positive integer, and the c th BU is the first uncoded BU in the ith frame image.
  • Step 3 Calculate the residual coefficient coding bit number R i (c) of the c-th BU in the i-th frame image:
  • Step 4 According to the MAD linear prediction model, the MAD value of the c-th BU in the i-th image (that is, the predicted MAD value of the c-th BU) is predicted from the MAD value of the target BU, and the target BU is the i-1th frame.
  • the position in the image is the BU corresponding to the position of the c-th BU in the i-th frame image, and the target BU has completed encoding; then according to the predicted MAD value of the c-th BU, the binomial rate-distortion model is used to calculate the encoding and quantization.
  • step size where the binomial rate-distortion model is:
  • ⁇ i (c) is the predicted MAD value of the cth BU
  • Q step,i (j) is the quantization step size calculated by the binomial rate-distortion model.
  • the quantization step size can be converted into a quantization parameter QP, which can be specifically determined according to actual usage requirements.
  • Step 5 According to the calculated quantization parameters, rate-distortion-optimized coding is performed on all macroblocks in the c-th BU, and after the coding is completed, the remaining bits of the i-th frame image, the parameters of the MAD linear prediction model, and two are updated.
  • the parameters of the multinomial rate-distortion model For details, reference may be made to the relevant descriptions in the foregoing embodiments.
  • the video encoding method provided in the embodiment of the present application may further include the following step 103.
  • Step 103 The video coding apparatus determines a first ratio according to the predictive coding complexity of the first image and the average coding complexity of M frames of the second image.
  • the predictive coding complexity of the first image is represented by the predicted MAD value of the first image
  • the average coding complexity of M frames of second images may be represented by the average MAD value of M second images.
  • the first ratio MAD ratio (j) can be calculated by the following formula (26):
  • MAD ratio (j) is the MAD ratio value of the jth P frame in the current GOP
  • MAD predict (j) is the MAD value of the jth P frame predicted by the MAD linear prediction model
  • MAD actual (o) is the actual MAD value calculated after encoding the oth frame in the current GOP (eg, the target group of pictures).
  • the average coding complexity of the coded images in the GOP where the image is located can be referred to, so it can be ensured that the coded video quality of the images in the same GOP is closer to each other , so that the fluctuation of the peak signal-to-noise ratio curve of each frame image in the same GOP can be reduced, so that the quality of the encoded video can be improved.
  • the execution body may be a video coding apparatus, or a control module in the video coding apparatus for executing the video coding method.
  • the video encoding device provided by the embodiments of the present application is described by taking the video encoding method performed by the video encoding device as an example.
  • FIG. 5 is a schematic structural diagram of a possible structure for implementing a video encoding apparatus provided by an embodiment of the present application.
  • the video encoding apparatus 50 may include: a determination module 51 and an encoding module 52 .
  • the determination module 51 can be used to determine the second bit number of the encoded first image according to the first ratio, the first bit number and the first quantity; the encoding module 52 can be used to determine the second bit number based on the determination module 51, Encoding the first image; wherein, the first ratio may be the ratio of the predicted coding complexity of the first image to the actual coding complexity of M frames of the second image, and the first image is the uncoded first frame image in the target image group, The second images of the M frames are coded images in the target image group; the first number of bits is the number of bits remaining in the target image group, the first number is the number of uncoded images in the target image group, and M is an integer greater than 1 .
  • the determining module 51 may be specifically configured to determine the weighting parameter corresponding to the first ratio by using the first ratio; The second bit number of an image.
  • the determining module 51 may be specifically configured to determine the second number of bits to encode the first image according to the first ratio, the first number of bits, the first number and the target parameter; wherein the target parameter Including: the estimated occupancy of the buffer, the actual occupancy of the buffer, the encoding frame rate and the transmission rate of the available channel before encoding the first image.
  • the above-mentioned determination module 51 may include a first determination sub-module and a processing sub-module; the first determination sub-module may be configured to determine according to the first ratio, the first number of bits and the first quantity. The third bit number; and determining the fourth bit number according to the target parameter; the processing sub-module can be used to perform a weighted summation of the third bit number and the fourth bit number determined by the first determination sub-module to obtain the second bit number .
  • the encoding module 52 may include a second determination submodule and an encoding submodule;
  • the second determination submodule can be used to determine the quantization parameter of the first image based on the second bit number and the predictive coding complexity of the first image, through a second rate distortion model;
  • the encoding sub-module may be configured to encode the first image according to the quantization parameter determined by the second determining sub-module.
  • the determining module 51 may also be configured to, before determining the second number of bits of the encoded first image according to the first ratio, the first number of bits and the first number, determine the number of bits according to the first image.
  • the predicted coding complexity and the average coding complexity of the M frames of the second image are used to determine the first ratio.
  • the first ratio can indicate the relative encoding complexity between the first image and the encoded second images of the M frames in the target image group, that is, the video provided by the embodiment of the present application
  • the encoding method can determine the number of bits to encode the image to be encoded according to the relative encoding complexity between the image to be encoded and the encoded image in the target image group, the number of bits remaining in the target image group, and the number of frames remaining in the target image group, Therefore, it is possible to save coding bits from pictures with low coding complexity in the target GOP, and use the saved coding bits for coding the pictures with high coding complexity, so that the average coding rate can be kept close to the target code rate ( On the premise of the average coding rate), the fluctuation of the PSNR curve of each frame image in the picture group can be reduced, and the quality of the coded video can be improved.
  • the video encoding apparatus in this embodiment of the present application may be an apparatus, or may be a component, an integrated circuit, or a chip in a terminal.
  • the apparatus may be a mobile electronic device or a non-mobile electronic device.
  • the mobile electronic device may be a mobile phone, a tablet computer, a notebook computer, a palmtop computer, an in-vehicle electronic device, a wearable device, an ultra-mobile personal computer (UMPC), a netbook, or a personal digital assistant (personal digital assistant).
  • UMPC ultra-mobile personal computer
  • netbook or a personal digital assistant
  • the non-mobile electronic device may be a network attached storage (Network Attached Storage, NAS), a personal computer (personal computer, PC), a television (television, TV), a teller machine or a self-service machine, etc., the embodiment of the present application There is no specific limitation.
  • Network Attached Storage NAS
  • personal computer personal computer, PC
  • television television
  • teller machine a self-service machine
  • the video encoding apparatus in this embodiment of the present application may be an apparatus having an operating system.
  • the operating system may be an Android (Android) operating system, an ios operating system, or other possible operating systems, which are not specifically limited in the embodiments of the present application.
  • the video encoding apparatus provided in the embodiments of the present application can implement each process implemented by the method embodiments in FIG. 1 to FIG. 4 , and to avoid repetition, details are not repeated here.
  • an embodiment of the present application further provides an electronic device 200, including a processor 202, a memory 201, and a program or instruction stored in the memory 201 and executable on the processor 202, the program or instruction being processed
  • an electronic device 200 including a processor 202, a memory 201, and a program or instruction stored in the memory 201 and executable on the processor 202, the program or instruction being processed
  • the device 202 is executed, each process of the above-mentioned embodiments of the screenshot method can be achieved, and the same technical effect can be achieved. In order to avoid repetition, details are not repeated here.
  • the electronic devices in the embodiments of the present application include the aforementioned mobile electronic devices and non-mobile electronic devices.
  • FIG. 7 is a schematic diagram of a hardware structure of an electronic device implementing an embodiment of the present application.
  • the electronic device 1000 includes but is not limited to: a radio frequency unit 1001, a network module 1002, an audio output unit 1003, an input unit 1004, a sensor 1005, a display unit 1006, a user input unit 1007, an interface unit 1008, a memory 1009, and components such as the processor 1010.
  • the electronic device 1000 may also include a power source (such as a battery) for supplying power to various components, and the power source may be logically connected to the processor 1010 through a power management system, so that the power management system can manage charging, discharging, and power functions. consumption management and other functions.
  • a power source such as a battery
  • the structure of the electronic device shown in FIG. 7 does not constitute a limitation on the electronic device.
  • the electronic device may include more or less components than the one shown, or combine some components, or arrange different components, which will not be repeated here. .
  • the processor 1010 may be configured to determine the second number of bits for encoding the first image according to the first ratio, the first number of bits and the first number; and, based on the second number of bits, encode the first image; wherein the first The ratio may be the ratio of the predicted coding complexity of the first image to the actual coding complexity of M frames of second images, where the first image is an uncoded first frame of images in the target image group, and the M frames of second images are the target image
  • the coded pictures in the group; the first number of bits is the number of bits remaining in the target picture group, the first number is the number of uncoded pictures in the target picture group, and M is an integer greater than 1.
  • the processor 1010 may be specifically configured to determine a weighting parameter corresponding to the first ratio by using the first ratio; The second bit number of an image.
  • the processor 1010 may be specifically configured to determine the second number of bits to encode the first image according to the first ratio, the first number of bits, the first number and the target parameter; wherein the target parameter Including: the estimated occupancy of the buffer, the actual occupancy of the buffer, the encoding frame rate and the transmission rate of the available channel before encoding the first image.
  • the processor 1010 may be configured to determine the third number of bits according to the first ratio, the first number of bits and the first number; and determine the fourth number of bits according to the target parameter; and The third bit number and the fourth bit number are weighted and summed to obtain the second bit number.
  • the processor 1010 may be configured to determine the quantization parameter of the first image through a second rate distortion model based on the second number of bits and the predictive coding complexity of the first image; and according to this The quantization parameter encodes the first image.
  • the processor 1010 may be further configured to, before determining the second number of bits to encode the first image according to the first ratio, the first number of bits, and the first number, perform an encoding process according to the number of bits of the first image.
  • the predicted coding complexity and the average coding complexity of the M frames of the second image are used to determine the first ratio.
  • the first ratio can indicate the relative encoding complexity between the first image and the encoded second images of the M frames in the target image group, that is, the video provided by the embodiment of the present application
  • the encoding method can determine the number of bits to encode the image to be encoded according to the relative encoding complexity between the image to be encoded and the encoded image in the target image group, the number of bits remaining in the target image group, and the number of frames remaining in the target image group, Therefore, it is possible to save coding bits from pictures with low coding complexity in the target GOP, and use the saved coding bits for coding the pictures with high coding complexity, so that the average coding rate can be kept close to the target code rate ( On the premise of the average coding rate), the fluctuation of the PSNR curve of each frame image in the picture group can be reduced, and the quality of the coded video can be improved.
  • the input unit 1004 may include a graphics processor (Graphics Processing Unit, GPU) 10041 and a microphone 10042. Such as camera) to obtain still pictures or video image data for processing.
  • the display unit 1006 may include a display panel 10061, which may be configured in the form of a liquid crystal display, an organic light emitting diode, or the like.
  • the user input unit 1007 includes a touch panel 10071 and other input devices 10072 .
  • the touch panel 10071 is also called a touch screen.
  • the touch panel 10071 may include two parts, a touch detection device and a touch controller.
  • Other input devices 10072 may include, but are not limited to, physical keyboards, function keys (such as volume control keys, switch keys, etc.), trackballs, mice, and joysticks, which will not be repeated here.
  • Memory 1009 may be used to store software programs as well as various data, including but not limited to application programs and operating systems.
  • the processor 1010 may integrate an application processor and a modem processor, wherein the application processor mainly processes the operating system, user interface, and application programs, and the like, and the modem processor mainly processes wireless communication. It can be understood that, the above-mentioned modulation and demodulation processor may not be integrated into the processor 1010.
  • the embodiments of the present application further provide a readable storage medium, where a program or an instruction is stored on the readable storage medium.
  • a program or an instruction is stored on the readable storage medium.
  • the processor is the processor in the electronic device described in the foregoing embodiments.
  • the readable storage medium includes a computer-readable storage medium, such as a computer read-only memory (Read-Only Memory, ROM), a random access memory (Random Access Memory, RAM), a magnetic disk or an optical disk, and the like.
  • An embodiment of the present application further provides a chip, where the chip includes a processor and a communication interface, the communication interface is coupled to the processor, and the processor is configured to run a program or an instruction to implement the above video encoding method embodiments.
  • the chip includes a processor and a communication interface
  • the communication interface is coupled to the processor
  • the processor is configured to run a program or an instruction to implement the above video encoding method embodiments.
  • the chip mentioned in the embodiments of the present application may also be referred to as a system-on-chip, a system-on-chip, a system-on-a-chip, or a system-on-a-chip, or the like.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Computing Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Algebra (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Analysis (AREA)
  • Mathematical Optimization (AREA)
  • Pure & Applied Mathematics (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)

Abstract

本申请公开了一种视频编码方法、装置和电子设备,属于通信技术领域。该方法包括:根据第一比值、第一比特数和第一数量,确定编码第一图像的第二比特数;基于第二比特数,编码第一图像;其中,第一比值为第一图像的预测编码复杂度与M帧第二图像的实际编码复杂度的比值,第一图像为目标图像组中未编码的第一帧图像,M帧第二图像为目标图像组中已编码的图像;第一比特数为目标图像组中剩余的比特数,第一数量为目标图像组中未编码图像的数量,M为大于1的整数。

Description

视频编码方法、装置和电子设备
相关申请的交叉引用
本申请主张在2021年04月26日在中国提交的中国专利申请号202110454418.X的优先权,其全部内容通过引用包含于此。
技术领域
本申请属于通信技术领域,具体涉及一种视频编码方法、装置和电子设备。
背景技术
视频编码是一种针对数字视频的数据压缩方法,目标是去除原始视频图像中的冗余,节约存储和传输成本;且在同等编码码率条件下,尽量降低编码后的视频图像的失真,以提高编码后的视频的质量。
目前,可以通过视频编码标准H.264/AVC中的JVT-G012码率控制算法对视频进行编码。JVT-G012码率控制算法实现了图像组GOP(Group of Pictures)级、帧级,以及宏块级的三级码率控制,控制功能比较全面。
然而,当视频序列中的一个图像组中不包括P帧图像(即向前搜索帧)时,JVT-G012码率控制算法是按照平均分配的方式给该图像组中的P帧图像分配比特数的,即JVT-G012码率控制算法在进行比特分配时,在帧层面没有考虑编码复杂度的问题,如此可能使得GOP内部各帧图像的峰值信噪比(Peak Signal-to-Noise Ratio,PSNR)曲线的波动,从而导致整个视频序列的平均峰值信噪比PSNR的下降。如此,导致编码后的视频的质量较差。
发明内容
本申请实施例的目的是提供一种视频编码方法、装置和电子设备,能够解决在帧层面没有考虑编码复杂度,使得编码后的视频的平均峰值信噪比PSNR的下降,从而导致编码后的视频的质量较差的问题。
第一方面,本申请实施例提供了一种视频编码方法,该方法包括:根据第一比值、第一比特数和第一数量,确定编码第一图像的第二比特数;基于第二比特数,编码第一图像;其中,第一比值为第一图像的预测编码复杂度与M帧第二图像的实际编码复杂度的比值,第一图像为目标图像组中未编码的第一帧图像,该M帧第二图像为目标图像组中已编码的图像;第一比特数为目标图像组中剩余的比特数,第一数量为目标图像组中未编码图像的数量,M为大于1的整数。
第二方面,本申请实施例提供了一种视频编码装置,该装置包括:确定模块和编码模块;确定模块,用于根据第一比值、第一比特数和第一数量,确定编码第一图像的第二比特数;编码模块,用于基于确定模块确定的第二比特数,编码第一图像;其中,第一比值为第一图像的预测编码复杂度与M帧第二图像的实际编码复杂度的比值,第一图像为目标图像组中未编码的第一帧图像,该M帧第二图像为目标图像组中已编码的图像;第一比特数为目标图像组中剩余的比特数,第一数量为目标图像组中未编码图像的数量,M为大于1的整数。
第三方面,本申请实施例提供了一种电子设备,该电子设备包括处理器、存储器及存储在所述存储器上并可在所述处理器上运行的程序或指令,所述程序或指令被所述处理器执行时实现如第一方面所述的方法的步骤。
第四方面,本申请实施例提供了一种可读存储介质,所述可读存储介质上存储程序或指令,所述程序或指令被处理器执行时实现如第一方面所述的方法的步骤。
第五方面,本申请实施例提供了一种芯片,所述芯片包括处理器和通信接口,所述通信接口和所述处理器耦合,所述处理器用于运行程序或指令,实现如第一方面所述的方法。
在本申请实施例中,可以根据第一比值、第一比特数和第一数量,确定编码第一图像的第二比特数;并基于第二比特数,编码第一图像;其中,第一比值为第一图像的预测编码复杂度与M帧第二图像的实际编码复杂度的比值,第一图像为目标图像组中未编码的第一帧图像,M帧第二图像为目标图像组中已编码的图像;第一比特数为目标图像组中剩余的比特数,第一数量为目标图像组中未编码的图像的数量,M为大于1的整数。通过该方案,由于第一比值可以指示第一图像与目标图像组中已编码的M帧第二图像之间的相对编码复杂度,即本申请实施例提供的视频编码方法可以根据目标图像组中待编码图像与已编码图像之间的相对编码复杂度、目标图像组中剩余的比特数和目标图像组中剩余的帧数,为待编码图像分配比特数,因此可以实现从目标图像组中的低编码复杂度的图像节省编码比特,并将节省的编码比特用于对高编码复杂度的图像的编码,从而可以在保持平均编码码率接近目标码率(平均编码码率)的前提下,减小图像组中的各帧图像PSNR曲线的波动,进而可以提高编码后的视频的质量。
附图说明
图1为视频编码中码率控制基本框架图;
图2为缓冲器的示意图;
图3为码率控制算法一般结构图;
图4为本申请实施例提供的视频编码方法的流程图;
图5为本申请实施例提供的一种视频编码装置的示意图;
图6为本申请实施例提供的电子设备的示意图;
图7为本申请实施例提供的电子设备的硬件示意图。
具体实施方式
下面将结合本申请实施例中的附图,对本申请实施例中的技术方案进行清楚地描述,显然,所描述的实施例是本申请一部分实施例,而不是全部的实施例。基于本申请中的实施例,本领域普通技术人员获得的所有其他实施例,都属于本申请保护的范围。
本申请的说明书和权利要求书中的术语“第一”、“第二”等是用于区别类似的对象,而不用于描述特定的顺序或先后次序。应该理解这样使用的数据在适当情况下可以互换,以便本申请的实施例能够以除了在这里图示或描述的那些以外的顺序实施。且“第一”、“第二”等所区分的对象通常为一类,并不限定对象的个数,例如第一对象可以是一个,也可以是多个。此外,说明书以及权利要求中“和/或”表示所连接对象的至少其中之一,字符“/”,一般表示前后关联对象是一种“或”的关系。
下面首先对本申请的权利要求书和说明书中涉及的一些名词或者术语进行解释说明。
基本单元BU(basic unit,BU):是一个或多个宏块MB(macro block,MB)的集合。一个BU内包含的MB数应能被一帧图像内包含的MB数整除,如QCIF格式的视频序列中,若一帧图像包含99个MB,那么:该图像的一个BU中可包含99、33、11、9、3、1个MB,从而该图像中可以包括1、3、9、11、33、99个BU。
可以看出,一个BU可以包括一个MB,一个片,一个场或一帧图像。
示例性地,以一个基本单元BU由至少一个宏块组成为例。假设一个图像由a个宏块MB组成,一个BU由b个连续的MB组成,那么:c=a/b;其中,c为该图像中包括的全部BU的数量,且a、b、c均为正整数。
需要说明的是,一个基本单元BU内的所有宏块MB均使用同一个量化参数QP进行编码。一个BU的中包括的MB的数量越多,表示该BU的尺寸越大,编码该BU的计算复杂度越低,控制精度也越低;一个BU的中包括的MB的数量越少,表示该BU的尺寸越小,编码该BU的计算复杂度越高,控制精度越高。在实时应用中,通常选择较大尺寸的BU,如将图像一行的所有MB组成一个基本单元BU,或将一帧图像作为一个基本单元BU。
流量往返模型:用于计算分配给当前帧图像的目标比特,即分配给当前帧图像的比特数。
具体的,令N表示一个视频序列中的一个GOP中包含的图像的数量,N为大于1的整数;n i,j(i=1,2,...,j=1,2,...,N)表示该视频序列的第i个GOP中的第j帧图像(以下称为图像j),B c(n i,,j)表示在编码图像n i,j后缓冲器的缓存区的实际占有率,则有:
Figure PCTCN2022088950-appb-000001
Figure PCTCN2022088950-appb-000002
B c(n i+1,0)=B c(n i,N)
其中,在上述公式(1)中,A(n i,j)为编码图像n i,j产生的实际比特数,u(n i,j-1)为编码图像n i,j前的瞬时信道带宽,F r为编码帧率,B s表示缓存器的缓冲区大小,缓冲区的最大占有率由不同的层次(Profile)和级别(Level)决定;
Figure PCTCN2022088950-appb-000003
表示编码第1个GOP中的第1帧图像后缓冲区的实际占用量,a 0为常数,通常a 0的值为8;B c(n i+1,0)=B c(n i,N)表示编码第i+1个GOP的第一帧图向前,缓冲区的实际占用量与编码第i个GOP中的最后一帧图像后缓冲区的实际占用量相同。
缓冲器:又称为冲寄存器,用于将外设(例如编码器)送来的数据暂时存放,以便通过信道带宽传输该数据。本发明实施例中的缓冲区为缓冲器的缓冲区。
MAD线性预测模型:用于根据第j-1帧图像的实际MAD预测第j帧图像的MAD,或根据第j-1帧图像中的一个基本单元的MAD预测第j帧图像中相应位置的基本单元的MAD,j为大于1的正整数。
示例性地,以通过线性预测模型预测第j帧图像中的基本单元BU1的MAD为例。令BU1与第j-1帧图像中对应位置的BU2对应,假设BU1的MAD值为MAD cb,BU2的MAD值为MAD pb,那么MAD线性预测模型可以通过下述的公式(2)表示:
MAD cb=a 1*MAD pb+a 2;       (2)
其中,在上述公式(2)中,a 1和a 2是MAD线性预测模型的两个参数,a1和a2的初始值分别设置为1和0,并在编完每个BU后进行更新。需要说明的是,可以根据预测MAD值与实际MAD值得差异,更新a1和a2;具体方法可以根据实际使用需求确定,此处不作具体限定。
图像的MAD:为当前帧图像的YUV值(例如Y值)与当前帧图像的前一帧图像(应为P帧图像或I帧图像)的YUV值(例如Y值)得绝对平均差。
其中,YUV中的“Y”表示明亮度(Luminance或Luma)“U”和“V”表示的是色度或浓度(Chrominance或Chroma),“U”和“V”的作用是描述图像的色彩及饱和度,用于指示图像的色彩度。
基本单元BU的MAD:为一个BU的YUV值与另一个BU的YUV值的绝对平均差,其中,该另一个BU为该一个BU所在图像(例如第j帧图像)的前一帧图像(例如第j-1帧图像)中的BU,且该一个BU在第j帧图像中的坐标信息与该另一个BU在第j-1帧图像中的坐标信息相同,第j帧图像与第j-1帧图像属于同一图像组,j为大于1的整数。
下面结合附图,通过具体的实施例及其应用场景对本申请实施例提供的视频编码方法进行详细地说明。
视频信号传输带宽通常都会受到一定限制,为了在满足信道带宽和传输时延的情况下有效传输视频数据,保证视频业务播放质量,需要对视频编码过程进行码率控制。所谓码 率控制,就是通过选择合适的编码参数,例如量化参数QP,并按照该量化参数对与该量化参数对应的图像编码,使得视频信号编码后的比特率满足带宽的限制,并且使编码失真尽可能小。可以理解,码率控制是一个典型的多约束条件、多目标的率失真优化问题,该问题可以描述为:在视频信号的总编码比特数小于或等于Rc(限制比特,或目标比特)的条件下,为每一个编码单元确定最优的编码参数,使得总失真最小,具体可以通过下述的公式(3)表示:
Figure PCTCN2022088950-appb-000004
其中,在上述公式(3)中,N为一个视频序列包含的图像的数量,D i为该视频序列中的第i帧图像的编码失真;R i为该视频序列中的第i帧图像的编码比特数,
Figure PCTCN2022088950-appb-000005
为该视频序列中各帧图像的最优编码参数(即量化参数QP),即
Figure PCTCN2022088950-appb-000006
为第1帧图像的最优编码参数、
Figure PCTCN2022088950-appb-000007
第2帧图像的最优编码参数、……,
Figure PCTCN2022088950-appb-000008
为第N帧图像的最优编码参数;Rc为该视频序列的目标编码比特数。
视频序列可以通过编码器进行编码,编码后的编码比特流通常需要通过通信信道进行传输。由于实际应用中的通信信道大多是恒定比特率CBR(Constant Bitrate,CBR)信道,而编码器输出的编码码流大多是变比特率VBR(Variable Bitrate,VBR)码流。因此,为了实现在CBR信道有效传输VBR码流,可以在编码器输出部分设置一个缓冲器,如此视频编码中码率控制基本框架如图1所示。
如图2所示,图2为缓冲器的示意图。图2中的A表示视频编码器输出至缓冲器的编码比特流,Bs表示缓冲器的缓冲区大小,Bc(即图2中的填充区域)是缓冲器的缓冲区中待发送的比特数,Cb是信道带宽,Fr是编码帧率,Cb/Fr表示在编码器编码一帧图像的时长内,通信信道传输的数据量。
下面对传统技术中的码率控制算法的原理进行示例性地说明。
码率控制的目标是在有限的带宽下获得更好的视频质量。为了达到这个目标,需要解决两个问题:一是如何分配编码比特数,二是如何有效利用分配比特数估计最优编码参数;换句话说,码率控制算法通常包括个步骤:比特分配和量化参数QP估计。其中,比特分配即把有限的资源分配到图像组、帧以及宏块等图像单元上。量化参数估计,即根据图像单元分配到的资源(以下称为资源0,即比特数),估计与该资源0对应的最优编码参数,以使编码后的视频的失真最小。
码率控制算法一方面要求编码后的码流适合在带限信道(例如CBR信道)上传输;另一方面要求在有限的信道传输带宽下获得更好的视频质量。而判断视频质量的好坏,通常要考虑两方面:一是看整个序列所有帧的平均PSNR,平均PSNR更好的视频序列质量更好;二是看视频序列编码过程中PSNR曲线的变化情况,拥有更平滑的PSNR曲线的视频序列质量更好。
在传统的码率控制算法中,以上两个问题是在三个层次上进行研究的,这三个层次分别是:GOP层、帧层和BU层。在视频编码中,通常以GOP为单位,进行“三层两步”的码率控制,如图3所示。
一个GOP通常是由一个采用帧内预测编码的I帧开头,其后跟着若干个采用帧间预测编码的P帧和/或B帧。其中,I帧为GOP中的关键帧,属于帧内压缩,I帧的画面会完整保留,解码I帧时只需要本帧数据就可以完成。P帧为向前搜索帧,也称为差别帧或帧间压缩,P帧编码后表示的是当前帧与I帧或与当前帧之前的P帧的差别信息;解码P帧时,需要用当前帧之前的P帧或I帧缓存的画面叠加上本帧定义的编码的差别信息,重建当前帧的画面。B帧是双向差别帧,也就是说,编码后的B帧记录的是本帧(即当前帧)与前后帧的差别信息;换言之,要解码B帧,不仅要取得之前的缓存画面,还要解码之后的画 面,通过前后帧与本帧编码数据重建本帧图像。
根据上述描述可知,编码I帧产生的数据量远大于编码P帧和编码B帧所产生的数据量,因此在编码I帧之后,缓冲器的占用量Bc将达到较高水平,且该占用量Bc在编码I帧之后的P帧和B帧的过程中逐渐下降,在将一个GOP中的图像编码完成之后,缓冲器占用量可以恢复至编码该GOP之前的水平。
实际实现中,从GOP层到BU层,码率控制算法自上向下分配编码资源,并根据可用编码比特数确定量化参数QP。具体的:
GOP层码率控制的主要任务是为整个GOP分配编码比特数,分配的依据是当前GOP包含的帧数、编码器输出缓冲区占用量和信道带宽。然后需要计算出GOP起始I帧的QP;计算I帧QP的过程是在帧内预测帧与帧间预测帧之间分配编码资源的过程,在JVT-G012中,各GOP的I帧QP是根据前一GOP中的全部P帧的平均QP计算得到的,对第一个GOP,可以根据经验为第一个GOP中的I帧选择QP。
帧层码率控制是视频编码中的重要环节,不论是GOP层码率控制还是BU层码率控制,都是围绕着帧层码率控制进行的。帧层码率控制中首先要在GOP内部的各P帧之间以目标比特的形式分配编码比特,然后根据分配的编码比特数估计当前帧的QP。
在GOP层码率控制和帧层码率控制的编码比特分配中,通过设置I帧QP和各P帧的编码比特数的方式,完成I帧与P帧间、不同P帧间的编码比特数分配。在帧层码率控制的QP计算和BU层码率控制中,主要任务是通过为帧内各MB设置合适的QP,使编码产生的实际比特数和目标比特数相符。
下面以H.264/AVC视频编码推荐的JVT-F086码率控制算法和JVT-G012码率控制算法为例,对传统技术的码率控制方法进行示例性地说明。
JVT-F086码率控制算法和JVT-G012码率控制算法
I,JVT-F086码率控制算法以MPEG-2TM5码率模型为基础,根据缓冲器状态进行比特分配,尽量保证缓冲器既不上溢也不下溢。在JVT-F086码率控制算法中,首先需要在编码一帧图像前对该帧图像编码所需比特数进行估计,然后根据缓冲器的反馈预先假定一个QP,并按照该QP对该帧图像进行编码;然后再根据当前帧图像的实际编码结果,判断是否需要调整预先假定的QP,若需要调整,那么可以先调整QP,并按照调整后的QP对该帧图像进行再次编码处理;即在JVT-F086码率控制算法中,编码每一帧图像时,都需要判断是否重新给定QP,并按重新给定的QP再次编码该帧图像,从而JVT-F086的计算复杂度比较高。同时,JVT-F086码率控制算法从缓冲器饱和度方面控制码率,其对缓冲器控制的较好,缓冲器占用量变化比较平滑,但其编码后的视频质量波动比较大。
II,JVT-G012码率控制算法继承了MPEG-4VM8码率控制算法的思想,沿用二次率失真模型,可以根据信源特征及时地调整模型参数,JVT-G012码率控制算法的关键技术包括流量往返模型,MAD线性预测模型和二次率失真模型等。JVT-G012码率控制算法根据预先定义的比特率、帧率、缓冲器充盈程度和缓冲器目标线为当前帧分配目标编码比特,然后利用线性跟踪理论预测当前帧图像的MAD,最后由二次率失真模型计算当前帧图像的QP。JVT-G012码率控制算法利用预测MAD的方法解决了QP悖论问题,而且相比JVT-F086码率控制算法,仅需对待编码的每帧图像进行一次编码,因此JVT-G012码率控制算法的计算复杂度较低。进一步的,JVT-G012码率控制算法实现了GOP层、帧层、宏块层的三级码率控制,控制功能比较全面。
下面在对JVT-G012码率控制算法实现GOP层、帧层和宏块层的三级码率控制的过程进行详细地说明。
GOP层码率控制
对于第i个GOP,在编码第i个GOP中的第一帧图像前,先根据信道速率和缓冲器状态为第i个GOP分配目标比特数Tr(n i,0),Tr(n i,0)表示编码第i个GOP的第0帧图像后, 该GOP可用/剩余的比特数,即:
Figure PCTCN2022088950-appb-000009
上述公式(4)中,u(n i,1)表示编码第i个GOP的第一帧图像前的可用信道传输速率,N i为第i个GOP包含的图像帧的数量,B s为缓冲区大小,B c(n i-1,Ni)表示编码第i-1个GOP后缓冲区的实际占用量,F r表示编码帧率。
编码第i个GOP中的1帧图像后,更新一次Tr(n i,j):
Figure PCTCN2022088950-appb-000010
在上述公式(5)中,Tr(n i,j)表示编码图像n i,j后,第i个GOP中剩余可用的比特数,u(n i,j)表示编码图像n i,j前的可用信道传输速率,u(n i,j-1)表示编码图像n i,j-1前的可用信道传输速率,A(n i,j)为图像n i,j的实际编码比特数,i为正整数,j为大于1的整数。对于CBR信道,有u(n i,j)=u(n i,j-1),则上述公式(5)可以简化成公式(6):
Figure PCTCN2022088950-appb-000011
可以理解,对第i个GOP的比特数分配的过程,即为对第i个GOP进行GOP层码率控制的过程。在完成GOP层码率控制之后,还需要确定第i个GOP的初始量化参数。视频序列中的第一个GOP(即i=1)的初始量化参数是一个预定义的QP 0,且第一个GOP中的I帧和第一个P帧均使用QP 0编码。
视频序列中除第一个GOP之外的其它GOP的I帧和第一个P帧的初始量化参数可以通过下述的公式(7)计算得到:
Figure PCTCN2022088950-appb-000012
在上述公式(7)中,QP st(i)表示第i个GOP的初始量化参数,Sum PQP(i-1)表示第i-1个GOP中所有P帧的量化参数之和,N (i-1)p表示第i-1个GOP中的P帧的数量,T r(n i-1,Ni)表示编码第i-1个GOP的最后一帧图像后,第i-1个GOP中可用的比特数,T r(n i,0)表示编码第i个GOP的第0帧图像后,第i个GOP中可用的比特数,N i-1表示第i-1个GOP中包含的图像帧的数量,N (i-1)p,表示第i-1个GOP中包含的P帧的数量。
帧层码率控制
帧层码率控制包括两个阶段:编码前阶段和编码后阶段。
(一),编码前阶段
本阶段的主要任务是为所有编码帧包括P帧和B帧计算量化参数。由于B帧通常不做参考帧,它的QP可由相邻帧的QP通过简单的线性插值得到,而P帧要作为后续帧的参考帧,其QP的值要精确计算得到。因此,要分别考虑不同帧的量化参数的计算方法。
①,B帧的量化参数计算
假设相邻两个P帧之间的连续的B帧数是E(E为大于1的整数),相邻两个P帧的量化参数分别为QP 1和QP 2,那么第i个B帧的量化参数根据下面两种情况计算如下:
a,当E=1时,即在两个相邻P帧之间只有一个B帧,B帧的量化参数
Figure PCTCN2022088950-appb-000013
的计算公式为公式(8):
Figure PCTCN2022088950-appb-000014
b,当E>1时,即相邻两个P帧之间不止一个B帧,则B帧的量化参数的计算公式为公式(9):
Figure PCTCN2022088950-appb-000015
其中,在上述公式(9)中,
Figure PCTCN2022088950-appb-000016
为相邻两个P帧之间的第i(i为正整数)个B帧,α是相邻两个P帧之间的第1个B帧的量化参数和相邻两个P帧的量化参数QP之间的差值,由下式给定:
Figure PCTCN2022088950-appb-000017
在上述公式(10)中,QP 2-QP 1<-2E+1的情况只有在视频序列从一个GOP切换到另一个GOP时才会发生。结合公式(10),如下述的公式(11)所示,相邻两个P帧之间的第i个B帧最终的量化参数QB i根据H.264/AVC标准进一步调整如下:
Figure PCTCN2022088950-appb-000018
其中,公式(11)中的
Figure PCTCN2022088950-appb-000019
参见公式(10)中的
Figure PCTCN2022088950-appb-000020
②,P帧的量化参数计算
1),确定目标缓冲级别
由于一个GOP中的第一个P帧的量化参数已经由该GOP给出,因此只需要确定该GOP中的其它P帧的目标缓冲区级别。可以理解,在编码完第一个GOP中的一个P帧后,可以得到目标缓冲区的初始水平值为,Tbl(n i,2)=B c(n i,2),其中,B c(n i,2)为编码完第i个GOP中的第1个P帧之后,缓冲区的实际占用量。那么,第i个GOP中的第j(j为正整数)个P帧的目标缓冲区水平定义为:
Figure PCTCN2022088950-appb-000021
在上述公式(12)中,Tbl(n i,j)为第i个GOP中的第j帧P帧图像的目标缓冲区水平,
Figure PCTCN2022088950-appb-000022
Figure PCTCN2022088950-appb-000023
分别为P帧和B帧的平均编码复杂度,u(n i,j)表示编码第i个GOP的第j帧图像前的可用信道传输速率,B s为缓冲区大小,N p(i-1)为第i-1个GOP中的P帧的数量。图像的编码复杂度可以通过公式(13)计算:
Figure PCTCN2022088950-appb-000024
在上述公式(13)中,S p表示第i个GOP中的所有P帧实际编码产生的比特数,S b表示第i个GOP中的所有B帧实际编码产生的比特数,Q p表示第i个GOP中的所有P帧的平均量化参数,Q b表示第i个GOP中的所有B帧的平均量化参数。在两个P帧间没有B帧的情况下,上述公式(12)可以简化为下述的公式(14):
Figure PCTCN2022088950-appb-000025
由上述公式(14)容易得出,
Figure PCTCN2022088950-appb-000026
的值接近Bs/8。因此,如果缓存器的实际占用量和预先确定的缓存器占用量完全一样的话,就能保证每个GOP只用了自己的比特开销。但是由于率失真模型和MAD线性预测模型的不准确性,缓存器的实际占用量和预先确定的缓存器占用量之间经常有差值,所以需要微调来获得每一帧的目标比特数。
2),计算P帧目标比特数
根据线性跟踪理论,为第i个GOP的第j帧分配的比特数
Figure PCTCN2022088950-appb-000027
由缓存器的目标占用量、编码帧率、可用信道带宽和缓存器的实际占用量共同决定:
Figure PCTCN2022088950-appb-000028
其中,在上述公式(15)中,γ为一个常量,当GOP内插有B帧时,它的值为0.25,否则为0.75,u(n i,j)表示编码第i个GOP的第j帧图像时的可用信道传输速率,F r为编码帧率,Tbl(n i,j)为第i个GOP中的第j帧图像的目标缓冲区水平;B c(n i,j)表示编码第i个GOP中的第j帧图像后,缓冲区的实际占用量。同时,编码第i个GOP中的第j帧图像后的剩余比特
Figure PCTCN2022088950-appb-000029
也要被考虑到:
Figure PCTCN2022088950-appb-000030
其中,在上述公式(16)中,N p,r(j-1)和N b,r(j-1)分别表示当前GOP中剩余未编码的P帧和B帧数量。最终,第j帧的分配到的比特数由
Figure PCTCN2022088950-appb-000031
Figure PCTCN2022088950-appb-000032
加权相加得到:
Figure PCTCN2022088950-appb-000033
其中,在公式(17)中,在第i个GOP有B帧时,β的取值为0.9;在第i个GOP有B帧时,β的取值为0.5。
3),计算P帧的量化参数QP并执行率失真RDO(Rate distortion optimization,RDO)优化
可选地,当前帧的MAD值由前一帧的实际MAD通过线性预测模型得到,然后根据二次率失真模型计算出一个GOP中的第i帧图像n i,j的量化参数
Figure PCTCN2022088950-appb-000034
i和j均为正整数。
Figure PCTCN2022088950-appb-000035
其中,在公式(18)中,f(n i,j)为第i个GOP中的第j帧图像的分配到的比特数,d 1和d 2为常数,MAD predict(n i,j)为预测的MAD值,
Figure PCTCN2022088950-appb-000036
为率失真模型计算得到的量化步长,然后可以将其转换为量化参数QP。
为了保证视频质量的连续性,相邻两帧图像的量化参数的差值应不大于2,从而将图像n i,j的量化参数调整为
Figure PCTCN2022088950-appb-000037
Figure PCTCN2022088950-appb-000038
其中,在上述公式(19)中,Q pp为第i个GOP中的第i-1帧图像n i,j的量化参数;最终图像n ij的量化参数被限制为:
Figure PCTCN2022088950-appb-000039
(二),编码后阶段
此阶段的主要任务有三个:更新线性预测模型中的参数、更新二次率失真模型中的参数,以及确定跳帧数目。
具体的,可以根据图像n i,j的预测MAD值与图像n i,j的实际MAD值之间的误差,更新线性预测模型中的参数和二次率失真模型中的参数。
在编完一帧图像(例如图像n i,j)以后,缓冲器的预测占用量(也可以称为缓冲器新的占用量,或缓冲器的预测占用量)由图像n i,j实际产生的比特数A(n i,j)和缓冲器的当前占用量,以及编码器编码一帧的时长内信道能传输的数据量确定。当遇到连续高复杂度编码帧时,需要采用跳帧技术来避免缓冲器新的占用量过高,甚至是溢出。跳帧的数量N post初始化为0,然后不断增加,直到满足下面的条件:
Figure PCTCN2022088950-appb-000040
其中,在上述公式(21)中,
Figure PCTCN2022088950-appb-000041
表示预测的编码图像
Figure PCTCN2022088950-appb-000042
后缓冲器的占用量,j表示启始跳帧的帧序号; j+Npost表示需求丢弃的图像帧。
缓冲器的占用量可以通过下述的公式(22)计算:
B c(n i,j+l+1)=B c(n i,j+l)-u(n i,j+l)/F r;1≤l<N post;    (22)
其中,上述公式(22)中,j表示启始跳帧的帧序号,l为正整数。
综上所述,根据上述的公式(12)至公式(17)可知,若一个GOP中不存在B帧,则JVT-G012码率控制算法在进行比特分配时,在帧层面不考虑各P帧之间的编码复杂度。也就是说,假设同一GOP中各P帧的编码复杂度相同,并为各P帧平均分配编码资源。但在实际视频序列中,各帧的编码复杂度会随各帧包含运动的幅度和多少而变化,采用平均分配策略不仅会引起GOP内部各帧PSNR曲线的波动,还会导致整个序列平均PSNR的下降,降低了整个视频编码的质量。
在码率控制中,准确的估计控制对象的编码复杂度是进行合理有效资源分配的基础。在JVT-G012中,假设同一GOP中各P帧的编码复杂度相同,为各P帧平均分配编码资源。而在实际视频中,视频中的各帧图像的编码复杂度会随各帧图像包含运动的幅度和多少而变化,采用平均分配的策略会导致压缩后视频质量的波动。针对以上问题,申请实施例提出了一种基于编码复杂度的视频编码方法,对JVT-G012方法中帧层码率控制中,计算P帧的比特数步骤进行了优化。
具体的,本申请实施例提供的视频编码方法在待编码视频的一个图像组GOP内,根据编码复杂度进行帧层的比特分配,从低复杂度帧编码上节省编码比特,并将之用于高复杂度帧编码,如此可以在保持平均编码码率接近目标码率的前提下,减小图像组中的各帧图像PSNR曲线的波动,从而可以提高编码后的视频的质量。
为了使视频序列中各帧图像编码后的画面质量更接近彼此,需要根据编码复杂度为各帧图像分配合适的编码比特数,这种比特数分配通常是在各个GOP内部进行的。在同一GOP中的不同图像间进行比特分配,需要知道图像之间的编码相对复杂度,并根据编码相对复杂度计算一个加权参数以修正在JVT-G012帧层码率控制中采取平均分配策略分配的比特数。
本申请实施例提供一种视频编码方法,如图4所示,该方法可以包括下述的步骤101 和步骤102。下面以视频编码装置为执行主体为例对该方法进行示例性说明。
步骤101、视频编码装置根据第一比值、第一比特数和第一数量,确定编码第一图像的第二比特数。
步骤102、视频编码装置基于第二比特数,编码第一图像。
其中,第一比值为第一图像的预测编码复杂度与M帧第二图像的实际编码复杂度的比值。第一图像为目标图像组中未编码的第一帧图像,上述M帧第二图像为目标图像组中已编码的图像;第一比特数为目标图像组中剩余的比特数,第一数量为目标图像组中未编码的图像的数量,M可以为大于1的整数。
本申请实施例中,第二比特数为视频编码装置配置给第一图像的比特数,即第二比特数为第一图像的目标比特数。
本申请实施例中,第一比值可以用于表示待编码的第一图像相对于目标图像组中已编码的M帧第二图像的相对编码复杂度。
需要说明的是,第一图像、M个第二图像和第一数量,根据目标图像组的编码进度确定。
例如,假设目标图像组包括10帧图像,分别为图像1、图像2、图像3、图像4、图像5、图像6、图像7、图像8、图像9和图像10,且图像3为最近一次编码的图像,那么:第一图像为图像4,该M(M=3)个第二图像包括图像1、图像2和图像3,第一数量为7。且在完成对编码图像4的编码之后,图像5成目标图像组中未编码的第一帧图像,从而视频编码装置可以将图像5作为新的第一图像,并重新执行上述步骤101和步骤102,以此类推,直至完成对图像10的编码。然后,视频编码装置可以继续编码下一个图像组。
本申请实施例提供的视频编码方法中,由于第一比值可以指示第一图像与目标图像组中已编码的M帧第二图像之间的相对编码复杂度,即本申请实施例提供的视频编码方法可以根据目标图像组中待编码图像与已编码图像之间的相对编码复杂度、目标图像组中剩余的比特数和目标图像组中剩余的帧数,确定编码待编码图像的比特数,因此可以实现从目标图像组中的低编码复杂度的图像节省编码比特,并将节省的编码比特用于对高编码复杂度的图像的编码,如此可以在保持平均编码码率接近目标码率(平均编码码率)的前提下,减小像组中的各帧图像PSNR曲线的波动,从而可以提高编码后的视频的质量。
可选地,本申请实施例中,上述步骤101具体可以通过下述的步骤101a和步骤101b实现。
步骤101a、视频编码装置通过第一比值,确定与第一比值对应的加权参数。
可选地,假设目标图像组为需求编码的视频中的第i个GOP,第一图像为目标图像组中的第j帧图像,第一比值为MAD ratio(n i,j),那么可以通过下述的公式(23)计算加权参数W MAD(n i,j):
W MAD(n i,j)=a+b·(MAD ratio(n i,j)-a);   (23 )
其中,在上述公式(23)中,a和b是根据可用信道资源(例如编码第一图像前的可用信道传输速率)和目标图像组的编码复杂程度设定的两个编码参数;a代表目标图像组的平均编码复杂度,b是对加权参数W MAD(n i,j)调整的幅值。
可选地,本申请实施例中,上述公式(23)中a和b为常数,例如,a=1.1,b=3.5。当然,实际实现中,a和b还可以为其他取值,例如,a=1.1±0.5,b=3.5±1。
考虑到缓冲器大小Bs的限制,需要进一步约束加权参数W MAD(i)的值:
W MAD(n i,j)=min{S high,max{S low,W MAD(n i,j)}};       (24)
其中,在上述公式(24)中,S high表示缓冲器的调整范围的上边界,用于避免高复杂度帧过分占用编码资源。S low表示缓冲器的调整范围的下边界,用于避免低复杂度帧占用的编码资源太少而导致的视频质量下降。
可以理解,S high取值过大会导致高复杂度图像过度使用编码资源而影响后续帧的编码质量,S high取值过小会限制分配给高复杂度图像的编码资源,影响其编码质量的提升。S low取值过大会影响在编码复杂度较低的图像时进行的资源节约,S low取值过小又可能使某些图像因为分配得到的编码资源太少而导致编码质量急剧下降。
可选地,本申请实施例中,S high和S low的值可以为常数,例如,可以为S high=1.5,S low=0.45。
步骤101b、视频编码装置根据加权参数、第一比特数和第一数量,确定编码第一图像的第二比特数。
可选地,本申请实施例中,假设为需求编码的视频中的第i个GOP,目标第一图像为图像组中的第i帧图像,那么编码第一图像n i,j的第二比特数Tc(n i,j)为:
Figure PCTCN2022088950-appb-000043
其中,在上述公式(25)中,Tr(n i,j)是编码第一图像前目标图像组中剩余可用的比特数,G(n i,j)是编码第一图像前目标图像组中未编码的帧总数,W MAD(n i,j)是与第一比特数对应的加权参数。
本申请实施例中,由于可以先确定与表示第一图像与目标图像组中已编码的图像的相对编码复杂度的第一比值对应的加权参数,然后再根据该加权参数、剩余比特数和未编码图像的数量,确定编码第一图像的比特数,即可以基于图像组中的各帧图像之间的相对编码复杂度确定编码图像的比特数,因此相对于平均分配的方法确定编码图像的比特数的方案,本申请实施例提供的视频编码方法可以较好地抑制编码后帧间视频质量的波动。
可选地,本申请实施例中,可以根据待编码图像(例如上述第一图像)与已编码图像之间的相对编码复杂度、剩余比特数、剩余帧数和缓冲器状态,确定编码待编码图像的比特数,如此可以避免缓冲器占用量发生上溢和下溢。
可选地,本申请实施例中,上述步骤101具体可以通过下述的步骤101c实现。
步骤101c、视频编码装置根据第一比值、第一比特数、第一数量和目标参数,确定编码第一图像的第二比特数。
其中,目标参数包括缓冲器的估计占用量、缓冲区的实际占用量、编码帧率和编码第一图像前的可用信道传输速率。对于CBR信道,编码每帧图像前的可用信道传输速率相同。
本申请实施例中,由于可以根据第一比值、第一比特数、第一数量和目标参数,确定编码第一图像的第二比特数,因此抑制帧间编码质量的波动,而且可以避免缓冲器的占用量发生上溢或下溢。如此可以进一步提高编码后视频的质量。
可选地,本申请实施例中,上述步骤101c具体可以通过下述的步骤A和步骤B实现。
步骤A、视频编码装置根据第一比值、第一比特数和第一数量,确定第三比特数。
可以理解,步骤A是基于目标图像组中的图像之间的相对编码复杂度,确定编码第一图像的比特数的。
本申请实施例中,视频编码装置可以先根据第一比值,确定与第一比值对应的加权参数,然后再根据该加权参数、第一比特数和第一质量,确定第三比特数,参见上述公式(25)。具体可以参见步骤101a和步骤101b的相关描述,为了避免重复此处不再赘述。
步骤B、视频编码装置根据目标参数,确定第四比特数。
可以理解,本申请实施例中,第四比特数是基于编码器的占用量确定编码第一图像的比特数。
示例性地,假设第一图像为第i个图像组中的第j帧图像,那么为避免编码器发生上溢和下溢,可以从编码器缓冲器占用量角度确定编码当前帧的第四比特数
Figure PCTCN2022088950-appb-000044
Figure PCTCN2022088950-appb-000045
其中,在上述公式(26)中,u(n i,j)表示编码第一图像前的可用信道传输速率,F r为编码帧率;γ 1为一个常量,其值为0.75。
步骤C、视频编码装置将第三比特数和第四比特数进行加权求和,得到第二比特数。
保持编码器缓冲区占用量稳定下降和提升编码后视频的质量之间存在矛盾,这种矛盾的根源在于视频序列中各帧的编码复杂度不同。为了达到与相对编码复杂度较低的图像一致的视频质量,相对编码复杂度较高的图像需要更多的编码资源。因此,综合考虑缓冲器占用量和编码后视频的质量,可以确定最终确定编码第一图像的比特数为Tc(n i,j),具体为:
Figure PCTCN2022088950-appb-000046
其中,在上述公式(27)中,
Figure PCTCN2022088950-appb-000047
为最终确定的编码第一图像的比特数,Tc(n i,j)是从相对编码复杂度角度确定的编码第一图像的比特数(具体参见上述公式(25)),
Figure PCTCN2022088950-appb-000048
是从编码器缓冲占用量角度确定的编码第一图像的比特数(具体参见上述公式(26)),β 1是一个加权参数,该参数决定了在确定编码图像的比特数时时考虑两方面的程度。β 1为一个常数,其取值范围为
Figure PCTCN2022088950-appb-000049
本申请实施例中,由于可以分别从相对编码复杂度角度确定编码第一图像的第三比特数,且从缓冲器占用量角度确定编码第一图像的第四比特数,并将第三比特数和第四比特数的权值之和,作为最终编码第一图像的比特数,因此不但可以高复杂度图像编码后的质量,而且可以提高目标图像组内部各帧图像的PSNR曲线的平滑度,减小PSNR曲线的波动,从而可以提高编码后的整个视频序列的平均PSNR。如此可以提高编码后的视频的质量。
可选地,本申请实施例中,上述步骤102具体可以通过下述的步骤102a和步骤102b实现。
步骤102a、视频编码装置基于第二比特数和第一图像的预测编码复杂度,通过二次率失真模型,确定第一图像的量化参数(以下称为目标量化参数)。
步骤102b、视频编码装置按照目标量化参数,编码第一图像。
可选地,第一图像的预测编码复杂度由第一图像的预测MAD值表示,第一图像的预测MAD值根据第一图像的前一帧图像(以下称为第三图像)的实际MAD值,通过线性预测模型预测得到;然后根据第一图像的预测编码复杂度、第三图像的实际编码复杂度,通过二次率失真模型,预测得到目标量化参数。
具体的,假设第一图像为需求编码的视频中的第i个图像组中的第j帧图像,i,j均为正整数;那么目标量化参数
Figure PCTCN2022088950-appb-000050
可以通过下述的公式(28)预测得到。
Figure PCTCN2022088950-appb-000051
其中,在公式(28)中,f(n i,j)为编码第一图像的比特数,d 1和d 2为二次率失真模型的参数,且d 1和d 2均为常数,MAD predict(n i,j)表示第一图像的预测编码复杂度。
为了保证编码后的视频的质量的连续性,相邻两帧图像的量化参数的差值应不大于a0(例如a0=2),从而将目标量化参数调整为
Figure PCTCN2022088950-appb-000052
Figure PCTCN2022088950-appb-000053
其中,在上述公式(29)中,Q pp为第三图像的量化参数(编码第三图像后即可得到)。如此,最终将目标量化参数限制为:
Figure PCTCN2022088950-appb-000054
本申请实施例中,编码时采用的基本单元BU不同,视频编码装置按照目标量化参数编码第一图像的方法也可能不同。具体的,当基本单元BU为一帧图像时,视频编码装置可以直接采用目标编码参数编码第一图像。当基本单元BU为至少一个宏块,且至少一个宏块的数量小于一帧图像中包括的宏块的数量)时,视频编码装置在执行步骤102a之后,需要进行基本层(即BU层)的码率控制。
下面对视频编码装置进行BU层码率控制的方法进行示例行地说明。
对于图像组中的I帧和B帧,一帧图像内所有的宏块MB都采用同一个量化参数进行编码,例如均采用该图像的量化参数进行编码。因此,BU层的码率控制主要对象是图像组中的P帧。
对于图像组中的每个P帧,需要先将一个P帧分配到的比特数,分配给该P帧中的每个BU。因为当前P帧中尚未编码的基本单元的MAD值(即编码复杂度)是未知的,所以可以当前P帧中剩余的可用比特数平均分配给当前P帧中未编码的基本单元。
BU层码率控制算法可以包括如下五步:
步骤1,计算待编码BU的目标比特数,即向该带编码BU分配比特数。
具体的,对于目标图像组中的第i帧图像(i为大于1的整数),令第i帧图像中剩余的比特数为f rb(n i,j),剩余的BU的数量为N ub;其中,f rb(n i,j)和N ub的初始值为f(n i,j)和N unit,f(n i,j)为第i帧图像分配到的全部比特数,N unit为第i帧图像中的全部BU的数量;那么,第i帧图像中未编码的第一个BU分配到的比特数为f rb/N ub
步骤2,计算第i帧图像中第c个BU的头估计比特数m h,c为正整数,且第c个BU为第i帧图像中未编码的第一个BU。
Figure PCTCN2022088950-appb-000055
其中,在上述公式(31)中,c=1,2...,...Nunit,
Figure PCTCN2022088950-appb-000056
为图像0中已编码的第c个BU的实际编码比特数,
Figure PCTCN2022088950-appb-000057
为图像0中,已编码前c-1个BU的平均编码比特数,c为正整数。
步骤3:计算第i帧图像中第c个BU的残差系数编码比特数R i(c):
Figure PCTCN2022088950-appb-000058
步骤4:根据MAD线性预测模型,由目标BU的MAD值预测得到第i个图像中的第c个BU的MAD值(即第c个BU的预测MAD值),目标BU为第i-1帧图像中位置与第i帧图像中的第c个BU的位置相对应的BU,且目标BU已经完成编码;再根据第c个BU的预测MAD值,利用二项式率失真模型计算出编码量化步长,其中,二项式率失真模型为:
Figure PCTCN2022088950-appb-000059
其中,在上述公式(33)中,σ i(c)为第c个BU的预测MAD值,Q step,i(j)为二项式率失真模型计算得到的量化步长。可以将量化步长转换为量化参数QP,具体可以根据实际使用需求确定。
步骤5:根据计算出的量化参数,对第c个BU中所有的宏块进行率失真优化的编码,并在编码完成之后更新第i帧图像的剩余比特数、MAD线性预测模型的参数和二项式率失真模型的参数。具体可以参考上述实施例中的相关描述。
可选地,本申请实施例中,在上述步骤101之前,本申请实施例提供的视频编码方法还可以包括下述的步骤103。
步骤103、视频编码装置根据第一图像的预测编码复杂度和M帧第二图像的平均编码复杂度,确定第一比值。
本申请实施例中,第一图像的预测编码复杂度由第一图像的预测MAD值表示,M帧第二图像的平均编码复杂度可以由M个第二图像的平均MAD值表示。从而,第一比值MAD ratio(j)可以通过下述的公式(26)计算:
Figure PCTCN2022088950-appb-000060
在上述公式(34)中,MAD ratio(j)是当前GOP中第j个P帧的MAD ratio值;MAD predict(j)是通过MAD线性预测模型预测的第j个P帧MAD值;MAD actual(o)是当前GOP(例如目标图像组)中编码完第o帧后计算得到的实际MAD值。
Figure PCTCN2022088950-appb-000061
表示目标图像组中已编码的前j-1个P帧的平均编码复杂度。
本申请实施例中,由于在为一帧图像分配比特数时,可以参考该图像所在GOP中已编码的图像的平均编码复杂度,因此可以确保相同GOP中的图像编码后的视频质量更接近彼此,从而可以减小相同GOP内部各帧图像的峰值信噪比曲线的波动,如此可以提高编码后视频的质量。
需要说明的是,本申请实施例提供的视频编码方法,执行主体可以为视频编码装置,或者该视频编码装置中的用于执行视频编码方法的控制模块。本申请实施例中以视频编码装置执行视频编码方法为例,说明本申请实施例提供的视频编码装置。
图5为实现本申请实施例提供的一种视频编码装置的可能的结构示意图,如图5所示,视频编码装置50可以包括:确定模块51和编码模块52。确定模块51,可以用于根据第一比值、第一比特数和第一数量,确定编码第一图像的第二比特数;编码模块52,可以用于基于确定模块51确定的第二比特数,编码第一图像;其中,第一比值可以为第一图像的预测编码复杂度与M帧第二图像的实际编码复杂度的比值,第一图像为目标图像组中未编码的第一帧图像,该M帧第二图像为目标图像组中已编码的图像;第一比特数为目标图像组中剩余的比特数,第一数量为目标图像组中未编码图像的数量,M为大于1的整数。
可选地,本申请实施例中,确定模块51,具体可以用于通过第一比值,确定与第一比值对应的加权参数;并根据加权参数、第一比特数和第一数量,确定编码第一图像的第二比特数。
可选地,本申请实施例中,确定模块51,具体可以用于根据第一比值、第一比特数、第一数量和目标参数,确定编码第一图像的第二比特数;其中,目标参数包括:缓冲区的估计占用量、缓冲区的实际占用量、编码帧率和编码第一图像前的可用信道传输速率。
可选地,本申请实施例中,上述确定模块51可以包括第一确定子模块和处理子模块;第一确定子模块,可以用于根据第一比值、第一比特数和第一数量,确定第三比特数;并根据目标参数,确定第四比特数;处理子模块,可以用于将第一确定子模块确定的第三比特数和第四比特数进行加权求和,得到第二比特数。
可选地,本申请实施例中,编码模块52可以包括第二确定子模块和编码子模块;
第二确定子模块,可以用于基于第二比特数和第一图像的预测编码复杂度,通过二次 率失真模型,确定第一图像的量化参数;
编码子模块,可以用于按照第二确定子模块确定的量化参数,编码第一图像。
可选地,本申请实施例中,确定模块51,还可以用于在根据第一比值、第一比特数和第一数量,确定编码第一图像的第二比特数之前,根据第一图像的预测编码复杂度和M帧第二图像的平均编码复杂度,确定第一比值。
本申请实施例提供的视频编码装置中,由于第一比值可以指示第一图像与目标图像组中已编码的所M帧第二图像之间的相对编码复杂度,即本申请实施例提供的视频编码方法可以根据目标图像组中待编码图像与已编码图像之间的相对编码复杂度、目标图像组中剩余的比特数和目标图像组中剩余的帧数,确定编码待编码图像的比特数,因此可以实现从目标图像组中的低编码复杂度的图像节省编码比特,并将节省的编码比特用于对高编码复杂度的图像的编码,从而可以在保持平均编码码率接近目标码率(平均编码码率)的前提下,减小图像组中的各帧图像PSNR曲线的波动,进而可以提高编码后的视频的质量。
本实施例中各种实现方式具有的有益效果具体可以参见上述方法实施例中相应实现方式所具有的有益效果,为避免重复,此处不再赘述。
本申请实施例中的视频编码装置可以是装置,也可以是终端中的部件、集成电路、或芯片。该装置可以是移动电子设备,也可以为非移动电子设备。示例性的,移动电子设备可以为手机、平板电脑、笔记本电脑、掌上电脑、车载电子设备、可穿戴设备、超级移动个人计算机(ultra-mobile personal computer,UMPC)、上网本或者个人数字助理(personal digital assistant,PDA)等,非移动电子设备可以为网络附属存储器(Network Attached Storage,NAS)、个人计算机(personal computer,PC)、电视机(television,TV)、柜员机或者自助机等,本申请实施例不作具体限定。
本申请实施例中的视频编码装置可以为具有操作系统的装置。该操作系统可以为安卓(Android)操作系统,可以为ios操作系统,还可以为其他可能的操作系统,本申请实施例不作具体限定。
本申请实施例提供的视频编码装置能够实现图1至图4的方法实施例实现的各个过程,为避免重复,这里不再赘述。
如图6所示,本申请实施例还提供一种电子设备200,包括处理器202,存储器201,存储在存储器201上并可在处理器202上运行的程序或指令,该程序或指令被处理器202执行时实现上述截屏方法实施例的各个过程,且能达到相同的技术效果,为避免重复,这里不再赘述。
需要注意的是,本申请实施例中的电子设备包括上述所述的移动电子设备和非移动电子设备。
图7为实现本申请实施例的一种电子设备的硬件结构示意图。
如图7所示,电子设备1000包括但不限于:射频单元1001、网络模块1002、音频输出单元1003、输入单元1004、传感器1005、显示单元1006、用户输入单元1007、接口单元1008、存储器1009、以及处理器1010等部件。
本领域技术人员可以理解,电子设备1000还可以包括给各个部件供电的电源(比如电池),电源可以通过电源管理系统与处理器1010逻辑相连,从而通过电源管理系统实现管理充电、放电、以及功耗管理等功能。图7中示出的电子设备结构并不构成对电子设备的限定,电子设备可以包括比图示更多或更少的部件,或者组合某些部件,或者不同的部件布置,在此不再赘述。
其中,处理器1010,可以用于根据第一比值、第一比特数和第一数量,确定编码第一图像的第二比特数;且基于第二比特数,编码第一图像;其中,第一比值可以为第一图像的预测编码复杂度与M帧第二图像的实际编码复杂度的比值,第一图像为目标图像组中未编码的第一帧图像,该M帧第二图像为目标图像组中已编码的图像;第一比特数为目标图 像组中剩余的比特数,第一数量为目标图像组中未编码图像的数量,M为大于1的整数。
可选地,本申请实施例中,处理器1010,具体可以用于通过第一比值,确定与第一比值对应的加权参数;并根据加权参数、第一比特数和第一数量,确定编码第一图像的第二比特数。
可选地,本申请实施例中,处理器1010,具体可以用于根据第一比值、第一比特数、第一数量和目标参数,确定编码第一图像的第二比特数;其中,目标参数包括:缓冲区的估计占用量、缓冲区的实际占用量、编码帧率和编码第一图像前的可用信道传输速率。
可选地,本申请实施例中,处理器1010,可以用于根据第一比值、第一比特数和第一数量,确定第三比特数;且根据目标参数,确定第四比特数;并将第三比特数和第四比特数进行加权求和,得到第二比特数。
可选地,本申请实施例中,处理器1010,可以用于基于第二比特数和第一图像的预测编码复杂度,通过二次率失真模型,确定第一图像的量化参数;且按照该量化参数,编码第一图像。
可选地,本申请实施例中,处理器1010,还可以用于在根据第一比值、第一比特数和第一数量,确定编码第一图像的第二比特数之前,根据第一图像的预测编码复杂度和M帧第二图像的平均编码复杂度,确定第一比值。
本申请实施例提供的视频编码装置中,由于第一比值可以指示第一图像与目标图像组中已编码的所M帧第二图像之间的相对编码复杂度,即本申请实施例提供的视频编码方法可以根据目标图像组中待编码图像与已编码图像之间的相对编码复杂度、目标图像组中剩余的比特数和目标图像组中剩余的帧数,确定编码待编码图像的比特数,因此可以实现从目标图像组中的低编码复杂度的图像节省编码比特,并将节省的编码比特用于对高编码复杂度的图像的编码,从而可以在保持平均编码码率接近目标码率(平均编码码率)的前提下,减小图像组中的各帧图像PSNR曲线的波动,进而可以提高编码后的视频的质量。
本实施例中各种实现方式具有的有益效果具体可以参见上述方法实施例中相应实现方式所具有的有益效果,为避免重复,此处不再赘述。
应理解的是,本申请实施例中,输入单元1004可以包括图形处理器(Graphics Processing Unit,GPU)10041和麦克风10042,图形处理器10041对在视频捕获模式或图像捕获模式中由图像捕获装置(如摄像头)获得的静态图片或视频的图像数据进行处理。显示单元1006可包括显示面板10061,可以采用液晶显示器、有机发光二极管等形式来配置显示面板10061。用户输入单元1007包括触控面板10071以及其他输入设备10072。触控面板10071,也称为触摸屏。触控面板10071可包括触摸检测装置和触摸控制器两个部分。其他输入设备10072可以包括但不限于物理键盘、功能键(比如音量控制按键、开关按键等)、轨迹球、鼠标、操作杆,在此不再赘述。存储器1009可用于存储软件程序以及各种数据,包括但不限于应用程序和操作系统。处理器1010可集成应用处理器和调制解调处理器,其中,应用处理器主要处理操作系统、用户界面和应用程序等,调制解调处理器主要处理无线通信。可以理解的是,上述调制解调处理器也可以不集成到处理器1010中。
本申请实施例还提供一种可读存储介质,所述可读存储介质上存储有程序或指令,该程序或指令被处理器执行时实现上述视频编码方法实施例的各个过程,且能达到相同的技术效果,为避免重复,这里不再赘述。
其中,所述处理器为上述实施例中所述的电子设备中的处理器。所述可读存储介质,包括计算机可读存储介质,如计算机只读存储器(Read-Only Memory,ROM)、随机存取存储器(Random Access Memory,RAM)、磁碟或者光盘等。
本申请实施例另提供了一种芯片,所述芯片包括处理器和通信接口,所述通信接口和所述处理器耦合,所述处理器用于运行程序或指令,实现上述视频编码方法实施例的各个过程,且能达到相同的技术效果,为避免重复,这里不再赘述。
应理解,本申请实施例提到的芯片还可以称为系统级芯片、系统芯片、芯片系统或片上系统芯片等。
需要说明的是,在本文中,术语“包括”、“包含”或者其任何其他变体意在涵盖非排他性的包含,从而使得包括一系列要素的过程、方法、物品或者装置不仅包括那些要素,而且还包括没有明确列出的其他要素,或者是还包括为这种过程、方法、物品或者装置所固有的要素。在没有更多限制的情况下,由语句“包括一个……”限定的要素,并不排除在包括该要素的过程、方法、物品或者装置中还存在另外的相同要素。此外,需要指出的是,本申请实施方式中的方法和装置的范围不限按示出或讨论的顺序来执行功能,还可包括根据所涉及的功能按基本同时的方式或按相反的顺序来执行功能,例如,可以按不同于所描述的次序来执行所描述的方法,并且还可以添加、省去、或组合各种步骤。另外,参照某些示例所描述的特征可在其他示例中被组合。
通过以上的实施方式的描述,本领域的技术人员可以清楚地了解到上述实施例方法可借助软件加必需的通用硬件平台的方式来实现,当然也可以通过硬件,但很多情况下前者是更佳的实施方式。基于这样的理解,本申请的技术方案本质上或者说对现有技术做出贡献的部分可以以计算机软件产品的形式体现出来,该计算机软件产品存储在一个存储介质(如ROM/RAM、磁碟、光盘)中,包括若干指令用以使得一台终端(可以是手机,计算机,服务器,或者网络设备等)执行本申请各个实施例所述的方法。
上面结合附图对本申请的实施例进行了描述,但是本申请并不局限于上述的具体实施方式,上述的具体实施方式仅仅是示意性的,而不是限制性的,本领域的普通技术人员在本申请的启示下,在不脱离本申请宗旨和权利要求所保护的范围情况下,还可做出很多形式,均属于本申请的保护之内。

Claims (17)

  1. 一种视频编码方法,所述方法包括:
    根据第一比值、第一比特数和第一数量,确定编码第一图像的第二比特数;
    基于所述第二比特数,编码所述第一图像;
    其中,所述第一比值为所述第一图像的预测编码复杂度与M帧第二图像的实际编码复杂度的比值,所述第一图像为目标图像组中未编码的第一帧图像,所述M帧第二图像为所述目标图像组中已编码的图像;所述第一比特数为所述目标图像组中剩余的比特数,所述第一数量为所述目标图像组中未编码图像的数量,M为大于1的整数。
  2. 根据权利要求1所述的方法,其中,所述根据第一比值、第一比特数和第一数量,确定编码第一图像的第二比特数,包括:
    通过所述第一比值,确定与所述第一比值对应的加权参数;
    根据所述加权参数、所述第一比特数和所述第一数量,确定编码所述第一图像的所述第二比特数。
  3. 根据权利要求1或2所述的方法,其中,
    所述根据第一比值、第一比特数和第一数量,确定编码第一图像的第二比特数,包括:
    根据所述第一比值、所述第一比特数、所述第一数量和目标参数,确定编码所述第一图像的所述第二比特数;
    其中,所述目标参数包括:缓冲区的估计占用量、所述缓冲区的实际占用量、编码帧率和编码所述第一图像前的可用信道传输速率。
  4. 根据权利要求3所述的方法,其中,所述根据所述第一比值、所述第一比特数、所述第一数量和目标参数,确定编码所述第一图像的所述第二比特数,包括:
    根据所述第一比值、所述第一比特数和所述第一数量,确定第三比特数;
    根据所述目标参数,确定第四比特数;
    将所述第三比特数和所述第四比特数进行加权求和,得到所述第二比特数。
  5. 根据权利要求1所述的方法,其中,所述基于所述第二比特数,编码所述第一图像,包括:
    基于所述第二比特数和所述第一图像的预测编码复杂度,通过二次率失真模型,确定所述第一图像的量化参数;按照所述量化参数,编码所述第一图像。
  6. 根据权利要求1所述的方法,其中,所述根据第一比值、第一比特数和第一数量,确定编码第一图像的第二比特数之前,所述方法还包括:
    根据所述第一图像的预测编码复杂度和所述M帧第二图像的平均编码复杂度,确定所述第一比值。
  7. 一种视频编码装置,所述装置包括:确定模块和编码模块;
    确定模块,用于根据第一比值、第一比特数和第一数量,确定编码第一图像的第 二比特数;
    所述编码模块,用于基于所述确定模块确定的所述第二比特数,编码所述第一图像;
    其中,所述第一比值为所述第一图像的预测编码复杂度与M帧第二图像的实际编码复杂度的比值,所述第一图像为目标图像组中未编码的第一帧图像,所述M帧第二图像为所述目标图像组中已编码的图像;所述第一比特数为所述目标图像组中剩余的比特数,所述第一数量为所述目标图像组中未编码图像的数量,M为大于1的整数。
  8. 根据权利要求7所述的装置,其中,所述确定模块,具体用于通过所述第一比值,确定与所述第一比值对应的加权参数;并根据所述加权参数、所述第一比特数和所述第一数量,确定编码所述第一图像的所述第二比特数。
  9. 根据权利要求7或8所述的装置,其中,所述确定模块,具体用于根据所述第一比值、所述第一比特数、所述第一数量和目标参数,确定编码所述第一图像的所述第二比特数;其中,所述目标参数包括:缓冲区的估计占用量、所述缓冲区的实际占用量、编码帧率和编码所述第一图像前的可用信道传输速率。
  10. 根据权利要求9所述的装置,其中,所述确定模块包括第一确定子模块和处理子模块;
    所述第一确定子模块,用于根据所述第一比值、所述第一比特数和所述第一数量,确定第三比特数;并根据所述目标参数,确定第四比特数;
    所述处理子模块,用于将所述第一确定子模块确定的所述第三比特数和所述第四比特数进行加权求和,得到所述第二比特数。
  11. 根据权利要求7所述的装置,其中,所述编码模块包括第二确定子模块和编码子模块;
    所述第二确定子模块,用于基于所述第二比特数和所述第一图像的预测编码复杂度,通过二次率失真模型,确定所述第一图像的量化参数;
    所述编码子模块,用于按照所述第二确定子模块确定的所述量化参数,编码所述第一图像。
  12. 根据权利要求7所述的装置,其中,所述确定模块,还用于在根据所述第一比值、所述第一比特数和所述第一数量,确定编码所述第一图像的所述第二比特数之前,根据所述第一图像的预测编码复杂度和所述M帧第二图像的平均编码复杂度,确定所述第一比值。
  13. 一种电子设备,包括处理器,存储器及存储在所述存储器上并可在所述处理器上运行的程序或指令,所述程序或指令被所述处理器执行时实现如权利要求1至6中任一项所述的视频编码方法的步骤。
  14. 一种可读存储介质,所述可读存储介质上存储程序或指令,所述程序或指令被处理器执行时实现如权利要求1至6中任一项所述的视频编码方法的步骤。
  15. 一种计算机软件产品,所述计算机软件产品被至少一个处理器执行以实现如权利要求1至6中任一项所述的视频编码方法。
  16. 一种电子设备,包括电子设备被配置成用于执行如权利要求1至6中任一项所述的视频编码方法。
  17. 一种芯片,所述芯片包括处理器和通信接口,所述通信接口和所述处理器耦合,所述处理器用于运行程序或指令,实现如1至6中任一项所述的视频编码方法。
PCT/CN2022/088950 2021-04-26 2022-04-25 视频编码方法、装置和电子设备 Ceased WO2022228375A1 (zh)

Priority Applications (4)

Application Number Priority Date Filing Date Title
EP22794845.2A EP4333433A4 (en) 2021-04-26 2022-04-25 VIDEO CODING METHOD AND APPARATUS, AND ELECTRONIC DEVICE
JP2023564189A JP7682297B2 (ja) 2021-04-26 2022-04-25 ビデオ符号化方法、装置と電子機器
KR1020237034885A KR20230155002A (ko) 2021-04-26 2022-04-25 비디오 코딩 방법, 장치 및 전자기기
US18/485,487 US12413738B2 (en) 2021-04-26 2023-10-12 Video encoding method and apparatus and electronic device

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202110454418.XA CN113286145B (zh) 2021-04-26 2021-04-26 视频编码方法、装置和电子设备
CN202110454418.X 2021-04-26

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US18/485,487 Continuation US12413738B2 (en) 2021-04-26 2023-10-12 Video encoding method and apparatus and electronic device

Publications (1)

Publication Number Publication Date
WO2022228375A1 true WO2022228375A1 (zh) 2022-11-03

Family

ID=77275740

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/088950 Ceased WO2022228375A1 (zh) 2021-04-26 2022-04-25 视频编码方法、装置和电子设备

Country Status (6)

Country Link
US (1) US12413738B2 (zh)
EP (1) EP4333433A4 (zh)
JP (1) JP7682297B2 (zh)
KR (1) KR20230155002A (zh)
CN (1) CN113286145B (zh)
WO (1) WO2022228375A1 (zh)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113286145B (zh) 2021-04-26 2022-07-22 维沃移动通信有限公司 视频编码方法、装置和电子设备
CN116248882B (zh) * 2023-02-26 2025-11-14 翱捷科技股份有限公司 一种i帧图像块级别的码率控制方法及装置
CN116708934B (zh) * 2023-05-16 2024-03-22 深圳东方凤鸣科技有限公司 一种视频编码处理方法及装置

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20050175091A1 (en) * 2004-02-06 2005-08-11 Atul Puri Rate and quality controller for H.264/AVC video coder and scene analyzer therefor
US20050175092A1 (en) * 2004-02-06 2005-08-11 Atul Puri H.264/AVC coder incorporating rate and quality controller
CN101188755A (zh) * 2007-12-14 2008-05-28 宁波中科集成电路设计中心有限公司 一种对实时视频信号在avs编码过程中vbr码率控制的方法
CN101895759A (zh) * 2010-07-28 2010-11-24 南京信息工程大学 一种h.264码率控制方法
CN101895758A (zh) * 2010-07-23 2010-11-24 南京信息工程大学 基于帧复杂度的h.264码率控制方法
CN103051897A (zh) * 2012-12-26 2013-04-17 南京信息工程大学 一种h264视频编码码率控制方法
CN108200431A (zh) * 2017-12-08 2018-06-22 重庆邮电大学 一种视频编码码率控制帧层比特分配方法
CN113286145A (zh) * 2021-04-26 2021-08-20 维沃移动通信有限公司 视频编码方法、装置和电子设备

Family Cites Families (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
FR2748623B1 (fr) * 1996-05-09 1998-11-27 Thomson Multimedia Sa Encodeur a debit variable
US6522693B1 (en) * 2000-02-23 2003-02-18 International Business Machines Corporation System and method for reencoding segments of buffer constrained video streams
JP3889552B2 (ja) * 2000-06-09 2007-03-07 パイオニア株式会社 符号量割り当て装置および方法
CN1206864C (zh) * 2002-07-22 2005-06-15 中国科学院计算技术研究所 结合率失真优化的码率控制的方法及其装置
US7133448B2 (en) * 2002-11-07 2006-11-07 Silicon Integrated Systems Corp. Method and apparatus for rate control in moving picture video compression
US7095784B2 (en) * 2003-04-14 2006-08-22 Silicon Intergrated Systems Corp. Method and apparatus for moving picture compression rate control using bit allocation with initial quantization step size estimation at picture level
US7373004B2 (en) * 2003-05-23 2008-05-13 Silicon Integrated Systems Corp. Apparatus for constant quality rate control in video compression and target bit allocator thereof
US7254176B2 (en) * 2003-05-23 2007-08-07 Silicon Integrated Systems Corp. Apparatus for variable bit rate control in video compression and target bit allocator thereof
JP4908943B2 (ja) 2006-06-23 2012-04-04 キヤノン株式会社 画像符号化装置及び画像符号化方法
JP2010062999A (ja) * 2008-09-05 2010-03-18 Toshiba Corp 動画像符号化装置、動画像符号化方法、及び、コンピュータプログラム
JP5871602B2 (ja) * 2011-12-19 2016-03-01 キヤノン株式会社 符号化装置
CN102752591B (zh) * 2012-06-14 2014-09-17 南京信息工程大学 基于综合因子的h.264码率控制方法
US10819997B2 (en) * 2016-01-20 2020-10-27 Arris Enterprises Llc Encoding video data according to target decoding device decoding complexity
US10560696B2 (en) * 2018-06-25 2020-02-11 Tfi Digital Media Limited Method for initial quantization parameter optimization in video coding
CN111432222B (zh) * 2019-12-30 2022-05-31 鹏城实验室 适用于交通场景的直播视频编码方法、终端及存储介质

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20050175091A1 (en) * 2004-02-06 2005-08-11 Atul Puri Rate and quality controller for H.264/AVC video coder and scene analyzer therefor
US20050175092A1 (en) * 2004-02-06 2005-08-11 Atul Puri H.264/AVC coder incorporating rate and quality controller
CN101188755A (zh) * 2007-12-14 2008-05-28 宁波中科集成电路设计中心有限公司 一种对实时视频信号在avs编码过程中vbr码率控制的方法
CN101895758A (zh) * 2010-07-23 2010-11-24 南京信息工程大学 基于帧复杂度的h.264码率控制方法
CN101895759A (zh) * 2010-07-28 2010-11-24 南京信息工程大学 一种h.264码率控制方法
CN103051897A (zh) * 2012-12-26 2013-04-17 南京信息工程大学 一种h264视频编码码率控制方法
CN108200431A (zh) * 2017-12-08 2018-06-22 重庆邮电大学 一种视频编码码率控制帧层比特分配方法
CN113286145A (zh) * 2021-04-26 2021-08-20 维沃移动通信有限公司 视频编码方法、装置和电子设备

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
See also references of EP4333433A4 *

Also Published As

Publication number Publication date
EP4333433A4 (en) 2024-09-25
US20240040127A1 (en) 2024-02-01
JP2024514348A (ja) 2024-04-01
KR20230155002A (ko) 2023-11-09
CN113286145A (zh) 2021-08-20
JP7682297B2 (ja) 2025-05-23
US12413738B2 (en) 2025-09-09
CN113286145B (zh) 2022-07-22
EP4333433A1 (en) 2024-03-06

Similar Documents

Publication Publication Date Title
Lee et al. Scalable rate control for MPEG-4 video
CN113766226A (zh) 图像编码方法、装置、设备及存储介质
JP5318561B2 (ja) マルチメディア処理のためのコンテンツ分類
JP5351040B2 (ja) 映像符号化規格に対応した映像レート制御の改善
US12413738B2 (en) Video encoding method and apparatus and electronic device
CN101518088B (zh) 针对有效速率控制和增强视频编码质量的ρ域帧级比特分配的方法
US20050286631A1 (en) Encoding with visual masking
CN106937112B (zh) 基于h.264视频压缩标准的码率控制方法
CN1926863B (zh) 多通路视频编码的方法
KR20080031344A (ko) 가변 비트 속도 인코딩이 가능한 비디오 인코더를 위한속도 제어 방법, 모듈, 기기 및 시스템
CN104410860B (zh) 一种高清roi视频实时质量调节的方法
CN114513664B (zh) 视频帧编码方法、装置、智能终端及计算机可读存储介质
US20050063461A1 (en) H.263/MPEG video encoder for efficiently controlling bit rates and method of controlling the same
CN100574442C (zh) 基于图像直方图的码率控制方法
CN102420987A (zh) 基于分层b帧结构的码率控制的自适应比特分配方法
CN118285094A (zh) 用于视频处理的方法、装置和介质
CN101331773A (zh) 使用速率失真特性进行视频编码的两遍速率控制技术
Tang et al. A low delay rate control method for screen content coding
CN117956160A (zh) 码率控制方法、码率控制装置以及计算机存储介质
Tsai Rate control for low-delay video using a dynamic rate table
CN118590648A (zh) 视频压缩处理方法、装置、设备及存储介质
CN109379593B (zh) 一基于超前预测的码率控制方法
CN101340584A (zh) 一种视频解码方法和装置
Chi et al. Region-of-interest video coding by fuzzy control for H. 263+ standard
Esmaeeli et al. Methods and Criteria for Evaluating Controllability of Video Bit Rate in HEVC-SCC

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22794845

Country of ref document: EP

Kind code of ref document: A1

ENP Entry into the national phase

Ref document number: 20237034885

Country of ref document: KR

Kind code of ref document: A

WWE Wipo information: entry into national phase

Ref document number: 1020237034885

Country of ref document: KR

WWE Wipo information: entry into national phase

Ref document number: 2023564189

Country of ref document: JP

WWE Wipo information: entry into national phase

Ref document number: 202317072284

Country of ref document: IN

WWE Wipo information: entry into national phase

Ref document number: 2022794845

Country of ref document: EP

NENP Non-entry into the national phase

Ref country code: DE

ENP Entry into the national phase

Ref document number: 2022794845

Country of ref document: EP

Effective date: 20231127