WO2023005579A1 - 视频编码、视频解码方法、装置、电子设备和存储介质 - Google Patents

视频编码、视频解码方法、装置、电子设备和存储介质 Download PDF

Info

Publication number
WO2023005579A1
WO2023005579A1 PCT/CN2022/102406 CN2022102406W WO2023005579A1 WO 2023005579 A1 WO2023005579 A1 WO 2023005579A1 CN 2022102406 W CN2022102406 W CN 2022102406W WO 2023005579 A1 WO2023005579 A1 WO 2023005579A1
Authority
WO
WIPO (PCT)
Prior art keywords
image
video
weighted prediction
parameters
weighted
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Ceased
Application number
PCT/CN2022/102406
Other languages
English (en)
French (fr)
Inventor
谢绍伟
吴钊
吴平
蔡品隆
高莹
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
ZTE Corp
Original Assignee
ZTE Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by ZTE Corp filed Critical ZTE Corp
Priority to JP2023578781A priority Critical patent/JP7698746B2/ja
Priority to US18/577,790 priority patent/US12457351B2/en
Priority to EP22848184.2A priority patent/EP4380156A4/en
Priority to KR1020247002467A priority patent/KR20240024975A/ko
Publication of WO2023005579A1 publication Critical patent/WO2023005579A1/zh
Anticipated expiration legal-status Critical
Ceased legal-status Critical Current

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/503Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
    • H04N19/51Motion estimation or motion compensation
    • H04N19/573Motion compensation with multiple frame prediction using two or more reference frames in a given prediction direction
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/124Quantisation
    • H04N19/126Details of normalisation or weighting functions, e.g. normalisation matrices or variable uniform quantisers
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/103Selection of coding mode or of prediction mode
    • H04N19/105Selection of the reference unit for prediction within a chosen coding or prediction mode, e.g. adaptive choice of position and number of pixels used for prediction
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/134Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
    • H04N19/136Incoming video signal characteristics or properties
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/134Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
    • H04N19/136Incoming video signal characteristics or properties
    • H04N19/14Coding unit complexity, e.g. amount of activity or edge presence estimation
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/17Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
    • H04N19/172Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a picture, frame or field
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/17Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
    • H04N19/174Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a slice, e.g. a line of blocks or a group of blocks
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/42Methods or arrangements for coding, decoding, compressing or decompressing digital video signals characterised by implementation details or hardware specially adapted for video compression or decompression, e.g. dedicated software implementation
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/46Embedding additional information in the video signal during the compression process
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/60Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding
    • H04N19/61Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding in combination with predictive coding
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/70Methods or arrangements for coding, decoding, compressing or decompressing digital video signals characterised by syntax aspects related to video coding, e.g. related to compression standards

Definitions

  • the present application relates to the technical field of image processing, and in particular to a video encoding and decoding method, device, electronic equipment and storage medium.
  • inter-frame prediction technology can effectively eliminate data redundancy in the time domain and greatly reduce the video transmission bit rate.
  • conventional inter-frame motion estimation domain motion compensation will be difficult to achieve the ideal data compression effect, even in actual coding
  • the decision result of the optimization model is usually to use intra-frame prediction coding, which greatly reduces the video coding efficiency. Therefore, in order to improve the coding effect in the content brightness change scene, weighted prediction technology can be used in video coding.
  • the coding end needs to determine the brightness change weight and offset of the current image relative to the reference image, and The corresponding weighted predicted frame is generated by the brightness compensation operation.
  • weighted prediction technology has been proposed, and there are two modes of weighted prediction in application, which are explicit weighted prediction and implicit weighted prediction.
  • implicit weighted prediction the model parameters are fixed, that is, the codec end agrees to use the same weighted prediction parameters, and the parameters do not need to be transmitted by the encoder end, which reduces the pressure of code stream transmission and improves transmission efficiency.
  • the weighted prediction parameters in implicit mode are fixed, when it is applied to inter-frame unidirectional prediction, the varying distance between the current frame and the reference frame will lead to unsatisfactory prediction performance with fixed weights.
  • weighted prediction involves three prediction parameters, namely weight (weight), offset (offset), log weight denominator (logarithmic weight denominator).
  • weight weight
  • offset offset
  • log weight denominator logarithmic weight denominator
  • a series of parameter information for weighted prediction can be included in the Picture header or Slice header. It is worth noting that each luma or chrominance component of the reference image has independent weighted prediction parameters.
  • each luma or chrominance component of the reference image has independent weighted prediction parameters.
  • each reference image is only equipped with a set of weighted prediction parameters (that is, a set of weights and offsets that cooperate with each other).
  • a set of weighted prediction parameters that is, a set of weights and offsets that cooperate with each other.
  • the current entire image has a complete and consistent brightness variation form, only A good prediction effect can be achieved by selecting the nearest forward reference image and its weighted prediction parameters.
  • multiple different reference images can be selected according to existing standards, and artificially configure the weights of adaptive reference images for different areas of the current image. and offset, as shown in Figure 1.
  • the buffer at the decoding end can only store a small amount of decoded reference images, especially for media content with a large amount of data such as ultra-high-definition video and panoramic video
  • the solution shown in Figure 1 can only be applied to partial brightness changes in practical applications. Scenes.
  • the main purpose of the embodiment of the present application is to propose a video coding method, device, electronic equipment and storage medium, aiming at realizing flexible video coding in complex graphics brightness changing scenes, improving video coding efficiency, and reducing the impact of graphics brightness changes on coding efficiency. Influence.
  • An embodiment of the present application provides a video encoding method, wherein the method includes the following steps: acquiring a video image, wherein the video image is at least one frame image of the video; performing weighted predictive encoding on the video image to generate an image code stream, wherein the weighted predictive encoding uses at least one set of weighted predictive identification information and parameters.
  • An embodiment of the present application provides a video decoding method, wherein the method includes the following steps: obtaining an image code stream, and parsing the weighted prediction identification information and parameters in the image code stream; The image code stream is decoded to generate a reconstructed image.
  • the embodiment of the present application also provides a video encoding device, wherein the device includes the following modules: an image acquisition module, configured to acquire a video image, wherein the video image is at least one frame image of a video; a video encoding module, configured to performing weighted predictive encoding on the video image to generate an image code stream, wherein the weighted predictive encoding uses at least one set of weighted predictive identification information and parameters.
  • the embodiment of the present application also provides a video decoding device, wherein the device includes the following modules: a code stream acquisition module, configured to obtain an image code stream, and analyze the weighted prediction identification information and parameters in the image code stream; A reconstruction module, configured to decode the image code stream according to the weighted prediction identification information and parameters to generate a reconstructed image.
  • a code stream acquisition module configured to obtain an image code stream, and analyze the weighted prediction identification information and parameters in the image code stream
  • a reconstruction module configured to decode the image code stream according to the weighted prediction identification information and parameters to generate a reconstructed image.
  • An embodiment of the present application also provides an electronic device, wherein the electronic device includes: one or more processors; a memory for storing one or more programs; when the one or more programs are executed by the one or more Multiple processors are executed, so that the one or multiple processors implement the method described in any one of the embodiments of the present application.
  • the embodiment of the present application also provides a computer-readable storage medium, on which a computer program is stored, and when the computer program is executed by a processor, the method described in any one of the embodiments of the present application is implemented.
  • the video image is at least one frame image in the video, and performing weighted predictive coding on the video image to generate an image code stream, wherein at least one set of weighted predictive identification information is used in the weighted predictive coding process and parameters, the flexible coding of video images is realized, the video coding efficiency can be improved, and the influence of video image brightness changes on coding efficiency can be reduced.
  • Figure 1 is an example diagram of region-adaptive weighted prediction for multiple reference images in some technical solutions
  • FIG. 2 is a flow chart of a video encoding method provided by an embodiment of the present application.
  • FIG. 3 is a flow chart of another video coding method provided by an embodiment of the present application.
  • FIG. 4 is a flow chart of another video encoding method provided by an embodiment of the present application.
  • FIG. 5 is an example diagram of a video coding method provided by an embodiment of the present application.
  • FIG. 6 is an example diagram of another video coding method provided by the embodiment of the present application.
  • FIG. 7 is a flowchart of a video decoding method provided by an embodiment of the present application.
  • FIG. 8 is a flow chart of another video decoding method provided by an embodiment of the present application.
  • FIG. 9 is an example diagram of a video decoding method provided by an embodiment of the present application.
  • FIG. 10 is an example diagram of another video decoding method provided by an embodiment of the present application.
  • FIG. 11 is a schematic structural diagram of a video encoding device provided by an embodiment of the present application.
  • FIG. 12 is a schematic structural diagram of a video decoding device provided by an embodiment of the present application.
  • FIG. 13 is a schematic structural diagram of an encoder provided in an embodiment of the present application.
  • FIG. 14 is a schematic structural diagram of a decoder provided in an embodiment of the present application.
  • FIG. 15 is a schematic structural diagram of an electronic device provided by an embodiment of the present application.
  • Fig. 2 is a flow chart of a video coding method provided by the embodiment of the present application.
  • the embodiment of the present application is applicable to video coding in a brightness change scene, and the method can be executed by a video coding device, which can be implemented by software and/or Or hardware implementation, and generally integrated in the terminal equipment, referring to Figure 2, the method provided by the embodiment of the present application specifically includes the following steps:
  • Step 110 Acquire a video image, where the video image is at least one frame image of the video.
  • the video image may need to be transmitted video data, and the video image may be a certain frame of data in the video data sequence or a corresponding frame of data at a certain moment.
  • video data may be processed, and one or more frames of image data may be extracted as video data for video encoding.
  • Step 120 Perform weighted predictive encoding on the video image to generate an image code stream, wherein the weighted predictive encoding uses at least one set of weighted predictive identification information and parameters.
  • the current frame can be equivalent to multiplying a weight by the previous frame as a whole, plus an offset to perform video encoding based on the previous frame.
  • This process of video encoding through weight and offset can be It is called weighted predictive coding.
  • three prediction parameters are mainly involved, which are weight, offset, and log weight denominator.
  • the log weight denominator It can avoid the floating-point operation in the encoding process and amplify the weight.
  • the weighted prediction identification information may be identification information of parameters used for weighted prediction, and the parameters may be specific parameters of the weighted prediction identification information, which may include at least one of weight, offset, and logarithmic weight denominator.
  • the image code stream may be data generated after video image encoding, and the image code stream may be used for transmission between terminal devices.
  • weighted predictive encoding can be performed on video images, and one or more sets of weighted predictive identification information and parameters can be used in the process of weighted predictive encoding, for example, different weights can be used for video images of different frames
  • the prediction identification information and parameters, or different weighted prediction identification information and parameters may be used for video images in different regions within the same frame. It can be understood that, in the process of performing weighted predictive coding on video images, different weighted predictive identification information and parameters can be selected for video coding according to brightness changes of video images.
  • the video image is at least one frame image in the video, and performing weighted predictive coding on the video image to generate an image code stream, wherein at least one set of weighted predictive identification information is used in the weighted predictive coding process and parameters, the flexible coding of video images is realized, the video coding efficiency can be improved, and the influence of video image brightness changes on coding efficiency can be reduced.
  • Fig. 3 is a flowchart of another video encoding method provided by the embodiment of the present application.
  • the embodiment of the present application is based on the embodiment of the above application. See Fig. 3.
  • the method provided by the embodiment of the present application specifically includes the following step:
  • Step 210 Acquire a video image, where the video image is at least one frame image of the video.
  • Step 220 determine the brightness change according to the comparison result between the video image and the reference image.
  • the reference image can be the book data located before or after the currently processed video image in the video data sequence, the reference image can be used for the motion estimation operation of the currently processed video image, and the number of reference images can be one or more frames .
  • the change in brightness may be the change in brightness of the video image compared to the reference image, and the change in brightness may specifically be determined by the change in pixel values between the video image and the reference image.
  • the video image can be compared with the reference image
  • the comparison method can include calculating the difference between the pixel values of the corresponding positions or calculating the difference between the pixel average value of the video image and the reference image, etc.
  • the video can be
  • the comparison result of the image and the reference image determines the change of brightness, which may include gradually brightening, gradually darkening, no change, random change, and the like.
  • the brightness change situation includes at least one of the following: an average value of image brightness change, and a pixel point brightness change value.
  • the brightness change between the video image and the reference image can be determined by the average value of the image brightness change and/or the brightness change value of the pixel point, wherein the average value of the image brightness change can refer to the average value change of the brightness value of the current image and the reference image, and the pixel point
  • the luminance change value may be a change between the luminance value of each pixel in the video image and the luminance value of the corresponding pixel in the reference image. It can be understood that the luminance change may also be a change of other luminance value statistical properties, for example, luminance variance, luminance mean square deviation, and the like.
  • Step 230 Perform weighted predictive encoding on the video image according to the brightness variation of the video image, wherein the weighted predictive encoding uses at least one set of weighted predictive identification information and parameters.
  • multiple sets of weighted prediction identification information and parameters can be preset, and the corresponding prediction identification information and parameters can be selected according to the brightness change to perform weighted prediction encoding on video images.
  • the predictive weighted encoding can be performed on the video image according to the specific content of the brightness change.
  • different weighted prediction identification information and parameters can be selected for the weighted predictive encoding according to the brightness change of the video image of different frames. Different regions in the video image select different weighted prediction identification information and parameters for weighted prediction coding.
  • Step 240 write weighted prediction identification information and parameters into the image code stream.
  • the weighted prediction identification information and parameters used in the encoding process can be written into the image code stream to facilitate the video decoding process in the subsequent process.
  • the image code stream realizes flexible encoding of video images, which can improve video encoding efficiency and reduce the impact of video image brightness changes on encoding efficiency.
  • the weighted predictive coding of the video image according to the brightness change includes at least one of the following:
  • weighted predictive encoding is performed on the video image
  • the change in brightness is that the brightness of the partitioned images is consistent, it is determined to perform weighted predictive coding on the partitioned images in the video image respectively.
  • the uniform brightness of the entire frame of the image may mean that the brightness changes of the entire image frame of the video image are the same.
  • the uniform brightness of the partition image may mean that there are multiple regions in the video image, and the brightness changes of each region are different.
  • weighted predictive coding can be performed on the entire frame of video image; In various ways, weighted predictive encoding can be performed for each image area. It can be understood that when the brightness changes of each image area are different, the weighted predictive identification information and parameters used in the weighted predictive encoding process can be different.
  • the video image has at least one set of the weighted prediction identification information and parameters for the reference image or a partition image of the reference image.
  • the weighted prediction identification information and parameters used in the weighted prediction coding process of the video image are equivalent to the information determined by the reference image, and the video image has one or more sets of weighted prediction identification information and parameters based on the reference image , when the reference image of the video image is multiple frames, the video image can have one or more sets of weighted prediction identification information and parameters for each frame of reference image, and each set of weighted prediction identification information and parameters can be related to the corresponding video image and/or Or there is an associated relationship with the reference image.
  • Fig. 4 is a flowchart of another video coding method provided by the embodiment of the present application.
  • the embodiment of the present application is based on the embodiment of the above application. Referring to Fig. 4, the method provided by the embodiment of the present application specifically includes the following steps :
  • Step 310 Acquire a video image, where the video image is at least one frame image of the video.
  • Step 320 perform weighted predictive coding on the video image according to the pre-trained neural network model.
  • the neural network model can perform weighted predictive coding processing on the video image, and can determine the weighted prediction identification information and parameters used in the video image, and the neural network model can be generated by training with image samples marked with the weighted prediction identification information and parameters.
  • the neural network model can determine the weighted prediction identification information and parameters of the video image or directly determine the image code stream of the video image.
  • the video image may be directly or indirectly input into the neural network model, and the weighted predictive coding of the video image may be implemented through the neural network model.
  • the input layer of the neural network model can accept video images or features of video images, and the neural network model can generate weighted prediction identification information and parameters used in weighted predictive coding of video images, or directly perform weighted prediction on video images coding.
  • Step 330 write weighted prediction identification information and parameters into the image code stream.
  • the video by acquiring a video image, using a pre-trained neural network model to perform weighted predictive encoding on the video image, and writing the weighted predictive identification information and parameters used in the weighted predictive encoding into the image code stream generated by the image video encoding, the video
  • the flexible encoding of images can improve video encoding efficiency and reduce the impact of video image brightness changes on encoding efficiency.
  • the weighted prediction identification information further includes a neural network model structure and neural network model parameters.
  • the weighted prediction identification information may also include the neural network model structure and neural network model parameters, wherein the neural network model result may be information reflecting the neural network result, for example, the function used by the fully connected layer, the activation function, the loss function etc., the neural network model parameters may be specific values of the parameters of the neural network model, for example, the network weight value, the number of hidden layers, and the like.
  • the set of weighted prediction identification information and parameters used in the weighted predictive encoding corresponds to one frame of the video image or at least one partition image of the video image.
  • one or more sets of weighted prediction identification information and parameters can be used in the weighted prediction coding process, and each set of machine prediction identification information and parameters corresponds to a frame of video image or a video image in the weighted prediction coding process.
  • a partitioned image which may be a part of a video image, for example, a sliced image or a sub-image.
  • the specification of the partition image includes at least one of the following: Slice, Tile, Subpicture, Coding Tree Unit, Coding Unit.
  • the partitioned image can be one of Slice, Tile, Subpicture, Coding Tree Unit, Coding Unit, or Various.
  • the weighted prediction identification information and the parameters are included in at least one of the following parameter sets: sequence layer parameter set, image layer parameter set, slice layer parameter set, supplementary enhancement information, video availability information, image header information, slice header information, network abstraction layer unit header information, coding tree unit, coding unit.
  • the weighted prediction identification information and parameters can be written into the image code stream, and the identification information and parameters are included in all or part of the following parameter sets: sequence layer parameter set, image layer parameter set, slice layer parameter set, supplementary enhancement information , video usability information, image header information, slice header information, network abstraction layer unit header information, or as a new information unit, may also be included in the coding tree unit or the coding unit.
  • the weighted prediction identification information and the parameters include at least one of the following information: reference image index information, weighted prediction enable control information, and region-adaptive weighted prediction enable control information , Weighted prediction parameters.
  • the weighted prediction identification information may be reference image index information, which is used to determine the reference image used for brightness changes, and weighted prediction enabling control information, which is used to determine whether to perform weighted predictive coding, region adaptive Weighted prediction enabling control information, this information is used to determine whether to perform area weighted predictive coding on the image video, weighted prediction parameters can be parameters used in the weighted predictive coding process, and can include weight, offset, logarithmic weight denominator, etc.
  • the image code stream includes a transport stream or a media file.
  • the image code stream may be a transport stream or a media file.
  • FIG. 5 is an example diagram of a video encoding method provided in the embodiment of the present application.
  • the input of the encoding process is the image contained in the video
  • the output is the image code stream or the video code stream containing the image code stream.
  • the process of video encoding may include the following steps:
  • Step 101 Read an image, where the image may be a frame of data in a video sequence, or a frame of data corresponding to a certain moment.
  • Step 102 Detect brightness changes of the image relative to a reference image.
  • the reference image may refer to an image before the image on the timeline in the video sequence, or an image after the current image on the timeline, and the reference image is used to perform a motion estimation operation on the current image.
  • the reference image can have one frame or multiple frames;
  • Brightness change can refer to the mean value change of the brightness value of the current image and the reference image, or the brightness value of each pixel of the current image and the brightness value of the pixel corresponding to the reference image, or the change of other statistical characteristics of the brightness value. ;
  • the current image needs to detect brightness changes for each frame of the reference image.
  • Step 103 According to the brightness change detection result in step 102, it is judged whether a weighted prediction operation needs to be adopted.
  • the basis for judging includes a trade-off between improving coding efficiency and increasing coding complexity in a weighted prediction operation for a specific luminance change situation.
  • Step 104 When weighted prediction operation is required, perform weighted prediction encoding, and record the index information of the aforementioned reference image, one or more sets of weighted prediction parameters, and/or weighted prediction parameter index information.
  • the encoder records the regional characteristics of the luminance change.
  • the regional characteristics of the luminance change include that the luminance change keeps a consistent change in the entire frame image area, or the luminance change keeps a consistent change in a partial area of the image.
  • the partial area may be one or more slices (Slice), or one or more tiles (Tile), or one or more subpictures (Subpicture), or one or more coding tree units (Coding Tree Unit, CTU), or one or more coding units (Coding Unit, CU);
  • the brightness change is consistent in the entire frame image area, it can be considered that the brightness change of the current image corresponding to the reference image is uniform, otherwise it is not uniform;
  • weighted predictive coding If the brightness change of the current image is uniform, perform weighted predictive coding, and record the aforementioned reference image index information and a set of weighted prediction parameters, wherein the weighted prediction parameters include weight and offset;
  • one image region may correspond to a set of weighted prediction parameters, or multiple image regions may correspond to a set of weighted prediction parameters.
  • Step 105 Write weighted prediction identification information and parameters into the code stream, and the identification information and parameters are included in all or part of the following parameter sets: sequence layer parameter set, image layer parameter set, slice layer parameter set, supplementary enhancement information , video usability information, image header information, slice header information, network abstraction layer unit header information, or as a new information unit, may also be included in the coding tree unit or the coding unit.
  • Step 106 When no weighted prediction operation is needed, directly perform image coding based on traditional methods;
  • Step 107 Output the image code stream or the transport stream or media file containing the image code stream.
  • FIG. 6 is an example diagram of a video encoding method provided by the embodiment of the present application. Referring to FIG. 6, this embodiment applies deep learning technology to region-adaptive weighted prediction operations, and the process of video encoding May include the following steps:
  • Step 201 Read an image, where the image may be a frame of data in a video sequence, or a frame of data corresponding to a certain moment.
  • Step 202 Determine whether to use a weighted prediction scheme based on deep learning technology.
  • Step 203 When using deep learning technology, use the neural network model and parameters generated by training to perform image weighted predictive coding.
  • weighted prediction parameters may include weights and offsets; wherein, a single image may use all or part of the weights and offsets in a set of values.
  • Step 204 When performing image weighted prediction encoding based on the neural network model, record the necessary weighted prediction parameters of the deep learning scheme, including but not limited to the structure and parameters of the neural network model, a set of extracted weighted prediction parameter values and the parameters used in each image area Index information.
  • Step 205 When the deep learning technology is not used, perform region-adaptive weighted predictive coding based on a traditional computing scheme, for example, based on the operation process in Embodiment 1.
  • Step 206 Write weighted prediction identification information and parameters into the code stream, and the identification information and parameters are included in all or part of the following parameter sets: sequence layer parameter set, image layer parameter set, slice layer parameter set, supplementary enhancement information , video usability information, image header information, slice header information, network abstraction layer unit header information, or as a new information unit, may also be included in the coding tree unit or the coding unit.
  • Step 207 Output the image code stream or the transport stream or media file containing the image code stream.
  • Fig. 7 is a flow chart of a video decoding method provided by the embodiment of the present application.
  • the embodiment of the present application is applicable to video decoding in a brightness change scene, and the method can be executed by a video decoding device, which can be implemented by software and/or Or hardware implementation, and generally integrated in the terminal device, see Figure 7, the method provided by the embodiment of the present application specifically includes the following steps:
  • Step 410 acquire the image code stream, and analyze the weighted prediction identification information and parameters in the image code stream.
  • the transmitted image code stream may be received, and weighted prediction identification information and parameters may be extracted from the image code stream, wherein the image code stream may include one or more sets of weighted prediction identification information and parameters.
  • Step 420 decode the image code stream according to the weighted prediction identification information and parameters to generate a reconstructed image.
  • weighted predictive decoding can be performed on the image code stream by using the obtained weighted prediction identification information and parameters, and the image code stream can be processed into a reconstructed image, wherein the reconstructed image can be an image generated according to the transmission code stream.
  • the image code by obtaining the image code stream, and obtaining the weighted prediction identification information and parameters in the image code stream, and processing the image code stream according to the obtained weighted prediction identification information and parameters to generate a new image, the image code is realized.
  • Dynamic decoding of streams can improve video decoding efficiency and reduce the impact of image brightness changes on video decoding efficiency.
  • Fig. 8 is a flowchart of another video decoding method provided by the embodiment of the present application.
  • the embodiment of the present application is based on the embodiment of the above application. See Fig. 8.
  • the method provided by the embodiment of the present application specifically includes the following steps :
  • Step 510 acquire the image code stream, and analyze the weighted prediction identification information and parameters in the image code stream.
  • Step 520 Perform weighted predictive decoding on the image code stream by using the weighted predictive identification information and parameters according to the pre-trained neural network model to generate a reconstructed image.
  • the neural network model may be a deep learning model for image code stream decoding
  • the deep network model may be generated by training sample code streams and sample images
  • the neural network model may perform weighted predictive decoding on the image code stream.
  • the image code stream and weighted prediction identification information and parameters can be input into the pre-trained neural network model, and the neural network model performs weighted predictive decoding processing on the image code stream, and processes the image code stream into a reconstructed image .
  • the realization of The dynamic decoding of the image code stream can improve the video decoding efficiency and reduce the impact of image brightness changes on the video decoding efficiency.
  • the number of weighted prediction identification information and parameters in the image code stream is at least one set.
  • the image code stream may be the information generated by weighted predictive coding of video images. Based on the different methods of weighted predictive coding, there may be one or more sets of weighted predictive identification information and parameters in the image code stream. For example, encoding If the terminal performs weighted predictive coding on different regions in the video image, multiple sets of weighted predictive identification information and parameters may exist in the image code stream.
  • the weighted prediction identification information and the parameters are included in at least one of the following parameter sets: sequence layer parameter set, image layer parameter set, slice layer parameter set, supplementary enhancement information, video availability information, image header information, slice header information, network abstraction layer unit header information, coding tree unit, coding unit.
  • the weighted prediction identification information and the parameters include at least one of the following information: reference image index information, weighted prediction enable control information, and region-adaptive weighted prediction enable control information , Weighted prediction parameters.
  • the image code stream includes a transport stream or a media file.
  • the weighted prediction identification information and parameters also include neural network model structure and neural network model parameters.
  • FIG. 9 is an example diagram of a video decoding method provided in the embodiment of the present application.
  • the input of the decoding process is an image code stream or a transport data stream or a media file containing an image code stream
  • the output It is an image that constitutes a video
  • the decoding process of a video image may include:
  • Step 301 Read code stream.
  • Step 302 Parse the code stream to obtain weighted prediction identification information.
  • the decoder parses the sequence layer parameter set, the image layer parameter set and/or the slice layer parameter set to obtain weighted prediction identification information.
  • the sequence layer parameter set includes the sequence parameter set (Sequence Parameter Set, SPS)
  • the image layer parameter set includes the picture parameter set (Picture Parameter Set, PPS)
  • the adaptation parameter set Adaptation Parameter Set, APS
  • the slice layer parameter Set includes APS.
  • the weighted prediction identification information in the sequence layer parameter set can be referenced by the image layer parameter set and the slice layer parameter set, and the weighted prediction identification information in the image layer parameter set can be referenced by the slice layer parameter set.
  • the weighted prediction identification information includes but is not limited to whether the sequence and/or image indicated by the current parameter set adopts unidirectional weighted prediction, bidirectional weighted prediction, and/or region-adaptive multi-weighted weighted prediction;
  • distinguishing whether to adopt the region-adaptive multi-weight weighted prediction method includes but is not limited to parsing binary identifiers, or the number of weighted prediction parameter sets (whether it is equal to or greater than 1).
  • Step 303 According to the weighted prediction identification information, determine whether the current image adopts weighted prediction decoding.
  • Step 304 When it is determined that the current image adopts weighted prediction decoding, obtain weighted prediction parameter information.
  • the decoder acquires weighted prediction parameter information from the parameter set and/or data header information according to the indication of the identification information.
  • the parameter set includes SPS, PPS, APS
  • the data header information includes image header PH, slice header (Slice Header, SH).
  • the weighted prediction parameter information includes, but is not limited to, whether each reference image in the reference image list is configured with weighted prediction parameters (weight and offset), the number of sets of weighted prediction parameters configured for each reference image, and the specific value of each set of weighted prediction parameters. value etc.
  • Step 305 Perform weighted prediction decoding on the current image according to the weighted prediction identification information and parameters.
  • the decoder can perform unified weighted prediction decoding on the current complete image, or perform differentiated weighted prediction decoding on each partial content in the image.
  • Step 306 When it is determined that the current image does not use weighted predictive decoding, directly perform image decoding based on traditional methods.
  • Step 307 Generate a reconstructed image.
  • the reconstructed image can be used for display or saved directly.
  • FIG. 10 is an example diagram of another video decoding method provided by the embodiment of the present application.
  • this embodiment applies deep learning technology to region-adaptive weighted prediction operations, and the video decoding method Specifically include the following steps:
  • Step 401 Read code stream.
  • Step 402 Parse the code stream to obtain identification information of weighted prediction based on deep learning.
  • the decoder parses the sequence layer parameter set, image layer parameter set and/or slice layer parameter set to obtain identification information of weighted prediction based on deep learning.
  • the sequence layer parameter set includes the sequence parameter set (Sequence Parameter Set, SPS)
  • the image layer parameter set includes the picture parameter set (Picture Parameter Set, PPS)
  • the adaptation parameter set Adaptation Parameter Set, APS
  • the slice layer parameter Set includes APS.
  • the weighted prediction identification information in the sequence layer parameter set can be referenced by the image layer parameter set and the slice layer parameter set
  • the weighted prediction identification information in the image layer parameter set can be referenced by the slice layer parameter set.
  • the identification information of the weighted prediction based on deep learning includes but not limited to whether the sequence and/or image indicated by the current parameter set adopts the weighted prediction based on deep learning.
  • Step 403 According to the identification information of weighted prediction based on deep learning, determine whether the current image is decoded by weighted prediction based on deep learning.
  • Step 404 When it is determined that the current image adopts weighted prediction decoding based on deep learning, obtain necessary weighted prediction parameters of the deep learning scheme.
  • the decoder acquires weighted prediction parameter information from the parameter set and/or data header information according to the indication of the identification information.
  • the parameter set includes SPS, PPS, APS
  • the data header information includes image header PH, slice header (Slice Header, SH).
  • the weighted prediction parameter information includes, but is not limited to, reference image index information, neural network model structure and parameters, all or part of the weighted prediction parameter values, and weighted prediction parameter index information used in each region of the current image.
  • Step 405 Perform weighted prediction decoding based on deep learning on the current image according to the weighted prediction identification information and parameters.
  • Step 406 When it is determined that the current image does not adopt weighted predictive decoding based on deep learning, directly perform image decoding based on traditional methods.
  • Step 407 Generate a reconstructed image.
  • the reconstructed image can be used for display or saved directly.
  • the embodiment provides identification information of region-adaptive weighted prediction parameters included in the sequence layer parameter set SPS in the code stream.
  • the identification information in SPS can be referenced by PPS and APS.
  • the syntax and semantics in Table 1 are defined as follows:
  • sps_weighted_pred_flag is the enabling control information for the sequence layer to apply weighted prediction techniques to unidirectional prediction slices (P slices).
  • P slices unidirectional prediction slices
  • sps_weighted_bipred_flag is the enabling control information for the sequence layer to apply the weighted prediction technique to biprediction slices (B slices).
  • B slices biprediction slices
  • sps_wp_multi_weights_flag is the enabling control information for the sequence layer to have multiple sets of weighted prediction parameters for a single reference picture.
  • sps_wp_multi_weights_flag is equal to 1, it indicates that the single reference picture of the picture indicated by the SPS can have multiple sets of weighted prediction parameters; conversely, when sps_wp_multi_weights_flag is equal to 0, it indicates that the single reference picture of the picture indicated by the SPS has only a single set of weighted prediction parameters.
  • this embodiment provides identification information of region-adaptive weighted prediction parameters included in the picture layer parameter set PPS in the code stream, and the identification information in the PPS can be referenced by the APS.
  • pps_weighted_pred_flag is the enabling control information for the image layer to apply weighted prediction technology to unidirectional prediction slices (P slices).
  • pps_weighted_pred_flag is the enabling control information for the image layer to apply weighted prediction technology to unidirectional prediction slices (P slices).
  • pps_weighted_pred_flag is the enabling control information for the image layer to apply weighted prediction technology to unidirectional prediction slices (P slices).
  • pps_weighted_bipred_flag is the enabling control information for the image layer to apply the weighted prediction technique to bipredictive slices (B slices).
  • B slices bipredictive slices
  • pps_wp_multi_weights_flag is the enabling control information for the picture layer to have multiple sets of weighted prediction parameters with respect to a single reference picture.
  • pps_wp_multi_weights_flag is equal to 1
  • sps_wp_multi_weights_flag is equal to 0, the value of pps_wp_multi_weights_flag should be equal to 0.
  • pps_no_pic_partition_flag When pps_no_pic_partition_flag is equal to 1, it indicates that no image segmentation is applied to each image indicated by the PPS; when pps_no_pic_partition_flag is equal to 0, it indicates that each image indicated by the PPS may be divided into multiple tiles Tile or Slice.
  • pps_rpl_info_in_ph_flag When pps_rpl_info_in_ph_flag is equal to 1, it indicates that the reference picture list (Reference Picture List, RPL) information exists in the picture header (Picture Header, PH) syntax structure, and does not exist in the slice header that does not contain the PH syntax structure indicated by the PPS.
  • pps_rpl_info_in_ph_flag is equal to 0, it indicates that the RPL information does not exist in the PH syntax structure, and may exist in the slice header indicated by the PPS.
  • pps_wp_info_in_ph_flag When pps_wp_info_in_ph_flag is equal to 1, it indicates that the weighted prediction information may exist in the PH syntax structure, and does not exist in the slice header that does not include the PH syntax structure indicated by the PPS. When pps_wp_info_in_ph_flag is equal to 0, it indicates that the weighted prediction information does not exist in the PH syntax structure, and may exist in the slice header indicated by the PPS.
  • the embodiment provides identification information of region-adaptive weighted prediction parameters included in the picture header (PH) in the code stream.
  • the weighted prediction parameters included in the PH may be referenced by the current picture, slices in the current picture, and/or CTUs or CUs in the current picture.
  • ph_inter_slice_allowed_flag When ph_inter_slice_allowed_flag is equal to 0, it indicates that all coded slices of the image are intra prediction type (I slice). When ph_inter_slice_allowed_flag is equal to 1, it indicates that the image is allowed to contain one or more unidirectional or bidirectional inter prediction type slices (P slice or B slice).
  • multi_pred_weights_table() is a numerical table containing weighted prediction parameters, where a single reference image can have multiple sets of weighted prediction parameters (weight + offset).
  • weighted prediction parameters can be obtained from the table.
  • pred_weight_table() is a numerical table containing weighted prediction parameters, where a single reference image has only a single set of weighted prediction parameters (weight + offset).
  • the embodiment provides identification information of region-adaptive weighted prediction parameters included in the slice header SH in the code stream.
  • the weighted prediction parameters contained in the SH can be referenced by the current slice and/or the CTU or CU in the current slice.
  • sh_picture_header_in_slice_header_flag is equal to 1, indicating that the PH syntax structure exists in the slice header.
  • sh_picture_header_in_slice_header_flag is equal to 0, the PH syntax structure does not exist in the slice header, that is, the slice layer may not completely inherit the identification information of the image layer, and the encoding tool can be flexibly selected.
  • sh_slice_type indicates the coding type of the slice, which can be intra coding type (I slice), unidirectional inter coding type (P slice), and bidirectional inter coding type (B slice).
  • sh_wp_multi_weights_flag is the enable control information for the slice layer to have multiple sets of weighted prediction parameters with respect to a single reference picture.
  • sh_wp_multi_weights_flag is the enable control information for the slice layer to have multiple sets of weighted prediction parameters with respect to a single reference picture.
  • weighted prediction parameters can be obtained from the value table multi_pred_weights_table(); if pps_wp_multi_weights_flag is equal to 0, the weighted prediction parameters can be obtained from the value Obtained in the table pred_weight_table().
  • the embodiment provides the syntax and semantics of the weighted prediction parameter value table, pred_weight_table() and multi_pred_weights_table() are both value tables containing weighted prediction parameters, the difference is that the former defines a single reference image only has a single set of weighted prediction parameters, while the latter defines that a single reference picture can have multiple sets of weighted prediction parameters.
  • the syntax and semantics of pred_weight_table() can refer to the document description of the international standard H.266/VVCversion1; the syntax and semantics of multi_pred_weights_table() are given in Table 5 and its description. in,
  • luma_log2_weight_denom and delta_chroma_log2_weight_denom are magnification factors for luma and chrominance weighting factors, respectively, to avoid floating-point operations at the encoding end.
  • num_l0_weights indicates that when pps_wp_info_in_ph_flag is equal to 1, the number of weighting factors that need to be indicated for many entries (reference pictures) in the reference picture list 0 (RPL0).
  • the value range of num_l0_weights is [0,Min(15,num_ref_entries[0][RplsIdx[0]])], where num_ref_entries[listIdx][rplsIdx] indicates the entries in the reference image list syntax structure ref_pic_list_struct(listIdx,rplsIdx) number.
  • the variable NumWeightsL0 is set to num_l0_weights; otherwise, when pps_wp_info_in_ph_flag is equal to 0, the variable NumWeightsL0 is set to NumRefIdxActive[0].
  • the value of NumRefIdxActive[i]-1 indicates the maximum reference index that may be used to decode the slice in the reference picture list i (RPLi). When the value of NumRefIdxActive[i] is 0, it indicates that there is no reference index in RPLi for decoding slices.
  • luma_weight_l0_flag[i] When luma_weight_l0_flag[i] is equal to 1, it indicates that the luma component for unidirectional prediction using the i-th entry in the reference list 0 (RefPicList[0][i]) has a weighting factor (weight+offset). When luma_weight_l0_flag[i] is equal to 0, it indicates that the above weighting factor does not exist.
  • chroma_weight_l0_flag[i] When chroma_weight_l0_flag[i] is equal to 1, it indicates that the chroma prediction value using the i-th entry in the reference list 0 (RefPicList[0][i]) for unidirectional prediction has a weighting factor (weight + offset). When chroma_weight_l0_flag[i] is equal to 0, it indicates that the above weighting factors do not exist (by default).
  • num_l0_luma_pred_weights[i] indicates that when luma_weight_l0_flag[i] is equal to 1, the number of weighting factors that need to be indicated for the brightness component of entry i (reference image i) in reference image list 0 (RPL0), that is, the brightness of a single reference image i in list 0 The number of weighted prediction parameters that a component can carry.
  • delta_luma_weight_l0[i][k] and luma_offset_l0[i][k] respectively indicate the k-th weight factor and offset value of the i-th reference image luminance component in reference image list 0.
  • num_l0_chroma_pred_weights[i][j] indicates that when chroma_weight_l0_flag[i] is equal to 1, the number of weighting factors that need to be indicated for the jth chrominance component of entry i (reference image i) in reference image list 0 (RPL0), namely The number of weighted prediction parameters that can be carried by the jth chrominance component of a single reference image i in list 0.
  • delta_chroma_weight_l0[i][j][k] and delta_chroma_offset_l0[i][j][k] respectively indicate the kth weight factor and offset of the jth chrominance component of the ith reference image in the reference image list 0 Quantity value.
  • this embodiment provides the syntax and semantics of another numerical table of weighted prediction parameters.
  • Both pred_weight_table() and multi_pred_weights_table() are numerical tables containing weighted prediction parameters. The difference is that the former defines that a single reference image has only one set of weighted prediction parameters, while the latter defines that a single reference image can have multiple sets of weighted prediction parameters.
  • the syntax and semantics of pred_weight_table() can be found in the document description of the international standard H.266/VVC version 1; the syntax and semantics of multi_pred_weights_table() are given in Table 6 and its description.
  • multi_pred_weights_table() involves many entries in the reference image list, that is, multiple specified reference images need to determine whether there are weighting factors, and a single reference image with weighting factors may also have multiple sets of weighted prediction parameters (including weights and offsets).
  • this embodiment is a special case of the eleventh embodiment, where only one reference image in each reference image list is considered, that is, when determining to apply the weighted prediction technique, directly specify the multiple sets of weighted predictions that the reference image has parameter.
  • the meaning of each field in Table 6 is the same as the corresponding semantic interpretation of each field in Table 5.
  • the identification information and parameters of the region-adaptive weighted prediction are given in the coding tree unit CTU.
  • the weighted prediction parameter information in the CTU can be independently identified, and can also refer to other parameter sets (such as sequence layer parameter set SPS, picture layer parameter set PPS) or header information (such as picture header PH, slice header SH) , you can also record the weighted prediction parameter difference, or record the weighted prediction parameter index information and the difference.
  • the weighted prediction parameter information in the CTU is the record difference, or the record index information and the difference, it refers to obtaining a weighted value from other parameter sets or header information, plus a difference in the CTU, so that The weighted prediction parameters finally applied to the CTU or CU can be obtained.
  • the weighted prediction parameter information is included in the unit of CTU, a refined brightness gradient effect in the unit of CTU, such as circle gradient and radiation gradient, can be realized.
  • the specific code stream organization method can be shown in Table 7.
  • the weighted prediction parameter finally applied to the current CTU is the specific weighted prediction parameter of the reference image plus the weighted prediction parameter difference defined in coding_tree_unit().
  • the weighted prediction parameter difference includes, but not limited to, the weighted prediction parameter difference (ctu_delta_luma_weight_l0 and ctu_delta_luma_offset_l0) for the luma component in RPL0, and the weighted prediction parameter difference for the chroma component in RPL0 (ctu_delta_chroma_weight_l0[i] and ctu_delta_chroma_offset_l0[ i]), the weighted prediction parameter difference for the luma component in RPL1 (ctu_delta_luma_weight_l1 and ctu_delta_luma_offset_l1), and the weighted prediction parameter difference for the chroma component in RPL1 (ctu_delta_chroma_weight_l1[i] and ctu_delta_chroma_offset_l1[i]).
  • the weighted prediction parameter index information includes but not limited to the weighted prediction parameter index number (ctu_index_l0_luma_pred_weights) for the luma component in RPL0, the weighted prediction parameter index number (ctu_index_l0_chroma_pred_weights[i]) for the chroma component in RPL0, The weighted prediction parameter index number (ctu_index_l1_luma_pred_weights) of the luma component, and the weighted prediction parameter index number (ctu_index_l1_chroma_pred_weights[i]) for the chrominance component in RPL1.
  • the weighted prediction parameter finally applied to the current CTU is the specific weighted prediction parameter of the reference image plus the weighted prediction parameter difference defined in coding_tree_unit().
  • the identification information and parameters of the region-adaptive weighted prediction are given in the coding unit CU.
  • the weighted prediction parameter information in the CU can be independently identified, or it can refer to other parameter sets (such as sequence layer parameter set SPS, picture layer parameter set PPS) or header information (such as picture header PH, slice header SH) or encoding
  • the difference value of the weighted prediction parameter may also be recorded, or the index information and the difference value of the weighted prediction parameter may be recorded.
  • the weighted prediction parameter information in the CU is the record difference, or the record index information and the difference, it refers to obtaining a weighted value from other parameter sets or header information, plus a difference in the CU, so that The weighted prediction parameters finally applied to the CU can be obtained.
  • the weighted prediction parameter information is included in units of CUs, refined brightness gradient effects in units of CUs, such as circle gradients and radiation gradients, can be achieved.
  • the specific code stream organization method may be as shown in Table 8.
  • cu_pred_weights_adjust_flag indicates whether to adjust the weighted prediction parameter value indexed by the current CU.
  • cu_pred_weights_adjust_flag is equal to 1, it indicates that the current CU needs to adjust the weighted prediction parameter value, that is, the weighted prediction parameter value finally applied to the current CU is the sum of the difference between the weighted prediction parameter value at the CTU level and the CU level identification; when cu_pred_weights_adjust_flag is equal to 1 , indicating that the current CU directly uses the weighted prediction parameter value determined by the CTU level.
  • the weighted prediction parameter differences identified at the CU level include the weighted prediction parameter differences (cu_delta_luma_weight_l0 and cu_delta_luma_offset_l0) for the luma component in RPL0.
  • the weighted prediction parameter difference identified at the CU level also includes the weighted prediction parameter difference for each chroma component; if the current CU is bidirectionally predictive, the weighted prediction parameter difference identified at the CU level It also includes weighted prediction parameter differences for the luma component and/or each chrominance component in RPL1.
  • the weighted prediction parameters finally applied to the current CU are the specific weighted prediction parameters carried by the upper layer data (such as CTU level, Slice level, and Subpicture level) in the codec structure plus the weighted prediction parameter difference defined in coding_unit().
  • the identification information and parameters of the region-adaptive weighted prediction are given in supplemental enhancement information (SupplementalEnhancementInformation, SEI).
  • SEI SupplementalEnhancementInformation
  • the NALunittype in the network abstraction layer unit header information nal_unit_header() is set to 23, indicating the pre-SEI information.
  • sei_rbsp() contains related code stream sei_message(), and sei_message() contains valid data information. It only needs to set the value of payloadType to be different from other SEI information in the current H.266/VVCversion1 (for example, the possible value is 100), then payload_size_byte contains code stream information related to region-adaptive weighted prediction.
  • the specific code stream organization method is shown in Table 7.
  • multi_pred_weights_cancel_flag 1, the SEI information related to the previous image will be canceled, and the image does not use the relevant SEI function; if multi_pred_weights_cancel_flag is 0, the previous SEI information will be used (during the decoding process, if the current image does not carry SEI information, the previous The SEI information of an image will continue to be used in the decoding process of the current image), and the relevant SEI function is enabled for the image; if multi_pred_weights_persistence_flag is 1, the SEI information is applied to the current image and images after the current layer; if multi_pred_weights_persistence_flag is 0, the SEI information is only applied to current image.
  • the meanings of other fields in Table 9 are the same as the semantic explanations corresponding to each field in Table 5.
  • the identification information and parameters of the region-adaptive weighted prediction are given in the media description information.
  • the media description information includes, but is not limited to, the Media Presentation Description (MPD) information in the HTTP-based Dynamic Adaptive Streaming over HTTP (DASH) protocol, MPEG Media Transport (MPEG Media Transport, MMT) Asset Descriptor information in the protocol.
  • MPD Media Presentation Description
  • MMT MPEG Media Transport
  • Table 10 The syntax and field meanings in Table 10 are the same as the semantic explanations corresponding to the fields in Table 5.
  • Fig. 11 is a schematic structural diagram of a video coding device provided by an embodiment of the present application; the video coding method provided by any embodiment of the present application can be executed, and it has corresponding functional modules and beneficial effects for executing the method, and the device can be implemented by software and/or
  • the hardware implementation specifically includes: an image acquisition module 610 and a video encoding module 620 .
  • the image acquiring module 610 is configured to acquire a video image, wherein the video image is at least one frame of video.
  • the video encoding module 620 is configured to perform weighted predictive encoding on the video image to generate an image code stream, wherein the weighted predictive encoding uses at least one set of weighted predictive identification information and parameters.
  • the video image is acquired by the image acquisition module, and the video image is at least one frame image in the video, and the video encoding module performs weighted predictive encoding on the video image to generate an image code stream, wherein at least A set of weighted prediction identification information and parameters realizes flexible encoding of video images, improves video encoding efficiency, and reduces the impact of video image brightness changes on encoding efficiency.
  • the device further includes:
  • the change determination module is used to determine the brightness change according to the comparison result between the video image and the reference image.
  • the brightness change situation includes at least one of the following: an average value of image brightness change, and a pixel point brightness change value.
  • the video encoding module 620 includes:
  • An encoding processing unit configured to perform weighted predictive encoding on the video image according to the brightness change of the video image.
  • the encoding processing unit is specifically configured to: if the brightness change is that the brightness of the entire frame image is consistent, perform weighted predictive encoding on the video image; if the brightness change If the image brightness of the partitions is the same, it is determined to perform weighted predictive coding on each of the partition images in the video image.
  • the set of weighted prediction identification information and parameters used in the weighted prediction encoding in the device corresponds to one frame of the video image or at least one partition image of the video image.
  • the specification of the partitioned image in the device includes at least one of the following: Slice, Tile, Subpicture, Coding Tree Unit, Coding Unit Unit.
  • the device further includes:
  • a code stream writing module configured to write the weighted prediction identification information and parameters into the image code stream.
  • the weighted prediction identification information and the parameters in the device are included in at least one of the following parameter sets: a sequence layer parameter set, an image layer parameter set, a slice layer parameter set, Supplementary enhancement information, video usability information, image header information, slice header information, network abstraction layer unit header information, coding tree unit, coding unit.
  • the weighted prediction identification information and the parameters in the device include at least one of the following information: reference image index information, weighted prediction enabling control information, and region-adaptive weighted prediction enabling Control information, weighted prediction parameters.
  • the image code stream in the device includes a transport stream or a media file.
  • the video coding module 620 also includes:
  • the deep learning unit is used to perform weighted predictive coding on the video image according to the pre-trained neural network model.
  • the weighted prediction identification information in the device further includes a neural network model structure and neural network model parameters.
  • Fig. 12 is a schematic structural diagram of a video decoding device provided in an embodiment of the present application, which can execute the video decoding method provided in any embodiment of the present application, and has corresponding functional modules and beneficial effects for executing the method.
  • the device can be implemented by software and/or
  • the hardware implementation specifically includes: a code stream acquisition module 710 and an image reconstruction module 720 .
  • the code stream obtaining module 710 is configured to obtain an image code stream, and parse the weighted prediction identification information and parameters in the image code stream.
  • An image reconstruction module 720 configured to decode the image code stream according to the weighted prediction identification information and parameters to generate a reconstructed image.
  • the image code stream is obtained by the code stream acquisition module, and the weighted prediction identification information and parameters in the image code stream are obtained, and the image reconstruction module processes the image code stream according to the obtained weighted prediction identification information and parameters to generate Re-image realizes dynamic decoding of image code stream, which can improve video decoding efficiency and reduce the impact of image brightness changes on video decoding efficiency.
  • the number of weighted prediction identification information and parameters in the image code stream in the device is at least one set.
  • the weighted prediction identification information and the parameters in the device are included in at least one of the following parameter sets: a sequence layer parameter set, an image layer parameter set, a slice layer parameter set, Supplementary enhancement information, video usability information, image header information, slice header information, network abstraction layer unit header information, coding tree unit, coding unit.
  • the weighted prediction identification information and the parameters in the device include at least one of the following information: reference image index information, weighted prediction enabling control information, and region-adaptive weighted prediction enabling Control information, weighted prediction parameters.
  • the image code stream in the device includes a transport stream or a media file.
  • the image reconstruction module 720 includes:
  • the deep learning decoding unit is configured to use the weighted prediction identification information and parameters to perform weighted prediction decoding on the image code stream according to the pre-trained neural network model to generate a reconstructed image.
  • the weighted prediction identification information and parameters in the device also include neural network model structure and neural network model parameters.
  • FIG. 13 is a schematic structural diagram of an encoder provided in an embodiment of the present application.
  • the encoder shown in FIG. 13 is applied to an apparatus for encoding video.
  • the input of the device is the image included in the video, and the output is the image code stream or the transport stream or media file containing the image code stream.
  • the encoder is used for: Step 501: Input an image.
  • the specific operation process may be the same as the video encoding method provided in any of the foregoing embodiments.
  • Step 503 Output code stream.
  • FIG. 14 is a schematic structural diagram of a decoder provided in an embodiment of the present application.
  • the decoder shown in FIG. 14 is applied to an apparatus for decoding video.
  • the input of the device is an image code stream or a transport stream or a media file containing the image code stream, and the output is an image constituting a video.
  • the decoder is used for: Step 601: Input an image.
  • An example of a specific operation process is the video decoding method provided in any of the foregoing embodiments.
  • Step 604 the player plays the image.
  • Figure 15 is a schematic structural diagram of an electronic device provided by an embodiment of the present application, the electronic device includes a processor 70, a memory 71, an input device 72, and an output device 73; the number of processors 70 in the electronic device can be one or more
  • a processor 70 is taken as an example; the processor 70, memory 71, input device 72 and output device 73 in the electronic device can be connected by bus or other methods.
  • the connection by bus is taken as an example.
  • the memory 71 can be used to store software programs, computer-executable programs and modules, such as modules corresponding to the video encoding device or video decoding device in the embodiment of the present application (the image acquisition module 610 and the video encoding module 620, or, code stream acquisition module 710 and image reconstruction module 720).
  • the processor 70 executes various functional applications and data processing of the electronic device by running software programs, instructions and modules stored in the memory 71 , that is, implements the above-mentioned method.
  • the memory 71 may mainly include a program storage area and a data storage area, wherein the program storage area may store an operating system and at least one application required by a function; the data storage area may store data created according to the use of the electronic device, and the like.
  • the memory 71 may include a high-speed random access memory, and may also include a non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other non-volatile solid-state storage devices.
  • the memory 71 may further include a memory that is remotely located relative to the processor 70, and these remote memories may be connected to the electronic device through a network. Examples of the aforementioned networks include, but are not limited to, the Internet, intranets, local area networks, mobile communication networks, and combinations thereof.
  • the input device 72 can be used to receive input numbers or character information, and generate key signal input related to user settings and function control of the electronic device.
  • the output device 73 may include a display device such as a display screen.
  • the embodiment of the present application also provides a storage medium containing computer-executable instructions, the computer-executable instructions are used to execute a video encoding method when executed by a computer processor, the method comprising:
  • the computer-executable instructions are used to perform a video decoding method when executed by a computer processor, the method comprising:
  • the present application can be realized by means of software and necessary general-purpose hardware, and of course it can also be realized by hardware, but in many cases the former is a better implementation .
  • the essence of the technical solution of this application or the part that contributes to the prior art can be embodied in the form of a software product, and the computer software product can be stored in a computer-readable storage medium, such as a computer floppy disk , read-only memory (Read-Only Memory, ROM), random access memory (Random Access Memory, RAM), flash memory (FLASH), hard disk or optical disc, etc., including several instructions to make a computer device (which can be a personal computer) , server, or network device, etc.) execute the method described in each embodiment of the present application.
  • a computer-readable storage medium such as a computer floppy disk , read-only memory (Read-Only Memory, ROM), random access memory (Random Access Memory, RAM), flash memory (FLASH), hard disk or optical disc, etc.
  • the division between functional modules/units mentioned in the above description does not necessarily correspond to the division of physical components; for example, one physical component may have multiple functions, or one function or step may be composed of several physical components. Components cooperate to execute.
  • Some or all of the physical components may be implemented as software executed by a processor, such as a central processing unit, digital signal processor, or microprocessor, or as hardware, or as an integrated circuit, such as an application-specific integrated circuit .
  • a processor such as a central processing unit, digital signal processor, or microprocessor
  • Such software may be distributed on computer readable media, which may include computer storage media (or non-transitory media) and communication media (or transitory media).
  • computer storage media includes both volatile and nonvolatile media implemented in any method or technology for storage of information, such as computer readable instructions, data structures, program modules, or other data. permanent, removable and non-removable media.
  • Computer storage media includes, but is not limited to, RAM, ROM, EEPROM, flash memory or other memory technology, CD-ROM, digital versatile disk (DVD) or other optical disk storage, magnetic cartridges, tape, magnetic disk storage or other magnetic storage devices, or can Any other medium used to store desired information and which can be accessed by a computer.
  • communication media typically embodies computer readable instructions, data structures, program modules, or other data in a modulated data signal such as a carrier wave or other transport mechanism, and may include any information delivery media .

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • General Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)

Abstract

本申请实施例提供了一种视频编码、视频解码方法、装置、电子设备和存储介质,其中,该视频编码方法包括:获取视频图像,其中,所述视频图像为视频的至少一帧图像(110);对所述视频图像进行加权预测编码以生成图像码流,其中,所述加权预测编码使用至少一套加权预测标识信息和参数(120)。

Description

视频编码、视频解码方法、装置、电子设备和存储介质
相关申请的交叉引用
本申请基于申请号为202110875430.8、申请日为2021年7月30日的中国专利申请提出,并要求该中国专利申请的优先权,该中国专利申请的全部内容在此引入本申请作为参考。
技术领域
本申请涉及图像处理技术领域,尤其涉及一种视频编码、视频解码方法、装置、电子设备和存储介质。
背景技术
随着数字媒体技术的快速发展,视频成为重要的传播载体,视频形式的交流、娱乐和学习逐渐融入大众的日常生活,越来越贴近并深入普通用户。目前常见的视频形式中往往会在一段主题内容的开始部分或者结束部分加入渐变特效,通过画面渐入渐出效果给观众更自然和更舒适的观看感受。
为了减少数据传输对网络带宽的压力,视频编解码技术成为多媒体领域重要的研究内容。在进行视频编码时,帧间预测技术能够有效消除时间域数据冗余,大大降低视频传输码率。然而,当视频序列中出现亮度变化场景时,比如淡入淡出、镜头光圈调制、整体或局部光源改变等,常规的帧间运动估计域运动补偿将难以达到理想的数据压缩效果,甚至于实际编码中在遇到亮度变化的局部内容块时,优化模型的决策结果通常都是采用帧内预测编码,大大降低视频编码效率。因此,为了改善内容亮度变化场景中的编码效果,可以在视频编码中使用加权预测技术,在检测到亮度发生变化时,编码端需要确定当前图像相对参考图像的亮度变化权重和偏移量,并由亮度补偿操作生成对应的加权预测帧。
目前在H.264/AVC标准中,加权预测技术已经被提出,并且加权预测在应用时存在两种模式,分别是显式加权预测与隐式加权预测。对于隐式加权预测,模型参数都是固定的,即编解码端约定好采用相同的加权预测参数,不需要由编码端传输参数,减少了码流传输的压力,提高了传输效率。然而,由于隐式模式下的加权预测参数是固定的,当它应用于帧间单向预测时,当前帧与参考帧之间变化的距离将导致固定权重的预测效果不理想。也就是说,显式模式适用于帧间单向预测和双向预测,而隐式模式仅适用于帧间双向预测。对于显式模式,编码端需要确定加权预测参数,并将其标识于数据头信息中;解码端需要从码流中读取到相应的加权预测参数,以顺利解码图像。加权预测共涉及三个预测参数,分别是weight(权重)、offset(偏移量)、log weight denominator(对数权重分母)。其中,为了避免浮点运算,编码端需要将权重放大,即引入了对数权重分母,而解码端则需要缩小相应的倍数。
以H.266/VVC标准为例,加权预测的一系列参数信息可以包含在图像头(Picture header)或者分片头(Slice header)里面。值得注意的是,参考图像的每个亮度或色度分量都有独立的加权预测参数。当视频内容中出现复杂的亮度变化场景时,虽然可以为图像内不同的分片区域配置不同的加权预测参数,但由于编码时的分片划分会影响网络抽象层数据传输等,分片划分方式存在诸多限制,分片层自适应的加权预测技术难以灵活地满足多样化的亮度变化情形。另一方面,现有标准化技术中每个参考图像仅配备了一套加权预测参数(即一套相互配合的权重和偏移量),在当前整幅图像具有完整一致的亮度变化形式时,只需要选用前向最近的一个参考图像及其加权预测参数就能达到很好的预测效果。然而,为了适当兼顾其他形式的内容亮度变化场景,当进行视频帧间编码时,可以遵循现有标准选用多个不同的参考图像,并人为地针对当前图像中不同区域内容配置适应参考图像的权重和偏移量,如图1所示。考虑到解码端的缓冲区往往仅能存放少量解码后的参考图像,尤其是针对超高清视频、全景视频等大数据量媒体内容,故而在实际应用中图1所示方案也只能适用部分亮度变化场景。
总而言之,现有视频编码标准化技术中的视频编码方案在针对视频中复杂的图形亮度变化场景进行编码时,其可行性和灵活度存在较大的限制。
发明内容
本申请实施例的主要目的在于提出一种视频编码方法、装置、电子设备和存储介质,旨在复杂的图形亮度变化场景中实现视频灵活编码,提高视频编码效率,降低图形亮度变化对编码效率的影响。
本申请实施例提供了一种视频编码方法,其中,该方法包括以下步骤:获取视频图像,其中,所述视频图像为视频的至少一帧图像;对所述视频图像进行加权预测编码以生成图像码流,其中,所述加权预测编码使用至少一套加权预测标识信息和参数。
本申请实施例提供了一种视频解码方法,其中,该方法包括以下步骤:获取图像码流,并解析所述图像码流中的加权预测标识信息和参数;根据所述加权预测标识信息和参数解码所述图像码流以生成重建图像。
本申请实施例还提供了一种视频编码装置,其中,该装置包括以下模块:图像获取模块,用于获取视 频图像,其中,所述视频图像为视频的至少一帧图像;视频编码模块,用于对所述视频图像进行加权预测编码以生成图像码流,其中,所述加权预测编码使用至少一套加权预测标识信息和参数。
本申请实施例还提供了一种视频解码装置,其中,该装置包括以下模块:码流获取模块,用于获取图像码流,并解析所述图像码流中的加权预测标识信息和参数;图像重建模块,用于根据所述加权预测标识信息和参数解码所述图像码流以生成重建图像。
本申请实施例还提供了一种电子设备,其中,该电子设备包括:一个或多个处理器;存储器,用于存储一个或多个程序;当所述一个或多个程序被所述一个或多个处理器执行,使得所述一个或多个处理器实现如本申请实施例中任一所述的方法。
本申请实施例还提供了一种计算机可读存储介质,其上存储有计算机程序,该计算机程序被处理器执行时实现如本申请实施例中任一所述的方法。
本申请实施例,通过获取视频图像,该视频图像为视频中至少一帧图像,对视频图像进行加权预测编码以生成图像码流,其中,加权预测编码过程中使用了至少一套加权预测标识信息和参数,实现了视频图像的灵活编码,可提高视频编码效率,降低视频图像亮度变化对编码效率的影响。
附图说明
图1是一些技术方案中面向多参考图像的区域自适应加权预测的示例图;
图2是本申请实施例提供的一种视频编码方法的流程图;
图3是本申请实施例提供的另一种视频编码方法的流程图;
图4是本申请实施例提供的另一种视频编码方法的流程图;
图5是本申请实施例提供的一种视频编码方法的示例图;
图6是本申请实施例提供的另一种视频编码方法的示例图;
图7是本申请实施例提供的一种视频解码方法的流程图;
图8是本申请实施例提供的另一种视频解码方法的流程图;
图9是本申请实施例提供的一种视频解码方法的示例图;
图10是本申请实施例提供的另一种视频解码方法的示例图;
图11是本申请实施例提供的一种视频编码装置的结构示意图;
图12是本申请实施例提供的一种视频解码装置的结构示意图;
图13是本申请实施例提供的一种编码器的结构示意图;
图14是本申请实施例提供的一种解码器的结构示意图;
图15是本申请实施例提供的一种电子设备的结构示意图。
具体实施方式
应当理解,此处所描述的具体实施例仅仅用以解释本申请,并不用于限定本申请。
在后续的描述中,使用用于表示元件的诸如“模块”、“部件”或“单元”的后缀仅为了有利于本申请的说明,其本身没有特有的意义。因此,“模块”、“部件”或“单元”可以混合地使用。
图2是本申请实施例提供的一种视频编码方法的流程图,本申请实施例可适用于亮度变化场景中的视频编码,该方法可以由视频编码装置来执行,该装置可通过软件和/或硬件方式实现,并一般集成在终端设备,参见图2,本申请实施例提供的方法具体包括如下步骤:
步骤110、获取视频图像,其中,视频图像为视频的至少一帧图像。
其中,视频图像可以需要进行传输的视频数据,视频图像可以是视频数据序列中的某一帧数据或者某一时刻对应的一帧数据。
在本申请实施例中,可以对视频数据进行处理,提取其中的一帧或多帧图像数据作为进行视频编码的视频数据。
步骤120、对视频图像进行加权预测编码以生成图像码流,其中,加权预测编码使用至少一套加权预测标识信息和参数。
其中,由于视频中场景会随着时间光线强弱渐变,或者,相同场景的视频出现阴影效应,虽然帧与帧间背景的相似度可能很高,明暗差别较大,相邻帧图像存在亮度变化,当前帧可以相当于前序帧整体乘了一种权重(weight),再加上一个偏移量的方式基于前序帧进行视频编码,这种通过权重和偏移量进行视频编码的过程可以被称为加权预测编码,在加权预测编码过程中主要涉及三个预测参数,分别为权重(weight)、偏移量(offset)、对数权重分母(log weight denominator),其中,对数权重分母可以避免编码过程中的浮点运算,将权重放大。加权预测标识信息可以是进行加权预测使用的参数的标识信息,参数可以是加权预测标识信息具体参数,可以包括权、偏移量和对数权重分母中至少之一。图像码流可以是视频图像编码后生成的数据,图像码流可以用于终端设备之间的传输。
在本申请实施例中,可以对视频图像进行加权预测编码,在加权预测编码的过程中可以使用一套或多套加权预测标识信息和参数,例如,可以针对不同帧的视频图像使用不同的加权预测标识信息和参数,或者,可以针对相同帧内不同区域的视频图像使用不同的加权预测标识信息和参数。可以理解的是,在对视频图像进行加权预测编码的过程中,可以按照视频图像的亮度变化选择不同的加权预测标识信息和参数进行视频编码。
本申请实施例,通过获取视频图像,该视频图像为视频中至少一帧图像,对视频图像进行加权预测编码以生成图像码流,其中,加权预测编码过程中使用了至少一套加权预测标识信息和参数,实现了视频图像的灵活编码,可提高视频编码效率,降低视频图像亮度变化对编码效率的影响。
图3是本申请实施例提供的另一种视频编码方法的流程图,本申请实施例是在上述申请实施例的基础上的具体化,参见图3,本申请实施例提供的方法具体包括如下步骤:
步骤210、获取视频图像,其中,所述视频图像为视频的至少一帧图像。
步骤220、根据视频图像与参考图像的比较结果确定亮度变化情况。
其中,参考图像可以是视频数据序列中位于当前处理的视频图像之前或者之后的图书数据,参考图像可以用于对当前处理的视频图像的运动估计操作,参考图像的数量可以为一帧或者多帧。亮度变化情况可以是视频图像相较于参考图像的亮度变化情况,亮度变化情况具体可以由视频图像与参考图像的像素值变化确定。
在本申请实施例中,可以将视频图像与参考图像进行比较,比较的方式可以包括计算相应位置的像素值的差值或者,计算视频图像与参考图像的像素均值的差值等,可以将视频图像与参考图像的比较结果确定亮度变化情况,亮度变化情况可以包括逐渐变亮、逐渐变暗、无变化、随机变化等。
进一步的,在上述申请实施例的基础上,所述亮度变化情况包括以下至少之一:图像亮度变化均值、像素点亮度变化值。
具体的,视频图像与参考图像的亮度变化情况可以通过图像亮度变化均值和/或像素点亮度变化值确定,其中,图像亮度变化均值可以指当前图像和参考图像的亮度值的均值变化,像素点亮度变化值可以是视频图像中每一个像素点的亮度值和参考图像对应位置像素点亮度值的变化。可以理解的是,亮度变化情况还可以是其他亮度值统计特性的变化,例如,亮度方差、亮度均方差等。
步骤230、根据视频图像的亮度变化情况对视频图像进行加权预测编码,其中,加权预测编码使用至少一套加权预测标识信息和参数。
在本申请实施例中,可以预设多套加权预测标识信息和参数,可以根据亮度变化情况选择对应的预测标识信息和参数对视频图像进行加权预测编码。可以理解的是,可以针对亮度变化情况的具体内容对视频图像进行预测加权编码,例如,可以针对不同帧的视频图像的亮度变化选择不同的加权预测标识信息和参数进行加权预测编码,还可以针对视频图像中的不同区域选择不同的加权预测标识信息和参数进行加权预测编码。
步骤240、将加权预测标识信息和参数写入图像码流。
具体的,视频图像经过加权预测编码后生成图像码流,可以将编码过程中使用的加权预测标识信息和参数写入到图像码流,以便后续过程中的视频解码过程。
本申请实施例,通过获取视频图像,按照视频图像与参考图像的比较结果确定亮度变化情况,按照亮度变化情况对视频图像进行加权预测编码,将加权预测编码使用的加权预测标识信息和参数写入图像码流,实现了视频图像的灵活编码,可提高视频编码效率,降低视频图像亮度变化对编码效率的影响。
进一步的,在上述申请实施例的基础上,所述根据所述亮度变化情况对所述视频图像进行加权预测编码,包括以下至少之一:
若亮度变化情况为整帧图像亮度一致,则对视频图像进行加权预测编码;
若亮度变化情况为分区图像亮度一致,则确定分别对视频图像内各分区图像进行加权预测编码。
其中,整帧图像亮度一致可以是视频图像整帧图像帧的亮度变化是相同的。分区图像亮度一致可以是视频图像中存在多个区域,每个区域的亮度变化不相同。
在本申请实施例中,当亮度变化情况为整帧图像亮度一致时,可对整帧视频图像进行加权预测编码,当亮度变化情况为分区图像亮度一致时,视频图像中的亮度变化情况时多种多样的,可以针对各图像区域执行加权预测编码,可以理解的是,当各图像区域的亮度变化不相同时,加权预测编码过程中使用的加权预测标识信息和参数可以不同。
进一步的,在上述申请实施例的基础上,所述视频图像针对所述参考图像或所述参考图像的分区图像存在至少一套所述加权预测标识信息和参数。
本申请实施例中,视频图像在加权预测编码过程中使用的加权预测标识信息和参数是相当于参考图像确定的信息,视频图像以参考图像为基准存在一套或者多套加权预测标识信息和参数,当视频图像的参考图像为多帧时,视频图像可以分别针对每帧参考图像存在一套或多个加权预测标识信息和参数,各套加权 预测标识信息和参数可以与对应的视频图像和/或参考图像存在关联关系。
图4是本申请实施例提供的另一种视频编码方法的流程图,本申请实施例是在上述申请实施例基础上的具体化,参见图4,本申请实施例提供的方法具体包括如下步骤:
步骤310、获取视频图像,其中,所述视频图像为视频的至少一帧图像。
步骤320、根据预先训练的神经网络模型对视频图像进行加权预测编码。
其中,神经网络模型可以对视频图像进行加权预测编码处理,可以确定视频图像使用的加权预测标识信息和参数,神经网络模型可以使用标识有加权预测标识信息和参数的图像样本训练生成。神经网络模型可以是确定出视频图像的加权预测标识信息和参数或者直接确定出视频图像的图像码流。
在本申请实施例中,可以将视频图像直接或间接的输入到神经网络模型中,通过神经网络模型实现对视频图像的加权预测编码。可以理解的是,神经网络模型的输入层可以接受视频图像或者视频图像的特征,神经网络模型可以生成视频图像的加权预测编码使用的加权预测标识信息和参数,或者,直接对视频图像进行加权预测编码。
步骤330、将加权预测标识信息和参数写入图像码流。
本申请实施例,通过获取视频图像,使用预先训练的神经网络模型对视频图像进行加权预测编码,将加权预测编码使用的加权预测标识信息和参数写入图像视频编码生成的图像码流,实现视频图像的灵活编码,可提高视频编码效率,降低视频图像亮度变化对编码效率的影响。
进一步的,在上述申请实施例的基础上,所述加权预测标识信息还包括神经网络模型结构和神经网络模型参数。
具体的,加权预测标识信息中还可以包括神经网络模型结构和神经网络模型参数,其中,神经网络模型结果可以是反映神经网络结果的信息,例如,全连接层使用的函数、激活函数、损失函数等,神经网络模型参数可以是神经网络模型的参数的具体取值,例如,网络权重值、隐含层数量等。
进一步的,在上述申请实施例的基础上,所述加权预测编码使用的一套所述加权预测标识信息和参数对应一帧所述视频图像或者所述视频图像的至少一个分区图像。
在本申请实施例中,加权预测编码过程中可以使用一套或者多套加权预测标识信息和参数,每套机器预测标识信息和参数在加权预测编码过程中对应一帧视频图像或者视频图像中的一个分区图像,该分区图像可以是视频图像中的一部分,例如,分片图像或者子图像等。
进一步的,在上述申请实施例的基础上,所述分区图像的规格包括以下至少之一:分片Slice、瓦片Tile、子图像Subpicture、编码树单元Coding Tree Unit、编码单元Coding Unit。
具体的,当对视频图像以分区图像的形式进行加权预测编码时,分区图像可以为分片Slice、瓦片Tile、子图像Subpicture、编码树单元Coding Tree Unit、编码单元Coding Unit中的一种或者多种。
进一步的,在上述申请实施例的基础上,所述加权预测标识信息和所述参数包含在下述至少一种参数集合中:序列层参数集、图像层参数集、分片层参数集、补充增强信息、视频可用性信息、图像头信息、分片头信息、网络抽象层单元头信息、编码树单元、编码单元。
具体的,可以将加权预测标识信息及参数写入图像码流,标识信息及参数包含在下列全部或部分参数集合中:序列层参数集、图像层参数集、分片层参数集、补充增强信息、视频可用性信息、图像头信息、分片头信息、网络抽象层单元头信息,或者作为一种新的信息单元,也可以包含在编码树单元或编码单元中。
进一步的,在上述申请实施例的基础上,所述加权预测标识信息和所述参数包括以下信息中至少之一:参考图像索引信息、加权预测启用控制信息、区域自适应的加权预测启用控制信息、加权预测参数。
在本申请实施例中,加权预测标识信息可以为参考图像索引信息,该信息用于确定亮度变化使用的参考图像,加权预测启用控制信息,该信息用于确定是否进行加权预测编码,区域自适应加权预测启用控制信息,该信息用于确定是否对图像视频进行区域加权预测编码,加权预测参数可以是加权预测编码过程使用的参数,可以包括权重、偏移量、对数权重分母等。
进一步的,在上述申请实施例的基础上,所述图像码流包括传输流或媒体文件。
具体的,图像码流可以为传输流或者媒体文件。
在一个实施方式中,图5是本申请实施例提供的一种视频编码方法的示例图,参见图5,编码处理的输入是视频所包含的图像,输出是图像码流或包含图像码流的传输数据流或媒体文件,视频编码的过程可以包括如下步骤:
步骤101:读取图像,其中,该图像可以是视频序列中的某一帧数据,也可以是某一时刻对应的一帧数据。
步骤102:检测该图像相对于某一参考图像的亮度变化。
其中,参考图像可以是指视频序列中在时间线上图像之前的图像,也可以是时间线上当前图像之后的图像,参考图像是用来对当前图像进行运动估计操作。参考图像可以有一帧也可以有多帧;
亮度变化可以是指当前图像和参考图像的亮度值的均值变化,也可以是当前图像每一个像素点的亮度值和参考图像对应位置像素点亮度值变化,也可以是其它亮度值统计特性的变化;
若参考图像是多帧的话,当前图像需要针对每一帧参考图像检测亮度变化。
步骤103:根据步骤102中的亮度变化检测结果,判断是否需要采用加权预测操作。其中判断依据包括面向特定的亮度变化情形,加权预测操作在提升编码效率和增加编码复杂度之间的权衡。
步骤104:当需要采用加权预测操作时,则执行加权预测编码,并记录前述参考图像的索引信息、一套或多套加权预测参数、和/或加权预测参数索引信息。
根据步骤102中的亮度变化统计信息,编码器记录亮度变化的区域特性。其中亮度变化的区域特性包括亮度变化在整帧图像区域保持变化一致,或者亮度变化在图像的部分区域保持变化一致。所述部分区域可以是一个或者多个分片(Slice),也可以一个或者多个瓦片(Tile),也可以是一个或者多个子图像(Subpicture),也可以是一个或者多个编码树单元(Coding Tree Unit,CTU),也可以是一个或者多个编码单元(Coding Unit,CU);
如果亮度变化是在整帧图像区域保持一致的话,可以认为当前图像对应参考图像是亮度变化统一的,否则是不统一的;
若当前图像的亮度变化情形是统一的,则执行加权预测编码,并记录前述参考图像索引信息与一套加权预测参数,其中加权预测参数包括权重和偏移量;
若当前图像各区域的亮度变化情形是多样的,则按图像区域执行加权预测编码,并记录前述参考图像索引信息与多套加权预测参数,以及各图像区域使用的特定加权预测参数索引信息。其中,可以是一个图像区域对应一套加权预测参数,也可以多个图像区域对应一套加权预测参数。
步骤105:将加权预测标识信息及参数写入码流,所述标识信息及参数包含在下列全部或部分参数集合中:序列层参数集、图像层参数集、分片层参数集、补充增强信息、视频可用性信息、图像头信息、分片头信息、网络抽象层单元头信息,或者作为一种新的信息单元,也可以包含在编码树单元或编码单元中。
步骤106:当不需要采用加权预测操作时,直接基于传统方法进行图像编码;
步骤107:输出图像码流或包含图像码流的传输流或媒体文件。
在另一个实施方式中,图6是本申请实施例提供的一种视频编码方法的示例图,参见图6,本实施例将深度学习技术应用于区域自适应的加权预测操作,视频编码的过程可以包括如下步骤:
步骤201:读取图像,其中,该图像可以是视频序列中的某一帧数据,也可以是某一时刻对应的一帧数据。
步骤202:确定是否使用基于深度学习技术的加权预测方案。
步骤203:使用深度学习技术时,使用训练生成的神经网络模型和参数进行图像加权预测编码。
在一些实例中,为了降低运算复杂度,可以基于神经网络学习与训练,简化为固定加权预测参数。应用中,面向图像中不同区域内容,选择合适的加权预测参数进行加权预测编码。所述加权预测参数可以包括权重和偏移量;其中,单幅图像可以使用一组数值中的全部或部分权重和偏移量。
步骤204:基于神经网络模型进行图像加权预测编码时,记录深度学习方案必要的加权预测参数,包括但不限于神经网络模型结构和参数、提取的一组加权预测参数数值与各个图像区域使用的参数索引信息。
步骤205:不使用深度学习技术时,基于传统运算方案执行区域自适应的加权预测编码,例如基于实施例一中的操作过程。
步骤206:将加权预测标识信息及参数写入码流,所述标识信息及参数包含在下列全部或部分参数集合中:序列层参数集、图像层参数集、分片层参数集、补充增强信息、视频可用性信息、图像头信息、分片头信息、网络抽象层单元头信息,或者作为一种新的信息单元,也可以包含在编码树单元或编码单元中。
步骤207:输出图像码流或包含图像码流的传输流或媒体文件。
图7是本申请实施例提供的一种视频解码方法的流程图,本申请实施例可适用于亮度变化场景中的视频解码,该方法可以由视频解码装置来执行,该装置可通过软件和/或硬件方式实现,并一般集成在终端设备,参见图7,本申请实施例提供的方法具体包括如下步骤:
步骤410、获取图像码流,并解析图像码流中的加权预测标识信息和参数。
在本申请实施例中,可以接收传输的图像码流,并在图像码流中提取到加权预测标识信息和参数,其中,图像码流可以包括一套或者多套加权预测标识信息和参数。
步骤420、根据加权预测标识信息和参数解码图像码流以生成重建图像。
具体的,可以使用获取到的加权预测标识信息和参数对图像码流进行加权预测解码,将图像码流处理为重建图像,其中,重建图像可以根据传输码流生成的图像。
本申请实施例,通过获取图像码流,并获取图像码流中的加权预测标识信息和参数,按照获取到的加权预测标识信息和参数对图像码流进行处理以生成重新图像,实现了图像码流的动态解码,可提高视频解码效率,降低图像亮度变化对视频解码效率的影响。
图8是本申请实施例提供的另一种视频解码方法的流程图,本申请实施例是在上述申请实施例基础上的具体化,参见图8,本申请实施例提供的方法具体包括如下步骤:
步骤510、获取图像码流,并解析图像码流中的加权预测标识信息和参数。
步骤520、根据预先训练的神经网络模型使用加权预测标识信息和参数对图像码流进行加权预测解码以生成重建图像。
其中,神经网络模型可以是用于图像码流解码的深度学习模型,该深度网络模型可以由样本码流和样本图像训练生成,神经网络模型可以对图像码流进行加权预测解码。
在本申请实施例中,可以将图像码流和加权预测标识信息和参数输入到预先训练的神经网络模型,由神经网络模型对图像码流进行加权预测解码处理,将图像码流处理为重建图像。
本申请实施例,通过获取图像码流,并获取图像码流中的加权预测标识信息和参数,使用预先训练的神经网络模型基于加权预测标识信息和参数将图像码流处理为重建图像,实现了图像码流的动态解码,可提高视频解码效率,降低图像亮度变化对视频解码效率的影响。
进一步的,在上述申请实施例的基础上,所述图像码流中的加权预测标识信息和参数的数量为至少一套。
具体的,图像码流可以是对视频图像进行加权预测编码生成的信息,基于进行加权预测编码的方式的不同,图像码流中可以存在一套或多套加权预测标识信息和参数,例如,编码端对视频图像中的不同区域分别进行加权预测编码,则图像码流中可以存在多套加权预测标识信息和参数。
进一步的,在上述申请实施例的基础上,所述加权预测标识信息和所述参数包含在下述至少一种参数集合中:序列层参数集、图像层参数集、分片层参数集、补充增强信息、视频可用性信息、图像头信息、分片头信息、网络抽象层单元头信息、编码树单元、编码单元。
进一步的,在上述申请实施例的基础上,所述加权预测标识信息和所述参数包括以下信息中至少之一:参考图像索引信息、加权预测启用控制信息、区域自适应的加权预测启用控制信息、加权预测参数。
进一步的,在上述申请实施例的基础上,所述图像码流包括传输流或媒体文件。
进一步的,在上述申请实施例的基础上,所述加权预测标识信息和参数还包括神经网络模型结构和神经网络模型参数。
在一个实施方式中,图9是本申请实施例提供的一种视频解码方法的示例图,参见图9,解码处理的输入是图像码流或包含图像码流的传输数据流或媒体文件,输出是构成视频的图像,视频图像的解码过程可以包括:
步骤301:读取码流。
步骤302:解析码流,获取加权预测标识信息。
解码器解析序列层参数集、图像层参数集和/或分片层参数集,获取加权预测标识信息。其中,序列层参数集包括序列参数集(Sequence Parameter Set,SPS),图像层参数集包括图像参数集(Picture Parameter Set,PPS)、适配参数集(Adaptation Parameter Set,APS),分片层参数集包括APS。序列层参数集中的加权预测标识信息可以被图像层参数集和分片层参数集引用,图像层参数集中的加权预测标识信息可以被分片层参数集引用。所述加权预测标识信息包括但不限于当前参数集指示的序列和/或图像是否采用单向加权预测、双向加权预测、和/或区域自适应的多权重加权预测;
其中,区分是否采用区域自适应的多权重加权预测的方式,包括但不限于解析二进制标识符、或者加权预测参数套数(是否等于或大于1)。
步骤303:依据加权预测标识信息,判断当前图像是否采用加权预测解码。
步骤304:当确定当前图像采用加权预测解码时,获取加权预测参数信息。
解码器根据标识信息指示,从参数集和/或数据头信息中获取加权预测参数信息。其中,参数集包括SPS、PPS、APS,数据头信息包括图像头PH、分片头(Slice Header,SH)。所述加权预测参数信息包括但不限于参考图像列表中的各参考图像是否配置了加权预测参数(权重和偏移量)、各参考图像配置的加权预测参数套数、以及各套加权预测参数具体取值等。
步骤305:依据加权预测标识信息与参数,对当前图像进行加权预测解码。
依据加权预测标识信息,解码器可以对当前完整图像进行统一形式的加权预测解码,或者对图像中各局部内容进行差异化的加权预测解码。
步骤306:当确定当前图像不采用加权预测解码时,直接基于传统方法进行图像解码。
步骤307:生成重建图像。其中重建图像可以用于显示,也可以直接保存。
在另一个实施方式中,图10是本申请实施例提供的另一种视频解码方法的示例图,参见图10,本实施例将深度学习技术应用于区域自适应的加权预测操作,视频解码方法具体包括如下步骤:
步骤401:读取码流。
步骤402:解析码流,获取基于深度学习的加权预测的标识信息。
解码器解析序列层参数集、图像层参数集和/或分片层参数集,获取基于深度学习的加权预测的标识信息。其中,序列层参数集包括序列参数集(Sequence Parameter Set,SPS),图像层参数集包括图像参数集(Picture Parameter Set,PPS)、适配参数集(Adaptation Parameter Set,APS),分片层参数集包括APS。序列层参数集中的加权预测标识信息可以被图像层参数集和分片层参数集引用,图像层参数集中的加权预测标识信息可以被分片层参数集引用。所述基于深度学习的加权预测的标识信息包括但不限于当前参数集指示的序列和/或图像是否采用基于深度学习的加权预测。
步骤403:依据基于深度学习的加权预测的标识信息,判断当前图像是否采用基于深度学习的加权预测解码。
步骤404:当确定当前图像采用基于深度学习的加权预测解码时,获取深度学习方案必要的加权预测参数。
解码器根据标识信息指示,从参数集和/或数据头信息中获取加权预测参数信息。其中,参数集包括SPS、PPS、APS,数据头信息包括图像头PH、分片头(Slice Header,SH)。所述加权预测参数信息包括但不限于参考图像索引信息、神经网络模型结构和参数、全部或部分加权预测参数数值、当前图像各区域内容使用的加权预测参数索引信息。
步骤405:依据加权预测标识信息与参数,对当前图像进行基于深度学习的加权预测解码。
步骤406:当确定当前图像不采用基于深度学习的加权预测解码时,直接基于传统方法进行图像解码。
步骤407:生成重建图像。其中重建图像可以用于显示,也可以直接保存。
在一个实施方式中,实施例给出了码流中序列层参数集SPS所包含区域自适应的加权预测参数的标识信息。SPS中的标识信息可以被PPS、APS引用。表1中语法和语义定义如下:
sps_weighted_pred_flag是序列层将加权预测技术应用于单向预测分片(P slices)的启用控制信息。当sps_weighted_pred_flag等于1时,表明该SPS指示的P slices可能应用了加权预测;反之,当sps_weighted_pred_flag等于0时,表明该SPS指示的P slices没有应用加权预测。
sps_weighted_bipred_flag是序列层将加权预测技术应用于双向预测分片(B slices)的启用控制信息。当sps_weighted_bipred_flag等于1时,表明该SPS指示的B slices可能应用了显式加权预测;反之,当sps_weighted_pred_flag等于0时,表明该SPS指示的B slices没有应用显式加权预测。
sps_wp_multi_weights_flag是序列层关于单个参考图像具有多套加权预测参数的启用控制信息。当sps_wp_multi_weights_flag等于1时,表明该SPS指示的图像的单个参考图像可以具有多套加权预测参数;反之,当sps_wp_multi_weights_flag等于0时,表明该SPS指示的图像的单个参考图像只具有单套加权预测参数。
表1
Figure PCTCN2022102406-appb-000001
在一个实施方式中,本实施例给出码流中图像层参数集PPS所包含区域自适应的加权预测参数的标识信息,PPS中的标识信息可以被APS引用。
表2中语法和语义定义如下:
pps_weighted_pred_flag是图像层将加权预测技术应用于单向预测分片(P slices)的启用控制信息。当pps_weighted_pred_flag等于1时,表明该PPS指示的P slices应用了加权预测;反之,当pps_weighted_pred_flag等于0时,表明该PPS指示的P slices没有应用加权预测。当sps_weighted_pred_flag等于0时,pps_weighted_pred_flag的取值应当等于0。
pps_weighted_bipred_flag是图像层将加权预测技术应用于双向预测分片(B slices)的启用控制信息。当pps_weighted_bipred_flag等于1时,表明该PPS指示的B slices应用了显式加权预测;反之,当pps_weighted_pred_flag等于0时,表明该PPS指示的B slices没有应用显式加权预测。当sps_weighted_bipred_flag等于0时,pps_weighted_bipred_flag的取值应当等于0。
pps_wp_multi_weights_flag是图像层关于单个参考图像具有多套加权预测参数的启用控制信息。当pps_wp_multi_weights_flag等于1时,表明该PPS指示的图像的单个参考图像具有多套加权预测参数; 反之,当pps_wp_multi_weights_flag等于0时,表明该PPS指示的图像的单个参考图像只具有一套加权预测参数。当sps_wp_multi_weights_flag等于0时,pps_wp_multi_weights_flag的取值应当等于0。
pps_no_pic_partition_flag等于1时,表明该PPS指示的每幅图像都不应用图像分割;pps_no_pic_partition_flag等于0时,表明该PPS指示的每幅图像都可能被分割成多个瓦片Tile或分片Slice。
pps_rpl_info_in_ph_flag等于1时,表明参考图像列表(Reference Picture List,RPL)信息存在于图像头(Picture Header,PH)语法结构中,且不存在于该PPS指示的不包含PH语法结构的分片头中。pps_rpl_info_in_ph_flag等于0时,表明RPL信息不存在于PH语法结构中,且可能存在于该PPS指示的分片头中。
pps_wp_info_in_ph_flag等于1时,表明加权预测信息可能存在于PH语法结构中,且不存在于该PPS指示的不包含PH语法结构的分片头中。pps_wp_info_in_ph_flag等于0时,表明加权预测信息不存在于PH语法结构中,且可能存在于该PPS指示的分片头中。
表2
Figure PCTCN2022102406-appb-000002
在另一个实施方式中,实施例给出了码流中图像头(PH)所包含区域自适应的加权预测参数的标识信息。PH中包含的加权预测参数可以被当前图片、当前图片中的分片、和/或当前图片中的CTU或CU引用。
表3中语法和语义定义如下:
ph_inter_slice_allowed_flag等于0时,表明该图像所有的编码分片皆为帧内预测类型(I slice)。ph_inter_slice_allowed_flag等于1时,表明该图像允许包含一个或多个单向或双向帧间预测类型的分片(P slice或B slice)。
multi_pred_weights_table()是包含加权预测参数的数值表,其中单个参考图像可以具有多套加权预测参数(权重+偏移量)。在确定应用帧间加权预测技术且加权预测信息可能存在于PH中时,若pps_wp_multi_weights_flag等于1,则加权预测参数可以从该表中获取。
pred_weight_table()是包含加权预测参数的数值表,其中单个参考图像只具有单套加权预测参数(权重+偏移量)。在确定应用帧间加权预测技术且加权预测信息可能存在于PH中时,若pps_wp_multi_weights_flag等于0,则加权预测参数可以从该表中获取。
表3
Figure PCTCN2022102406-appb-000003
Figure PCTCN2022102406-appb-000004
在另一个实施方式中,实施例给出了码流中分片头SH所包含区域自适应的加权预测参数的标识信息。SH中包含的加权预测参数可以被当前分片和/或当前分片中的CTU或CU引用。
表4中语法和语义定义如下:
sh_picture_header_in_slice_header_flag等于1,表明PH语法结构存在于分片头中。sh_picture_header_in_slice_header_flag等于0,PH语法结构不存在于分片头中,即分片层可以不完全继承图像层的标识信息,可以灵活选用编码工具。
sh_slice_type表明了分片的编码类型,可以是帧内编码类型(I slice)、单向帧间编码类型(P slice)、双向帧间编码类型(B slice)。
sh_wp_multi_weights_flag是分片层关于单个参考图像具有多套加权预测参数的启用控制信息。当sh_wp_multi_weights_flag等于1时,表明该分片头SH指示的图像分片的单个参考图像具有多套加权预测参数;反之,当sh_wp_multi_weights_flag等于0时,表明该SH指示的图像分片的单个参考图像只具有一套加权预测参数。当pps_wp_multi_weights_flag等于0时,sh_wp_multi_weights_flag的取值应当等于0。
在确定应用帧间加权预测技术且加权预测信息可能存在于SH中时,若pps_wp_multi_weights_flag等于1,则加权预测参数可以从数值表multi_pred_weights_table()中获取;若pps_wp_multi_weights_flag等于0,则加权预测参数可以从数值表pred_weight_table()中获取。
表4
Figure PCTCN2022102406-appb-000005
在另一个实施方式中,实施例给出了加权预测参数的数值表的语法和语义,pred_weight_table()和multi_pred_weights_table()皆是包含加权预测参数的数值表,其区别在于前者定义了单个参考图像只具有单套加权预测参数,而后者定义了单个参考图像可以具有多套加权预测参数。具体地,pred_weight_table()的语法和语义可以参见国际标准H.266/VVCversion1的文档描述;multi_pred_weights_table()的语法和语义在表5及其描述中给出。其中,
luma_log2_weight_denom和delta_chroma_log2_weight_denom分别是面向亮度和色度的加权因子的 放大系数,以避免编码端浮点运算。
num_l0_weights表明了在pps_wp_info_in_ph_flag等于1时,需要为参考图像列表0(RPL0)中众多条目(参考图像)指示的加权因子个数。num_l0_weights的取值范围是[0,Min(15,num_ref_entries[0][RplsIdx[0]])],其中num_ref_entries[listIdx][rplsIdx]表明了参考图像列表语法结构ref_pic_list_struct(listIdx,rplsIdx)中的条目个数。当pps_wp_info_in_ph_flag等于1时,变量NumWeightsL0设置为num_l0_weights;反之,当pps_wp_info_in_ph_flag等于0时,变量NumWeightsL0设置为NumRefIdxActive[0]。其中,NumRefIdxActive[i]-1的取值指示了参考图像列表i(RPLi)中可能被用于解码分片的最大参考索引。当NumRefIdxActive[i]的取值为0时,表明RPLi中没有参考索引用于解码分片。
luma_weight_l0_flag[i]等于1时,表明使用参考列表0中第i个条目(RefPicList[0][i])进行单向预测的亮度分量具有加权因子(权重+偏移量)。当luma_weight_l0_flag[i]等于0时,表明上述加权因子不存在。
chroma_weight_l0_flag[i]等于1时,表明使用参考列表0中第i个条目(RefPicList[0][i])进行单向预测的色度预测值具有加权因子(权重+偏移量)。当chroma_weight_l0_flag[i]等于0时,表明上述加权因子不存在(默认情况)。
num_l0_luma_pred_weights[i]表明了在luma_weight_l0_flag[i]等于1时,需要为参考图像列表0(RPL0)中条目i(参考图像i)亮度分量指示的加权因子个数,即列表0中单个参考图像i亮度分量可以携带的加权预测参数数目。
delta_luma_weight_l0[i][k]和luma_offset_l0[i][k]分别指示了参考图像列表0中第i个参考图像亮度分量的第k个权重因子和偏移量数值。
num_l0_chroma_pred_weights[i][j]表明了在chroma_weight_l0_flag[i]等于1时,需要为参考图像列表0(RPL0)中条目i(参考图像i)的第j个色度分量指示的加权因子个数,即列表0中单个参考图像i的第j个色度分量可以携带的加权预测参数数目。
delta_chroma_weight_l0[i][j][k]和delta_chroma_offset_l0[i][j][k]分别指示了参考图像列表0中第i个参考图像的第j个色度分量的第k个权重因子和偏移量数值。
当涉及双向预测时,除了上述针对参考图像列表0的加权预测参数标识,还需要为参考图像列表1(RPL1)进行类似的信息标识,如表5所示。
表5
Figure PCTCN2022102406-appb-000006
Figure PCTCN2022102406-appb-000007
在另一个实施方式中,本实施例给出了另一种加权预测参数的数值表的语法和语义。pred_weight_table()和multi_pred_weights_table()皆是包含加权预测参数的数值表,其区别在于前者定义了单个参考图像只具有单套加权预测参数,而后者定义了单个参考图像可以具有多套加权预测参数。具体地,pred_weight_table()的语法和语义可以参见国际标准H.266/VVC version 1的文档描述;multi_pred_weights_table()的语法和语义在表6及其描述中给出。在实施例十一中,multi_pred_weights_table()涉及参考图像列表中的众多条目,即指定的多个参考图像都需要判断是否存在加权因子,且存在加权因子的单个参考图像还可能具有多套加权预测参数(包括权重和偏移量)。相 对地,本实施例是实施例十一的一种特例,其中仅考虑每个参考图像列表中的一个参考图像,即在确定应用加权预测技术时,直接指定该参考图像具有的多套加权预测参数。表6中各字段的含义与表5中各字段对应的语义解释相同。
表6
Figure PCTCN2022102406-appb-000008
在另一个实施方式中,区域自适应的加权预测的标识信息和参数在编码树单元CTU中给出。
在CTU中的加权预测参数信息可以是独立标识的,也可以是引用其它参数集(例如序列层参数集SPS、图像层参数集PPS)或者头信息(例如图像头PH、分片头SH)中的,也可以记录加权预测参数差值,或者记录加权预测参数索引信息与差值。当CTU中的加权预测参数信息是记录差值,或者是记录索引信息和差值时,指代从其它参数集或头信息中获取一个加权数值,再加上CTU中的一个差值,如此就可以得到最终 应用于该CTU或者CU的加权预测参数。当以CTU为单位包含加权预测参数信息时,可以实现以CTU为单位的精细化亮度渐变效果,比如圆圈渐变、辐射渐变。
在CTU中标识加权预测参数信息时,具体的码流组织方法可以如表7所示。
表7中,当参考图像列表RPL中单个参考图像仅具有一套加权预测参数时,例如当指示信息sh_wp_multi_weights_flag等于0时,CTU中可以直接设定加权预测参数差值;当参考图像列表中单个参考图像具有多套加权预测参数时,例如当指示信息sh_wp_multi_weights_flag等于1时,CTU中需要标识自身索引的某套加权预测参数,以及加权预测参数差值。
最终应用于当前CTU的加权预测参数是参考图像的特定加权预测参数加上coding_tree_unit()中定义的加权预测参数差值。
所述的加权预测参数差值,包括但不限于面向RPL0中亮度分量的加权预测参数差值(ctu_delta_luma_weight_l0和ctu_delta_luma_offset_l0)、面向RPL0中色度分量的加权预测参数差值(ctu_delta_chroma_weight_l0[i]和ctu_delta_chroma_offset_l0[i])、面向RPL1中亮度分量的加权预测参数差值(ctu_delta_luma_weight_l1和ctu_delta_luma_offset_l1)、面向RPL1中色度分量的加权预测参数差值(ctu_delta_chroma_weight_l1[i]和ctu_delta_chroma_offset_l1[i])。
所述的加权预测参数索引信息,包括但不限于面向RPL0中亮度分量的加权预测参数索引号(ctu_index_l0_luma_pred_weights)、面向RPL0中色度分量的加权预测参数索引号(ctu_index_l0_chroma_pred_weights[i])、面向RPL1中亮度分量的加权预测参数索引号(ctu_index_l1_luma_pred_weights)、面向RPL1中色度分量的加权预测参数索引号(ctu_index_l1_chroma_pred_weights[i])。
总之,最终应用于当前CTU的加权预测参数是参考图像的特定加权预测参数加上coding_tree_unit()中定义的加权预测参数差值。
表7
Figure PCTCN2022102406-appb-000009
Figure PCTCN2022102406-appb-000010
在另一个实施方式中,区域自适应的加权预测的标识信息和参数在编码单元CU中给出。
在CU中的加权预测参数信息可以是独立标识的,也可以是引用其它参数集(例如序列层参数集SPS、图像层参数集PPS)或者头信息(例如图像头PH、分片头SH)或者编码树单元CTU中的,也可以记录加权预测参数差值,或者记录加权预测参数索引信息与差值。当CU中的加权预测参数信息是记录差值,或者是记录索引信息和差值时,指代从其它参数集或头信息中获取一个加权数值,再加上CU中的一个差值,如此就可以得到最终应用于该CU的加权预测参数。当以CU为单位包含加权预测参数信息时,可以实现以CU为单位的精细化亮度渐变效果,比如圆圈渐变、辐射渐变。
在CU中标识加权预测参数信息时,具体的码流组织方法可以如表8所示。
其中,cu_pred_weights_adjust_flag指示是否需要对当前CU索引到的加权预测参数值进行调整。当cu_pred_weights_adjust_flag等于1时,指示当前CU需要调整加权预测参数值,即最终应用于当前CU的加权预测参数值为CTU层级的加权预测参数值与CU层级标识的差值之和;当cu_pred_weights_adjust_flag等于1时,指示当前CU直接使用CTU层级决定的加权预测参数值。
表8中,所述的CU层级标识的加权预测参数差值,包括面向RPL0中亮度分量的加权预测参数差值(cu_delta_luma_weight_l0和cu_delta_luma_offset_l0)。
若当前CU包含色度分量,则CU层级标识的加权预测参数差值还包括面向各个色度分量的加权预测参数差值;若当前CU是双向预测的,则CU层级标识的加权预测参数差值还包括面向RPL1中亮度分量和/或各个色度分量的加权预测参数差值。
总之,最终应用于当前CU的加权预测参数是编解码结构中上层数据(例如CTU层级、Slice层级、Subpicture层级)携带的特定加权预测参数加上coding_unit()中定义的加权预测参数差值。
表8
Figure PCTCN2022102406-appb-000011
Figure PCTCN2022102406-appb-000012
在另一个实施方式中,区域自适应的加权预测的标识信息和参数在补充增强信息(SupplementalEnhancementInformation,SEI)中给出。
网络抽象层单元头信息nal_unit_header()中的NALunittype设置为23,表示前置的SEI信息。sei_rbsp()包含相关码流sei_message(),sei_message()中包含有效数据信息。设置payloadType取值与当前H.266/VVCversion1中的其他SEI信息不同即可(例如,可取值为100),则payload_size_byte中包含了与区域自适应的加权预测相关的码流信息。具体的码流组织方法如表7所示。
multi_pred_weights_cancel_flag为1则取消之前图像相关的SEI信息,且该图像不使用相关SEI功能;multi_pred_weights_cancel_flag为0,则沿用之前的SEI信息(在解码过程,如果当前图像未携带SEI信息,则内存中保留的前一图像的SEI信息会沿用到当前图像的解码过程),且该图像启用相关SEI功能;multi_pred_weights_persistence_flag为1,则SEI信息应用于当前图像和当前层之后图像;multi_pred_weights_persistence_flag为0,则SEI信息只应用于当前图像。表9中其他字段的含义与表5中各字段对应的语义解释相同。
表9
Figure PCTCN2022102406-appb-000013
Figure PCTCN2022102406-appb-000014
在另一个实施方式中,区域自适应的加权预测的标识信息和参数在媒体描述信息中给出。其中,媒体描述信息包括但不限于基于HTTP的动态自适应流(Dynamic Adaptive Streaming over HTTP,DASH)协议中的媒体呈现描述(Media Presentation Description,MPD)信息、MPEG媒体传输(MPEG Media Transport,MMT)协议中的媒体资源描述子(Asset Descriptor)信息。以MMT中的媒体资源描述子为例,其面向区域自适应的加权预测信息的码流组织方法如表10所示。表10中的语法及字段含义与表5中各字段对应的语义解释相同。
表10
Figure PCTCN2022102406-appb-000015
Figure PCTCN2022102406-appb-000016
Figure PCTCN2022102406-appb-000017
图11是本申请实施例提供的一种视频编码装置的结构示意图;可执行本申请任意实施例提供的视频编码方法,具备执行方法相应的功能模块和有益效果,该装置可以由软件和/或硬件实现,具体包括:图像获取模块610和视频编码模块620。
图像获取模块610,用于获取视频图像,其中,所述视频图像为视频的至少一帧图像。
视频编码模块620,用于对所述视频图像进行加权预测编码以生成图像码流,其中,所述加权预测编码使用至少一套加权预测标识信息和参数。
本申请实施例,通过图像获取模块获取视频图像,该视频图像为视频中至少一帧图像,视频编码模块对视频图像进行加权预测编码以生成图像码流,其中,加权预测编码过程中使用了至少一套加权预测标识信息和参数,实现了视频图像的灵活编码,可提高视频编码效率,降低视频图像亮度变化对编码效率的影响。
进一步的,在上述申请实施例的基础上,所述装置还包括:
变化确定模块,用于根据所述视频图像与参考图像的比较结果确定亮度变化情况。
进一步的,在上述申请实施例的基础上,所述亮度变化情况包括以下至少之一:图像亮度变化均值、像素点亮度变化值。
进一步的,在上述申请实施例的基础上,所述视频编码模块620包括:
编码处理单元,用于根据所述视频图像的亮度变化情况对所述视频图像进行加权预测编码。
进一步的,在上述申请实施例的基础上,所述编码处理单元具体用于:若所述亮度变化情况为整帧图像亮度一致,则对所述视频图像进行加权预测编码;若所述亮度变化情况为分区图像亮度一致,则确定分别对所述视频图像内各所述分区图像进行加权预测编码。
进一步的,在上述申请实施例的基础上,所述装置中加权预测编码使用的一套所述加权预测标识信息和参数对应一帧所述视频图像或者所述视频图像的至少一个分区图像。
进一步的,在上述申请实施例的基础上,所述装置中分区图像的规格包括以下至少之一:分片S l ice、瓦片Tile、子图像Subpicture、编码树单元Coding Tree Unit、编码单元Coding Unit。
进一步的,在上述申请实施例的基础上,所述装置还包括:
码流写入模块,用于将所述加权预测标识信息和参数写入所述图像码流。
进一步的,在上述申请实施例的基础上,所述装置中加权预测标识信息和所述参数包含在下述至少一种参数集合中:序列层参数集、图像层参数集、分片层参数集、补充增强信息、视频可用性信息、图像头信息、分片头信息、网络抽象层单元头信息、编码树单元、编码单元。
进一步的,在上述申请实施例的基础上,所述装置中加权预测标识信息和所述参数包括以下信息中至少之一:参考图像索引信息、加权预测启用控制信息、区域自适应的加权预测启用控制信息、加权预测参数。
进一步的,在上述申请实施例的基础上,所述装置中图像码流包括传输流或媒体文件。
进一步的,在上述申请实施例的基础上,所述视频编码模块620还包括:
深度学习单元,用于根据预先训练的神经网络模型对所述视频图像进行加权预测编码。
进一步的,在上述申请实施例的基础上,所述装置中加权预测标识信息还包括神经网络模型结构和神经网络模型参数。
图12是本申请实施例提供的一种视频解码装置的结构示意图,可执行本申请任意实施例提供的视频 解码方法,具备执行方法相应的功能模块和有益效果,该装置可以由软件和/或硬件实现,具体包括:码流获取模块710和图像重建模块720。
码流获取模块710,用于获取图像码流,并解析所述图像码流中的加权预测标识信息和参数。
图像重建模块720,用于根据所述加权预测标识信息和参数解码所述图像码流以生成重建图像。
本申请实施例,通过码流获取模块获取图像码流,并获取图像码流中的加权预测标识信息和参数,图像重建模块按照获取到的加权预测标识信息和参数对图像码流进行处理以生成重新图像,实现了图像码流的动态解码,可提高视频解码效率,降低图像亮度变化对视频解码效率的影响。
进一步的,在上述申请实施例的基础上,所述装置中图像码流中的加权预测标识信息和参数的数量为至少一套。
进一步的,在上述申请实施例的基础上,所述装置中加权预测标识信息和所述参数包含在下述至少一种参数集合中:序列层参数集、图像层参数集、分片层参数集、补充增强信息、视频可用性信息、图像头信息、分片头信息、网络抽象层单元头信息、编码树单元、编码单元。
进一步的,在上述申请实施例的基础上,所述装置中加权预测标识信息和所述参数包括以下信息中至少之一:参考图像索引信息、加权预测启用控制信息、区域自适应的加权预测启用控制信息、加权预测参数。
进一步的,在上述申请实施例的基础上,所述装置中图像码流包括传输流或媒体文件。
进一步的,在上述申请实施例的基础上,所述图像重建模块720包括:
深度学习解码单元,用于根据预先训练的神经网络模型使用所述加权预测标识信息和参数对所述图像码流进行加权预测解码以生成重建图像。
进一步的,在上述申请实施例的基础上,所述装置中加权预测标识信息和参数还包括神经网络模型结构和神经网络模型参数。
在一些示例中,图13是本申请实施例提供的一种编码器的结构示意图,图13示出的编码器,应用于对视频进行编码处理的装置。所述装置的输入是视频所包含的图像,输出是图像码流或包含图像码流的传输流或媒体文件。该编码器用于:步骤501:输入图像。步骤502:编码器处理图像,并进行编码。具体操作过程可以如上述任意实施例提供的视频编码方法。步骤503:输出码流。
在一些示例中,图14是本申请实施例提供的一种解码器的结构示意图,图14示出的解码器,应用于对视频进行解码处理的装置。所述装置的输入是图像码流或包含图像码流的传输流或媒体文件,输出是构成视频的图像。该解码器用于:步骤601:输入图像。步骤602:解码器解析码流获得图像,并对图像进行解码。具体操作过程示例如上述任意实施例提供的视频解码方法。步骤603,输出图像。步骤604,播放器播放图像。
图15是本申请实施例提供的一种电子设备的结构示意图,该电子设备包括处理器70、存储器71、输入装置72和输出装置73;电子设备中处理器70的数量可以是一个或多个,图15中以一个处理器70为例;电子设备中处理器70、存储器71、输入装置72和输出装置73可以通过总线或其他方式连接,图15中以通过总线连接为例。
存储器71作为一种计算机可读存储介质,可用于存储软件程序、计算机可执行程序以及模块,如本申请实施例中的视频编码装置或视频解码装置对应的模块(图像获取模块610和视频编码模块620,或,码流获取模块710和图像重建模块720)。处理器70通过运行存储在存储器71中的软件程序、指令以及模块,从而执行电子设备的各种功能应用以及数据处理,即实现上述的方法。
存储器71可主要包括存储程序区和存储数据区,其中,存储程序区可存储操作系统、至少一个功能所需的应用程序;存储数据区可存储根据电子设备的使用所创建的数据等。此外,存储器71可以包括高速随机存取存储器,还可以包括非易失性存储器,例如至少一个磁盘存储器件、闪存器件、或其他非易失性固态存储器件。在一些实例中,存储器71可进一步包括相对于处理器70远程设置的存储器,这些远程存储器可以通过网络连接至电子设备。上述网络的实例包括但不限于互联网、企业内部网、局域网、移动通信网及其组合。
输入装置72可用于接收输入的数字或字符信息,以及产生与电子设备的用户设置以及功能控制有关的键信号输入。输出装置73可包括显示屏等显示设备。
本申请实施例还提供一种包含计算机可执行指令的存储介质,所述计算机可执行指令在由计算机处理器执行时用于执行一种视频编码方法,该方法包括:
获取视频图像,其中,所述视频图像为视频的至少一帧图像;
对所述视频图像进行加权预测编码以生成图像码流,其中,所述加权预测编码使用至少一套加权预测 标识信息和参数。
或者,所述计算机可执行指令在由计算机处理器执行时用于执行一种视频解码方法,该方法包括:
获取图像码流,并解析所述图像码流中的加权预测标识信息和参数;
根据所述加权预测标识信息和参数解码所述图像码流以生成重建图像。
通过以上关于实施方式的描述,所属领域的技术人员可以清楚地了解到,本申请可借助软件及必需的通用硬件来实现,当然也可以通过硬件实现,但很多情况下前者是更佳的实施方式。基于这样的理解,本申请的技术方案本质上或者说对现有技术做出贡献的部分可以以软件产品的形式体现出来,该计算机软件产品可以存储在计算机可读存储介质中,如计算机的软盘、只读存储器(Read-Only Memory,ROM)、随机存取存储器(Random Access Memory,RAM)、闪存(FLASH)、硬盘或光盘等,包括若干指令用以使得一台计算机设备(可以是个人计算机,服务器,或者网络设备等)执行本申请各个实施例所述的方法。
值得注意的是,上述装置的实施例中,所包括的各个单元和模块只是按照功能逻辑进行划分的,但并不局限于上述的划分,只要能够实现相应的功能即可;另外,各功能单元的具体名称也只是为了便于相互区分,并不用于限制本申请的保护范围。
本领域普通技术人员可以理解,上文中所公开方法中的全部或某些步骤、系统、设备中的功能模块/单元可以被实施为软件、固件、硬件及其适当的组合。
在硬件实施方式中,在以上描述中提及的功能模块/单元之间的划分不一定对应于物理组件的划分;例如,一个物理组件可以具有多个功能,或者一个功能或步骤可以由若干物理组件合作执行。某些物理组件或所有物理组件可以被实施为由处理器,如中央处理器、数字信号处理器或微处理器执行的软件,或者被实施为硬件,或者被实施为集成电路,如专用集成电路。这样的软件可以分布在计算机可读介质上,计算机可读介质可以包括计算机存储介质(或非暂时性介质)和通信介质(或暂时性介质)。如本领域普通技术人员公知的,术语计算机存储介质包括在用于存储信息(诸如计算机可读指令、数据结构、程序模块或其他数据)的任何方法或技术中实施的易失性和非易失性、可移除和不可移除介质。计算机存储介质包括但不限于RAM、ROM、EEPROM、闪存或其他存储器技术、CD-ROM、数字多功能盘(DVD)或其他光盘存储、磁盒、磁带、磁盘存储或其他磁存储装置、或者可以用于存储期望的信息并且可以被计算机访问的任何其他的介质。此外,本领域普通技术人员公知的是,通信介质通常包含计算机可读指令、数据结构、程序模块或者诸如载波或其他传输机制之类的调制数据信号中的其他数据,并且可包括任何信息递送介质。
以上参照附图说明了本申请的一些实施例,并非因此局限本申请的权利范围。本领域技术人员不脱离本申请的范围和实质内所作的任何修改、等同替换和改进,均应在本申请的权利范围之内。

Claims (25)

  1. 一种视频编码方法,包括:
    获取视频图像,其中,所述视频图像为视频的至少一帧图像;
    对所述视频图像进行加权预测编码以生成图像码流,其中,所述加权预测编码使用至少一套加权预测标识信息和参数。
  2. 根据权利要求1所述的方法,还包括:
    根据所述视频图像与参考图像的比较结果确定亮度变化情况。
  3. 根据权利要求2所述的方法,其中,所述亮度变化情况包括以下至少之一:
    图像亮度变化均值、像素点亮度变化值。
  4. 根据权利要求1所述的方法,其中,所述对所述视频图像进行加权预测编码以生成图像码流,包括:
    根据所述视频图像的亮度变化情况对所述视频图像进行加权预测编码。
  5. 根据权利要求4所述的方法,其中,所述根据所述亮度变化情况对所述视频图像进行加权预测编码,包括以下至少之一:
    若所述亮度变化情况为整帧图像亮度一致,则对所述视频图像进行加权预测编码;
    若所述亮度变化情况为分区图像亮度一致,则确定分别对所述视频图像内各所述分区图像进行加权预测编码。
  6. 根据权利要求1所述的方法,其中,所述视频图像针对所述参考图像或所述参考图像的分区图像存在至少一套所述加权预测标识信息和参数。
  7. 根据权利要求1所述的方法,其中,所述加权预测编码使用的一套所述加权预测标识信息和参数对应一帧所述视频图像或者所述视频图像的至少一个分区图像。
  8. 根据权利要求7所述的方法,其中,所述分区图像的规格包括以下至少之一:
    分片Slice、瓦片Tile、子图像Subpicture、编码树单元Coding Tree Unit、编码单元Coding Unit。
  9. 根据权利要求1所述的方法,还包括:
    将所述加权预测标识信息和参数写入所述图像码流。
  10. 根据权利要求9所述的方法,其中,所述加权预测标识信息和所述参数包含在下述至少一种参数集合中:序列层参数集、图像层参数集、分片层参数集、补充增强信息、视频可用性信息、图像头信息、分片头信息、网络抽象层单元头信息、编码树单元、编码单元。
  11. 根据权利要求1所述的方法,其中,所述加权预测标识信息和所述参数包括以下信息中至少之一:参考图像索引信息、加权预测启用控制信息、区域自适应的加权预测启用控制信息、加权预测参数。
  12. 根据权利要求1所述的方法,其中,所述图像码流包括传输流或媒体文件。
  13. 根据权利要求1所述的方法,其中,所述对所述视频图像进行加权预测编码以生成图像码流,包括
    根据预先训练的神经网络模型对所述视频图像进行加权预测编码。
  14. 根据权利要求13所述的方法,其中,所述加权预测标识信息还包括神经网络模型结构和神经网络模型参数。
  15. 一种视频解码方法,包括:
    获取图像码流,并解析所述图像码流中的加权预测标识信息和参数;
    根据所述加权预测标识信息和参数解码所述图像码流以生成重建图像。
  16. 根据权利要求15所述的方法,其中,所述图像码流中的加权预测标识信息和参数的数量为至少一套。
  17. 根据权利要求15所述的方法,其中,所述加权预测标识信息和所述参数包含在下述至少一种参数集合中:序列层参数集、图像层参数集、分片层参数集、补充增强信息、视频可用性信息、图像头信息、分片头信息、网络抽象层单元头信息、编码树单元、编码单元。
  18. 根据权利要求15所述的方法,其中,所述加权预测标识信息和所述参数包括以下信息中至少之一:参考图像索引信息、加权预测启用控制信息、区域自适应的加权预测启用控制信息、加权预测参数。
  19. 根据权利要求15所述的方法,其中,所述图像码流包括传输流或媒体文件。
  20. 根据权利要求15所述的方法,其中,所述根据所述加权预测标识信息和参数解码所述图像码流以生成重建图像,包括:
    根据预先训练的神经网络模型使用所述加权预测标识信息和参数对所述图像码流进行加权预测解码以生成重建图像。
  21. 根据权利要求20所述的方法,其中,所述加权预测标识信息和参数还包括神经网络模型结构和 神经网络模型参数。
  22. 一种视频编码装置,包括:
    图像获取模块,用于获取视频图像,其中,所述视频图像为视频的至少一帧图像;
    视频编码模块,用于对所述视频图像进行加权预测编码以生成图像码流,其中,所述加权预测编码使用至少一套加权预测标识信息和参数。
  23. 一种视频解码装置,包括:
    码流获取模块,用于获取图像码流,并解析所述图像码流中的加权预测标识信息和参数;
    图像重建模块,用于根据所述加权预测标识信息和参数解码所述图像码流以生成重建图像。
  24. 一种电子设备,包括:
    一个或多个处理器;
    存储器,用于存储一个或多个程序;其中,
    当所述一个或多个程序被所述一个或多个处理器执行,使得所述一个或多个处理器实现如权利要求1-14或15-21中任一所述的方法。
  25. 一种计算机可读存储介质,存储有一个或者多个程序,其中,所述一个或者多个程序可被一个或者多个处理器执行,以实现根据权利要求1-14或15-21中任一所述的方法。
PCT/CN2022/102406 2021-07-30 2022-06-29 视频编码、视频解码方法、装置、电子设备和存储介质 Ceased WO2023005579A1 (zh)

Priority Applications (4)

Application Number Priority Date Filing Date Title
JP2023578781A JP7698746B2 (ja) 2021-07-30 2022-06-29 ビデオ符号化、ビデオ復号化の方法、装置、電子機器及び記憶媒体
US18/577,790 US12457351B2 (en) 2021-07-30 2022-06-29 Video encoding method and apparatus, video decoding method and apparatus, and electronic device and storage medium
EP22848184.2A EP4380156A4 (en) 2021-07-30 2022-06-29 VIDEO ENCODING METHOD AND APPARATUS, VIDEO DECODING METHOD AND APPARATUS, ELECTRONIC DEVICE, AND STORAGE MEDIUM
KR1020247002467A KR20240024975A (ko) 2021-07-30 2022-06-29 비디오 인코딩, 비디오 디코딩 방법, 장치, 전자 기기 및 저장 매체

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202110875430.8A CN115695812A (zh) 2021-07-30 2021-07-30 视频编码、视频解码方法、装置、电子设备和存储介质
CN202110875430.8 2021-07-30

Publications (1)

Publication Number Publication Date
WO2023005579A1 true WO2023005579A1 (zh) 2023-02-02

Family

ID=85059734

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/102406 Ceased WO2023005579A1 (zh) 2021-07-30 2022-06-29 视频编码、视频解码方法、装置、电子设备和存储介质

Country Status (6)

Country Link
US (1) US12457351B2 (zh)
EP (1) EP4380156A4 (zh)
JP (1) JP7698746B2 (zh)
KR (1) KR20240024975A (zh)
CN (1) CN115695812A (zh)
WO (1) WO2023005579A1 (zh)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN119520491B (zh) * 2023-08-23 2026-02-17 上海交通大学 一种基于http服务器的媒体资源的发送方法、接收方法和交互方法
CN120151515A (zh) * 2023-12-12 2025-06-13 腾讯科技(深圳)有限公司 视频编解码方法、装置、计算机可读介质及电子设备
US20260113463A1 (en) * 2024-10-18 2026-04-23 Nvidia Corporation Hardware-assisted weighted prediction encoding with software-generated weight prediction parameters

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP2302933A1 (en) * 2009-09-17 2011-03-30 Mitsubishi Electric R&D Centre Europe B.V. Weighted motion compensation of video
CN103458240A (zh) * 2012-05-29 2013-12-18 韩国科亚电子股份有限公司 利用自适应加权预测的影像处理方法
CN106358041A (zh) * 2016-08-30 2017-01-25 北京奇艺世纪科技有限公司 一种帧间预测编码方法及装置
CN109417619A (zh) * 2016-04-29 2019-03-01 英迪股份有限公司 用于编码/解码视频信号的方法和设备
CN112261409A (zh) * 2019-07-22 2021-01-22 中兴通讯股份有限公司 残差编码、解码方法及装置、存储介质及电子装置

Family Cites Families (19)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
RU2699253C2 (ru) 2010-09-03 2019-09-04 Гуандун Оппо Мобайл Телекоммьюникейшнз Корп., Лтд. Способ и система для компенсации освещенности и перехода при кодировании и обработке видеосигнала
JP2012244353A (ja) * 2011-05-18 2012-12-10 Sony Corp 画像処理装置および方法
AU2011379258C1 (en) 2011-10-17 2015-11-26 Kabushiki Kaisha Toshiba Encoding method and decoding method
KR101974261B1 (ko) * 2016-06-24 2019-04-30 한국과학기술원 Cnn 기반 인루프 필터를 포함하는 부호화 방법과 장치 및 복호화 방법과 장치
WO2018037919A1 (ja) 2016-08-26 2018-03-01 シャープ株式会社 画像復号装置、画像符号化装置、画像復号方法、および画像符号化方法
WO2019110125A1 (en) * 2017-12-08 2019-06-13 Huawei Technologies Co., Ltd. Polynomial fitting for motion compensation and luminance reconstruction in texture synthesis
EP3562162A1 (en) 2018-04-27 2019-10-30 InterDigital VC Holdings, Inc. Method and apparatus for video encoding and decoding based on neural network implementation of cabac
WO2019205117A1 (en) * 2018-04-28 2019-10-31 Intel Corporation Weighted prediction mechanism
JP7185467B2 (ja) 2018-09-28 2022-12-07 Kddi株式会社 画像復号装置、画像符号化装置、画像処理システム及びプログラム
KR20210145754A (ko) * 2019-04-12 2021-12-02 베이징 바이트댄스 네트워크 테크놀로지 컴퍼니, 리미티드 행렬 기반 인트라 예측에서의 산출
CN114051735B (zh) * 2019-05-31 2024-07-05 北京字节跳动网络技术有限公司 基于矩阵的帧内预测中的一步下采样过程
JP2022534320A (ja) * 2019-06-05 2022-07-28 北京字節跳動網絡技術有限公司 マトリクスベースイントラ予測のためのコンテキスト決定
CN120302048A (zh) 2019-08-22 2025-07-11 Lg电子株式会社 图像解码方法、图像编码方法和比特流发送方法
US11677987B2 (en) * 2020-10-15 2023-06-13 Qualcomm Incorporated Joint termination of bidirectional data blocks for parallel coding
US20230407239A1 (en) * 2020-11-13 2023-12-21 Teewinot Life Sciences Corporation Tetrahydrocannabinolic acid (thca) synthase variants, and manufacture and use thereof
US12015785B2 (en) * 2020-12-04 2024-06-18 Ofinno, Llc No reference image quality assessment based decoder side inter prediction
US11729424B2 (en) * 2020-12-04 2023-08-15 Ofinno, Llc Visual quality assessment-based affine transformation
US11599748B2 (en) * 2020-12-18 2023-03-07 Tiliter Pty Ltd. Methods and apparatus for recognizing produce category, organic type, and bag type in an image using a concurrent neural network model
US11825090B1 (en) * 2022-07-12 2023-11-21 Qualcomm Incorporated Bit-rate estimation for video coding with machine learning enhancement

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP2302933A1 (en) * 2009-09-17 2011-03-30 Mitsubishi Electric R&D Centre Europe B.V. Weighted motion compensation of video
CN103458240A (zh) * 2012-05-29 2013-12-18 韩国科亚电子股份有限公司 利用自适应加权预测的影像处理方法
CN109417619A (zh) * 2016-04-29 2019-03-01 英迪股份有限公司 用于编码/解码视频信号的方法和设备
CN106358041A (zh) * 2016-08-30 2017-01-25 北京奇艺世纪科技有限公司 一种帧间预测编码方法及装置
CN112261409A (zh) * 2019-07-22 2021-01-22 中兴通讯股份有限公司 残差编码、解码方法及装置、存储介质及电子装置

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
See also references of EP4380156A4

Also Published As

Publication number Publication date
CN115695812A (zh) 2023-02-03
US12457351B2 (en) 2025-10-28
EP4380156A4 (en) 2025-07-09
EP4380156A1 (en) 2024-06-05
US20240380911A1 (en) 2024-11-14
JP2024521528A (ja) 2024-05-31
JP7698746B2 (ja) 2025-06-25
KR20240024975A (ko) 2024-02-26

Similar Documents

Publication Publication Date Title
WO2023005579A1 (zh) 视频编码、视频解码方法、装置、电子设备和存储介质
US8428144B2 (en) Method and apparatus for decoding/encoding of a video signal
CN101658038A (zh) 针对可缩放视频编码的视频可用信息的方法和设备
US11336965B2 (en) Method and apparatus for processing video bitstream, network device, and readable storage medium
CN118435608A (zh) 用于转换胶片颗粒元数据的方法和装置
CN104754358A (zh) 码流的生成和处理方法、装置及系统
US20050114887A1 (en) Quality of video
KR20090099547A (ko) 멀티뷰 코딩 비디오에서 비디오 에러 정정을 위한 방법 및 장치
CN115668936A (zh) 时间运动向量预测、层间参考和时间子层指示的视频编码方面
JP2025036622A (ja) 画像復号装置、画像復号方法、画像符号化装置、および画像符号化方法
US9565454B2 (en) Picture referencing control for video decoding using a graphics processor
TW202545183A (zh) 編碼/解碼之方法及設備
CN120266479A (zh) 用于胶片颗粒合成的sei信息
EP4654589A1 (en) Decoding method, coding method, electronic device, and storage medium
WO2021114305A1 (zh) 视频处理方法、装置及计算机可读存储介质
CN115988171A (zh) 一种视频会议系统及其沉浸式布局方法和装置
US12452429B2 (en) Picture encoding method and apparatus, picture decoding method and apparatus, electronic device and storage medium
JP7302144B2 (ja) ビットストリームの処理方法及び装置
CN121753332A (zh) 多层比特流一致性检查以及对应装置
WO2026040641A1 (zh) 视频码流的解码方法、编码方法、电子设备及存储介质
WO2026091948A1 (zh) 图像编码方法、图像解码方法、装置、设备、存储介质及程序产品
WO2025148442A1 (zh) 胶片颗粒区域处理方法、设备及存储介质
CN120345235A (zh) 发信号通知表示视频的对准轴线的封装的数据
HK40056285B (zh) 视频码流解码方法和解码器、编码器以及电子设备

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22848184

Country of ref document: EP

Kind code of ref document: A1

ENP Entry into the national phase

Ref document number: 2023578781

Country of ref document: JP

Kind code of ref document: A

WWE Wipo information: entry into national phase

Ref document number: 18577790

Country of ref document: US

ENP Entry into the national phase

Ref document number: 20247002467

Country of ref document: KR

Kind code of ref document: A

WWE Wipo information: entry into national phase

Ref document number: 1020247002467

Country of ref document: KR

WWE Wipo information: entry into national phase

Ref document number: 2022848184

Country of ref document: EP

NENP Non-entry into the national phase

Ref country code: DE

ENP Entry into the national phase

Ref document number: 2022848184

Country of ref document: EP

Effective date: 20240229

WWG Wipo information: grant in national office

Ref document number: 18577790

Country of ref document: US