WO2020192020A1 - 滤波方法、装置、编码器以及计算机存储介质 - Google Patents

滤波方法、装置、编码器以及计算机存储介质 Download PDF

Info

Publication number
WO2020192020A1
WO2020192020A1 PCT/CN2019/104499 CN2019104499W WO2020192020A1 WO 2020192020 A1 WO2020192020 A1 WO 2020192020A1 CN 2019104499 W CN2019104499 W CN 2019104499W WO 2020192020 A1 WO2020192020 A1 WO 2020192020A1
Authority
WO
WIPO (PCT)
Prior art keywords
information
components
processing
filtered
component
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Ceased
Application number
PCT/CN2019/104499
Other languages
English (en)
French (fr)
Inventor
马彦卓
万帅
霍俊彦
张伟
王铭泽
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangdong Oppo Mobile Telecommunications Corp Ltd
Original Assignee
Guangdong Oppo Mobile Telecommunications Corp Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangdong Oppo Mobile Telecommunications Corp Ltd filed Critical Guangdong Oppo Mobile Telecommunications Corp Ltd
Priority to JP2021556289A priority Critical patent/JP2022526107A/ja
Priority to CN201980094255.XA priority patent/CN113574884A/zh
Priority to KR1020217032825A priority patent/KR102916992B1/ko
Priority to EP19922221.7A priority patent/EP3941057A4/en
Publication of WO2020192020A1 publication Critical patent/WO2020192020A1/zh
Priority to US17/475,184 priority patent/US12206904B2/en
Anticipated expiration legal-status Critical
Ceased legal-status Critical Current

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/80Details of filtering operations specially adapted for video compression, e.g. for pixel interpolation
    • H04N19/82Details of filtering operations specially adapted for video compression, e.g. for pixel interpolation involving filtering within a prediction loop
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T9/00Image coding
    • G06T9/002Image coding using neural networks
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/117Filters, e.g. for pre-processing or post-processing
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/124Quantisation
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/134Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
    • H04N19/136Incoming video signal characteristics or properties
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/17Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
    • H04N19/176Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a block, e.g. a macroblock
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/186Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being a colour or a chrominance component
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/42Methods or arrangements for coding, decoding, compressing or decompressing digital video signals characterised by implementation details or hardware specially adapted for video compression or decompression, e.g. dedicated software implementation
    • H04N19/423Methods or arrangements for coding, decoding, compressing or decompressing digital video signals characterised by implementation details or hardware specially adapted for video compression or decompression, e.g. dedicated software implementation characterised by memory arrangements
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/80Details of filtering operations specially adapted for video compression, e.g. for pixel interpolation
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/85Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using pre-processing or post-processing specially adapted for video compression
    • H04N19/86Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using pre-processing or post-processing specially adapted for video compression involving reduction of coding artifacts, e.g. of blockiness

Definitions

  • the embodiments of the present application relate to the technical field of video image processing, and in particular, to a filtering method, device, encoder, and computer storage medium.
  • the preprocessing filter is used to preprocess the original image to reduce the video resolution, because the video resolution that needs to be encoded is lower than the resolution of the original video, so you can use less Bit representation, which can improve the overall coding efficiency;
  • the post-processing filter processes the in-loop filtered video to output the video to improve the video resolution.
  • neural network-based filters are often composed of multiple basic units.
  • the input of neural network is single input or multiple input, that is, single image component or multiple image components when input, that is, the complexity of existing convolutional neural network is relatively high. High, and the current convolutional neural network (CNN, Convolutional Neural Network) filter does not make full use of relevant information, which makes the improvement of reconstructed image quality limited.
  • CNN Convolutional Neural Network
  • the embodiments of the present application provide a filtering method, a device, an encoder, and a computer storage medium, which can reduce the complexity of the neural network-based filtering method and help improve the image quality of the reconstructed image.
  • an embodiment of the present application provides a filtering method, the method including:
  • the at least two components of the pixel information to be filtered and the at least one kind of side information are input into a neural network-based filter to output at least one component after filtering the pixel information to be filtered.
  • an embodiment of the present application provides a filtering device, and the filtering device includes:
  • the first obtaining module is configured to obtain pixel information to be filtered
  • the second obtaining module is configured to obtain at least one type of side information
  • a determining module configured to input at least two components of the pixel information to be filtered and the at least one kind of side information into a neural network-based filter to output at least one of the filtered pixel information to be filtered Weight.
  • an encoder in a third aspect, provides an encoder, and the encoder includes:
  • an embodiment of the present application provides a computer storage medium, in which executable instructions are stored, and when the executable instructions are executed by one or more processors, the processors execute one or more of the foregoing The filtering method described in one embodiment.
  • the embodiments of the present application provide a filtering method, device, encoder, and computer storage medium.
  • the filtering device obtains pixel information to be filtered, obtains at least one type of side information, and combines at least two components and at least one of the pixel information to be filtered.
  • the seed side information is input into the filter based on the neural network to output at least one component of the pixel information to be filtered after filtering; that is, in the embodiment of the present application, at least two components of the pixel information to be filtered are acquired, And at least one kind of side information is input into a neural network-based filter for processing.
  • the side information of at least one component is incorporated to obtain filtered pixel information.
  • Figure 1 is a schematic structural diagram of a traditional coding block diagram
  • Figure 2 is a schematic structural diagram of a traditional decoding block diagram
  • FIG. 3 is a schematic flowchart of an optional filtering method provided by an embodiment of this application.
  • FIG. 4 is a schematic structural diagram of a block division matrix provided by an embodiment of the application.
  • FIG. 5 is a schematic structural diagram of a traditional CNN filter provided by an embodiment of the application.
  • 6A is a schematic structural diagram of another traditional CNN filter provided by an embodiment of the application.
  • 6B is a schematic diagram of the composition structure of another traditional CNN filter provided by an embodiment of the application.
  • FIG. 7 is a schematic structural diagram of an optional filtering framework provided by an embodiment of the application.
  • FIG. 8 is a schematic structural diagram of another optional filtering framework provided by an embodiment of the application.
  • FIG. 9 is a schematic structural diagram of yet another optional filtering framework provided by an embodiment of this application.
  • FIG. 10 is a schematic structural diagram of an optional filtering device provided by an embodiment of this application.
  • FIG. 11 is a schematic structural diagram of an optional encoder provided by an embodiment of the application.
  • the video to be coded includes the original image frame, and the original image frame includes the original image, and the original image is processed in various ways, such as prediction, transformation, quantization, reconstruction, and filtering.
  • the processed video image may have shifted in pixel value relative to the original image, causing visual impairment or artifacts.
  • adjacent coding blocks use different coding parameters (such as different transformation processes, different QPs, different prediction methods, and different Reference image frame, etc.), the size of the error introduced by each coding block and its distribution characteristics are independent of each other, and the discontinuity of the boundaries of adjacent coding blocks produces blocking effects.
  • FIG 1 is a schematic structural diagram of a traditional coding block diagram.
  • the traditional coding block diagram 10 may include a transform and quantization unit 101, an inverse transform and inverse quantization unit 102, a prediction unit 103, an in-loop filtering unit 104, and an entropy coding unit 105 and other components; wherein the prediction unit 103 further includes an intra prediction unit 1031 and an inter prediction unit 1032.
  • the coding tree unit (CTU, Coding Tree Unit) can be obtained through preliminary division, and the content adaptive division of a CTU can be continued to obtain the CU.
  • a CU generally contains one or more coding blocks (CB, Coding Block).
  • the prediction unit 103 is also used to convert the selected intra prediction data or The inter-frame prediction data is provided to the entropy coding unit 105; in addition, the inverse transform and inverse quantization unit 102 is used for reconstruction of the coding block, reconstructing the residual block in the pixel domain, and the reconstructed residual block is filtered in the loop
  • the unit 104 removes the block artifacts, and then adds the reconstructed residual block to the decoded image buffer unit to generate a reconstructed reference image; the entropy encoding unit 105 is used for encoding various encoding parameters and quantization Transform coefficients, such as the entropy coding unit
  • the in-loop filter unit 104 is a loop filter, also called an in-loop filter (In-Loop Filter), which may include a de-blocking filter (DBF, De-Blocking Filter). ), sample adaptive compensation (SAO, Sample Adaptive Offset) filter and adaptive loop filter (ALF, Adaptive Loop Filter), etc.
  • In-Loop Filter an in-loop filter
  • DPF de-blocking filter
  • SAO Sample Adaptive Offset
  • ALF Adaptive Loop Filter
  • the preprocessing filtering unit 106 is used to receive the input original video frame, and perform preprocessing filtering on the original image frame in the original video frame to reduce the resolution of the video.
  • the post-processing filtering unit 107 is used to receive the in-loop filtered video frame, and perform post-processing filtering on the in-loop filtered video frame to improve the resolution of the video, so that less bits can be used in the video encoding and decoding process to obtain reconstruction Video frames, which can improve the overall coding and decoding efficiency.
  • the input of the neural network currently used by both the preprocessing filter and the postprocessing filter is single input or multiple inputs, that is, a single image component or multiple image components are input, which is the complexity of the existing convolutional neural network High, and the current CNN filter does not fully utilize relevant information, which makes the improvement of the reconstructed image quality limited.
  • Figure 2 is a schematic structural diagram of a traditional decoding block diagram.
  • the traditional decoding block diagram 20 may include an entropy coding unit 201, an inverse quantization and inverse transformation unit 202, a prediction unit 203, and an in-loop
  • the filtering unit 204 and the post-processing filtering unit 205 are components; wherein, the prediction unit 203 further includes an intra prediction unit 2031 and an inter prediction unit 2032.
  • the video decoding process is the opposite process to the video encoding process, in which the post-processing filtered image obtained in the video decoding process is determined as the reconstructed video frame. It can be seen from Figure 2 that in the decoding process Does not involve the preprocessing filter unit in the encoding process.
  • the embodiment of the application provides a filtering method, which is applied to a filtering device.
  • the filtering device can be set in the pre-processing filter and the post-processing filter in the encoder, or in the post-processing filter of the decoder.
  • the embodiments of the present application do not make specific limitations.
  • FIG. 3 is a schematic flowchart of an optional filtering method provided by an embodiment of this application.
  • the filtering method may include:
  • S303 Input at least two components of the pixel information to be filtered and at least one type of side information into the neural network-based filter to output at least one component after filtering the pixel information to be filtered.
  • the above-mentioned pixel information to be filtered refers to the image block to be filtered represented by the pixel value
  • the image block to be filtered includes three image components
  • the above-mentioned at least two components may be any two of the three image components or
  • the image components can include a first image component, a second image component, and a third image component; in the embodiment of the present application, the first image component represents the luminance component, and the second image component represents the first chrominance component,
  • the third image component represents the second chrominance component as an example for description.
  • the aforementioned at least one type of side information may be at least one of the side information corresponding to the first type of image component, the side information corresponding to the second type of image component, and the side information of the third type of image component.
  • the original image frame can be divided into CTU or CTU into CU; that is to say, the block division information in the embodiment of this application may refer to CTU division information or CU division information;
  • the filtering method of the embodiment of the present application can be applied not only to pre-processing filtering or post-processing filtering at the CU level, but also to pre-processing filtering or post-processing filtering at the CTU level, which is not specifically limited in the embodiment of the present application.
  • the pixel information to be filtered is the original image block in the video to be encoded expressed in pixel values, or the pixel information to be filtered is the image obtained after the in-loop filtering process of the video to be encoded expressed in pixel values in the video encoding process. Piece.
  • the reference image of the image to be encoded for video encoding can also be processed by inverse transformation and inverse quantization, reconstruction and filtering. That is to say, the pixel information to be filtered can be the image block in the original image just input into the encoder. This is the case when it is applied to the pre-processing filter, or it can be obtained just after in-loop filtering.
  • the image block, which is applied to the post-processing filter, is used to obtain the pixel information to be filtered.
  • At least two components of the pixel information to be filtered and at least one type of side information can be obtained.
  • the first image component, the second image component, and the third image component are generally used to characterize the original image or the image to be filtered.
  • the three image components are a luminance component, a blue chrominance (color difference) component, and a red chrominance (color difference) component; specifically, the luminance component usually uses the symbol Y Indicates that the blue chrominance component is usually represented by the symbol Cb or U; the red chrominance component is usually represented by the symbol Cr or V.
  • At least one component represents one or more of the first image component, the second image component, and the third image component
  • the at least two components may be the first image component and the second image component
  • the third image component can also be the first image component and the second image component, or the first image component and the third image component, or even the second image component and the third image component.
  • VVC next generation video coding standard
  • VTM VVC Test Model
  • VTM VVC Test Model
  • the current standard test sequence adopts the YUV 4:2:0 format.
  • Each frame of the video to be encoded in this format can be composed of three components: a luminance component Y and two chrominances Components U and V.
  • the size information corresponding to the first image component is H ⁇ W
  • the size information corresponding to the second image component or the third image component is both
  • the embodiment of the present application will take the YUV 4:2:0 format as an example for description, but the filtering method of the embodiment of the present application is also applicable to other sampling formats.
  • the size information of the first image component and the second image component or the third image component are different, in order to combine the first image component and/or the second image component and/or The third image component is input into the neural network-based filter at one time. At this time, the three components need to be sampled or recombined to make the spatial size information of the three components the same.
  • pixel rearrangement processing may be performed on high-resolution image components, so that the spatial size information of the three components is the same. Specifically, before S302, for at least two components of the pixel information to be filtered, high-resolution image components may be selected, and pixel rearrangement processing may be performed on the high-resolution image components.
  • the three components included in the original image are the original image components before other processing is performed. If the first image component is the luminance component, the second image component is the first chrominance component, and the third image component is the second chrominance component; then the high-resolution image component is the first image component.
  • the image components undergo pixel rearrangement processing.
  • the original image with a size of 2 ⁇ 2 as an example, convert it into 4 channels, that is, arrange the 2 ⁇ 2 ⁇ 1 tensor into a 1 ⁇ 1 ⁇ 4 tensor; then when the original image is the first
  • the size information of an image component is H ⁇ W, it can be converted to The form of; because the size information of the second image component and the third image component are both In this way, the spatial size information of the three image components can be the same; the three image components after the pixel rearrangement processing are subsequently combined and transformed into Input pre-processing filter or post-processing filter in the form of.
  • a low-resolution component may be selected, and the low-resolution component may be up-sampled.
  • the low-resolution components can also be up-sampled (That is, upward adjustment).
  • upsampling processing not only can upsampling processing be performed, but also deconvolution processing can be performed, and even super-resolution processing can also be performed. The effects of these three processings are the same, and the embodiment of the present application does not specifically limit it.
  • the three components contained in the original image are the original image components before other processing is performed. If the first image component is the luminance component, the second image component is the first chrominance component, and the third image component is the second chrominance component; then the low-resolution image component is the second image component or the third image component. It is necessary to perform up-sampling processing on the second image component or the third image component.
  • the size information of the second image component and the third image component of the original image are both Before filtering, it can be converted into H ⁇ W form by upsampling; since the size information of the first image component is H ⁇ W, this can also make the spatial size information of the three image components the same.
  • the second image component after upsampling processing and the third image component after upsampling processing will maintain the same resolution as the first image component.
  • side information can be used to assist filtering and improve filtering quality.
  • the side information can be not only block division information (such as CU division information and/or CTU division information), but also quantization parameter information, or even motion vector (MV , Motion Vector) information, prediction direction information, etc.; these information can be used as side information alone or in any combination as side information, for example, block division information alone as side information, or block division information and quantization parameter information together as side information , Or the block division information and MV information are used together as side information, etc., which is not specifically limited in the embodiment of the present application.
  • the at least two types of components and at least one type of side information are input into the neural network-based filter for processing, where the components may be included
  • Processing methods such as processing, fusion processing, joint processing, and branch processing are not specifically limited in the embodiment of the present application.
  • At least two components of the pixel information to be filtered and at least one type of side information are used as input values and input into a neural network-based filter, and at least one component of the pixel information to be filtered can be output.
  • the structure of the neural network includes at least a joint processing stage and an independent processing stage; in the joint processing stage, all components are processed together; in the independent processing stage, each component is processed on an independent branch of the neural network.
  • S303 may include:
  • S3031 Process each of the at least two components separately to obtain at least two components after processing
  • S3032 Perform fusion processing according to at least one kind of side information and at least two processed components to obtain fusion information of the pixel information to be filtered;
  • S3033 Process the fusion information to obtain at least one component filtered by the pixel information to be filtered.
  • the processing process can be regarded as a splitting stage for obtaining at least two components separately; the fusion information includes at least information obtained by fusing at least two components, S3032, the processing procedure can be regarded as a merging stage, It is used to fuse at least two components; in this way, the embodiment of the present application adopts a cascade processing structure.
  • S3031 may include:
  • Component processing is performed on each of the at least two components to obtain at least two components after processing.
  • the original image component YUV of the original image block is obtained, and Y, U, and V are respectively processed to obtain the YUV of the image block to be filtered, that is, at least two components of the image block to be filtered can be YU, or YV, or UV.
  • S3031 may include:
  • the first side information includes at least block division information and/or quantization parameter information.
  • the first side information can be used for auxiliary filtering to improve filtering quality.
  • the first side information can be not only block division information (such as CU division information and/or CTU division information), but also quantization parameter information, or even It can be motion vector (MV, Motion Vector) information, prediction direction information, etc.; these information can be used as the first side information alone, or can be combined arbitrarily as the first side information, for example, block division information is used as the first side information alone, or The block division information and the quantization parameter information are used together as the first side information, or the block division information and the MV information are used together as the first side information, etc., which are not specifically limited in the embodiment of the present application.
  • S3031 can be regarded as the first shunt stage.
  • component processing such as deep learning
  • the first side corresponding to each component The information is added to the corresponding components to obtain at least two components after processing; that is, for the first branching stage, the first side information may or may not be fused, which is not used in this embodiment of the application. Specific restrictions.
  • the CU division information may be used as the block division information corresponding to each component of the pixel information to be filtered.
  • the first value is filled in each pixel position corresponding to the CU boundary , Fill in the second value in other pixel positions to obtain the first matrix corresponding to the CU division information; where the first value is different from the second value; take the first matrix as the block corresponding to each component of the pixel information to be filtered Divide information.
  • the first value can be a preset value, letter, etc.
  • the second value can also be a preset value, letter, etc.
  • the first value is different from the second value; for example, the first value can be set to 2.
  • the second value can be set to 1, but the embodiment of the present application does not specifically limit it.
  • the CU division information may be used as the first side information to assist the pixel information to be filtered for filtering processing. That is to say, in the process of performing video encoding on the original image in the video to be encoded, the CU division information can be fully utilized to merge it with at least two components of the pixel information to be filtered and then guide filtering.
  • the CU division information is converted into a coding unit map (CUmap, Coding Unit Map), and represented by a two-dimensional matrix, that is, the CUmap matrix, that is, the first matrix in the embodiment of this application; that is, the original Take the first image component of the image as an example, it can be divided into multiple CUs; each pixel position corresponding to the boundary of each CU is filled with the first value, and other pixel positions are filled with the second value, so Then, a first matrix reflecting CU partition information can be constructed.
  • FIG. 4 is a schematic structural diagram of a block division matrix provided by an embodiment of this application. As shown in FIG.
  • each pixel position corresponding to each CU boundary is filled with 2 and other pixel positions are filled with 1, that is, the pixel filled with 2
  • the point position indicates the boundary of the CU, so that the CU division information, that is, the first side information corresponding to the first image component of the pixel information to be filtered can be determined.
  • the CU division information of the first image component is different from that of the second or third image component.
  • the CU division information may be different. Therefore, when the CU division information of the first image component is different from the CU division information of the second image component or the third image component, the CU division information corresponding to the first image component of the pixel information to be filtered and the pixel to be filtered need to be determined respectively.
  • the CU division information corresponding to the second image component or the third image component of the information then it is used as the first side information to be fused into the corresponding first image component or second image component or third image component; when the first When the CU division information of the image component is the same as the CU division information of the second image component or the third image component, only the CU division information of the first image component or the second image component or the third image component can be determined at this time, and then the As the first side information, the CU division information is fused into the corresponding first image component, second image component, or third image component; in this way, it is convenient to subsequently fuse at least two new components obtained to treat
  • the filtered pixel information is subjected to pre-processing filtering or post-processing filtering.
  • the first side information corresponding to each component may be determined based on the original image in the video to be encoded, and the quantization parameter corresponding to each of the at least two components of the original image block may be obtained, and the quantization The parameter is used as the quantization parameter information corresponding to each component of the pixel information to be filtered.
  • using the quantization parameter as the quantization parameter information corresponding to each component of the pixel information to be filtered may be to separately establish a second matrix with the same size as each component of the original image; wherein, the second matrix The position of each pixel in the original image is filled with the normalized value of the quantization parameter corresponding to each component of the original image; the second matrix is used as the quantization parameter information corresponding to each component of the pixel information to be filtered.
  • the filter network can adaptively have the ability to process any quantization parameter during the training process.
  • the quantization parameter information may also be used as the first side information to assist the image block to be filtered in filtering processing. That is to say, in the process of video encoding the original image in the video to be encoded, the quantization parameter information can be fully utilized to merge it with at least two components of the pixel information to be filtered and then guide filtering.
  • the quantization parameter information can be normalized, and the quantization parameter information can also be non-normalized (such as classification processing, interval division processing, etc.); the following will take the quantization parameter normalization processing as an example for detailed description.
  • the quantization parameter information is converted into a second matrix reflecting the quantization parameter information; that is to say, taking the first image component of the original image as an example, a matrix with the same size as the first image component of the original image is established.
  • the position of each pixel in the matrix is filled with the normalized value of the quantization parameter corresponding to the first image component of the original image; among them, the normalized value of the quantization parameter is represented by QP max (x, y), namely:
  • QP represents the quantization parameter value corresponding to the first image component of the original image
  • x represents the abscissa value of each pixel position in the first image component of the original image
  • y represents the first image of the original image
  • QP max represents the maximum value of the quantization parameter.
  • the value of QP max is 51, but QP max can also be other values, such as 29, 31, etc., this application implements The examples are not specifically limited.
  • S3031 may include:
  • the second side information is different from the first side information.
  • S3032 may include:
  • S3032 may include:
  • the first side information and the second side information can be used for auxiliary filtering to improve filtering quality.
  • the first side information and the second side information may be one or more of block division information, quantization parameter information, MV information, and prediction direction information; in other words, when the first side information is block division information, the first side information
  • the second side information may be quantization parameter information; or, when the first side information is quantization parameter information, the second side information may be block division information; or, when the first side information is block division information and quantization parameter information, the first
  • the second side information may be MV information; or, when the first side information is block division information, the second side information may be quantization parameter information and MV information; the embodiment of the present application does not specifically limit it.
  • the fusion stage of the first side information and the second side information may be the same or different. It is assumed that the first branching stage is used to indicate the processing stages corresponding to at least two components of the pixel information to be filtered, and the merging stage is used to indicate the processing stage corresponding to the fusion information of the pixel information to be filtered.
  • the second branching stage It is used to determine the processing stage corresponding to the residual information of each component after the fusion processing.
  • the fusion stage of the first side information can be any one of the first branching stage, the merging stage or the second branching stage, and the fusion stage of the second side information can also be the first branching stage, the merging stage, or the first branching stage.
  • the information fusion stage may also be the first branching stage; or, the fusion stage of the first side information may be the merging stage, and the fusion stage of the second side information may also be the merging stage; the embodiment of the application does not specifically limit it.
  • S3033 may include:
  • Joint processing and branch processing are performed on the fusion information to obtain residual information corresponding to at least one of the at least two components;
  • At least one component of the at least two components and the residual information corresponding to the at least one component are summed to obtain at least one component after filtering the pixel information to be filtered.
  • pre-processing filtering or post-processing filtering in the embodiments of the present application adopts a multi-stage cascade processing structure, such as a branch-merge-split processing structure, a branch-merge processing structure, or a merge-split processing structure.
  • the route processing structure, etc. are not specifically limited in the embodiment of the present application.
  • the first branching stage if you first need to obtain at least two components of the pixel information to be filtered, that is, the first branching stage, and then merge the at least two components, that is, the merging stage; in this way, after all the information fusion processing, when needed
  • the residual information corresponding to the first image component and the second image component are obtained by joint processing the fusion information
  • the residual information corresponding to the corresponding residual information and the residual information corresponding to the third image component, and then the residual information corresponding to the first image component and the first image component are summed, and the second image component and the second image component are combined.
  • the corresponding residual information is summed, and the residual information corresponding to the third image component and the third image component is summed to obtain the first image component filtered by the pixel information to be filtered and the pixel information to be filtered after filtering.
  • the second image component of the second image component and the third image component filtered by the pixel information to be filtered, the processing process is the second branching stage; then the entire pre-processing filtering or post-processing filtering process adopts branch-combination-branch processing structure;
  • the pre-processing filtering or post-processing filtering in the embodiments of the present application may also adopt more cascaded processing structures, such as branch-merge-split-merge-split processing structures.
  • the embodiments of the present application may adopt a typical cascade structure, such as a split-merge-split processing structure, or a cascade processing structure that is less than a typical cascade structure, such as a split -Merge processing structure or merge-split processing structure, etc.; even more cascaded processing structures than typical cascade structures can be used, such as branch-merge-split-merge-split processing structure, etc., this application
  • the embodiments are not specifically limited.
  • the pre-processing filter or the post-processing filter may include a convolutional neural network filter.
  • the pre-processing filter or the post-processing filter may be a convolutional neural network filter, or may be another filter established by deep learning, which is not specifically limited in the embodiment of the present application.
  • the convolutional neural network filter also called CNN filter, is a type of feedforward neural network that includes convolution calculations and has a deep structure, and is one of the representative algorithms of deep learning.
  • the input layer of the CNN filter can process multi-dimensional data, such as the three image component (Y/U/V) channels of the original image in the video to be encoded.
  • FIG. 5 is a schematic structural diagram of a traditional CNN filter provided by an embodiment of the application.
  • the traditional CNN filter 50 is based on the previous generation video coding standard H.265/High Efficiency Video Coding (HEVC) It is improved on the basis of Video Coding. It contains a 2-layer convolutional network structure, which can replace the deblocking filter and the sample adaptive compensation filter.
  • HEVC High Efficiency Video Coding
  • the image to be filtered (represented by F in ) is input to the input layer of the traditional CNN filter 50, it passes through the first layer of convolutional network F 1 (assuming that the size of the convolution kernel is 3 ⁇ 3 and contains 64 feature maps) And the second layer convolutional network F 2 (assuming that the size of the convolution kernel is 5 ⁇ 5 and contains 32 feature maps), a residual information F 3 is obtained ; then the image to be filtered F in and the residual information F 3 Perform a summation operation, and finally obtain the filtered image output by the traditional CNN filter 50 (denoted by F out ).
  • the convolutional network structure is also called residual neural network, which is used to output residual information corresponding to the image to be filtered.
  • the traditional CNN filter 60 the three image components of the image to be filtered are processed independently, but the same filter network and related parameters of the filter network are shared.
  • FIG. 6A is a schematic structural diagram of another traditional CNN filter provided by an embodiment of this application
  • FIG. 6B is a schematic structural diagram of another traditional CNN filter provided by an embodiment of this application, see FIG. 6A and FIG. 6B, the traditional CNN
  • the filter 60 uses two filter networks.
  • the filter network shown in FIG. 6A is dedicated to outputting the first image component
  • the filter network shown in FIG. 6B is dedicated to outputting the second image component or the third image component.
  • the size information corresponding to the first image component is H ⁇ W.
  • the first image component can be rearranged and converted to The form of; because the size information corresponding to the second image component or the third image component is Then the three image components are combined and transformed into The form is input to the traditional CNN filter 60.
  • the filtering network shown in Figure 6A after the input layer network receives the image to be filtered F in (assuming the size of the convolution kernel is N ⁇ N, the number of channels is 6), it passes through the first layer of convolutional network F 1-Y (Assuming that the size of the convolution kernel is L1 ⁇ L1, the number of convolution kernels is M, and the number of channels is 6) and the second layer of convolutional network F 2-Y (assuming the size of the convolution kernel is L2 ⁇ L2, convolution After the number of cores is 4 and the number of channels is M), a residual information F 3-Y is obtained (assuming that the size of the convolution kernel is N ⁇ N, the number of channels is 4); then the input image to be filtered F in and The residual information F 3-Y is summed, and finally the
  • the input layer network receives the image to be filtered F in (assuming the size of the convolution kernel is N ⁇ N, the number of channels is 6), it passes through the first layer of convolution network F 1-U (Assuming that the size of the convolution kernel is L1 ⁇ L1, the number of convolution kernels is M, and the number of channels is 6) and the second layer of convolution network F 2-U (assuming the size of the convolution kernel is L2 ⁇ L2, convolution After the number of cores is 2, the number of channels is M), a residual information F 3-U is obtained (assuming the size of the convolution kernel is N ⁇ N, the number of channels is 2); then the input image to be filtered F in and The residual information F 3-U performs a summation operation, and finally obtains the filtered second image component or the filtered third image component output by the traditional CNN filter 60 (denoted by F out-U ).
  • the traditional CNN filter 50 shown in FIG. 5 or the traditional CNN filter 60 shown in FIG. 6A and FIG. 6B since the relationship between different image components is not considered, it is not reasonable to process each image component independently;
  • coding parameters such as block division information and QP information are not fully utilized at the input.
  • the distortion of the reconstructed image mainly comes from blockiness, and the boundary information of blockiness is determined by the CU division information; that is, CNN
  • the filter network in the filter should focus on the boundary area; in addition, the integration of quantization parameter information into the filter network also helps to improve its generalization ability, so that it can filter any quality of distorted images.
  • the filtering method provided by the embodiments of the present application not only has a reasonable CNN filter structure setting, and the same filter network can receive multiple image components at the same time, but also fully considers the relationship between these multiple image components.
  • the enhanced images of these image components can be output at the same time; in addition, the filtering method can also incorporate coding parameters such as block division information and/or QP information as coding information for auxiliary filtering, thereby improving filtering quality.
  • S3031 in the embodiment of the present application may be for the first image component, second image component, and third image component of the pixel information to be filtered to determine the side information corresponding to each component (such as The first side information or the second side information), three image components can be obtained after fusion processing; it is also possible to determine the side information corresponding to each component for the first image component and the second image component of the pixel information to be filtered, Two image components can be obtained after fusion processing; it is also possible to determine the side information corresponding to each component for the first image component and the third image component of the pixel information to be filtered, and two types of components can be obtained after fusion processing; It is also possible to determine the side information corresponding to each component for the second image component and the third image component of the pixel information to be filtered, and after fusion processing, two new components can be obtained; the embodiment of the application does not specifically limit it.
  • the fusion information for the pixel information to be filtered can be obtained by directly fusing at least two components, or at least two components and corresponding side information (such as first side information or second side information). Information) obtained by fusion together; the embodiments of this application are not specifically limited.
  • the first image component, the second image component, and the third image component of the pixel information to be filtered can be fused to obtain the fusion information; or
  • the first image component and the second image component of the pixel information to be filtered are fused to obtain the fusion information; it can also be the first image component and the third image component of the pixel information to be filtered to obtain the fusion information; It may be that the second image component and the third image component of the pixel information to be filtered are fused to obtain fused information.
  • the fusion information is obtained by fusing together at least two kinds of components and corresponding side information (such as the first side information or the second side information), then the first image component, the second image component, and the second image component of the pixel information to be filtered can be combined.
  • the third image component and the side information are fused to obtain the fusion information;
  • the first image component, the second image component and the side information of the pixel information to be filtered can also be fused to obtain the fusion information;
  • the first image component, the third image component and the side information of the pixel information are fused to obtain the fusion information; even the second image component, the third image component and the side information of the pixel information to be filtered can be fused to obtain Fusion information.
  • the corresponding coding information such as the first side information or the second side information
  • the corresponding coding information such as the first side information or the second side information
  • S303 in the embodiment of the present application specifically addresses various components of the pixel information to be filtered (such as the first image component, the second image component, and the third image component) and side information (such as the first side information or After the second side information) is fused and input to the filter, it can output only the filtered first image component, or the filtered second image component, or the filtered third image component of the pixel information to be filtered, or it can be output
  • the pixel information to be filtered is the filtered first image component and the filtered second image component, or the filtered second image component and the filtered third image component, or the filtered first image component and the filtered first image component
  • the three image components may even be the filtered first image component, the filtered second image component, and the filtered third image component of the pixel information to be filtered; the embodiment of the present application does not specifically limit it.
  • FIG. 7 is an optional filtering framework provided by an embodiment of the present application As shown in Fig.
  • the filtering framework 70 can include three components of the pixel information to be filtered (represented by Y, U, V) 701, a first branching unit 702, and first side information 703, Y Image component first processing unit 704, U image component first processing unit 705, V image component first processing unit 706, second side information 707, input fusion unit 708, joint processing unit 709, second branching unit 710, Y Image component second processing unit 711, U image component second processing unit 712, V image component second processing unit 713, first adder 714, second adder 715, third adder 716, and filtered three images Components (represented by Out_Y, Out_U, and Out_V, respectively) 717.
  • the three image components 701 of the image block to be filtered pass through the first demultiplexing unit 702, they are divided into three signals: Y image component, U image component and V image component.
  • the first Y image The component and the corresponding first side information 703 enter the Y image component first processing unit 704, the second U image component and the corresponding first side information 703 enter the U image component first processing unit 705, and the third The V image component and the corresponding first side information 703 enter the V image component first processing unit 706, which will output three new image components;
  • the input fusion unit 708 is used to combine the three new image components with
  • the second side information 707 is fused, and then input to the joint processing unit 709;
  • the joint processing unit 709 includes a multi-layer convolution filter network for convolution calculation of the input information, due to the specific convolution calculation process and related technologies The solutions are similar, so the specific execution steps of the joint processing unit 709 are not described again.
  • the joint processing unit 709 After the joint processing unit 709, it will enter the second demultiplexing unit 710 to divide it into three signals again, and then input the three signals to the second processing unit 711 of the Y image component and the second processing unit of the U image component.
  • the unit 712 and the second processing unit 713 of the V image component can sequentially obtain the residual information of the Y image component, the residual information of the U image component, and the residual information of the V image component; the three image components of the image block to be filtered are 701
  • the Y image component and the obtained residual information of the Y image component are input to the first adder 714.
  • the output of the first adder 714 is the filtered Y image component (indicated by Out_Y);
  • the U image component in the image components 701 and the obtained residual information of the U image component are input to the second adder 715 together.
  • the output of the second adder 715 is the filtered U image component (denoted by Out_U);
  • the V image component of the three image components 701 of the filtered image block and the obtained residual information of the V image component are input to the third adder 716 together.
  • the output of the third adder 716 is the filtered V image component (using Out_V Said).
  • the filtering framework 70 may not include the second demultiplexing unit 710, the second adder 715, and the third adder 716; if only the filtered Y image component needs to be output
  • the filtering framework 70 may not include the second demultiplexing unit 710, the first adder 714, and the third adder 716; if it is necessary to output the filtered Y image component and the filtered U image component, the filtering framework 70 may not include the third adder 716; the embodiment of the present application sets no specific limitation.
  • FIG. 8 is a structure of another optional filtering framework provided by an embodiment of the application
  • the filtering framework 80 may include two components of the pixel information to be filtered (represented by Y and U) 801, a first branching unit 702, first side information 703, and a first Y image component.
  • the two image components 801 of the image block to be filtered pass through the first demultiplexing unit 702, they will be divided into two signals: the Y image component and the U image component, the first Y image component and the
  • the corresponding first side information 703 enters the Y image component first processing unit 704, the second U image component and the corresponding first side information 703 enter the U image component first processing unit 705, which will output two new channels
  • the input fusion unit 708 is used to fuse the two new image components, and then input to the joint processing unit 709; after the joint processing unit 709, only a single image component (ie filtered Y image Component), there is no need to enter the second demultiplexing unit 710 at this time, you can directly input the Y image component to the second processing unit 711, and then obtain the residual information of the Y image component; change the Y in the two image components 801 of the image block to be filtered
  • the image component and the obtained residual information of the Y image component are input to the first adder 714, and
  • the pre-processing filter and the post-processing filter in the embodiment of the present application may at least include an input fusion unit 708, a joint processing unit 709, and a first adder 714
  • the second adder 715 and the third adder 716 can also include a first demultiplexing unit 702, a Y image component first processing unit 704, a U image component first processing unit 705, a V image component first processing unit 706, etc., It may even include a second branching unit 710, a Y image component second processing unit 711, a U image component second processing unit 712, a V image component second processing unit 713, etc., which are not specifically limited in the embodiment of the present application.
  • the filtering method provided in the embodiments of the present application may adopt a split-merge-split processing structure, such as the filtering framework 70 shown in FIG. 7; or a less split-merge processing structure, such as The filtering framework 80 shown in 8; it is also possible to use fewer merge-split processing structures, or even fewer merge-split processing structures or more split-merge-split-merge-split
  • the processing structure is not specifically limited in the embodiment of this application.
  • first side information and the second side information can all participate in the filtering process, such as the filtering framework 70 shown in FIG. 7; the first side information and the second side information can also selectively participate in the filtering process.
  • the filtering framework 80 shown in FIG. 8 in which the second side information does not participate in the filtering processing.
  • all the first side information and the second side information participate in the filtering processing, or the first side information does not participate in the filtering processing, or the second side information does not participate in the filtering processing, or even Neither the first side information nor the second side information participates in the filtering process, which is not specifically limited in the embodiment of the present application.
  • the fusion stage of the first side information and the second side information can be the same or different; that is, the first side information and the second side information can participate in the filtering process at the same stage , It can also participate in the filtering process at different stages, which is not specifically limited in the embodiment of the present application. For example, still taking the filtering framework 70 shown in FIG.
  • both the first side information 703 and the second side information 707 can participate in the filtering process in the stage corresponding to the first branching unit 702, or the first side Both the information 703 and the second side information 707 can participate in the filtering process in the corresponding stage of the input fusion unit 708, or both the first side information 703 and the second side information 707 can be in the corresponding stage of the second branching unit 710.
  • the first side information 703 participates in the filtering processing in the stage corresponding to the first branching unit 702, and the second side information 707 participates in the filtering processing in the stage corresponding to the input fusion unit 708
  • the first side information 703 participates in the filtering process before the stage corresponding to the first branching unit 702
  • the second side information 707 participates in the filtering process in the stage corresponding to the input fusion unit 708
  • the first side The information 703 participates in the filtering process before the stage corresponding to the first branching unit 702
  • the second side information 707 participates in the filtering process in the stage corresponding to the second branching unit 710
  • the first side information 703 is input
  • the stage corresponding to the fusion unit 708 participates in the filtering processing
  • the second side information 707 participates in the filtering processing in the stage corresponding to the second branching unit 710; that is, the first side information 703 and the second side information 707
  • the fusion stage can be
  • the filtering framework 70 shown in FIG. 7 uses a deep learning network (such as CNN) for filtering.
  • a deep learning network such as CNN
  • the difference from traditional CNN filters is that the filters in the embodiments of the present application adopt a cascaded processing structure.
  • the three components of the pixel information to be filtered can be simultaneously input into the filter network, and other coding-related side information (such as block division information, quantization parameter information, MV information and other coding parameters) can be incorporated, and these side information can be The same stage or different stages are integrated into the filter network; in this way, not only the relationship between the three components is fully utilized, but also other coding-related coding information is used to assist the filtering, which improves the filtering quality; in addition, for the three components Simultaneous processing also effectively avoids the problem of three complete network forward calculations for these three components, thereby reducing the computational complexity and saving the coding rate.
  • coding-related side information such as block division information, quantization parameter information, MV information and other coding parameters
  • FIG. 9 is a schematic structural diagram of another optional filtering framework provided by an embodiment of the present application.
  • the filtering framework 90 may include three components of the pixel information to be filtered (represented by Y, U, and V, respectively). ) 901, first side information 902, Y image component first processing unit 903, U image component first processing unit 904, V image component first processing unit 905, second side information 906, fusion unit 907, joint processing unit 908 , Branching unit 909, Y image component second processing unit 910, U image component second processing unit 911, V image component second processing unit 912, first adder 913, second adder 914, third adder 915 And the three filtered image components (represented by Out_Y, Out_U, and Out_V, respectively) 916.
  • the three components 901 of the pixel information to be filtered are subjected to component processing, and they are divided into three signals: Y image component, U image component, and V image component, the first Y image component and the corresponding
  • the first side information 902 enters the Y image component first processing unit 903, the second U image component and the corresponding first side information 703 enter the U image component first processing unit 904, the third V image component and
  • the corresponding first side information 902 enters the V image component first processing unit 905, which will output three new image components;
  • the fusion unit 907 is used to fuse the three new image components and the second side information 906 , And then input to the joint processing unit 908;
  • the joint processing unit 908 includes a multi-layer convolution filter network for convolution calculation on the input information, because the specific convolution calculation process is similar to related technical solutions, so for the joint processing The specific execution steps of unit 908 will not be described again.
  • the joint processing unit 908 After the joint processing unit 908, it will enter the demultiplexing unit 909 to divide it into three signals again, and then input the three signals to the second processing unit 910 of Y image component and the second processing unit 911 of U image component.
  • the second processing unit 912 of the V image component can sequentially obtain the residual information of the Y image component, the residual information of the U image component, and the residual information of the V image component; the Y in the three components 901 of the pixel information to be filtered
  • the image component and the obtained residual information of the Y image component are input to the first adder 913, and the output of the first adder 913 is the filtered Y image component (denoted by Out_Y); the three components of the pixel information to be filtered
  • the U image component in 901 and the obtained residual information of the U image component are jointly input to the second adder 914, and the output of the second adder 914 is the filtered U image component (denoted by Out_U);
  • the filtering framework 90 may not include the demultiplexing unit 909, the second adder 914, and the third adder 915; if only the filtered U image needs to be output
  • the filtering framework 90 may not include the demultiplexing unit 909, the first adder 913, and the third adder 915; if it is necessary to output the filtered Y image component and the filtered U image component, the filtering framework 90 may not include The third adder 915; the embodiment of the present application does not specifically limit it.
  • the neural network architecture provided by the embodiments of the present application can reasonably and effectively utilize various components and side information, and can bring about better coding performance.
  • a filtering device obtains pixel information to be filtered, obtains at least one type of side information, and inputs at least two components of the pixel information to be filtered and at least one type of side information to the neural network-based filtering
  • at least one component of the pixel information to be filtered is obtained by outputting; that is, in the embodiment of the present application, at least two components of the pixel information to be filtered and at least one side information are obtained, and inputted
  • the side information of at least one component is incorporated in the filtering process to obtain filtered pixel information.
  • the relationship between multiple components is not only fully utilized, but also effective It avoids the need to perform multiple complete network forward calculations for at least two components, thereby reducing the computational complexity, saving the coding rate, and improving the image and post-processing filtering obtained after preprocessing and filtering in the coding and decoding process Then the quality of the image obtained, thereby improving the quality of the reconstructed image.
  • FIG. 10 is a schematic structural diagram of an optional filtering device provided by an embodiment of the application.
  • the filtering device may include: a first acquiring module 101, a second acquiring module 102, and Determine module 103, where
  • the first obtaining module 101 is configured to obtain pixel information to be filtered
  • the second obtaining module 102 is configured to obtain at least one type of side information
  • the determining module 103 is configured to input at least two components of the pixel information to be filtered and at least one type of side information into the neural network-based filter to output at least one component of the pixel information to be filtered after filtering.
  • the determining module 103 may include:
  • the first processing submodule is configured to separately process each of the at least two components to obtain the processed at least two components
  • the fusion sub-module is configured to perform fusion processing according to at least one type of side information and at least two processed components to obtain fusion information of the pixel information to be filtered;
  • the second processing sub-module is configured to process the fusion information to obtain at least one component after filtering the pixel information to be filtered.
  • the first processing sub-module may be specifically configured as:
  • Component processing is performed on each of the at least two components to obtain at least two components after processing.
  • the first processing submodule when the first side information corresponding to each component is acquired, correspondingly, the first processing submodule can be specifically configured as:
  • the first side information includes at least block division information and/or quantization parameter information.
  • the fusion sub-module can be specifically configured as:
  • the second processing sub-module can be specifically configured as:
  • Joint processing and branch processing are performed on the fusion information to obtain residual information corresponding to at least one of the at least two components;
  • At least one component of the at least two components and residual information corresponding to the at least one component are summed to obtain at least one component after filtering the pixel information to be filtered.
  • the first processing submodule when the second side information corresponding to each component is acquired, correspondingly, the first processing submodule can be specifically configured as:
  • the second side information is different from the first side information.
  • the fusion sub-module can be specifically configured as:
  • the structure of the neural network includes at least a joint processing stage and an independent processing stage; in the joint processing stage, all components are processed together; in the independent processing stage, each component is processed on an independent branch of the neural network .
  • a "unit" may be a part of a circuit, a part of a processor, a part of a program, or software, etc., of course, may also be a module, or may be non-modular.
  • the various components in this embodiment may be integrated into one processing unit, or each unit may exist alone physically, or two or more units may be integrated into one unit.
  • the above-mentioned integrated unit can be realized in the form of hardware or software function module.
  • the integrated unit is implemented in the form of a software function module and is not sold or used as an independent product, it can be stored in a computer readable storage medium.
  • the technical solution of this embodiment is essentially or It is said that the part that contributes to the existing technology or all or part of the technical solution can be embodied in the form of a software product.
  • the computer software product is stored in a storage medium and includes several instructions to enable a computer device (which can A personal computer, server, or network device, etc.) or a processor (processor) executes all or part of the steps of the method described in this embodiment.
  • the aforementioned storage media include: U disk, mobile hard disk, read only memory (Read Only Memory, ROM), random access memory (Random Access Memory, RAM), magnetic disk or optical disk and other media that can store program codes.
  • FIG. 11 is a schematic structural diagram of an optional encoder provided by an embodiment of this application. As shown in FIG. 11, an embodiment of this application provides an encoder 1100.
  • the storage medium 1102 includes a processor 1101 and a storage medium 1102 storing instructions executable by the processor 1101.
  • the storage medium 1102 relies on the processor 1101 to perform operations through the communication bus 1103.
  • the filtering method of the foregoing embodiment is executed.
  • the communication bus 1103 is used to implement connection and communication between these components.
  • the communication bus 1103 also includes a power bus, a control bus, and a status signal bus.
  • various buses are marked as the communication bus 1103 in FIG. 11.
  • An embodiment of the present application provides a computer storage medium that stores executable instructions.
  • the processors execute the operations described in one or more embodiments above. Filtering method.
  • the memory in the embodiment of the present application may be a volatile memory or a non-volatile memory, or may include both volatile and non-volatile memory.
  • the non-volatile memory can be read-only memory (Read-Only Memory, ROM), programmable read-only memory (Programmable ROM, PROM), erasable programmable read-only memory (Erasable PROM, EPROM), and electrically available Erase programmable read-only memory (Electrically EPROM, EEPROM) or flash memory.
  • the volatile memory may be a random access memory (Random Access Memory, RAM), which is used as an external cache.
  • RAM static random access memory
  • DRAM dynamic random access memory
  • DRAM synchronous dynamic random access memory
  • DDRSDRAM Double Data Rate Synchronous Dynamic Random Access Memory
  • Enhanced SDRAM, ESDRAM Synchronous Link Dynamic Random Access Memory
  • Synchlink DRAM Synchronous Link Dynamic Random Access Memory
  • DRRAM Direct Rambus RAM
  • the processor may be an integrated circuit chip with signal processing capabilities.
  • the steps of the above method can be completed by hardware integrated logic circuits in the processor or instructions in the form of software.
  • the aforementioned processor may be a general-purpose processor, a digital signal processor (Digital Signal Processor, DSP), an application specific integrated circuit (ASIC), a ready-made programmable gate array (Field Programmable Gate Array, FPGA) or other Programming logic devices, discrete gates or transistor logic devices, discrete hardware components.
  • DSP Digital Signal Processor
  • ASIC application specific integrated circuit
  • FPGA ready-made programmable gate array
  • the methods, steps, and logical block diagrams disclosed in the embodiments of the present application can be implemented or executed.
  • the general-purpose processor may be a microprocessor or the processor may also be any conventional processor or the like.
  • the steps of the method disclosed in the embodiments of the present application may be directly embodied as being executed and completed by a hardware decoding processor, or executed and completed by a combination of hardware and software modules in the decoding processor.
  • the software module can be located in a mature storage medium in the field such as random access memory, flash memory, read-only memory, programmable read-only memory or electrically erasable programmable memory, registers.
  • the storage medium is located in the memory, and the processor reads the information in the memory and completes the steps of the above method in combination with its hardware.
  • the embodiments described herein can be implemented by hardware, software, firmware, middleware, microcode, or a combination thereof.
  • the processing unit can be implemented in one or more Application Specific Integrated Circuits (ASIC), Digital Signal Processing (DSP), Digital Signal Processing Equipment (DSP Device, DSPD), programmable Logic device (Programmable Logic Device, PLD), Field-Programmable Gate Array (Field-Programmable Gate Array, FPGA), general-purpose processors, controllers, microcontrollers, microprocessors, and others for performing the functions described in this application Electronic unit or its combination.
  • ASIC Application Specific Integrated Circuits
  • DSP Digital Signal Processing
  • DSP Device Digital Signal Processing Equipment
  • PLD programmable Logic Device
  • PLD Field-Programmable Gate Array
  • FPGA Field-Programmable Gate Array
  • the technology described herein can be implemented through modules (such as procedures, functions, etc.) that perform the functions described herein.
  • the software codes can be stored in the memory and executed by the processor.
  • the memory can be implemented in the processor or external to the processor.
  • the method of the above embodiments can be implemented by means of software plus the necessary general hardware platform. Of course, it can also be implemented by hardware, but in many cases the former is better. ⁇
  • the technical solution of this application essentially or the part that contributes to the existing technology can be embodied in the form of a software product, and the computer software product is stored in a storage medium (such as ROM/RAM, magnetic disk, The optical disc) includes a number of instructions to enable a terminal (which may be a mobile phone, a computer, a server, or a network device, etc.) to execute the method described in each embodiment of the present application.
  • the filtering device obtains pixel information to be filtered, obtains at least one type of side information, and inputs at least two components of the pixel information to be filtered and at least one type of side information into the neural network-based filter to
  • the output obtains at least one component of the pixel information to be filtered after filtering; that is, in the embodiment of the present application, at least two components of the pixel information to be filtered and at least one type of side information are obtained, and inputted to the neural network-based
  • the side information of at least one component is incorporated to obtain filtered pixel information.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)

Abstract

本申请实施例公开了一种滤波方法、装置、编码器以及计算机存储介质,该方法包括:获取待滤波像素信息,获取至少一种边信息,将待滤波像素信息的至少两种分量和至少一种边信息输入至基于神经网络的滤波器中,以输出得到待滤波像素信息滤波后的至少一种分量。本申请实施例还提供一种滤波装置,编码器以及计算机存储介质。

Description

滤波方法、装置、编码器以及计算机存储介质 技术领域
本申请实施例涉及视频图像处理技术领域,尤其涉及一种滤波方法、装置、编码器以及计算机存储介质。
背景技术
在视频编解码系统中,大多数视频编码采用的是基于块形编码单元(CU,Coding Unit)的混合编码框架,由于相邻的CU采用不同的编码参数,比如:不同的变换过程、不同的量化参数(QP,Quantization Parameter)、不同的预测方式、不同的参考图像帧等,而且各个CU引入的误差大小及其分布特性的相互独立,相邻CU边界的不连续性而产生块效应,从而影响了重建图像的主客观质量,甚至影响后续编解码的预测准确性。
这样,在编解码过程中,预处理滤波器用于对原始图像进行预处理,用来降低视频分辨率,因为需要编码表示的视频分辨率要比原始视频的分辨率低,这样可以使用更少的比特表示,从而能够提高总体的编码效率;后处理滤波器对环内滤波后的视频进行处理,以输出视频,用来提高视频分辨率,针对预处理滤波器和后处理滤波器来说,由于目前基于神经网络的滤波器往往由多个基本单元组成,神经网络的输入为单输入或者多输入,即输入时单种图像分量或者多种图像分量,即现有的卷积神经网络复杂度较高,且目前的卷积神经网络(CNN,Convolutional Neural Network)滤波器未充分综合利用相关信息,使得重建图像质量提升有限。
发明内容
本申请实施例提供一种滤波方法、装置、编码器以及计算机存储介质,能够降低基于神经网络的滤波方法的复杂度,有助于提高重建图像的图像质量。
本申请实施例的技术方案可以如下实现:
第一方面,本申请实施例提供了一种滤波方法,所述方法包括:
获取待滤波像素信息;
获取至少一种边信息;
将所述待滤波像素信息的至少两种分量和所述至少一种边信息输入至基于神经网络的滤波器中,以输出得到所述待滤波像素信息滤波后的至少一种分量。
第二方面,本申请实施例提供了一种滤波装置,所述滤波装置包括:
第一获取模块,配置为获取待滤波像素信息;
第二获取模块,配置为获取至少一种边信息;
确定模块,配置为将所述待滤波像素信息的至少两种分量和所述至少一种边信息输入至基于神经网络的滤波器中,以输出得到所述待滤波像素信息滤波后的至少一种分量。
第三方面,本申请实施例提供了一种编码器,所述编码器包括:
处理器以及存储有所述处理器可执行指令的存储介质,所述存储介质通过通信总线依赖所述处理器执行操作,当所述指令被所述处理器执行时,执行上述一个或多个实施例所述的滤波方法。
第四方面,本申请实施例提供了一种计算机存储介质,其中,存储有可执行指令,当所述可执行指令被一个或多个处理器执行的时候,所述处理器执行上述一个或多个实施例所述的滤波方法。
本申请实施例提供了一种滤波方法、装置、编码器以及计算机存储介质,首先,滤波装置获取待滤波像素信息,获取至少一种边信息,将待滤波像素信息的至少两种分量和至少一种边信息输入至基于神经网络的滤波器中,以输出得到待滤波像素信息滤波后的至少一种分量;也就是说,在本申请实施例中,获取待滤波像素信息中至少两种分量,以及至少一种边信息,将其输入至基于神经网络的滤波器中进行处理,在滤波过程中,融入至少一种分量的边信息,从而得到滤波后的像素信息,这样,不仅充分利用了多种分量之间的关系,而且,还有效避免了对至少两种分量需要进行多次完整的网络前向计算的问题,进而降低了计算复杂度,节省了编码码率,提升了编解码过程中预处理滤波后得到的图像和后处理滤波后得到的图像的质量,从而提高重建图像的质量。
附图说明
图1为传统编码框图的结构示意图;
图2为传统解码框图的结构示意图;
图3为本申请实施例提供的一种可选的滤波方法的流程示意图;
图4为本申请实施例提供的一种块划分矩阵的结构示意图;
图5为本申请实施例提供的一种传统CNN滤波器的结构示意图;
图6A为本申请实施例提供的另一种传统CNN滤波器的结构示意图;
图6B为本申请实施例提供的又一种传统CNN滤波器的组成结构示意图;
图7为本申请实施例提供的一种可选的滤波框架的结构示意图;
图8为本申请实施例提供的另一种可选的滤波框架的结构示意图;
图9为本申请实施例提供的又一种可选的滤波框架的结构示意图;
图10为本申请实施例提供的一种可选的滤波装置的结构示意图;
图11为本申请实施例提供的一种可选的编码器的结构示意图。
具体实施方式
为了能够更加详尽地了解本申请实施例的特点与技术内容,下面结合附图对本申请实施例的实现进行详细阐述,所附附图仅供参考说明之用,并非用来限定本申请实施例。
在视频编解码系统中,待编码视频包括原始图像帧,而原始图像帧中包括原始图像,对该原始图像进行多种处理,诸如预测、变换、量化、重建和滤波等,在这些处理过程中,已处理的视频图像相对原始图像可能已经发生像素值偏移,导致视觉障碍或假象。此外,在大多数视频编解码系统采用的基于块形CU的混合编码框架下,由于相邻的编码块采用不同的编码参数(比如不同的变换过程、不同的QP、不同的预测方式、不同的参考图像帧等),各个编码块引入的误差大小及其分布特性的相互独立,相邻编码块边界的不连续性,产生块效应。这些失真影响了重建图像块的主客观质量,若重建图像块作为后续编码像素的参考图像,甚至还会影响后续编解码的预测准确性,进而影响了视频码流中比特的大小。因此,在视频编解码系统中,往往会加入预处理滤波器和后处理滤波器来提升重建图像的主客观质量。
图1为传统编码框图的结构示意图,如图1所示,该传统编码框图10可以包括变换及量化单元101、反变换及反量化单元102、预测单元103、环内滤波单元104和熵编码单元105等部件;其中,预测单元103还包括帧内预测单元1031和帧间预测单元1032。 针对输入的原始图像,通过初步划分可以得到编码树单元(CTU,Coding Tree Unit),而对一个CTU继续进行内容自适应划分,可以得到CU,CU一般包含一个或多个编码块(CB,Coding Block)。对编码块进行帧内预测单元1031的帧内预测或者帧间预测单元1032的帧间预测,可以得到残差信息;将该残差信息通过变换及量化单元101对该编码块进行变换,包括将残差信息从像素域变换到变换域,以及对所得到的变换系数进行量化,用以进一步减少比特率;在确定出预测模式之后,预测单元103还用于将所选择的帧内预测数据或者帧间预测数据提供给熵编码单元105;此外,反变换与反量化单元102是用于该编码块的重构建,在像素域中重构建残差块,该重构建残差块通过环内滤波单元104去除方块效应伪影,然后将该重构残差块添加到解码图像缓存单元中,用以产生经重构建的参考图像;熵编码单元105是用于编码各种编码参数及量化后的变换系数,比如熵编码单元105采用头信息编码及基于上下文的自适应二进制算术编码(CABAC,Context-based Adaptive Binary Arithmatic Coding)算法,可以用于编码指示所确定的预测模式的编码信息,输出对应的码流。
针对图1中的传统编码框图10,环内滤波单元104为环路滤波器,也称之为环内滤波器(In-Loop Filter),它可以包括去方块滤波器(DBF,De-Blocking Filter)、样点自适应补偿(SAO,Sample Adaptive Offset)滤波器和自适应环路滤波器(ALF,Adaptive Loop Filter)等。
针对图1中的传统编码框图10,预处理滤波单元106用于接收到输入的原始视频帧,对原始视频帧中的原始图像帧进行预处理滤波,以降低视频的分辨率,后处理滤波单元107用于接收到环内滤波后的视频帧,对环内滤波后的视频帧进行后处理滤波,以提高视频的分辨率,这样,在视频的编解码过程中可以使用较少的比特得到重建视频帧,从而可以提高总体编解码的效率。然而,由于目前预处理滤波器和后处理滤波器均采用的神经网络的输入为单输入或者多输入,即输入时单种图像分量或者多种图像分量,即现有的卷积神经网络复杂度较高,且目前的CNN滤波器未充分综合利用相关信息,使得重建图像质量提升有限。
与图1中的编码框图类似,图2为传统解码框图的结构示意图,如图2所示,该传统解码框图20可以包括熵编码单元201、反量化反变换单元202、预测单元203、环内滤波单元204和后处理滤波单元205等部件;其中,预测单元203还包括帧内预测单元2031和帧间预测单元2032。这里,需要说明的是,视频解码过程是与视频编码过程相反的过程,其中,视频解码过程中将得到的后处理滤波后的图像确定为重建视频帧,由图2可以看出,在解码过程中并不涉及编码过程中的预处理滤波单元。
本申请实施例提供一种滤波方法,该方法应用于一滤波装置中,该滤波装置可以设置于编码器中的预处理滤波器和后处理滤波器中,也可以设置于解码器的后处理滤波器中,这里本申请实施例不作具体限定。
图3为本申请实施例提供的一种可选的滤波方法的流程示意图,参考图3所示,该滤波方法可以包括:
S301:获取待滤波像素信息;
S302:获取至少一种边信息;
S303:将待滤波像素信息的至少两种分量和至少一种边信息输入至基于神经网络的滤波器中,以输出得到待滤波像素信息滤波后的至少一种分量。
其中,上述待滤波像素信息是指由像素值表示的待滤波图像块,且该待滤波图像块包括三种图像分量,上述至少两种分量可以为三种图像分量中的任意两种图像分量或者三种图像分量,其中,图像分量可以包括第一图像分量、第二图像分量和第三图像分量;本申请实施例以第一图像分量表示亮度分量,第二图像分量表示第一色度分量,第三图 像分量表示第二色度分量为例来进行说明。
上述至少一种边信息可以为第一种图像分量对应的边信息,第二种图像分量对应的边信息,第三种图像分量的边信息中的至少一种边信息。
需要说明的是,由于原始图像帧可以划分为CTU,或者由CTU划分为CU;也就是说,本申请实施例中的块划分信息可以是指CTU划分信息,也可以是指CU划分信息;这样,本申请实施例的滤波方法不仅可以应用于CU级别的预处理滤波或者后处理滤波,也可以应用于CTU级别的预处理滤波或者后处理滤波,本申请实施例不作具体限定。
其中,待滤波像素信息为用像素值表示的待编码视频中的原始图像块,或者,待滤波像素信息为用像素值表示的待编码视频在视频编码过程中经过环内滤波处理后得到的图像块。
需要说明的是,对待编码视频中的原始图像进行视频编码的过程中,针对原始图像进行视频编码处理时,将其进行CU划分、预测、变换及量化等处理,并且为了得到用于对后续的待编码图像进行视频编码的参考图像,还可以进行反变换及反量化、重建和滤波等处理。也就是说,待滤波像素信息可以为刚刚输入到编码器中的原始图像中的图像块,这种为应用于预处理滤波器中的情况,也可以为刚刚经过环内滤波处理后的得到的图像块,这种为应用于后处理滤波器中的情况,如此,来获取待滤波像素信息。
也就是说,在获取到待滤波像素信息之后,可以获取到待滤波像素信息的至少两种分量,以及至少一种边信息。
需要说明的是,在视频图像中,一般采用第一图像分量、第二图像分量和第三图像分量来表征原始图像或者待滤波图像。其中,在亮-色度分量表示方法下,这三个图像分量分别为一个亮度分量、一个蓝色色度(色差)分量和一个红色色度(色差)分量;具体地,亮度分量通常使用符号Y表示,蓝色色度分量通常使用符号Cb表示,也可以用U表示;红色色度分量通常使用符号Cr表示,也可以用V表示。
在本申请实施例中,至少一种分量表示第一图像分量、第二图像分量和第三图像分量中的一种或者多种,而至少两种分量可以是第一图像分量、第二图像分量以及第三图像分量,还可以是第一图像分量和第二图像分量,也可以是第一图像分量和第三图像分量,甚至也可以是第二图像分量和第三图像分量,本申请实施例均不作具体限定。
在下一代视频编码标准(VVC,Versatile Video Coding)中,其相应的测试模型为VVC测试模型(VTM,VVC Test Model)。在VTM实施测试时,目前标准测试序列采用的是YUV为4:2:0格式,该格式的待编码视频中每一帧图像都可以由三种分量组成:一个亮度分量Y和两个色度分量U和V。假定待编码视频中原始图像的高为H,宽为W,那么第一图像分量对应的尺寸信息为H×W,第二图像分量或者第三图像分量对应的尺寸信息均为
Figure PCTCN2019104499-appb-000001
需要注意的是,本申请实施例将以YUV为4:2:0格式为例进行描述,但是本申请实施例的滤波方法同样适用于其他采样格式。
以YUV为4:2:0格式为例,由于第一图像分量与第二图像分量或者第三图像分量的尺寸信息是不同的,为了将第一图像分量和/或第二图像分量和/或第三图像分量一次性输入到基于神经网络的滤波器中,这时候需要对这三种分量进行采样或重组处理,以使得三种分量的空域尺寸信息是相同的。
在一些实施例中,可以对高分辨率的图像分量进行像素重排处理(也可以称为下采样处理),以使得三种分量的空域尺寸信息是相同的。具体地,在S302之前,还可以针对待滤波像素信息的至少两种分量,选取高分辨率的图像分量,对高分辨率的图像分量进行像素重排处理。
需要说明的是,原始图像中所包含的三种分量在进行其他处理之前,这三种分量为原始图像分量。如果第一图像分量为亮度分量,第二图像分量为第一色度分量,第三图像分量为第二色度分量;那么高分辨率的图像分量为第一图像分量,此时需要对第一图像分量进行像素重排处理。
示例性地,以2×2尺寸大小的原始图像为例,将其转换为4个通道,即将2×2×1的张量排列成1×1×4的张量;那么当原始图像的第一图像分量的尺寸信息为H×W时,在进行滤波之前可以通过像素重排处理将其转换为
Figure PCTCN2019104499-appb-000002
的形式;由于第二图像分量和第三图像分量的尺寸信息均为
Figure PCTCN2019104499-appb-000003
这样就可以使得三个图像分量的空域尺寸信息是相同的;后续将像素重排处理后的三个图像分量合并后即变换为
Figure PCTCN2019104499-appb-000004
的形式输入预处理滤波器或者后处理滤波器。
在一些实施例中,还可以对低分辨率的图像分量进行上采样处理,以使得三种分量的空域尺寸信息是相同的。具体地,可以针对待滤波像素信息的至少两种分量,选取低分辨率的分量,对低分辨率的分量进行上采样处理。
需要说明的是,除了可以对高分辨率的分量进行尺寸信息的像素重排处理(即向下调整)之外,在本申请实施例中,还可以对低分辨率的分量进行上采样处理(即向上调整)。另外,针对低分辨率的分量,不仅可以进行上采样处理,还可以进行反卷积处理,甚至还可以进行超分辨率处理等,这三种处理的效果相同,本申请实施例不作具体限定。
还需要说明的是,原始图像中所包含的三种分量在进行其他处理之前,这三种分量为原始图像分量。如果第一图像分量为亮度分量,第二图像分量为第一色度分量,第三图像分量为第二色度分量;那么低分辨率的图像分量为第二图像分量或者第三图像分量,此时需要对第二图像分量或者第三图像分量进行上采样处理。
示例性地,当原始图像的第二图像分量和第三图像分量的尺寸信息均为
Figure PCTCN2019104499-appb-000005
时,在进行滤波之前可以通过上采样处理将其转换为H×W的形式;由于第一图像分量的尺寸信息为H×W,这样也可以使得三个图像分量的空域尺寸信息是相同的,而且上采样处理后的第二图像分量和上采样处理后的第三图像分量将与第一图像分量的分辨率保持一致。
其中,边信息可以用于辅助滤波,提升滤波质量,边信息不仅可以是块划分信息(比如CU划分信息和/或CTU划分信息),还可以是量化参数信息,甚至也可以是运动矢量(MV,Motion Vector)信息、预测方向信息等;这些信息可以单独作为边信息,也可以任意组合作为边信息,比如将块划分信息单独作为边信息,或者将块划分信息和量化参数信息共同作为边信息,或者将块划分信息和MV信息共同作为边信息等,本申请实施例不作具体限定。
那么,在获取到待滤波像素信息的至少两种分量和至少一种边信息之后,将至少两种分量和至少一种边信息输入至基于神经网络的滤波器中进行处理,这里,可以包括分量处理、融合处理、联合处理以及分路处理等等处理方式,这里,本申请实施例不作具体限定。
至此,将获取到待滤波像素信息的至少两种分量和至少一种边信息作为输入值,输入至基于神经网络的滤波器中,就可以输出得到待滤波像素信息滤波后的至少一种分量。
其中,神经网络的结构中至少包括一个联合处理阶段和一个独立处理阶段;在联合处理阶段,所有分量共同处理;在独立处理阶段,每种分量在神经网络的一个独立分支 上进行处理。
为了得到待滤波像素信息滤波后的至少一种分量,在一种可选的实施例中,S303可以包括:
S3031:分别对至少两种分量中每种分量进行处理,得到处理后的至少两种分量;
S3032:根据至少一种边信息和处理后的至少两种分量进行融合处理,得到待滤波像素信息的融合信息;
S3033:对融合信息进行处理,得到待滤波像素信息滤波后的至少一种分量。
在S3031中,该处理过程可以看作分路阶段,用于分别得到至少两种分量;融合信息为至少包括对至少两种分量进行融合得到的信息,S3032,该处理过程可以看作合并阶段,用于将至少两种分量进行融合;这样,本申请实施例采用了级联处理结构,通过将输入的多种分量进行融合处理,不仅充分利用了多种分量之间的关系,而且还有效避免了对这多种分量需要进行多次完整的网络前向计算的问题,进而降低了计算复杂度,节省了编码码率;最后对融合信息进行处理,得到待滤波像素信息滤波后的至少一种分量;这样,通过融合信息还可以进一步辅助滤波,提升了编解码过程中视频重建图像的主客观质量。
为了得到处理后的至少两种分量,在一种可选的实施例中,S3031可以包括:
分别对至少两种分量中每种分量进行分量处理,得到处理后的至少两种分量。
这里,获取原始图像块的原始图像分量YUV,对Y,U,V分别进行处理,得到待滤波图像块的YUV,即可以得到待滤波图像块的至少两种分量可以为YU,或者YV,或者UV。
为了得到处理后的至少两种分量,在一种可选的实施例中,当获取到每种分量对应的第一边信息时,相应地,S3031可以包括:
分别将至少两种分量中每种分量与每种分量对应的第一边信息进行融合处理,得到处理后的至少两种分量;
其中,第一边信息至少包括块划分信息和/或量化参数信息。
可以理解地,第一边信息可以用于辅助滤波,提升滤波质量,第一边信息不仅可以是块划分信息(比如CU划分信息和/或CTU划分信息),还可以是量化参数信息,甚至也可以是运动矢量(MV,Motion Vector)信息、预测方向信息等;这些信息可以单独作为第一边信息,也可以任意组合作为第一边信息,比如将块划分信息单独作为第一边信息,或者将块划分信息和量化参数信息共同作为第一边信息,或者将块划分信息和MV信息共同作为第一边信息等,本申请实施例不作具体限定。
需要说明的是,S3031可以看作是第一分路阶段。这样,针对待滤波像素信息的至少两种分量,可以对其分别进行分量处理(比如深度学习),从而可以得到处理后的至少两种分量;另外,还可以将每种分量对应的第一边信息添加到对应的分量中,从而得到处理后的至少两种分量;也就是说,针对第一分路阶段,可以融合第一边信息,也可以不融合第一边信息,本申请实施例不作具体限定。
进一步地,在一些实施例中,可以将CU划分信息作为待滤波像素信息的每种分量所对应的块划分信息,其中,针对CU划分信息,在CU边界对应的各个像素点位置填充第一值,在其他像素点位置填充第二值,得到与CU划分信息对应的第一矩阵;其中,第一值与第二值不同;将第一矩阵作为待滤波像素信息的每种分量所对应的块划分信息。
需要说明的是,第一值可以是预先设定的数值、字母等,第二值也可以是预先设定的数值、字母等,第一值与第二值不同;比如第一值可以设置为2,第二值可以设置为1,但是本申请实施例不作具体限定。
在本申请实施例中,可以将CU划分信息作为第一边信息来辅助待滤波像素信息进行滤波处理。也就是说,在对待编码视频中的原始图像进行视频编码的过程中,可以充分利用CU划分信息,将其与待滤波像素信息的至少两种分量进行融合后来指导滤波。
具体地,将CU划分信息转换成一个编码单元图(CUmap,Coding Unit Map),并以二维矩阵表示,即CUmap矩阵,也即本申请实施例中的第一矩阵;也就是说,以原始图像的第一图像分量为例,可以将其划分为多个CU;在每个CU边界对应的各个像素点位置用第一值进行填充,而在其他像素点位置用第二值进行填充,这样就可以构造出一张反映CU划分信息的第一矩阵。示例性地,图4为本申请实施例提供的一种块划分矩阵的结构示意图,如图4所示,如果该图表示一个CTU,那么可以将该CTU划分为9个CU;假定第一值设置为2,第二值设置为1;这样,在每个CU边界对应的各个像素点位置用2进行填充,而在其他像素点位置用1进行填充,也就是说,利用2所填充的像素点位置表示了CU的边界,从而可以确定出CU划分信息,即待滤波像素信息的第一图像分量所对应的第一边信息。
还需要说明的是,如果第一图像分量为亮度分量,第二图像分量和第三图像分量均为色度分量,那么第一图像分量的CU划分信息与第二图像分量或者第三图像分量的CU划分信息是有可能不相同的。因此,当第一图像分量的CU划分信息与第二图像分量或者第三图像分量的CU划分信息不同时,需要分别确定待滤波像素信息的第一图像分量所对应的CU划分信息和待滤波像素信息的第二图像分量或者第三图像分量所对应的CU划分信息;然后将其作为第一边信息融合到相对应的第一图像分量或者第二图像分量或者第三图像分量中;当第一图像分量的CU划分信息与第二图像分量或者第三图像分量的CU划分信息相同时,此时可以只确定第一图像分量或者第二图像分量或者第三图像分量的CU划分信息,然后将确定的CU划分信息作为第一边信息融合到相对应的第一图像分量或者第二图像分量或者第三图像分量中;这样,可以方便后续将所得到的至少两种新的分量进行融合,以对待滤波像素信息进行预处理滤波或者后处理滤波。
在一些实施例中,分别确定每种分量对应的第一边信息,可以是基于待编码视频中的原始图像,分别获取原始图像块的至少两种分量中每种分量对应的量化参数,将量化参数作为待滤波像素信息的每种分量所对应的量化参数信息。
进一步地,在一些实施例中,将量化参数作为待滤波像素信息的每种分量所对应的量化参数信息可以是分别建立与原始图像的每种分量尺寸相同的第二矩阵;其中,第二矩阵中各个像素点位置均填充原始图像的每种分量对应的量化参数的归一化值;将第二矩阵作为待滤波像素信息的每种分量所对应的量化参数信息。
需要说明的是,不同的量化参数对应的待滤波像素信息,其失真程度不尽相同。如果融入量化参数信息,那么能够使得滤波网络在训练过程中自适应地拥有对任意量化参数进行处理的能力。
在本申请实施例中,还可以将量化参数信息作为第一边信息来辅助待滤波图像块进行滤波处理。也就是说,在对待编码视频中的原始图像进行视频编码的过程中,可以充分利用量化参数信息,将其与待滤波像素信息的至少两种分量进行融合后来指导滤波。其中,量化参数信息可以进行归一化处理,量化参数信息也可以进行非归一化处理(比如分类处理、区间划分处理等);下面将以对量化参数归一化处理为例进行详细描述。
具体地,将量化参数信息转换成一个反映量化参数信息的第二矩阵;也就是说,以原始图像的第一图像分量为例,建立一个与原始图像的第一图像分量尺寸相同的矩阵,该矩阵中各个像素点位置均用原始图像的第一图像分量所对应的量化参数的归一化值进行填充;其中,量化参数的归一化值用QP max(x,y)表示,即:
Figure PCTCN2019104499-appb-000006
在式(1)中,QP表示原始图像的第一图像分量所对应的量化参数值,x表示原始图像的第一图像分量中各像素点位置的横坐标值,y表示原始图像的第一图像分量中各像素点位置的纵坐标值;QP max表示量化参数的最大值,一般来说,QP max的取值为51,但是QP max也可以为其他值,比如29、31等,本申请实施例不作具体限定。
为了得到处理后的至少两种分量,在一种可选的实施例中,当获取到每种分量对应的第二边信息时,相应地,在S3031可以包括:
分别将至少两种分量中每种分量与每种分量对应的第二边信息进行融合处理,得到处理后的至少两种分量;
其中,第二边信息与第一边信息不同。
为了得到待滤波像素信息的融合信息,在一种可选的实施例中,S3032可以包括:
对处理后的至少两种分量与每种分量对应的第一边信息进行融合处理,得到待滤波像素信息的融合信息。
为了得到待滤波像素信息的融合信息,在一种可选的实施例中,S3032可以包括:
对处理后的至少两种分量与每种分量对应的第二边信息进行融合处理,得到待滤波像素信息的融合信息。
需要说明的是,无论是第一边信息、还是第二边信息,都可以用于辅助滤波,提升滤波质量。第一边信息和第二边信息可以是块划分信息、量化参数信息、MV信息以及预测方向信息等中的一种或者多种;也就是说,当第一边信息是块划分信息时,第二边信息可以是量化参数信息;或者,当第一边信息是量化参数信息时,第二边信息可以是块划分信息;或者,当第一边信息是块划分信息和量化参数信息时,第二边信息可以是MV信息;或者,当第一边信息是块划分信息时,第二边信息可以是量化参数信息和MV信息;本申请实施例不作具体限定。
还需要说明的是,第一边信息和第二边信息的融合阶段可以是相同的,也可以是不同的。假定第一分路阶段用于表示分别获得待滤波像素信息的至少两种分量所对应的处理阶段,合并阶段用于表示确定待滤波像素信息的融合信息所对应的处理阶段,第二分路阶段用于表示在融合处理之后分别确定每种分量的残差信息所对应的处理阶段。这样,第一边信息的融合阶段可以是第一分路阶段、合并阶段或者第二分路阶段中的任意一个,第二边信息的融合阶段也可以是第一分路阶段、合并阶段或者第二分路阶段中的任意一个;也就是说,第一边信息的融合阶段可以是第一分路阶段,第二边信息的融合阶段可以是合并阶段;或者,第一边信息的融合阶段可以是合并阶段,第二边信息的融合阶段可以是第一分路阶段;或者,第一边信息的融合阶段可以是第二分路阶段,第二边信息的融合阶段可以是合并阶段;或者,第一边信息的融合阶段可以是第一分路阶段,第二边信息的融合阶段可以是第二分路阶段;或者,第一边信息的融合阶段可以是第一分路阶段,第二边信息的融合阶段也可以是第一分路阶段;或者,第一边信息的融合阶段可以是合并阶段,第二边信息的融合阶段也可以是合并阶段;本申请实施例不作具体限定。
为了得到待滤波像素信息滤波后的至少一种分量,在一种可选的实施例中,S3033可以包括:
对融合信息进行联合处理和分路处理,得到至少两种分量中的至少一种分量所对应的残差信息;
将至少两种分量中的至少一种分量与至少一种分量所对应的残差信息进行求和运 算,得到待滤波像素信息滤波后的至少一种分量。
需要说明的是,本申请实施例的预处理滤波或者后处理滤波采用的是多阶段的级联处理结构,例如分路-合并-分路处理结构、分路-合并处理结构、或者合并-分路处理结构等,本申请实施例不作具体限定。
具体地,如果首先需要分别获得待滤波像素信息的至少两种分量,即第一分路阶段,然后再将至少两种分量进行融合,即合并阶段;这样,在所有信息融合处理之后,当需要同时输出多种分量时,比如第一图像分量、第二图像分量和第三图像分量;这时候通过对融合信息进行联合处理,分别获取第一图像分量所对应的残差信息、第二图像分量所对应的残差信息和第三图像分量所对应的残差信息,然后将第一图像分量与第一图像分量所对应的残差信息进行求和运算、第二图像分量与第二图像分量所对应的残差信息进行求和运算、第三图像分量与第三图像分量所对应的残差信息进行求和运算,分别得到待滤波像素信息滤波后的第一图像分量、待滤波像素信息滤波后的第二图像分量和待滤波像素信息滤波后的第三图像分量,该处理过程即为第二分路阶段;那么这整个预处理滤波或者后处理滤波过程采用了分路-合并-分路处理结构;
如果首先需要获取待滤波像素信息的至少两种分量,即第一分路阶段,然后再将至少两种分量进行融合,即合并阶段;这样,在所有信息融合处理之后,当只需要输出一种分量时,比如第一图像分量;这时候通过对融合信息进行联合处理,获取第一图像分量所对应的残差信息,然后将第一图像分量与第一图像分量所对应的残差信息进行求和运算,得到待滤波像素信息滤波后的第一图像分量,该处理过程不存在第二分路阶段;那么这整个预处理滤波或者后处理滤波过程采用了分路-合并处理结构。
另外,如果不需要分别获得待滤波像素信息的至少两种分量,即无需第一分路阶段,可以直接将待滤波像素信息的至少两种分量进行融合处理,即直接进入合并阶段;而在所有信息融合处理之后,由于需要同时输出多种分量,此时还需要存在第二分路阶段;那么这整个预处理滤波或者后处理滤波过程采用了合并-分路处理结构。
还需要说明的是,本申请实施例的预处理滤波或者后处理滤波还可以采用更多的级联处理结构,例如分路-合并-分路-合并-分路处理结构等。针对这些级联处理结构,本申请实施例可以采用典型的级联结构,比如分路-合并-分路处理结构,还可以采用比典型的级联结构更少的级联处理结构,比如分路-合并处理结构或者合并-分路处理结构等;甚至也可以采用比典型的级联结构更多的级联处理结构,比如分路-合并-分路-合并-分路处理结构等,本申请实施例不作具体限定。
其中,预处理滤波器或者后处理滤波器可以包括卷积神经网络滤波器。
需要说明的是,预处理滤波或者后处理滤波器可以是卷积神经网络滤波器,也可以是其他深度学习所建立的滤波器,本申请实施例不作具体限定。这里,卷积神经网络滤波器,也称为CNN滤波器,它是一类包含卷积计算且具有深度结构的前馈神经网络,是深度学习的代表算法之一。CNN滤波器的输入层可以处理多维数据,比如待编码视频中原始图像的三个图像分量(Y/U/V)通道。
图5为本申请实施例提供的一种传统CNN滤波器的结构示意图,如图5所示,该传统CNN滤波器50是在上一代视频编码标准H.265/高效视频编码(HEVC,High Efficiency Video Coding)的基础上进行改进的,它包含有2层卷积网络结构,可以替代去方块滤波器和样点自适应补偿滤波器。将待滤波图像(用F in表示)输入到传统CNN滤波器50的输入层后,经过第一层卷积网络F 1(假定卷积核的大小为3×3,包含有64张特征图)和第二层卷积网络F 2(假定卷积核的大小为5×5,包含有32张特征图)后,得到一个残差信息F 3;然后将待滤波图像F in和残差信息F 3进行求和运算,最终得到该传统CNN滤波器50输出的滤波后图像(用F out表示)。其中,该卷积网络结构也称之为 残差神经网络,用于输出待滤波图像所对应的残差信息。在该传统CNN滤波器60中,分别对待滤波图像的三个图像分量进行独立处理,但是共享同一个滤波网络以及滤波网络的相关参数。
图6A为本申请实施例提供的另一种传统CNN滤波器的结构示意图,图6B为本申请实施例提供的又一种传统CNN滤波器的结构示意图,参见图6A和图6B,该传统CNN滤波器60使用了两个滤波网络,如图6A所示的滤波网络专用于输出第一图像分量,如图6B所示的滤波网络专用于输出第二图像分量或者第三图像分量。假定待编码视频中原始图像的高为H,宽为W,那么第一图像分量对应的尺寸信息为H×W,可以对第一图像分量进行像素重排处理,将其转换为
Figure PCTCN2019104499-appb-000007
的形式;由于第二图像分量或者第三图像分量对应的尺寸信息均为
Figure PCTCN2019104499-appb-000008
那么将这三个图像分量合并后即变换为
Figure PCTCN2019104499-appb-000009
的形式输入到传统CNN滤波器60。基于如图6A所示的滤波网络,输入层网络接收到待滤波图像F in(假定卷积核的大小为N×N,通道数为6)之后,经过第一层卷积网络F 1-Y(假定卷积核的大小为L1×L1,卷积核的数量为M,通道数为6)和第二层卷积网络F 2-Y(假定卷积核的大小为L2×L2,卷积核的数量为4,通道数为M)之后,得到一个残差信息F 3-Y(假定卷积核的大小为N×N,通道数为4);然后将输入的待滤波图像F in和残差信息F 3-Y进行求和运算,最终得到传统CNN滤波器60所输出的滤波后的第一图像分量(用F out-Y表示)。基于如图6B所示的滤波网络,输入层网络接收到待滤波图像F in(假定卷积核的大小为N×N,通道数为6)之后,经过第一层卷积网络F 1-U(假定卷积核的大小为L1×L1,卷积核的数量为M,通道数为6)和第二层卷积网络F 2-U(假定卷积核的大小为L2×L2,卷积核的数量为2,通道数为M)之后,得到一个残差信息F 3-U(假定卷积核的大小为N×N,通道数为2);然后将输入的待滤波图像F in和残差信息F 3-U进行求和运算,最终得到传统CNN滤波器60所输出的滤波后的第二图像分量或滤波后的第三图像分量(用F out-U表示)。
针对图5所示的传统CNN滤波器50、或者图6A和图6B所示的传统CNN滤波器60,由于没有考虑到不同图像分量之间的关系,而将各图像分量进行独立处理不够合理;另外,在输入端也没有充分利用块划分信息、QP信息等编码参数,然而重建图像的失真主要来自于块效应,而块效应的边界信息是由CU划分信息所决定的;也就是说,CNN滤波器中的滤波网络应该着重关注边界区域;除此外,将量化参数信息融入到滤波网络还有助于提升其泛化能力,使其能够对任意质量的失真图像进行滤波。因此,本申请实施例所提供的滤波方法,不仅CNN滤波结构设置合理,同一个滤波网络能够同时接收多个图像分量,而且充分考虑了这多个图像分量之间的关系,在滤波处理后还可以同时输出这些图像分量的增强图像;另外,该滤波方法还可以通过融入块划分信息和/或QP信息等编码参数来作为编码信息进行辅助滤波,从而提升了滤波质量。
需要说明的是,本申请实施例中的S3031,具体来说,可以是针对待滤波像素信息的第一图像分量、第二图像分量以及第三图像分量分别确定每种分量对应的边信息(比如第一边信息或第二边信息),经过融合处理后可以得到三个图像分量;还可以是针对待滤波像素信息的第一图像分量和第二图像分量分别确定每种分量对应的边信息,经过融合处理后可以得到二个图像分量;也可以是针对待滤波像素信息的第一图像分量和第三图像分量分别确定每种分量对应的边信息,经过融合处理后可以得到二种分量;甚至也可以是针对待滤波像素信息的第二图像分量和第三图像分量分别确定每种分量对应 的边信息,经过融合处理后可以得到二种新的分量;本申请实施例不作具体限定。
还需要说明的是,针对待滤波像素信息的融合信息,可以是由至少两种分量直接进行融合得到的,也可以是至少两种分量以及对应的边信息(比如第一边信息或第二边信息)共同融合得到的;本申请实施例不作具体限定。
如果融合信息是由至少两种分量直接进行融合得到的,那么可以是将待滤波像素信息的第一图像分量、第二图像分量以及第三图像分量进行融合,以得到融合信息;还可以是将待滤波像素信息的第一图像分量和第二图像分量进行融合,以得到融合信息;也可以是将待滤波像素信息的第一图像分量和第三图像分量进行融合,以得到融合信息;甚至也可以是将待滤波像素信息的第二图像分量和第三图像分量进行融合,以得到融合信息。
如果融合信息是由至少两种分量以及对应的边信息(比如第一边信息或第二边信息)共同融合得到的,那么可以是将待滤波像素信息的第一图像分量、第二图像分量以及第三图像分量和边信息进行融合,以得到融合信息;还可以是将待滤波像素信息的第一图像分量、第二图像分量和边信息进行融合,以得到融合信息;也可以是将待滤波像素信息的第一图像分量、第三图像分量和边信息进行融合,以得到融合信息;甚至也可以是将待滤波像素信息的第二图像分量、第三图像分量和边信息进行融合,以得到融合信息。具体地,针对“有至少两种分量以及对应的编码信息(比如第一边信息或第二边信息)共同融合得到”,可以是先将待滤波像素信息的至少两种分量进行融合,然后再融入边信息;还可以是先分别将待滤波像素信息的至少两种分量中每种分量与对应的边信息进行融入处理,然后再将处理后的至少两种分量进行融合;也就是说,针对融合处理的具体方式,本申请实施例不作具体限定。
另外,本申请实施例中的S303,具体来说,针对待滤波像素信息的多种分量(比如第一图像分量、第二图像分量以及第三图像分量)和边信息(比如第一边信息或第二边信息)融合输入到滤波器之后,可以是只输出待滤波像素信息滤波后的第一图像分量、或者滤波后的第二图像分量、或者滤波后的第三图像分量,也可以是输出待滤波像素信息滤波后的第一图像分量和滤波后的第二图像分量、或者滤波后的第二图像分量和滤波后的第三图像分量、或者滤波后的第一图像分量和滤波后的第三图像分量,甚至也可以是待滤波像素信息滤波后的第一图像分量、滤波后的第二图像分量和滤波后的第三图像分量;本申请实施例不作具体限定。
以待滤波像素信息的三种分量同时输入基于神经网络的滤波器且采用分路-合并-分路的级联处理结构为例,图7为了本申请实施例提供的一种可选的滤波框架的结构示意图,如图7所示,该滤波框架70可以包括待滤波像素信息的三种分量(分别用Y、U、V表示)701、第一分路单元702、第一边信息703、Y图像分量第一处理单元704、U图像分量第一处理单元705、V图像分量第一处理单元706、第二边信息707、输入融合单元708、联合处理单元709、第二分路单元710、Y图像分量第二处理单元711、U图像分量第二处理单元712、V图像分量第二处理单元713、第一加法器714、第二加法器715、第三加法器716和滤波后的三个图像分量(分别用Out_Y、Out_U、Out_V表示)717。具体地,针对待滤波图像块的三个图像分量701经过第一分路单元702之后,会将其分为三路信号:Y图像分量、U图像分量和V图像分量,第一路的Y图像分量以及与之对应的第一边信息703进入Y图像分量第一处理单元704,第二路的U图像分量以及与之对应的第一边信息703进入U图像分量第一处理单元705,第三路的V图像分量以及与之对应的第一边信息703进入V图像分量第一处理单元706,这样会输出三路新的图像分量;输入融合单元708用于将这三路新的图像分量和第二边信息707进行融合,然后输入到联合处理单元709;联合处理单元709包括有多层卷积滤波网络,用于 对输入的信息进行卷积计算,由于具体的卷积计算过程与相关技术方案相似,因此针对联合处理单元709的具体执行步骤不再进行描述。经过联合处理单元709之后,将会进入第二分路单元710以将其重新分为三路信号,然后将这三路信号再分别输入Y图像分量第二处理单元711、U图像分量第二处理单元712和V图像分量第二处理单元713,可以依次得到Y图像分量的残差信息、U图像分量的残差信息和V图像分量的残差信息;将待滤波图像块的三个图像分量701中的Y图像分量与所得到的Y图像分量的残差信息共同输入第一加法器714,第一加法器714的输出就是滤波后的Y图像分量(用Out_Y表示);将待滤波图像的三个图像分量701中的U图像分量与所得到的U图像分量的残差信息共同输入第二加法器715,第二加法器715的输出就是滤波后的U图像分量(用Out_U表示);将待滤波图像块的三个图像分量701中的V图像分量与所得到的V图像分量的残差信息共同输入第三加法器716,第三加法器716的输出就是滤波后的V图像分量(用Out_V表示)。这里,针对输出分量,如果只需要输出滤波后的Y图像分量时,滤波框架70可以不包括第二分路单元710、第二加法器715和第三加法器716;如果只需要输出滤波后的U图像分量时,滤波框架70可以不包括第二分路单元710、第一加法器714和第三加法器716;如果需要输出滤波后的Y图像分量和滤波后的U图像分量时,滤波框架70可以不包括第三加法器716;本申请实施例不作具体限定。
以待滤波像素信息的二种分量同时输入基于神经网络的滤波器且采用分路-合并的级联处理结构为例,图8为本申请实施例提供的另一种可选的滤波框架的结构示意图,如图8所示,该滤波框架80可以包括待滤波像素信息的二种分量(分别用Y和U表示)801、第一分路单元702、第一边信息703、Y图像分量第一处理单元704、U图像分量第一处理单元705、输入融合单元708、联合处理单元709、Y图像分量第二处理单元711、第一加法器714和滤波后的一个图像分量(用Out_Y表示)802。具体地,针对待滤波图像块的二个图像分量801经过第一分路单元702之后,会将其分为二路信号:Y图像分量和U图像分量,第一路的Y图像分量以及与之对应的第一边信息703进入Y图像分量第一处理单元704,第二路的U图像分量以及与之对应的第一边信息703进入U图像分量第一处理单元705,这样会输出二路新的图像分量;输入融合单元708用于将这二路新的图像分量进行融合,然后输入到联合处理单元709;经过联合处理单元709之后,由于只需要输出单个图像分量(即滤波后的Y图像分量),此时无需进入第二分路单元710,可以直接输入Y图像分量第二处理单元711,然后得到Y图像分量的残差信息;将待滤波图像块的二个图像分量801中的Y图像分量与所得到的Y图像分量的残差信息共同输入第一加法器714,第一加法器714的输出就是滤波后的Y图像分量(用Out_Y表示)。
需要说明的是,由于Y图像分量与U图像分量或者V图像分量的尺寸信息可以是不同的,在图7所示的滤波框架70或者图8所示的滤波框架80中,还可以在U图像分量第一处理单元705和V图像分量第一处理单元706之前,增加上采样单元(或者反卷积单元或者超分辨单元)以进行上采样处理,使得上采样处理后的U图像分量或者上采样处理后的V图像分量与Y图像分量的分辨率是保持一致的,便于后续进行预处理滤波和后处理滤波。另外,以图7所示的滤波框架70为例,本申请实施例中的预处理滤波器和后处理滤波器至少可以包括输入融合单元708、联合处理单元709、以及第一加法器714、第二加法器715和第三加法器716,但是也可以包括第一分路单元702、Y图像分量第一处理单元704、U图像分量第一处理单元705、V图像分量第一处理单元706等,甚至还可以包括第二分路单元710、Y图像分量第二处理单元711、U图像分量第二处理单元712、V图像分量第二处理单元713等,本申请实施例不作具体限定。
另外,在本申请实施例中提供的滤波方法既可以采用分路-合并-分路处理结构,例 如图7所示的滤波框架70;还可以采用较少的分路-合并处理结构,例如图8所示的滤波框架80;也可以采用较少的合并-分路处理结构,甚至也可以采用较少的合并-分路处理结构或者较多的分路-合并-分路-合并-分路处理结构,本申请实施例不作具体限定。
还需要说明的是,第一边信息和第二边信息可以全部参与到滤波处理中,例如图7所示的滤波框架70;第一边信息和第二边信息也可以选择性地参与到滤波处理中,例如图8所示的滤波框架80,其中,第二边信息不参与滤波处理。在本申请实施例中,可以是第一边信息和第二边信息全部参与滤波处理,还可以是第一边信息不参与滤波处理,也可以是第二边信息不参与滤波处理,甚至也可以是第一边信息和第二边信息均不参与滤波处理,本申请实施例不作具体限定。
还需要说明的是,第一边信息和第二边信息的融合阶段可以是相同的,也可以是不同的;也就是说,第一边信息和第二边信息可以在同一阶段参与到滤波处理中,也可以在不同阶段参与到滤波处理中,本申请实施例不作具体限定。例如,仍以图7所示的滤波框架70为例,第一边信息703和第二边信息707都可以在第一分路单元702对应的阶段内参与到滤波处理中,或者,第一边信息703和第二边信息707都可以在输入融合单元708对应的阶段内参与到滤波处理中,或者,第一边信息703和第二边信息707都可以在第二分路单元710对应的阶段内参与到滤波处理中;或者,第一边信息703在第一分路单元702对应的阶段内参与到滤波处理中,第二边信息707在输入融合单元708对应的阶段内参与到滤波处理中;或者,第一边信息703在第一分路单元702对应的阶段之前参与到滤波处理中,第二边信息707在输入融合单元708对应的阶段内参与到滤波处理中;或者,第一边信息703在第一分路单元702对应的阶段之前参与到滤波处理中,第二边信息707在第二分路单元710对应的阶段内参与到滤波处理中;或者,第一边信息703在输入融合单元708对应的阶段内参与到滤波处理中,第二边信息707在第二分路单元710对应的阶段内参与到滤波处理中;也就是说,第一边信息703和第二边信息707可以在级联处理结构中灵活选择融合阶段,本申请实施例不作具体限定。
以图7所示的滤波框架70为例,它采用了深度学习网络(如CNN)来进行滤波,与传统CNN滤波器的区别在于,本申请实施例中的滤波器采用了级联处理结构,可以将待滤波像素信息的三种分量同时输入到滤波网络中,而且还融入了其他编码相关的边信息(比如块划分信息、量化参数信息、MV信息等编码参数),且这些边信息可以在同一阶段或者不同阶段来融入到滤波网络中;这样,不仅充分利用了三种分量之间的关系,而且还使用其他编码相关的编码信息来辅助滤波,提升了滤波质量;另外,针对三种分量进行同时处理,还有效避免了对这三种分量需要进行三次完整的网络前向计算的问题,进而降低了计算复杂度,节省了编码码率。
图9为了本申请实施例提供的又一种可选的滤波框架的结构示意图,如图9所示,该滤波框架90可以包括待滤波像素信息的三种分量(分别用Y、U、V表示)901、第一边信息902、Y图像分量第一处理单元903、U图像分量第一处理单元904、V图像分量第一处理单元905、第二边信息906、融合单元907、联合处理单元908、分路单元909、Y图像分量第二处理单元910、U图像分量第二处理单元911、V图像分量第二处理单元912、第一加法器913、第二加法器914、第三加法器915和滤波后的三个图像分量(分别用Out_Y、Out_U、Out_V表示)916。具体地,针对待滤波像素信息的三种分量901经过分量处理,会将其分为三路信号:Y图像分量、U图像分量和V图像分量,第一路的Y图像分量以及与之对应的第一边信息902进入Y图像分量第一处理单元903,第二路的U图像分量以及与之对应的第一边信息703进入U图像分量第一处理单元904,第三路的V图像分量以及与之对应的第一边信息902进入V图像分量第一处理单元905,这样会输出三路新的图像分量;融合单元907用于将这三路新的图像分量和第二边信息 906进行融合,然后输入到联合处理单元908;联合处理单元908包括有多层卷积滤波网络,用于对输入的信息进行卷积计算,由于具体的卷积计算过程与相关技术方案相似,因此针对联合处理单元908的具体执行步骤不再进行描述。经过联合处理单元908之后,将会进入分路单元909以将其重新分为三路信号,然后将这三路信号再分别输入Y图像分量第二处理单元910、U图像分量第二处理单元911和V图像分量第二处理单元912,可以依次得到Y图像分量的残差信息、U图像分量的残差信息和V图像分量的残差信息;将待滤波像素信息的三种分量901中的Y图像分量与所得到的Y图像分量的残差信息共同输入第一加法器913,第一加法器913的输出就是滤波后的Y图像分量(用Out_Y表示);将待滤波像素信息的三种分量901中的U图像分量与所得到的U图像分量的残差信息共同输入第二加法器914,第二加法器914的输出就是滤波后的U图像分量(用Out_U表示);将待滤波像素信息的三种分量901中的V图像分量与所得到的V图像分量的残差信息共同输入第三加法器915,第三加法器915的输出就是滤波后的V图像分量(用Out_V表示)。这里,针对输出分量,如果只需要输出滤波后的Y图像分量时,滤波框架90可以不包括分路单元909、第二加法器914和第三加法器915;如果只需要输出滤波后的U图像分量时,滤波框架90可以不包括分路单元909、第一加法器913和第三加法器915;如果需要输出滤波后的Y图像分量和滤波后的U图像分量时,滤波框架90可以不包括第三加法器915;本申请实施例不作具体限定。
本申请实施例提供的神经网络架构能合理有效地利用起各分量以及边信息,能带来更优的编码性能。
本申请实施例提供了一种滤波方法,首先滤波装置获取待滤波像素信息,获取至少一种边信息,将待滤波像素信息的至少两种分量和至少一种边信息输入至基于神经网络的滤波器中,以输出得到待滤波像素信息滤波后的至少一种分量;也就是说,在本申请实施例中,获取待滤波像素信息中至少两种分量,以及至少一种边信息,将其输入至基于神经网络的滤波器中进行处理,在滤波过程中,融入至少一种分量的边信息,从而得到滤波后的像素信息,这样,不仅充分利用了多种分量之间的关系,而且还有效避免了对至少两种分量需要进行多次完整的网络前向计算的问题,进而降低了计算复杂度,节省了编码码率,提升了编解码过程中预处理滤波后得到的图像和后处理滤波后得到的图像的质量,从而提高重建图像的质量。
基于相同的发明构思,图10为本申请实施例提供的一种可选的滤波装置的结构示意图,如图10所示,该滤波装置可以包括:第一获取模块101、第二获取模块102和确定模块103,其中,
第一获取模块101,配置为获取待滤波像素信息;
第二获取模块102,配置为获取至少一种边信息;
确定模块103,配置为将待滤波像素信息至少两种分量和至少一种边信息输入至基于神经网络的滤波器中,以输出得到待滤波像素信息滤波后的至少一种分量。
在上述方案中,确定模块103可以包括:
第一处理子模块,配置为分别对至少两种分量中每种分量进行处理,得到处理后的至少两种分量;
融合子模块,配置为根据至少一种边信息和处理后的至少两种分量进行融合处理,得到待滤波像素信息的融合信息;
第二处理子模块,配置为对融合信息进行处理,得到待滤波像素信息滤波后的至少一种分量。
在上述方案中,第一处理子模块,可以具体配置为:
分别对至少两种分量中每种分量进行分量处理,得到处理后的至少两种分量。
在上述方案中,当获取到每种分量对应的第一边信息时,相应地,第一处理子模块,可以具体配置为:
分别将至少两种分量中每种分量与每种分量对应的第一边信息进行融合处理,得到处理后的至少两种分量;
其中,第一边信息至少包括块划分信息和/或量化参数信息。
在上述方案中,融合子模块,可以具体配置为:
对处理后的至少两种分量与每种分量对应的第一边信息进行融合处理,得到待滤波像素信息的融合信息。
在上述方案中,第二处理子模块,可以具体配置为:
对融合信息进行联合处理和分路处理,得到至少两种分量中的至少一种分量所对应的残差信息;
将至少两种分量中的至少一种分量与至少一种分量所对应的残差信息进行求和运算,得到待滤波像素信息滤波后的至少一种分量。
在上述方案中,当获取到的每种分量对应的第二边信息时,相应地,第一处理子模块,可以具体配置为:
分别将至少两种分量中每种分量与每种分量对应的第二边信息进行融合处理,得到处理后的至少两种分量;
其中,第二边信息与第一边信息不同。
在上述方案中,融合子模块,可以具体配置为:
对处理后的至少两种分量与每种分量对应的第二边信息进行融合处理,得到待滤波像素信息的融合信息。
在上述方案中,神经网络的结构中至少包括一个联合处理阶段和一个独立处理阶段;在联合处理阶段,所有分量共同处理;在独立处理阶段,每种分量在神经网络的一个独立分支上进行处理。
可以理解地,在本实施例中,“单元”可以是部分电路、部分处理器、部分程序或软件等等,当然也可以是模块,还可以是非模块化的。而且在本实施例中的各组成部分可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。上述集成的单元既可以采用硬件的形式实现,也可以采用软件功能模块的形式实现。
所述集成的单元如果以软件功能模块的形式实现并非作为独立的产品进行销售或使用时,可以存储在一个计算机可读取存储介质中,基于这样的理解,本实施例的技术方案本质上或者说对现有技术做出贡献的部分或者该技术方案的全部或部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质中,包括若干指令用以使得一台计算机设备(可以是个人计算机,服务器,或者网络设备等)或processor(处理器)执行本实施例所述方法的全部或部分步骤。而前述的存储介质包括:U盘、移动硬盘、只读存储器(Read Only Memory,ROM)、随机存取存储器(Random Access Memory,RAM)、磁碟或者光盘等各种可以存储程序代码的介质。
图11为本申请实施例提供的一种可选的编码器的结构示意图,如图11所示,本申请实施例提供了一种编码器1100,
包括处理器1101以及存储有处理器1101可执行指令的存储介质1102,存储介质1102通过通信总线1103依赖处理器1101执行操作,当指令被处理器1101执行时,执行上述实施例的滤波方法。
需要说明的是,实际应用时,终端中的各个组件通过通信总线1103耦合在一起。可理解,通信总线1103用于实现这些组件之间的连接通信。通信总线1103除包括数据 总线之外,还包括电源总线、控制总线和状态信号总线。但是为了清楚说明起见,在图11中将各种总线都标为通信总线1103。
本申请实施例提供了一种计算机存储介质,存储有可执行指令,当所述可执行指令被一个或多个处理器执行的时候,所述处理器执行上述一个或多个实施例所述的滤波方法。
可以理解,本申请实施例中的存储器可以是易失性存储器或非易失性存储器,或可包括易失性和非易失性存储器两者。其中,非易失性存储器可以是只读存储器(Read-Only Memory,ROM)、可编程只读存储器(Programmable ROM,PROM)、可擦除可编程只读存储器(Erasable PROM,EPROM)、电可擦除可编程只读存储器(Electrically EPROM,EEPROM)或闪存。易失性存储器可以是随机存取存储器(Random Access Memory,RAM),其用作外部高速缓存。通过示例性但不是限制性说明,许多形式的RAM可用,例如静态随机存取存储器(Static RAM,SRAM)、动态随机存取存储器(Dynamic RAM,DRAM)、同步动态随机存取存储器(Synchronous DRAM,SDRAM)、双倍数据速率同步动态随机存取存储器(Double Data Rate SDRAM,DDRSDRAM)、增强型同步动态随机存取存储器(Enhanced SDRAM,ESDRAM)、同步连接动态随机存取存储器(Synchlink DRAM,SLDRAM)和直接内存总线随机存取存储器(Direct Rambus RAM,DRRAM)。本文描述的系统和方法的存储器旨在包括但不限于这些和任意其它适合类型的存储器。
而处理器可能是一种集成电路芯片,具有信号的处理能力。在实现过程中,上述方法的各步骤可以通过处理器中的硬件的集成逻辑电路或者软件形式的指令完成。上述的处理器可以是通用处理器、数字信号处理器(Digital Signal Processor,DSP)、专用集成电路(Application Specific Integrated Circuit,ASIC)、现成可编程门阵列(Field Programmable Gate Array,FPGA)或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件。可以实现或者执行本申请实施例中的公开的各方法、步骤及逻辑框图。通用处理器可以是微处理器或者该处理器也可以是任何常规的处理器等。结合本申请实施例所公开的方法的步骤可以直接体现为硬件译码处理器执行完成,或者用译码处理器中的硬件及软件模块组合执行完成。软件模块可以位于随机存储器,闪存、只读存储器,可编程只读存储器或者电可擦写可编程存储器、寄存器等本领域成熟的存储介质中。该存储介质位于存储器,处理器读取存储器中的信息,结合其硬件完成上述方法的步骤。
可以理解的是,本文描述的这些实施例可以用硬件、软件、固件、中间件、微码或其组合来实现。对于硬件实现,处理单元可以实现在一个或多个专用集成电路(Application Specific Integrated Circuits,ASIC)、数字信号处理器(Digital Signal Processing,DSP)、数字信号处理设备(DSP Device,DSPD)、可编程逻辑设备(Programmable Logic Device,PLD)、现场可编程门阵列(Field-Programmable Gate Array,FPGA)、通用处理器、控制器、微控制器、微处理器、用于执行本申请所述功能的其它电子单元或其组合中。
对于软件实现,可通过执行本文所述功能的模块(例如过程、函数等)来实现本文所述的技术。软件代码可存储在存储器中并通过处理器执行。存储器可以在处理器中或在处理器外部实现。
需要说明的是,在本文中,术语“包括”、“包含”或者其任何其他变体意在涵盖非排他性的包含,从而使得包括一系列要素的过程、方法、物品或者装置不仅包括那些要素,而且还包括没有明确列出的其他要素,或者是还包括为这种过程、方法、物品或者装置所固有的要素。在没有更多限制的情况下,由语句“包括一个……”限定的要素, 并不排除在包括该要素的过程、方法、物品或者装置中还存在另外的相同要素。
上述本申请实施例序号仅仅为了描述,不代表实施例的优劣。
通过以上的实施方式的描述,本领域的技术人员可以清楚地了解到上述实施例方法可借助软件加必需的通用硬件平台的方式来实现,当然也可以通过硬件,但很多情况下前者是更佳的实施方式。基于这样的理解,本申请的技术方案本质上或者说对现有技术做出贡献的部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质(如ROM/RAM、磁碟、光盘)中,包括若干指令用以使得一台终端(可以是手机、计算机、服务器、或者网络设备等)执行本申请各个实施例所述的方法。
上面结合附图对本申请的实施例进行了描述,但是本申请并不局限于上述的具体实施方式,上述的具体实施方式仅仅是示意性的,而不是限制性的,本领域的普通技术人员在本申请的启示下,在不脱离本申请宗旨和权利要求所保护的范围情况下,还可做出很多形式,这些均属于本申请的保护之内。
工业实用性
本申请实施例中,首先,滤波装置获取待滤波像素信息,获取至少一种边信息,将待滤波像素信息的至少两种分量和至少一种边信息输入至基于神经网络的滤波器中,以输出得到待滤波像素信息滤波后的至少一种分量;也就是说,在本申请实施例中,获取待滤波像素信息中至少两种分量,以及至少一种边信息,将其输入至基于神经网络的滤波器中进行处理,在滤波过程中,融入至少一种分量的边信息,从而得到滤波后的像素信息,这样,不仅充分利用了多种分量之间的关系,而且,还有效避免了对至少两种分量需要进行多次完整的网络前向计算的问题,进而降低了计算复杂度,节省了编码码率,提升了编解码过程中预处理滤波后得到的图像和后处理滤波后得到的图像的质量,从而提高重建图像的质量。

Claims (12)

  1. 一种滤波方法,其中,所述方法包括:
    获取待滤波像素信息;
    获取至少一种边信息;
    将所述待滤波像素信息的至少两种分量和所述至少一种边信息输入至基于神经网络的滤波器中,以输出得到所述待滤波像素信息滤波后的至少一种分量。
  2. 根据权利要求1所述的方法,其中,将所述待滤波像素信息的至少两种分量和所述至少一种边信息输入至基于神经网络的滤波器中,以输出得到所述待滤波像素信息滤波后的至少一种分量,包括:
    分别对所述至少两种分量中每种分量进行处理,得到处理后的至少两种分量;
    根据所述至少一种边信息和所述处理后的至少两种分量进行融合处理,得到所述待滤波像素信息的融合信息;
    对所述融合信息进行处理,得到所述待滤波像素信息滤波后的至少一种分量。
  3. 根据权利要求2所述的方法,其中,所述分别对所述至少两种分量中每种分量进行处理,得到处理后的至少两种分量,包括:
    分别对所述至少两种分量中每种分量进行分量处理,得到所述处理后的至少两种分量。
  4. 根据权利要求2所述的方法,其中,当获取到每种分量对应的第一边信息时,相应地,所述分别对所述至少两种分量中每种分量进行处理,得到处理后的至少两种分量,包括:
    分别将所述至少两种分量中每种分量与每种分量对应的第一边信息进行融合处理,得到所述处理后的至少两种分量;
    其中,所述第一边信息至少包括块划分信息和/或量化参数信息。
  5. 根据权利要求4所述的方法,其中,所述根据所述至少一种边信息和所述处理后的至少两种分量进行融合处理,得到所述待滤波像素信息的融合信息,包括:
    对所述处理后的至少两种分量与每种分量对应的第一边信息进行融合处理,得到所述待滤波像素信息的融合信息。
  6. 根据权利要求2所述的方法,其中,所述对所述融合信息进行处理,得到所述待滤波像素信息滤波后的至少一种分量,包括:
    对所述融合信息进行联合处理和分路处理,得到所述至少两种分量中的至少一种分量所对应的残差信息;
    将所述至少两种分量中的至少一种分量与所述至少一种分量所对应的残差信息进行求和运算,得到所述待滤波像素信息滤波后的至少一种分量。
  7. 根据权利要求1至6任一项所述的方法,其中,当获取到每种分量对应的第二边信息时,相应地,所述分别对所述至少两种分量中每种分量进行处理,得到处理后的至少两种分量,包括:
    分别将所述至少两个种分量中每种分量与每种分量对应的第二边信息进行融合处理,得到所述处理后的至少两种分量;
    其中,所述第二边信息与所述第一边信息不同。
  8. 根据权利要求7所述的方法,其中,所述根据所述至少一种边信息和所述处理后的至少两种分量进行融合处理,得到所述待滤波像素信息的融合信息,包括:
    对所述处理后的至少两种分量与每种分量对应的第二边信息进行融合处理,得到所 述待滤波像素信息的融合信息。
  9. 根据权利要求1所述的方法,其中,所述神经网络的结构中至少包括一个联合处理阶段和一个独立处理阶段;
    在所述联合处理阶段,所有分量共同处理;
    在所述独立处理阶段,每种分量在所述神经网络的一个独立分支上进行处理。
  10. 一种滤波装置,其中,所述滤波装置包括:
    第一获取模块,配置为获取待滤波像素信息;
    第二获取模块,配置为获取至少一种边信息;
    确定模块,配置为将所述待滤波像素信息的至少两种分量和所述至少一种边信息输入至基于神经网络的滤波器中,以输出得到所述待滤波像素信息滤波后的至少一种分量。
  11. 一种编码器,其中,所述编码器包括:
    处理器以及存储有所述处理器可执行指令的存储介质,所述存储介质通过通信总线依赖所述处理器执行操作,当所述指令被所述处理器执行时,执行上述的权利要求1至9任一项所述的滤波方法。
  12. 一种计算机存储介质,其中,存储有可执行指令,当所述可执行指令被一个或多个处理器执行的时候,所述处理器执行所述的权利要求1至9任一项所述的滤波方法。
PCT/CN2019/104499 2019-03-24 2019-09-05 滤波方法、装置、编码器以及计算机存储介质 Ceased WO2020192020A1 (zh)

Priority Applications (5)

Application Number Priority Date Filing Date Title
JP2021556289A JP2022526107A (ja) 2019-03-24 2019-09-05 フィルタリング方法、装置、エンコーダ及びコンピュータ記憶媒体
CN201980094255.XA CN113574884A (zh) 2019-03-24 2019-09-05 滤波方法、装置、编码器以及计算机存储介质
KR1020217032825A KR102916992B1 (ko) 2019-03-24 2019-09-05 필터링 방법, 장치, 인코더 및 컴퓨터 저장 매체
EP19922221.7A EP3941057A4 (en) 2019-03-24 2019-09-05 FILTERING METHOD AND APPARATUS, ENCODER AND COMPUTER STORAGE MEDIUM
US17/475,184 US12206904B2 (en) 2019-03-24 2021-09-14 Filtering method and device, encoder and computer storage medium

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US201962822951P 2019-03-24 2019-03-24
US62/822,951 2019-03-24

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US17/475,184 Continuation US12206904B2 (en) 2019-03-24 2021-09-14 Filtering method and device, encoder and computer storage medium

Publications (1)

Publication Number Publication Date
WO2020192020A1 true WO2020192020A1 (zh) 2020-10-01

Family

ID=72608393

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2019/104499 Ceased WO2020192020A1 (zh) 2019-03-24 2019-09-05 滤波方法、装置、编码器以及计算机存储介质

Country Status (6)

Country Link
US (1) US12206904B2 (zh)
EP (1) EP3941057A4 (zh)
JP (1) JP2022526107A (zh)
KR (1) KR102916992B1 (zh)
CN (1) CN113574884A (zh)
WO (1) WO2020192020A1 (zh)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115151941A (zh) * 2020-12-29 2022-10-04 腾讯美国有限责任公司 用于视频编码的方法和设备
JP2023100261A (ja) * 2022-01-05 2023-07-18 株式会社アクセル 符号化装置、復号装置、符号化方法、復号方法、符号化プログラム、及び復号プログラム
WO2023190053A1 (ja) * 2022-03-31 2023-10-05 パナソニック インテレクチュアル プロパティ コーポレーション オブ アメリカ 画像符号化装置、画像復号装置、画像符号化方法、及び画像復号方法
JP2023544705A (ja) * 2020-10-05 2023-10-25 クゥアルコム・インコーポレイテッド ビデオコーディング中の、ジョイント成分ニューラルネットワークベースのフィルタ処理
JP2023544711A (ja) * 2020-10-06 2023-10-25 インターデジタル ヴイシー ホールディングス フランス,エスエーエス メタデータを用いた圧縮ビデオのループ内及びポストフィルタリングの空間解像度適合

Families Citing this family (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11902561B2 (en) * 2020-04-18 2024-02-13 Alibaba Group Holding Limited Convolutional-neutral-network based filter for video coding
EP4150907A1 (en) * 2020-06-10 2023-03-22 Huawei Technologies Co., Ltd. Adaptive image enhancement using inter-channel correlation information
US12603998B2 (en) * 2021-07-07 2026-04-14 Lemon Inc. Configurable neural network model depth in neural network-based video coding
CN117793355A (zh) * 2022-09-19 2024-03-29 腾讯科技(深圳)有限公司 多媒体数据处理方法、装置、设备及存储介质
US20260082084A1 (en) * 2023-06-26 2026-03-19 Lg Electronics Inc. Image encoding/decoding method, method of transmitting bitstream and recording medium storing bitstream
CN119996677A (zh) * 2023-11-09 2025-05-13 腾讯科技(深圳)有限公司 滤波方法、装置、电子设备以及存储介质

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108184129A (zh) * 2017-12-11 2018-06-19 北京大学 一种视频编解码方法、装置及用于图像滤波的神经网络
US10019814B2 (en) * 2016-05-16 2018-07-10 Canon Kabushiki Kaisha Method, apparatus and system for determining a luma value
CN109120937A (zh) * 2017-06-26 2019-01-01 杭州海康威视数字技术股份有限公司 一种视频编码方法、解码方法、装置及电子设备
CN109151475A (zh) * 2017-06-27 2019-01-04 杭州海康威视数字技术股份有限公司 一种视频编码方法、解码方法、装置及电子设备

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP7260472B2 (ja) * 2017-08-10 2023-04-18 シャープ株式会社 画像フィルタ装置
EP3451670A1 (en) * 2017-08-28 2019-03-06 Thomson Licensing Method and apparatus for filtering with mode-aware deep learning

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10019814B2 (en) * 2016-05-16 2018-07-10 Canon Kabushiki Kaisha Method, apparatus and system for determining a luma value
CN109120937A (zh) * 2017-06-26 2019-01-01 杭州海康威视数字技术股份有限公司 一种视频编码方法、解码方法、装置及电子设备
CN109151475A (zh) * 2017-06-27 2019-01-04 杭州海康威视数字技术股份有限公司 一种视频编码方法、解码方法、装置及电子设备
CN108184129A (zh) * 2017-12-11 2018-06-19 北京大学 一种视频编解码方法、装置及用于图像滤波的神经网络

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
See also references of EP3941057A4 *

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2023544705A (ja) * 2020-10-05 2023-10-25 クゥアルコム・インコーポレイテッド ビデオコーディング中の、ジョイント成分ニューラルネットワークベースのフィルタ処理
JP2023544711A (ja) * 2020-10-06 2023-10-25 インターデジタル ヴイシー ホールディングス フランス,エスエーエス メタデータを用いた圧縮ビデオのループ内及びポストフィルタリングの空間解像度適合
US12549747B2 (en) 2020-10-06 2026-02-10 Interdigital Ce Patent Holdings, Sas Spatial resolution adaptation of in-loop and post-filtering of compressed video using metadata
CN115151941A (zh) * 2020-12-29 2022-10-04 腾讯美国有限责任公司 用于视频编码的方法和设备
CN115151941B (zh) * 2020-12-29 2025-07-25 腾讯美国有限责任公司 视频处理的方法、计算机装置、设备及存储介质
JP2023100261A (ja) * 2022-01-05 2023-07-18 株式会社アクセル 符号化装置、復号装置、符号化方法、復号方法、符号化プログラム、及び復号プログラム
JP7742041B2 (ja) 2022-01-05 2025-09-19 株式会社アクセル 符号化装置、復号装置、符号化方法、復号方法、符号化プログラム、及び復号プログラム
WO2023190053A1 (ja) * 2022-03-31 2023-10-05 パナソニック インテレクチュアル プロパティ コーポレーション オブ アメリカ 画像符号化装置、画像復号装置、画像符号化方法、及び画像復号方法

Also Published As

Publication number Publication date
KR20210139342A (ko) 2021-11-22
US20220021905A1 (en) 2022-01-20
EP3941057A1 (en) 2022-01-19
EP3941057A4 (en) 2022-06-01
CN113574884A (zh) 2021-10-29
US12206904B2 (en) 2025-01-21
JP2022526107A (ja) 2022-05-23
KR102916992B1 (ko) 2026-01-22

Similar Documents

Publication Publication Date Title
WO2020192020A1 (zh) 滤波方法、装置、编码器以及计算机存储介质
US12177491B2 (en) Loop filter implementation method and apparatus, and computer storage medium
CN113747179B (zh) 环路滤波实现方法、装置及计算机存储介质
WO2020192034A1 (zh) 滤波方法及装置、计算机存储介质
CN113784128B (zh) 图像预测方法、编码器、解码器以及存储介质
CN113766233B (zh) 图像预测方法、编码器、解码器以及存储介质
WO2021203381A1 (zh) 一种视频编解码方法、装置以及计算机可读存储介质
WO2025129410A1 (zh) 编解码方法、码流、编码器、解码器以及存储介质
WO2025097423A1 (zh) 编解码方法、码流、编码器、解码器以及存储介质
WO2025138170A1 (zh) 编解码方法、编解码器以及存储介质

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19922221

Country of ref document: EP

Kind code of ref document: A1

ENP Entry into the national phase

Ref document number: 2021556289

Country of ref document: JP

Kind code of ref document: A

NENP Non-entry into the national phase

Ref country code: DE

ENP Entry into the national phase

Ref document number: 20217032825

Country of ref document: KR

Kind code of ref document: A

ENP Entry into the national phase

Ref document number: 2019922221

Country of ref document: EP

Effective date: 20211015

WWD Wipo information: divisional of initial pct application

Ref document number: 202528020913

Country of ref document: IN

WWP Wipo information: published in national office

Ref document number: 202528020913

Country of ref document: IN

WWW Wipo information: withdrawn in national office

Ref document number: 2019922221

Country of ref document: EP